#|------------------------------------------------------------------------------
|
| NAME
|
| LR(1)AndLALR(1)ParserGenerator.lsp
|
|
| DESCRIPTION
|
| LR parser generator which produces goto graphs and action and goto tables
| for both LR(1) and LALR(1) grammars.
|
| It gives the same parsing tables (and conflicts) as UNIX's yacc
| compiler-compiler, except that some states may be numbered in a different
| order.
|
|
| CALLING SEQUENCE
|
| Once you are in a Common Lisp interpreter, load this file using
| your path:
|
| (load "LR(1)AndLALR(1)ParserGenerator.lsp")
|
| For an LR(1) grammar, type
|
| (parser-generator "grammar.dat" "parser.dat" :parser-type 'LR1)
|
| For an LALR(1) grammar, type
|
| (parser-generator "grammar.dat" "parser.dat")
| or
| (parser-generator "grammar.dat" "parser.dat" :parser-type 'LALR1)
|
| Parser-generator prints the warning message "Conflicts were detected" to
| the console if any shift-reduce or reduce-reduce conflicts occur.
|
| For testing, you can also call
|
| (test-parser-generator)
|
| but you need to modify this function to your taste by setting the
| file paths.
|
| Online documentation when you're in the lisp interpreter is given by the
| standard documentation functions; for example,
|
| (apropos 'getHead)
| => GETHEAD
| GETHEADOFLISTUPTO (fbound)
| (describe 'getHeadOfListUpTo)
| COMMON-LISP-USER::GETHEADOFLISTUPTO
| [symbol]
|
| GETHEADOFLISTUPTO names a compiled function:
| Lambda-list: (ITEM LIST)
| Derived type: (FUNCTION (T T) (VALUES LIST &OPTIONAL))
| Documentation:
| -------------------------------------------------------------------------------
| |
| | DESCRIPTION
| |
| | Return the list from the beginning up to but not including a given item,
| | or the whole list if the item wasn't found.
| ...
| |-------------------------------------------------------------------------------
| Source file: /Users/seanoconnor/ParserGeneratorAndParser/SourceCode/ParserGenerator/LR(1)AndLALR(1)ParserGenerator.lsp
|
| (describe '*productions*)
| COMMON-LISP-USER::*PRODUCTIONS*
| [symbol]
|
| *PRODUCTIONS* names a special variable:
| Value: (((1) (S -> POLY MOD)) ((2) (MOD -> COMMA INTEGER))
| ((3) (MOD -> EPSILON)) ((4) (POLY -> POLY + TERM))
| ((5) (POLY -> TERM)) ((6) (TERM -> MULTIPLIER POWER))
| ((7) (MULTIPLIER -> INTEGER)) ((8) (MULTIPLIER -> EPSILON))
| ((9) (POWER -> X)) ((10) (POWER -> X ^ INTEGER))
| ((11) (POWER -> EPSILON)))
| Documentation:
| List of productions of the unaugmented grammar.
|
|
| INPUT FILES:
|
| grammar.dat A list of the productions of the grammar followed by
| a list of terminal symbols. The file grammar.dat
| shows an example. Epsilon productions are allowed.
|
| We assume the start symbol is the one which begins the first production
| listed in grammar.dat.
|
| Don't include $ (the right endmarker) in the list of terminals. It is
| added automatically by the program.
|
|
|
| OUTPUT FILES:
|
| parser.dat A numbered list of productions, followed by the LR(1)
| or LALR(1) goto graph (i.e. set of items) of the
| grammar and the action and goto tables. See the files
| parser.dat and lalrparser.dat for examples.
|
| The LALR(1) tables are the same as the ones in the y.output file
| generated by UNIX's yacc compiler-compiler running with the -v
| option. The only difference is that some states may be numbered in
| a different order.
|
| Shift-reduce or reduce-reduce conflicts are inserted into the action
| and goto tables at the end of the line for the state in which they
| occur.
|
| You can feed the action and goto tables to my Common Lisp LR parser
| program "parser.lisp". The goto graph indicates the state of the
| parse, just as in yacc's output, and can help to define the parsing
| error messages.
|
|
| AUTHOR
|
| Sean E. O'Connor 01 Jun 1989 Version 1.0
| 11 Mar 2008 Version 5.6 released.
|
| LEGAL
|
| LR(1)AndLALR(1)ParserGenerator Version 5.6
| An LR(1) and LALR(1) Parser Generator written in Common Lisp.
|
| Copyright (C) 1989-2024 by Sean Erik O'Connor. All Rights Reserved.
|
| This program is free software: you can redistribute it and/or modify
| it under the terms of the GNU General Public License as published by
| the Free Software Foundation, either version 3 of the License, or
| (at your option) any later version.
|
| This program is distributed in the hope that it will be useful,
| but WITHOUT ANY WARRANTY; without even the implied warranty of
| MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
| GNU General Public License for more details.
|
| You should have received a copy of the GNU General Public License
| along with this program. If not, see .
|
| The author's address is seanerikoconnor!AT!gmail!DOT!com
| with the !DOT! replaced by . and the !AT! replaced by @
|
|
| METHOD
|
| This is a Common Lisp implementation and will run under CLISP.
| The software design is layered, the simpler list manipulation
| utilities coming first, building up gradually to the specialized
| and higher level parser functions. I've put lots of examples to
| ease the pain.
|
| To construct the LR(1) goto graph (i.e. set of items) we use Algorithm
| 4.9 of [Aho 86, pg. 231-232]. To create the cannonical LR(1) parsing
| action and goto tables, algorithm 4.10 [Aho 86, pg. 234] is used.
|
| To construct the LALR(1) parsing tables, we use the much simpler
| algorithm of [Aho 74, pg. 115] instead of algorithm 4.11 in [Aho 86,
| pgs. 238-239].
|
| For computing FIRST (first derived terminals) we use algorithm 5.5 of
| [Aho 72, pgs. 357-359].
|
| The function EFF (epsilon-free first derived terminals) is described
| in [Aho 72, pg. 381]. We base the algorithm used in the function
| first-terminals-of-symbol on exercise 5.2.19 [Aho 72, pg. 398]. The
| modifications to algorithm 5.5 to make it compute EFF are my own and
| are described in my notes.
|
| In the first version of this program, we used the algorithm for FIRST
| of [Aho 86, pgs. 188-189]. But this algorithm does not always
| terminate! In particular, it fails for the grammar,
|
| S -> A S | b
| A -> S A | a
|
| of example 4.33 [Aho 86, pg. 272] by getting into the following
| infinite loop: FIRST( S ) = FIRST( A ) = FIRST( S ) ... The algorithm
| we use always terminates.
|
|
| REFERENCES
|
| See http://www.seanerikoconnor.freeservers.com for a review of the
| parsing theory behind this program.
|
|
| [Aho 86] COMPILERS: PRINCIPLES, TECHNIQUES, AND TOOLS,
| Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman,
| Addison-Wesley, 1986.
|
| [Aho 74] "LR Parsing", Alfred V. Aho and Stephen C. Johnson,
| Computing Surveys, Vol. 6, No. 2, June 1974, pg. 99-124.
|
| [Aho 72] THE THEORY OF PARSING, TRANSLATION AND COMPILING, VOLUME 1:
| PARSING, Alfred V. Aho and Jeffrey D. Ullman, Prentice-Hall,
| 1972.
|
| BUGS
|
| Have the output look like the file y.output generated by yacc -v, or
| eyacc -v.
|
| NOTES
|
| In Common Lisp, functions and variables don't share the same namespace.
| So you need to tell LISP that the variable func is actually a function by using funcall.
|
| (defun apply-func (func arg1 arg2) (funcall func arg1 arg2))
|
| So you can't do the simpler expression (func arg1 arg2).
|
| It goes the other way too. You can't say (apply-func + 1 2) because + denotes a variable.
| You have to tell LISP it's a function,
|
| (apply-func #'+ 1 2)
|
| which is a shorthand for
|
| (apply-func (function +) 1 2)
|
| first and car are synonyms as are rest and cdr
| (null '()) => T
| (null nil) => T
| (null 'a) => NIL
|
| Optional arguments using keywords example:
|
| (defun doggie-looks( dog &key (nose-color 'red) (hair-color 'white))
| (list 'my dog 'has 'a nose-color 'nose 'and hair-color 'hair))
|
| * (doggie-looks 'husky) => (MY HUSKY HAS A RED NOSE AND WHITE HAIR)
| * (doggie-looks 'husky :nose-color 'black) => (MY HUSKY HAS A BLACK NOSE AND WHITE HAIR)
|
+-------------------------------------------------------------------------------|#
; ==============================================================================
; | Constants |
; ==============================================================================
; Use the naming convention +variable-name+ to denote constants.
(defconstant +initial-hash-table-size+ 100
"-------------------------------------------------------------------------------
| Initial hash table length. Don't worry, lisp hash tables are extensible
| at run time.
--------------------------------------------------------------------------------"
)
(defconstant +hash-value-upper-limit+ 65536
"-------------------------------------------------------------------------------
| Upper limit on hash value on core of items.
|-------------------------------------------------------------------------------"
)
; ==============================================================================
; | Dynamically Bound (i.e. Global) Variables |
; ==============================================================================
; Use the naming convention *variable-name* to denote them as global.
(defvar *productions* nil
"-------------------------------------------------------------------------------
| List of productions of the unaugmented grammar (without S' -> S).
| e.g. ( (S -> S |a| S |b| / EPSILON) )
| which represents S -> S a S b, S -> EPSILON
|-------------------------------------------------------------------------------"
)
(defvar *has-epsilon-productions* nil
"-------------------------------------------------------------------------------
| T if we have any epsilon productions of the form A -> EPSILON, but NIL otherwise.
--------------------------------------------------------------------------------"
)
(defvar *terminals* nil
"-------------------------------------------------------------------------------
| List of terminal symbols for the grammar. e.g. ( |a| |b| )
|-------------------------------------------------------------------------------"
)
(defvar *first-derived-terminals* nil
"-------------------------------------------------------------------------------
| Hash table containing the first derived terminals for each grammar symbol.
|-------------------------------------------------------------------------------"
)
(defvar *epsilon-free-first-derived-terminals* nil
"-------------------------------------------------------------------------------
| Hash table containing the epsilon-free first derived terminals for each grammar symbol.
|-------------------------------------------------------------------------------"
)
(defvar *goto-graph* nil
"---------------------------------------------------------------------------------------------------------------
| Goto graph of the LR(1) or LALR(1) grammar of the form
|
| (
| ( ------+
| (6 |a| 4) <-- Transition in Goto graph from |
| state 6 to state 4 on symbol a. +------ List of graph edges and transitions.
| (1 |a| 2) <-- Transition from state 1 to state 2 |
| on a. |
| ) ------+
|
| ) --------+
| ( 0 <-- State number 0. |
| 3668 <-- Hash value of core of items. |
| ( |
| (SP -> DOT S |,| $) ----+ |
| ( S -> DOT S |a| S |b| |,| $) | |
| ( S -> DOT EPSILON |,| $) +---- Set of items for state 0. |
| ( S -> DOT S |a| S |b| |,| |a|) | |
| ( S -> DOT EPSILON |,| |a|) | |
| ) ----+ |
| ) +-- List of sets of items.
| |
| ( 2 <-- State number 2. |
| 5168 <-- Hash values of core of items. |
| ( |
| (S -> S |a| DOT S |b| |,| $) ----+ |
| (S -> S |a| DOT S |b| |,| |a|) | |
| (S -> DOT S |a| S |b| |,| |b|) | |
| (S -> DOT EPSILON |,| |b|) +-- Set of items for state 2. |
| (S -> DOT S |a| S |b| |,| |a|) | |
| (S -> DOT EPSILON |,| |a|) ----+ |
| ) |
| ) |
| ) --------+
| )
|--------------------------------------------------------------------------------------------------------------"
)
(defvar *action-table* nil
"-------------------------------------------------------------------------------
| Action table of the form,
|
| (
| ( (0) <-- state number
| (
| ($ (R 2)) <-- reduce action on end of input $
| (|a| (R 2)) <-- reduce action on symbol a.
| (DEFAULT (ERROR)) <-- otherwise must be error
| )
| )
|
| ( (1) <-- next line of action table.
| (
| ($ (ACC NIL)) <-- accept action on end of input $
| (|a| (S 2)) <-- shift action on symbol a.
| (DEFAULT (ERROR))
| )
| )
| )
|-------------------------------------------------------------------------------"
)
(defvar *goto-table* nil
"-------------------------------------------------------------------------------
| Goto table of the form,
|
| (
| ( (0) <-- state number
| (
| (S 1) <-- transition to state 1 on symbol S
| (DEFAULT (ERROR)) <-- otherwise error
| )
| )
|
| ( (2)
| (
| (S 3)
| (DEFAULT (ERROR))
| )
| )
| )
|-------------------------------------------------------------------------------"
)
(defvar *conflicts* nil
"-------------------------------------------------------------------------------
| Set to true if we have any shift-reduce or reduce-reduce conflicts.
|-------------------------------------------------------------------------------"
)
; ==============================================================================
; | General Purpose List Processing Primitives |
; ==============================================================================
(defun getHeadOfListUpTo( item list )
"-------------------------------------------------------------------------------
|
| DESCRIPTION
|
| Return the list from the beginning up to but not including a given item,
| or the whole list if the item wasn't found.
|
| CALLING SEQUENCE
|
| (getHeadOfListUpTo item list)
| => New list of all symbols before the item.
|
| EXAMPLE
|
| (getHeadOfListUpTo 'rat '(you are a rat fink)) => (YOU ARE A)
| (getHeadOfListUpTo 'cat '(you are a rat fink)) => (YOU ARE A RAT FINK)
| (getHeadOfListUpTo 'rat '(rat) ) => nil
| (getHeadOfListUpTo 'rat nil ) => nil
|
|-------------------------------------------------------------------------------"
(cond ( (null list) nil) ; Empty list.
( (equal (first list) item) nil) ; List = (item). Return ().
; Recurse.
( (cons (first list)
(getHeadOfListUpTo item (rest list)))))
)
(defun removeItemFromList( item list &key (equalityTest #'equal) )
"-------------------------------------------------------------------------------
|
| DESCRIPTION
|
| Remove all occurences of a given item from a list. Test item equality
| with a function.
|
| CALLING SEQUENCE
|
| (removeItemFromList item list :equalityTest testFunction)
| => New list with all occurrences of symbol taken out.
|
| testFunction The name of the function which tests if two symbols are
| equal. It should be a function of two arguments which
| returns T if the symbols are equal and NIL otherwise.
| It defaults to #'equal.
|
| EXAMPLE
|
| (removeItemFromList '(rat bad) '( (cat good) (rat good)))
| => ( (CAT GOOD) (RAT GOOD) )
|
| (defun sameAnimal( s1 s2 ) (equal (first s1) (first s2)))
| (funcall #'sameAnimal '(rat good) '(rat bad)) => T
|
| (removeItemFromList '(rat bad) '( (cat good) (rat good))
| :equalityTest #'sameAnimal)
| => ( (CAT GOOD) )
|
+-------------------------------------------------------------------------------"
(cond ( (null list) nil) ; Nothing in the list.
( (funcall equalityTest ; First item matches.
item (first list)) ; according to equality
; test.
(removeItemFromList item (rest list) ; Discard it and remove
; all other
:equalityTest equalityTest)) ; items too.
( t (cons (first list) ; First item does not match.
(removeItemFromList item ; Add it back and remove the
(rest list) ; remaining items.
:equalityTest equalityTest))))
)
(defun itemInList( element list &key (test #'equal) )
"-------------------------------------------------------------------------------
|
| DESCRIPTION
|
| Find out if an atom or a list is a member of a given list. Test for
| equality with a function.
|
| CALLING SEQUENCE
|
| (itemInList item list :equalityTest testFunc)
| => T if item is in list; NIL if not.
|
| testFunc The name of the function which tests if two symbols are
| equal. It should be a function of two arguments which
| returns T if the symbols are equal and NIL otherwise.
| test defaults #'equal.
|
| EXAMPLE
|
| (itemInList '(hot dog) '((cool cat) (cool dog)) ) => NIL
|
| (defun sameAnimal( s1 s2 ) (equal (first s1) (first s2)))
|
| (itemInList '(hot dog) '((cool cat) (cool dog))
| :equalityTest #'sameAnimal) => T
|
+---------------------------------------------------------------------------------"
(cond ( (null list) nil) ; Not in the list.
( (funcall test element (first list)) ; First item matches.
t)
( t (itemInList element (rest list) ; Try again on rest of list.
:test test)))
)
(defun positionInList( item list )
"-------------------------------------------------------------------------------
|
| DESCRIPTION
|
| Find the position of an item in a list.
|
| CALLING SEQUENCE
|
| (positionInList item list)
|
| item Atom or list to be found.
|
| list Any list.
|
| Returns: The position of item in the list or NIL if it is not there.
| The first position is zero.
|
| EXAMPLE
|
| (positionInList '(winter mute) '(I am (winter mute))) => 2
| (positionInList 'ratfinn '(Who you ? ratfink ?)) => NIL
|
+---------------------------------------------------------------------------------"
(cond ( (null list) nil ) ; Nothing in the list.
( (equal item (first list)) 0) ; list = (item ...), return
; position = 0.
; If the item is in the rest of the list, find its position in the
; rest of the list, then add 1 to fix up the count.
( (itemInList item (rest list))
(1+ (positionInList item (rest list))))
( t nil )) ; Item was not found ---
; return NIL.
)
(defun insertItemIntoList( item L &key (test #'equal) (precedence nil) )
"------------------------------------------------------------------------------
|
| DESCRIPTION
|
| If an object isn't already in the list, add it to the end. If it is,
| overwrite it (see below).
|
| CALLING SEQUENCE
|
| (insertItemIntoList item L :test test :precedence precedence)
|
| item An atom or list.
|
| test The test to perform to see if an item is is in the list.
| It is the name of a function with two arguments which should
| return T if its arguments are equal and NIL if they aren't.
| The test function defaults to #'equal if omitted.
|
| precedence The test function to perform to say which object has the
| higher precedence when both are equal. The one of higher
| precedence is kept. An item of higher precedence overwrites
| its lower precedence brother in the list. The function should
| be of the form (precedence x y), returning the object of
| higher precedence. Defaults to NIL (Don't care).
|
| L List of non-duplicated elements (according to equality test
| specified above).
|
| Returns: Unchanged list if item is already in it. Otherwise, returns
| the list L with the item in the last position.
|
| EXAMPLE
|
| (insertItemIntoList '(rat good) '( (rat bad) (bat good) ) )
| => ((RAT BAD) (BAT GOOD) (RAT GOOD))
| We compared for exact equality, so the new item gets inserted.
|
|
| (defun sameAnimal( s1 s2 ) (equal (first s1) (first s2)))
|
| (insertItemIntoList '(rat good)
| '((rat bad) (bat good))
| :test #'sameAnimal)
| => ((RAT BAD) (BAT GOOD))
| Rats are already in the list, so don't add the item.
|
| (defun good-always-wins( x y ) (cond ((equal (second x) 'good) x) (t y)))
|
| (insertItemIntoList '(rat good) '( (rat bad) (bat good) )
| :test #'sameAnimal
| :precedence 'good-always-wins) =>
| => ((RAT GOOD) (BAT GOOD))
| Rats are already in the list, but we now compare equal items further
| to see which have higher precedence.
|
|------------------------------------------------------------------------------"
(cond ( (null L) (list item)) ; Nothing there. Add the item.
( (funcall test item (first L)) ; Item is already in the list.
; Of the two equal objects --- item and the first element in the list ---
; keep the one of higher precedence.
(if (not (null precedence))
(cons (funcall precedence item (first L)) (rest L))
L)) ; Don't care about precedence, so
; keep the original list.
( t (cons (first L)
(insertItemIntoList item
(rest L)
:test test
:precedence precedence))))
)
(defun combine( list1 list2 &key (test #'equal) (precedence nil) )
"------------------------------------------------------------------------------
|
| DESCRIPTION
|
| Take the union of two lists. We can do a generalized test for
| equality of elements. Also, if two elements are equal, we can
| keep the one of higher precedence.
|
| CALLING SEQUENCE
|
| (combine list1 list2 :test test :precedence precedence)
|
| list1 Arbitrary lists.
|
| list2
|
| item An atom or list.
|
| test The test to perform to see if an item is is in the list.
| It is the name of a function with two arguments which should
| return T if its arguments are equal and NIL if they aren't.
| The test function defaults to #'equal if omitted.
|
| precedence The test function to perform to say which object has the
| higher precedence when both are equal. The one of higher
| precedence is kept. An item of higher precedence overwrites
| its lower precedence brother in the list. The function should
| be of the form (precedence x y), returning the object of higher
| precedence. precedence defaults to NIL (Don't care).
|
| Returns: The set theoretic union of the two lists, except that we
| always keep the element of highest precedence when two
| elements are the same.
|
| EXAMPLE
|
| (combine '((rat good) (rat awful)) '((rat bad) (bat good)))
| => ((RAT AWFUL) (RAT GOOD) (RAT BAD) (BAT GOOD))
|
| (defun sameAnimal( s1 s2 ) (equal (first s1) (first s2)))
|
| (combine '((rat good) (rat awful)) '((rat bad) (bat good))
| :test #'sameAnimal)
| => ((RAT AWFUL) (BAT GOOD))
|
| (defun good-always-wins( x y ) (cond ((equal (second x) 'good) x) (t y)))
|
| (combine '((rat good) (rat awful)) '((rat bad) (bat good))
| :test #'sameAnimal
| :precedence #'good-always-wins) => ((RAT GOOD) (BAT GOOD))
|
|------------------------------------------------------------------------------"
; Successively add elements from both lists to nil, eliminating
; duplicated or low precedence items. If both lists are nil, dolist
; does not loop, and we return nil.
(let ( (new-list nil) )
(dolist (item (union list1 list2 :test #'equal))
(setq new-list (insertItemIntoList item new-list
:test test :precedence precedence)))
new-list)
)
; ------------------------------------------------------------------------------
; | core-of-item! |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Get the core of an item: the item but without the lookahead.
;
; CALLING SEQUENCE
;
; (core-of-item! item)
;
; item [A -> alpha . beta , gamma ]
;
; Returns: [A -> alpha . beta]
;
; EXAMPLE
;
; (core-of-item! '(sandwich -> bread meat DOT bread |,| knife))
; => (SANDWICH -> BREAD MEAT DOT BREAD)
;
; ------------------------------------------------------------------------------
(defun core-of-item!( item )
(getHeadOfListUpTo '|,| item)
)
; ------------------------------------------------------------------------------
; | element-of-item? |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Find out if an item or its core is in a set of items.
;
; CALLING SEQUENCE
;
; (element-of-item item set-of-items compare-type)
;
; item [A -> alpha . beta , gamma]
;
; set-of-items ... [A' -> alpha' . beta' , gamma'] ...
;
; compare-type Whether to compare the whole item or only its core.
;
; Returns: if compare-type = 'core then T if A = A', alpha =
; alpha', beta = beta'. gamma need not equal gamma'.
; if compare-type = item then T if in addition,
; gamma = gamma', and NIL otherwise.
;
; EXAMPLE
;
; (element-of-item? '(eat -> living death |,| scum)
; '( (eat -> hot fudge |,| scum)
; (eat -> living death |,| wimp))
; 'core) => T
;
; But (element-of-item? . . . 'item) => NIL
;
; ------------------------------------------------------------------------------
(defun element-of-item?( item set-of-items compare-type )
(cond ((null set-of-items) nil) ; No items, no match.
( (if (equal compare-type 'core)
(equal (core-of-item! item) ; Core of first item
(core-of-item! (first set-of-items))) ; was found.
(equal item (first set-of-items))) ; First items match.
T) ; First item is in set
( t (element-of-item? item ; Continue to search.
(rest set-of-items)
compare-type)))
)
; ------------------------------------------------------------------------------
; | contained-in-item? |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Find if the first set of items is contained in the second
; set of items. Alternatively, find out if the cores of the
; first set of items are contained in the cores of the second set.
;
; CALLING SEQUENCE
;
; (contained-in-item? set-of-items1 set-of-items2 compare-type)
;
; set-of-items1 First and ...
;
; set-of-items2 ... second sets of of items.
;
; compare-type Type of comparison: 'item or 'core.
;
; Returns: T if compare-type = 'item and the first set of items
; is contained in the second set of items.
; T if compare-type = 'core and the cores of the first
; set of items are contained in the cores of the second
; set of items.
; EXAMPLE
;
; (contained-in-item? '( (a -> b DOT c |,| x)
; (e -> f DOT |,| g))
;
; '( (a -> b DOT c |,| h)
; (e -> f DOT |,| i)
; (f -> g DOT h i |,| j) )
; 'core ) => T
;
; However, (contained-in-item? . . . 'item) => NIL
;
; ------------------------------------------------------------------------------
(defun contained-in-item?( set-of-items1 set-of-items2 compare-type )
(cond ((null set-of-items1) T) ; Null set is contained in
; every set.
((element-of-item? (first set-of-items1) ; First item (or its core)
set-of-items2 ; is in the second set.
compare-type)
(if (null (rest set-of-items1)) ; No other elements in first set
T
(contained-in-item? (rest set-of-items1) ; Are the remaining
set-of-items2 ; items of the first
compare-type))) ; set in the second?
( t nil )) ; First element of first set isn't in
; the second set: first set can't be
; contained in second set.
)
; ------------------------------------------------------------------------------
; | equal-sets-of-items? |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Check if two sets of items are the same (or have the same core).
; You can also use it to check if two arbitrary lists contain the
; same elements.
;
; CALLING SEQUENCE
;
; (equal-sets-of-items? set-of-items1 set-of-items2 :compare-type type)
;
; set-of-items1 First and ...
;
; set-of-items2 ... second sets of of items.
;
; type Optional argument defaulting to 'item.
;
; Returns: T for type = 'item if both sets of items are identical.
; T for type = 'core if both sets of items have the same
; cores.
; METHOD
;
; Two sets of items are the same if each one is contained within the
; other. They have the same core if the core of one is contained in
; the core of the other and vice-versa. We can't just test for equality
; because the order of the items could be different.
;
; EXAMPLE
;
; (equal-sets-of-items? '( (a -> b DOT c |,| d) )
; '( (a -> b DOT c |,| d) )) => T
;
; (equal-sets-of-items? '( (a -> b DOT c |,| d) )
; '( (a -> b DOT c |,| e) )
; :compare-type 'core ) => T
;
; (equal-sets-of-items? '(a (b c) d) '(d a (b c))) => T
;
; ------------------------------------------------------------------------------
(defun equal-sets-of-items?( set-of-items1 set-of-items2
&key (compare-type 'item))
(and (contained-in-item? set-of-items1 set-of-items2 compare-type)
(contained-in-item? set-of-items2 set-of-items1 compare-type))
)
; ==============================================================================
; | Helper Functions on Symbols and Productions |
; ==============================================================================
; ------------------------------------------------------------------------------
; | terminal? |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Find out if a symbol of the grammar is a terminal.
;
; CALLING SEQUENCE
;
; (terminal? symbol)
;
; *terminals* Global list of terminal symbols for the grammar.
;
; symbol: A grammar symbol
;
; Returns: T if the symbol is a terminal symbol, NIL otherwise.
; EPSILON is not a terminal.
;
; EXAMPLE
;
; Let *terminals* = (c C),
;
; (terminal? '|c| ) => T
; (terminal? 'C ) => NIL
; (terminal? 'EPSILON) => NIL
;
; ------------------------------------------------------------------------------
(defun terminal?( symbol )
(itemInList symbol *terminals*)
)
; ------------------------------------------------------------------------------
; | nonterminal? |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Find out if a symbol of the grammar is a nonterminal.
;
; CALLING SEQUENCE
;
; (nonterminal? symbol)
;
; symbol: A grammar symbol
;
; Returns: T if the symbol is a nonterminal symbol, NIL otherwise.
; EPSILON is not a non-terminal.
;
; EXAMPLE
;
; (nonterminal? 'C ) => T
; (nonterminal? '|c| ) => NIL
; (nonterminal? 'EPSILON ) => NIL
;
; ------------------------------------------------------------------------------
(defun nonterminal?( symbol )
(and (not (equal symbol 'EPSILON))
(not (terminal? symbol)))
)
; ------------------------------------------------------------------------------
; | derives-leading-terminal? |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Check if a production derives a leading terminal.
;
; CALLING SEQUENCE
;
; (derives-leading-terminal? production)
;
; production: A production of the form X -> a Y
;
; Returns: T if a is a terminal, NIL otherwise.
;
; EXAMPLE
;
; (derives-leading-terminal? '(C -> |c| C)) => T
; (derives-leading-terminal? '(C -> C C)) => NIL
;
; ------------------------------------------------------------------------------
(defun derives-leading-terminal?( production )
(terminal? (third production))
)
; ------------------------------------------------------------------------------
; | derives-leading-nonterminal? |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Check if a production derives a leading nonterminal.
;
; CALLING SEQUENCE
;
; (derives-leading-nonterminal? production)
;
; production: A production of the form X -> a Y
;
; Returns: T if a is a terminal, NIL otherwise.
;
; EXAMPLE
;
; (derives-leading-nonterminal? '(C -> C C)) => T
; (derives-leading-nonterminal? '(C -> |c| C)) => NIL
;
; ------------------------------------------------------------------------------
(defun derives-leading-nonterminal?( production )
(nonterminal? (third production))
)
; ------------------------------------------------------------------------------
; | valid-production? |
; ------------------------------------------------------------------------------
;
; DESCRPTION
;
; Find out if a production starts with the given symbol.
;
; CALLING SEQUENCE
;
; (valid-production? symbol production)
;
; symbol Grammar symbol
;
; production A production of the form X -> alpha
;
; Returns: T if X = symbol, NIL otherwise.
;
; EXAMPLE
;
; (valid-production? 'C '(C -> |c| C)) => T
; (valid-production? 'S '(C -> |c| C)) => NIL
;
; ------------------------------------------------------------------------------
(defun valid-production?( symbol production )
(equal symbol (first production))
)
; ------------------------------------------------------------------------------
; | reduction? |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Find out if an item calls for a reduction.
;
; CALLING SEQUENCE
;
; (reduction? item)
;
; item Any item [A -> alpha . beta , gamma].
;
; Returns: T if the item is of the form [A -> alpha . , gamma ]
; and NIL otherwise.
; Note [A -> alpha . EPSILON , gamma] =
; [A -> alpha . , gamma], to this is a reduction too.
;
; EXAMPLE
;
; (reduction? '(C -> |d| DOT |,| |c|)) => T
; (reduction? '(C -> |d| DOT |e| |,| |c|)) => NIL
; (reduction? '(C -> |d| DOT EPSILON |,| |c|)) => T
;
; ------------------------------------------------------------------------------
(defun reduction?( item )
; Get everything between the dot and comma. It will be empty for a reduction.
(let ((between-dot-and-comma (getHeadOfListUpTo 'DOT (reverse (getHeadOfListUpTo '|,| item )))))
(or (null between-dot-and-comma)
(equal (first between-dot-and-comma) 'epsilon)))
)
; ------------------------------------------------------------------------------
; | is-accept? |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Find out if an item calls for an accept.
;
; CALLING SEQUENCE
;
; (is-accept? item)
;
; item Arbitrary item [A -> alpha . beta , gamma].
;
; *productions* Global list of productions for our grammar.
;
; Returns: T if the item is of the form [S' -> S . , $]
; S is the start symbol --- the left hand side symbol
; of the first production. S' (represented as SP) is
; the extra start symbol of the augmented grammar.
; EXAMPLE
;
; *productions* => ( (S -> C C) (C -> |c| C) (C -> |d|) )
; (is-accept? '(SP -> S DOT |,| $)) => T
;
; ------------------------------------------------------------------------------
(defun is-accept?( item )
(equal item `(SP -> ,(first (first *productions*)) DOT |,| $))
)
; ------------------------------------------------------------------------------
; | symbol-after-dot! |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Find the symbol following the dot in an item.
;
;
; CALLING SEQUENCE
;
; (symbol-after-dot! item)
;
; item Item of the form [A -> alpha . B beta , gamma ]
;
; Returns: The symbol B, or NIL if there is none.
;
; EXAMPLE
;
; (symbol-after-dot! '(frogs -> are DOT keen |,| you bet)) => KEEN
; (symbol-after-dot! '(toads -> are not)) => NIL
;
; ------------------------------------------------------------------------------
(defun symbol-after-dot!( item )
(cond ( (null item) nil ) ; No dot was ever found.
; Return the symbol after the dot or nil if there is none.
( (equal (first item) 'DOT) (second item) )
( T (symbol-after-dot! (rest item)))) ; Keep
; looking.
)
; ------------------------------------------------------------------------------
; | terminal-after-dot? |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Check if an item has a terminal symbol after the dot.
;
; CALLING SEQUENCE
;
; (terminal-after-dot? item)
;
; item Any item [A -> alpha . beta delta , gamma]
;
; Returns: T if beta is a terminal symbol and NIL otherwise.
;
; EXAMPLE
;
; (terminal-after-dot? '(C -> C DOT |c| |,| $ )) => T
; (terminal-after-dot? '(C -> C DOT C |,| $ )) => NIL
;
; ------------------------------------------------------------------------------
(defun terminal-after-dot?( item )
(if (null (symbol-after-dot! item)) ; No symbol after the dot.
nil
(terminal? (symbol-after-dot! item))) ; Check if the symbol after the dot
; is a terminal.
)
; ------------------------------------------------------------------------------
; | epsilon-production? |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Find out if a production derives only epsilon.
;
; CALLING SEQUENCE
;
; (epsilon-production? production)
;
; production [A -> alpha].
;
; Returns: T if alpha = EPSILON.
;
; EXAMPLE
;
; (epsilon-production? '(A -> EPSILON)) => T
;
; ------------------------------------------------------------------------------
(defun epsilon-production?( production )
(and (equal (length production) 3)
(equal (third production) 'EPSILON))
)
; ------------------------------------------------------------------------------
; | same-symbol? |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Test if two tagged grammar symbols are equal.
;
; CALLING SEQUENCE
;
; (same-symbol? s1 s2)
;
; s1, s2 Tagged grammar symbols of the form (symbol tag), with
; tag = 'EPSILON-FREE or NIL.
;
; Returns T if the symbol parts are equal.
; EXAMPLE
;
; (same-symbol? '(a NIL) '(a EPSILON-FREE)) => T
; (same-symbol? '(a EPSILON-FREE) '(b EPSILON-FREE)) => NIL
;
; ------------------------------------------------------------------------------
(defun same-symbol?( s1 s2 )
(equal (first s1) (first s2))
)
; ------------------------------------------------------------------------------
; | first-alternate! |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Return the first alternate of a production.
;
; CALLING SEQUENCE
;
; (first-alternate! rhs)
;
; rhs The right hand side of a production containing alternates:
; alpha / beta / ...
;
; Returns: alpha
;
; EXAMPLE
;
; (first-alternate! '(A B / C D)) => (A B)
;
; ------------------------------------------------------------------------------
(defun first-alternate!( rhs )
(getHeadOfListUpTo '/ rhs)
)
; ------------------------------------------------------------------------------
; | all-but-first-alternate! |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Return all but the first alternates of of a production.
;
; CALLING SEQUENCE
;
; (all-but-first-alternate! rhs)
;
; rhs The right hand side of a production, with alternates
; (alpha / beta / ...)
;
; Returns: (beta / ...) or NIL if rhs has only one alternate:
; (alpha)
;
; EXAMPLE
;
; (all-but-first-alternate! '(B C / D E / F)) => (D E / F)
; (all-but-first-alternate! '(B C)) => NIL
; (all-but-first-alternate! '(B C / D E / F)) => (D E / F)
;
; ------------------------------------------------------------------------------
(defun all-but-first-alternate!( rhs )
; Get the first alternate, then strip it off.
(nthcdr (1+ (length (getHeadOfListUpTo '/ rhs))) rhs)
)
; ------------------------------------------------------------------------------
; | production-rhs! |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Return the right hand side of a production.
;
; CALLING SEQUENCE
;
; (production-rhs! production)
;
; production Production of the form [A -> alpha].
;
; Returns: alpha
;
; EXAMPLE
;
; (production-rhs! '(sandwich -> bread meat bread)) => (BREAD MEAT BREAD)
;
; ------------------------------------------------------------------------------
(defun production-rhs!( production )
(nthcdr 2 production)
)
; ------------------------------------------------------------------------------
; | string-before-comma! |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Return the symbols between the first symbol after the dot and the comma
; in an item.
;
; CALLING SEQUENCE
;
; (string-before-comma! item)
;
; item An item of the form [A -> alpha . B beta , gamma]
;
; Returns: beta or NIL if beta is empty.
;
; EXAMPLE
;
; (string-before-comma! '(A -> + DOT B * + + |,| a)) => (* + +)
; (string-before-comma! '(A -> + DOT B |,| a)) => NIL
; (string-before-comma! '(A -> + DOT |,| a)) => NIL
;
; ------------------------------------------------------------------------------
(defun string-before-comma!( item )
(let ((temp (reverse (getHeadOfListUpTo 'DOT (reverse item))))) ; Get everything past the
; dot.
(if (equal (first temp) '|,|) ; No symbol after the dot.
nil
(getHeadOfListUpTo '|,| (rest temp)))) ; Get everything past the
; symbol after the dot
; (which could be nil).
)
; ------------------------------------------------------------------------------
; | lookahead-of! |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Return the lookahead symbol of an item.
;
; CALLING SEQUENCE
;
; (lookahead-of! item)
;
; item [A -> alpha . beta , gamma]
;
; Returns: gamma
;
; EXAMPLE
;
; (lookahead-of! '(SP -> DOT |d| |,| |c|)) => |c|
;
; ------------------------------------------------------------------------------
(defun lookahead-of!( item )
(cond ( (null item) nil ) ; Nothing at all.
( (equal (first item) '|,|) (second item)) ; Lookahead (or nothing)
; follows the comma.
( T (lookahead-of! (rest item)))) ; Search
)
; ------------------------------------------------------------------------------
; | split-up-production |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Break a single production with alternates into separate productions.
;
; CALLING SEQUENCE
;
; (split-up-production production)
;
; production [A -> alpha / beta / ... ]
;
; Returns: The list [A -> alpha], [A -> beta], ...
;
; EXAMPLE
;
; (split-up-production '(A -> B C / D E / F))
; => ( (A -> B C) (A -> D E) (A -> F) )
;
; ------------------------------------------------------------------------------
(defun split-up-production( production )
; production = A -> B | C | ...
(let ( (head `(,(first production)
,(second production))) ; Get the left hand side: A ->
(tail (nthcdr 2 production)) ; Get the right hand side: B | C | ...
(new-productions nil)
(new-production nil) )
(loop (if (null tail) (return))
(setq new-production ; A -> B, A -> C, etc.
(append head
(first-alternate! tail)))
; Strip off next list up to bar.
(setq tail (all-but-first-alternate! tail))
(setq new-productions (append new-productions (list new-production))))
new-productions)
)
; ------------------------------------------------------------------------------
; | split-up-productions |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Break a list of productions with alternates into a list of separate
; productions.
;
; CALLING SEQUENCE
;
; (split-up-productions production-list)
;
; production-list [A -> alpha / beta / ... ] [B -> gamma / delta ...] ...
;
; Returns: The list [A -> alpha], [A -> beta], ... [B -> gamma]
; [B -> delta], ...
; EXAMPLE
;
; (split-up-productions '((S -> C C) (C -> |c| C / |d|)))
; => ((S -> C C) (C -> |c| C) (C -> |d|))
;
; ------------------------------------------------------------------------------
(defun split-up-productions( production-list )
(let ((new-production-list nil))
(dolist (production production-list) ; Split up each production.
(setq new-production-list ; Add it to the growing list.
(append new-production-list
(split-up-production production))))
new-production-list)
)
; ------------------------------------------------------------------------------
; | make-item |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Create an item with a leading dot from a production and a
; lookahead symbol.
;
; CALLING SEQUENCE
;
; (make-item production lookahead)
;
; production Any production [A -> alpha]
;
; lookahead Any lookahead symbol b.
;
; Returns: The item [A -> . alpha , b]
;
; EXAMPLE
;
; (make-item '(A -> B C) '|d|) => (A -> DOT B C |,| |d|)
;
; ------------------------------------------------------------------------------
(defun make-item( production lookahead )
`( ,(first production) ; Get the first symbol A.
,(second production) ; Get the arrow ->
DOT ; Add the leading dot.
,@(nthcdr 2 production) ; Add the right hand side of the production.
|,| ; Comma.
,lookahead ) ; Add the lookahead symbol last.
)
; ------------------------------------------------------------------------------
; | move-dot-right |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Move the dot in an item to the right if possible.
;
; CALLING SEQUENCE
;
; (move-dot-right item)
;
; item [A -> alpha . X beta, b]
;
; Returns: [A -> alpha X . beta, b]
; If the item is of the form [A -> alpha . , b], we return it
; unchanged.
;
; EXAMPLE
;
; (move-dot-right '( A -> B DOT C |,| D)) => ( A -> B C DOT |,| D)
; (move-dot-right '( A -> B C DOT |,| D)) => ( A -> B C |,| DOT D)
;
; ------------------------------------------------------------------------------
(defun move-dot-right( item )
(cond ( (null item) nil )
( (equal (first item) 'DOT) ; The item begins with a dot.
(if (null (second item)) ; item = DOT
item ; Leave it alone.
; Move the dot right over the next symbol. We change [ . b c d ] to [b . c d].
`( ,(second item) ; b
DOT ; Add a dot.
,@(nthcdr 2 item)))) ; The remainder, (c d).
( t (cons (first item) ; Item doesn't begin with a dot.
(move-dot-right (rest item)))))
)
; ------------------------------------------------------------------------------
; | create-augmenting-item |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Create the accept item of the augmented grammar.
;
; CALLING SEQUENCE
;
; (create-augmenting-item)
;
; *productions* List of productions for this grammar.
;
; Returns: The item (SP -> DOT S |,| $) where S is the start
; symbol: the left hand side nonterminal of the first
; production S -> alpha.
; EXAMPLE
;
; *productions* => ((E -> E T) (E -> id) (T -> id))
;
; (create-augmenting-item) => (SP -> DOT E |,| $)
;
; ------------------------------------------------------------------------------
(defun create-augmenting-item()
; Assume the first symbol of the left hand side of the first production is
; the start symbol.
`(SP -> DOT ,(first (first *productions*)) |,| $)
)
; ------------------------------------------------------------------------------
; | find-grammar-symbols |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Create all the grammar symbols (terminal and nonterminal) by looking
; at the list of productions.
;
; CALLING SEQUENCE
;
; (find-grammar-symbols)
;
; Returns: List of grammar symbols. Note we don't include the endmarker,
; $ or the null string, EPSILON.
;
; EXAMPLE
;
; For productions S -> C C, S -> |c| C | d,
;
; (find-grammar-symbols) => (S C |c| |d|)
;
; ------------------------------------------------------------------------------
(defun find-grammar-symbols()
(let ((symbols nil))
; Scan through all productions, collecting all terminals and nonterminals.
(dolist (production *productions*)
(setq symbols (append symbols (removeItemFromList '-> production))))
; Remove duplicated elements which occur later in the sequence. Remove any
; EPSILON's introduced by epsilon productions, A -> EPSILON.
(removeItemFromList 'EPSILON (remove-duplicates symbols :from-end T)))
)
; ------------------------------------------------------------------------------
; | item-to-production |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Change an item to a production by removing the dot and lookahead part.
;
; CALLING SEQUENCE
;
; (item-to-production item)
;
; item [A -> alpha . beta , gamma]
;
; Returns: [A -> alpha beta]
;
; EXAMPLE
;
; (item-to-production '(rat -> on DOT rye |,| tail)) => (RAT -> ON RYE)
;
; ------------------------------------------------------------------------------
(defun item-to-production( item )
(removeItemFromList 'DOT (getHeadOfListUpTo '|,| item )))
; ------------------------------------------------------------------------------
; | production-number |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Return the number of a production.
;
; CALLING SEQUENCE
;
; (production-number production)
;
; production Any production.
; *production* List of productions for the grammar.
;
; Returns: The number of the production in the list *productions*.
; The first production is numbered 1. Recall alternates
; of productions were split off into separate productions
; in the function load-input-and-initialize.
; EXAMPLE
;
; (production-number '(S -> C C)) => 1
;
; ------------------------------------------------------------------------------
(defun production-number( production )
(1+ (positionInList production *productions*))
)
; ==============================================================================
; | First Derived Symbol Utilities |
; ==============================================================================
; | |
; | NOTE: In this section we will be using the sample grammar, |
; | |
; | S -> A B |
; | A -> C a | EPSILON |
; | B -> b |
; | C -> c | EPSILON |
; | |
; | with terminal symbols a b c |
; | |
; ==============================================================================
; ------------------------------------------------------------------------------
; | tag-symbol |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Convert a grammar symbol A to tagged form (A NIL).
;
; CALLING SEQUENCE
;
; (tag-symbol symbol)
;
; symbol A grammar symbol.
;
; Returns: The list, (symbol NIL).
;
; EXAMPLE
;
; (tag-symbol 'a) => (A NIL)
;
; ------------------------------------------------------------------------------
(defun tag-symbol( s )
`(,s NIL)
)
; ------------------------------------------------------------------------------
; | flag-epsilon-free! |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Flag a tagged grammar symbol as coming from an epsilon-free derivation.
;
; CALLING SEQUENCE
;
; (flag-epsilon-free! tagged-symbol)
;
; tagged-symbol Tagged grammar symbol of the form (symbol tag).
;
; Returns: (symbol NIL)
;
; EXAMPLE
;
; (flag-epsilon-free! '(a nil)) => (A EPSILON-FREE)
;
; ------------------------------------------------------------------------------
(defun flag-epsilon-free!( s )
`(,(first s) epsilon-free)
)
; ------------------------------------------------------------------------------
; | epsilon-free-only |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Return only tagged grammar symbols with epsilon-free derivations.
;
; CALLING SEQUENCE
;
; (epsilon-free-only list)
;
; list List of tagged symbols, ( (s1 tag1) ... )
;
; Returns: Only those symbols in the list for which
; tag = 'epsilon-free
; EXAMPLE
;
; (epsilon-free-only '((a NIL) (b EPSILON-FREE) (c NIL)))
; => ((B EPSILON-FREE))
;
; ------------------------------------------------------------------------------
(defun epsilon-free-only( l )
(cond ( (null l) nil )
( (equal (second (first l)) ; Keep this symbol: It is epsilon-free.
'epsilon-free)
(cons (first l) (epsilon-free-only (rest l))))
( t (epsilon-free-only (rest l)))) ; Discard this symbol.
)
; ------------------------------------------------------------------------------
; | untag-list |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Remove the tags from a list of tagged grammar symbols.
;
; CALLING SEQUENCE
;
; (untag-list list)
;
; list List of tagged grammar symbols,
; ((a NIL) (b EPSILON-FREE) ... ).
;
; Returns: Untagged symbols, (a b ...).
;
; EXAMPLE
;
; (untag-list '( (a NIL) (b EPSILON-FREE) (c nil))) => (A B C)
;
; ------------------------------------------------------------------------------
(defun untag-list( list )
(mapcar #'car list)
)
; ------------------------------------------------------------------------------
; | flag-non-epsilon-free |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Flag a list of tagged symbols as being all not epsilon-free derived.
;
; CALLING SEQUENCE
;
; (flag-epsilon-free tagged-list)
;
; tagged-list Tagged list of grammar symbols, ( (s1 tag1) ... )
;
; Returns: List with all tags set to NIL, ( (s1 NIL) ... )
;
; EXAMPLE
;
; (flag-non-epsilon-free '((a epsilon-free) (b nil) (c epsilon-free))) =>
; ((A NIL) (B NIL) (C NIL))
;
; ------------------------------------------------------------------------------
(defun flag-non-epsilon-free( s )
; Apply an anonymous function to all elements of the list.
(mapcar #'(lambda (x) (cons (first x) '(NIL))) s)
)
; ------------------------------------------------------------------------------
; | precedence |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Compare two tagged symbols and return the one of higher precedence.
;
; CALLING SEQUENCE
;
; (precedence s1 s2)
;
; s1, s2 Tagged grammar symbols of the form (s1 tag1), (s2 tag2) with
; tag1, tag2 = 'EPSILON-FREE or NIL.
;
; Returns (s1 tag1) if tag1 = EPSILON-FREE, (s2 tag2) otherwise.
;
; EXAMPLE
;
; (precedence '(a NIL) '(b EPSILON-FREE)) => (B EPSILON-FREE)
; (precedence '(a EPSILON-FREE) '(b EPSILON-FREE)) => (A EPSILON-FREE)
;
; ------------------------------------------------------------------------------
(defun precedence( s1 s2 )
(cond ((equal (second s1) 'epsilon-free) s1) ; Return the epsilon-free one.
( t s2))
)
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Return the derived leading terminal of a production.
; e.g. the derived leading terminal a of the production X -> a Y
;
; CALLING SEQUENCE
;
; (derived-leading-terminal production)
;
; production [A -> a beta]
;
; Returns: a
;
; EXAMPLE
;
; (derived-leading-terminal '(A -> + B A)) =>
;
; ------------------------------------------------------------------------------
(defun derived-leading-terminal( production )
(third production)
)
; ------------------------------------------------------------------------------
; | first-terminals-of-rhs |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Find the ith approximation of the first derived terminals of the right
; hand side of a production.
;
; CALLING SEQUENCE
;
; (first-terminals-of-rhs rhs hash-table)
;
; rhs The right hand side Y1 ... Yn of the production.
;
; hash-table The current approximation to the FIRST() function for
; the non-terminals. FIRST of the terminals is already exact.
;
; Returns: The first derived terminals of the string Y1 ... Yn. We
; tag the epsilon-free first derived symbols.
;
; METHOD
;
; F ( X ) = F( Y1 ) + ... + F( Yn )
; i 1 1
;
; EXAMPLE
;
; Assume we have just called initial-first-derived-terminals, so that
; hash-table contains the zeroth approximation to FIRST.
;
; (first-terminals-of-rhs '(A B) hash-table) => ( (b NIL) )
; because S => A B => EPSILON b => b, which is not in EFF().
;
; ------------------------------------------------------------------------------
(defun first-terminals-of-rhs( rhs hash-table )
; Compute FIRST( Y1 ) and FIRST( Y1 ) - EPSILON.
(let* ((first-terms (gethash (first rhs) hash-table))
(first-terms-minus-epsilon (removeItemFromList '(EPSILON NIL) first-terms
:equalityTest 'same-symbol?)))
(cond ( (null rhs) NIL)
; If we have the case A -> alpha beta with FIRST( alpha ) = {}, we want to
; return {}.
( (null first-terms) nil )
; If epsilon is in FIRST( Y1 ) , add all non-epsilon symbols in FIRST( Y1 )
; to the first derived terminals in the rest of the list. Flag all these new
; symbols as epsilon-derived. If there are duplicated symbols, keep only
; the epsilon-free ones.
( (itemInList '(EPSILON NIL) first-terms :test 'same-symbol?)
(combine first-terms-minus-epsilon
(flag-non-epsilon-free
(first-terminals-of-rhs (rest rhs) hash-table))
:test 'same-symbol? :precedence 'precedence))
; Otherwise, Y1 has only non-epsilon terminals. Return FIRST( Y1 ). Whether
; these symbols are epsilon-free depends on their previous flags.
( t first-terms )))
)
; ------------------------------------------------------------------------------
; | update-first-derived-function |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Update the first derived terminals function to create a new version.
;
; CALLING SEQUENCE
;
; (update-first-derived-function hash-table update-hash-table)
;
; hash-table Old hash table.
;
; update-hash-table Changes to the old table. If an entry is NIL,
; it indicates no change is to be made to hash-table.
;
; Returns: Updated hash-table of the first derived terminals.
;
; EXAMPLE
;
; (update-first-derived-function) => Updated hash table.
;
; ------------------------------------------------------------------------------
(defun update-first-derived-function( hash-table update-hash-table )
; Update only changes to nonterminals because the FIRST of a terminal symbol
; does not change.
(dolist (symbol (find-grammar-symbols))
(if (nonterminal? symbol)
(if (not (null (gethash symbol update-hash-table)))
(setf (gethash symbol hash-table)
(gethash symbol update-hash-table)))))
)
; ------------------------------------------------------------------------------
; | initial-first-derived-terminals |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; The zeroth approximation to the first derived terminals function.
; It is exact for all terminals and epsilon.
;
; CALLING SEQUENCE
;
; (initial-first-derived-terminals hash-table :type type)
;
; *terminals* List of all the terminal symbols including the endmarker $
;
; hash-table Empty, but allocated hash table.
;
; Returns: Hash table of the zeroth approximation to the first
; derived terminals function for every grammar symbol.
; Epsilon-free first derived symbols are tagged with the
; flag 'epsilon-free.
;
; EXAMPLE
;
; (setq F0 (make-hash-table :size 100))
; (setq F0 (initial-first-derived-terminals F0))
; (gethash 'A F0) => ( (EPSILON NIL) )
;
; Grammar FIRST Grammar FIRST
; Symbol Symbol
; ------------------- -------------------
; S NIL |a| ((|a| EPSILON-FREE))
; A ((EPSILON NIL)) |b| ((|b| EPSILON-FREE))
; B ((|b| EPSILON-FREE)) |c| ((|c| EPSILON-FREE))
; C ((|c| EPSILON-FREE) EPSILON ((EPSILON NIL))
; (EPSILON NIL)) $ (($ EPSILON-FREE))
;
; |c| is flagged as being in EFF( C ). EPSILON is in FIRST( C ) but
; not in EFF( C ).
;
; ------------------------------------------------------------------------------
(defun initial-first-derived-terminals( hash-table )
(let ( (first-symbols nil)
(nonterm nil)
(new-symbol nil) )
; FIRST( X ) = { (X 'epsilon-free) } if X is a terminal.
(dolist (terminal *terminals*)
(setf (gethash terminal hash-table)
(list (flag-epsilon-free! (tag-symbol terminal)))))
; FIRST( EPSILON ) = { (EPSILON NIL) }.
(setf (gethash 'EPSILON hash-table) (list (tag-symbol 'EPSILON)))
; Every nonterminal appears as the left hand side of some production.
; Thus we can scan through the productions to define FIRST( A ) for every
; nonterminal A.
; Compute the zeroth approximation to FIRST(). Look for a production of
; the form A -> a alpha, where a is a nonterminal. Find the entry for
; A in the table, and add a to it.
; We tag productions of the form A -> EPSILON as not being epsilon-free
; derivations.
(dolist (production *productions*)
(cond ( (or (derives-leading-terminal? production)
(epsilon-production? production))
(setq nonterm (first production))
(setq first-symbols (gethash nonterm hash-table))
; Get a or EPSILON.
(setq new-symbol
(tag-symbol (derived-leading-terminal production)))
; Flag a as an epsilon-free derivation, but EPSILON as not.
(if (not (epsilon-production? production))
(setq new-symbol (flag-epsilon-free! new-symbol)))
; Add a to FIRST( A ).
(setq first-symbols
(insertItemIntoList new-symbol first-symbols))
(setf (gethash nonterm hash-table) first-symbols))))
hash-table)
)
; ------------------------------------------------------------------------------
; | create-all-first-derived-terminals |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Create a hash table of all first derived terminals for every grammar
; symbol.
;
; CALLING SEQUENCE
;
; (create-all-first-derived-terminals)
;
; Returns: A hash table of the first derived terminals for every grammar
; symbol, including EPSILON and $. Flag the epsilon-free
; derived terminals.
; METHOD
;
; Successive approximation by transitive closure.
;
; EXAMPLE
;
; (setq h (create-all-first-derived-terminals)) => #
;
; (gethash 'S h) => ((|a| NIL) (|c| EPSILON-FREE) (|b| NIL))
; i.e. FIRST( S ) = { |a| |b| |c| }, but EFF( S ) = { |c| }
;
;
; Grammar FIRST Grammar FIRST
; Symbol Symbol
; -------------------------------- --------------------------------
; S ((|a| NIL) |a| ((|a| EPSILON-FREE))
; (|b| NIL) |b| ((|a| EPSILON-FREE))
; (|c| EPSILON-FREE)) |c| ((|c| EPSILON-FREE))
; A ((EPSILON NIL) EPSILON ((EPSILON NIL))
; (|a| NIL) $ (($ EPSILON-FREE))
; (|c| EPSILON-FREE))
; B ((|b| EPSILON-FREE))
; C ((|c| EPSILON-FREE)
; (EPSILON NIL))
;
; ------------------------------------------------------------------------------
(defun create-all-first-derived-terminals()
; Initialize the hash table. The size is extensible at run time.
(let ( (hash-table (make-hash-table :size +initial-hash-table-size+))
(update-hash-table (make-hash-table :size +initial-hash-table-size+))
(nonterm nil)
(new-first-symbols nil)
(old-first-symbols nil)
(change-flag T) )
; Create the zeroth approximation to FIRST(), accurate for all terminals.
(initial-first-derived-terminals hash-table)
; Loop until no more changes occur in the approximation to FIRST.
(loop
(setq change-flag nil)
; Compute FIRST[i+1](A) for all the nonterminals A.
(dolist (production *productions*) ; Scan all productions A -> alpha
(setq nonterm (first production)) ; A
; FIRST[i+1]( A ) =
; first terminal of( FIRST[i]( Y1 ) ... FIRST[Yn]) U FIRST[i]( A ).
(setq old-first-symbols (gethash nonterm hash-table))
(setq new-first-symbols
(combine old-first-symbols ; FIRST[i](A)
(first-terminals-of-rhs (production-rhs! production)
hash-table)
:test 'same-symbol? :precedence 'precedence))
; Record if any changes occurred, and save FIRST[i+1]( A ) in a separate
; update hash table.
(cond ((not (equal-sets-of-items? new-first-symbols
old-first-symbols))
(setq change-flag T)
(setf (gethash nonterm update-hash-table)
new-first-symbols))))
; Add updates to the old hash table for FIRST[i]() to create FIRST[i+1](),
; then clear out the update hash table.
(update-first-derived-function hash-table update-hash-table)
(clrhash update-hash-table)
(if (null change-flag) (return))) ; No more changes --- exit.
; Return the hash table of first derived terminals for every grammar symbol.
hash-table)
)
; ------------------------------------------------------------------------------
; | first-terminals-of-symbol |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Return a list of the first derived terminals of a grammar symbol.
;
; CALLING SEQUENCE
;
; (first-terminals-of-symbol s)
;
; s Any grammar symbol (or EPSILON or $).
;
; *first-derived-terminals* Hash table of the first derived terminals
; for every grammar symbol.
;
; *epsilon-free-first-derived-terminals*
;
; Hash table of the epsilon-free first
; derived terminals for every grammar symbol.
;
; type Defaults to NIL for computing FIRST() and
; equals 'epsilon-free for computing EFF().
;
; Returns: List of first derived terminals of s.
; If type = 'epsilon-free, return the
; epsilon-free first derived terminals
; instead.
;
; Creates *first-derived-terminals* and
; *epsilon-free-first-derived-terminals*
; if they do not already exist.
; EXAMPLE
;
; (first-terminals-of-symbol 'S)
; => (|a| |c| |b|)
; (first-terminals-of-symbol 'S :type 'epsilon-free)
; => (|c|)
;
; ------------------------------------------------------------------------------
(defun first-terminals-of-symbol( symbol &key (type NIL) )
; Create the hash tables for FIRST() and EFF() if they do not exist.
(cond ( (or (null *first-derived-terminals*)
(null *epsilon-free-first-derived-terminals*))
; Sort out first derived terminals from epsilon-free first derived terminals.
(setq *first-derived-terminals*
(make-hash-table :size +initial-hash-table-size+))
(setq *epsilon-free-first-derived-terminals*
(make-hash-table :size +initial-hash-table-size+))
(let ( (old-hash-table (create-all-first-derived-terminals)) )
(dolist (symbol (cons '$ (find-grammar-symbols)))
(setf (gethash symbol *first-derived-terminals*)
(untag-list (gethash symbol old-hash-table)))
(setf (gethash symbol *epsilon-free-first-derived-terminals*)
(untag-list
(epsilon-free-only
(gethash symbol old-hash-table))))))))
; Return FIRST() or EFF() depending on the customer's request.
(if (equal type 'epsilon-free)
(gethash symbol *epsilon-free-first-derived-terminals*)
(gethash symbol *first-derived-terminals*))
)
; ------------------------------------------------------------------------------
; | first-derived-terminals |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Return a list of the first derived terminals of a grammar string.
;
; CALLING SEQUENCE
;
; (first-derived-terminals string :type type)
;
; string A list of grammar symbols, X1 ... Xn
;
; type NIL by default.
;
; Returns: First derived terminals of the list, FIRST( X1 ... Xn )
; if type = NIL, but EFF( X1 ... Xn ) if type = 'epsilon-free.
;
; METHOD
;
; FIRST( X1 ... Xn ) = FIRST( X1 ) + ... + FIRST( Xn )
; 1 1
;
; EFF( X1 ... Xn ) = EFF( X1 ) + FIRST( X2 ... Xn )
; 1
;
; EXAMPLE
;
; (first-derived-terminals '(S A)) => (|a| |c| |b|)
; (first-derived-terminals '(S A) :type 'epsilon-free) => (|c|)
;
; because FIRST( S ) = (|a| |b| |c|) and FIRST( A ) = (|a| |c| EPSILON)
; and |c| is the only terminal with a non-epsilon derivation,
; S => A B => C a b => c a b. |b|, for example has only the derivation
; S => A B => A b => EPSILON b = b, in which we must replace a leading
; non-terminal A with EPSILON.
;
; (first-derived-terminals '(A B)) => (|c| |a| |b|)
; because FIRST( B ) = { |b| }
;
; (first-derived-terminals '(A B)) => (|c|)
;
; ------------------------------------------------------------------------------
(defun first-derived-terminals( string &key (type NIL) )
; We want FIRST( EPSILON ) = (EPSILON) and EFF( EPSILON ) = NIL.
(cond ( (null string) (if (equal type 'epsilon-free)
NIL
(list 'EPSILON)))
; If EPSILON is in FIRST( Y1 ), add all non-epsilon terminals of FIRST( Y1 ).
; If we are computing EFF(), we do EFF( Y1 ) instead.
( (itemInList 'EPSILON
(first-terminals-of-symbol (first string) :type type))
(union (removeItemFromList 'EPSILON
(first-terminals-of-symbol (first string)))
(first-derived-terminals (rest string))
:test #'equal
)
)
; Otherwise, return the non-epsilon symbols of FIRST( Y1 ) or of EFF( Y1 ).
(t (first-terminals-of-symbol (first string) :type type)))
)
; ==============================================================================
; | Item functions: closure, goto, cores, etc. |
; ==============================================================================
; ------------------------------------------------------------------------------
; | closure |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Return the closure of a list of items.
;
; CALLING SEQUENCE
;
; (closure set-of-items)
;
; *productions* Global list of productions.
; set-of-items
;
; Returns: For each item [A -> alpha . B beta , a] in the set, add
; [B -> . gamma , b] where (B -> gamma) is a production
; and b is the first derived symbol of the string "beta a".
; METHOD
;
; Intuitively, we saw alpha already and expect to see B next.
; That is, we expect to see any string of terminals gamma derived from B.
; The next symbol we expect is the lookahead, which is the first
; derived terminal symbol of the string beta a.
;
; EXAMPLE
;
; (closure '( (SP -> DOT S |,| $) ) ) =>
;
; ( (SP -> DOT S |,| $ )
; (S -> DOT C C |,| $ )
; (C -> DOT |c| C |,| |c|)
; (C -> DOT |c| C |,| |d|)
; (C -> DOT |d| |,| |c|)
; (C -> DOT |d| |,| |d|) )
;
;
; ------------------------------------------------------------------------------
(defun closure( item-list )
(let ((closed-item-list item-list) ; Closure of item-list.
(item-num -1) ; nth item in item-list.
(nonterm nil) ; B
(first-syms nil) ; FIRST[ beta a ]
(item nil) ; Current item in item-list.
(new-item nil)) ; [B -> . gamma, b]
(loop ; Loop over each item.
(setq item-num (1+ item-num)) ; Advance to next item.
(setq item (nth item-num closed-item-list)) ; Get current item,
; [A -> alpha . B beta , a]
(if (null item) (return)) ; End of the list.
(setq nonterm (symbol-after-dot! item)) ; Get B.
(if (nonterminal? nonterm) ; B is nonterminal.
(dolist (production *productions*)
(cond ((valid-production? nonterm ; production = [B -> gamma]
production)
(setq first-syms ; FIRST[ beta a ]
(first-derived-terminals
`(,@(string-before-comma! item) ; Get beta.
,(lookahead-of! item)))) ; Get a.
(dolist (lookahead first-syms) ; for each b in
; FIRST[ beta a ]
(setq new-item ; [ B -> . gamma , b ]
(make-item production
lookahead))
(setq closed-item-list ; Add to end of list.
(insertItemIntoList new-item
closed-item-list))))))))
closed-item-list)
)
; ------------------------------------------------------------------------------
; | compute-goto |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Compute the goto on a set-of-items and a grammar symbol.
;
; CALLING SEQUENCE
;
; (compute-goto set-of-items grammar-symbol)
;
; set-of-items Set of items I.
;
; grammar-symbol Grammar symbol X.
;
; Returns: Goto function GOTO( I, X ) defined as follows:
;
; For all items [A -> alpha . X beta , a] in I,
; add together the closures of [A -> alpha X . beta, a].
; EXAMPLE
;
; (compute-goto '( (SP -> DOT s |,| $)
; ( S -> DOT c c |,| $)
; ( c -> |c| c |,| c)
; ( c -> |c| c |,| d)
; ( c -> d |,| c)
; ( c -> d |,| d))
;
; 'S) => ( (SP -> S DOT |,| $))
;
; ------------------------------------------------------------------------------
(defun compute-goto( set-of-items grammar-symbol )
(let ( (new-set nil) )
(dolist (item set-of-items)
; Examine each of the form [A -> alpha . X beta, a]
(if (equal (symbol-after-dot! item)
grammar-symbol)
; Add [A -> alpha X . beta , a ] if not already there.
(setq new-set
(insertItemIntoList (move-dot-right item)
new-set))
)
)
; Closure of the new list.
(closure new-set)
)
)
; ==============================================================================
; | Goto Graph Functions |
; ==============================================================================
; ------------------------------------------------------------------------------
; | create-new-node |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Create a new node in the goto graph given the data.
;
; CALLING SEQUENCE
;
; (create-new-node state hash-value set-of-items)
;
; hash-value Hash value of core of items.
;
; Returns:
;
; EXAMPLE
;
; (create-new-node 1 3668 '((SP -> S DOT |,| $)) 3648)
; => (1 3668 ((SP -> S DOT |,| $)))
;
; ------------------------------------------------------------------------------
(defun create-new-node( current-state
hash-value-of-core-of-items
list-of-items )
`(,current-state ,hash-value-of-core-of-items ,list-of-items)
)
(defun create-new-link( current-state
grammar-symbol
next-state)
`(,current-state ,grammar-symbol ,next-state)
)
; ------------------------------------------------------------------------------
; | select-items! |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Select out the set of items in a node of the goto graph.
;
; CALLING SEQUENCE
;
; (select-items! node)
;
; node A node in the goto graph, (i2 i1 X SET-OF-ITEMS)
;
; Returns: SET-OF-ITEMS
;
; EXAMPLE
;
; (select-items!
; '( 0
; 3668
; (
; (SP -> DOT S |,| $)
; ( S -> DOT S |a| S |b| |,| $)
; )
; )
; )
;
; =>
;
; ------------------------------------------------------------------------------
(defun select-items!( node )
(third node)
)
(defun hash-value!( node )
(second node)
)
(defun links!( goto-graph )
(first goto-graph)
)
(defun nodes!( goto-graph )
(second goto-graph)
)
(defun nth-node!( node-num goto-graph )
(nth node-num (nodes! goto-graph))
)
(defun first-node( goto-graph )
(first (nodes! goto-graph))
)
(defun rest-node( goto-graph )
(rest (nodes! goto-graph))
)
(defun insert-node( node goto-graph )
`( ,(links! goto-graph)
,(insertItemIntoList node
(nodes! goto-graph))
)
)
(defun insert-link( link goto-graph )
`( ,(insertItemIntoList link (links! goto-graph))
,(nodes! goto-graph)
)
)
; ------------------------------------------------------------------------------
; | current-state! |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Get the number of a node in the goto graph.
;
; CALLING SEQUENCE
;
; (current-state! node)
;
; node A node in the goto graph, (i1 i2 X SET-OF-ITEMS))
;
; Returns: i1
;
; EXAMPLE
;
; (current-state! '(1 0 S ((SP -> S DOT |,| $)))) => 1
;
; -----------------------------------------------------------------------------
(defun current-state!( node )
(first node)
)
; ------------------------------------------------------------------------------
; | transition-symbol! |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Return the symbol upon which an action is performed.
;
; CALLING SEQUENCE
;
; (transition-symbol! node)
;
; node A node in the goto graph, (i1 i2 X SET-OF-ITEMS))
;
; Returns: X
;
; EXAMPLE
;
; (transition-symbol! '(1 0 S ((SP -> S DOT |,| $)))) => S
;
; ------------------------------------------------------------------------------
(defun transition-symbol!( node )
(third node)
)
; ------------------------------------------------------------------------------
; | set-of-items-in-graph? |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Find out if a goto graph contains a given a set of items (or
; their cores).
;
; CALLING SEQUENCE
;
; (set-of-items-in-graph? set-of-items goto-graph :compare-type type)
;
; set-of-items Any set of items.
;
; goto-graph The goto graph of the grammar.
;
; type Optional keyword. If omitted, it defaults to 'item.
;
; Returns T if any node in goto-graph has the same set of items
; (for type = 'item) or the same core (for type = 'core)
; as set-of-items.
;
; EXAMPLE
;
; (set-of-items-in-graph?
; '( (a -> b DOT c |,| d) (e -> f DOT g |,| h) )
; '( ( (-1 nil 0) (0 a 1) )
; ( (0 12 ( (a -> b DOT c |,| ddd) (e -> f DOT g |,| h) ) )
; (1 23 ( (i -> j DOT |,| k) ) ) ))
; ) => nil
;
; but
;
; (set-of-items-in-graph?
; '( (a -> b DOT c |,| d) (e -> f DOT g |,| h) )
; '( ( (-1 nil 0) (0 a 1) )
; ( (0 12 ( (a -> b DOT c |,| d) (e -> f DOT g |,| h) ) )
; (1 23 ( (i -> j DOT |,| k) ) ) ))
; :compare-type 'core) => T
;
;
; ------------------------------------------------------------------------------
(defun set-of-items-in-graph?( set-of-items goto-graph
&key (compare-type 'item) )
(cond ( (null goto-graph) nil) ; goto graph = ()
( (null (first-node goto-graph)) nil) ; goto graph = ( (...) () )
( t
; Scan all nodes in the goto-graph, looking for one which has
; a matching item.
(dolist (node (nodes! goto-graph))
(if (equal-sets-of-items? set-of-items (select-items! node)
:compare-type compare-type)
(return t)
)
)
; return nil by default
)
)
)
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Find out the current node (state) number of a node in the goto
; graph which contains the given set of items or their cores.
;
; CALLING SEQUENCE
;
; (node-number set-of-items goto-graph :compare-type compare-type)
;
; set-of-items Set of items to search for.
;
; goto-graph Goto graph of the grammar.
;
; compare-type If 'item, find identical set of items, but if 'core,
; find identical cores. Defaults to 'item.
;
; Returns: Number of the node in the goto graph containing the given
; set of items or core.
;
; EXAMPLE
;
; (node-number '(( S -> C C DOT |,| $)) *goto-graph*) => 5
; (node-number '(( S -> C C DOT |,| |e|)) *goto-graph*) => NIL
; but
; (node-number '(( S -> C C DOT |,| |e|)) *goto-graph*
; :compare-type 'core) => 5
;
; ------------------------------------------------------------------------------
(defun node-number( set-of-items goto-graph &key (compare-type 'item) )
(cond ( (null goto-graph) -1) ; goto graph = ()
( (null (first-node goto-graph)) -1) ; goto graph = ( (...) () )
( t
; Scan all nodes in the goto-graph, looking for one which has
; a matching item.
(dolist (node (nodes! goto-graph))
(if (equal-sets-of-items? set-of-items (select-items! node)
:compare-type compare-type)
(return (current-state! node))
)
)
; return nil by default
)
)
)
; Hash value of the core of an item.
; (core-hash-value-of-item '(S -> S DOT |a| S |b| |,| $)) => 1790
;
; Sum up the integer value of all characters in each of the symbols.
; Multiply by the position of DOT in the item to distinguish items
; with the same symbols.
;
(defun core-hash-value-of-item( item )
(let ( (hash-value 0)
(symbol-position -1)
(string-of-symbol "")
(length-of-string 0) )
; Hash the core of the item only.
(dolist (s (core-of-item! item))
; Symbol index in the item, starting with 0.
(setq symbol-position (+ 1 symbol-position))
; Convert symbol to string and get its length.
(setq string-of-symbol (symbol-name s))
(setq length-of-string (length string-of-symbol))
; Sum the integer values of each character in the symbol.
(dotimes (i length-of-string)
(setq hash-value
(+ hash-value
(char-int (char string-of-symbol i)))
)
)
; Multiply by the index position of the DOT symbol
; to distinguish between same items with dots in different
; locations such as
; [S -> a . b , c] and [S -> a b . , c]
(if (equal s 'DOT)
(setq hash-value (* hash-value symbol-position))
)
)
hash-value)
)
; Only the core matters:
; (core-hash-value-of-set-of-items
; '( (SP -> S DOT |,| $)
; ( S -> S DOT |a| S |b| |,| $)
; ( S -> S DOT |a| S |b| |,| |a|))) => 3542
;
; (core-hash-value-of-set-of-items
; '( (SP -> S DOT |,| $)
; ( S -> S DOT |a| S |b| |,| $))) => 3542
;
; (core-hash-value-of-set-of-items
; '( (SP -> S DOT |,| $))) => 1752
;
(defun core-hash-value-of-set-of-items( set-of-items )
(let ( (hashes-of-items (mapcar #'core-hash-value-of-item set-of-items))
(sum 0)
)
; Hash value on the entire set of items.
; Don't count duplicate items.
(dolist (i (remove-duplicates hashes-of-items))
(setq sum (+ sum i))
)
; Modulo to keep within size of an integer.
(mod sum +hash-value-upper-limit+)
)
)
; ==============================================================================
; | LALR(1) Core Merging Functions
; ==============================================================================
;
; Partition
;
; merge-equivalence-classes( '(2 4) '() )
; => ( (2 4) )
;
; merge-equivalence-classes( '(2 4) '( (4 5) (6 7) ) )
; => ( (2 4 5) (6 7) )
;
; merge-equivalence-classes( '(2 4) '( (4 5) (2 7) (3 6) ) )
; => ( (3 6) (2 4 5 7) )
;
(defun merge-equivalence-classes( equivalence partition )
(cond
; Dispose of trivial inputs.
( (null equivalence) partition)
( (= (length equivalence) 1) partition)
; Partition is empty.
( (null partition) (list equivalence) )
; First set in the partition has common elements
; with the equivalence.
( (intersection equivalence (first partition) )
; Sort the elements in the equivalence classes.
(mapcar #'(lambda (x) (sort x #'(lambda (x y) (< x y))))
; Remerge the other sets in the partition.
(merge-equivalence-classes
; Merge the equivalence into the first set in the
; partition.
(union equivalence (first partition) :test #'equal)
(rest partition)
)
)
)
(t
; Sort the elements in the equivalence classes.
(mapcar #'(lambda (x) (sort x #'(lambda (x y) (< x y))))
; Merge the other sets in the partition.
(cons (first partition)
(merge-equivalence-classes equivalence (rest partition))
)
)
)
)
)
; If member of equiv. class in the partition, return the smallest
; equivalent element.
(defun remap-equivalent( num partition )
(cond ( (null partition) num)
( (member num (first partition))
(caar partition)
)
(t
(remap-equivalent num (rest partition))
)
)
)
; ------------------------------------------------------------------------------
; | merge-lookaheads |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Collapse together two sets of items, eliminating duplicated items.
;
; CALLING SEQUENCE
;
; (merge-lookaheads set-of-items1 set-of-items2)
;
; set-of-items
;
; node A node in the goto graph with the same cores as in
; set-of-items.
;
; Returns: Updated node with the same core as before, but
; added lookaheads.
;
; METHOD
;
; Remove duplicates and use a set union to merge the lookaheads.
;
; EXAMPLE
;
; (merge-lookaheads '( (D -> E DOT F |,| |b|)
; (A -> B DOT C |,| |a|) )
;
; '( (A -> B DOT C |,| |a|)
; (A -> B DOT C |,| |a|)
; (D -> E DOT F |,| |c|) )) =>
;
; ( (D -> E DOT F |,| |b|)
; (A -> B DOT C |,| |a|)
; (D -> E DOT F |,| |c|) )
;
; ------------------------------------------------------------------------------
(defun merge-lookaheads( set-of-items1 set-of-items2 )
; Use the equal function to test for duplicates, since we are handling
; elements which are lists, not atoms.
(union (remove-duplicates set-of-items1)
(remove-duplicates set-of-items2)
:test #'equal)
)
; ------------------------------------------------------------------------------
; | merge-cores |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; If the given set of items has the same core as a node in the goto graph,
; merge it into the node.
;
; CALLING SEQUENCE
;
; merge-cores( goto-graph )
;
; goto-graph Goto graph of the grammar.
;
; Returns: All sets of items with the same cores are merged,
; and states and links are renumbered.
;
; EXAMPLE
;
; ------------------------------------------------------------------------------
(defun merge-cores( goto-graph )
(let ( (nodes (nodes! goto-graph)) ; Get the nodes out
(links (links! goto-graph)) ; and the links.
(previous-node nil)
(previous-state -1)
(previous-hash-value -1)
(equiv-class nil)
(merged-goto-graph '(() ()) ) ; New merged goto graph.
)
; Sort the nodes on hash value to keep sets of items with the same
; cores adjacent.
(setq nodes
(sort nodes #'(lambda (x y) (< (hash-value! x) (hash-value! y)))))
; Scan through the nodes, looking for sets of items with the
; same cores.
(dolist (node nodes)
; Current node and previous node have same cores.
(cond ( (= (hash-value! node) previous-hash-value)
(setq equiv-class
(merge-equivalence-classes
(list (current-state! node) previous-state)
equiv-class
)
)
; Create a new merged node.
(setq node
(create-new-node
; Use the lowest numbered state
; for the new node number.
(if (< (current-state! node)
(current-state! previous-node))
(current-state! node)
(current-state! previous-node)
)
; Hash value.
(hash-value! node)
; Merge the cores in the items.
(merge-lookaheads (select-items! node)
(select-items! previous-node))
)
)
)
; Current node differs, send off previous node.
(t
(if (not (null previous-node))
(setq merged-goto-graph
(insert-node previous-node merged-goto-graph))
)
)
)
(setq previous-node node)
(setq previous-hash-value (hash-value! node))
(setq previous-state (current-state! node))
)
; Send off last node, merged or otherwise, in any case.
(setq merged-goto-graph
(insert-node previous-node merged-goto-graph))
; Renumber states in the links.
(dolist (link links)
(setq link
`( ,(remap-equivalent (first link) equiv-class)
,(second link)
,(remap-equivalent (third link) equiv-class)
)
)
(setq merged-goto-graph (insert-link link merged-goto-graph))
)
merged-goto-graph
)
)
; ==============================================================================
; | LR(1) Action and Goto table utilities |
; ==============================================================================
; ------------------------------------------------------------------------------
; | create-goto-graph |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Create a goto graph containing the sets of items for the grammar.
;
; CALLING SEQUENCE
;
; (create-goto-graph parser-type)
;
; parser-type The type of grammar: 'LR1 or 'LALR1
;
; Returns: Goto graph of the grammar.
;
; METHOD
;
; We create a DFA which recognizes the viable prefixes of the grammar.
;
; The DFA is called the goto graph. Each node in the graph is of the
; form (i2 i1 X SET-OF-ITEMS).
;
; i1 = V( gamma ) = (set of all items valid for viable prefix gamma).
; i2 = V( gamma X )
; X = a grammar symbol (but not EPSILON).
;
; An item [A -> alpha . beta] is valid for viable prefix gamma alpha if
; gamma A is also a viable prefix.
;
; The prefix gamma is viable if there is a rightmost derivation
; S =>* gamma w.
;
; The first state is I0 = V( EPSILON ) = { [S -> . S , $] }.
;
; We process nodes as follows:
;
; node-num ---> 0
; ...
; node-num ---> 3 ----+----+
; | a |
; 4 <---+ | b
; |
; 5 <---+----+
; ...
; 3 ----+----+
; | a |
; 4 <---+ | b
; |
; node-num ---> 5 <---+----+
;
; -----------------------------------------------------------------------------
(defun create-goto-graph( parser-type )
(let* ( (goto-of-item nil) ; Set of items, GOTO( I, X )
(node nil) ; Node in graph.
(node-num 0) ; Next node in goto graph to process.
(new-node nil) ; New node in Goto graph.
(new-link nil)
(new-state-num 1) ; State number of the next node.
(goto-graph '( () () ) ); Initial goto-graph.
)
; Our very first set of items I0 is the closure of [S' -> .S, $]
(setq goto-of-item (closure (list (create-augmenting-item))))
; The initial node has state 0, items as above, and hash value.
(setq node (create-new-node 0
(core-hash-value-of-set-of-items
goto-of-item)
goto-of-item))
(setq new-link (create-new-link -1 nil 0))
; Insert nodes and links into the goto graph.
(setq goto-graph (insert-node node goto-graph))
(setq goto-graph (insert-link new-link goto-graph))
(loop
; Latest unprocessed node in the goto graph. Starting with I0.
(setq node (nth-node! node-num goto-graph))
(if (null node) (return)) ; No more sets of items.
; For each grammar symbol X ...
(dolist (grammar-symbol (find-grammar-symbols))
; ...compute GOTO( I, X ), the new set of items.
(setq goto-of-item
(compute-goto (select-items! node)
grammar-symbol))
; Create a new node with set of items GOTO( I, X ),
(setq new-node
(create-new-node new-state-num
(core-hash-value-of-set-of-items
goto-of-item)
goto-of-item )
)
; GOTO( I, X ) is empty.
(if (not (null goto-of-item))
(cond (
; Our GOTO( I, X ) has computed the same sets of
; items.
(set-of-items-in-graph? goto-of-item
goto-graph)
; Insert a new link
; X
; I --->
(setq new-link (create-new-link
node-num
grammar-symbol
(node-number goto-of-item
goto-graph))
)
(setq goto-graph (insert-link new-link goto-graph))
)
; Add a new node with a new set of items and
; a new link.
; Increment the current state number.
(t
(setq goto-graph (insert-node new-node goto-graph))
(setq new-link (create-new-link (current-state! node)
grammar-symbol
new-state-num))
(setq goto-graph (insert-link new-link goto-graph))
(setq new-state-num (1+ new-state-num))
)
) ; end cond
) ; end if empty GOTO( I, X )
) ; end dolist
; Bump up the node number.
(setq node-num (1+ node-num ))
) ; end loop
; For LALR(1) languages, sort the goto graph on core hash value
; then merge states with the same cores.
(if (equal parser-type 'LALR1)
(setq goto-graph (merge-cores goto-graph))
)
goto-graph
) ; end let
)
; ------------------------------------------------------------------------------
; | goto |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; The LR GOTO function derived from the goto graph.
;
; CALLING SEQUENCE
;
; (goto i A goto-graph)
;
; goto-graph The goto graph with entries of the form
; (i2 i1 X )
;
; state The initial state i1.
;
; symbol The transition symbol X.
;
; Returns: The next state i2, or NIL if GOTO is undefined.
;
; EXAMPLE
;
; Suppose *goto-graph* = ( ( (6 |a| 4) ) (nodes) )
;
; (goto 6 '|a| goto-graph) => 4
;
; ------------------------------------------------------------------------------
(defun goto( state symbol goto-graph)
(dolist (link (links! goto-graph))
(if (and (= state (first link))
(equal symbol (second link))
)
(return (third link))
)
)
; Return nil by default.
)
; ------------------------------------------------------------------------------
; | action-list! |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Return the list of actions in a line of the action table.
;
; CALLING SEQUENCE
;
; (action-list action-table-line)
;
; action-table-line One line of action table of the form
; ( (stateNumber) (listOfActions) )
;
; Returns: (listOfActions)
;
; EXAMPLE
;
; (action-list! '( (0) ((|c| (S 3)) (|d| (S 4)))))
; => ((|c| (S 3)) (|d| (S 4)))
;
; ------------------------------------------------------------------------------
(defun action-list!( line-of-table )
(second line-of-table)
)
; ------------------------------------------------------------------------------
; | action-line-state! |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Return the state of a line of the action table.
;
; CALLING SEQUENCE
;
; (action-line-state! action-table)
;
; action-table Table of the form ( (stateNum) (listOfActions) )
;
; Returns: stateNum
;
; EXAMPLE
;
; (action-line-state! '( (0) ((|c| (S 3)) (|d| (S 4)) (DEFAULT (ERROR)))) )
; => 0
;
; ------------------------------------------------------------------------------
(defun action-line-state!( action-table-line )
(first (first action-table-line))
)
; ------------------------------------------------------------------------------
; | action-trigger-symbol! |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Return the transition symbol in an action pair.
;
; CALLING SEQUENCE
;
; (action-trigger-symbol! action-pair)
;
; action-pair An action/new-state pair of a line in the action table
; of the form (X (action i)).
;
; Returns: X
;
; EXAMPLE
;
; (action-trigger-symbol! '(|c| (S 3))) => |c|
;
; ------------------------------------------------------------------------------
(defun action-trigger-symbol!( action-pair )
(first action-pair)
)
; ------------------------------------------------------------------------------
; | insert-action-or-goto-into-list |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Insert an action into a line of the action table. Check for conflicts.
;
; CALLING SEQUENCE
;
; (insert-action-or-goto-into-list symbol new-state list-of-actions action)
;
; symbol The transition symbol X.
;
; new-state The new state i.
;
; list-of-actions The action list part of one line of the action or
; goto table.
;
; action If 'NONE, update a goto list, else add this action
; to an action list.
;
; Returns: Augmented action list containing a new action pair
; (X (action i)) or a conflict pair
; (CONFLICT (X (action i)) (X (old-action j)))
; Similarly for a goto list.
; EXAMPLE
;
; (insert-action-or-goto-into-list 'a 666 '((b (s 3)) (c (r 2))) :action 's)
; =>((B (S 3)) (C (R 2)) (A (S 666)))
;
; (insert-action-or-goto-into-list 'b 666 '((b (s 3)) (c (r 2))) :action 's)
; => ((B (S 3)) (C (R 2)) (CONFLICT ((B (S 666)) (B (S 3)))))
;
; (insert-action-or-goto-into-list 'a 666 '((b 5) (c 6)))
; => ((B 5) (C 6) (A 666))
;
; ------------------------------------------------------------------------------
(defun insert-action-or-goto-into-list( symbol new-state list-of-actions
&key (action 'NONE) )
; Nothing or only a default in the list. Insert a new action.
(cond ( (or (null list-of-actions)
(equal (first list-of-actions) '(default (error))))
(cons (if (equal action 'NONE)
`(,symbol ,new-state) ; Insert a goto.
`(,symbol (,action ,new-state))) ; Insert an action.
list-of-actions))
; Ignore duplicate actions.
((equal (first list-of-actions)
(if (equal action 'NONE)
`(,symbol ,new-state) ; Compare a goto.
`(,symbol (,action ,new-state)))) ; Compare an action.
list-of-actions) ; Return list unchanged.
; We have a conflict on the first action. Insert a conflict report at the
; end of the row, unless it is there already.
((equal symbol (action-trigger-symbol! (first list-of-actions)))
(setq *conflicts* T)
(insertItemIntoList (if (equal action 'NONE)
`(conflict ((,symbol ,new-state)
(,@(first list-of-actions))))
`(conflict ((,symbol (,action ,new-state))
(,@(first list-of-actions)))))
list-of-actions))
; No conflict yet --- try insertion in the rest of the list.
( t (cons (first list-of-actions)
(insert-action-or-goto-into-list symbol
new-state
(rest list-of-actions)
:action action))))
)
; ------------------------------------------------------------------------------
; | add-action-or-goto |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Add an action to the action table or a goto to the goto table.
;
;
; CALLING SEQUENCE
;
; (add-action-or-goto( state symbol new-state table action)
;
; state The current state i1.
;
; new-state The new state i2
;
; table The action or goto table.
;
; action Defaults to 'NONE for the goto table, otherwise, the
; action to take (e.g. S, R, ACC, ERROR)
;
; Returns: The updated action or goto table.
;
; EXAMPLE
;
; Let table = ( ( (2) ( (a (s 5))
; (b (r 2))
; (default (error))))
; ( (4) ( ($ (acc nil))
; (default (error)))))
;
; To insert ACTION[ 2, b ] = (shift 6) into the table, call
;
; (add-action-or-goto 2 'b 6 table :action 's) =>
;
; ( ( (2) ( (A (S 5))
; (B (R 2))
; (DEFAULT (ERROR))
; (CONFLICT ((B (S 6)) (B (R 2))))))
; ( (4) ( ($ (ACC NIL))
; (DEFAULT (ERROR)))))
;
; We detect a shift/reduce conflict on symbol b and report it.
;
; On the other hand,
;
; (add-action-or-goto 2 'c 6 table :action 's) =>
;
; ( ( (2) ( (A (S 5))
; (B (R 2))
; (C (S 6))
; (DEFAULT (ERROR))))
; ( (4) ( ($ (ACC NIL))
; (DEFAULT (ERROR)))))
;
; Suppose we have a goto table,
;
; table = ( ( (0) ( (a 10)
; (b 20)
; (default (error))))
; ( (4) ( (a 11)
; (default (error)))))
;
; To insert GOTO[ 0, c ] = 6 call
;
; (add-action-or-goto 0 'c 6 table) =>
;
; ( ( (0) ( (A 10)
; (B 20)
; (C 6)
; (DEFAULT (ERROR))))
; ( (4) ( (A 11)
; (DEFAULT (ERROR)))))
;
; -----------------------------------------------------------------------------
(defun add-action-or-goto( state symbol new-state table
&key (action 'NONE))
; The table has no entries. Create a new action table of the form
; ( (State) ( (TransitionSymbol (Action NewState)) (default (error))))
; or Goto table of the form,
; ( (State) ( (TransitionSymbol (NewState)) (default (error)))).
;
; NOTE:
; We assume the Goto graph starts with state 0.
; Since we insert new states into the action table in order,
; the order will be maintained as we scan through the Goto graph.
(cond ( (null table) `(
( (,state)
(
,(if (equal action 'NONE)
`(,symbol ,new-state) ; goto table
`(,symbol (,action ,new-state))
)
(default (error))
)
)
)
)
; Found state in first line of table. Add the new action to
; this line.
( (= (action-line-state! (first table))
state)
(cons (list (first (first table)) ; Get state of first line.
(insert-action-or-goto-into-list symbol
new-state
(action-list! (first table))
:action action))
(rest table)))
; State is smaller than first line's state. Create a new line
; containing a new state, action and (default (error)) and add it
; before the first line.
( (< state (action-line-state! (first table)))
(cons `( (,state)
( ,(if (equal action 'NONE)
`(,symbol ,new-state) ; goto table
`(,symbol (,action ,new-state))
)
(default (error))
)
)
table)
)
; State is bigger than the first line's state. Decide later.
( t (cons (first table)
(add-action-or-goto state symbol new-state (rest table)
:action action))))
)
; ------------------------------------------------------------------------------
; | build-action-table |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Build the ACTION table of a cannonical LR(1) parser.
;
;
; CALLING SEQUENCE
;
; (build-action-table goto-graph)
;
; goto-graph The goto graph generated by make-items.
;
; Returns: Action table of the grammar.
;
; METHOD
;
; We initially add the action (default (error)) to each line of the table.
; If we generate any shift-reduce or reduce-reduce conflicts,
; we record them in the action table and check them later.
;
; EXAMPLE
;
; ------------------------------------------------------------------------------
(defun build-action-table( goto-graph )
(let ( (action-table nil)
(first-symbols-after-dot nil) )
(dolist (node (nodes! goto-graph)) ; Scan through every node in
; in the goto graph.
(dolist (item (select-items! node))
(cond (
; For the item [S' -> S. , $],
; ACTION[ i, $ ] = (accept).
(is-accept? item)
(setq action-table
(add-action-or-goto
(current-state! node) ; Current state i
'$ ; Transition.
'nil ; No state.
action-table
:action 'acc) ; No state.
)
)
; For the item [A -> alpha . , b]
; ACTION[ i, b ] = (reduce k)
; where k is the number of the production A -> alpha
( (reduction? item)
(setq action-table
(add-action-or-goto
(current-state! node) ; Current state i
(lookahead-of! item) ; b.
(production-number ; Production num.
(item-to-production item))
action-table
:action 'r)
)
)
; Prepare to add a possible shift.
( t
(if *has-epsilon-productions*
; When the grammar has epsilon-productions, for the item
; [A -> alpha . beta , b]
; where beta is not equal to the null-string EPSILON,
; for all a in EFF( beta b ), we add
; ACTION[ i, a ] = (shift j) where j = GOTO( i , a ).
(setq first-symbols-after-dot
(if (reduction? item) ; beta = epsilon
nil
(first-derived-terminals `(,(symbol-after-dot! item)
,@(string-before-comma! item)
,(lookahead-of! item))
:type 'epsilon-free)))
; For a grammar with no epsilon productions, for the item
; [A -> alpha . a beta , b]
; where a is a terminal, we add
; ACTION[ i, a ] = (shift j) where j = GOTO( i , a ).
(setq first-symbols-after-dot
(if (terminal-after-dot? item)
(list (symbol-after-dot! item))
nil))
)
; Add a shift, if any.
(dolist (term first-symbols-after-dot)
(setq action-table
(add-action-or-goto (current-state! node) ; Current state i
term ; Terminal a.
(goto ; into state j.
(current-state! node)
term
goto-graph)
action-table
:action 's))))))) ; Do a shift.
action-table)
)
; ------------------------------------------------------------------------------
; | build-goto-table |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Build the GOTO table for an LR(1) parser.
;
;
; CALLING SEQUENCE
;
; (build-goto-table)
;
; goto-graph The goto graph.
;
; Returns: The goto table.
;
;
; METHOD
;
; Whenever we have a link i which has a transition
; on a nonterminal A to the link j, we fill the table with
; GOTO( i, A ) = j.
;
; EXAMPLE
;
; ------------------------------------------------------------------------------
(defun build-goto-table( goto-graph )
(let ((goto-table nil))
(dolist (link (links! goto-graph))
(if (and (> (first link) -1)
(nonterminal? (second link)))
(setq goto-table
(add-action-or-goto (first link)
(second link)
(third link)
goto-table)
)
)
)
goto-table)
)
; ==============================================================================
; | Input and Output Functions |
; ==============================================================================
; ------------------------------------------------------------------------------
; | write-header |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Write a header for the parse tables file.
;
; CALLING SEQUENCE
;
; (write-header fp parser-type)
;
; fp Pointer to the currently open file.
;
; parser-type 'LR1 or 'LALR1. The title will be adjusted
; automatically based on the parser type.
;
; Returns: Header text written to file.
;
; EXAMPLE
;
; ------------------------------------------------------------------------------
(defun write-header( fp parser-type )
(format fp "~A~%~A~%~A~%~A~%~A~%~A~%~A~%~A~%~%"
";---------------------"
(if (equal parser-type 'LR1)
"; LR(1) parse tables"
"; LALR(1) parse tables")
";---------------------"
";"
"; Suitable for input to the Common Lisp program "
";"
"; LR(1)AndLALR(1)Parser.lsp"
";"
)
)
; ------------------------------------------------------------------------------
; | write-terminals |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Write a header for the parse tables file.
;
; CALLING SEQUENCE
;
; (write-terminals fp)
;
; fp Pointer to the currently open file.
;
; Returns: Terminal symbols written to file.
;
; EXAMPLE
;
; ------------------------------------------------------------------------------
(defun write-terminals( fp terminals )
(format fp "~A~%~A~%~%"
"; TERMINALS"
";"
)
(format fp "~S~%~%" terminals)
(fresh-line fp)
(fresh-line fp)
)
; ------------------------------------------------------------------------------
; | write-productions |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Write the split up productions with their numbers.
;
; CALLING SEQUENCE
;
; (write-productions fp productions)
;
; fp Pointer to the currently open file.
;
; productions List of productions to write.
;
; Returns: Neat list of numbered productions. (These are
; the ones expanded from the alternates.)
;
; EXAMPLE
;
; See the files lalrparser.dat and parser.dat for examples.
;
; ------------------------------------------------------------------------------
(defun write-productions( fp productions )
(format fp "~A~%~A~%~A~%~A~%~%~A~%"
"; PRODUCTIONS"
";"
"; Productions are numbered starting with 1."
"; All alternates were expanded into separate productions."
"(" )
(dolist (production productions)
; Print each production.
(format fp "~A~D~A~S~A~%"
" ( "
`(,(production-number production))
" "
production
" )" )
)
(format fp "~A~%~%"
")" )
)
(defun construct-error-messages( action-table )
(let ( (error-messages nil)
(transition-symbols nil) )
; Scan through each line of the action table.
(dolist (action-line action-table)
(setq transition-symbols nil)
; Scan through the actions in each line.
(dolist (action (action-list! action-line))
; Found an error state; add message to the list.
(if (equal (second action) '(ERROR))
(push `( (,(action-line-state! action-line))
(,(concatenate 'string
"error - expecting one of the symbols "
(string-trim "("
(string-trim ")"
(write-to-string transition-symbols))))))
error-messages)
; else keep collecting transition symbols.
(setq transition-symbols
(cons (first action)
transition-symbols))
)
)
)
(reverse error-messages))
)
; ------------------------------------------------------------------------------
; | write-error-message-table |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Write the error message table with templates for the user to fill in.
;
; CALLING SEQUENCE
;
; (write-error-message-table fp action-table)
;
; fp Pointer to the currently open file.
;
; action-table
;
; Returns: Error message table.
;
; EXAMPLE
;
;
; ------------------------------------------------------------------------------
(defun write-error-message-table( fp action-table )
(format fp "~A~%~%~%"
"
; ERROR MESSAGE TABLE
;
; If the action table has an error state, the other non-error
; actions show which symbol was failed to appear next on the input.
;
; The user can modify these minimal error messages.
" )
; Opening parenthesis.
(format fp "(~%~%")
; Iterate over error states.
(dolist (error-message (construct-error-messages action-table))
(format fp " ~S ~%" error-message)
)
; Closing parenthesis.
(format fp ")~%~%")
)
; ------------------------------------------------------------------------------
; | write-goto-graph |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Write the formatted goto graph to a file.
;
; CALLING SEQUENCE
;
; (write-goto-graph fp goto-graph)
;
; fp Pointer to (open) file which is to contain the graph.
;
; goto-graph Goto graph itself, which will be pretty-printed.
;
;
; EXAMPLE
;
; (write-goto-graph fp " *goto-graph*)
; => ... see the sample output files parser.dat and lalrparser.dat.
;
; ------------------------------------------------------------------------------
(defun write-goto-graph( fp goto-graph )
; Write the title first and the opening parenthesis.
(format fp "~A~%~%"
"; GOTO GRAPH
;
; Not needed for the parser, but here for reference and debugging.
; **********
; Goto graph of the LR(1) or LALR(1) grammar of the form
;
; (
; ( <-- List of links.
; (6 |a| 4) <-- Transition in Goto graph from state 6 to
; state 4 on symbol a.
; (1 |a| 2) <-- Transition from state 1 to state 2 on a.
; )
;
; ( <-- List of sets of items.
; ( 0 <-- State number 0.
; 3668 <-- Hash value of core.
; (
; (SP -> DOT S |,| $) ----+
; ( S -> DOT S |a| S |b| |,| $) |
; ( S -> DOT EPSILON |,| $) +---- Set of items for state 0
; ( S -> DOT S |a| S |b| |,| |a|) |
; ( S -> DOT EPSILON |,| |a|) |
; ) ----+
; ) "
)
; Opening parenthesis of graph.
(format fp "(~%")
; Opening parenthesis of links.
(format fp "~3,4@T(~%")
; Print each link.
(dolist (link (links! goto-graph))
(format fp "~3,8@T(~D ~S ~D)~%"
(first link)
(second link)
(third link))
)
; Closing parenthesis of links.
(format fp "~3,4@T)~%")
; Opening parenthesis of nodes.
(format fp "~3,4@T(~%")
; Print each node in the graph.
(dolist (node (nodes! goto-graph))
; Print open paren of node, state and hash value.
(format fp "~3,8@T(~D~%~3,8@T~D~%"
(current-state! node)
(hash-value! node))
; Print out each item.
(dolist (item (select-items! node))
(format fp "~3,12@T~S~%" item))
; Closing paren of node.
(format fp "~3,8@T)~%")
)
; Closing parenthesis of nodes.
(format fp "~3,4@T)~%")
; Closing parenthesis of graph.
(format fp ")~%~%")
)
; ------------------------------------------------------------------------------
; | write-action-or-goto-table |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Write the formatted action or goto table to a file.
;
; CALLING SEQUENCE
;
; (write-action-or-goto-table fp table)
;
; fp Pointer to (open) file which is to contain the
; action-table.
;
; table Action or goto table itself, which will be
; pretty-printed.
;
;
; EXAMPLE
;
; (write-action-or-goto-table fp " *action-table)
; => ... see the sample output files parser.dat and lalrparser.dat.
;
; ------------------------------------------------------------------------------
(defun write-action-or-goto-table( fp table &key (table-type 'ACTION))
; Write the title first and the opening parenthesis.
(format fp "~A~%~A~%~A~%~A~%~A~%~%~A~%"
(cond ( (equal table-type 'ACTION) "; ACTION TABLE")
( (equal table-type 'GOTO) "; GOTO TABLE" ))
";"
"; (state"
"; (item)"
"; ..."
"(" )
; Print actions for each state.
(dolist (state table)
; Print the opening paren of the table and the state
; number in parentheses.
(format fp "~3,4@T( (~D) ~%"
(action-line-state! state)
)
; Print the word NIL explicitly if the list of items is empty.
(if (null (action-list! state))
(format fp "~3,8@TNIL~%")
; Print out the list of actions.
(progn
; Print first paren of action list.
(format fp "~3,8@T(~%")
; Print actions.
(dolist (item (action-list! state))
(format fp "~3,12@T~S~%" item))
; Print first paren of action list.
(format fp "~3,8@T)~%")
)
)
(format fp "~3,4@T)~%")
)
; Closing parenthesis.
(format fp ")~%~%")
)
; ------------------------------------------------------------------------------
; | print-legal-notice |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Write legal notice when the program starts up.
;
; CALLING SEQUENCE
;
; Returns: Legal notice to standard output.
;
; EXAMPLE
;
; ------------------------------------------------------------------------------
(defun print-legal-notice()
; Print a few newlines, the notice and a few more newlines.
(format t "~%~%~A~%~%"
"
LR(1)AndLALR(1)ParserGenerator Version 5.6
An LR(1) and LALR(1) Parser Generator written in Common Lisp.
Copyright (C) 1989-2024 by Sean Erik O'Connor. All Rights Reserved.
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see .
The author's address is seanerikoconnor!AT!gmail!DOT!com
with the !DOT! replaced by . and the !AT! replaced by @"
)
)
; ------------------------------------------------------------------------------
;
; NAME
;
; load-input-and-initialize
;
; DESCRIPTION
;
; Load the grammar from file. Initialize global variables.
;
; CALLING SEQUENCE
;
; (load-input-and-initialize filename)
;
; filename Name of the file containing the productions and terminals for
; the grammar.
;
; Returns: *terminals* After this is read from file, we add the extra
; terminal $ (the language's right endmarker).
;
; *productions* After reading the list of productions,
;
; [A -> alpha | beta ...]
;
; from file, we split up the alternates to
; generate the set of productions
;
; [A -> alpha], [A -> beta], ...
;
; *first-derived-terminals*
; *epsilon-free-first-derived-terminals*
;
; Set to NIL.
;
; *has-epsilon-productions*
;
; Set to NIL unless we have epsilon
; productions of the
; form A -> EPSILON.
;
; *conflicts* Set to NIL.
;
; *goto-graph*,
; *action-table*
; *goto-table* Set to NIL just for the hell of it.
;
; EXAMPLE
;
; (load-input-and-initialize "grammar.dat")
;
; *productions* => ((S -> S |a| S |b|) (S -> EPSILON)
; *terminals* => (|a| |b| $)
; *first-derived-terminals* => NIL
; *epsilon-free-first-derived-terminals* => NIL
; *conflicts* => NIL
; *has-epsilon-productions* => NIL
;
; ------------------------------------------------------------------------------
(defun load-input-and-initialize( grammar-file )
; Better safe than sorry.
(setq *goto-graph* nil)
(setq *action-table* nil)
(setq *goto-table* nil)
(setq *conflicts* nil)
(setq *first-derived-terminals* nil)
(setq *epsilon-free-first-derived-terminals* nil)
; Split up productions and add the endmarker to the list of terminals.
(let ( (fp (open grammar-file :direction :input)) )
(setq *productions* (read fp))
(setq *terminals* (read fp))
; Add the endmarker to the list of terminals.
(setq *terminals* (append *terminals* '($)))
; Split up productions (so we don't handle alternates directly)
(setq *productions* (split-up-productions *productions*))
; Detect epsilon productions.
(setq *has-epsilon-productions* nil)
(dolist (production *productions*)
(if (epsilon-production? production)
(setq *has-epsilon-productions* T)))
(close fp))
)
; ------------------------------------------------------------------------------
; | compile-all |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Compile all the functions in this program, except compile-all itself.
;
; CALLING SEQUENCE
;
; (compile-all)
;
; EXAMPLE
;
; (compile-all) =>
;
; ;;; Compiling function LOAD-INPUT-AND-INITIALIZE...tail-merging...
; assembling...emitting...done.
;
; --- and so on, with pauses for garbage collection ---
;
; ;;; Compiling function PARSER-GENERATOR...assembling...emitting...done
; NIL
;
; ------------------------------------------------------------------------------
(defun compile-all()
; Tell the compiler the following variables are global (have dynamic binding).
(proclaim '(special *productions*))
(proclaim '(special *has-epsilon-productions*))
(proclaim '(special *terminals*))
(proclaim '(special *first-derived-terminals*))
(proclaim '(special *epsilon-free-first-derived-terminals*))
(proclaim '(special *goto-graph*))
(proclaim '(special *action-table*))
(proclaim '(special *goto-table*))
(proclaim '(special *conflicts*))
(let ( (functions-to-compile
'( print-legal-notice
load-input-and-initialize
getHeadOfListUpTo
removeItemFromList
positionInList
insertItemIntoList
combine
itemInList
terminal?
nonterminal?
derives-leading-terminal?
derives-leading-nonterminal?
valid-production?
set-of-items-in-graph? reduction?
is-accept? terminal-after-dot?
equal-sets-of-items?
contained-in-item?
element-of-item?
epsilon-production?
same-symbol?
first-alternate!
all-but-first-alternate!
production-rhs!
symbol-after-dot!
string-before-comma!
lookahead-of!
select-items!
hash-value!
current-state!
action-list!
transition-symbol!
action-line-state!
action-trigger-symbol!
core-of-item!
core-hash-value-of-item
core-hash-value-of-set-of-items
merge-lookaheads
merge-cores
split-up-production
split-up-productions
make-item
move-dot-right
create-augmenting-item
find-grammar-symbols
create-new-node
create-new-link
node-number
item-to-production
production-number
tag-symbol
flag-epsilon-free!
epsilon-free-only
untag-list
flag-non-epsilon-free precedence
derived-leading-terminal
initial-first-derived-terminals
first-terminals-of-rhs
update-first-derived-function
create-all-first-derived-terminals
first-terminals-of-symbol
first-derived-terminals
add-action-or-goto
insert-action-or-goto-into-list
goto closure compute-goto
create-goto-graph
build-action-table
build-goto-table
write-header
write-terminals
write-productions
write-goto-graph
write-action-or-goto-table
write-error-message-table
construct-error-messages
parser-generator
file-exists?
base-path!
test-parser-generator)))
; Compile all the functions, except compile-all itself.
(dolist (function-to-compile functions-to-compile)
(compile function-to-compile)))
)
; ==============================================================================
; | Main Program |
; ==============================================================================
; ------------------------------------------------------------------------------
; | parser-generator |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Main program which produces the LR(1) and LALR(1) parsing tables.
;
; CALLING SEQUENCE
;
; (parser-generator in-file out-file :parser-type parser-type)
;
; in-file Productions and terminals for the grammar. See
; the file grammar.dat for an example.
;
; out-file The numbered productions, goto graph, action and
; parsing tables for the grammar. See the files
; lalrparser.dat and parser.dat for examples.
;
; parser-type 'LR1 or 'LALR1 parsing. The default is 'LALR1.
;
; Returns: A string indicating if any conflicts have occurred.
;
; EXAMPLE
;
; (parser-generator "grammar.dat" "parser.dat" :parser-type 'lr1)
; => NIL and the file parser.dat
;
; (parser-generator "grammar.dat" "lalrparser.dat" :parser-type 'lalr1)
; => NIL and the file lalrparser.dat
;
; (parser-generator "grammar.dat" "lalrparser.dat")
; => same as above
;
; (parser-generator "grammar4.dat" "junk" :parser-type 'lalr1)
; => "Conflicts were detected" and the file junk.
;
; ------------------------------------------------------------------------------
(defun parser-generator( in-file out-file &key (parser-type 'LALR1) )
; Keep my lawyer happy.
(print-legal-notice)
; Read in the grammar file productions and terminals.
(load-input-and-initialize in-file)
(let ( (fp (open out-file :direction :output :if-exists :supersede)) )
; Compute the goto graph for the grammar.
(setq *goto-graph* (create-goto-graph parser-type))
; Construct the action and goto parsing tables.
(setq *action-table* (build-action-table *goto-graph*))
(setq *goto-table* (build-goto-table *goto-graph*))
; Write out the terminals and productions for reference.
(write-header fp parser-type)
(write-terminals fp *terminals*)
(write-productions fp *productions*)
; Write out the goto graph.
(write-goto-graph fp *goto-graph*)
; Write out the action and goto parse tables.
(write-action-or-goto-table fp *action-table* :table-type 'ACTION)
(write-action-or-goto-table fp *goto-table* :table-type 'GOTO)
; Write out the error message template.
(write-error-message-table fp *action-table*)
(close fp)
(if *conflicts*
"Conflicts were detected")
)
)
; ------------------------------------------------------------------------------
; | print-file-to-console |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; List the lines of a file to the console.
;
; CALLING SEQUENCE
;
; (print-file-to-console filename)
;
; filename Name of the file.
;
; Returns:
;
; EXAMPLE
;
; (print-file-to-console "grammar.dat")
; => ; GrammarE=E+T_T.dat
; ---------------------------------------------------------------------------
;
; A grammar of arithmetic expressions,
;
; E -> E + T | T
; ...
;
; ------------------------------------------------------------------------------
(defun print-file-to-console( file-name )
(format t "~%~%=========================== ~A =============================~%~%~%" file-name)
(with-open-file (stream file-name)
(do ( (line (read-line stream nil) ; nil inhibits throw at eof
(read-line stream nil) ) ; and read-line returns nil at eof
)
( (null line) ) ; Terminate at eof
(format t "~A~%" line)
)
)
)
(defun component-present-p (value)
(and value (not (eql value :unspecific))))
(defun directory-pathname-p (p)
(and
(not (component-present-p (pathname-name p)))
(not (component-present-p (pathname-type p)))
p))
(defun pathname-as-directory (name)
(let ((pathname (pathname name)))
(when (wild-pathname-p pathname)
(error "Can't reliably convert wild pathnames."))
(if (not (directory-pathname-p name))
(make-pathname
:directory (append (or (pathname-directory pathname) (list :relative))
(list (file-namestring pathname)))
:name nil
:type nil
:defaults pathname)
pathname)))
; ------------------------------------------------------------------------------
; | file-exists? |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Portable way to check if a file or directory exists.
;
; CALLING SEQUENCE
;
; (file-exists? directory-or-file)
;
; directory-or-file Pathname for directory or file
; Returns: t if it is there, nil if not.
;
;
; EXAMPLES
;
; (file-exists? "/NotThere") => nil
; (file-exists? "/Volumes/seanoconnor") => t
;
; ------------------------------------------------------------------------------
(defun file-exists? (pathname)
"Check if the file exists"
#+(or sbcl lispworks openmcl)
(probe-file pathname)
#+(or allegro cmu)
(or (probe-file (pathname-as-directory pathname))
(probe-file pathname))
#+clisp
(or (ignore-errors
(probe-file (pathname-as-file pathname)))
(ignore-errors
(let ((directory-form (pathname-as-directory pathname)))
(when (ext:probe-directory directory-form)
directory-form))))
#-(or sbcl cmu lispworks openmcl allegro clisp)
(error "file-exists-p not implemented")
)
; ------------------------------------------------------------------------------
; | base-path! |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Try to find out where the base directory for the web page is located.
;
; CALLING SEQUENCE
;
; (base-path!)
;
; Returns: String of base path or nil if it can't find it.
;
;
; EXAMPLES
;
; (base-path!) => "C:/Sean/WebSite" ; Got it.
; (base-path!) => nil ; Could't find it.
;
; ------------------------------------------------------------------------------
(defun base-path!()
(let ( (possible-directories-list '(
"/cygdrive/c/Sean/WebSite" ; Windows / Cygwin
"/Users/seanoconnor/Desktop/Sean/WebSite" ; Mac OS
"/home/seanoconnor/Desktop/Sean/WebSite" ; Ubuntu Linux
)))
(dolist (base-path possible-directories-list)
; (format t "base path = ~S exists = ~S~%" base-path (file-exists? base-path) )
(if (file-exists? base-path) (return (concatenate 'string base-path "/"))))
)
)
; ------------------------------------------------------------------------------
; | test-parser-generator |
; ------------------------------------------------------------------------------
;
; DESCRIPTION
;
; Run the parser generator on a test grammar and produce test parsing
; tables.
;
; CALLING SEQUENCE
;
; (test-parser-generator)
;
; Set the files and paths to your requirements. I'm assuming you've
; installed cygwin if you're on a Windows machine.
;
; ------------------------------------------------------------------------------
(defun test-parser-generator()
; Compile all the functions for speed.
(compile-all)
; Garbage collect.
(gc)
; Generate a set of parse tables from a test grammar, both LR(1) and
; LALR(1).
(let* (
; Set up the base directory paths.
(base-path (base-path!))
(sub-path "ComputerScience/Compiler/ParserGeneratorAndParser/")
(grammar-path "Grammars/" )
(parse-table-path "ParseTables/")
; List the grammar files (input) and parse table files (output).
(grammar-file '( "GrammarS=SaSbEPSILON.dat"
"GrammarE=E+T_T.dat"
"GrammarPoly.dat"
"GrammarLR(1)NotLALR(1).dat"
"GrammarNotLR(1)NotLALR(1).dat") )
(parse-file-LR1 '( "ParseTablesLR(1)_S=SaSbEPSILON.dat"
"ParseTablesLR(1)_E=E+T_T.dat"
"ParseTablesLR(1)_Poly.dat"
"ParseTablesLR(1)_NotLALR(1).dat"
"ParseTablesLR(1)_NotLR(1)NotLALR(1).dat") )
(parse-file-LALR1 '( "ParseTablesLALR(1)_S=SaSbEPSILON.dat"
"ParseTablesLALR(1)_E=E+T_T.dat"
"ParseTablesLALR(1)_Poly.dat"
"ParseTablesLALR(1)_NotLALR(1).dat"
"ParseTablesLALR(1)_NotLR(1)NotLALR(1).dat") )
)
(dotimes (i (length grammar-file))
(let* (
; Create the full file path.
(full-grammar-file
(concatenate 'string
base-path sub-path grammar-path
(nth i grammar-file))
)
(full-parse-file-LR1
(concatenate 'string
base-path sub-path parse-table-path
(nth i parse-file-LR1))
)
(full-parse-file-LALR1
(concatenate 'string
base-path sub-path parse-table-path
(nth i parse-file-LALR1))
)
)
; Call the parser generator to generate parse tables for
; both LR(1) and LALR(1).
(parser-generator full-grammar-file full-parse-file-LR1
:parser-type 'LR1)
(parser-generator full-grammar-file full-parse-file-LALR1)
; Display the results to the console.
(print-file-to-console full-grammar-file)
(print-file-to-console full-parse-file-LR1)
(print-file-to-console full-parse-file-LALR1)
)
)
)
)