1#|------------------------------------------------------------------------------
   2| 
   3| NAME
   4| 
   5|    LR(1)AndLALR(1)ParserGenerator.lsp
   6| 
   7| 
   8| DESCRIPTION
   9| 
  10|     LR parser generator which produces goto graphs and action and goto tables
  11|     for both LR(1) and LALR(1) grammars.
  12| 
  13|     It gives the same parsing tables (and conflicts) as UNIX's yacc
  14|     compiler-compiler, except that some states may be numbered in a different
  15|     order.
  16| 
  17| 
  18| CALLING SEQUENCE
  19| 
  20|     Once you are in a Common Lisp interpreter, load this file using
  21|     your path:
  22| 
  23|        (load "LR(1)AndLALR(1)ParserGenerator.lsp")
  24| 
  25|     For an LR(1) grammar, type
  26| 
  27|         (parser-generator "grammar.dat" "parser.dat" :parser-type 'LR1)    
  28| 
  29|     For an LALR(1) grammar, type
  30| 
  31|         (parser-generator "grammar.dat" "parser.dat")
  32|     or
  33|         (parser-generator "grammar.dat" "parser.dat" :parser-type 'LALR1)
  34| 
  35|     Parser-generator prints the warning message "Conflicts were detected" to
  36|     the console if any shift-reduce or reduce-reduce conflicts occur.
  37| 
  38|     For testing, you can also call
  39| 
  40|         (test-parser-generator)
  41| 
  42|     but you need to modify this function to your taste by setting the 
  43|     file paths.
  44| 
  45|     Online documentation when you're in the lisp interpreter is given by the
  46|     standard documentation functions; for example,
  47| 
  48|         (apropos 'getHead)
  49|             => GETHEAD
  50|                GETHEADOFLISTUPTO (fbound)
  51|         (describe 'getHeadOfListUpTo)
  52|                COMMON-LISP-USER::GETHEADOFLISTUPTO
  53|                  [symbol]
  54|                
  55|                GETHEADOFLISTUPTO names a compiled function:
  56|                  Lambda-list: (ITEM LIST)
  57|                  Derived type: (FUNCTION (T T) (VALUES LIST &OPTIONAL))
  58|                  Documentation:
  59|                    -------------------------------------------------------------------------------
  60|                    |
  61|                    |  DESCRIPTION
  62|                    |
  63|                    |      Return the list from the beginning up to but not including a given item,
  64|                    |      or the whole list if the item wasn't found.
  65|                    ...
  66|                    |-------------------------------------------------------------------------------
  67|                    Source file: /Users/seanoconnor/ParserGeneratorAndParser/SourceCode/ParserGenerator/LR(1)AndLALR(1)ParserGenerator.lsp
  68|
  69|         (describe '*productions*)
  70|         COMMON-LISP-USER::*PRODUCTIONS*
  71|           [symbol]
  72|         
  73|         *PRODUCTIONS* names a special variable:
  74|           Value: (((1) (S -> POLY MOD)) ((2) (MOD -> COMMA INTEGER))
  75|                   ((3) (MOD -> EPSILON)) ((4) (POLY -> POLY + TERM))
  76|                   ((5) (POLY -> TERM)) ((6) (TERM -> MULTIPLIER POWER))
  77|                   ((7) (MULTIPLIER -> INTEGER)) ((8) (MULTIPLIER -> EPSILON))
  78|                   ((9) (POWER -> X)) ((10) (POWER -> X ^ INTEGER))
  79|                   ((11) (POWER -> EPSILON)))
  80|           Documentation:
  81|              List of productions of the unaugmented grammar.
  82|         
  83| 
  84| INPUT FILES:
  85| 
  86|     grammar.dat     A list of the productions of the grammar followed by
  87|                     a list of terminal symbols.  The file grammar.dat
  88|                     shows an example.  Epsilon productions are allowed.
  89| 
  90|     We assume the start symbol is the one which begins the first production
  91|     listed in grammar.dat.
  92| 
  93|     Don't include $ (the right endmarker) in the list of terminals.  It is
  94|     added automatically by the program.
  95| 
  96| 
  97| 
  98| OUTPUT FILES:
  99| 
 100|     parser.dat      A numbered list of productions, followed by the LR(1) 
 101|                     or LALR(1) goto graph (i.e. set of items) of the 
 102|                     grammar and the action and goto tables.  See the files 
 103|                     parser.dat and lalrparser.dat for examples.
 104| 
 105|     The LALR(1) tables are the same as the ones in the y.output file 
 106|     generated by UNIX's yacc compiler-compiler running with the -v 
 107|     option. The only difference is that some states may be numbered in 
 108|     a different order.
 109| 
 110|     Shift-reduce or reduce-reduce conflicts are inserted into the action 
 111|     and goto tables at the end of the line for the state in which they 
 112|     occur.
 113| 
 114|     You can feed the action and goto tables to my Common Lisp LR parser 
 115|     program "parser.lisp".  The goto graph indicates the state of the 
 116|     parse, just as in yacc's output, and can help to define the parsing
 117|     error messages.
 118| 
 119| 
 120| AUTHOR
 121| 
 122|      Sean E. O'Connor       01  Jun 1989  Version 1.0
 123|                             11  Mar 2008  Version 5.6 released.
 124| 
 125| LEGAL
 126| 
 127|     LR(1)AndLALR(1)ParserGenerator Version 5.6
 128|     An LR(1) and LALR(1) Parser Generator written in Common Lisp.
 129| 
 130|     Copyright (C) 1989-2024 by Sean Erik O'Connor.  All Rights Reserved.
 131| 
 132|     This program is free software: you can redistribute it and/or modify
 133|     it under the terms of the GNU General Public License as published by
 134|     the Free Software Foundation, either version 3 of the License, or
 135|     (at your option) any later version.
 136|
 137|     This program is distributed in the hope that it will be useful,
 138|     but WITHOUT ANY WARRANTY; without even the implied warranty of
 139|     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 140|     GNU General Public License for more details.
 141|
 142|     You should have received a copy of the GNU General Public License
 143|     along with this program.  If not, see <http://www.gnu.org/licenses/>.
 144|    
 145|     The author's address is seanerikoconnor!AT!gmail!DOT!com
 146|     with the !DOT! replaced by . and the !AT! replaced by @
 147| 
 148| 
 149| METHOD
 150| 
 151|     This is a Common Lisp implementation and will run under CLISP.  
 152|     The software design is layered, the simpler list manipulation
 153|     utilities coming first, building up gradually to the specialized
 154|     and higher level parser functions.  I've put lots of examples to
 155|     ease the pain.
 156| 
 157|     To construct the LR(1) goto graph (i.e. set of items) we use Algorithm 
 158|     4.9 of [Aho 86, pg. 231-232].  To create the cannonical LR(1) parsing 
 159|     action and goto tables, algorithm 4.10 [Aho 86, pg. 234] is used.
 160| 
 161|     To construct the LALR(1) parsing tables, we use the much simpler 
 162|     algorithm of [Aho 74, pg. 115] instead of algorithm 4.11 in [Aho 86,
 163|     pgs. 238-239].
 164| 
 165|     For computing FIRST (first derived terminals) we use algorithm 5.5 of 
 166|     [Aho 72, pgs. 357-359].
 167| 
 168|     The function EFF (epsilon-free first derived terminals) is described
 169|     in [Aho 72, pg. 381].  We base the algorithm used in the function
 170|     first-terminals-of-symbol on exercise 5.2.19 [Aho 72, pg. 398].  The 
 171|     modifications to algorithm 5.5 to make it compute EFF are my own and 
 172|     are described in my notes.
 173| 
 174|     In the first version of this program, we used the algorithm for FIRST 
 175|     of [Aho 86, pgs. 188-189].  But this algorithm does not always 
 176|     terminate!  In particular, it fails for the grammar, 
 177| 
 178|         S -> A S | b 
 179|         A -> S A | a  
 180| 
 181|     of example 4.33 [Aho 86, pg. 272] by getting into the following 
 182|     infinite loop:  FIRST( S ) = FIRST( A ) = FIRST( S ) ... The algorithm 
 183|     we use always terminates.
 184| 
 185| 
 186| REFERENCES
 187| 
 188|         See http://www.seanerikoconnor.freeservers.com for a review of the
 189|         parsing theory behind this program.
 190|                   
 191| 
 192|         [Aho 86]  COMPILERS: PRINCIPLES, TECHNIQUES, AND TOOLS,
 193|                   Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman,
 194|                   Addison-Wesley, 1986.
 195| 
 196|         [Aho 74]  "LR Parsing", Alfred V. Aho and Stephen C. Johnson, 
 197|                   Computing Surveys, Vol. 6, No. 2, June 1974, pg. 99-124.
 198| 
 199|         [Aho 72]  THE THEORY OF PARSING, TRANSLATION AND COMPILING, VOLUME 1:
 200|                   PARSING, Alfred V. Aho and Jeffrey D. Ullman, Prentice-Hall,
 201|                   1972.
 202| 
 203| BUGS
 204| 
 205|    Have the output look like the file y.output generated by yacc -v, or 
 206|    eyacc -v.
 207| 
 208| NOTES
 209|
 210|        In Common Lisp, functions and variables don't share the same namespace.
 211|        So you need to tell LISP that the variable func is actually a function by using funcall.
 212|
 213|        (defun apply-func (func arg1 arg2) (funcall func arg1 arg2))
 214|
 215|        So you can't do the simpler expression (func arg1 arg2).
 216|
 217|        It goes the other way too.  You can't say (apply-func + 1 2) because + denotes a variable.
 218|        You have to tell LISP it's a function,
 219|
 220|        (apply-func #'+ 1 2)
 221|
 222|        which is a shorthand for
 223|
 224|        (apply-func (function +) 1 2)
 225|
 226|        first and car are synonyms as are rest and cdr
 227|        (null '()) => T
 228|        (null nil) => T
 229|        (null 'a) => NIL
 230|
 231|        Optional arguments using keywords example:
 232|
 233|        (defun doggie-looks( dog &key (nose-color 'red) (hair-color 'white)) 
 234|            (list 'my dog 'has 'a nose-color 'nose 'and hair-color 'hair))
 235|
 236|        * (doggie-looks 'husky)                    => (MY HUSKY HAS A RED NOSE AND WHITE HAIR)
 237|        * (doggie-looks 'husky :nose-color 'black) => (MY HUSKY HAS A BLACK NOSE AND WHITE HAIR)
 238|
 239+-------------------------------------------------------------------------------|#
 240
 241
 242; ==============================================================================
 243; |                             Constants                                      |
 244; ==============================================================================
 245
 246; Use the naming convention +variable-name+ to denote constants.
 247
 248(defconstant +initial-hash-table-size+ 100
 249"-------------------------------------------------------------------------------
 250|  Initial hash table length.  Don't worry, lisp hash tables are extensible
 251|  at run time.
 252--------------------------------------------------------------------------------"
 253)
 254
 255
 256
 257(defconstant +hash-value-upper-limit+ 65536
 258"-------------------------------------------------------------------------------
 259|  Upper limit on hash value on core of items.
 260|-------------------------------------------------------------------------------"
 261)
 262
 263
 264
 265; ==============================================================================
 266; |                Dynamically Bound (i.e. Global) Variables                   |
 267; ==============================================================================
 268
 269; Use the naming convention *variable-name* to denote them as global.
 270
 271(defvar *productions* nil
 272"-------------------------------------------------------------------------------
 273|  List of productions of the unaugmented grammar (without S' -> S).  
 274|  e.g. ( (S -> S |a| S |b| / EPSILON) )
 275|  which represents S -> S a S b, S -> EPSILON
 276|-------------------------------------------------------------------------------"
 277)
 278
 279
 280
 281(defvar *has-epsilon-productions* nil
 282"-------------------------------------------------------------------------------
 283|  T if we have any epsilon productions of the form A -> EPSILON, but NIL otherwise.
 284--------------------------------------------------------------------------------"
 285)
 286
 287
 288
 289(defvar *terminals* nil
 290"-------------------------------------------------------------------------------
 291|  List of terminal symbols for the grammar.  e.g. ( |a| |b| )
 292|-------------------------------------------------------------------------------"
 293)
 294
 295
 296
 297(defvar *first-derived-terminals* nil
 298"-------------------------------------------------------------------------------
 299|  Hash table containing the first derived terminals for each grammar symbol.
 300|-------------------------------------------------------------------------------"
 301)
 302
 303
 304
 305(defvar *epsilon-free-first-derived-terminals* nil
 306"-------------------------------------------------------------------------------
 307|  Hash table containing the epsilon-free first derived terminals for each grammar symbol.
 308|-------------------------------------------------------------------------------"
 309)
 310
 311
 312
 313(defvar *goto-graph* nil
 314"---------------------------------------------------------------------------------------------------------------
 315|  Goto graph of the LR(1) or LALR(1) grammar of the form
 316|
 317| (
 318|   (                                                       ------+
 319|       (6 |a| 4)   <-- Transition in Goto graph from             |
 320|                       state 6 to state 4 on symbol a.           +------ List of graph edges and transitions.
 321|       (1 |a| 2)   <-- Transition from state 1 to state 2        |      
 322|                       on a.                                     |
 323|   )                                                       ------+
 324|
 325|   )                                                                           --------+ 
 326|       ( 0                                <-- State number 0.                          |
 327|         3668                             <-- Hash value of core of items.             |
 328|         (                                                                             |
 329|            (SP -> DOT S           |,|  $)  ----+                                      |
 330|            ( S -> DOT S |a| S |b| |,|  $)      |                                      |
 331|            ( S -> DOT EPSILON     |,|  $)      +---- Set of items for state 0.        |
 332|            ( S -> DOT S |a| S |b| |,| |a|)     |                                      |
 333|            ( S -> DOT EPSILON     |,| |a|)     |                                      |
 334|         )                                  ----+                                      |
 335|       )                                                                               +-- List of sets of items.
 336|                                                                                       |
 337|       ( 2                                <-- State number 2.                          |
 338|         5168                             <-- Hash values of core of items.            |
 339|          (                                                                            |
 340|            (S -> S |a| DOT S |b|       |,|  $)  ----+                                 |
 341|            (S -> S |a| DOT S |b|       |,| |a|)     |                                 |
 342|            (S ->       DOT S |a| S |b| |,| |b|)     |                                 |
 343|            (S ->       DOT EPSILON     |,| |b|)     +-- Set of items for state 2.     |
 344|            (S ->       DOT S |a| S |b| |,| |a|)     |                                 |
 345|            (S ->       DOT EPSILON     |,| |a|) ----+                                 |
 346|          )                                                                            |
 347|      )                                                                                |
 348|   )                                                                           --------+
 349| ) 
 350|--------------------------------------------------------------------------------------------------------------"
 351)
 352
 353
 354
 355(defvar *action-table* nil
 356"-------------------------------------------------------------------------------
 357|  Action table of the form,
 358|
 359| (
 360|    ( (0)                        <-- state number
 361|      (
 362|        ($ (R 2))                <-- reduce action on end of input $
 363|        (|a| (R 2))              <-- reduce action on symbol a.
 364|        (DEFAULT (ERROR))        <-- otherwise must be error
 365|      )
 366|    )
 367|    
 368|    ( (1)                         <-- next line of action table.
 369|      (
 370|        ($ (ACC NIL))             <-- accept action on end of input $
 371|        (|a| (S 2))               <-- shift action on symbol a.
 372|        (DEFAULT (ERROR))
 373|      )
 374|    )
 375| )
 376|-------------------------------------------------------------------------------"
 377)
 378
 379
 380
 381(defvar *goto-table* nil
 382"-------------------------------------------------------------------------------
 383|  Goto table of the form,
 384|
 385| (
 386|    ( (0)                   <-- state number
 387|      (
 388|        (S 1)               <-- transition to state 1 on symbol S
 389|        (DEFAULT (ERROR))   <-- otherwise error
 390|      )
 391|    )
 392|
 393|    ( (2) 
 394|      (
 395|        (S 3)
 396|        (DEFAULT (ERROR))
 397|      )
 398|    )
 399| )
 400|-------------------------------------------------------------------------------"
 401)
 402
 403
 404
 405(defvar *conflicts* nil
 406"-------------------------------------------------------------------------------
 407|  Set to true if we have any shift-reduce or reduce-reduce conflicts.
 408|-------------------------------------------------------------------------------"
 409)
 410
 411
 412
 413; ==============================================================================
 414; |                  General Purpose List Processing Primitives                |
 415; ==============================================================================
 416
 417(defun getHeadOfListUpTo( item list )
 418
 419"-------------------------------------------------------------------------------
 420|
 421|  DESCRIPTION
 422|
 423|      Return the list from the beginning up to but not including a given item,
 424|      or the whole list if the item wasn't found.
 425|      
 426|  CALLING SEQUENCE
 427|
 428|     (getHeadOfListUpTo item list)
 429|            => New list of all symbols before the item.
 430|
 431|  EXAMPLE
 432|
 433|     (getHeadOfListUpTo 'rat '(you are a rat fink)) => (YOU ARE A)
 434|     (getHeadOfListUpTo 'cat '(you are a rat fink)) => (YOU ARE A RAT FINK)
 435|     (getHeadOfListUpTo 'rat '(rat)               ) =>  nil
 436|     (getHeadOfListUpTo 'rat   nil                ) =>  nil
 437|
 438|-------------------------------------------------------------------------------"
 439
 440    (cond ( (null list)                nil)  ; Empty list.
 441          ( (equal (first list) item)  nil)  ; List = (item).  Return ().
 442
 443          ;  Recurse.
 444          ( (cons  (first list)
 445                   (getHeadOfListUpTo item (rest list)))))
 446)
 447
 448
 449
 450
 451(defun removeItemFromList( item list &key (equalityTest #'equal) )
 452
 453"-------------------------------------------------------------------------------
 454|
 455|    DESCRIPTION
 456|  
 457|        Remove all occurences of a given item from a list.  Test item equality
 458|        with a function.
 459|  
 460|    CALLING SEQUENCE
 461|  
 462|        (removeItemFromList item list :equalityTest testFunction)
 463|            => New list with all occurrences of symbol taken out.
 464|  
 465|        testFunction   The name of the function which tests if two symbols are 
 466|                       equal.  It should be a function of two arguments which
 467|                       returns T if the symbols are equal and NIL otherwise.
 468|                       It defaults to #'equal.
 469|  
 470|    EXAMPLE
 471|  
 472|        (removeItemFromList '(rat bad) '( (cat good) (rat good)))
 473|             => ( (CAT GOOD) (RAT GOOD) )
 474|  
 475|        (defun sameAnimal( s1 s2 ) (equal (first s1) (first s2)))
 476|        (funcall #'sameAnimal '(rat good) '(rat bad)) => T
 477|  
 478|        (removeItemFromList '(rat bad) '( (cat good) (rat good))
 479|                        :equalityTest #'sameAnimal) 
 480|             => ( (CAT GOOD) )
 481|
 482+-------------------------------------------------------------------------------"
 483
 484    (cond ( (null list)  nil)                        ; Nothing in the list.
 485 
 486          ( (funcall equalityTest                    ; First item matches.
 487                     item (first list))              ; according to equality
 488                                                     ; test.
 489
 490            (removeItemFromList item (rest list)     ; Discard it and remove
 491                                                     ; all other
 492                                :equalityTest equalityTest))  ; items too.
 493
 494          ( t  (cons (first list)                 ; First item does not match.
 495
 496                     (removeItemFromList item     ; Add it back and remove the 
 497                                    (rest list)   ; remaining items.
 498                                    :equalityTest equalityTest)))) 
 499)
 500
 501
 502
 503(defun itemInList( element list &key (test #'equal) )
 504
 505"-------------------------------------------------------------------------------
 506|  
 507|    DESCRIPTION
 508|  
 509|        Find out if an atom or a list is a member of a given list.  Test for
 510|        equality with a function.
 511|  
 512|    CALLING SEQUENCE
 513|   
 514|        (itemInList item list :equalityTest testFunc)
 515|        =>  T if item is in list; NIL if not.
 516|  
 517|        testFunc    The name of the function which tests if two symbols are 
 518|                    equal.  It should be a function of two arguments which
 519|                    returns T if the symbols are equal and NIL otherwise.
 520|                    test defaults #'equal.
 521|  
 522|    EXAMPLE
 523|  
 524|        (itemInList '(hot dog) '((cool cat) (cool dog)) ) => NIL
 525|  
 526|        (defun sameAnimal( s1 s2 ) (equal (first s1) (first s2)))
 527|  
 528|        (itemInList '(hot dog) '((cool cat) (cool dog))
 529|                    :equalityTest #'sameAnimal) => T
 530|  
 531+---------------------------------------------------------------------------------"
 532
 533(cond ( (null list) nil)                        ; Not in the list.
 534
 535      ( (funcall test element (first list))     ; First item matches.
 536
 537                         t)
 538
 539      ( t  (itemInList element (rest list)      ; Try again on rest of list.
 540                        :test test))) 
 541)
 542
 543
 544
 545(defun positionInList( item list )
 546
 547"-------------------------------------------------------------------------------
 548|
 549|  DESCRIPTION
 550|
 551|      Find the position of an item in a list.
 552|
 553|  CALLING SEQUENCE
 554|
 555|     (positionInList item list)
 556|
 557|     item      Atom or list to be found.
 558|
 559|     list      Any list.
 560|
 561|     Returns:  The position of item in the list or NIL if it is not there.
 562|               The first position is zero.
 563|
 564|  EXAMPLE
 565|
 566|     (positionInList '(winter mute)  '(I am (winter mute))) => 2
 567|     (positionInList 'ratfinn        '(Who you ? ratfink ?)) => NIL
 568|
 569+---------------------------------------------------------------------------------"
 570
 571    (cond ( (null list)  nil )                ; Nothing in the list.
 572
 573          ( (equal item (first list))  0)     ; list = (item ...), return
 574                                              ; position = 0.
 575
 576          ;  If the item is in the rest of the list, find its position in the 
 577          ;  rest of the list, then add 1 to fix up the count.
 578
 579          ( (itemInList item (rest list))
 580                     (1+ (positionInList item (rest list))))
 581
 582          ( t nil ))                          ; Item was not found --- 
 583                                              ; return NIL.
 584)
 585
 586
 587
 588(defun insertItemIntoList( item L &key (test #'equal) (precedence nil) )
 589
 590"------------------------------------------------------------------------------
 591|
 592|  DESCRIPTION
 593|
 594|      If an object isn't already in the list, add it to the end.  If it is,
 595|      overwrite it (see below).
 596|
 597|  CALLING SEQUENCE
 598|
 599|      (insertItemIntoList item L :test test :precedence precedence)
 600|
 601|      item       An atom or list.
 602|
 603|      test       The test to perform to see if an item is is in the list.  
 604|                 It is the name of a function with two arguments which should
 605|                 return T if its arguments are equal and NIL if they aren't.
 606|                 The test function defaults to #'equal if omitted.
 607|
 608|      precedence The test function to perform to say which object has the
 609|                 higher precedence when both are equal.  The one of higher
 610|                 precedence is kept.  An item of higher precedence overwrites
 611|                 its lower precedence brother in the list.  The function should
 612|                 be of the form (precedence x y), returning the object of 
 613|                 higher precedence.  Defaults to NIL (Don't care).
 614|
 615|      L          List of non-duplicated elements (according to equality test
 616|                 specified above).
 617|
 618|      Returns:   Unchanged list if item is already in it.  Otherwise, returns
 619|                 the list L with the item in the last position.
 620|
 621|  EXAMPLE
 622|
 623|      (insertItemIntoList '(rat good) '( (rat bad) (bat good) ) )
 624|              => ((RAT BAD) (BAT GOOD) (RAT GOOD))
 625|      We compared for exact equality, so the new item gets inserted.
 626|
 627|
 628|      (defun sameAnimal( s1 s2 ) (equal (first s1) (first s2)))
 629|
 630|      (insertItemIntoList '(rat good) 
 631|                          '((rat bad) (bat good)) 
 632|                          :test #'sameAnimal)
 633|         =>  ((RAT BAD) (BAT GOOD))
 634|      Rats are already in the list, so don't add the item.
 635|
 636|      (defun good-always-wins( x y ) (cond ((equal (second x) 'good) x) (t y)))
 637|
 638|      (insertItemIntoList '(rat good) '( (rat bad) (bat good) )
 639|                        :test #'sameAnimal 
 640|                        :precedence 'good-always-wins) =>
 641|         =>  ((RAT GOOD) (BAT GOOD))
 642|      Rats are already in the list, but we now compare equal items further
 643|      to see which have higher precedence.
 644|
 645|------------------------------------------------------------------------------"
 646
 647(cond ( (null L)                 (list item))   ; Nothing there.  Add the item.
 648
 649      ( (funcall test item (first L))           ; Item is already in the list.
 650
 651      ; Of the two equal objects --- item and the first element in the list ---
 652      ; keep the one of higher precedence.
 653 
 654                (if (not (null precedence))
 655
 656                  (cons (funcall precedence item (first L)) (rest L))
 657
 658                             L))               ; Don't care about precedence, so
 659                                               ; keep the original list.
 660
 661      ( t                        (cons (first L) 
 662                                       (insertItemIntoList item 
 663                                                         (rest L)
 664                                                         :test test
 665                                                         :precedence precedence))))
 666)
 667
 668
 669
 670(defun combine( list1 list2 &key (test #'equal) (precedence nil) )
 671
 672"------------------------------------------------------------------------------
 673| 
 674|  DESCRIPTION
 675|
 676|      Take the union of two lists.  We can do a generalized test for
 677|      equality of elements.  Also, if two elements are equal, we can
 678|      keep the one of higher precedence.
 679|
 680|  CALLING SEQUENCE
 681|
 682|      (combine list1 list2 :test test :precedence precedence)
 683|
 684|      list1      Arbitrary lists.
 685| 
 686|      list2
 687|
 688|      item       An atom or list.
 689|
 690|      test       The test to perform to see if an item is is in the list.  
 691|                 It is the name of a function with two arguments which should
 692|                 return T if its arguments are equal and NIL if they aren't.
 693|                 The test function defaults to #'equal if omitted.
 694|
 695|      precedence The test function to perform to say which object has the
 696|                 higher precedence when both are equal.  The one of higher
 697|                 precedence is kept.  An item of higher precedence overwrites
 698|                 its lower precedence brother in the list.  The function should
 699|                 be of the form (precedence x y), returning the object of higher
 700|                 precedence.  precedence defaults to NIL (Don't care).
 701|
 702|      Returns:   The set theoretic union of the two lists, except that we 
 703|                 always keep the element of highest precedence when two 
 704|                 elements are the same.
 705|
 706|  EXAMPLE
 707|
 708|      (combine '((rat good) (rat awful)) '((rat bad) (bat good)))
 709|              => ((RAT AWFUL) (RAT GOOD) (RAT BAD) (BAT GOOD))
 710|
 711|      (defun sameAnimal( s1 s2 ) (equal (first s1) (first s2)))
 712|
 713|      (combine '((rat good) (rat awful)) '((rat bad) (bat good))
 714|                :test #'sameAnimal)
 715|         =>  ((RAT AWFUL) (BAT GOOD))
 716|
 717|      (defun good-always-wins( x y ) (cond ((equal (second x) 'good) x) (t y)))
 718|
 719|      (combine '((rat good) (rat awful)) '((rat bad) (bat good))
 720|               :test #'sameAnimal 
 721|               :precedence #'good-always-wins) => ((RAT GOOD) (BAT GOOD))
 722|
 723|------------------------------------------------------------------------------"
 724
 725;  Successively add elements from both lists to nil, eliminating
 726;  duplicated or low precedence items.  If both lists are nil, dolist
 727;  does not loop, and we return nil.
 728
 729(let ( (new-list nil) )
 730
 731    (dolist (item (union list1 list2 :test #'equal))
 732
 733        (setq new-list (insertItemIntoList item new-list
 734                                        :test test :precedence precedence)))
 735new-list)
 736)
 737
 738
 739
 740
 741; ------------------------------------------------------------------------------
 742; |                               core-of-item!                                |
 743; ------------------------------------------------------------------------------
 744;
 745;  DESCRIPTION
 746;
 747;      Get the core of an item:  the item but without the lookahead.
 748;      
 749;  CALLING SEQUENCE
 750;
 751;      (core-of-item! item)
 752;
 753;      item          [A -> alpha . beta , gamma ]
 754;
 755;      Returns:      [A -> alpha . beta]
 756;
 757;  EXAMPLE
 758;
 759;      (core-of-item! '(sandwich -> bread meat DOT bread |,| knife)) 
 760;      => (SANDWICH -> BREAD MEAT DOT BREAD)
 761;
 762; ------------------------------------------------------------------------------
 763
 764(defun core-of-item!( item )
 765
 766    (getHeadOfListUpTo '|,| item)
 767
 768)
 769
 770
 771
 772
 773; ------------------------------------------------------------------------------
 774; |                               element-of-item?                             |
 775; ------------------------------------------------------------------------------
 776;
 777;  DESCRIPTION
 778;
 779;      Find out if an item or its core is in a set of items.
 780;     
 781;  CALLING SEQUENCE
 782;
 783;     (element-of-item item set-of-items compare-type)
 784;
 785;     item               [A -> alpha . beta , gamma]
 786;
 787;     set-of-items       ... [A' -> alpha' . beta' , gamma'] ...
 788;
 789;     compare-type       Whether to compare the whole item or only its core.
 790;
 791;     Returns:           if compare-type = 'core then T if A = A', alpha = 
 792;                        alpha', beta = beta'.  gamma need not equal gamma'. 
 793;                        if compare-type = item then T if in addition, 
 794;                        gamma = gamma', and NIL otherwise.
 795;
 796;  EXAMPLE
 797;
 798;     (element-of-item?   '(eat -> living death |,| scum)
 799;                       '( (eat -> hot fudge    |,| scum)
 800;                          (eat -> living death |,| wimp))
 801;                       'core) => T
 802;
 803;     But (element-of-item? . . . 'item) => NIL
 804;
 805; ------------------------------------------------------------------------------
 806
 807(defun element-of-item?( item set-of-items compare-type )
 808
 809(cond ((null set-of-items) nil)                           ; No items, no match. 
 810
 811      ( (if (equal compare-type 'core)
 812
 813            (equal (core-of-item! item)                   ; Core of first item
 814                   (core-of-item! (first set-of-items)))  ; was found.
 815
 816            (equal item (first set-of-items)))            ; First items match.
 817      
 818             T)                                           ; First item is in set
 819
 820     ( t    (element-of-item? item                        ; Continue to search.
 821                              (rest set-of-items) 
 822                              compare-type)))
 823)
 824
 825
 826
 827
 828; ------------------------------------------------------------------------------
 829; |                               contained-in-item?                           |
 830; ------------------------------------------------------------------------------
 831;
 832;  DESCRIPTION
 833;
 834;      Find if the first set of items is contained in the second 
 835;      set of items.  Alternatively, find out if the cores of the
 836;      first set of items are contained in the cores of the second set.
 837;      
 838;  CALLING SEQUENCE
 839;
 840;      (contained-in-item? set-of-items1 set-of-items2 compare-type)
 841;
 842;      set-of-items1  First and ...
 843;
 844;      set-of-items2  ... second sets of of items.
 845;
 846;      compare-type    Type of comparison:  'item or 'core.
 847;
 848;      Returns:        T if compare-type = 'item and the first set of items
 849;                      is contained in the second set of items.
 850;                      T if compare-type = 'core and the cores of the first 
 851;                      set of items are contained in the cores of the second
 852;                      set of items.
 853;  EXAMPLE
 854;
 855;     (contained-in-item? '( (a -> b DOT c |,| x)
 856;                            (e -> f DOT |,| g))
 857;
 858;                         '( (a -> b DOT c |,| h)
 859;                            (e -> f DOT |,| i)
 860;                            (f -> g DOT h i |,| j) )
 861;                         'core ) => T
 862;
 863;    However, (contained-in-item? . . . 'item) => NIL
 864;
 865; ------------------------------------------------------------------------------
 866
 867(defun contained-in-item?( set-of-items1 set-of-items2 compare-type )
 868
 869(cond ((null set-of-items1) T)                  ; Null set is contained in
 870                                                ; every set.
 871
 872      ((element-of-item? (first set-of-items1)  ; First item (or its core) 
 873                         set-of-items2          ; is in the second set.
 874                         compare-type)
 875
 876            (if (null (rest set-of-items1))     ; No other elements in first set
 877
 878                T
 879
 880                (contained-in-item? (rest set-of-items1) ; Are the remaining
 881                                    set-of-items2        ; items of the first
 882                                    compare-type)))      ; set in the second?
 883   
 884       ( t  nil ))                 ; First element of first set isn't in 
 885                                   ; the second set: first set can't be 
 886                                   ; contained in second set.
 887)
 888
 889
 890
 891
 892; ------------------------------------------------------------------------------
 893; |                             equal-sets-of-items?                           |
 894; ------------------------------------------------------------------------------
 895;
 896;  DESCRIPTION
 897;
 898;      Check if two sets of items are the same (or have the same core).
 899;      You can also use it to check if two arbitrary lists contain the
 900;      same elements.
 901;      
 902;  CALLING SEQUENCE
 903;
 904;      (equal-sets-of-items? set-of-items1 set-of-items2 :compare-type type)
 905;     
 906;      set-of-items1  First and ...
 907;
 908;      set-of-items2  ... second sets of of items.
 909;
 910;      type           Optional argument defaulting to 'item.
 911;
 912;      Returns:       T for type = 'item if both sets of items are identical.
 913;                     T for type = 'core if both sets of items have the same
 914;                     cores.
 915;  METHOD
 916;
 917;      Two sets of items are the same if each one is contained within the 
 918;      other. They have the same core if the core of one is contained in 
 919;      the core of the other and vice-versa.  We can't just test for equality
 920;      because the order of the items could be different.
 921;
 922;  EXAMPLE
 923;
 924;     (equal-sets-of-items? '( (a -> b DOT c |,| d) )
 925;                           '( (a -> b DOT c |,| d) )) => T
 926;
 927;     (equal-sets-of-items? '( (a -> b DOT c |,| d) )
 928;                           '( (a -> b DOT c |,| e) )
 929;                           :compare-type 'core ) => T
 930;
 931;     (equal-sets-of-items? '(a (b c) d) '(d a (b c))) => T
 932;
 933; ------------------------------------------------------------------------------
 934
 935(defun equal-sets-of-items?( set-of-items1 set-of-items2 
 936                              &key (compare-type 'item))
 937
 938(and (contained-in-item? set-of-items1 set-of-items2 compare-type)
 939     (contained-in-item? set-of-items2 set-of-items1 compare-type))
 940)
 941
 942
 943
 944
 945; ==============================================================================
 946; |                   Helper Functions on Symbols and Productions              |
 947; ==============================================================================
 948
 949; ------------------------------------------------------------------------------
 950; |                              terminal?                                     |
 951; ------------------------------------------------------------------------------
 952;
 953;  DESCRIPTION
 954;
 955;      Find out if a symbol of the grammar is a terminal.
 956;
 957;  CALLING SEQUENCE
 958;
 959;      (terminal? symbol)
 960;
 961;      *terminals*  Global list of terminal symbols for the grammar.
 962;
 963;      symbol:      A grammar symbol
 964;
 965;      Returns:     T if the symbol is a terminal symbol, NIL otherwise.
 966;                   EPSILON is not a terminal.
 967;
 968;  EXAMPLE
 969;
 970;      Let *terminals* = (c C),
 971;
 972;      (terminal? '|c| ) => T
 973;      (terminal? 'C )   => NIL
 974;      (terminal? 'EPSILON)   => NIL
 975;
 976; ------------------------------------------------------------------------------
 977
 978(defun terminal?( symbol )
 979
 980(itemInList symbol *terminals*)
 981)
 982
 983
 984
 985
 986; ------------------------------------------------------------------------------
 987; |                               nonterminal?                                 |
 988; ------------------------------------------------------------------------------
 989; 
 990;  DESCRIPTION
 991;
 992;      Find out if a symbol of the grammar is a nonterminal.
 993;
 994;  CALLING SEQUENCE
 995;
 996;      (nonterminal? symbol)
 997;
 998;      symbol:     A grammar symbol
 999;
1000;      Returns:    T if the symbol is a nonterminal symbol, NIL otherwise.
1001;                  EPSILON is not a non-terminal.
1002;
1003;  EXAMPLE
1004;
1005;      (nonterminal? 'C )   => T
1006;      (nonterminal? '|c| ) => NIL
1007;      (nonterminal? 'EPSILON ) => NIL
1008;
1009; ------------------------------------------------------------------------------
1010
1011(defun nonterminal?( symbol )
1012
1013(and (not (equal symbol 'EPSILON))
1014     (not (terminal? symbol)))
1015
1016)
1017
1018
1019
1020
1021; ------------------------------------------------------------------------------
1022; |                         derives-leading-terminal?                          |
1023; ------------------------------------------------------------------------------
1024;
1025; DESCRIPTION
1026;
1027;     Check if a production derives a leading terminal.
1028;
1029; CALLING SEQUENCE
1030;
1031;     (derives-leading-terminal? production)
1032;
1033;     production:    A production of the form X -> a Y
1034;
1035;     Returns:       T if a is a terminal, NIL otherwise.
1036;
1037; EXAMPLE
1038;
1039;     (derives-leading-terminal? '(C -> |c| C)) => T
1040;     (derives-leading-terminal? '(C -> C C))   => NIL
1041;
1042; ------------------------------------------------------------------------------
1043
1044(defun derives-leading-terminal?( production )
1045
1046(terminal? (third production))
1047
1048)
1049
1050
1051
1052
1053; ------------------------------------------------------------------------------
1054; |                       derives-leading-nonterminal?                         |
1055; ------------------------------------------------------------------------------
1056;
1057; DESCRIPTION
1058;
1059;     Check if a production derives a leading nonterminal.
1060;
1061; CALLING SEQUENCE
1062;
1063;     (derives-leading-nonterminal? production)
1064;
1065;     production:    A production of the form X -> a Y
1066;
1067;     Returns:       T if a is a terminal, NIL otherwise.
1068;
1069; EXAMPLE
1070;
1071;     (derives-leading-nonterminal? '(C -> C C))   => T
1072;     (derives-leading-nonterminal? '(C -> |c| C)) => NIL
1073;
1074; ------------------------------------------------------------------------------
1075
1076(defun derives-leading-nonterminal?( production )
1077
1078(nonterminal? (third production))
1079
1080)
1081
1082
1083
1084
1085; ------------------------------------------------------------------------------
1086; |                                valid-production?                           |
1087; ------------------------------------------------------------------------------
1088;
1089; DESCRPTION
1090;
1091;     Find out if a production starts with the given symbol.
1092;
1093; CALLING SEQUENCE
1094;
1095;     (valid-production? symbol production)
1096;
1097;     symbol             Grammar symbol
1098;
1099;     production         A production of the form X -> alpha
1100;
1101;     Returns:           T if X = symbol, NIL otherwise.
1102;
1103; EXAMPLE
1104;
1105;     (valid-production? 'C '(C -> |c| C)) => T
1106;     (valid-production? 'S '(C -> |c| C)) => NIL
1107;
1108; ------------------------------------------------------------------------------
1109
1110(defun valid-production?( symbol production )
1111
1112(equal symbol (first production)) 
1113
1114)
1115
1116
1117
1118; ------------------------------------------------------------------------------
1119; |                               reduction?                                   |
1120; ------------------------------------------------------------------------------
1121; 
1122;  DESCRIPTION
1123;
1124;      Find out if an item calls for a reduction.
1125;
1126;  CALLING SEQUENCE
1127;
1128;     (reduction? item)
1129;
1130;     item          Any item [A -> alpha . beta , gamma].
1131;
1132;     Returns:      T if the item is of the form [A -> alpha . , gamma ]
1133;                   and NIL otherwise.  
1134;                   Note [A -> alpha . EPSILON , gamma] =
1135;                        [A -> alpha . , gamma], to this is a reduction too.
1136;
1137;  EXAMPLE
1138;
1139;     (reduction? '(C -> |d| DOT |,| |c|)) => T
1140;     (reduction? '(C -> |d| DOT |e| |,| |c|)) => NIL
1141;     (reduction? '(C -> |d| DOT EPSILON |,| |c|)) => T
1142;
1143; ------------------------------------------------------------------------------
1144
1145(defun reduction?( item )
1146
1147;  Get everything between the dot and comma.  It will be empty for a reduction.
1148
1149(let ((between-dot-and-comma (getHeadOfListUpTo 'DOT (reverse (getHeadOfListUpTo '|,| item )))))
1150
1151    (or (null between-dot-and-comma) 
1152        (equal (first between-dot-and-comma) 'epsilon)))
1153)
1154
1155
1156
1157
1158; ------------------------------------------------------------------------------
1159; |                                  is-accept?                                |
1160; ------------------------------------------------------------------------------
1161; 
1162;  DESCRIPTION
1163;
1164;      Find out if an item calls for an accept.
1165;
1166;  CALLING SEQUENCE
1167;
1168;      (is-accept? item)
1169;
1170;      item           Arbitrary item [A -> alpha . beta , gamma].
1171;
1172;      *productions*  Global list of productions for our grammar.
1173;
1174;      Returns:       T if the item is of the form [S' -> S . , $]
1175;                     S is the start symbol --- the left hand side symbol 
1176;                     of the first production.  S' (represented as SP) is 
1177;                     the extra start symbol of the augmented grammar.
1178;  EXAMPLE
1179;
1180;      *productions* => ( (S -> C C) (C -> |c| C) (C -> |d|) )
1181;      (is-accept? '(SP -> S DOT |,| $)) => T
1182;
1183; ------------------------------------------------------------------------------
1184
1185(defun is-accept?( item )
1186
1187    (equal item `(SP -> ,(first (first *productions*)) DOT |,| $))
1188
1189)
1190
1191
1192
1193
1194; ------------------------------------------------------------------------------
1195; |                               symbol-after-dot!                            |
1196; ------------------------------------------------------------------------------
1197; 
1198;  DESCRIPTION
1199;
1200;      Find the symbol following the dot in an item.
1201;      
1202;
1203;  CALLING SEQUENCE
1204;
1205;      (symbol-after-dot! item)
1206;
1207;      item      Item of the form [A -> alpha . B beta , gamma ]         
1208;
1209;      Returns:  The symbol B, or NIL if there is none.
1210;
1211;  EXAMPLE
1212;
1213;      (symbol-after-dot! '(frogs -> are DOT keen |,| you bet)) => KEEN
1214;      (symbol-after-dot! '(toads -> are not)) => NIL
1215;
1216; ------------------------------------------------------------------------------
1217
1218(defun symbol-after-dot!( item )
1219
1220(cond ( (null item)                 nil )            ; No dot was ever found.
1221
1222; Return the symbol after the dot or nil if there is none.
1223
1224      ( (equal (first item) 'DOT)  (second item) ) 
1225
1226      ( T                          (symbol-after-dot! (rest item)))) ; Keep 
1227                                                                     ; looking.
1228)
1229
1230
1231
1232
1233; ------------------------------------------------------------------------------
1234; |                             terminal-after-dot?                            |
1235; ------------------------------------------------------------------------------
1236; 
1237;  DESCRIPTION
1238;
1239;      Check if an item has a terminal symbol after the dot.
1240;
1241;  CALLING SEQUENCE
1242;
1243;     (terminal-after-dot? item)
1244;
1245;     item       Any item [A -> alpha . beta delta , gamma]
1246;
1247;     Returns:   T if beta is a terminal symbol and NIL otherwise.
1248; 
1249;  EXAMPLE
1250;
1251;     (terminal-after-dot? '(C -> C DOT |c| |,| $ )) => T
1252;     (terminal-after-dot? '(C -> C DOT C |,| $ )) => NIL
1253;
1254; ------------------------------------------------------------------------------
1255
1256(defun terminal-after-dot?( item )
1257
1258(if (null (symbol-after-dot! item))         ; No symbol after the dot.
1259
1260    nil
1261
1262    (terminal? (symbol-after-dot! item)))   ; Check if the symbol after the dot
1263                                            ; is a terminal.
1264)
1265
1266
1267
1268
1269
1270; ------------------------------------------------------------------------------
1271; |                             epsilon-production?                            |
1272; ------------------------------------------------------------------------------
1273;
1274;  DESCRIPTION
1275;
1276;      Find out if a production derives only epsilon.
1277;     
1278;  CALLING SEQUENCE
1279;
1280;     (epsilon-production? production)
1281;
1282;     production    [A -> alpha].
1283;
1284;     Returns:      T if alpha = EPSILON.
1285;
1286;  EXAMPLE
1287;
1288;     (epsilon-production? '(A -> EPSILON)) => T
1289;
1290; ------------------------------------------------------------------------------
1291
1292(defun epsilon-production?( production )
1293
1294(and (equal (length production) 3)
1295     (equal (third production) 'EPSILON))
1296)
1297
1298
1299
1300
1301; ------------------------------------------------------------------------------
1302; |                               same-symbol?                                 |
1303; ------------------------------------------------------------------------------
1304;
1305; DESCRIPTION
1306;
1307;     Test if two tagged grammar symbols are equal.
1308;
1309; CALLING SEQUENCE
1310;
1311;     (same-symbol? s1 s2)
1312;
1313;     s1, s2     Tagged grammar symbols of the form (symbol tag), with
1314;                tag = 'EPSILON-FREE or NIL.
1315;
1316;     Returns    T if the symbol parts are equal.
1317
1318; EXAMPLE
1319;
1320;    (same-symbol? '(a NIL) '(a EPSILON-FREE)) => T
1321;    (same-symbol? '(a EPSILON-FREE) '(b EPSILON-FREE)) => NIL
1322;
1323; ------------------------------------------------------------------------------
1324
1325(defun same-symbol?( s1 s2 )
1326
1327(equal (first s1) (first s2))
1328
1329)
1330
1331
1332
1333; ------------------------------------------------------------------------------
1334; |                               first-alternate!                             |
1335; ------------------------------------------------------------------------------
1336;
1337;  DESCRIPTION
1338;
1339;      Return the first alternate of a production.
1340;
1341;  CALLING SEQUENCE
1342;
1343;      (first-alternate! rhs)
1344;
1345;      rhs        The right hand side of a production containing alternates:
1346;                 alpha / beta / ...
1347;
1348;      Returns:   alpha
1349;
1350;  EXAMPLE
1351;
1352;      (first-alternate! '(A B / C D)) => (A B)
1353;
1354; ------------------------------------------------------------------------------
1355
1356(defun first-alternate!( rhs )
1357
1358  (getHeadOfListUpTo '/ rhs)
1359)
1360
1361
1362
1363
1364; ------------------------------------------------------------------------------
1365; |                            all-but-first-alternate!                        |
1366; ------------------------------------------------------------------------------
1367;
1368;  DESCRIPTION
1369;
1370;      Return all but the first alternates of of a production.
1371;
1372;  CALLING SEQUENCE
1373;
1374;      (all-but-first-alternate! rhs)
1375;
1376;      rhs              The right hand side of a production, with alternates
1377;                       (alpha / beta / ...)
1378;
1379;      Returns:         (beta / ...) or NIL if rhs has only one alternate:  
1380;                       (alpha) 
1381;
1382;  EXAMPLE
1383;
1384;      (all-but-first-alternate! '(B C / D E / F)) => (D E / F)
1385;      (all-but-first-alternate! '(B C))  => NIL
1386;      (all-but-first-alternate! '(B C / D E / F))  => (D E / F)
1387;
1388; ------------------------------------------------------------------------------
1389
1390(defun all-but-first-alternate!( rhs )
1391
1392;  Get the first alternate, then strip it off.
1393
1394    (nthcdr (1+ (length (getHeadOfListUpTo '/ rhs))) rhs)
1395)
1396
1397
1398
1399
1400; ------------------------------------------------------------------------------
1401; |                              production-rhs!                               |
1402; ------------------------------------------------------------------------------
1403;
1404;  DESCRIPTION
1405;
1406;      Return the right hand side of a production.
1407;
1408;  CALLING SEQUENCE
1409;
1410;      (production-rhs! production)
1411;
1412;      production      Production of the form [A -> alpha].
1413;
1414;      Returns:        alpha 
1415;
1416;  EXAMPLE
1417;
1418;      (production-rhs! '(sandwich -> bread meat bread)) => (BREAD MEAT BREAD)
1419;
1420; ------------------------------------------------------------------------------
1421
1422(defun production-rhs!( production )
1423
1424(nthcdr 2 production)
1425)
1426
1427
1428
1429
1430; ------------------------------------------------------------------------------
1431; |                               string-before-comma!                         |
1432; ------------------------------------------------------------------------------
1433; 
1434;  DESCRIPTION
1435;
1436;     Return the symbols between the first symbol after the dot and the comma
1437;     in an item.
1438;
1439;  CALLING SEQUENCE
1440;
1441;      (string-before-comma! item)
1442;     
1443;      item       An item of the form [A -> alpha . B beta , gamma]
1444;
1445;      Returns:   beta or NIL if beta is empty.
1446;
1447;  EXAMPLE
1448;
1449;      (string-before-comma! '(A -> + DOT B * + + |,| a)) => (* + +)  
1450;      (string-before-comma! '(A -> + DOT B |,| a)) => NIL
1451;      (string-before-comma! '(A -> + DOT |,| a)) => NIL
1452;
1453; ------------------------------------------------------------------------------
1454
1455(defun string-before-comma!( item )
1456
1457(let ((temp (reverse (getHeadOfListUpTo 'DOT (reverse item)))))  ; Get everything past the 
1458                                                     ; dot.
1459
1460    (if (equal (first temp) '|,|)                    ; No symbol after the dot.
1461
1462        nil 
1463
1464        (getHeadOfListUpTo '|,| (rest temp))))                   ; Get everything past the
1465                                                     ; symbol after the dot
1466                                                     ; (which could be nil).
1467)
1468
1469
1470
1471
1472; ------------------------------------------------------------------------------
1473; |                               lookahead-of!                                |
1474; ------------------------------------------------------------------------------
1475; 
1476;  DESCRIPTION
1477;
1478;      Return the lookahead symbol of an item.
1479;      
1480;  CALLING SEQUENCE
1481;
1482;      (lookahead-of! item)
1483;
1484;      item      [A -> alpha . beta , gamma]
1485;
1486;      Returns:  gamma
1487;
1488;  EXAMPLE
1489;
1490;      (lookahead-of! '(SP -> DOT |d| |,| |c|)) => |c|
1491;
1492; ------------------------------------------------------------------------------
1493
1494(defun lookahead-of!( item )
1495
1496(cond ( (null item)                   nil )            ; Nothing at all.
1497
1498      ( (equal (first item) '|,|)     (second item))   ; Lookahead (or nothing)
1499                                                       ; follows the comma.
1500
1501      ( T                             (lookahead-of! (rest item)))) ; Search
1502)
1503
1504
1505
1506
1507; ------------------------------------------------------------------------------
1508; |                               split-up-production                          |
1509; ------------------------------------------------------------------------------
1510;
1511;  DESCRIPTION
1512;
1513;      Break a single production with alternates into separate productions.
1514;      
1515;  CALLING SEQUENCE
1516;
1517;      (split-up-production production)
1518;
1519;      production    [A -> alpha / beta / ... ]
1520;
1521;      Returns:      The list [A -> alpha], [A -> beta], ... 
1522;
1523;  EXAMPLE
1524;
1525;      (split-up-production '(A -> B C / D E / F))
1526;      => ( (A -> B C) (A -> D E) (A -> F) )
1527;
1528; ------------------------------------------------------------------------------
1529
1530(defun split-up-production( production )
1531
1532;  production = A -> B | C | ...
1533
1534(let ( (head `(,(first production) 
1535               ,(second production)))    ; Get the left hand side:  A ->
1536
1537       (tail (nthcdr 2 production))      ; Get the right hand side:  B | C | ...
1538
1539       (new-productions nil)
1540       (new-production nil) )
1541
1542    (loop  (if (null tail) (return))
1543      
1544        (setq new-production             ; A -> B, A -> C, etc. 
1545              (append head 
1546                      (first-alternate! tail)))
1547
1548; Strip off next list up to bar. 
1549
1550        (setq tail (all-but-first-alternate! tail)) 
1551
1552        (setq new-productions (append new-productions (list new-production))))
1553
1554new-productions)
1555)
1556
1557
1558
1559
1560; ------------------------------------------------------------------------------
1561; |                               split-up-productions                         |
1562; ------------------------------------------------------------------------------
1563;
1564;  DESCRIPTION
1565;
1566;      Break a list of productions with alternates into a list of separate 
1567;      productions.
1568;      
1569;  CALLING SEQUENCE
1570;
1571;      (split-up-productions production-list)
1572;
1573;      production-list [A -> alpha / beta / ... ] [B -> gamma / delta ...] ...
1574;
1575;      Returns:        The list [A -> alpha], [A -> beta], ... [B -> gamma] 
1576;                      [B -> delta], ... 
1577;  EXAMPLE
1578;
1579;      (split-up-productions '((S -> C C) (C -> |c| C / |d|)))
1580;      => ((S -> C C) (C -> |c| C) (C -> |d|))
1581;
1582; ------------------------------------------------------------------------------
1583
1584(defun split-up-productions( production-list )
1585
1586(let ((new-production-list nil)) 
1587
1588    (dolist (production production-list)      ; Split up each production.
1589
1590        (setq new-production-list             ; Add it to the growing list.
1591              (append new-production-list 
1592                      (split-up-production production))))
1593                     
1594    new-production-list)
1595)
1596
1597
1598
1599
1600; ------------------------------------------------------------------------------
1601; |                                  make-item                                 |
1602; ------------------------------------------------------------------------------
1603; 
1604;  DESCRIPTION
1605;
1606;      Create an item with a leading dot from a production and a 
1607;      lookahead symbol.
1608;      
1609;  CALLING SEQUENCE
1610;
1611;      (make-item production lookahead)
1612;
1613;      production   Any production [A -> alpha]
1614;
1615;      lookahead    Any lookahead symbol b.
1616;
1617;      Returns:     The item [A -> . alpha , b]
1618;
1619;  EXAMPLE
1620;
1621;      (make-item '(A -> B C) '|d|) => (A -> DOT B C |,| |d|)
1622;
1623; ------------------------------------------------------------------------------
1624
1625(defun make-item( production lookahead )
1626
1627`( ,(first production)          ;  Get the first symbol A.
1628   ,(second production)         ;  Get the arrow ->
1629    DOT                         ;  Add the leading dot.
1630   ,@(nthcdr 2 production)      ;  Add the right hand side of the production.
1631    |,|                         ;  Comma.
1632   ,lookahead )                 ;  Add the lookahead symbol last.
1633)
1634
1635
1636
1637
1638; ------------------------------------------------------------------------------
1639; |                                   move-dot-right                           |
1640; ------------------------------------------------------------------------------
1641; 
1642;  DESCRIPTION
1643;
1644;      Move the dot in an item to the right if possible.
1645;
1646;  CALLING SEQUENCE
1647;
1648;      (move-dot-right item)
1649;
1650;      item     [A -> alpha . X beta, b]
1651;
1652;      Returns: [A -> alpha X . beta, b]
1653;               If the item is of the form [A -> alpha . , b], we return it
1654;               unchanged.
1655;
1656;  EXAMPLE
1657;
1658;      (move-dot-right '( A -> B DOT C |,| D)) => ( A -> B C DOT |,| D)
1659;      (move-dot-right '( A -> B C DOT |,| D)) => ( A -> B C |,| DOT D)
1660;
1661; ------------------------------------------------------------------------------
1662
1663(defun move-dot-right( item )
1664
1665(cond ( (null item)      nil )
1666
1667      ( (equal (first item) 'DOT)         ; The item begins with a dot.
1668
1669            (if (null (second item))      ; item = DOT
1670
1671                item                      ; Leave it alone.
1672               
1673;  Move the dot right over the next symbol.  We change [ . b c d ] to [b . c d].
1674 
1675               `( ,(second item)          ; b
1676                  DOT                     ; Add a dot.
1677                 ,@(nthcdr 2 item))))     ; The remainder, (c d).
1678
1679      ( t           (cons (first item)    ; Item doesn't begin with a dot.
1680
1681                          (move-dot-right (rest item)))))
1682)
1683
1684
1685
1686
1687; ------------------------------------------------------------------------------
1688; |                               create-augmenting-item                       |
1689; ------------------------------------------------------------------------------
1690; 
1691;  DESCRIPTION
1692;
1693;      Create the accept item of the augmented grammar.
1694;
1695;  CALLING SEQUENCE
1696;
1697;      (create-augmenting-item)
1698;
1699;      *productions*   List of productions for this grammar.
1700;
1701;      Returns:        The item (SP -> DOT S |,| $) where S is the start 
1702;                      symbol:  the left hand side nonterminal of the first 
1703;                      production S -> alpha.
1704;  EXAMPLE
1705;
1706;     *productions* => ((E -> E T) (E -> id) (T -> id))
1707;
1708;     (create-augmenting-item) => (SP -> DOT E |,| $)
1709;
1710; ------------------------------------------------------------------------------
1711
1712(defun create-augmenting-item()
1713
1714;  Assume the first symbol of the left hand side of the first production is 
1715;  the start symbol.
1716
1717`(SP -> DOT ,(first (first *productions*)) |,| $)
1718
1719)
1720
1721
1722
1723
1724; ------------------------------------------------------------------------------
1725; |                               find-grammar-symbols                         |
1726; ------------------------------------------------------------------------------
1727; 
1728;  DESCRIPTION
1729;
1730;      Create all the grammar symbols (terminal and nonterminal) by looking
1731;      at the list of productions.
1732;
1733;  CALLING SEQUENCE
1734;
1735;      (find-grammar-symbols)
1736;
1737;      Returns:  List of grammar symbols.  Note we don't include the endmarker,
1738;                $ or the null string, EPSILON.
1739;
1740;  EXAMPLE
1741;
1742;      For productions S -> C C, S -> |c| C | d,
1743;
1744;      (find-grammar-symbols) => (S C |c| |d|)
1745;
1746; ------------------------------------------------------------------------------
1747
1748(defun find-grammar-symbols()
1749
1750(let ((symbols nil))
1751
1752;  Scan through all productions, collecting all terminals and nonterminals.
1753
1754    (dolist (production *productions*)
1755
1756        (setq symbols (append symbols (removeItemFromList '-> production))))
1757
1758
1759;  Remove duplicated elements which occur later in the sequence.  Remove any
1760;  EPSILON's introduced by epsilon productions, A -> EPSILON.
1761
1762(removeItemFromList 'EPSILON (remove-duplicates symbols :from-end T)))
1763
1764)
1765
1766
1767; ------------------------------------------------------------------------------
1768; |                            item-to-production                              |
1769; ------------------------------------------------------------------------------
1770; 
1771;  DESCRIPTION
1772;
1773;      Change an item to a production by removing the dot and lookahead part.
1774;
1775;  CALLING SEQUENCE
1776;
1777;      (item-to-production item)
1778;
1779;      item       [A -> alpha . beta , gamma]
1780;
1781;      Returns:   [A -> alpha beta]
1782;
1783;  EXAMPLE
1784;
1785;      (item-to-production '(rat -> on DOT rye |,| tail)) => (RAT -> ON RYE)
1786;
1787; ------------------------------------------------------------------------------
1788
1789(defun item-to-production( item )
1790
1791(removeItemFromList 'DOT (getHeadOfListUpTo '|,| item )))
1792
1793
1794
1795
1796; ------------------------------------------------------------------------------
1797; |                              production-number                             |
1798; ------------------------------------------------------------------------------
1799; 
1800;  DESCRIPTION
1801;
1802;      Return the number of a production.
1803;
1804;  CALLING SEQUENCE
1805;
1806;      (production-number production)
1807;
1808;      production     Any production.
1809;      *production*   List of productions for the grammar.
1810;
1811;      Returns:       The number of the production in the list *productions*.
1812;                     The first production is numbered 1.  Recall alternates
1813;                     of productions were split off into separate productions
1814;                     in the function load-input-and-initialize.
1815;  EXAMPLE
1816;
1817;      (production-number '(S -> C C)) => 1
1818;
1819; ------------------------------------------------------------------------------
1820
1821(defun production-number( production )
1822
1823(1+ (positionInList production *productions*))
1824
1825)
1826
1827
1828
1829
1830; ==============================================================================
1831; |                        First Derived Symbol Utilities                      |
1832; ==============================================================================
1833; |                                                                            |
1834; |       NOTE: In this section we will be using the sample grammar,           |
1835; |                                                                            |
1836; |            S -> A B                                                        |
1837; |            A -> C a | EPSILON                                              |
1838; |            B -> b                                                          |
1839; |            C -> c | EPSILON                                                |
1840; |                                                                            |
1841; |       with terminal symbols a b c                                          |
1842; |                                                                            |
1843; ==============================================================================
1844
1845
1846; ------------------------------------------------------------------------------
1847; |                                  tag-symbol                                |
1848; ------------------------------------------------------------------------------
1849;
1850;  DESCRIPTION
1851;
1852;      Convert a grammar symbol A to tagged form (A NIL).
1853;
1854;  CALLING SEQUENCE
1855;
1856;      (tag-symbol symbol)
1857;
1858;      symbol    A grammar symbol.
1859;    
1860;      Returns:  The list, (symbol NIL).
1861;
1862;  EXAMPLE
1863;
1864;      (tag-symbol 'a) => (A NIL)
1865;
1866; ------------------------------------------------------------------------------
1867
1868(defun tag-symbol( s )
1869
1870  `(,s NIL)
1871
1872)
1873
1874
1875
1876; ------------------------------------------------------------------------------
1877; |                              flag-epsilon-free!                            |
1878; ------------------------------------------------------------------------------
1879;
1880;  DESCRIPTION
1881;
1882;    Flag a tagged grammar symbol as coming from an epsilon-free derivation.
1883;
1884;  CALLING SEQUENCE
1885;
1886;      (flag-epsilon-free! tagged-symbol) 
1887;
1888;      tagged-symbol   Tagged grammar symbol of the form (symbol tag).
1889; 
1890;      Returns:        (symbol NIL)
1891;
1892;  EXAMPLE
1893;
1894;      (flag-epsilon-free! '(a nil)) => (A EPSILON-FREE)
1895;
1896; ------------------------------------------------------------------------------
1897
1898(defun flag-epsilon-free!( s )
1899
1900  `(,(first s) epsilon-free)
1901
1902)
1903
1904
1905
1906; ------------------------------------------------------------------------------
1907; |                                epsilon-free-only                           |
1908; ------------------------------------------------------------------------------
1909;
1910;  DESCRIPTION
1911;
1912;      Return only tagged grammar symbols with epsilon-free derivations.
1913;
1914;  CALLING SEQUENCE
1915;
1916;      (epsilon-free-only list)
1917;
1918;      list          List of tagged symbols, ( (s1 tag1) ... )
1919;
1920;      Returns:      Only those symbols in the list for which 
1921;                    tag = 'epsilon-free
1922;  EXAMPLE
1923;
1924;      (epsilon-free-only '((a NIL) (b EPSILON-FREE) (c NIL))) 
1925;           => ((B EPSILON-FREE))
1926;
1927; ------------------------------------------------------------------------------
1928
1929(defun epsilon-free-only( l )
1930
1931(cond ( (null l) nil )
1932
1933      ( (equal (second (first l))    ;  Keep this symbol:  It is epsilon-free.
1934               'epsilon-free)
1935                              (cons (first l) (epsilon-free-only (rest l))))
1936
1937      ( t   (epsilon-free-only (rest l))))   ; Discard this symbol.
1938)
1939
1940
1941
1942; ------------------------------------------------------------------------------
1943; |                                  untag-list                                |
1944; ------------------------------------------------------------------------------
1945;
1946;  DESCRIPTION
1947;
1948;      Remove the tags from a list of tagged grammar symbols.
1949;
1950;  CALLING SEQUENCE
1951;
1952;      (untag-list list)
1953;
1954;      list          List of tagged grammar symbols, 
1955;                    ((a NIL) (b EPSILON-FREE) ... ).
1956; 
1957;      Returns:      Untagged symbols, (a b ...).
1958;
1959;  EXAMPLE
1960;
1961;      (untag-list '( (a NIL) (b EPSILON-FREE) (c nil))) => (A B C)
1962;
1963; ------------------------------------------------------------------------------
1964
1965(defun untag-list( list )
1966
1967    (mapcar #'car list)
1968
1969)
1970
1971
1972
1973; ------------------------------------------------------------------------------
1974; |                             flag-non-epsilon-free                          |
1975; ------------------------------------------------------------------------------
1976;
1977;  DESCRIPTION
1978;
1979;      Flag a list of tagged symbols as being all not epsilon-free derived.
1980;
1981;  CALLING SEQUENCE
1982;
1983;      (flag-epsilon-free tagged-list)
1984;
1985;      tagged-list      Tagged list of grammar symbols, ( (s1 tag1) ... )
1986;
1987;      Returns:         List with all tags set to NIL, ( (s1 NIL) ... )
1988;
1989;  EXAMPLE
1990;
1991;      (flag-non-epsilon-free '((a epsilon-free) (b nil) (c epsilon-free))) =>
1992;         ((A NIL) (B NIL) (C NIL))
1993;
1994; ------------------------------------------------------------------------------
1995
1996(defun flag-non-epsilon-free( s )
1997
1998    ; Apply an anonymous function to all elements of the list.
1999    (mapcar #'(lambda (x) (cons (first x) '(NIL)))  s)
2000)
2001
2002
2003
2004; ------------------------------------------------------------------------------
2005; |                                  precedence                                |
2006; ------------------------------------------------------------------------------
2007;
2008; DESCRIPTION
2009;
2010;     Compare two tagged symbols and return the one of higher precedence.
2011;
2012; CALLING SEQUENCE
2013;
2014;     (precedence s1 s2)
2015;
2016;     s1, s2     Tagged grammar symbols of the form (s1 tag1), (s2 tag2) with
2017;                tag1, tag2 = 'EPSILON-FREE or NIL.
2018;
2019;     Returns    (s1 tag1) if tag1 = EPSILON-FREE, (s2 tag2) otherwise.
2020;
2021; EXAMPLE
2022;
2023;     (precedence '(a NIL) '(b EPSILON-FREE)) => (B EPSILON-FREE)
2024;     (precedence '(a EPSILON-FREE) '(b EPSILON-FREE)) => (A EPSILON-FREE)
2025;
2026; ------------------------------------------------------------------------------
2027
2028(defun precedence( s1 s2 )
2029
2030(cond ((equal (second s1) 'epsilon-free)  s1)   ; Return the epsilon-free one.
2031      ( t                                 s2))
2032)
2033
2034
2035
2036; ------------------------------------------------------------------------------
2037;
2038;  DESCRIPTION
2039;
2040;      Return the derived leading terminal of a production.
2041;      e.g. the derived leading terminal a of the production X -> a Y
2042;
2043;  CALLING SEQUENCE
2044;
2045;     (derived-leading-terminal production)
2046;
2047;     production      [A -> a beta]
2048;
2049;     Returns:        a
2050;
2051;  EXAMPLE
2052;
2053;     (derived-leading-terminal '(A -> + B A)) =>
2054;
2055; ------------------------------------------------------------------------------
2056
2057(defun derived-leading-terminal( production )
2058
2059(third production)
2060
2061)
2062
2063
2064
2065; ------------------------------------------------------------------------------
2066; |                             first-terminals-of-rhs                         |
2067; ------------------------------------------------------------------------------
2068;
2069;  DESCRIPTION
2070;
2071;      Find the ith approximation of the first derived terminals of the right 
2072;      hand side of a production.
2073;
2074;  CALLING SEQUENCE
2075; 
2076;      (first-terminals-of-rhs rhs hash-table)
2077;
2078;      rhs         The right hand side Y1 ... Yn of the production.
2079;
2080;      hash-table  The current approximation to the FIRST() function for
2081;                  the non-terminals.  FIRST of the terminals is already exact.
2082;
2083;      Returns:    The first derived terminals of the string Y1 ... Yn.   We
2084;                  tag the epsilon-free first derived symbols.
2085;
2086;  METHOD
2087;
2088;      F ( X ) = F( Y1 ) +  ... +  F( Yn )
2089;       i                 1      1
2090;
2091;  EXAMPLE
2092;
2093;      Assume we have just called initial-first-derived-terminals, so that
2094;      hash-table contains the zeroth approximation to FIRST.
2095;
2096;      (first-terminals-of-rhs '(A B) hash-table) => ( (b NIL) )
2097;      because S => A B => EPSILON b => b, which is not in EFF().
2098;
2099; ------------------------------------------------------------------------------
2100
2101(defun first-terminals-of-rhs( rhs hash-table )
2102
2103; Compute FIRST( Y1 ) and FIRST( Y1 ) - EPSILON.
2104
2105(let* ((first-terms (gethash (first rhs) hash-table))   
2106       (first-terms-minus-epsilon (removeItemFromList '(EPSILON NIL) first-terms
2107                                                 :equalityTest 'same-symbol?)))
2108    (cond ( (null rhs) NIL)
2109
2110;  If we have the case A -> alpha beta with FIRST( alpha ) = {}, we want to 
2111;  return {}.  
2112
2113          ( (null first-terms) nil ) 
2114
2115
2116; If epsilon is in FIRST( Y1 ) , add all non-epsilon symbols in FIRST( Y1 ) 
2117; to the first derived terminals in the rest of the list.  Flag all these new
2118; symbols as epsilon-derived.  If there are duplicated symbols, keep only
2119; the epsilon-free ones.
2120
2121          ( (itemInList '(EPSILON NIL) first-terms :test 'same-symbol?) 
2122
2123                (combine first-terms-minus-epsilon 
2124                         (flag-non-epsilon-free 
2125                             (first-terminals-of-rhs (rest rhs) hash-table))
2126                         :test 'same-symbol? :precedence 'precedence))
2127
2128
2129; Otherwise, Y1 has only non-epsilon terminals.  Return FIRST( Y1 ).  Whether
2130; these symbols are epsilon-free depends on their previous flags.
2131
2132      ( t     first-terms )))
2133)
2134
2135
2136
2137; ------------------------------------------------------------------------------
2138; |                            update-first-derived-function                   |
2139; ------------------------------------------------------------------------------
2140;
2141;  DESCRIPTION
2142;
2143;      Update the first derived terminals function to create a new version.
2144;
2145;  CALLING SEQUENCE
2146;
2147;      (update-first-derived-function hash-table update-hash-table)
2148;
2149;      hash-table         Old hash table.
2150;
2151;      update-hash-table  Changes to the old table.  If an entry is NIL,
2152;                         it indicates no change is to be made to hash-table.
2153; 
2154;      Returns:           Updated hash-table of the first derived terminals.
2155;
2156;  EXAMPLE
2157;
2158;      (update-first-derived-function) => Updated hash table.
2159;
2160; ------------------------------------------------------------------------------
2161
2162(defun update-first-derived-function( hash-table update-hash-table )
2163
2164; Update only changes to nonterminals because the FIRST of a terminal symbol 
2165; does not change.
2166
2167(dolist (symbol (find-grammar-symbols))
2168
2169(if (nonterminal? symbol)
2170
2171    (if (not (null (gethash symbol update-hash-table)))
2172
2173        (setf (gethash symbol hash-table)
2174              (gethash symbol update-hash-table)))))
2175)
2176
2177
2178
2179; ------------------------------------------------------------------------------
2180; |                           initial-first-derived-terminals                  |
2181; ------------------------------------------------------------------------------
2182;
2183;  DESCRIPTION
2184;
2185;      The zeroth approximation to the first derived terminals function.
2186;      It is exact for all terminals and epsilon.
2187;
2188;  CALLING SEQUENCE
2189;
2190;      (initial-first-derived-terminals hash-table :type type)
2191; 
2192;      *terminals*    List of all the terminal symbols including the endmarker $
2193;
2194;      hash-table     Empty, but allocated hash table.
2195;
2196;      Returns:       Hash table of the zeroth approximation to the first 
2197;                     derived terminals function for every grammar symbol.
2198;                     Epsilon-free first derived symbols are tagged with the
2199;                     flag 'epsilon-free.
2200;
2201;  EXAMPLE
2202;
2203;      (setq F0 (make-hash-table :size 100))
2204;      (setq F0 (initial-first-derived-terminals F0)) 
2205;      (gethash 'A F0) => ( (EPSILON NIL) )
2206;
2207;      Grammar     FIRST                 Grammar     FIRST
2208;      Symbol                            Symbol
2209;      -------------------               -------------------
2210;      S           NIL                   |a|         ((|a| EPSILON-FREE))
2211;      A           ((EPSILON NIL))       |b|         ((|b| EPSILON-FREE))
2212;      B           ((|b| EPSILON-FREE))  |c|         ((|c| EPSILON-FREE))
2213;      C           ((|c| EPSILON-FREE)   EPSILON     ((EPSILON NIL))
2214;                   (EPSILON NIL))       $           (($ EPSILON-FREE))
2215;
2216;      |c| is flagged as being in EFF( C ). EPSILON is in FIRST( C ) but
2217;      not in EFF( C ).
2218;
2219; ------------------------------------------------------------------------------
2220
2221(defun initial-first-derived-terminals( hash-table )
2222
2223(let ( (first-symbols nil)
2224       (nonterm nil) 
2225       (new-symbol nil) )
2226
2227;  FIRST( X ) = { (X 'epsilon-free) } if X is a terminal.
2228
2229    (dolist (terminal *terminals*)
2230
2231        (setf (gethash terminal hash-table) 
2232              (list (flag-epsilon-free! (tag-symbol terminal)))))
2233
2234
2235;  FIRST( EPSILON ) = { (EPSILON NIL) }.
2236
2237        (setf (gethash 'EPSILON hash-table) (list (tag-symbol 'EPSILON)))
2238
2239
2240;      Every nonterminal appears as the left hand side of some production.
2241;  Thus we can scan through the productions to define FIRST( A ) for every
2242;  nonterminal A.
2243;      Compute the zeroth approximation to FIRST().  Look for a production of
2244;  the form A -> a alpha, where a is a nonterminal.  Find the entry for
2245;  A in the table, and add a to it.  
2246;      We tag productions of the form A -> EPSILON as not being epsilon-free
2247;  derivations.
2248
2249    (dolist (production *productions*)
2250
2251        (cond ( (or (derives-leading-terminal? production)
2252                    (epsilon-production? production))
2253
2254                      (setq nonterm (first production))
2255
2256                      (setq first-symbols (gethash nonterm hash-table))
2257   
2258; Get a or EPSILON. 
2259                      (setq new-symbol 
2260                            (tag-symbol (derived-leading-terminal production)))
2261
2262; Flag a as an epsilon-free derivation, but EPSILON as not.
2263
2264                      (if (not (epsilon-production? production))
2265                          (setq new-symbol (flag-epsilon-free! new-symbol)))
2266
2267; Add a to FIRST( A ).
2268
2269                      (setq first-symbols 
2270                            (insertItemIntoList new-symbol first-symbols))
2271
2272                      (setf (gethash nonterm hash-table) first-symbols))))
2273    hash-table)
2274)
2275
2276
2277
2278; ------------------------------------------------------------------------------
2279; |                        create-all-first-derived-terminals                  |
2280; ------------------------------------------------------------------------------
2281;
2282;  DESCRIPTION
2283;
2284;      Create a hash table of all first derived terminals for every grammar
2285;      symbol.
2286;
2287;  CALLING SEQUENCE
2288;
2289;      (create-all-first-derived-terminals)
2290;
2291;      Returns:  A hash table of the first derived terminals for every grammar 
2292;                symbol, including EPSILON and $.  Flag the epsilon-free 
2293;                derived terminals.
2294;  METHOD
2295;
2296;      Successive approximation by transitive closure.
2297;
2298;  EXAMPLE
2299;
2300;      (setq h (create-all-first-derived-terminals)) => #<Hash-Table 8EDA53>
2301;
2302;      (gethash 'S h) => ((|a| NIL) (|c| EPSILON-FREE) (|b| NIL))
2303;      i.e. FIRST( S ) = { |a| |b| |c| }, but EFF( S ) = { |c| }
2304;
2305;
2306;      Grammar     FIRST                     Grammar     FIRST
2307;      Symbol                                Symbol
2308;      --------------------------------      --------------------------------
2309;      S           ((|a| NIL)                |a|         ((|a| EPSILON-FREE))
2310;                   (|b| NIL)                |b|         ((|a| EPSILON-FREE))
2311;                   (|c| EPSILON-FREE))      |c|         ((|c| EPSILON-FREE))
2312;      A           ((EPSILON NIL)            EPSILON     ((EPSILON NIL))
2313;                   (|a| NIL)                $           (($ EPSILON-FREE))
2314;                   (|c| EPSILON-FREE))
2315;      B           ((|b| EPSILON-FREE))  
2316;      C           ((|c| EPSILON-FREE)  
2317;                   (EPSILON NIL))       
2318;
2319; ------------------------------------------------------------------------------
2320
2321(defun create-all-first-derived-terminals() 
2322
2323;  Initialize the hash table.  The size is extensible at run time.
2324
2325(let ( (hash-table        (make-hash-table :size +initial-hash-table-size+))  
2326       (update-hash-table (make-hash-table :size +initial-hash-table-size+)) 
2327       (nonterm nil) 
2328       (new-first-symbols nil) 
2329       (old-first-symbols nil) 
2330       (change-flag T) )
2331
2332;  Create the zeroth approximation to FIRST(), accurate for all terminals.
2333
2334       (initial-first-derived-terminals hash-table)
2335
2336
2337;  Loop until no more changes occur in the approximation to FIRST.
2338
2339    (loop  
2340
2341        (setq change-flag nil)  
2342
2343
2344;  Compute FIRST[i+1](A) for all the nonterminals A.
2345
2346        (dolist (production *productions*)     ; Scan all productions A -> alpha
2347
2348            (setq nonterm (first production))  ; A
2349
2350
2351;  FIRST[i+1]( A ) = 
2352;  first terminal of( FIRST[i]( Y1 ) ... FIRST[Yn]) U FIRST[i]( A ).
2353
2354
2355            (setq old-first-symbols (gethash nonterm hash-table)) 
2356
2357            (setq new-first-symbols 
2358
2359                  (combine old-first-symbols         ; FIRST[i](A)
2360
2361                           (first-terminals-of-rhs (production-rhs! production)
2362                                                 hash-table)
2363                           :test 'same-symbol? :precedence 'precedence))
2364
2365;  Record if any changes occurred, and save FIRST[i+1]( A ) in a separate 
2366;  update hash table.
2367
2368            (cond ((not (equal-sets-of-items? new-first-symbols 
2369                                              old-first-symbols))
2370
2371                         (setq change-flag T)
2372
2373                         (setf (gethash nonterm update-hash-table) 
2374                               new-first-symbols))))
2375
2376;  Add updates to the old hash table for FIRST[i]() to create FIRST[i+1](),
2377;  then clear out the update hash table.
2378
2379        (update-first-derived-function hash-table update-hash-table)
2380
2381        (clrhash update-hash-table)
2382
2383        (if (null change-flag) (return)))    ;  No more changes --- exit.
2384
2385;  Return the hash table of first derived terminals for every grammar symbol.
2386
2387hash-table)
2388
2389)
2390
2391
2392
2393; ------------------------------------------------------------------------------
2394; |                            first-terminals-of-symbol                       |
2395; ------------------------------------------------------------------------------
2396;
2397;  DESCRIPTION
2398;
2399;      Return a list of the first derived terminals of a grammar symbol.
2400;
2401;  CALLING SEQUENCE
2402;
2403;      (first-terminals-of-symbol s)
2404; 
2405;      s                            Any grammar symbol (or EPSILON or $).
2406;
2407;      *first-derived-terminals*    Hash table of the first derived terminals
2408;                                   for every grammar symbol.  
2409;
2410;      *epsilon-free-first-derived-terminals*    
2411;
2412;                                   Hash table of the epsilon-free first 
2413;                                   derived terminals for every grammar symbol.
2414;
2415;      type                         Defaults to NIL for computing FIRST() and
2416;                                   equals 'epsilon-free for computing EFF().
2417;
2418;      Returns:                     List of first derived terminals of s.
2419;                                   If type = 'epsilon-free, return the 
2420;                                   epsilon-free first derived terminals
2421;                                   instead.
2422;
2423;                                   Creates *first-derived-terminals* and
2424;                                   *epsilon-free-first-derived-terminals*
2425;                                   if they do not already exist.
2426;  EXAMPLE
2427;
2428;      (first-terminals-of-symbol 'S)  
2429;             => (|a| |c| |b|)
2430;      (first-terminals-of-symbol 'S :type 'epsilon-free) 
2431;             => (|c|)
2432;
2433; ------------------------------------------------------------------------------
2434
2435(defun first-terminals-of-symbol( symbol &key (type NIL) )
2436
2437;  Create the hash tables for FIRST() and EFF() if they do not exist.
2438
2439(cond ( (or (null *first-derived-terminals*)
2440            (null *epsilon-free-first-derived-terminals*))
2441
2442         ;  Sort out first derived terminals from epsilon-free first derived terminals.
2443
2444         (setq *first-derived-terminals* 
2445             (make-hash-table :size +initial-hash-table-size+))
2446         (setq *epsilon-free-first-derived-terminals* 
2447              (make-hash-table :size +initial-hash-table-size+))
2448
2449         (let ( (old-hash-table (create-all-first-derived-terminals)) )
2450
2451             (dolist (symbol (cons '$ (find-grammar-symbols)))
2452
2453             (setf (gethash symbol *first-derived-terminals*)
2454                   (untag-list (gethash symbol old-hash-table)))
2455
2456             (setf (gethash symbol *epsilon-free-first-derived-terminals*)
2457                   (untag-list 
2458                      (epsilon-free-only 
2459                       (gethash symbol old-hash-table))))))))
2460
2461
2462
2463    ;  Return FIRST() or EFF() depending on the customer's request.
2464
2465    (if (equal type 'epsilon-free)
2466
2467        (gethash symbol *epsilon-free-first-derived-terminals*)
2468
2469        (gethash symbol *first-derived-terminals*))
2470)
2471
2472
2473
2474; ------------------------------------------------------------------------------
2475; |                            first-derived-terminals                         |
2476; ------------------------------------------------------------------------------
2477;
2478;  DESCRIPTION
2479;
2480;      Return a list of the first derived terminals of a grammar string.
2481;
2482;  CALLING SEQUENCE
2483;
2484;      (first-derived-terminals string :type type)
2485;
2486;      string     A list of grammar symbols, X1 ... Xn
2487;
2488;      type       NIL by default.
2489;
2490;      Returns:   First derived terminals of the list, FIRST( X1 ... Xn )
2491;                 if type = NIL, but EFF( X1 ... Xn ) if type = 'epsilon-free.
2492;
2493;  METHOD
2494;
2495;      FIRST( X1 ... Xn ) = FIRST( X1 ) + ... +  FIRST( Xn )
2496;                                        1     1
2497;
2498;      EFF( X1 ... Xn ) = EFF( X1 ) +  FIRST( X2 ... Xn )
2499;                                    1
2500;
2501;  EXAMPLE
2502;
2503;      (first-derived-terminals '(S A)) => (|a| |c| |b|)
2504;      (first-derived-terminals '(S A) :type 'epsilon-free) => (|c|)
2505;
2506;      because FIRST( S ) = (|a| |b| |c|) and FIRST( A ) = (|a| |c| EPSILON)
2507;      and |c| is the only terminal with a non-epsilon derivation,
2508;      S => A B => C a b => c a b.  |b|, for example has only the derivation
2509;      S => A B => A b => EPSILON b = b, in which we must replace a leading
2510;      non-terminal A with EPSILON.
2511;     
2512;      (first-derived-terminals '(A B)) => (|c| |a| |b|)
2513;      because FIRST( B ) = { |b| }
2514;      
2515;      (first-derived-terminals '(A B)) => (|c|)
2516;
2517; ------------------------------------------------------------------------------
2518
2519(defun first-derived-terminals( string &key (type NIL) )
2520
2521; We want FIRST( EPSILON ) = (EPSILON) and EFF( EPSILON ) = NIL.
2522
2523(cond ( (null string) (if (equal type 'epsilon-free)
2524                          NIL
2525                          (list 'EPSILON)))
2526
2527; If EPSILON is in FIRST( Y1 ), add all non-epsilon terminals of FIRST( Y1 ).
2528; If we are computing EFF(), we do EFF( Y1 ) instead.
2529
2530      ( (itemInList 'EPSILON 
2531                     (first-terminals-of-symbol (first string) :type type))
2532
2533          (union (removeItemFromList 'EPSILON 
2534                                (first-terminals-of-symbol (first string)))
2535                 (first-derived-terminals (rest string))
2536                 :test #'equal
2537          )
2538      )
2539
2540; Otherwise, return the non-epsilon symbols of FIRST( Y1 ) or of EFF( Y1 ).
2541
2542      (t  (first-terminals-of-symbol (first string) :type type)))
2543)
2544
2545
2546
2547
2548; ==============================================================================
2549; |                    Item functions:  closure, goto, cores, etc.             |
2550; ==============================================================================
2551
2552
2553; ------------------------------------------------------------------------------
2554; |                                 closure                                    |
2555; ------------------------------------------------------------------------------
2556; 
2557;  DESCRIPTION
2558;
2559;      Return the closure of a list of items.
2560;
2561;  CALLING SEQUENCE
2562;
2563;      (closure set-of-items)
2564;
2565;      *productions*  Global list of productions.
2566;      set-of-items
2567;
2568;      Returns:       For each item [A -> alpha . B beta , a] in the set, add 
2569;                     [B -> . gamma , b] where (B -> gamma) is a production
2570;                     and b is the first derived symbol of the string "beta a".
2571;  METHOD
2572;
2573;      Intuitively, we saw alpha already and expect to see B next.  
2574;      That is, we expect to see any string of terminals gamma derived from B.
2575;      The next symbol we expect is the lookahead, which is the first
2576;      derived terminal symbol of the string beta a.
2577;
2578;  EXAMPLE
2579;
2580;      (closure '( (SP -> DOT S |,| $) ) ) =>
2581;
2582;      ( (SP -> DOT S     |,|  $ )
2583;        (S  -> DOT C C   |,|  $ )
2584;        (C  -> DOT |c| C |,| |c|)
2585;        (C  -> DOT |c| C |,| |d|)
2586;        (C  -> DOT |d|   |,| |c|)
2587;        (C  -> DOT |d|   |,| |d|) )
2588;
2589;
2590; ------------------------------------------------------------------------------
2591
2592(defun closure( item-list )
2593
2594(let ((closed-item-list item-list)             ; Closure of item-list.
2595      (item-num      -1)                       ; nth item in item-list.
2596      (nonterm      nil)                       ; B
2597      (first-syms    nil)                      ; FIRST[ beta a ]
2598      (item          nil)                      ; Current item in item-list.
2599      (new-item      nil))                     ; [B -> . gamma, b]
2600
2601    (loop                                      ; Loop over each item.
2602
2603        (setq item-num (1+ item-num))                ; Advance to next item.
2604
2605        (setq item (nth item-num closed-item-list))  ; Get current item,
2606                                                     ; [A -> alpha . B beta , a]
2607
2608        (if (null item)  (return))                   ; End of the list.
2609
2610        (setq nonterm (symbol-after-dot! item))      ; Get B.
2611
2612
2613        (if (nonterminal? nonterm)                   ; B is nonterminal.
2614
2615            (dolist (production *productions*)
2616
2617                (cond ((valid-production? nonterm    ; production = [B -> gamma]
2618                                       production) 
2619
2620                     (setq first-syms                ; FIRST[ beta a ]
2621                           (first-derived-terminals
2622                                 `(,@(string-before-comma! item)   ; Get beta.
2623                                   ,(lookahead-of! item))))        ; Get a.
2624
2625                    (dolist (lookahead first-syms)   ; for each b in 
2626                                                     ; FIRST[ beta a ]
2627
2628                        (setq new-item               ; [ B -> . gamma , b ]
2629                              (make-item production 
2630                                         lookahead))
2631
2632                        (setq closed-item-list       ; Add to end of list.
2633                              (insertItemIntoList new-item 
2634                                                closed-item-list))))))))
2635    closed-item-list)
2636)
2637
2638
2639
2640
2641; ------------------------------------------------------------------------------
2642; |                                  compute-goto                              |
2643; ------------------------------------------------------------------------------
2644; 
2645;  DESCRIPTION
2646;
2647;      Compute the goto on a set-of-items and a grammar symbol.
2648;
2649;  CALLING SEQUENCE
2650;
2651;      (compute-goto set-of-items grammar-symbol)
2652;
2653;      set-of-items      Set of items I.
2654;
2655;      grammar-symbol    Grammar symbol X.
2656;
2657;      Returns:          Goto function GOTO( I, X ) defined as follows:
2658;
2659;                        For all items [A -> alpha . X beta , a] in I, 
2660;                        add together the closures of [A -> alpha X . beta, a]. 
2661;  EXAMPLE
2662;
2663;      (compute-goto '( (SP -> DOT s   |,| $)
2664;                       ( S -> DOT c c |,| $)
2665;                       ( c -> |c| c   |,| c)
2666;                       ( c -> |c| c   |,| d)
2667;                       ( c -> d       |,| c)
2668;                       ( c -> d       |,| d))
2669;
2670;                     'S) => ( (SP -> S DOT |,| $))
2671;
2672; ------------------------------------------------------------------------------
2673
2674(defun compute-goto( set-of-items grammar-symbol )
2675
2676    (let ( (new-set nil) )
2677
2678        (dolist (item set-of-items)  
2679
2680            ; Examine each of the form [A -> alpha . X beta, a]
2681            (if (equal (symbol-after-dot! item)
2682                        grammar-symbol)                 
2683
2684                ;  Add [A -> alpha X . beta , a ] if not already there.
2685                (setq new-set
2686                      (insertItemIntoList (move-dot-right item) 
2687                                    new-set))
2688            )
2689        )
2690    
2691        ; Closure of the new list.
2692        (closure new-set)
2693    )
2694)
2695
2696
2697
2698
2699; ==============================================================================
2700; |                        Goto Graph Functions                                |
2701; ==============================================================================
2702
2703
2704; ------------------------------------------------------------------------------
2705; |                              create-new-node                               |
2706; ------------------------------------------------------------------------------
2707;
2708; DESCRIPTION
2709;
2710;     Create a new node in the goto graph given the data.
2711;
2712; CALLING SEQUENCE
2713;
2714;     (create-new-node state hash-value set-of-items)
2715;
2716;     hash-value Hash value of core of items.
2717;
2718;     Returns:   
2719;
2720; EXAMPLE
2721;
2722;     (create-new-node 1 3668 '((SP -> S DOT |,| $)) 3648)
2723;     =>  (1 3668 ((SP -> S DOT |,| $)))
2724;
2725; ------------------------------------------------------------------------------
2726
2727(defun create-new-node( current-state 
2728                        hash-value-of-core-of-items
2729                        list-of-items )
2730
2731    `(,current-state ,hash-value-of-core-of-items ,list-of-items)
2732)
2733
2734(defun create-new-link( current-state 
2735                        grammar-symbol
2736                        next-state)
2737
2738    `(,current-state ,grammar-symbol ,next-state)
2739)
2740
2741
2742; ------------------------------------------------------------------------------
2743; |                             select-items!                                  |
2744; ------------------------------------------------------------------------------
2745;
2746;  DESCRIPTION
2747;
2748;      Select out the set of items in a node of the goto graph.
2749;
2750;  CALLING SEQUENCE
2751;
2752;     (select-items! node)
2753;
2754;     node        A node in the goto graph, (i2 i1 X SET-OF-ITEMS)
2755;
2756;     Returns:    SET-OF-ITEMS
2757;
2758;  EXAMPLE
2759;
2760;     (select-items! 
2761;       '( 0         
2762;          3668     
2763;         (
2764;            (SP -> DOT S           |,|  $)
2765;            ( S -> DOT S |a| S |b| |,|  $) 
2766;         )
2767;       )   
2768;     )
2769;
2770;     => 
2771;
2772; ------------------------------------------------------------------------------
2773
2774(defun select-items!( node )
2775
2776(third node)
2777
2778)
2779
2780
2781(defun hash-value!( node )
2782  (second node)
2783)
2784
2785(defun links!( goto-graph )
2786  (first goto-graph)
2787)
2788
2789(defun nodes!( goto-graph )
2790  (second goto-graph)
2791)
2792
2793(defun nth-node!( node-num goto-graph )
2794
2795  (nth node-num (nodes! goto-graph))
2796
2797)
2798
2799(defun first-node( goto-graph )
2800    (first (nodes! goto-graph))
2801)
2802
2803(defun rest-node( goto-graph )
2804    (rest (nodes! goto-graph))
2805)
2806
2807(defun insert-node( node goto-graph )
2808
2809    `( ,(links! goto-graph)
2810       ,(insertItemIntoList node
2811                            (nodes! goto-graph))
2812     )
2813)
2814         
2815(defun insert-link( link goto-graph )
2816
2817    `( ,(insertItemIntoList link (links! goto-graph))
2818       ,(nodes! goto-graph)
2819     )
2820)
2821
2822
2823
2824
2825; ------------------------------------------------------------------------------
2826; |                                current-state!                              |
2827; ------------------------------------------------------------------------------
2828; 
2829;  DESCRIPTION
2830;
2831;      Get the number of a node in the goto graph.
2832;
2833;  CALLING SEQUENCE
2834;
2835;      (current-state! node)
2836;
2837;      node      A node in the goto graph, (i1 i2 X SET-OF-ITEMS))
2838;
2839;      Returns:  i1
2840;
2841;  EXAMPLE
2842;
2843;      (current-state! '(1 0 S ((SP -> S DOT |,| $)))) => 1
2844;
2845; -----------------------------------------------------------------------------
2846
2847(defun current-state!( node )
2848
2849(first node)
2850
2851)
2852
2853
2854
2855
2856; ------------------------------------------------------------------------------
2857; |                              transition-symbol!                            |
2858; ------------------------------------------------------------------------------
2859;
2860;  DESCRIPTION
2861;
2862;      Return the symbol upon which an action is performed.
2863;
2864;  CALLING SEQUENCE
2865;
2866;      (transition-symbol! node)
2867;
2868;      node      A node in the goto graph, (i1 i2 X SET-OF-ITEMS))
2869;
2870;      Returns:  X
2871;
2872;  EXAMPLE
2873;
2874;      (transition-symbol! '(1 0 S ((SP -> S DOT |,| $)))) => S
2875;
2876; ------------------------------------------------------------------------------
2877
2878(defun transition-symbol!( node )
2879
2880(third node)
2881
2882)
2883
2884
2885; ------------------------------------------------------------------------------
2886; |                               set-of-items-in-graph?                       |
2887; ------------------------------------------------------------------------------
2888; 
2889;  DESCRIPTION
2890;
2891;      Find out if a goto graph contains a given a set of items (or 
2892;      their cores).
2893;
2894;  CALLING SEQUENCE
2895;
2896;      (set-of-items-in-graph? set-of-items goto-graph :compare-type type)
2897;
2898;      set-of-items    Any set of items.
2899;
2900;      goto-graph      The goto graph of the grammar.
2901;
2902;      type            Optional keyword.  If omitted, it defaults to 'item.
2903;
2904;      Returns         T if any node in goto-graph has the same set of items
2905;                      (for type = 'item) or the same core (for type = 'core) 
2906;                      as set-of-items.
2907;
2908;  EXAMPLE
2909;                     
2910;      (set-of-items-in-graph? 
2911;              '( (a -> b DOT c |,| d) (e -> f DOT g |,| h) )
2912;              '( ( (-1 nil 0) (0 a 1) )
2913;                 ( (0 12 ( (a -> b DOT c |,| ddd) (e -> f DOT g |,| h) ) )
2914;                   (1 23 ( (i -> j DOT |,| k) ) ) )) 
2915;      ) => nil
2916;
2917;      but
2918;
2919;      (set-of-items-in-graph? 
2920;              '( (a -> b DOT c |,| d) (e -> f DOT g |,| h) )
2921;              '( ( (-1 nil 0) (0 a 1) )
2922;                 ( (0 12 ( (a -> b DOT c |,| d) (e -> f DOT g |,| h) ) )
2923;                   (1 23 ( (i -> j DOT |,| k) ) ) )) 
2924;              :compare-type 'core) => T
2925;
2926;
2927; ------------------------------------------------------------------------------
2928
2929(defun set-of-items-in-graph?( set-of-items goto-graph 
2930                               &key (compare-type 'item) )
2931
2932(cond ( (null goto-graph)              nil)   ; goto graph = ()
2933      ( (null (first-node goto-graph)) nil)   ; goto graph = ( (...) () )
2934
2935      ( t
2936
2937           ; Scan all nodes in the goto-graph, looking for one which has
2938           ; a matching item.
2939           (dolist (node (nodes! goto-graph))
2940
2941               (if (equal-sets-of-items? set-of-items (select-items! node)
2942                                         :compare-type compare-type)
2943                   (return t)
2944               )
2945           )
2946           ; return nil by default
2947      )
2948)
2949
2950)
2951
2952
2953
2954; ------------------------------------------------------------------------------
2955;
2956; DESCRIPTION
2957;
2958;     Find out the current node (state) number of a node in the goto 
2959;     graph which contains the given set of items or their cores.
2960;
2961; CALLING SEQUENCE
2962;
2963;     (node-number set-of-items goto-graph :compare-type compare-type)
2964;
2965;     set-of-items  Set of items to search for.
2966;
2967;     goto-graph    Goto graph of the grammar.
2968;
2969;     compare-type  If 'item, find identical set of items, but if 'core, 
2970;                   find identical cores.  Defaults to 'item.
2971;
2972;     Returns:      Number of the node in the goto graph containing the given
2973;                   set of items or core.
2974;
2975;  EXAMPLE
2976;
2977;      (node-number '(( S -> C C DOT |,| $)) *goto-graph*) => 5
2978;      (node-number '(( S -> C C DOT |,| |e|)) *goto-graph*) => NIL
2979;    but
2980;      (node-number '(( S -> C C DOT |,| |e|)) *goto-graph* 
2981;                   :compare-type 'core) => 5
2982;
2983; ------------------------------------------------------------------------------
2984
2985(defun node-number( set-of-items goto-graph &key (compare-type 'item) )
2986
2987(cond ( (null goto-graph)              -1)   ; goto graph = ()
2988      ( (null (first-node goto-graph)) -1)   ; goto graph = ( (...) () )
2989
2990      ( t
2991
2992           ; Scan all nodes in the goto-graph, looking for one which has
2993           ; a matching item.
2994           (dolist (node (nodes! goto-graph))
2995
2996               (if (equal-sets-of-items? set-of-items (select-items! node)
2997                                         :compare-type compare-type)
2998                   (return (current-state! node))
2999               )
3000           )
3001           ; return nil by default
3002      )
3003)
3004
3005)
3006
3007
3008
3009; Hash value of the core of an item.
3010; (core-hash-value-of-item '(S -> S DOT |a| S |b| |,| $)) => 1790
3011;
3012;  Sum up the integer value of all characters in each of the symbols.
3013;  Multiply by the position of DOT in the item to distinguish items
3014;  with the same symbols.
3015;  
3016(defun core-hash-value-of-item( item )
3017
3018    (let ( (hash-value        0)
3019           (symbol-position  -1)
3020           (string-of-symbol "")
3021           (length-of-string  0) ) 
3022
3023         ; Hash the core of the item only.
3024         (dolist (s (core-of-item! item))
3025
3026             ; Symbol index in the item, starting with 0.
3027             (setq symbol-position (+ 1 symbol-position))
3028
3029             ; Convert symbol to string and get its length.
3030             (setq string-of-symbol (symbol-name s))
3031             (setq length-of-string (length string-of-symbol))
3032
3033             ; Sum the integer values of each character in the symbol.
3034             (dotimes (i length-of-string)
3035                  (setq hash-value
3036                        (+ hash-value
3037                           (char-int (char string-of-symbol i)))
3038                  )
3039             )
3040
3041             ; Multiply by the index position of the DOT symbol
3042             ; to distinguish between same items with dots in different 
3043             ; locations such as
3044             ;     [S -> a . b , c]  and [S -> a b . , c]
3045             (if (equal s 'DOT)
3046                 (setq hash-value (* hash-value symbol-position))
3047             )
3048         )
3049    hash-value)    
3050)
3051
3052; Only the core matters:
3053; (core-hash-value-of-set-of-items
3054;     '( (SP -> S DOT           |,| $) 
3055;        ( S -> S DOT |a| S |b| |,| $) 
3056;        ( S -> S DOT |a| S |b| |,| |a|))) => 3542
3057; 
3058; (core-hash-value-of-set-of-items
3059;     '( (SP -> S DOT           |,| $) 
3060;        ( S -> S DOT |a| S |b| |,| $))) => 3542
3061;
3062; (core-hash-value-of-set-of-items
3063;     '( (SP -> S DOT           |,| $))) => 1752
3064;                                  
3065(defun core-hash-value-of-set-of-items( set-of-items )
3066
3067   (let ( (hashes-of-items (mapcar #'core-hash-value-of-item set-of-items))
3068          (sum 0)
3069        )
3070
3071        ; Hash value on the entire set of items.
3072        ; Don't count duplicate items.
3073        (dolist (i (remove-duplicates hashes-of-items))
3074            (setq sum (+ sum i))
3075        )
3076
3077        ; Modulo to keep within size of an integer.
3078        (mod sum +hash-value-upper-limit+)
3079   )
3080)
3081
3082
3083
3084
3085; ==============================================================================
3086; |                    LALR(1) Core Merging Functions
3087; ==============================================================================
3088
3089;
3090; Partition
3091;
3092;     merge-equivalence-classes( '(2 4) '() ) 
3093;                => ( (2 4) )
3094;
3095;     merge-equivalence-classes( '(2 4) '( (4 5) (6 7) ) ) 
3096;                => ( (2 4 5) (6 7) )
3097;
3098;     merge-equivalence-classes( '(2 4) '( (4 5) (2 7) (3 6) ) ) 
3099;                => ( (3 6) (2 4 5 7) )
3100;
3101(defun merge-equivalence-classes( equivalence partition )
3102
3103  (cond 
3104       ; Dispose of trivial inputs.
3105       ( (null equivalence)         partition)
3106       ( (= (length equivalence) 1) partition)
3107
3108       ;  Partition is empty.
3109       ( (null partition)           (list equivalence) )
3110
3111       ;  First set in the partition has common elements
3112       ;  with the equivalence.
3113       ( (intersection equivalence (first partition) )
3114
3115         ; Sort the elements in the equivalence classes.
3116         (mapcar #'(lambda (x) (sort x #'(lambda (x y) (< x y))))
3117
3118             ; Remerge the other sets in the partition.
3119             (merge-equivalence-classes
3120
3121                         ; Merge the equivalence into the first set in the
3122                         ; partition.
3123                         (union equivalence (first partition) :test #'equal)
3124                         (rest partition)
3125             )
3126         )
3127       )
3128
3129       (t
3130
3131         ; Sort the elements in the equivalence classes.
3132         (mapcar #'(lambda (x) (sort x #'(lambda (x y) (< x y))))
3133
3134             ; Merge the other sets in the partition.
3135             (cons (first partition) 
3136                   (merge-equivalence-classes equivalence (rest partition))
3137             )
3138         )
3139       )
3140  )
3141)
3142
3143
3144; If member of equiv. class in the partition, return the smallest
3145; equivalent element.
3146(defun remap-equivalent( num partition )
3147
3148    (cond ( (null partition) num)
3149       
3150          ( (member num (first partition)) 
3151            (caar partition)
3152          )
3153
3154          (t
3155              (remap-equivalent num (rest partition))
3156          )
3157    )
3158)
3159
3160
3161
3162; ------------------------------------------------------------------------------
3163; |                               merge-lookaheads                             |
3164; ------------------------------------------------------------------------------
3165;
3166;  DESCRIPTION
3167;
3168;      Collapse together two sets of items, eliminating duplicated items.
3169;      
3170;  CALLING SEQUENCE
3171;
3172;      (merge-lookaheads set-of-items1 set-of-items2)
3173;
3174;      set-of-items 
3175;
3176;      node          A node in the goto graph with the same cores as in
3177;                    set-of-items.
3178;
3179;      Returns:      Updated node with the same core as before, but
3180;                    added lookaheads.
3181;      
3182;  METHOD
3183;
3184;    Remove duplicates and use a set union to merge the lookaheads.
3185;
3186;  EXAMPLE
3187;
3188;    (merge-lookaheads '( (D -> E DOT F |,| |b|)
3189;                         (A -> B DOT C |,| |a|) )
3190;
3191;                      '( (A -> B DOT C |,| |a|)
3192;                         (A -> B DOT C |,| |a|)
3193;                         (D -> E DOT F |,| |c|) )) =>
3194;
3195;      ( (D -> E DOT F |,| |b|) 
3196;        (A -> B DOT C |,| |a|) 
3197;        (D -> E DOT F |,| |c|) )
3198;
3199; ------------------------------------------------------------------------------
3200
3201(defun merge-lookaheads( set-of-items1 set-of-items2 )
3202
3203    ;  Use the equal function to test for duplicates, since we are handling
3204    ;  elements which are lists, not atoms.
3205    (union  (remove-duplicates set-of-items1)
3206            (remove-duplicates set-of-items2)
3207            :test #'equal)
3208)
3209
3210
3211
3212
3213; ------------------------------------------------------------------------------
3214; |                               merge-cores                                  |
3215; ------------------------------------------------------------------------------
3216;
3217;  DESCRIPTION
3218;
3219;      If the given set of items has the same core as a node in the goto graph,
3220;      merge it into the node.
3221;      
3222;  CALLING SEQUENCE
3223;
3224;      merge-cores( goto-graph ) 
3225;   
3226;      goto-graph    Goto graph of the grammar.
3227;
3228;      Returns:      All sets of items with the same cores are merged,
3229;                    and states and links are renumbered.
3230;
3231;  EXAMPLE
3232;
3233; ------------------------------------------------------------------------------
3234
3235(defun merge-cores( goto-graph )
3236
3237(let ( (nodes               (nodes! goto-graph))  ; Get the nodes out
3238       (links               (links! goto-graph))  ; and the links.
3239       (previous-node       nil)
3240       (previous-state       -1)
3241       (previous-hash-value  -1)
3242       (equiv-class          nil)
3243       (merged-goto-graph   '(() ()) )            ; New merged goto graph.
3244     )
3245
3246     ; Sort the nodes on hash value to keep sets of items with the same
3247     ; cores adjacent. 
3248     (setq nodes
3249           (sort nodes #'(lambda (x y) (< (hash-value! x) (hash-value! y)))))
3250
3251     ; Scan through the nodes, looking for sets of items with the
3252     ; same cores.
3253     (dolist (node nodes)
3254
3255         ; Current node and previous node have same cores.
3256         (cond ( (= (hash-value! node) previous-hash-value)
3257         
3258                 (setq equiv-class 
3259                      (merge-equivalence-classes
3260                            (list (current-state! node) previous-state)
3261                            equiv-class
3262                      )
3263                 )
3264
3265                 ; Create a new merged node.
3266                 (setq node
3267                       (create-new-node 
3268
3269                           ; Use the lowest numbered state
3270                           ; for the new node number.
3271                           (if (< (current-state! node) 
3272                                  (current-state! previous-node))
3273                                (current-state! node)
3274                                (current-state! previous-node)
3275                           )
3276
3277                           ; Hash value.
3278                           (hash-value! node)
3279
3280                           ; Merge the cores in the items.
3281                           (merge-lookaheads (select-items! node)
3282                                             (select-items! previous-node))
3283                       )
3284                  )
3285               )
3286
3287               ; Current node differs, send off previous node.
3288               (t
3289                   (if (not (null previous-node))
3290                       (setq merged-goto-graph 
3291                         (insert-node previous-node merged-goto-graph))
3292                   )
3293               )
3294         )
3295
3296         (setq previous-node       node)
3297         (setq previous-hash-value (hash-value! node))
3298         (setq previous-state      (current-state! node))
3299     )
3300
3301     ; Send off last node, merged or otherwise, in any case.
3302     (setq merged-goto-graph 
3303           (insert-node previous-node merged-goto-graph))
3304
3305     ; Renumber states in the links.
3306     (dolist (link links)
3307      
3308         (setq link 
3309               `( ,(remap-equivalent (first link) equiv-class)
3310                  ,(second link)
3311                  ,(remap-equivalent (third link) equiv-class) 
3312                )
3313         )
3314         (setq merged-goto-graph (insert-link link merged-goto-graph))
3315     )
3316     merged-goto-graph
3317)
3318
3319)
3320
3321
3322
3323
3324
3325; ==============================================================================
3326; |                    LR(1) Action and Goto table utilities                   |
3327; ==============================================================================
3328
3329
3330
3331; ------------------------------------------------------------------------------
3332; |                              create-goto-graph                             |
3333; ------------------------------------------------------------------------------
3334; 
3335;  DESCRIPTION
3336;
3337;      Create a goto graph containing the sets of items for the grammar.
3338;
3339;  CALLING SEQUENCE
3340;
3341;      (create-goto-graph parser-type)
3342;
3343;      parser-type  The type of grammar:  'LR1 or 'LALR1
3344;
3345;      Returns:     Goto graph of the grammar.
3346;
3347;  METHOD
3348;
3349;      We create a DFA which recognizes the viable prefixes of the grammar.
3350;
3351;      The DFA is called the goto graph.  Each node in the graph is of the 
3352;      form (i2 i1 X SET-OF-ITEMS).
3353;
3354;          i1 = V( gamma ) = (set of all items valid for viable prefix gamma).
3355;          i2 = V( gamma X )
3356;          X = a grammar symbol (but not EPSILON).
3357;
3358;      An item [A -> alpha . beta] is valid for viable prefix gamma alpha if
3359;      gamma A is also a viable prefix.  
3360;
3361;      The prefix gamma is viable if there is a rightmost derivation 
3362;      S =>* gamma w.  
3363;
3364;      The first state is I0 = V( EPSILON ) = { [S -> . S , $] }.
3365;
3366;      We process nodes as follows:
3367;
3368;      node-num ---> 0
3369;            ...
3370;      node-num ---> 3 ----+----+
3371;                          | a  |
3372;                    4 <---+    | b
3373;                               |
3374;                    5 <---+----+
3375;              ...
3376;                    3 ----+----+
3377;                          | a  |
3378;                    4 <---+    | b
3379;                               |
3380;      node-num ---> 5 <---+----+
3381;
3382; -----------------------------------------------------------------------------
3383
3384(defun create-goto-graph( parser-type )
3385
3386(let* ( (goto-of-item   nil)        ; Set of items, GOTO( I, X )
3387        (node           nil)        ; Node in graph.
3388        (node-num         0)        ; Next node in goto graph to process.
3389        (new-node       nil)        ; New node in Goto graph.
3390        (new-link       nil)
3391        (new-state-num    1)        ; State number of the next node.
3392        (goto-graph     '( () () ) ); Initial goto-graph.
3393       )
3394
3395
3396    ; Our very first set of items I0 is the closure of [S' -> .S, $]
3397    (setq goto-of-item    (closure (list (create-augmenting-item))))
3398
3399    ; The initial node has state 0, items as above, and hash value.
3400    (setq node (create-new-node 0
3401                                (core-hash-value-of-set-of-items 
3402                                      goto-of-item)
3403                                goto-of-item))
3404
3405    (setq new-link (create-new-link -1 nil 0))
3406
3407    ; Insert nodes and links into the goto graph.
3408    (setq goto-graph (insert-node node     goto-graph))
3409    (setq goto-graph (insert-link new-link goto-graph))
3410
3411    (loop
3412          ; Latest unprocessed node in the goto graph.  Starting with I0.
3413          (setq node (nth-node! node-num goto-graph))
3414
3415          (if (null node) (return))              ; No more sets of items.
3416
3417          ; For each grammar symbol X ...
3418          (dolist (grammar-symbol (find-grammar-symbols))
3419
3420              ; ...compute GOTO( I, X ), the new set of items.
3421              (setq goto-of-item
3422                    (compute-goto (select-items! node)
3423                                  grammar-symbol))
3424
3425              ; Create a new node with set of items GOTO( I, X ), 
3426              (setq new-node
3427                    (create-new-node new-state-num
3428                                     (core-hash-value-of-set-of-items
3429                                           goto-of-item)
3430                                     goto-of-item  )
3431              )
3432            
3433              ; GOTO( I, X ) is empty.
3434              (if (not (null goto-of-item))
3435
3436                  (cond ( 
3437                            ; Our GOTO( I, X ) has computed the same sets of
3438                            ; items.
3439                            (set-of-items-in-graph? goto-of-item
3440                                                    goto-graph)
3441
3442                            ; Insert a new link 
3443                            ;                 X
3444                            ;              I ---> <existing node in graph>
3445                            (setq new-link (create-new-link
3446                                                   node-num
3447                                                   grammar-symbol
3448                                                   (node-number goto-of-item
3449                                                                goto-graph))
3450                            )
3451
3452                            (setq goto-graph (insert-link new-link goto-graph))
3453                         )
3454
3455               
3456
3457                         ;  Add a new node with a new set of items and
3458                         ;  a new link.
3459                         ;  Increment the current state number.
3460                         (t 
3461                             (setq goto-graph (insert-node new-node goto-graph))
3462
3463                             (setq new-link (create-new-link  (current-state! node)
3464                                                              grammar-symbol
3465                                                              new-state-num))
3466                             (setq goto-graph (insert-link new-link goto-graph))
3467                            
3468                             (setq new-state-num (1+ new-state-num))
3469                          )
3470
3471                ) ; end cond
3472
3473            ) ; end if empty GOTO( I, X )
3474        ) ; end dolist
3475
3476        ; Bump up the node number.
3477        (setq node-num  (1+ node-num ))
3478
3479    ) ; end loop
3480
3481
3482    ; For LALR(1) languages, sort the goto graph on core hash value
3483    ; then merge states with the same cores.
3484    (if (equal parser-type 'LALR1)
3485        (setq goto-graph (merge-cores goto-graph))
3486    )
3487    goto-graph
3488
3489) ; end let
3490
3491)
3492
3493; ------------------------------------------------------------------------------
3494; |                                   goto                                     |
3495; ------------------------------------------------------------------------------
3496;
3497;  DESCRIPTION
3498;
3499;      The LR GOTO function derived from the goto graph.
3500;
3501;  CALLING SEQUENCE
3502;
3503;      (goto i A goto-graph)
3504;
3505;      goto-graph     The goto graph with entries of the form
3506;                     (i2 i1 X <list of items>)
3507;
3508;      state          The initial state i1.
3509;
3510;      symbol         The transition symbol X.
3511;
3512;      Returns:       The next state i2, or NIL if GOTO is undefined.
3513;     
3514;  EXAMPLE
3515;
3516;      Suppose *goto-graph* = ( ( (6 |a| 4) ) (nodes) )
3517;
3518;      (goto 6 '|a| goto-graph) => 4
3519;
3520; ------------------------------------------------------------------------------
3521
3522(defun goto( state symbol goto-graph)
3523
3524    (dolist (link (links! goto-graph))
3525
3526        (if (and (=      state (first link))
3527                 (equal symbol (second link))
3528            )
3529
3530            (return (third link))
3531        )
3532    )
3533    ; Return nil by default.
3534)
3535
3536
3537; ------------------------------------------------------------------------------
3538; |                               action-list!                                 |
3539; ------------------------------------------------------------------------------
3540;
3541;  DESCRIPTION
3542;
3543;      Return the list of actions in a line of the action table.
3544;
3545;  CALLING SEQUENCE
3546;
3547;      (action-list action-table-line)
3548;
3549;      action-table-line    One line of action table of the form 
3550;                           ( (stateNumber) (listOfActions) )
3551;
3552;      Returns:             (listOfActions)
3553;
3554;  EXAMPLE
3555;
3556;      (action-list! '( (0) ((|c| (S 3)) (|d| (S 4)))))
3557;      =>   ((|c| (S 3)) (|d| (S 4)))
3558;
3559; ------------------------------------------------------------------------------
3560
3561(defun action-list!( line-of-table )
3562
3563(second line-of-table)
3564)
3565
3566
3567
3568; ------------------------------------------------------------------------------
3569; |                               action-line-state!                           |
3570; ------------------------------------------------------------------------------
3571; 
3572;  DESCRIPTION
3573;
3574;      Return the state of a line of the action table.
3575;
3576;  CALLING SEQUENCE
3577;
3578;      (action-line-state! action-table)
3579;
3580;      action-table   Table of the form ( (stateNum) (listOfActions) )
3581;
3582;      Returns:       stateNum
3583;
3584;  EXAMPLE
3585;
3586;      (action-line-state! '( (0) ((|c| (S 3)) (|d| (S 4)) (DEFAULT (ERROR)))) )
3587;      => 0
3588;
3589; ------------------------------------------------------------------------------
3590
3591(defun action-line-state!( action-table-line )
3592
3593(first (first action-table-line))
3594
3595)
3596
3597
3598
3599
3600; ------------------------------------------------------------------------------
3601; |                          action-trigger-symbol!                            |
3602; ------------------------------------------------------------------------------
3603; 
3604;  DESCRIPTION
3605;
3606;      Return the transition symbol in an action pair.
3607;
3608;  CALLING SEQUENCE
3609;
3610;     (action-trigger-symbol! action-pair)
3611;
3612;      action-pair    An action/new-state pair of a line in the action table
3613;                     of the form (X  (action i)).
3614;
3615;      Returns:       X
3616;
3617;  EXAMPLE
3618;
3619;     (action-trigger-symbol! '(|c| (S 3))) => |c|
3620;
3621; ------------------------------------------------------------------------------
3622
3623(defun action-trigger-symbol!( action-pair )
3624
3625(first action-pair)
3626
3627)
3628
3629
3630
3631; ------------------------------------------------------------------------------
3632; |                        insert-action-or-goto-into-list                     |
3633; ------------------------------------------------------------------------------
3634;
3635;  DESCRIPTION
3636;
3637;      Insert an action into a line of the action table.  Check for conflicts.
3638;
3639;  CALLING SEQUENCE
3640;
3641;     (insert-action-or-goto-into-list symbol new-state list-of-actions action)
3642;
3643;     symbol           The transition symbol X.
3644;
3645;     new-state        The new state i.
3646;
3647;     list-of-actions  The action list part of one line of the action or 
3648;                      goto table.
3649;
3650;     action           If 'NONE, update a goto list, else add this action
3651;                      to an action list.
3652;
3653;     Returns:         Augmented action list containing a new action pair
3654;                      (X (action i)) or a conflict pair 
3655;                      (CONFLICT (X (action i)) (X (old-action j)))
3656;                      Similarly for a goto list.
3657;  EXAMPLE
3658;
3659;     (insert-action-or-goto-into-list 'a 666 '((b (s 3)) (c (r 2))) :action 's)
3660;       =>((B (S 3)) (C (R 2)) (A (S 666))) 
3661;
3662;     (insert-action-or-goto-into-list 'b 666 '((b (s 3)) (c (r 2))) :action 's)
3663;       => ((B (S 3)) (C (R 2)) (CONFLICT ((B (S 666)) (B (S 3)))))
3664;
3665;     (insert-action-or-goto-into-list 'a 666 '((b 5) (c 6)))
3666;       => ((B 5) (C 6) (A 666))
3667;
3668; ------------------------------------------------------------------------------
3669
3670(defun insert-action-or-goto-into-list( symbol new-state list-of-actions 
3671                                &key (action 'NONE) )
3672
3673
3674;  Nothing or only a default in the list.  Insert a new action.
3675
3676    (cond ( (or (null list-of-actions)
3677                (equal (first list-of-actions) '(default (error))))
3678
3679                 (cons (if (equal action 'NONE)
3680                          `(,symbol ,new-state)            ; Insert a goto.
3681                          `(,symbol (,action ,new-state))) ; Insert an action.
3682                       list-of-actions))
3683
3684;  Ignore duplicate actions.
3685
3686          ((equal  (first list-of-actions)
3687                   (if (equal action 'NONE)
3688                      `(,symbol ,new-state)                 ; Compare a goto.
3689                      `(,symbol (,action ,new-state))))     ; Compare an action.
3690
3691                   list-of-actions)                ; Return list unchanged.
3692
3693
3694; We have a conflict on the first action.  Insert a conflict report at the
3695; end of the row, unless it is there already.
3696
3697          ((equal symbol (action-trigger-symbol! (first list-of-actions)))
3698
3699               (setq *conflicts* T)
3700
3701               (insertItemIntoList (if (equal action 'NONE)
3702                                    `(conflict ((,symbol ,new-state)
3703                                                (,@(first list-of-actions))))
3704                                    `(conflict ((,symbol (,action ,new-state))
3705                                                (,@(first list-of-actions)))))
3706
3707                                  list-of-actions))
3708
3709
3710;  No conflict yet --- try insertion in the rest of the list.
3711
3712           ( t (cons (first list-of-actions)
3713                     (insert-action-or-goto-into-list symbol 
3714                                                      new-state 
3715                                                      (rest list-of-actions)
3716                                                      :action action))))
3717)
3718
3719
3720
3721; ------------------------------------------------------------------------------
3722; |                              add-action-or-goto                            |
3723; ------------------------------------------------------------------------------
3724;
3725;  DESCRIPTION
3726;
3727;      Add an action to the action table or a goto to the goto table.
3728;
3729;
3730;  CALLING SEQUENCE
3731;
3732;     (add-action-or-goto( state symbol new-state table action)
3733;
3734;     state     The current state i1.
3735;
3736;     new-state The new state i2
3737;
3738;     table     The action or goto table.
3739;
3740;     action    Defaults to 'NONE for the goto table, otherwise, the
3741;               action to take (e.g. S, R, ACC, ERROR)
3742;
3743;     Returns:  The updated action or goto table.
3744;
3745;  EXAMPLE
3746;
3747;     Let table = ( ( (2) ( (a (s 5)) 
3748;                        (b (r 2)) 
3749;                        (default (error))))
3750;                ( (4) ( ($ (acc nil)) 
3751;                        (default (error)))))
3752;
3753;     To insert ACTION[ 2, b ] = (shift 6) into the table, call
3754;
3755;     (add-action-or-goto 2 'b 6 table :action 's) =>
3756;
3757;               ( ( (2) ( (A (S 5)) 
3758;                         (B (R 2)) 
3759;                         (DEFAULT (ERROR))
3760;                         (CONFLICT ((B (S 6)) (B (R 2))))))
3761;                 ( (4) ( ($ (ACC NIL)) 
3762;                         (DEFAULT (ERROR)))))
3763;
3764;      We detect a shift/reduce conflict on symbol b and report it.
3765;
3766;      On the other hand,
3767;
3768;      (add-action-or-goto 2 'c 6 table :action 's) =>
3769;
3770;               ( ( (2) ( (A (S 5)) 
3771;                         (B (R 2)) 
3772;                         (C (S 6))
3773;                         (DEFAULT (ERROR))))
3774;                 ( (4) ( ($ (ACC NIL)) 
3775;                         (DEFAULT (ERROR)))))
3776;
3777;      Suppose we have a goto table,
3778;
3779;      table = ( ( (0) ( (a 10) 
3780;                        (b 20)
3781;                        (default (error))))
3782;                ( (4) ( (a 11)
3783;                        (default (error)))))
3784;
3785;      To insert GOTO[ 0, c ] = 6 call
3786;
3787;      (add-action-or-goto 0 'c 6 table) =>
3788;
3789;             ( ( (0) ( (A 10) 
3790;                       (B 20)
3791;                       (C 6)
3792;                       (DEFAULT (ERROR))))
3793;               ( (4) ( (A 11)
3794;                       (DEFAULT (ERROR)))))
3795;
3796; -----------------------------------------------------------------------------
3797
3798(defun add-action-or-goto( state symbol new-state table 
3799                           &key (action 'NONE))
3800
3801    ;  The table has no entries.  Create a new action table of the form
3802    ;      ( (State) ( (TransitionSymbol (Action NewState)) (default (error))))
3803    ;  or Goto table of the form,
3804    ;      ( (State) ( (TransitionSymbol (NewState)) (default (error)))).
3805    ;
3806    ; NOTE:
3807    ;   We assume the Goto graph starts with state 0.
3808    ;   Since we insert new states into the action table in order,
3809    ;   the order will be maintained as we scan through the Goto graph.
3810
3811    (cond ( (null table)   `( 
3812                                ( (,state)
3813
3814                                  ( 
3815                                     ,(if (equal action 'NONE)
3816                                         `(,symbol ,new-state) ; goto table
3817                                         `(,symbol (,action ,new-state))
3818                                      )
3819
3820                                      (default (error)) 
3821                                  )
3822                                )
3823                            )
3824          )
3825
3826
3827             ;  Found state in first line of table.  Add the new action to 
3828             ;  this line.
3829             ( (= (action-line-state! (first table))
3830                  state)                                
3831
3832                (cons (list (first (first table))  ; Get state of first line.
3833
3834                             (insert-action-or-goto-into-list symbol 
3835                                                              new-state 
3836                                                    (action-list! (first table))
3837                                                    :action action))
3838                           (rest table)))
3839
3840           ;  State is smaller than first line's state.  Create a new line
3841           ;  containing a new state, action and (default (error)) and add it 
3842           ;  before the first line.
3843           ( (< state (action-line-state! (first table)))
3844
3845                       (cons `( (,state)
3846                                (  ,(if (equal action 'NONE)
3847                                       `(,symbol ,new-state) ; goto table
3848                                       `(,symbol (,action ,new-state))
3849                                    )
3850                                    (default (error))
3851                                )
3852                              )
3853                             table)
3854           )
3855
3856           ; State is bigger than the first line's state.  Decide later.
3857           ( t (cons (first table)
3858                     (add-action-or-goto state symbol new-state (rest table) 
3859                                         :action action))))
3860)
3861
3862
3863; ------------------------------------------------------------------------------
3864; |                              build-action-table                            |
3865; ------------------------------------------------------------------------------
3866;
3867;  DESCRIPTION
3868;
3869;      Build the ACTION table of a cannonical LR(1) parser.
3870;
3871;
3872;  CALLING SEQUENCE
3873;
3874;      (build-action-table goto-graph)
3875;
3876;       goto-graph  The goto graph generated by make-items.
3877;
3878;       Returns:    Action table of the grammar.  
3879;
3880;  METHOD
3881;
3882;      We initially add the action (default (error)) to each line of the table.
3883;      If we generate any shift-reduce or reduce-reduce conflicts,
3884;      we record them in the action table and check them later.
3885;
3886;  EXAMPLE
3887;
3888; ------------------------------------------------------------------------------
3889
3890(defun build-action-table( goto-graph )
3891
3892(let ( (action-table nil)
3893       (first-symbols-after-dot nil) )
3894
3895  (dolist (node (nodes! goto-graph))             ; Scan through every node in
3896                                                 ; in the goto graph.
3897    (dolist (item (select-items! node))
3898
3899        (cond ( 
3900                ;  For the item [S' -> S. , $],
3901                ;      ACTION[ i, $ ] = (accept).
3902                (is-accept? item)
3903
3904                (setq action-table
3905                      (add-action-or-goto 
3906                           (current-state! node)    ; Current state i
3907                           '$                       ; Transition.
3908                           'nil                     ; No state.
3909                           action-table
3910                           :action 'acc)            ; No state.
3911                )          
3912              )
3913
3914              ;  For the item [A -> alpha . , b]
3915              ;       ACTION[ i, b ] = (reduce k)      
3916              ;  where k is the number of the production A -> alpha
3917              ( (reduction? item)
3918
3919                (setq action-table
3920                      (add-action-or-goto 
3921                           (current-state! node)    ; Current state i
3922                           (lookahead-of! item)     ; b.
3923                           (production-number       ; Production num.
3924                                 (item-to-production item))   
3925                           action-table
3926                           :action 'r)
3927                )
3928              )
3929 
3930;  Prepare to add a possible shift.
3931
3932             ( t  
3933
3934        (if *has-epsilon-productions* 
3935
3936            ; When the grammar has epsilon-productions, for the item
3937            ;     [A -> alpha . beta , b]
3938            ; where beta is not equal to the null-string EPSILON, 
3939            ; for all a in EFF( beta b ), we add
3940            ;     ACTION[ i, a ] = (shift j)       where j = GOTO( i , a ).
3941            (setq first-symbols-after-dot
3942
3943                  (if (reduction? item)    ; beta = epsilon
3944
3945                      nil
3946
3947                      (first-derived-terminals `(,(symbol-after-dot! item)
3948                                                 ,@(string-before-comma! item)
3949                                                 ,(lookahead-of! item))
3950                                                :type 'epsilon-free)))
3951 
3952            ;  For a grammar with no epsilon productions, for the item 
3953            ;      [A -> alpha . a beta , b]
3954            ;  where a is a terminal, we add
3955            ;      ACTION[ i, a ] = (shift j)       where j = GOTO( i , a ).
3956            (setq first-symbols-after-dot 
3957
3958                  (if (terminal-after-dot? item) 
3959
3960                      (list (symbol-after-dot! item))
3961
3962                      nil))
3963        )
3964
3965
3966;  Add a shift, if any.
3967      (dolist (term first-symbols-after-dot)
3968
3969          (setq action-table
3970                 (add-action-or-goto (current-state! node)     ; Current state i
3971                                     term                      ; Terminal a. 
3972                                     (goto                     ; into state j.
3973                                           (current-state! node)
3974                                            term
3975                                            goto-graph)
3976                                     action-table
3977                                     :action 's)))))))          ; Do a shift.
3978
3979    action-table)
3980)
3981
3982
3983
3984; ------------------------------------------------------------------------------
3985; |                              build-goto-table                              |
3986; ------------------------------------------------------------------------------
3987;
3988;  DESCRIPTION
3989;
3990;      Build the GOTO table for an LR(1) parser.
3991;
3992;
3993;  CALLING SEQUENCE
3994;
3995;      (build-goto-table)
3996;
3997;       goto-graph    The goto graph.
3998;
3999;       Returns:      The goto table.
4000;
4001;      
4002;  METHOD
4003;
4004;       Whenever we have a link i which has a transition 
4005;       on a nonterminal A to the link j, we fill the table with 
4006;       GOTO( i, A ) = j.
4007;
4008;  EXAMPLE
4009;
4010; ------------------------------------------------------------------------------
4011
4012(defun build-goto-table( goto-graph )
4013
4014(let ((goto-table nil))
4015
4016    (dolist (link (links! goto-graph))
4017          
4018            (if (and (> (first link) -1)
4019                     (nonterminal? (second link)))
4020
4021                 (setq goto-table 
4022                       (add-action-or-goto (first  link)
4023                                           (second link)
4024                                           (third  link)
4025                                           goto-table)
4026                 )
4027            )
4028    )
4029    goto-table)
4030)
4031
4032
4033
4034
4035
4036; ==============================================================================
4037; |                    Input and Output Functions                              |
4038; ==============================================================================
4039
4040; ------------------------------------------------------------------------------
4041; |                              write-header                                  |
4042; ------------------------------------------------------------------------------
4043;
4044;  DESCRIPTION
4045;
4046;      Write a header for the parse tables file.
4047;
4048;  CALLING SEQUENCE
4049;
4050;      (write-header fp parser-type)
4051;
4052;      fp            Pointer to the currently open file.
4053;
4054;      parser-type   'LR1 or 'LALR1.  The title will be adjusted 
4055;                    automatically based on the parser type.
4056;
4057;      Returns:      Header text written to file.
4058;
4059;  EXAMPLE
4060;
4061; ------------------------------------------------------------------------------
4062
4063(defun write-header( fp parser-type )
4064
4065(format fp "~A~%~A~%~A~%~A~%~A~%~A~%~A~%~A~%~%"
4066           ";---------------------"
4067
4068           (if (equal parser-type 'LR1) 
4069                      "; LR(1) parse tables"
4070                      "; LALR(1) parse tables")
4071
4072           ";---------------------"
4073           ";"
4074           "; Suitable for input to the Common Lisp program "
4075           ";"
4076           ";     LR(1)AndLALR(1)Parser.lsp"
4077           ";"
4078)
4079
4080)
4081
4082
4083; ------------------------------------------------------------------------------
4084; |                              write-terminals                               |
4085; ------------------------------------------------------------------------------
4086;
4087;  DESCRIPTION
4088;
4089;      Write a header for the parse tables file.
4090;
4091;  CALLING SEQUENCE
4092;
4093;      (write-terminals fp)
4094;
4095;      fp            Pointer to the currently open file.
4096;
4097;      Returns:      Terminal symbols written to file.
4098;
4099;  EXAMPLE
4100;
4101; ------------------------------------------------------------------------------
4102
4103(defun write-terminals( fp terminals )
4104
4105    (format fp "~A~%~A~%~%" 
4106               "; TERMINALS" 
4107               ";"
4108               )
4109
4110    (format fp "~S~%~%" terminals)
4111
4112    (fresh-line fp)
4113    (fresh-line fp)
4114)
4115
4116
4117; ------------------------------------------------------------------------------
4118; |                              write-productions                             |
4119; ------------------------------------------------------------------------------
4120;
4121;  DESCRIPTION
4122;
4123;      Write the split up productions with their numbers.
4124;
4125;  CALLING SEQUENCE
4126;
4127;      (write-productions fp productions)
4128;
4129;      fp            Pointer to the currently open file.
4130;
4131;      productions   List of productions to write.
4132;
4133;      Returns:      Neat list of numbered productions.  (These are
4134;                    the ones expanded from the alternates.)
4135;
4136;  EXAMPLE
4137;
4138;     See the files lalrparser.dat and parser.dat for examples.
4139;
4140; ------------------------------------------------------------------------------
4141
4142(defun write-productions( fp productions )
4143
4144    (format fp "~A~%~A~%~A~%~A~%~%~A~%" 
4145               ";  PRODUCTIONS"
4146               ";"
4147               ";  Productions are numbered starting with 1."
4148               ";  All alternates were expanded into separate productions." 
4149               "(" )
4150
4151    (dolist (production productions)
4152
4153        ; Print each production.
4154        (format fp "~A~D~A~S~A~%" 
4155                   "  ( "
4156                   `(,(production-number production))
4157                   "   "
4158                   production
4159                   " )" )
4160    )
4161
4162    (format fp "~A~%~%" 
4163               ")" )
4164)
4165
4166
4167
4168(defun construct-error-messages( action-table )
4169
4170(let ( (error-messages     nil)
4171       (transition-symbols nil) )
4172
4173        ; Scan through each line of the action table.
4174        (dolist (action-line action-table)
4175
4176            (setq transition-symbols nil)
4177          
4178            ; Scan through the actions in each line.
4179            (dolist (action (action-list! action-line))
4180
4181                ; Found an error state;  add message to the list.
4182                (if (equal (second action) '(ERROR))
4183                    (push `( (,(action-line-state! action-line))
4184                             (,(concatenate 'string 
4185                                   "error - expecting one of the symbols " 
4186                                   (string-trim "("
4187                                   (string-trim ")"
4188                                    (write-to-string transition-symbols))))))
4189                           error-messages)
4190
4191                    ; else keep collecting transition symbols.
4192                    (setq transition-symbols
4193                          (cons (first action)
4194                                transition-symbols))
4195
4196                )
4197            )
4198        )
4199
4200    (reverse error-messages))
4201)
4202
4203
4204; ------------------------------------------------------------------------------
4205; |                         write-error-message-table                          |
4206; ------------------------------------------------------------------------------
4207;
4208;  DESCRIPTION
4209;
4210;      Write the error message table with templates for the user to fill in.
4211;
4212;  CALLING SEQUENCE
4213;
4214;      (write-error-message-table fp action-table)
4215;
4216;      fp            Pointer to the currently open file.
4217;
4218;      action-table 
4219;
4220;      Returns:      Error message table.
4221;
4222;  EXAMPLE
4223;
4224;
4225; ------------------------------------------------------------------------------
4226
4227(defun write-error-message-table( fp action-table )
4228
4229    (format fp "~A~%~%~%"
4230"
4231;  ERROR MESSAGE TABLE
4232;
4233;  If the action table has an error state, the other non-error
4234;  actions show which symbol was failed to appear next on the input.
4235;
4236;  The user can modify these minimal error messages.
4237" )
4238
4239    ; Opening parenthesis.
4240    (format fp "(~%~%")
4241
4242    ; Iterate over error states.
4243    (dolist (error-message (construct-error-messages action-table))
4244
4245        (format fp "    ~S ~%"  error-message)
4246    )
4247
4248    ; Closing parenthesis.
4249    (format fp ")~%~%")
4250)
4251
4252
4253
4254; ------------------------------------------------------------------------------
4255; |                               write-goto-graph                             |
4256; ------------------------------------------------------------------------------
4257;
4258;  DESCRIPTION
4259;
4260;      Write the formatted goto graph to a file.
4261;
4262;  CALLING SEQUENCE
4263;
4264;      (write-goto-graph fp goto-graph)
4265;
4266;      fp                Pointer to (open) file which is to contain the graph.
4267;
4268;      goto-graph        Goto graph itself, which will be pretty-printed.
4269;
4270;
4271;  EXAMPLE
4272;
4273;      (write-goto-graph fp " *goto-graph*)
4274;      =>  ... see the sample output files parser.dat and lalrparser.dat.
4275;
4276; ------------------------------------------------------------------------------
4277
4278(defun write-goto-graph( fp goto-graph )
4279
4280    ; Write the title first and the opening parenthesis.
4281    (format fp "~A~%~%"
4282";  GOTO GRAPH
4283;                 
4284;  Not needed for the parser, but here for reference and debugging.
4285; **********
4286;  Goto graph of the LR(1) or LALR(1) grammar of the form
4287;                
4288; (
4289;   (                     <-- List of links.
4290;       (6 |a| 4)         <-- Transition in Goto graph from state 6 to
4291;                             state 4 on symbol a.
4292;       (1 |a| 2)         <-- Transition from state 1 to state 2 on a.
4293;   )
4294;                   
4295;   (                     <-- List of sets of items.
4296;       ( 0                                <-- State number 0.
4297;         3668                             <-- Hash value of core.
4298;         (
4299;            (SP -> DOT S           |,|  $)  ----+
4300;            ( S -> DOT S |a| S |b| |,|  $)      |
4301;            ( S -> DOT EPSILON     |,|  $)      +---- Set of items for state 0
4302;            ( S -> DOT S |a| S |b| |,| |a|)     |
4303;            ( S -> DOT EPSILON     |,| |a|)     |
4304;         )                                  ----+
4305;       ) "
4306)
4307
4308    ; Opening parenthesis of graph.
4309    (format fp "(~%")
4310
4311    ; Opening parenthesis of links.
4312    (format fp "~3,4@T(~%")
4313
4314    ; Print each link.
4315    (dolist (link (links! goto-graph))
4316
4317        (format fp "~3,8@T(~D ~S ~D)~%"
4318                   (first  link)
4319                   (second link)
4320                   (third  link))
4321    )
4322
4323    ; Closing parenthesis of links.
4324    (format fp "~3,4@T)~%")
4325
4326    ; Opening parenthesis of nodes.
4327    (format fp "~3,4@T(~%")
4328
4329    ; Print each node in the graph.
4330    (dolist (node (nodes! goto-graph))
4331       
4332        ; Print open paren of node, state and hash value.
4333        (format fp "~3,8@T(~D~%~3,8@T~D~%"
4334            (current-state! node)
4335            (hash-value!    node))
4336
4337        ; Print out each item.
4338        (dolist (item (select-items! node))
4339                (format fp "~3,12@T~S~%" item))
4340
4341        ; Closing paren of node.
4342        (format fp "~3,8@T)~%")
4343    )
4344
4345    ; Closing parenthesis of nodes.
4346    (format fp "~3,4@T)~%")
4347
4348    ; Closing parenthesis of graph.
4349    (format fp ")~%~%")
4350)
4351
4352
4353; ------------------------------------------------------------------------------
4354; |                       write-action-or-goto-table                           |
4355; ------------------------------------------------------------------------------
4356;
4357;  DESCRIPTION
4358;
4359;      Write the formatted action or goto table to a file.
4360;
4361;  CALLING SEQUENCE
4362;
4363;      (write-action-or-goto-table fp table)
4364;
4365;      fp                Pointer to (open) file which is to contain the
4366;                        action-table.
4367;
4368;      table             Action or goto table itself, which will be
4369;                        pretty-printed.
4370;
4371;
4372;  EXAMPLE
4373;
4374;      (write-action-or-goto-table fp " *action-table)
4375;      =>  ... see the sample output files parser.dat and lalrparser.dat.
4376;
4377; ------------------------------------------------------------------------------
4378
4379(defun write-action-or-goto-table( fp table &key (table-type 'ACTION))
4380
4381    ; Write the title first and the opening parenthesis.
4382    (format fp "~A~%~A~%~A~%~A~%~A~%~%~A~%" 
4383               (cond ( (equal table-type 'ACTION)   ";  ACTION TABLE")
4384                     ( (equal table-type 'GOTO)     ";  GOTO TABLE"  ))
4385               ";"
4386               ";  (state"
4387               ";         (item)" 
4388               ";         ..." 
4389               "(" )
4390
4391    ; Print actions for each state.
4392    (dolist (state table)
4393
4394        ; Print the opening paren of the table and the state
4395        ; number in parentheses.
4396        (format fp "~3,4@T( (~D) ~%"
4397                   (action-line-state! state)
4398        )
4399       
4400       ; Print the word NIL explicitly if the list of items is empty.
4401       (if (null (action-list! state))
4402           (format fp "~3,8@TNIL~%")
4403           
4404           ; Print out the list of actions.
4405           (progn
4406               ; Print first paren of action list.
4407               (format fp "~3,8@T(~%")
4408
4409               ; Print actions.
4410               (dolist (item (action-list! state))
4411                   (format fp "~3,12@T~S~%" item))
4412
4413               ; Print first paren of action list.
4414               (format fp "~3,8@T)~%")
4415           )
4416       )
4417
4418        (format fp "~3,4@T)~%")
4419    )
4420
4421    ; Closing parenthesis.
4422    (format fp ")~%~%")
4423)
4424
4425
4426; ------------------------------------------------------------------------------
4427; |                          print-legal-notice                                |
4428; ------------------------------------------------------------------------------
4429;
4430;  DESCRIPTION
4431;
4432;      Write legal notice when the program starts up.
4433;
4434;  CALLING SEQUENCE
4435;
4436;      Returns:      Legal notice to standard output.
4437;
4438;  EXAMPLE
4439;
4440; ------------------------------------------------------------------------------
4441
4442(defun print-legal-notice()
4443
4444    ; Print a few newlines, the notice and a few more newlines.
4445    (format t "~%~%~A~%~%" 
4446        "
4447        LR(1)AndLALR(1)ParserGenerator Version 5.6
4448
4449        An LR(1) and LALR(1) Parser Generator written in Common Lisp.
4450
4451        Copyright (C) 1989-2024 by Sean Erik O'Connor.  All Rights Reserved.
4452
4453        This program is free software: you can redistribute it and/or modify
4454        it under the terms of the GNU General Public License as published by
4455        the Free Software Foundation, either version 3 of the License, or
4456        (at your option) any later version.
4457
4458        This program is distributed in the hope that it will be useful,
4459        but WITHOUT ANY WARRANTY; without even the implied warranty of
4460        MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
4461        GNU General Public License for more details.
4462
4463        You should have received a copy of the GNU General Public License
4464        along with this program.  If not, see <http://www.gnu.org/licenses/>.
4465    
4466        The author's address is seanerikoconnor!AT!gmail!DOT!com
4467        with the !DOT! replaced by . and the !AT! replaced by @"
4468    )
4469) 
4470
4471
4472
4473; ------------------------------------------------------------------------------
4474;
4475;  NAME 
4476;
4477;      load-input-and-initialize
4478;
4479;  DESCRIPTION
4480;
4481;      Load the grammar from file.  Initialize global variables.
4482;
4483;  CALLING SEQUENCE
4484;
4485;      (load-input-and-initialize filename)
4486;
4487;      filename  Name of the file containing the productions and terminals for
4488;                the grammar.
4489;
4490;      Returns:  *terminals*       After this is read from file, we add the extra 
4491;                                  terminal $ (the language's right endmarker).
4492;
4493;                *productions*     After reading the list of productions,
4494;
4495;                                      [A -> alpha | beta ...] 
4496;
4497;                                  from file, we split up the alternates to
4498;                                  generate the set of productions 
4499;
4500;                                      [A -> alpha], [A -> beta], ...
4501;                
4502;                *first-derived-terminals* 
4503;                *epsilon-free-first-derived-terminals* 
4504;
4505;                                  Set to NIL.
4506;
4507;                *has-epsilon-productions* 
4508;
4509;                                  Set to NIL unless we have epsilon
4510;                                  productions of the 
4511;                                  form A -> EPSILON.
4512;
4513;                *conflicts*       Set to NIL.
4514;
4515;                *goto-graph*, 
4516;                *action-table* 
4517;                *goto-table*      Set to NIL just for the hell of it.
4518;
4519;  EXAMPLE
4520;
4521;      (load-input-and-initialize "grammar.dat") 
4522;
4523;      *productions* => ((S -> S |a| S |b|) (S -> EPSILON)
4524;      *terminals*   =>  (|a| |b| $)
4525;      *first-derived-terminals* => NIL
4526;      *epsilon-free-first-derived-terminals* => NIL
4527;      *conflicts* => NIL
4528;      *has-epsilon-productions* => NIL
4529;
4530; ------------------------------------------------------------------------------
4531
4532(defun load-input-and-initialize( grammar-file )
4533
4534; Better safe than sorry.
4535(setq *goto-graph*   nil)
4536(setq *action-table* nil)
4537(setq *goto-table*   nil)
4538(setq *conflicts*    nil)
4539(setq *first-derived-terminals*              nil)
4540(setq *epsilon-free-first-derived-terminals* nil)
4541
4542
4543;  Split up productions and add the endmarker to the list of terminals.
4544(let ( (fp (open grammar-file :direction :input)) )
4545
4546    (setq *productions* (read fp))
4547    (setq *terminals*   (read fp))
4548
4549    ; Add the endmarker to the list of terminals.
4550    (setq *terminals*    (append *terminals*   '($)))
4551
4552    ;  Split up productions (so we don't handle alternates directly)
4553    (setq *productions*  (split-up-productions *productions*))
4554
4555
4556    ;  Detect epsilon productions.
4557    (setq *has-epsilon-productions* nil)
4558
4559    (dolist (production *productions*)
4560
4561        (if (epsilon-production? production)
4562
4563            (setq *has-epsilon-productions* T)))
4564
4565    (close fp))
4566)
4567
4568
4569; ------------------------------------------------------------------------------
4570; |                               compile-all                                  |
4571; ------------------------------------------------------------------------------
4572;
4573;  DESCRIPTION
4574;
4575;      Compile all the functions in this program, except compile-all itself.
4576;
4577;  CALLING SEQUENCE
4578;
4579;      (compile-all)
4580;
4581;  EXAMPLE
4582;
4583;      (compile-all) =>
4584;
4585;      ;;; Compiling function LOAD-INPUT-AND-INITIALIZE...tail-merging...
4586;      assembling...emitting...done.
4587;
4588;          --- and so on, with pauses for garbage collection ---
4589;
4590;      ;;; Compiling function PARSER-GENERATOR...assembling...emitting...done
4591;      NIL
4592;
4593; ------------------------------------------------------------------------------
4594
4595(defun compile-all() 
4596
4597;  Tell the compiler the following variables are global (have dynamic binding).
4598
4599    (proclaim '(special *productions*))
4600    (proclaim '(special *has-epsilon-productions*))
4601    (proclaim '(special *terminals*))
4602    (proclaim '(special *first-derived-terminals*))
4603    (proclaim '(special *epsilon-free-first-derived-terminals*))
4604    (proclaim '(special *goto-graph*))
4605    (proclaim '(special *action-table*))
4606    (proclaim '(special *goto-table*))
4607    (proclaim '(special *conflicts*))
4608
4609(let ( (functions-to-compile
4610
4611      '(  print-legal-notice
4612          load-input-and-initialize
4613
4614          getHeadOfListUpTo  
4615          removeItemFromList  
4616          positionInList       
4617          insertItemIntoList
4618          combine
4619          itemInList
4620
4621          terminal?
4622          nonterminal?
4623          derives-leading-terminal?
4624          derives-leading-nonterminal?
4625          valid-production? 
4626          set-of-items-in-graph? reduction?
4627          is-accept? terminal-after-dot?
4628          equal-sets-of-items?
4629          contained-in-item? 
4630          element-of-item? 
4631          epsilon-production?  
4632          same-symbol?
4633
4634          first-alternate! 
4635          all-but-first-alternate!
4636          production-rhs! 
4637          symbol-after-dot! 
4638          string-before-comma! 
4639          lookahead-of! 
4640          select-items! 
4641          hash-value! 
4642          current-state! 
4643          action-list! 
4644          transition-symbol! 
4645          action-line-state! 
4646          action-trigger-symbol! 
4647
4648          core-of-item! 
4649          core-hash-value-of-item
4650          core-hash-value-of-set-of-items
4651          merge-lookaheads 
4652          merge-cores
4653
4654          split-up-production 
4655          split-up-productions
4656          make-item 
4657          move-dot-right 
4658          create-augmenting-item 
4659          find-grammar-symbols 
4660          create-new-node  
4661          create-new-link  
4662
4663          node-number  
4664          item-to-production
4665          production-number
4666          tag-symbol  
4667          flag-epsilon-free!  
4668          epsilon-free-only
4669          untag-list  
4670          flag-non-epsilon-free precedence
4671
4672          derived-leading-terminal
4673          initial-first-derived-terminals 
4674          first-terminals-of-rhs 
4675          update-first-derived-function 
4676          create-all-first-derived-terminals 
4677          first-terminals-of-symbol 
4678          first-derived-terminals 
4679
4680          add-action-or-goto
4681          insert-action-or-goto-into-list 
4682          goto closure compute-goto 
4683
4684          create-goto-graph 
4685          build-action-table
4686          build-goto-table 
4687
4688          write-header 
4689          write-terminals
4690          write-productions  
4691          write-goto-graph
4692          write-action-or-goto-table
4693          write-error-message-table
4694          construct-error-messages
4695
4696          parser-generator 
4697          file-exists?
4698          base-path!
4699          test-parser-generator)))
4700
4701;  Compile all the functions, except compile-all itself.
4702
4703(dolist (function-to-compile functions-to-compile)
4704
4705    (compile function-to-compile)))
4706)
4707
4708
4709
4710; ==============================================================================
4711; |                               Main Program                                 |
4712; ==============================================================================
4713
4714; ------------------------------------------------------------------------------
4715; |                                  parser-generator                          |
4716; ------------------------------------------------------------------------------
4717;
4718;  DESCRIPTION
4719;
4720;      Main program which produces the LR(1) and LALR(1) parsing tables.
4721;
4722;  CALLING SEQUENCE
4723;
4724;      (parser-generator in-file out-file :parser-type parser-type)
4725;
4726;       in-file       Productions and terminals for the grammar.  See
4727;                     the file grammar.dat for an example.
4728;
4729;       out-file      The numbered productions, goto graph, action and 
4730;                     parsing tables for the grammar.  See the files 
4731;                     lalrparser.dat and parser.dat for examples.
4732;
4733;       parser-type   'LR1 or 'LALR1 parsing.  The default is 'LALR1.
4734;
4735;       Returns:      A string indicating if any conflicts have occurred.
4736;
4737;  EXAMPLE
4738;
4739;      (parser-generator "grammar.dat" "parser.dat" :parser-type 'lr1) 
4740;        =>  NIL and the file parser.dat
4741;
4742;      (parser-generator "grammar.dat" "lalrparser.dat" :parser-type 'lalr1) 
4743;        => NIL and the file lalrparser.dat
4744;
4745;      (parser-generator "grammar.dat" "lalrparser.dat")
4746;        => same as above
4747;    
4748;      (parser-generator "grammar4.dat" "junk" :parser-type 'lalr1) 
4749;      => "Conflicts were detected" and the file junk.
4750;
4751; ------------------------------------------------------------------------------
4752
4753(defun parser-generator( in-file out-file &key (parser-type 'LALR1) )
4754
4755    ; Keep my lawyer happy.
4756    (print-legal-notice)
4757
4758    ; Read in the grammar file productions and terminals.
4759    (load-input-and-initialize in-file)
4760
4761    (let ( (fp (open out-file :direction :output :if-exists :supersede)) )
4762
4763        ; Compute the goto graph for the grammar.
4764        (setq *goto-graph*   (create-goto-graph   parser-type))
4765
4766        ; Construct the action and goto parsing tables.
4767        (setq *action-table* (build-action-table *goto-graph*))
4768        (setq *goto-table*   (build-goto-table   *goto-graph*))
4769
4770        ; Write out the terminals and productions for reference.
4771        (write-header      fp  parser-type)
4772        (write-terminals   fp  *terminals*)
4773        (write-productions fp  *productions*)
4774
4775        ; Write out the goto graph.
4776        (write-goto-graph fp *goto-graph*)
4777
4778        ; Write out the action and goto parse tables.
4779        (write-action-or-goto-table fp *action-table* :table-type 'ACTION)
4780        (write-action-or-goto-table fp *goto-table* :table-type 'GOTO)
4781
4782        ; Write out the error message template.
4783        (write-error-message-table fp *action-table*)
4784
4785        (close fp)
4786
4787        (if *conflicts* 
4788            "Conflicts were detected")
4789    )
4790)
4791
4792
4793
4794; ------------------------------------------------------------------------------
4795; |                              print-file-to-console                         |
4796; ------------------------------------------------------------------------------
4797;
4798;  DESCRIPTION
4799;
4800;      List the lines of a file to the console.
4801;
4802;  CALLING SEQUENCE
4803;
4804;      (print-file-to-console filename)
4805;
4806;      filename  Name of the file.
4807;
4808;      Returns:  
4809;
4810;  EXAMPLE
4811;
4812;      (print-file-to-console "grammar.dat") 
4813;      =>      ;  GrammarE=E+T_T.dat
4814;              ---------------------------------------------------------------------------
4815;
4816;              A grammar of arithmetic expressions,
4817;
4818;              E -> E + T | T
4819;              ...
4820;
4821; ------------------------------------------------------------------------------
4822
4823(defun print-file-to-console( file-name )
4824    (format t "~%~%=========================== ~A =============================~%~%~%" file-name)
4825
4826    (with-open-file (stream file-name)
4827      (do ( (line (read-line stream nil)    ; nil inhibits throw at eof
4828                  (read-line stream nil) )  ; and read-line returns nil at eof
4829          )
4830          ( (null line) )  ; Terminate at eof
4831          (format t "~A~%" line)
4832      )
4833    )
4834)
4835
4836
4837(defun component-present-p (value)
4838  (and value (not (eql value :unspecific))))
4839
4840(defun directory-pathname-p  (p)
4841  (and
4842   (not (component-present-p (pathname-name p)))
4843   (not (component-present-p (pathname-type p)))
4844   p))
4845
4846(defun pathname-as-directory (name)
4847  (let ((pathname (pathname name)))
4848    (when (wild-pathname-p pathname)
4849      (error "Can't reliably convert wild pathnames."))
4850    (if (not (directory-pathname-p name))
4851      (make-pathname
4852       :directory (append (or (pathname-directory pathname) (list :relative))
4853                          (list (file-namestring pathname)))
4854       :name      nil
4855       :type      nil
4856       :defaults pathname)
4857      pathname)))
4858
4859
4860; ------------------------------------------------------------------------------
4861; |                           file-exists?                                     |
4862; ------------------------------------------------------------------------------
4863;
4864; DESCRIPTION
4865;
4866;      Portable way to check if a file or directory exists.
4867;
4868; CALLING SEQUENCE
4869;
4870;      (file-exists? directory-or-file)
4871;
4872;      directory-or-file    Pathname for directory or file
4873;      Returns:             t if it is there, nil if not.
4874;
4875;
4876; EXAMPLES
4877;
4878;     (file-exists? "/NotThere") => nil
4879;     (file-exists? "/Volumes/seanoconnor") => t
4880;
4881; ------------------------------------------------------------------------------
4882
4883(defun file-exists? (pathname)
4884  "Check if the file exists"
4885      #+(or sbcl lispworks openmcl)
4886      (probe-file pathname)
4887
4888      #+(or allegro cmu)
4889      (or (probe-file (pathname-as-directory pathname))
4890                    (probe-file pathname))
4891
4892      #+clisp
4893      (or (ignore-errors
4894           (probe-file (pathname-as-file pathname)))
4895                        (ignore-errors
4896                                  (let ((directory-form (pathname-as-directory pathname)))
4897                                              (when (ext:probe-directory directory-form)
4898                                                            directory-form))))
4899
4900      #-(or sbcl cmu lispworks openmcl allegro clisp)
4901      (error "file-exists-p not implemented")
4902)
4903
4904
4905
4906; ------------------------------------------------------------------------------
4907; |                               base-path!                                   |
4908; ------------------------------------------------------------------------------
4909;
4910; DESCRIPTION
4911;
4912;      Try to find out where the base directory for the web page is located.
4913;
4914; CALLING SEQUENCE
4915;
4916;      (base-path!)
4917;
4918;      Returns:             String of base path or nil if it can't find it.
4919;
4920;
4921; EXAMPLES
4922;
4923;     (base-path!) => "C:/Sean/WebSite"         ; Got it.
4924;     (base-path!) => nil                       ; Could't find it.
4925;
4926; ------------------------------------------------------------------------------
4927
4928(defun base-path!()
4929    (let ( (possible-directories-list '( 
4930                                         "/cygdrive/c/Sean/WebSite"                 ; Windows / Cygwin
4931                                         "/Users/seanoconnor/Desktop/Sean/WebSite"  ; Mac OS
4932                                         "/home/seanoconnor/Desktop/Sean/WebSite"   ; Ubuntu Linux
4933                                        )))
4934
4935     (dolist (base-path possible-directories-list)  
4936;       (format t "base path = ~S exists = ~S~%" base-path (file-exists? base-path) )
4937         (if (file-exists? base-path) (return (concatenate 'string base-path "/"))))
4938    )
4939)
4940
4941
4942
4943; ------------------------------------------------------------------------------
4944; |                           test-parser-generator                            |
4945; ------------------------------------------------------------------------------
4946;
4947; DESCRIPTION
4948;
4949;     Run the parser generator on a test grammar and produce test parsing
4950;     tables.
4951;
4952; CALLING SEQUENCE
4953;
4954;     (test-parser-generator)
4955;
4956;      Set the files and paths to your requirements.  I'm assuming you've
4957;      installed cygwin if you're on a Windows machine.
4958;
4959; ------------------------------------------------------------------------------
4960
4961(defun test-parser-generator()
4962
4963    ;  Compile all the functions for speed.
4964    (compile-all)
4965
4966    ; Garbage collect.
4967    (gc)
4968
4969    ;  Generate a set of parse tables from a test grammar, both LR(1) and
4970    ;  LALR(1).
4971    (let* ( 
4972            ; Set up the base directory paths.
4973            (base-path             (base-path!))
4974
4975            (sub-path              "ComputerScience/Compiler/ParserGeneratorAndParser/")
4976            (grammar-path          "Grammars/" )
4977            (parse-table-path      "ParseTables/")
4978
4979            ;  List the grammar files (input) and parse table files (output).
4980            (grammar-file         '( "GrammarS=SaSbEPSILON.dat"
4981                                     "GrammarE=E+T_T.dat"
4982                                     "GrammarPoly.dat"
4983                                     "GrammarLR(1)NotLALR(1).dat"
4984                                     "GrammarNotLR(1)NotLALR(1).dat") )
4985            (parse-file-LR1       '( "ParseTablesLR(1)_S=SaSbEPSILON.dat"
4986                                     "ParseTablesLR(1)_E=E+T_T.dat" 
4987                                     "ParseTablesLR(1)_Poly.dat"
4988                                     "ParseTablesLR(1)_NotLALR(1).dat"
4989                                     "ParseTablesLR(1)_NotLR(1)NotLALR(1).dat") )
4990            (parse-file-LALR1     '( "ParseTablesLALR(1)_S=SaSbEPSILON.dat"
4991                                     "ParseTablesLALR(1)_E=E+T_T.dat"
4992                                     "ParseTablesLALR(1)_Poly.dat"
4993                                     "ParseTablesLALR(1)_NotLALR(1).dat"
4994                                     "ParseTablesLALR(1)_NotLR(1)NotLALR(1).dat") )
4995          )
4996
4997          (dotimes (i (length grammar-file))
4998
4999              (let* (
5000                      ;  Create the full file path.
5001                      (full-grammar-file    
5002                              (concatenate 'string 
5003                                           base-path sub-path grammar-path 
5004                                           (nth i grammar-file))
5005                      )
5006
5007                      (full-parse-file-LR1  
5008                              (concatenate 'string 
5009                                       base-path sub-path parse-table-path 
5010                                       (nth i parse-file-LR1))
5011                      )
5012
5013                      (full-parse-file-LALR1 
5014                               (concatenate 'string 
5015                                        base-path sub-path parse-table-path 
5016                                        (nth i parse-file-LALR1))
5017                      )
5018                    )
5019
5020                    ; Call the parser generator to generate parse tables for 
5021                    ; both LR(1) and LALR(1).
5022                    (parser-generator full-grammar-file full-parse-file-LR1
5023                                      :parser-type 'LR1)
5024
5025                    (parser-generator full-grammar-file full-parse-file-LALR1)
5026
5027                    ; Display the results to the console.
5028                    (print-file-to-console full-grammar-file)
5029                    (print-file-to-console full-parse-file-LR1)
5030                    (print-file-to-console full-parse-file-LALR1)
5031                )
5032         )
5033    )
5034)