• LALR look-ahead sets from item right context grammar?

    From Richard Rogers@rprogers@seanet.com to comp.compilers on Thu Jan 29 18:25:19 2026
    From Newsgroup: comp.compilers

    In discussing LR-Regular parsing, Grune [1] gives an algorithm for computing the CFG of LR(0) item right contexts. I wondered if LALR(1) look-ahead sets
    are just the FIRST of the item right context grammar?

    1. @book{Grune:1990:PTP:130365,
    author = {Grune, Dick and Jacobs, Ceriel J. H.},
    title = {Parsing Techniques: A Practical Guide},
    year = {1990},
    isbn = {0-13-651431-6},
    publisher = {Ellis Horwood},
    address = {Upper Saddle River, NJ, USA},
    }
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Chris Clark@cclark@imachinesinc.com to comp.compilers on Sat Jan 31 01:30:19 2026
    From Newsgroup: comp.compilers

    While I haven't read the book you reference, and thus am not familiar with the details of the "right context grammar", I would suggest you look to verify
    that in cases where there are nullable non-terminals in the right context grammar, that it would also include the "follow" set and not just the "first' set. We did something similar to that in our version of "Yacc++ and the Language Objects Library" but our algorithm is a non-standard implementation
    of the technology. Still the LR(0) machine has more information in it than is generally assumed.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Richard Rogers@rprogers@seanet.com to comp.compilers on Sat Feb 7 01:42:35 2026
    From Newsgroup: comp.compilers

    The right context grammar for item i in state j of the LR(0) automaton
    of CFG G describes all-a of the suffixes that can follow any prefix
    arriving at (i, j) when parsing a string in L(G). If the start symbol of
    the right context grammar is nullable, it would mean a string in L(G)
    could end at (i, j). If we augment G with S' -> S $ then S' -> S $ *
    should be the only item with a nullable right context grammar start
    symbol, and FOLLOW would also be the empty string.

    And my hypothesis is that FIRST(RCG(i,j)) is the LALR look-ahead set for
    (i, j), where RCG(i, j) is the start symbol of the right context grammar
    for item i in state j of the LR(0) automaton of the augmented G.

    The RCG computing algorithm gives a right-regular grammar for the right contexts if the non-terminals of G are treated as terminals in the RCG.
    For making an LR-Regular parser, G's non-terminals are replaced by their approximate regular envelopes to give a regular approximation of the
    right contexts. But if we include G's non-terminals and productions in
    the RCG, the RCG is an accurate CFG for the right contexts.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Richard Rogers@rprogers@seanet.com to comp.compilers on Sat Feb 28 13:43:17 2026
    From Newsgroup: comp.compilers

    Qwen3-coder just pointed out that what I've described is actually "LALR(1)
    with context-sensitive look-aheads," also known by a couple other names. To
    get true LALR(1), you'd have to take the union of the FIRST(RCG(i,j)) over states j containing item i. But that seems silly since it's better and less code without doing the union :)

    --- Synchronet 3.21d-Linux NewsLink 1.2