-
LALR look-ahead sets from item right context grammar?
From
Richard Rogers@rprogers@seanet.com to
comp.compilers on Thu Jan 29 18:25:19 2026
From Newsgroup: comp.compilers
In discussing LR-Regular parsing, Grune [1] gives an algorithm for computing the CFG of LR(0) item right contexts. I wondered if LALR(1) look-ahead sets
are just the FIRST of the item right context grammar?
1. @book{Grune:1990:PTP:130365,
author = {Grune, Dick and Jacobs, Ceriel J. H.},
title = {Parsing Techniques: A Practical Guide},
year = {1990},
isbn = {0-13-651431-6},
publisher = {Ellis Horwood},
address = {Upper Saddle River, NJ, USA},
}
--- Synchronet 3.21b-Linux NewsLink 1.2
-
From
Chris Clark@cclark@imachinesinc.com to
comp.compilers on Sat Jan 31 01:30:19 2026
From Newsgroup: comp.compilers
While I haven't read the book you reference, and thus am not familiar with the details of the "right context grammar", I would suggest you look to verify
that in cases where there are nullable non-terminals in the right context grammar, that it would also include the "follow" set and not just the "first' set. We did something similar to that in our version of "Yacc++ and the Language Objects Library" but our algorithm is a non-standard implementation
of the technology. Still the LR(0) machine has more information in it than is generally assumed.
--- Synchronet 3.21b-Linux NewsLink 1.2
-
From
Richard Rogers@rprogers@seanet.com to
comp.compilers on Sat Feb 7 01:42:35 2026
From Newsgroup: comp.compilers
The right context grammar for item i in state j of the LR(0) automaton
of CFG G describes all-a of the suffixes that can follow any prefix
arriving at (i, j) when parsing a string in L(G). If the start symbol of
the right context grammar is nullable, it would mean a string in L(G)
could end at (i, j). If we augment G with S' -> S $ then S' -> S $ *
should be the only item with a nullable right context grammar start
symbol, and FOLLOW would also be the empty string.
And my hypothesis is that FIRST(RCG(i,j)) is the LALR look-ahead set for
(i, j), where RCG(i, j) is the start symbol of the right context grammar
for item i in state j of the LR(0) automaton of the augmented G.
The RCG computing algorithm gives a right-regular grammar for the right contexts if the non-terminals of G are treated as terminals in the RCG.
For making an LR-Regular parser, G's non-terminals are replaced by their approximate regular envelopes to give a regular approximation of the
right contexts. But if we include G's non-terminals and productions in
the RCG, the RCG is an accurate CFG for the right contexts.
--- Synchronet 3.21b-Linux NewsLink 1.2
-
From
Richard Rogers@rprogers@seanet.com to
comp.compilers on Sat Feb 28 13:43:17 2026
From Newsgroup: comp.compilers
Qwen3-coder just pointed out that what I've described is actually "LALR(1)
with context-sensitive look-aheads," also known by a couple other names. To
get true LALR(1), you'd have to take the union of the FIRST(RCG(i,j)) over states j containing item i. But that seems silly since it's better and less code without doing the union :)
--- Synchronet 3.21d-Linux NewsLink 1.2