Forum: Too Lazy BBS

LALR look-ahead sets from item right context grammar?

From Richard Rogers@rprogers@seanet.com to comp.compilers on Thu Jan 29 18:25:19 2026

From Newsgroup: comp.compilers

In discussing LR-Regular parsing, Grune [1] gives an algorithm for computing the CFG of LR(0) item right contexts. I wondered if LALR(1) look-ahead sets
are just the FIRST of the item right context grammar?

1. @book{Grune:1990:PTP:130365,
author = {Grune, Dick and Jacobs, Ceriel J. H.},
title = {Parsing Techniques: A Practical Guide},
year = {1990},
isbn = {0-13-651431-6},
publisher = {Ellis Horwood},
address = {Upper Saddle River, NJ, USA},
}
--- Synchronet 3.21b-Linux NewsLink 1.2

From Chris Clark@cclark@imachinesinc.com to comp.compilers on Sat Jan 31 01:30:19 2026

From Newsgroup: comp.compilers

While I haven't read the book you reference, and thus am not familiar with the details of the "right context grammar", I would suggest you look to verify
that in cases where there are nullable non-terminals in the right context grammar, that it would also include the "follow" set and not just the "first' set. We did something similar to that in our version of "Yacc++ and the Language Objects Library" but our algorithm is a non-standard implementation
of the technology. Still the LR(0) machine has more information in it than is generally assumed.
--- Synchronet 3.21b-Linux NewsLink 1.2

From Richard Rogers@rprogers@seanet.com to comp.compilers on Sat Feb 7 01:42:35 2026

From Newsgroup: comp.compilers

The right context grammar for item i in state j of the LR(0) automaton
of CFG G describes all-a of the suffixes that can follow any prefix
arriving at (i, j) when parsing a string in L(G). If the start symbol of
the right context grammar is nullable, it would mean a string in L(G)
could end at (i, j). If we augment G with S' -> S $ then S' -> S $ *
should be the only item with a nullable right context grammar start
symbol, and FOLLOW would also be the empty string.

And my hypothesis is that FIRST(RCG(i,j)) is the LALR look-ahead set for
(i, j), where RCG(i, j) is the start symbol of the right context grammar
for item i in state j of the LR(0) automaton of the augmented G.

The RCG computing algorithm gives a right-regular grammar for the right contexts if the non-terminals of G are treated as terminals in the RCG.
For making an LR-Regular parser, G's non-terminals are replaced by their approximate regular envelopes to give a regular approximation of the
right contexts. But if we include G's non-terminals and productions in
the RCG, the RCG is an accurate CFG for the right contexts.
--- Synchronet 3.21b-Linux NewsLink 1.2

From Richard Rogers@rprogers@seanet.com to comp.compilers on Sat Feb 28 13:43:17 2026

From Newsgroup: comp.compilers

Qwen3-coder just pointed out that what I've described is actually "LALR(1)
with context-sensitive look-aheads," also known by a couple other names. To
get true LALR(1), you'd have to take the union of the FIRST(RCG(i,j)) over states j containing item i. But that seems silly since it's better and less code without doing the union :)

--- Synchronet 3.21d-Linux NewsLink 1.2

Who's Online
Recent Visitors
- Geek2
  Sun May 17 07:06:15 2026
  from Euclid, Oh via Telnet
- Geek2
  Sat May 16 21:25:04 2026
  from Euclid, Oh via Telnet
- Jas Hud
  Sat May 16 00:50:28 2026
  from Bbs.Eob-Bbs.Com,wi via Telnet
- Geek2
  Fri May 15 19:53:20 2026
  from Euclid, Oh via Telnet

System Info

Sysop:	Amessyroom
Location:	Fayetteville, NC
Users:	65
Nodes:	6 (0 / 6)
Uptime:	08:01:29
Calls:	862
Files:	1,311
D/L today:	1 files (1,366K bytes)
Messages:	264,936