From Newsgroup: comp.compilers
On 2025-04-05, Kaz Kylheku <
643-408-1753@kylheku.com> wrote:
Hi all,
I recently came up with neat trick in order to obtain particular
(to me) desirable results out of an infix parser (which more
or less follows Shunting Yard).
This "precedence demotion" algorithm turns out to have some
undesirable properties, leading to unintuitive parses, such as,
oh, a + sqrt b * sqrt c parsing as (a + sqrt b) * (sqrt c).
This led me to search for another way to capture the good stuff while
avoiding the bad.
When the loop is just about to process an infix operator O, we perform
these steps before anything else:
1. Determine whether the infix operator O is immediately followed by a
sequence of one or more consecutive prefix operators P1, P2.
We keep this step.
2. If such a sequence is identified, we determine which of the Pi's
has lowest precedence, and we subtract one from that precedence.
We also keep this, but don't subtract one from the prededence.
3. Then we compare the precedence of infix operator O with the
precedence calculated in (3). If O has higher precedence, we
*demote* it to the calculated precedence. (Only the currently
occurring instance of that operator.)
We replace this step entirely. No dynamic precedence adjustment takes
place. Instead:
3. We search the operator stack to find an prefix operator,
which has a precdedence at least as high as the one determined in 2.
If such an operator exists, we perform reductions until we remove it.
If no such operator exists in the stack, we do nothing.
4. We then continue with the usual algorithm, processing the operator
stack based on comparing precedence and building nodes in the
output tree and so on.
Some live examples:
1> (parse-infix '(sin x + x + cos y + y))
(+ (sin (+ x x))
(cos (+ y y)))
[...]
2> (parse-infix '(- x * x * - y * y))
(* (- (* x x))
(- (* y y)))
[...]
3> (parse-infix '(- x + x + - y + y)) ;; ... x + - y ...
(+ (+ (+ (- x) x)
(- y))
y)
[...]
4> (parse-infix '(sin x + x + - cos y + y))
(+ (sin (+ x x))
(- (cos (+ y y))))
I get exactly the same output on these, and all my existing test cases pass.
Effectively we are still demoting the precedence of the infix operator,
but in a surgical way: we are giving it lower precedence than
specifically identified unparsed material to the left in the parsing
stack, by doing reductions until we clear that material.
E.g. in the case of
(sin x + x + - cos y + y))
At the second + , we see a sequence of prefix operators to
the right, (- cos). The lowest precedence among them is that of cos.
We the search the parsing stack until we find a prefix operator
at least as low, and so sin is identified.
What is going in is a kind of bracketing. We see some prefix operators to the right, and so we scan the left context to find a parallel prefix operator whose parse is not yet complete. If we find it, we then close off that operator, reducing it to a syntax node.
--
TXR Programming Language:
http://nongnu.org/txr
Cygnal: Cygwin Native Application Library:
http://kylheku.com/cygnal
Mastodon: @
Kazinator@mstdn.ca
--- Synchronet 3.21a-Linux NewsLink 1.2