• Nice trick for "superfix" relational operators operator precedence parsing.

    From Kaz Kylheku@643-408-1753@kylheku.com to comp.compilers on Wed May 7 02:44:45 2025
    From Newsgroup: comp.compilers

    I wanted to have compounding relational operators (which I call
    "superfix") in an operator precedence parser. This means that
    syntax like:

    1 < j < k <= N

    behaves as one comparison operation meaning
    (1 < j) and (j < k) and (k <= N), except that the j and k
    terms are not evaluated twice. There may be short-circuiting
    of the logical and.

    We want parentheses to override this, so that

    1 < (j < k) <= N

    is treated as

    1 < T < N

    where T is a single term consisting of (j < k).

    Anyway, I came up with a nice solution for bringing this about
    without complicating the Shunting-Yard-based operator precedence
    parser.

    1. I put all the relational operators into a single level of
    precedence, and made them right-associative.

    2. A pass is made through the resulting syntax tree, whic
    recognizes the right-associative tree patterns involving these
    operators and rewrites them to the "and" terms.

    We can see this live:

    (parse-infix '(1 < i < j <= N))
    (< 1 (< i (<= j N)))

    (sys:finish-infix *1)
    (and (< 1 i j)
    (<= j N))

    (In my actual implementation, the < function is n-ary so we reduce
    to a single (< 1 i j) term. Because this is a strictly evaluated
    function call, it means that even if (< 1 i) is false, j is
    still evaluated, unlike with (and (< 1 i) (< i j))).

    Tree pattern matching matchin and rewriting is used to handle patterns
    like (relop1 A (relop2 B C)) into (and (relop1 A B) (relop2 B C)).

    Oh, about multiple evaluation, let's replace i with (inc i),
    an expression wth a side-effect of pre-incrementing i (returning
    the new value) which must happen once:

    (parse-infix '(1 < (inc i) < j <= N))
    (< 1 (< (inc i) (<= j N)))
    (sys:finish-infix *1)
    (let (#:g0014)
    (and (< 1 (set #:g0014
    (inc i)))
    (< #:g0014 j)
    (<= j N)))

    Code is emitted to bind a temporary hidden variable, which receives
    the result of the first (inc i). The subsequent term that would
    be an occurrence of (inc i) under the < function instead refers
    to that variable. The strategy both preserves left-to-right evaluation
    of all terms, and also depends on it or correctness.

    Other than adjusting the declarations of associativity and precedence, nothing had to be done in the parser; all the work in in the separate "finish infix" pass.

    It is then easy to document that way, too; the parsing model is very clear, requiring no concepts beyond binary operator precedence, and the heuristics of the "superfix" transformation are easy to clearly document also.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    [COBOL had roughly that shortcut comparison syntax in 1960 so a lot of people must have looked at this. -John]
    --- Synchronet 3.21b-Linux NewsLink 1.2