• infix via code walker hook.

    From Kaz Kylheku@643-408-1753@kylheku.com to comp.lang.lisp on Mon Mar 31 21:24:03 2025
    From Newsgroup: comp.lang.lisp

    I've been toying with the idea for a while of using a code walker
    or hooking into an existing code walker in order to support S-exp-ified
    infix syntax for arithmetic.

    I have an advanced proof of concept happening.

    What is interesting here is that the work gives rise to a powerful
    argument in favor of the separation of function and variable bindings,
    which helps to make it trivial:

    $ ./txr -i infix.tl
    TXR's no-spray organic production means every bug is carefully removed by hand.
    1> (ifx (let ((a 1) (b 2) (c 3))
    ((a + b) * 3)))
    9

    Notice how ifx "works through" the let (and trust me when I say, any layering of syntax it is not interested in).

    ifx will find all infix expressions in its interior and transform them
    based on the pattern (arg1 op [arg ...]) to (op arg1 [arg ...]).

    This is the complete implementation, but relies on an unreleased
    *expand-hook* feature:

    (defun-match infix-expand-hook
    ((@(as form (@x @y . @rest)) @env nil)
    (if (fboundp y)
    ^(,y ,x ,*rest)
    form))
    ((@form @nil @nil) form))

    (defmacro ifx (:env env . body)
    (let ((*expand-hook* (fun infix-expand-hook)))
    (expand ^(progn ,*body) env)))

    The implementation's macro-expanding code-walker invokes *expand-hook*
    for each macro (symbol or compound), function call or malformed form
    that it encounters. It passes the form, the macro-expansion-time
    environment, and a type symbol. The type symbol is :macro when the
    form is a macro invocation, :fun if it is a function call, or else
    nil if it is invalid.

    The above infix module won't make the transformation on a form
    that looks valid. Concretely, if you define a function a, whether
    globally or locally, then (a + b) will not be transformed.

    As you can see, this also showcases the benefits of a function/variable namespace separation in Lisp. Even though a has a binding, (a + b) can
    be transformed to (+ a b). This is because its binding is a variable
    binding, which we can ignore.

    In a Lisp-1 like Clojure or Scheme, there is a difficulty here:
    we have to determine that a is not a function. That requires
    type information or else a run-time check. The latter is a nonstarter,
    since we are looking to transform code at macro time.
    But even if you have the former, type info, who is to say that is
    correct; maybe someone wants (f + g) where those are functions.

    Here is a toast to variable/function namespace separation.
    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    --- Synchronet 3.21d-Linux NewsLink 1.2
  • From Kaz Kylheku@643-408-1753@kylheku.com to comp.lang.lisp on Wed Apr 2 03:26:20 2025
    From Newsgroup: comp.lang.lisp

    On 2025-03-31, Kaz Kylheku <643-408-1753@kylheku.com> wrote:
    (defmacro ifx (:env env . body)
    (let ((*expand-hook* (fun infix-expand-hook)))
    (expand ^(progn ,*body) env)))

    Doh! I just though of expander-let which can do this nicely.

    And ... it turns out I already developed it! expander-let
    appeared in TXR 287, in June 2023. (Current is 299).

    Thus:

    (defmacro ifx (. body)
    ^(expander-let ((*expand-hook* (fun infix-expand-hook)))
    ,*body))

    That is way better. The expander encounters the expander-let and does
    the binding for us, wraps the body in a (progn ...) and naturally
    recurses over it. Then the expander-let disappears.

    The explicit expand call causes a wasteful double expansion; whatever it produces will be recursed into by the expander again.

    Arguably, this design is wrong because, the binding for expand hooks
    should ideally be in the macro environment, not in the dynamic
    environment. If a closure is ever produced that captures a macro
    environment, and does form expansion when called, it may not have the
    correct value of *expand-hook* that was in effect. Also, categorically,
    local expander hooks are kind of generalized macrolets, and therefore
    belong in the same space.

    There is the obvious workaround in that the scope around the closure can
    just copy the then-visible dynamic value of *expand-hook* into a lexical variable, and capture that. It's an ugly ritual to have to remember,
    and to have to carry out for correctness, though.
    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    --- Synchronet 3.21d-Linux NewsLink 1.2
  • From Kaz Kylheku@643-408-1753@kylheku.com to comp.lang.lisp on Fri Apr 4 03:14:58 2025
    From Newsgroup: comp.lang.lisp

    On 2025-04-02, Kaz Kylheku <643-408-1753@kylheku.com> wrote:
    On 2025-03-31, Kaz Kylheku <643-408-1753@kylheku.com> wrote:
    (defmacro ifx (:env env . body)
    (let ((*expand-hook* (fun infix-expand-hook)))
    (expand ^(progn ,*body) env)))

    Doh! I just though of expander-let which can do this nicely.

    I'm serious about getting a form of infix as a supported
    feature, and it's coming along.

    The ground rules are: no read syntax. Everything is done using
    existing syntax.

    (ifx (let ((i 0))
    (while (i ++ < 9)
    (put-line (pic "## ## #.#####" i (i * i) (sqrt sin(i) ** 2 + sin(2 * i) ** 2))))))
    1 1 1.23891
    2 4 1.18304
    3 9 0.31303
    4 16 1.24562
    5 25 1.10249
    6 36 0.60497
    7 49 1.18867
    8 64 1.03040
    9 81 0.85663

    I have a Shunting Yard type infix parser which is extended with handling prefix and postifix operators, as well as error checking. I converted the algorithm to a pattern-based term rewriting system with two stacks. There are a few pure term rewriting rules which do some nice things.

    Here is one. I took numerous monadic math functions like sin, cos, log, and turned them into very low precedence unary operators. What this means
    is that sin is applied to the entire formula here:

    1> (parse-infix '(sin a + b + c + d))
    (sin (+ (+ (+ a b) c)
    d))

    But, if we put parentheses on the operand of sin, then it's no longer
    the unary sin, but a function call with that argument:

    2> (parse-infix '(sin (a) + b + c + d))
    (+ (+ (+ (sin a) b)
    c)
    d)
    3> (parse-infix '(sin (a + b) + c + d))
    (+ (+ (sin a + b)
    c)
    d)

    The parser has a hack like "if we are processing a unary operator
    which is tagged as a function-like operator, and it is immediately
    followed by a compound expression, then insert the operator as
    the head operator into the compound expression".

    Another term rewriting rule handles C-like [ ] array reference
    agglutination:

    4> (parse-infix '(a + b[i][j][k]))
    (+ a [[[b i] j] k])

    It's like magic; just use (ifx ...) around any expression or
    even an entire top-level form like a defun, and you can freely
    mix infix and regular Lisp in all evaluation contexts.

    4> (ifx (list* 1 2 (mapcar (ret (@1 + @1)) '(10 20 30))))
    (1 2 20 40 60)
    5> (ifx (list* 1 2 (mapcar (lambda (x) (+ x x)) '(10 20 30))))
    (1 2 20 40 60)

    It's turning out ot be a pretty nice system, combining autodetection within an explicit (ifx ...) contour.

    I think it is already better than SRFI-105 curly braces and some other systems. --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    --- Synchronet 3.21d-Linux NewsLink 1.2
  • From Stefan Monnier@monnier@iro.umontreal.ca to comp.lang.lisp on Fri Apr 4 14:17:20 2025
    From Newsgroup: comp.lang.lisp

    1> (parse-infix '(sin a + b + c + d))
    (sin (+ (+ (+ a b) c)
    d))

    But, if we put parentheses on the operand of sin, then it's no longer
    the unary sin, but a function call with that argument:

    2> (parse-infix '(sin (a) + b + c + d))
    (+ (+ (+ (sin a) b)
    c)
    d)

    For someone coming from "statically typed functional programming", this
    is *very* weird.
    All those languages (Haskell/OCaml/SML/Coq/Idris/Agda/...) treat

    (sin a + b + c + d)

    the same as

    (sin (a) + b + c + d)

    I think it is already better than SRFI-105 curly braces and some
    other systems.

    A related effort was Honu: https://users.cs.utah.edu/plt/publications/gpce12-rf.pdf


    Stefan
    --- Synchronet 3.21d-Linux NewsLink 1.2
  • From Kaz Kylheku@643-408-1753@kylheku.com to comp.lang.lisp on Fri Apr 4 19:35:28 2025
    From Newsgroup: comp.lang.lisp

    On 2025-04-04, Stefan Monnier <monnier@iro.umontreal.ca> wrote:
    1> (parse-infix '(sin a + b + c + d))
    (sin (+ (+ (+ a b) c)
    d))

    But, if we put parentheses on the operand of sin, then it's no longer
    the unary sin, but a function call with that argument:

    2> (parse-infix '(sin (a) + b + c + d))
    (+ (+ (+ (sin a) b)
    c)
    d)

    For someone coming from "statically typed functional programming", this
    is *very* weird.
    All those languages (Haskell/OCaml/SML/Coq/Idris/Agda/...) treat

    (sin a + b + c + d)

    the same as

    (sin (a) + b + c + d)

    Ah, but how about higher precedence operators?

    Would you want

    sin a * b * c

    to be

    sin (a) * b *c

    If not, you're asking for the operator version of sin to be
    a unary operator with a higher precedence than plus or minus,
    but lower than multiplication.

    In math boox and papers, you certainly see expression like

    sin 2cx

    I believe I have also seen

    sin t + 1

    But also:

    (sin t - cos t)**2

    It seems that a useful behavior to have would be for the
    range of the unary operator to stop when we encounter
    another such unary operator; i.e.

    sin a + b + c + sin d + e + f
    ^ ^

    Suppose that when the binary + is adjacent to a unary operator as above,
    we pretend that the + has a lower precedence.

    Ooh, that seems to nicely for all unary operators.

    I put into the parser a couple of hack lines which look like this:

    (iflet ((o3 [ifx-uops (car rest)]))
    (if (eq o1.arity :infix)
    (set o1 (copy o1)
    o1.prec (pred o3.prec))))

    When about to process an infix operator o1, we first peek at the rest of the input. If it starts with a prefix operator, which we call o3, (since o2 is being used for a previous operator pushed onto the operator stack) then we duplicate the current o1 operator object, and give it a precedence that is one lower than the infix.

    (A better implementation would be to have a lexical variable here holding
    a copy of the precedence, and just adjust the lexical, so as not to copy
    the operator object.)

    Bam:

    What we want works:

    2> (parse-infix '(sin x + x + cos y + y))
    (+ (sin (+ x x))
    (cos (+ y y)))

    Also the higher precedence unary minus now does
    the right thing, syntactically, FWIW:

    3> (parse-infix '(- x * x * - y * y))
    (* (- (* x x))
    (- (* y y)))

    The rule is easy to understand and leads to intuitive, symmetric treatment of symmetric-looking formulas.

    A lower precedence prefix operator suddenly appearing in a chain of higher precedence operators should interrupt that chain, and begin a new chain piece at the same level of depth.

    Say, I wonder whether this logic could be nicely expressed in LAR(1) anymore.
    I have a feeling that not, because LALR(1) makes only simple shift or reduce

    Next question: should the precedence-lowering propagate thorugh stacks
    of prefix operators?

    Like here:

    5> (parse-infix '(sin x + x + - cos y + y))
    (sin (+ x (+ x (- (cos (+ y y))))))

    When we are processing (+ - cos y ...), looking at the binarry +, should we take into consideration all the consecutive unary operators that prefix the rest of the input? In this case (- cos). And should we then take their lowest precedence? The idea being that cos propagates its low precedence to the higher precedence -, which propagates it to + (decremeted by 1).

    Then we could get the parse

    (+ (sin (+ x x))
    (- (cos (+ y y))))
    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    --- Synchronet 3.21d-Linux NewsLink 1.2
  • From Stefan Monnier@monnier@iro.umontreal.ca to comp.lang.lisp on Fri Apr 4 16:33:12 2025
    From Newsgroup: comp.lang.lisp

    Ah, but how about higher precedence operators?

    Same: the "usual" convention in those languages (and in most ++-calculus papers) is that function application takes the shape "e1 e2" and binds
    tighter than "anything else".

    Would you want

    sin a * b * c

    to be

    sin (a) * b *c

    Yup.

    It seems that a useful behavior to have would be for the
    range of the unary operator to stop when we encounter
    another such unary operator; i.e.

    sin a + b + c + sin d + e + f
    ^ ^

    Well, now you're falling into the world of heuristics to try and handle
    the inconsistent mix of rules used by various authors.
    You're obviously free to go there, but don't count me in.


    Stefan
    --- Synchronet 3.21d-Linux NewsLink 1.2
  • From Kaz Kylheku@643-408-1753@kylheku.com to comp.lang.lisp on Sat Apr 5 01:09:55 2025
    From Newsgroup: comp.lang.lisp

    On 2025-04-04, Stefan Monnier <monnier@iro.umontreal.ca> wrote:
    Ah, but how about higher precedence operators?

    Same: the "usual" convention in those languages (and in most ++-calculus papers) is that function application takes the shape "e1 e2" and binds tighter than "anything else".

    Would you want

    sin a * b * c

    to be

    sin (a) * b *c

    Yup.

    But then that is inconsistent with a very common treatment of
    sin, which is that product terms comprise one argument.

    For instance, oh sin 2 * pi * f * t .

    It seems that a useful behavior to have would be for the
    range of the unary operator to stop when we encounter
    another such unary operator; i.e.

    sin a + b + c + sin d + e + f
    ^ ^

    Well, now you're falling into the world of heuristics to try and handle
    the inconsistent mix of rules used by various authors.
    You're obviously free to go there, but don't count me in.

    It may seem like a heuristic, but it's actually one clear rule. (It may
    be novel; I should ping the CS StackExchange or something.)

    It's not some very specific, dark corner case but a general rule
    prominently situated at the heart of the algortihm.

    Whenever processing any infix operator O, we determine whether it is
    followed by a nonempty, contiguous sequence of prefix operators P1, P2,
    ... In other words, does the right operand of O begin with prefixes.
    If so, we calculate the minimum precedence among the Pi,
    and decrement that minium by one level (or perhaps a fraction of a
    level, just to make it smaller). If the precedence of O is not at least
    as low as that value, we demote O's precedence to that value,
    for the purposes of the usual operator processing logic that follows.

    As far as I can see, with that, I am done. I don't see any situations
    that yield results that grate on my eyes/nerves. I'm ready to move on
    from exploratory coding to cranking out test cases, and writing
    documentation.
    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    --- Synchronet 3.21d-Linux NewsLink 1.2
  • From ram@ram@zedat.fu-berlin.de (Stefan Ram) to comp.lang.lisp on Sat Apr 5 07:26:42 2025
    From Newsgroup: comp.lang.lisp

    Stefan Monnier <monnier@iro.umontreal.ca> wrote or quoted:
    sin a + b + c + sin d + e + f
    Well, now you're falling into the world of heuristics to try and handle
    the inconsistent mix of rules used by various authors.
    You're obviously free to go there, but don't count me in.

    Languages such as "AsciiMath" or "AsciiMathML" have been created
    to deal with such notations, and there is software available to
    convert them to LaTeX or MathML.

    In one case, a double space was made significant, to resolve
    some ambiguities as in "sin a/2", in which the "/" now binds
    stronger than the sin, different from "sin a/2".

    The implementation of ASCIIMathML is somewhat elegant insofar as
    some ambiguities are not resolved but just passed on to LaTeX or
    MathML as they are. The input is not actually parsed into an AST
    that represents the actual structure of the expression as it would
    be parsed by a parser for a programming language.

    And some implementations take care to handle /ungrammatical/ math
    because it could be needed in a text book to show what makes no sense.
    For example, input such as "sin sin sin" is handled too.


    --- Synchronet 3.21d-Linux NewsLink 1.2
  • From Kaz Kylheku@643-408-1753@kylheku.com to comp.lang.lisp on Sat Apr 5 07:59:25 2025
    From Newsgroup: comp.lang.lisp

    On 2025-04-05, Stefan Ram <ram@zedat.fu-berlin.de> wrote:
    Stefan Monnier <monnier@iro.umontreal.ca> wrote or quoted:
    sin a + b + c + sin d + e + f
    Well, now you're falling into the world of heuristics to try and handle
    the inconsistent mix of rules used by various authors.
    You're obviously free to go there, but don't count me in.

    Languages such as "AsciiMath" or "AsciiMathML" have been created
    to deal with such notations, and there is software available to
    convert them to LaTeX or MathML.

    In one case, a double space was made significant, to resolve
    some ambiguities as in "sin a/2", in which the "/" now binds
    stronger than the sin, different from "sin a/2".

    I feel that this anecdote validates my intutition that we want
    sin a / 2 to be sin(a / 2), and such. The ugliness of the double space
    hack is proportional to the depth of someone's regret, which he or she
    used to justify coding the hack.
    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    --- Synchronet 3.21d-Linux NewsLink 1.2