Forum: Too Lazy BBS

infix via code walker hook.

From Kaz Kylheku@643-408-1753@kylheku.com to comp.lang.lisp on Mon Mar 31 21:24:03 2025

From Newsgroup: comp.lang.lisp

I've been toying with the idea for a while of using a code walker
or hooking into an existing code walker in order to support S-exp-ified
infix syntax for arithmetic.

I have an advanced proof of concept happening.

What is interesting here is that the work gives rise to a powerful
argument in favor of the separation of function and variable bindings,
which helps to make it trivial:

$ ./txr -i infix.tl
TXR's no-spray organic production means every bug is carefully removed by hand.
1> (ifx (let ((a 1) (b 2) (c 3))
((a + b) * 3)))
9

Notice how ifx "works through" the let (and trust me when I say, any layering of syntax it is not interested in).

ifx will find all infix expressions in its interior and transform them
based on the pattern (arg1 op [arg ...]) to (op arg1 [arg ...]).

This is the complete implementation, but relies on an unreleased
*expand-hook* feature:

(defun-match infix-expand-hook
((@(as form (@x @y . @rest)) @env nil)
(if (fboundp y)
^(,y ,x ,*rest)
form))
((@form @nil @nil) form))

(defmacro ifx (:env env . body)
(let ((*expand-hook* (fun infix-expand-hook)))
(expand ^(progn ,*body) env)))

The implementation's macro-expanding code-walker invokes *expand-hook*
for each macro (symbol or compound), function call or malformed form
that it encounters. It passes the form, the macro-expansion-time
environment, and a type symbol. The type symbol is :macro when the
form is a macro invocation, :fun if it is a function call, or else
nil if it is invalid.

The above infix module won't make the transformation on a form
that looks valid. Concretely, if you define a function a, whether
globally or locally, then (a + b) will not be transformed.

As you can see, this also showcases the benefits of a function/variable namespace separation in Lisp. Even though a has a binding, (a + b) can
be transformed to (+ a b). This is because its binding is a variable
binding, which we can ignore.

In a Lisp-1 like Clojure or Scheme, there is a difficulty here:
we have to determine that a is not a function. That requires
type information or else a run-time check. The latter is a nonstarter,
since we are looking to transform code at macro time.
But even if you have the former, type info, who is to say that is
correct; maybe someone wants (f + g) where those are functions.

Here is a toast to variable/function namespace separation.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
--- Synchronet 3.21d-Linux NewsLink 1.2

From Kaz Kylheku@643-408-1753@kylheku.com to comp.lang.lisp on Wed Apr 2 03:26:20 2025

From Newsgroup: comp.lang.lisp

On 2025-03-31, Kaz Kylheku <643-408-1753@kylheku.com> wrote:

(defmacro ifx (:env env . body)
(let ((*expand-hook* (fun infix-expand-hook)))
(expand ^(progn ,*body) env)))

Doh! I just though of expander-let which can do this nicely.

And ... it turns out I already developed it! expander-let
appeared in TXR 287, in June 2023. (Current is 299).

Thus:

(defmacro ifx (. body)
^(expander-let ((*expand-hook* (fun infix-expand-hook)))
,*body))

That is way better. The expander encounters the expander-let and does
the binding for us, wraps the body in a (progn ...) and naturally
recurses over it. Then the expander-let disappears.

The explicit expand call causes a wasteful double expansion; whatever it produces will be recursed into by the expander again.

Arguably, this design is wrong because, the binding for expand hooks
should ideally be in the macro environment, not in the dynamic
environment. If a closure is ever produced that captures a macro
environment, and does form expansion when called, it may not have the
correct value of *expand-hook* that was in effect. Also, categorically,
local expander hooks are kind of generalized macrolets, and therefore
belong in the same space.

There is the obvious workaround in that the scope around the closure can
just copy the then-visible dynamic value of *expand-hook* into a lexical variable, and capture that. It's an ugly ritual to have to remember,
and to have to carry out for correctness, though.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
--- Synchronet 3.21d-Linux NewsLink 1.2

From Kaz Kylheku@643-408-1753@kylheku.com to comp.lang.lisp on Fri Apr 4 03:14:58 2025

From Newsgroup: comp.lang.lisp

On 2025-04-02, Kaz Kylheku <643-408-1753@kylheku.com> wrote:

On 2025-03-31, Kaz Kylheku <643-408-1753@kylheku.com> wrote:

(defmacro ifx (:env env . body)
(let ((*expand-hook* (fun infix-expand-hook)))
(expand ^(progn ,*body) env)))

Doh! I just though of expander-let which can do this nicely.

I'm serious about getting a form of infix as a supported
feature, and it's coming along.

The ground rules are: no read syntax. Everything is done using
existing syntax.

(ifx (let ((i 0))

(while (i ++ < 9)
(put-line (pic "## ## #.#####" i (i * i) (sqrt sin(i) ** 2 + sin(2 * i) ** 2))))))
1 1 1.23891
2 4 1.18304
3 9 0.31303
4 16 1.24562
5 25 1.10249
6 36 0.60497
7 49 1.18867
8 64 1.03040
9 81 0.85663

I have a Shunting Yard type infix parser which is extended with handling prefix and postifix operators, as well as error checking. I converted the algorithm to a pattern-based term rewriting system with two stacks. There are a few pure term rewriting rules which do some nice things.

Here is one. I took numerous monadic math functions like sin, cos, log, and turned them into very low precedence unary operators. What this means
is that sin is applied to the entire formula here:

1> (parse-infix '(sin a + b + c + d))
(sin (+ (+ (+ a b) c)
d))

But, if we put parentheses on the operand of sin, then it's no longer
the unary sin, but a function call with that argument:

2> (parse-infix '(sin (a) + b + c + d))
(+ (+ (+ (sin a) b)
c)
d)
3> (parse-infix '(sin (a + b) + c + d))
(+ (+ (sin a + b)
c)
d)

The parser has a hack like "if we are processing a unary operator
which is tagged as a function-like operator, and it is immediately
followed by a compound expression, then insert the operator as
the head operator into the compound expression".

Another term rewriting rule handles C-like [ ] array reference
agglutination:

4> (parse-infix '(a + b[i][j][k]))
(+ a [[[b i] j] k])

It's like magic; just use (ifx ...) around any expression or
even an entire top-level form like a defun, and you can freely
mix infix and regular Lisp in all evaluation contexts.

4> (ifx (list* 1 2 (mapcar (ret (@1 + @1)) '(10 20 30))))
(1 2 20 40 60)
5> (ifx (list* 1 2 (mapcar (lambda (x) (+ x x)) '(10 20 30))))
(1 2 20 40 60)

It's turning out ot be a pretty nice system, combining autodetection within an explicit (ifx ...) contour.

I think it is already better than SRFI-105 curly braces and some other systems. --
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
--- Synchronet 3.21d-Linux NewsLink 1.2

From Stefan Monnier@monnier@iro.umontreal.ca to comp.lang.lisp on Fri Apr 4 14:17:20 2025

From Newsgroup: comp.lang.lisp

1> (parse-infix '(sin a + b + c + d))
(sin (+ (+ (+ a b) c)
d))

But, if we put parentheses on the operand of sin, then it's no longer
the unary sin, but a function call with that argument:

2> (parse-infix '(sin (a) + b + c + d))
(+ (+ (+ (sin a) b)
c)
d)

For someone coming from "statically typed functional programming", this
is *very* weird.
All those languages (Haskell/OCaml/SML/Coq/Idris/Agda/...) treat

(sin a + b + c + d)

the same as

(sin (a) + b + c + d)

I think it is already better than SRFI-105 curly braces and some
other systems.

A related effort was Honu: https://users.cs.utah.edu/plt/publications/gpce12-rf.pdf

Stefan
--- Synchronet 3.21d-Linux NewsLink 1.2

From Kaz Kylheku@643-408-1753@kylheku.com to comp.lang.lisp on Fri Apr 4 19:35:28 2025

From Newsgroup: comp.lang.lisp

On 2025-04-04, Stefan Monnier <monnier@iro.umontreal.ca> wrote:

1> (parse-infix '(sin a + b + c + d))
(sin (+ (+ (+ a b) c)
d))

But, if we put parentheses on the operand of sin, then it's no longer
the unary sin, but a function call with that argument:

2> (parse-infix '(sin (a) + b + c + d))
(+ (+ (+ (sin a) b)
c)
d)

For someone coming from "statically typed functional programming", this
is *very* weird.
All those languages (Haskell/OCaml/SML/Coq/Idris/Agda/...) treat

(sin a + b + c + d)

the same as

(sin (a) + b + c + d)

Ah, but how about higher precedence operators?

Would you want

sin a * b * c

to be

sin (a) * b *c

If not, you're asking for the operator version of sin to be
a unary operator with a higher precedence than plus or minus,
but lower than multiplication.

In math boox and papers, you certainly see expression like

sin 2cx

I believe I have also seen

sin t + 1

But also:

(sin t - cos t)**2

It seems that a useful behavior to have would be for the
range of the unary operator to stop when we encounter
another such unary operator; i.e.

sin a + b + c + sin d + e + f
^ ^

Suppose that when the binary + is adjacent to a unary operator as above,
we pretend that the + has a lower precedence.

Ooh, that seems to nicely for all unary operators.

I put into the parser a couple of hack lines which look like this:

(iflet ((o3 [ifx-uops (car rest)]))
(if (eq o1.arity :infix)
(set o1 (copy o1)
o1.prec (pred o3.prec))))

When about to process an infix operator o1, we first peek at the rest of the input. If it starts with a prefix operator, which we call o3, (since o2 is being used for a previous operator pushed onto the operator stack) then we duplicate the current o1 operator object, and give it a precedence that is one lower than the infix.

(A better implementation would be to have a lexical variable here holding
a copy of the precedence, and just adjust the lexical, so as not to copy
the operator object.)

Bam:

What we want works:

2> (parse-infix '(sin x + x + cos y + y))
(+ (sin (+ x x))
(cos (+ y y)))

Also the higher precedence unary minus now does
the right thing, syntactically, FWIW:

3> (parse-infix '(- x * x * - y * y))
(* (- (* x x))
(- (* y y)))

The rule is easy to understand and leads to intuitive, symmetric treatment of symmetric-looking formulas.

A lower precedence prefix operator suddenly appearing in a chain of higher precedence operators should interrupt that chain, and begin a new chain piece at the same level of depth.

Say, I wonder whether this logic could be nicely expressed in LAR(1) anymore.
I have a feeling that not, because LALR(1) makes only simple shift or reduce

Next question: should the precedence-lowering propagate thorugh stacks
of prefix operators?

Like here:

5> (parse-infix '(sin x + x + - cos y + y))
(sin (+ x (+ x (- (cos (+ y y))))))

When we are processing (+ - cos y ...), looking at the binarry +, should we take into consideration all the consecutive unary operators that prefix the rest of the input? In this case (- cos). And should we then take their lowest precedence? The idea being that cos propagates its low precedence to the higher precedence -, which propagates it to + (decremeted by 1).

Then we could get the parse

(+ (sin (+ x x))
(- (cos (+ y y))))
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
--- Synchronet 3.21d-Linux NewsLink 1.2

From Stefan Monnier@monnier@iro.umontreal.ca to comp.lang.lisp on Fri Apr 4 16:33:12 2025

From Newsgroup: comp.lang.lisp

Ah, but how about higher precedence operators?

Same: the "usual" convention in those languages (and in most ++-calculus papers) is that function application takes the shape "e1 e2" and binds
tighter than "anything else".

Would you want

sin a * b * c

to be

sin (a) * b *c

Yup.

It seems that a useful behavior to have would be for the
range of the unary operator to stop when we encounter
another such unary operator; i.e.

sin a + b + c + sin d + e + f
^ ^

Well, now you're falling into the world of heuristics to try and handle
the inconsistent mix of rules used by various authors.
You're obviously free to go there, but don't count me in.

Stefan
--- Synchronet 3.21d-Linux NewsLink 1.2

From Kaz Kylheku@643-408-1753@kylheku.com to comp.lang.lisp on Sat Apr 5 01:09:55 2025

From Newsgroup: comp.lang.lisp

On 2025-04-04, Stefan Monnier <monnier@iro.umontreal.ca> wrote:

Ah, but how about higher precedence operators?

Same: the "usual" convention in those languages (and in most ++-calculus papers) is that function application takes the shape "e1 e2" and binds tighter than "anything else".

Would you want

sin a * b * c

to be

sin (a) * b *c

Yup.

But then that is inconsistent with a very common treatment of
sin, which is that product terms comprise one argument.

For instance, oh sin 2 * pi * f * t .

It seems that a useful behavior to have would be for the
range of the unary operator to stop when we encounter
another such unary operator; i.e.

sin a + b + c + sin d + e + f
^ ^

Well, now you're falling into the world of heuristics to try and handle
the inconsistent mix of rules used by various authors.
You're obviously free to go there, but don't count me in.

It may seem like a heuristic, but it's actually one clear rule. (It may
be novel; I should ping the CS StackExchange or something.)

It's not some very specific, dark corner case but a general rule
prominently situated at the heart of the algortihm.

Whenever processing any infix operator O, we determine whether it is
followed by a nonempty, contiguous sequence of prefix operators P1, P2,
... In other words, does the right operand of O begin with prefixes.
If so, we calculate the minimum precedence among the Pi,
and decrement that minium by one level (or perhaps a fraction of a
level, just to make it smaller). If the precedence of O is not at least
as low as that value, we demote O's precedence to that value,
for the purposes of the usual operator processing logic that follows.

As far as I can see, with that, I am done. I don't see any situations
that yield results that grate on my eyes/nerves. I'm ready to move on
from exploratory coding to cranking out test cases, and writing
documentation.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
--- Synchronet 3.21d-Linux NewsLink 1.2

From ram@ram@zedat.fu-berlin.de (Stefan Ram) to comp.lang.lisp on Sat Apr 5 07:26:42 2025

From Newsgroup: comp.lang.lisp

Stefan Monnier <monnier@iro.umontreal.ca> wrote or quoted:

sin a + b + c + sin d + e + f

Well, now you're falling into the world of heuristics to try and handle
the inconsistent mix of rules used by various authors.
You're obviously free to go there, but don't count me in.

Languages such as "AsciiMath" or "AsciiMathML" have been created
to deal with such notations, and there is software available to
convert them to LaTeX or MathML.

In one case, a double space was made significant, to resolve
some ambiguities as in "sin a/2", in which the "/" now binds
stronger than the sin, different from "sin a/2".

The implementation of ASCIIMathML is somewhat elegant insofar as
some ambiguities are not resolved but just passed on to LaTeX or
MathML as they are. The input is not actually parsed into an AST
that represents the actual structure of the expression as it would
be parsed by a parser for a programming language.

And some implementations take care to handle /ungrammatical/ math
because it could be needed in a text book to show what makes no sense.
For example, input such as "sin sin sin" is handled too.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Kaz Kylheku@643-408-1753@kylheku.com to comp.lang.lisp on Sat Apr 5 07:59:25 2025

From Newsgroup: comp.lang.lisp

On 2025-04-05, Stefan Ram <ram@zedat.fu-berlin.de> wrote:

Stefan Monnier <monnier@iro.umontreal.ca> wrote or quoted:

sin a + b + c + sin d + e + f

Well, now you're falling into the world of heuristics to try and handle
the inconsistent mix of rules used by various authors.
You're obviously free to go there, but don't count me in.

Languages such as "AsciiMath" or "AsciiMathML" have been created
to deal with such notations, and there is software available to
convert them to LaTeX or MathML.

In one case, a double space was made significant, to resolve
some ambiguities as in "sin a/2", in which the "/" now binds
stronger than the sin, different from "sin a/2".

I feel that this anecdote validates my intutition that we want
sin a / 2 to be sin(a / 2), and such. The ugliness of the double space
hack is proportional to the depth of someone's regret, which he or she
used to justify coding the hack.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
--- Synchronet 3.21d-Linux NewsLink 1.2

Who's Online
Recent Visitors
- Geek2
  Sun May 17 07:06:15 2026
  from Euclid, Oh via Telnet
- Geek2
  Sat May 16 21:25:04 2026
  from Euclid, Oh via Telnet
- Jas Hud
  Sat May 16 00:50:28 2026
  from Bbs.Eob-Bbs.Com,wi via Telnet
- Geek2
  Fri May 15 19:53:20 2026
  from Euclid, Oh via Telnet

System Info

Sysop:	Amessyroom
Location:	Fayetteville, NC
Users:	65
Nodes:	6 (0 / 6)
Uptime:	08:16:37
Calls:	862
Files:	1,311
D/L today:	2 files (6,679K bytes)
Messages:	264,942

infix via code walker hook.

Who's Online

Recent Visitors

System Info