Forum: Too Lazy BBS

Re: Calling conventions (particularly 32-bit ARM)

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.arch on Sat Feb 14 20:40:26 2026

From Newsgroup: comp.arch

George Neuner <gneuner2@comcast.net> writes:

On Mon, 27 Jan 2025 17:09:59 -0800, Tim Rentsch
<tr.17687@z991.linuxsc.com> wrote:

George Neuner <gneuner2@comcast.net> writes:

On Mon, 6 Jan 2025 20:10:13 +0000, mitchalsup@aol.com (MitchAlsup1)
wrote:

I looked high and low for codes using more than 8 arguments and
returning aggregates larger than 8 double words, and about the
only things I found were a handful of []print[]() calls.

Large numbers of parameters may be generated either by closure
conversion or by lambda lifting. These are FP language
transformations that are analogous to, but potentially more complex
than, the rewriting of object methods and their call sites to pass the
current object in an OO language.

[The difference between closure conversion and lambda lifting is the
scope of the tranformation: conversion limits code transformations to
within the defining call chain, whereas lifting pulls the closure to
top level making it (at least potentially) globally available.]

In either case the original function is rewritten such that non-local
variables can be passed as parameters. The function's code must be
altered to access the non-locals - either directly as explicit
individual parameters, or by indexing from a pointer to an environment
data structure.

While in a simple case this could look exactly like the OO method
transformation, recall that a general closure may require access to
non-local variables spread through multiple environments. Even if
whole environments are passed via single pointers, there still may
need to be multiple parameters added.

Isn't it the case that access to all of the enclosing environments
can be provided by passing a single pointer? I'm pretty sure it
is.

Certainly, if the enclosing environments somehow are chained together.
In real code though, in many instances such a chain will not already
exist when the closure is constructed. The compiler would have to
install pointers to the needed environments (or, alternatively,
pointers directly to the needed values) into the new closure's
immediate environment.
[essentially this creates a private "display" for the closure.]

Completely doable: it is simply that, if there are enough registers,
passing the pointers as parameters will tend to be more performant.

Sounds like you're saying that you agree that passing
just one value is always feasible. Also that, depending
on individual circumstances, either approach might have
better performance.
--- Synchronet 3.21b-Linux NewsLink 1.2

From George Neuner@gneuner2@comcast.net to comp.arch on Tue Feb 17 15:35:33 2026

From Newsgroup: comp.arch

Hi Tim,

On Sat, 14 Feb 2026 20:40:26 -0800, Tim Rentsch
<tr.17687@z991.linuxsc.com> wrote:

George Neuner <gneuner2@comcast.net> writes:

On Mon, 27 Jan 2025 17:09:59 -0800, Tim Rentsch
<tr.17687@z991.linuxsc.com> wrote:

George Neuner <gneuner2@comcast.net> writes:

On Mon, 6 Jan 2025 20:10:13 +0000, mitchalsup@aol.com (MitchAlsup1)
wrote:

I looked high and low for codes using more than 8 arguments and
returning aggregates larger than 8 double words, and about the
only things I found were a handful of []print[]() calls.

Large numbers of parameters may be generated either by closure
conversion or by lambda lifting. These are FP language
transformations that are analogous to, but potentially more complex
than, the rewriting of object methods and their call sites to pass the >>>> current object in an OO language.

[The difference between closure conversion and lambda lifting is the
scope of the tranformation: conversion limits code transformations to >>>> within the defining call chain, whereas lifting pulls the closure to
top level making it (at least potentially) globally available.]

In either case the original function is rewritten such that non-local
variables can be passed as parameters. The function's code must be
altered to access the non-locals - either directly as explicit
individual parameters, or by indexing from a pointer to an environment >>>> data structure.

While in a simple case this could look exactly like the OO method
transformation, recall that a general closure may require access to
non-local variables spread through multiple environments. Even if
whole environments are passed via single pointers, there still may
need to be multiple parameters added.

Isn't it the case that access to all of the enclosing environments
can be provided by passing a single pointer? I'm pretty sure it
is.

Certainly, if the enclosing environments somehow are chained together.
In real code though, in many instances such a chain will not already
exist when the closure is constructed. The compiler would have to
install pointers to the needed environments (or, alternatively,
pointers directly to the needed values) into the new closure's
immediate environment.
[essentially this creates a private "display" for the closure.]

Completely doable: it is simply that, if there are enough registers,
passing the pointers as parameters will tend to be more performant.

Sounds like you're saying that you agree that passing
just one value is always feasible. Also that, depending
on individual circumstances, either approach might have
better performance.

You are correct ... it always is possible to pass the closure
environment to the function using a single pointer.

But you may not want to do it that way.

My point was about the structure of closure environments. In general,
you want to minimize what data needs to be persisted - particularly in
a program that generates lots of /related/ closures - while also
keeping in mind that data may need to be both shared among multiple
closures [not just among multiple functions in a common closure].

It may be necessary, e.g., to pull data out of a stack context and
heap allocate it instead. That requires changing the stack context to
be a pointer rather than a value, rewriting any functions that expect
the value to use the pointer instead, and constructing new persistent "environment" structures that can find the relocated data.

This can require a lot of effort by the compiler.

OTOH, if the structure of the program is such that the closure's
non-local data is guaranteed to be in scope when the closure is
invoked, it often is simpler just to rewrite closure functions to
access that data via a pointer parameter, and change the call sites to
pass the required pointer(s).

The closure may still need a persistent enviroment, but this method
reduces or eliminates the need for /chained/ environments, and having
to rewrite other non-closure functions that happen to use the data.

This also can require a lot of effort by the compiler, but the effort
can be more focused on the closures, and less on "regular" code.

What you really don't want in any case is to have to preserve entire
stacks just to support creating closures. It doesn't matter whether
the stack is linear or a chain of heap allocated structures[*]. Some
rewriting and data relocation (out of the stack) will be necessary in
any case.

[*] yes, this actually is done in some GC'd language implementations.
When the stack shrinks, discarded contexts are cleaned up by the GC.

--- Synchronet 3.21b-Linux NewsLink 1.2

Who's Online

System Info

Sysop:	Amessyroom
Location:	Fayetteville, NC
Users:	59
Nodes:	6 (0 / 6)
Uptime:	02:06:24
Calls:	810
Files:	1,287
Messages:	200,610

Re: Calling conventions (particularly 32-bit ARM)

Who's Online

System Info