The use of overlays was a hack to get around small address spaces.
But, it disciplined developers to partition their code into
self-consistent chunks -- you couldn't flip back and forth
between overlaid images (like you could with bank switching).
You had to ensure any indentifiers/labels you needed "now"
were accessible "from here".
In a VMM environment, you don't care -- you just let the
page get faulted in while you wait (assuming it's not presently
in place).
My apps tend to run "forever". Yet, lots of resources they
use are only transitory; no need for them to persist beyond
the point where they were used.
[Think of initialization code; once done, you're not going
to revisit it until the application is restarted/reloaded]
I'd like to be able to shed resources that are no longer
needed. But, have some assurances that they truly *aren't*
needed (referenced) going forward. This lets the system free
up those resources for use by other applications (and runtime
diagnostics, once you know "no one" is using a resource).
[I can do this automatically or let the developer do it "on demand"
by lumping resources in specific sections with judicious use
of linker scripts, etc. Applications developed with this
in mind will tend to be more resilient because they will tend
to be allowed to continue execution when the system is running
in overload; applications that hold onto unneeded resources
will look "more wasteful" and selected for removal.]
But, there is nothing in the traditional build process that
ensures references from "some point" in the code don't refer
back to some *other* point (that you thought you were done with).
The overlay build process had to ensure THIS overlay didn't refer to
anything in THAT overlay. (Bankswitching code had to rely on "trampoline" >logic in some shared/persistent area to get from one bank to another,
but, there was nothing prohibiting such references!)
Does anyone still develop with overlays ...
... (where the overlay has to
replace the existing overlay at run time)? Besides personal
familiarity with the codebase, how do you select code portions
to place in each overlay?
Note, this is different than paged VMM where individual pages are
swapped in and out on demand (I have no backing store so once "out",
coming back *in* is costly).
I'd like to be able to shed resources that are no longer
needed. But, have some assurances that they truly *aren't*
needed (referenced) going forward. This lets the system free
up those resources for use by other applications (and runtime
diagnostics, once you know "no one" is using a resource).
[I can do this automatically or let the developer do it "on demand"
by lumping resources in specific sections with judicious use
of linker scripts, etc. Applications developed with this
in mind will tend to be more resilient because they will tend
to be allowed to continue execution when the system is running
in overload; applications that hold onto unneeded resources
will look "more wasteful" and selected for removal.]
But, there is nothing in the traditional build process that
ensures references from "some point" in the code don't refer
back to some *other* point (that you thought you were done with).
Granted most existing compilers do not understand the concept of
"overlay" per se, but they do certainly understand both definition and execution scope [in many languages these are identical, but in some
they can be different].
The problems with overlays almost all are related to closures and
[escape] continuations - in C terms: stored function pointers and
exceptions (or longjmp).
The overlay build process had to ensure THIS overlay didn't refer to
anything in THAT overlay. (Bankswitching code had to rely on "trampoline" >> logic in some shared/persistent area to get from one bank to another,
but, there was nothing prohibiting such references!)
Just scope handling.
The issue with (classic) C was it had only global and local scopes. Supporting overlays required the addition of non-global-non-local
scopes [at the load module level at least], and a bit of runtime magic
behind the scenes to load/unload them >
Does anyone still develop with overlays ...
I haven't seen an overlay compiler since 8-bit. Of course, 8-bit
cpus/mpus still are used in the embedded world.
More broadly: explicity loaded DLLs [ using dlopen()/dlsym(), LoadLibrary()/GetProcAddress(), etc. ] could be considered modern
moral equivalents of overlays.
Though they may not share memory (or other resources) and their
coming/going is not handled automagically by the compiler runtime,
they are similar [in the geometric meaning] in that: they might not be loaded/mapped when needed, they may be nested, and storing references
to anything defined or created by them can be frought with danger.
Granted, few developers ever muck with explicit DLL management ...
most will just let the linker/runtime/loader handle it.
... (where the overlay has to
replace the existing overlay at run time)? Besides personal
familiarity with the codebase, how do you select code portions
to place in each overlay?
Deliberately ignoring namespaces and "static", the problem with C is
that functions all are defined globally, so any function potentially
can be called from anywhere.
Think instead how you would code it in Pascal ... or any language
having nested functions but without using closures. Functions that
support specific computations (as opposed to generally useful) should
be defined / in scope only where they need to be callable.
Disjoint definition scopes then are exactly analogous to overlays, and
the functions defined within them can be packaged appropriately.
Note that grafting this onto a language lacking the notion of nested functions will not be easy.
Note, this is different than paged VMM where individual pages are
swapped in and out on demand (I have no backing store so once "out",
coming back *in* is costly).
George
But, there is nothing in the traditional build process that
ensures references from "some point" in the code don't refer
back to some *other* point (that you thought you were done with).
Does anyone still develop with overlays (where the overlay has to
replace the existing overlay at run time)? Besides personal
familiarity with the codebase, how do you select code portions
to place in each overlay?
According to Don Y <blockedofcourse@foo.invalid>:
But, there is nothing in the traditional build process that
ensures references from "some point" in the code don't refer
back to some *other* point (that you thought you were done with).
Depends on your tools.
According to Don Y <blockedofcourse@foo.invalid>:
But, there is nothing in the traditional build process that
ensures references from "some point" in the code don't refer
back to some *other* point (that you thought you were done with).
Depends on your tools.
Does anyone still develop with overlays (where the overlay has to
replace the existing overlay at run time)? Besides personal
familiarity with the codebase, how do you select code portions
to place in each overlay?
Still develop, no, but back in the 1980s I was one of the developers of Javelin,
a modelling package that ran on MS-DOS. We wrote it in Wizard C, which later became Turbo C, in medium model so all of the static data was in one segment that was always resident, with the code in each module in its own segment. We used a third party linker that provided an overlay scheme similar to the OS/360
one I described in "Linkers and Loaders."
The linker let us (well, me) assign code modules to overlays and complained if
it saw calls that weren't strictly up or down the overlay tree. The digital origami involved a lot of trial and error and guessing to see what in fact caused overlay swapping, largely affected in our case by the way the users used
the program. There were about six views of the model of which you could put any
two on the screen at once, so I had to try and guess which views were likely to
be used together so I could put them in the same overlay.
On 2026-01-14, John Levine <johnl@taugh.com> wrote:
According to Don Y <blockedofcourse@foo.invalid>:
But, there is nothing in the traditional build process that
ensures references from "some point" in the code don't refer
back to some *other* point (that you thought you were done with).
Depends on your tools.
Indeed, it's pretty simple to do something like that with gcc and
binutils. You can link each "chunk" separately so that it exposes
only a few (or even no) global symbols.
Then you link those chunks together to build an executable, you know
there's a very limited (or empty) set of "connections" between
them. However, that set of possible connections is static and doesn't
change over time the way it does when you swap overlays in/out.
I do this regularly to eliminate the possibilty of accidental name
collisions and so that I know exactly what functions can by called by "others" outside a chunk/module.
That's not quite the same as overlays, because all the chunks/modules
are resident in memory all the time, and there might be multiple
threads active in various different chunks at any point in time --
even though they can't access each other's functions or variables.
Hi George,
Hoping all is well (Mom?)
But, there is nothing in the traditional build process that
ensures references from "some point" in the code don't refer
back to some *other* point (that you thought you were done with).
Granted most existing compilers do not understand the concept of
"overlay" per se, but they do certainly understand both definition and
execution scope [in many languages these are identical, but in some
they can be different].
The problems with overlays almost all are related to closures and
[escape] continuations - in C terms: stored function pointers and
exceptions (or longjmp).
I am hoping to assume "modules" can't be split (so the compiler
always knows where local targets reside) and confine the "problem"
to the linkage editor -- assuming the targeted module can "need
help" being accessed.
I further assume that "overlay" is a misnomer; that the address space
is large enough that all targets have unique addresses and just need >assistance being accessed. (or, prohibitions against FURTHER access
if they have been "discarded"/marked "extinct")
The overlay build process had to ensure THIS overlay didn't refer to
anything in THAT overlay. (Bankswitching code had to rely on "trampoline" >>> logic in some shared/persistent area to get from one bank to another,
but, there was nothing prohibiting such references!)
Just scope handling.
If they can coexist in a shared address space, that's an issue.
But, if you're just trying to ensure nothing "here" ever refers
to something "there" (marked extinct), I don't think it is as
much of an issue.
The problem becomes one of discipline -- "inherently" knowing how to
organize your modules/references so you can be "sure" when you execute
one specific line of code (that marks a section as "extinct") that you
will NEVER reference anything that it containED (past tense because it
no longer exists)
Think in terms of FORTRAN's "COMMON" and "CHAIN"; you want to be able
to put any shared data created in "section 1" into COMMON and then
fall into the code in the next "section", having the benefit of
all that "common" data.
Knowing that the code from the first section is forever gone.
[Ideally, I would like to be able ot section off more varied instances
of resources instead of just chopping a program into discrete CONSECUTIVE >sections.]
:
More broadly: explicity loaded DLLs [ using dlopen()/dlsym(),
LoadLibrary()/GetProcAddress(), etc. ] could be considered modern
moral equivalents of overlays.
Yes. Though you can REload a DLL if you decide you need it
later. The CHAIN/COMMON distinction was that prior sections
are gone -- until you restart the program.
... (where the overlay has to
replace the existing overlay at run time)? Besides personal
familiarity with the codebase, how do you select code portions
to place in each overlay?
Deliberately ignoring namespaces and "static", the problem with C is
that functions all are defined globally, so any function potentially
can be called from anywhere.
Think instead how you would code it in Pascal ... or any language
having nested functions but without using closures. Functions that
support specific computations (as opposed to generally useful) should
be defined / in scope only where they need to be callable.
Disjoint definition scopes then are exactly analogous to overlays, and
the functions defined within them can be packaged appropriately.
But, that assumes those objects are NEVER accessed -- despite the
obvious need to access them /at least once/ to gain entry to that domain.
Note that grafting this onto a language lacking the notion of nested
functions will not be easy.
I can handle some cases easily:
main() {
Initialize()
Extinctify(&Initialize)
DoWork()
}
and place "Initialize" in its own section, commanding the linker to locate
it "conveniently" (likely on a page-frame boundary soas to maximize the >amount of usable space in that page-frame).
But, aside from such obvious choices, I think it is hard to mentally >subdivide a piece of code for such a partitioning. And, then, remembering
to "extinctify" portions that you no longer need. You'd have to be
keenly aware of the cost of each such "portion" of the algorithm so
you could identify things that could/should be excised.
Past expiration date.
But, there is nothing in the traditional build process that
ensures references from "some point" in the code don't refer
back to some *other* point (that you thought you were done with).
With discipline it (miserably) can be done.
See below.
The overlay build process had to ensure THIS overlay didn't refer to
anything in THAT overlay. (Bankswitching code had to rely on "trampoline" >>>> logic in some shared/persistent area to get from one bank to another,
but, there was nothing prohibiting such references!)
Just scope handling.
If they can coexist in a shared address space, that's an issue.
But, if you're just trying to ensure nothing "here" ever refers
to something "there" (marked extinct), I don't think it is as
much of an issue.
The problem becomes one of discipline -- "inherently" knowing how to
organize your modules/references so you can be "sure" when you execute
one specific line of code (that marks a section as "extinct") that you
will NEVER reference anything that it containED (past tense because it
no longer exists)
Think in terms of FORTRAN's "COMMON" and "CHAIN"; you want to be able
to put any shared data created in "section 1" into COMMON and then
fall into the code in the next "section", having the benefit of
all that "common" data.
Knowing that the code from the first section is forever gone.
Data produced by X and consumed later by Y is not a problem unless X
and Y disagree on the data's format.
Problems are only possible when there can exist stored references to
things which no longer exist.
In Fortran that was not possible: a
COMMON block could not include pointers [even in the Fortrans that had pointers], and so CHAIN with COMMON could not fail in that way.
Similarly, creating a data structure stored by the main program with
overlay X and then swapping in overlay Y to process the data will not
be a problem so long as the data structure contains no references to
anything inside X.
[Ideally, I would like to be able ot section off more varied instances
of resources instead of just chopping a program into discrete CONSECUTIVE
sections.]
:
More broadly: explicity loaded DLLs [ using dlopen()/dlsym(),
LoadLibrary()/GetProcAddress(), etc. ] could be considered modern
moral equivalents of overlays.
Yes. Though you can REload a DLL if you decide you need it
later. The CHAIN/COMMON distinction was that prior sections
are gone -- until you restart the program.
You're confused.
With explicit DLL management, the programmer has to deliberately write
code to load a DLL and map its exported API to function pointers. If
the DLL then is unloaded, the mapped pointers become garbage: trying
to call the functions will _NOT_ reload the DLL - rather it will,
almost certainly, crash the program.
If the program was written in a CHAIN/COMMON fashion, using and
disgarding a progression of DLLs, then there will be NO programmer
supplied code to "go back" and reload one of them.
Note that grafting this onto a language lacking the notion of nested
functions will not be easy.
I can handle some cases easily:
main() {
Initialize()
Extinctify(&Initialize)
DoWork()
}
and place "Initialize" in its own section, commanding the linker to locate >> it "conveniently" (likely on a page-frame boundary soas to maximize the
amount of usable space in that page-frame).
But, aside from such obvious choices, I think it is hard to mentally
subdivide a piece of code for such a partitioning. And, then, remembering >> to "extinctify" portions that you no longer need. You'd have to be
keenly aware of the cost of each such "portion" of the algorithm so
you could identify things that could/should be excised.
It's hard because you work in languages that make it hard(er).
The most straightforward way to arrange the code in C would be to
separate the disjoint "overlay" scopes by source file / compilation
unit.
pseudo'ing the example above in C:
----------------
F1();
F2();
main()
{
F1();
F2();
}
----------------
static G1() {}
static H1() {}
F1()
{
...
}
----------------
static G2() {}
static H2() {}
F2()
{
...
}
----------------
where the horizontal lines denote source file / compilation unit
boundaries.
This way the compiler will catch most problems, and the code will be
in modules that potentially could be loaded/unloaded independently
(assuming you have a way to do that).
But, as you said, it takes some dicipline.
The best way is to just use DLLs if you can. Export only the entry
points and everything else will be hidden.
But, there is nothing in the traditional build process that
ensures references from "some point" in the code don't refer
back to some *other* point (that you thought you were done with).
With discipline it (miserably) can be done.
See below.
In addition to knowing what MIGHT be accessible (exported), you
also have to track down every reference to each of those and
identify when, in time, they occur (or might occur) relative to
your explicitly declaring them to be "no longer needed".
THAT is the tough part -- a link map tells you "what talks to
what" but says nothing about when, in time, those interactions occur.
If FOO calls BAR and BAR calls BAZ, then anything that calls
BAZ, BAR *or* FOO, at a point in time AFTER you have declared
BAZ to no longer be needed will SIGSEGV.
You have to keep track of these interdependencies when making
decisions declaring particular "identifiers" (code or data)
to be no longer needed.
The "Initialization()" example I provided is easily to conceive of
occurring exactly once in a "program's" execution. So, as long as
it is only invoked once AND THERE ARE NO PATHS BACK TO Main() -- in
my example -- there is no way it can be reaccessed/referenced after
the "extinctify" invocation that follows it.
Can you say that about string(3C) functions -- can you decide
when those library functions are no longer needed, thereby
freeing up the resources that they require? Or, some function
that you wrote to handle a particular error condition?
Think in terms of FORTRAN's "COMMON" and "CHAIN"; you want to be able
to put any shared data created in "section 1" into COMMON and then
fall into the code in the next "section", having the benefit of
all that "common" data.
Knowing that the code from the first section is forever gone.
Data produced by X and consumed later by Y is not a problem unless X
and Y disagree on the data's format.
The format isn't the issue. Rather, that X /no longer exists/.
References to the address(es) that it occupied aren't mapped to
anything, anymore.
You must be sure Y (and every other X consumer) have made use
of X before you unmap its resources. THAT is what is hard to track
because tools don't lay out the temporal relationships of various >identifiers.
Problems are only possible when there can exist stored references to
things which no longer exist.
The references need not be explicitly "stored" but, rather, can be
part of an instruction stream generated by a compiler.
In Fortran that was not possible: a
COMMON block could not include pointers [even in the Fortrans that had
pointers], and so CHAIN with COMMON could not fail in that way.
I mention COMMON and CHAIN because they implement the mechanisms that
must be present for one part of a "program" to do work and pass those
results to a followup part (via COMMON) while execution is passed
to that followup part (via CHAIN).
You (typ) need data and code bridges to connect separate parts of a program >together, esp if you are deliberately trying to cut a program into smaller >pieces to reduce its "current" footprint (for any value of "current")
Similarly, creating a data structure stored by the main program with
overlay X and then swapping in overlay Y to process the data will not
be a problem so long as the data structure contains no references to
anything inside X.
That doesn't address having the data structure overlaid (or, in
my case, unmapped) because *IT* was considered "no longer needed".
At some point in the "earlier" execution of the "program", that data >structure had meaning. But, one the developer declares that data to
no longer be needed, it disappears. Any code that later references it
(in error!) crashes. No, it doesn't get the wrong values for the data
or interpret the values incorrectly. The memory reference simply FAILS.
With explicit DLL management, the programmer has to deliberately write
code to load a DLL and map its exported API to function pointers. If
the DLL then is unloaded, the mapped pointers become garbage: trying
to call the functions will _NOT_ reload the DLL - rather it will,
almost certainly, crash the program.
That is EXACTLY the situation I am describing. The developer explicitly >decided the DLL was NO LONGER NEEDED. *He* unloaded it. If he wasn't >disciplined enough to know that he wasn't yet done with it, then his
program crashes.
However, he can choose to unload the DLL (to reduce his resource
usage) and then, at some later time, explicitly REload it as if
for the first time and make continued use of it.
With CHAIN/COMMON, the past is past. The code that preceded is no longer >available (unless you rerun the program).
... my deliberate marking of portions of that memory as "no longer
needed" (mapped) discards them and their contents, irretrievably.
The only way to recreate the data and code is to restart the
program.
You have some codebase. Tools will tell you where a particular
identifier is referenced. You can use that to build a list
of other objects that implicitly reference that identifier.
*What* is going to tell you WHEN any of those objects are
invoked, relative to a particular statement that is executed
at a specific point in time:
unmap the memory used by identifier X
------
X4()
{
X7()
}
------
X5()
{
F2()
}
-----
X6()
{
if (blah)
X4()
}
-----
X7()
{
F1()
}
----
X8()
{
X4()
}
[This isn't really far-fetched with libraries that can be interrelated or >interact in other ways]
At some point in time, in some module, I decide to unmap the resources that >F1 consumes. Maybe that happens conditionally.
But, at some later point in time, X8() is invoked. Or, X4. Or, X7, Or,
X6 with blah being true.
You have to track the dependency hierarchy for all of these "objects"
and ensure that none of them are invoked *after* you have unmapped F1. >Because you can't recreate F1, its resources or its actual "value"/meaning.
And, you can't exhaustively test to ensure that every possible path through >the application is exercised. You could easily have a latent bug that
only surfaces in some particular set of circumstances that you didn't >consider, test or encounter, before.
We aren't accustomed to having objects (code or data) disappear during
the course of developing or executing a piece of code. If "foo"
existed at some point in the program -- and you are executing in
the same context and scope, then why would you NOT expect it to still
be there?
You can do an RPC and then be surprised that the NEXT invocation
fails -- because the remote host disappeared. You code against
that possibility.
I just delete objects and whatever they contain/represent goes away
in that action. Whether it is a collection of data that I no longer
need, a group of functions, a subassembly, etc. The developer
has to decide how to group "things" for their utility and temporal >pertinence.
Does anyone still develop with overlays (where the overlay has to
replace the existing overlay at run time)?-a Besides personal
familiarity with the codebase, how do you select code portions
to place in each overlay?
On 1/13/2026 5:03 PM, Don Y wrote:
Does anyone still develop with overlays (where the overlay has to
replace the existing overlay at run time)?-a Besides personal
familiarity with the codebase, how do you select code portions
to place in each overlay?
This actually wasn't as difficult as I thought it would be.
It's (unfortunately!) more of a "mindset" issue than anything else.
We had an offsite, this past week that let me poke at their code and
demo my prototypes.
| Sysop: | Amessyroom |
|---|---|
| Location: | Fayetteville, NC |
| Users: | 59 |
| Nodes: | 6 (1 / 5) |
| Uptime: | 16:03:29 |
| Calls: | 810 |
| Calls today: | 1 |
| Files: | 1,287 |
| D/L today: |
10 files (21,017K bytes) |
| Messages: | 193,341 |