Sysop: | Amessyroom |
---|---|
Location: | Fayetteville, NC |
Users: | 43 |
Nodes: | 6 (0 / 6) |
Uptime: | 104:29:11 |
Calls: | 290 |
Files: | 905 |
Messages: | 76,618 |
Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
antispam@fricas.org (Waldek Hebisch) writes:
From my point of view main drawbacks of 286 is poor support for
large arrays and problem for Lisp-like system which have a lot
of small data structures and traverse then via pointers.
Yes. In the first case the segments are too small, in the latter case
there are too few segments (if you have one segment per object).
In the second case one can pack several objects into single
segment, so except for loct security properties this is not
a big problem.
But there is a lot of loading segment registers
and slow loading is a problem.
Using 16-bit offsets for jumps inside procedure and
segment-offset pair for calls is likely to lead to better
or similar performance as purely 32-bit machine.
With the 80286's segments and their slowness, that is very doubtful.
The 8086 has branches with 8-bit offsets and branches and calls with
16-bit offsets. The 386 in 32-bit mode has branches with 8-bit
offsets and branches and calls with 32-bit offsets; if 16-bit offsets
for branches would be useful enough for performance, they could
instead have designed the longer branch length to be 16 bits, and
maybe a prefix for 32-bit branch offsets.
At that time Intel apparently wanted to avoid having too many
instructions.
I used Xenix on a 286 in 1986 or 1987; my impression is that programs
were limited to 64KB code and 64KB data size, exactly the PDP-11 model
you denounce.
Maybe. I have seen many cases where sofware essentiallt "wastes"
good things offered by hardware.
Every successful software used direct access to hardware because of
performance; the rest waned. Using BIOS calls was just too slow.
Lotus 1-2-3 won out over VisiCalc and Multiplan by being faster from
writing directly to video.
For most early graphic cards direct screen access could be allowed
just by allocating appropriate segment. And most non-games
could gain good performance with better system interface.
I think that variaty of tricks used in games and their
popularity made protected mode system much less appealing
to vendors. And that discouraged work on better interfaces
for non-games.
More generally, vendors could release separate versions of
programs for 8086 and 286 but few did so.
And users having
only binaries wanted to use 8086 on their new systems which
led to heroic efforts like OS/2 DOS box and later Linux
dosemu. But integration of 8086 programs with protected
mode was solved too late for 286 model to gain traction
(and on 286 "DOS box" had to run in real mode, breaking
normal system protection).
There was various segmented hardware around, first and foremost (for
the designers of the 80286), the iAPX432. And as you write, all the
good reasons that resulted in segments on the iAPX432 also persisted
in the 80286. However, given the slowness of segmentation, only the
tiny (all in one segment), small (one segment for code and one for
data), and maybe medium memory models (one data segment) are
competetive in protected mode compared to real mode.
AFAICS that covered wast majority of programs during eighties.
Turbo Pascal offered only medium memory model
Intel apparently assumed that programmers are willing to spend
extra work to get good performance and IMO this was right
as a general statement. Intel probably did not realize that
programmers will be very reluctant to spent work on security
features and in particular to spent work on making programs
fast in 286 protected mode.
Intel probably assumend that 286 would cover most needs,
especially
given that most system had much less memory than 16 MB theoreticlly
allowed by 286.
IMO this is partially true: there
is a class of programs which with some work fit into medium
model, but using flat address space is easier. I think that
on 286 (that is with 16 bit bus) those programs (assuming enough
tuning) run faster than flat 32-bit version.
But I think that Intel segmentation had some
attractive features during eighties.
Another thing is 386. I think that designers of 286 thought
that 386 will remove some limitations. And 386 allowed
bigger segmensts removing one major limitation. OTOH
for 32-bit processor with segementation it would be natural
to have 32-bit segment registers. It is not clear to
me if 16-bit segment registers in 386 were deemed necessary
for backward compatibility or maybe in 386 period flat
fraction in Intel won and they kept segmentation mostly
for compatibility.
Xenix, apart from OS/2 the only other notable protected-mode OS for the
286, was ported to the 386 in 1987, after SCO secured "knowledge from Microsoft insiders that Microsoft was no longer developing Xenix", so
SCO (or Microsoft) might have done it even earlier if the commercial situation had been less muddled; in any case, Xenix jumped the 286 ship
ASAP.
The bad taste of segments is from exposure to Intel's half-assed >implementation which exposed the segment selector as part of the
address.
Segments /should/ have been implemented similar to the way paging is
done: the program using flat 32-bit addresses and the MMU (SMU?)
consulting some kind of segment "database" [using the term loosely].
Intel had a chance to do it right with the 386, but instead they
doubled down and expanded the existing poor implementation to support
larger segments.
I realize that transistor counts at the time might have made an
on-chip SMU impossible, but ISTM the SMU would have been a very small >component that (if necessary) could have been implemented on-die as a >coprocessor.
How would the addresses be divided into segment and offset in your
model? What would the SMU have to do?
- anton
George Neuner <gneuner2@comcast.net> writes:
The bad taste of segments is from exposure to Intel's half-assed
implementation which exposed the segment selector as part of the
address.
Segments /should/ have been implemented similar to the way paging is
done: the program using flat 32-bit addresses and the MMU (SMU?)
consulting some kind of segment "database" [using the term loosely].
What benefits do you expect from segments? One benefit usually
mentioned is security, in particular, protection against out-of-bounds accesses (e.g., buffer overflow exploits).
The best idea I have seen to help detect out of bounds accesses, is to
round all requested memory blocks up to the next 4K boundary and mark
the next page as unavailable, then return a skewed pointer back, so that
the end of the requested region coincides with the end of the (last) >allocated page. This does require at least 8kB for every allocation, but
I guess they can all share a single trapping segment?
(This idea does not help locate negative buffer overruns (underruns?)
but they seem to be much less common?)
The bad taste of segments is from exposure to Intel's half-assed >implementation which exposed the segment selector as part of the
address.
Segments /should/ have been implemented similar to the way paging is
done: the program using flat 32-bit addresses and the MMU (SMU?)
consulting some kind of segment "database" [using the term loosely].
Terje Mathisen <terje.mathisen@tmsw.no> writes:
The best idea I have seen to help detect out of bounds accesses, is to >>round all requested memory blocks up to the next 4K boundary and mark
the next page as unavailable, then return a skewed pointer back, so that >>the end of the requested region coincides with the end of the (last) >>allocated page. This does require at least 8kB for every allocation, but
I guess they can all share a single trapping segment?
(This idea does not help locate negative buffer overruns (underruns?)
but they seem to be much less common?)
It also does not help for out-of-bounds accesses that are not just
adjacent to an earlier in-bounds access. That may also be a less
common vulnerability than adjacent positive-stride buffer overflows.
But if we throw hardware on the problem, do we want to spend hardware
on something that does not catch all out-of-bounds accesses?
- anton
According to George Neuner <gneuner2@comcast.net>:
The bad taste of segments is from exposure to Intel's half-assed >>implementation which exposed the segment selector as part of the
address.
Segments /should/ have been implemented similar to the way paging is
done: the program using flat 32-bit addresses and the MMU (SMU?)
consulting some kind of segment "database" [using the term loosely].
The whole point of a segmented architecture is that the segments are visible and
meaningful. You put a thing (for some definition of thing) in a segment to >control access to the thing. So if it's an array, all of the address >calculations are relative to the segment and out of bounds references fail >because they point to a non-existent part of the segment. Similiarly if it's >code, a jump outside the segment's boundaries fails.
Muitics and the Burroughs machines had (still have, I suppose for emulated
Anton Ertl wrote:
George Neuner <gneuner2@comcast.net> writes:
The bad taste of segments is from exposure to Intel's half-assed
implementation which exposed the segment selector as part of the
address.
Segments /should/ have been implemented similar to the way paging is
done: the program using flat 32-bit addresses and the MMU (SMU?)
consulting some kind of segment "database" [using the term loosely].
What benefits do you expect from segments? One benefit usually
mentioned is security, in particular, protection against out-of-bounds
accesses (e.g., buffer overflow exploits).
The best idea I have seen to help detect out of bounds accesses, is to
round all requested memory blocks up to the next 4K boundary and mark
the next page as unavailable, then return a skewed pointer back, so that
the end of the requested region coincides with the end of the (last) allocated page.
This does require at least 8kB for every allocation, but
I guess they can all share a single trapping segment?
(This idea does not help locate negative buffer overruns (underruns?)
but they seem to be much less common?)
Terje
According to George Neuner <gneuner2@comcast.net>:
The bad taste of segments is from exposure to Intel's half-assed >>implementation which exposed the segment selector as part of the
address.
Segments /should/ have been implemented similar to the way paging is
done: the program using flat 32-bit addresses and the MMU (SMU?)
consulting some kind of segment "database" [using the term loosely].
The whole point of a segmented architecture is that the segments are visible and
meaningful. You put a thing (for some definition of thing) in a segment to >control access to the thing. So if it's an array, all of the address >calculations are relative to the segment and out of bounds references fail >because they point to a non-existent part of the segment. Similiarly if it's >code, a jump outside the segment's boundaries fails.
Muitics and the Burroughs machines had (still have, I suppose for emulated >Burroughs) visible segments and programmers liked them just fine. The problems >were that the segment sizes were too small as memories got bigger, and that they
weren't byte addressed which these days is practically mandatory. The 286 added
additional flaws that there weren't enough segment registers and segment loads >were very slow.
What you're describing is multi-level page tables. Every virtual memory system >has them. Sometimes the operating systems make the higher level tables visible >to applications, sometimes they don't. For example, in IBM mainframes the second
level page table entries, which they call segments, can be shared between >applications.
The best idea I have seen to help detect out of bounds accesses, is to
round all requested memory blocks up to the next 4K boundary and mark
the next page as unavailable, then return a skewed pointer back, so that
the end of the requested region coincides with the end of the (last) allocated page. This does require at least 8kB for every allocation, but
I guess they can all share a single trapping segment?
(This idea does not help locate negative buffer overruns (underruns?)
but they seem to be much less common?)
Terje Mathisen <terje.mathisen@tmsw.no> schrieb:
The best idea I have seen to help detect out of bounds accesses, is to
round all requested memory blocks up to the next 4K boundary and mark
the next page as unavailable, then return a skewed pointer back, so that
the end of the requested region coincides with the end of the (last)
allocated page. This does require at least 8kB for every allocation, but
I guess they can all share a single trapping segment?
(This idea does not help locate negative buffer overruns (underruns?)
but they seem to be much less common?)
It is also problematic to allocate 8K (or more) for a small entity, or
on the stack.
Bounds checking should ideally impart minimum overhead so that it
can be enabled in production code.
Hmm... a beginning of an idea (for which I am ready to be shot
down, this is comp.arch :-)
This would work best for languages which explicitly pass
array bounds or sizes (like Fortran's assumed size arrays,
or, if I read this correctly, Rust's slices).
Assume a class of load and store instructions containing
- One source or destination register
- One base register
- One index register
- One ubound register
Hmm... a beginning of an idea (for which I am ready to be shot
down, this is comp.arch :-)
This would work best for languages which explicitly pass
array bounds or sizes (like Fortran's assumed size arrays,
or, if I read this correctly, Rust's slices).
Assume a class of load and store instructions containing
- One source or destination register
- One base register
- One index register
- One ubound register
Memory access is to base + index, with one additional point:
If index > ubound, then the instruction raises an exception.
This works less well with C's pointers, for which you would have
to pass some sort of fat pointer. Compilers would have to make
sure that the address of the base object is passed.
Comments?
What you're describing is multi-level page tables. Every virtual
memory system has them. Sometimes the operating systems make the
higher level tables visible to applications, sometimes they don't. For example, in IBM mainframes the second level page table entries, which
they call segments, can be shared between applications.