Forum: Too Lazy BBS

Combining Practicality with Perfection

From John Savard@quadibloc@invalid.invalid to comp.arch on Sat Jan 31 18:28:36 2026

From Newsgroup: comp.arch

I had looked into unusual memory architectures to allow a computer to be designed which had single-precision floats that were 36 bits long, so that
it would be possible more often to avoid recourse to double precision, and which had double-precision floats that were 60 bits long, also a multiple
of 12, because it wasn't necessary to have all the precision of 64-bit
floats.

Also thrown in were 48-bit floats, which were designed to have 11-digit precision and a range just exceeding 10^-99 to 1^99, so as to be
comparable with what scientific calculators offer.

While it was interesting to examine the possible ways this could be
managed, all the possibilities involved awkwardness and complexity - as
might be expected.

So how could I achieve my original goals while avoiding awkwardness?

Well, I came up with this:

Have floating-point formats that are either 36 bits long or 72 bits long.

That way, the 36-bit format is available, and longer formats, being twice
as long, are easy to fetch from memory.

One of the 72-bit formats has the same significand (or mantissa) length as
the 48-bit floats in my idealized computer. But no bits are wasted;
instead, the exponent field is just enlarged.

It's still a conventional floating-point format, where the lengths of the exponent and significand are fixed, unlike John Gustavson's posits. But
this gives it the advantage that either a computation will fail, or the precision of all the intermediate results will be the same as that of the final result; no catastrophic loss of precision will pass by unnoticed.

The other 72-bit format has a significand the same size as that of IEEE
754 64-bit floats. Offering lower precision, the same as that of a 60-bit float... would, no doubt, be too tough a sell. So the exponent field,
while not as large as that of the other format, would still be 8 bits
longer than usual, which, no doubt would be helpful.

John Savard
--- Synchronet 3.21b-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Thu Feb 5 01:57:22 2026

From Newsgroup: comp.arch

John Savard <quadibloc@invalid.invalid> posted:

I had looked into unusual memory architectures to allow a computer to be designed which had single-precision floats that were 36 bits long, so that it would be possible more often to avoid recourse to double precision, and which had double-precision floats that were 60 bits long, also a multiple
of 12, because it wasn't necessary to have all the precision of 64-bit floats.

Any machine on sale today (selling at les 100,000 machines/year)
provide 36-bit or 60-bit or 72-bit FP ?!?

If you want to build a 12-bit=base machine, go ahead--just don't
expect much takeup.

Also thrown in were 48-bit floats, which were designed to have 11-digit precision and a range just exceeding 10^-99 to 1^99, so as to be
comparable with what scientific calculators offer.

While it was interesting to examine the possible ways this could be
managed, all the possibilities involved awkwardness and complexity - as might be expected.

There really is something special about 2^(3+n) data sizes.
It _IS_ what everyone wants...

So how could I achieve my original goals while avoiding awkwardness?

Avoid non 8^n design points altogether.

Well, I came up with this:

Have floating-point formats that are either 36 bits long or 72 bits long.

Ok, better than the above, 12^n -> {12, 24, 48, 96}
WOOPS no 36, 60 or 72 !!!
............................6^n -> {6, 12, 24, 48, 96} still does not work.

That way, the 36-bit format is available, and longer formats, being twice
as long, are easy to fetch from memory.

Your problem is that 36 is not a 2^ of anything !

One of the 72-bit formats has the same significand (or mantissa) length as the 48-bit floats in my idealized computer. But no bits are wasted;
instead, the exponent field is just enlarged.

72-bit FP (ala IEEE754 rules) is arguably better than Posits.

It's still a conventional floating-point format, where the lengths of the exponent and significand are fixed, unlike John Gustavson's posits. But
this gives it the advantage that either a computation will fail, or the precision of all the intermediate results will be the same as that of the final result; no catastrophic loss of precision will pass by unnoticed.

The other 72-bit format has a significand

?? fraction ??

the same size as that of IEEE
754 64-bit floats. Offering lower precision, the same as that of a 60-bit float... would, no doubt, be too tough a sell. So the exponent field,
while not as large as that of the other format, would still be 8 bits
longer than usual, which, no doubt would be helpful.

John Savard

--- Synchronet 3.21b-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Thu Feb 5 06:06:05 2026

From Newsgroup: comp.arch

On Thu, 05 Feb 2026 01:57:22 +0000, MitchAlsup wrote:

John Savard <quadibloc@invalid.invalid> posted:

I had looked into unusual memory architectures to allow a computer to
be designed which had single-precision floats that were 36 bits long,
so that it would be possible more often to avoid recourse to double
precision, and which had double-precision floats that were 60 bits
long, also a multiple of 12, because it wasn't necessary to have all
the precision of 64-bit floats.

Any machine on sale today (selling at les 100,000 machines/year)
provide 36-bit or 60-bit or 72-bit FP ?!?

Not that I know of. Of course, there's Univac, which still sells machines supporting their old 36-bit architecture.

If you want to build a 12-bit=base machine, go ahead--just don't expect
much takeup.

That's indeed the problem, so I tried to address the problem.

So how could I achieve my original goals while avoiding awkwardness?

Avoid non 8^n design points altogether.

That, unfortunately, couldn't achieve my original goals.

Well, I came up with this:

Have floating-point formats that are either 36 bits long or 72 bits
long.

Ok, better than the above, 12^n -> {12, 24, 48, 96}
WOOPS no 36, 60 or 72 !!!
............................6^n -> {6, 12, 24, 48, 96} still does not
work.

The idea is now there's a 9-bit byte, and everything is build around that 9-bit byte. Although 9 is not a power of two, all other lengths are 9
times a power of two, so binary addressing of these bytes and two-byte and four-byte and eight-byte quantities remains just as simple as on a pure
2^n machine.

Since 2^n machines with *bit addressing* are just about as rare as 36-bit
and 60-bit machines... now my proposal is "just as good".

I _still_ don't _really_ expect much takeup, even though my floats have
sizes that seem to match the precisions those engaged in scientific
computing were fond of.

One of the 72-bit formats has the same significand (or mantissa) length
as the 48-bit floats in my idealized computer. But no bits are wasted;
instead, the exponent field is just enlarged.

72-bit FP (ala IEEE754 rules) is arguably better than Posits.

At least one bit of positivity.

The other 72-bit format has a significand

?? fraction ??

A floating-point number usually has three parts; a sign, an exponent
(which includes its own sign) and...

a coefficient or mantissa or fraction... which is now referred to, in the
IEEE standard, as a "significand", so I guess we have to get use to the
new official name for it.

John Savard

--- Synchronet 3.21b-Linux NewsLink 1.2

From Robert Finch@robfi680@gmail.com to comp.arch on Thu Feb 5 08:10:58 2026

From Newsgroup: comp.arch

On 2026-02-05 1:06 a.m., quadi wrote:

On Thu, 05 Feb 2026 01:57:22 +0000, MitchAlsup wrote:

John Savard <quadibloc@invalid.invalid> posted:

I had looked into unusual memory architectures to allow a computer to
be designed which had single-precision floats that were 36 bits long,
so that it would be possible more often to avoid recourse to double
precision, and which had double-precision floats that were 60 bits
long, also a multiple of 12, because it wasn't necessary to have all
the precision of 64-bit floats.

Any machine on sale today (selling at les 100,000 machines/year)
provide 36-bit or 60-bit or 72-bit FP ?!?

Not that I know of. Of course, there's Univac, which still sells machines supporting their old 36-bit architecture.

If you want to build a 12-bit=base machine, go ahead--just don't expect
much takeup.

That's indeed the problem, so I tried to address the problem.

So how could I achieve my original goals while avoiding awkwardness?

Avoid non 8^n design points altogether.

That, unfortunately, couldn't achieve my original goals.

Well, I came up with this:

Have floating-point formats that are either 36 bits long or 72 bits
long.

Ok, better than the above, 12^n -> {12, 24, 48, 96}
WOOPS no 36, 60 or 72 !!!
............................6^n -> {6, 12, 24, 48, 96} still does not
work.

The idea is now there's a 9-bit byte, and everything is build around that 9-bit byte. Although 9 is not a power of two, all other lengths are 9
times a power of two, so binary addressing of these bytes and two-byte and four-byte and eight-byte quantities remains just as simple as on a pure
2^n machine.

Since 2^n machines with *bit addressing* are just about as rare as 36-bit
and 60-bit machines... now my proposal is "just as good".

I _still_ don't _really_ expect much takeup, even though my floats have
sizes that seem to match the precisions those engaged in scientific
computing were fond of.

One of the 72-bit formats has the same significand (or mantissa) length
as the 48-bit floats in my idealized computer. But no bits are wasted;
instead, the exponent field is just enlarged.

72-bit FP (ala IEEE754 rules) is arguably better than Posits.

At least one bit of positivity.

The other 72-bit format has a significand

?? fraction ??

A floating-point number usually has three parts; a sign, an exponent
(which includes its own sign) and...

a coefficient or mantissa or fraction... which is now referred to, in the IEEE standard, as a "significand", so I guess we have to get use to the
new official name for it.

John Savard

I do not see anything wrong with an odd sized machine. One just has to
accept that it would not be accepted by the community at large. One
would be expending a lot of effort. It may be considered an artistic
endevour.

I have toyed with the idea of 10 or 11 bit bytes as there could be byte
error correct on them if they used 16-bits.

An issue is that 2^n floats work very well. Some of the approximations
are more costly to achieve (larger tables) with a wider byte. Estimating
a reciprocal to eight bits is less costly than 9 or 10 bits. I would not
do anything other than 32/64/128 bits floats even in a machine with odd
sized bytes.

I have made a couple of machines (not really finished off) with odd byte
sizes and it is a ton of work to get the software working. Best to stick
with eight bits.

--- Synchronet 3.21b-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.arch on Thu Feb 5 14:50:47 2026

From Newsgroup: comp.arch

quadi <quadibloc@ca.invalid> writes:

On Thu, 05 Feb 2026 01:57:22 +0000, MitchAlsup wrote:

John Savard <quadibloc@invalid.invalid> posted:

I had looked into unusual memory architectures to allow a computer to
be designed which had single-precision floats that were 36 bits long,
so that it would be possible more often to avoid recourse to double
precision, and which had double-precision floats that were 60 bits
long, also a multiple of 12, because it wasn't necessary to have all
the precision of 64-bit floats.

Any machine on sale today (selling at les 100,000 machines/year)
provide 36-bit or 60-bit or 72-bit FP ?!?

Not that I know of. Of course, there's Univac, which still sells machines >supporting their old 36-bit architecture.

The last Unisys CMOS machines were shipped over a decade ago. Modern 2200 systems are all emulated on standard x86 cores running under linux.

--- Synchronet 3.21b-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Thu Feb 5 23:58:52 2026

From Newsgroup: comp.arch

On Thu, 05 Feb 2026 06:06:05 +0000, quadi wrote:

The idea is now there's a 9-bit byte, and everything is build around
that 9-bit byte. Although 9 is not a power of two, all other lengths are
9 times a power of two, so binary addressing of these bytes and two-byte
and four-byte and eight-byte quantities remains just as simple as on a
pure 2^n machine.

Since 2^n machines with *bit addressing* are just about as rare as
36-bit and 60-bit machines... now my proposal is "just as good".

I have given more thought to interoperability with the 8-bit world.

Giving it additional numeric types which are stored normally in registers,
but which are stored in memory using only the least significant eight bits
of each nine-bit byte, would allow it to exchange data with conventional machines based on the eight-bit byte.

John Savard
--- Synchronet 3.21b-Linux NewsLink 1.2

From jgd@jgd@cix.co.uk (John Dallman) to comp.arch on Fri Feb 6 16:37:00 2026

From Newsgroup: comp.arch

In article <10m3ars$3loo1$1@dont-email.me>, quadibloc@ca.invalid (quadi)
wrote:

Giving it additional numeric types which are stored normally in
registers, but which are stored in memory using only the least
significant eight bits of each nine-bit byte, would allow it to
exchange data with conventional machines based on the eight-bit
byte.

Isn't that going to create opcode space pressure?

It's certainly going to create some interesting new types of bad data if pointers to the two types of data get confused.

How are you planning to handle UTF-8, UTF-16 and UTF-32 character data? Creating UTF-9, UTF-18 and UTF-36 seems like pointless complexity.

John
--- Synchronet 3.21b-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Fri Feb 6 22:32:25 2026

From Newsgroup: comp.arch

On Fri, 06 Feb 2026 16:37:00 +0000, John Dallman wrote:

Isn't that going to create opcode space pressure?

Well, that will be less of an issue in an architecture where the
instructions are stored in wider memory.

How are you planning to handle UTF-8, UTF-16 and UTF-32 character data? Creating UTF-9, UTF-18 and UTF-36 seems like pointless complexity.

I think UTF-9 was described in an April 1st RFC. But I agree with that.

Essentially, I am now thinking that a CPU with this architecture might
have its primary application as a numerical co-processor for a
conventional CPU. This would provide the opportunity for carrying out computations with extra exponent range or higher precision without having
to switch to a much larger floating-point format, thus avoiding loss of
speed.

One would need to create a new kind of RAM module to support a 144-bit
wide data bus, but it would be unrealistic to create new video cards and
so on.

So it would have its own FORTRAN compiler - that would be the highest
priority in software development, after some kind of operating system for
the compiler to run within. Well, maybe porting a C compiler would need to come first, to allow everything else to be ported.

John Savard
--- Synchronet 3.21b-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Fri Feb 6 22:34:24 2026

From Newsgroup: comp.arch

On Fri, 06 Feb 2026 22:32:25 +0000, quadi wrote:

On Fri, 06 Feb 2026 16:37:00 +0000, John Dallman wrote:

How are you planning to handle UTF-8, UTF-16 and UTF-32 character data?
Creating UTF-9, UTF-18 and UTF-36 seems like pointless complexity.

I think UTF-9 was described in an April 1st RFC.

Ah, yes. Here we are:

https://www.ietf.org/rfc/rfc4042.txt

RFC 4042.

John Savard
--- Synchronet 3.21b-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Fri Feb 6 22:40:56 2026

From Newsgroup: comp.arch

On Fri, 06 Feb 2026 16:37:00 +0000, John Dallman wrote:

How are you planning to handle UTF-8, UTF-16 and UTF-32 character data?

Character strings would normally be handled as sequences of nine-bit bytes.

Given that the various compatibility formats for numbers would place them
in the least significant eight bits of successive bytes, this is also how eight-bit characters would be handled; they would be placed in successive nine-bit bytes.

Then they would be converted to 31-bit numbers representing Unicode characters; presumably as normal 36-bit integers, but they could be placed
in 32-bit compatibility-form integers if they were going back out to a conventional system.

John Savard
--- Synchronet 3.21b-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Sat Feb 7 00:57:08 2026

From Newsgroup: comp.arch

quadi <quadibloc@ca.invalid> posted:

On Fri, 06 Feb 2026 16:37:00 +0000, John Dallman wrote:

Isn't that going to create opcode space pressure?

Well, that will be less of an issue in an architecture where the instructions are stored in wider memory.

I would think that with 36-bit instructions, you have the OpCode space
to 'blow'...

How are you planning to handle UTF-8, UTF-16 and UTF-32 character data? Creating UTF-9, UTF-18 and UTF-36 seems like pointless complexity.

I think UTF-9 was described in an April 1st RFC. But I agree with that.

Essentially, I am now thinking that a CPU with this architecture might
have its primary application as a numerical co-processor for a
conventional CPU. This would provide the opportunity for carrying out computations with extra exponent range or higher precision without having
to switch to a much larger floating-point format, thus avoiding loss of speed.

One would need to create a new kind of RAM module to support a 144-bit
wide data bus, but it would be unrealistic to create new video cards and
so on.

So it would have its own FORTRAN compiler - that would be the highest priority in software development, after some kind of operating system for the compiler to run within. Well, maybe porting a C compiler would need to come first, to allow everything else to be ported.

C and FORTRAN will put you in a position where everything is 9|u2^n in
size, so you might as well just design a 72-bit machine. 72-bit registers 72-bit Virtual Address Space, ...

And then have the MMUs do translations between 8-bit Byte-world and 9-bit Byte-world. Everything in CPU-land is 72-bits... Done right, LDs and STs through certain PTEs "translate" between 64-bit world and 72-bit world.

If you have MMU doing the translation, you do not need 8|u2^n instruction calculations--saving OpCOde space {from Concertina III !}

John Savard

Why Quadi ??
--- Synchronet 3.21b-Linux NewsLink 1.2

From jgd@jgd@cix.co.uk (John Dallman) to comp.arch on Sat Jan 31 22:36:00 2026

From Newsgroup: comp.arch

In article <10llhkk$34374$2@dont-email.me>, quadibloc@invalid.invalid
(John Savard) wrote:

Have floating-point formats that are either 36 bits long or 72 bits
long.

That way, the 36-bit format is available, and longer formats, being
twice as long, are easy to fetch from memory.
...
The other 72-bit format has a significand the same size as that of
IEEE 754 64-bit floats. Offering lower precision, the same as that
of a 60-bit float... would, no doubt, be too tough a sell. So the
exponent field, while not as large as that of the other format,
would still be 8 bits longer than usual, which, no doubt would be
helpful.

If you're going to have non-IEEE-standard formats, that seems a way for
them to be more acceptable. Accurate representation of smaller values
without underflow is good.

John
--- Synchronet 3.21b-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Sat Feb 7 02:45:35 2026

From Newsgroup: comp.arch

On Sat, 07 Feb 2026 00:57:08 +0000, MitchAlsup wrote:

quadi <quadibloc@ca.invalid> posted:

John Savard

Why Quadi ??

http://www.quadibloc.com/crypto/co0407.htm

John Savard

--- Synchronet 3.21b-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Mon Feb 9 22:46:44 2026

From Newsgroup: comp.arch

On Thu, 05 Feb 2026 23:58:52 +0000, quadi wrote:

I have given more thought to interoperability with the 8-bit world.

Giving it additional numeric types which are stored normally in
registers,
but which are stored in memory using only the least significant eight
bits of each nine-bit byte, would allow it to exchange data with
conventional machines based on the eight-bit byte.

I have now added a new page to my site,

http://www.quadibloc.com/arch/per16.htm

where this is explained more completely with illustrations.

John Savard
--- Synchronet 3.21b-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Tue Feb 10 03:49:51 2026

From Newsgroup: comp.arch

On Mon, 09 Feb 2026 22:46:44 +0000, quadi wrote:

On Thu, 05 Feb 2026 23:58:52 +0000, quadi wrote:

I have now added a new page to my site,

http://www.quadibloc.com/arch/per16.htm

where this is explained more completely with illustrations.

I have further updated that page to show how this principle can be
extended to connect the 36-bit word computer not only to a 32-bit word computer, but also to a 24-bit word computer, and I mention that integer formats as well as floating-point ones of this type are needed.

John Savard
--- Synchronet 3.21b-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Tue Feb 10 19:10:45 2026

From Newsgroup: comp.arch

quadi <quadibloc@ca.invalid> posted:

On Mon, 09 Feb 2026 22:46:44 +0000, quadi wrote:

On Thu, 05 Feb 2026 23:58:52 +0000, quadi wrote:

I have now added a new page to my site,

http://www.quadibloc.com/arch/per16.htm

where this is explained more completely with illustrations.

I have further updated that page to show how this principle can be
extended to connect the 36-bit word computer not only to a 32-bit word computer, but also to a 24-bit word computer, and I mention that integer formats as well as floating-point ones of this type are needed.

Over the last couple of days, I have come to the conclusion that
your job, in the near future, is to sell the idea of a 72-bit
computer architecture.

Figure out some way to have 9-bit, 12-bit things (without resorting
to 3-bit base things) and you have access to {9, 12, 18, 24, 30, 36,
42, 48, 54, 60, 66, 72} data sizes.

Provide a means to access 8|u2^n via PTE translation, and presto, you
have all the (odd) data sizes You have been struggling with for Oh so
long.

John Savard

--- Synchronet 3.21b-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Wed Feb 11 23:04:20 2026

From Newsgroup: comp.arch

On Tue, 10 Feb 2026 19:10:45 +0000, MitchAlsup wrote:

Over the last couple of days, I have come to the conclusion that your
job, in the near future, is to sell the idea of a 72-bit computer architecture.

Here, you are raising the one question that I have been merrily avoiding
as irrelevant. Although it is anything but irrelevant in one way, as it directly deals with the value of all this in the real world.

Trying to persuade the world to switch from 32 bits to 36 bits? How could
I be anything other than an amusing crank if I did that?

I remember having read one article in a computer magazine where someone mentioned that an unfortunate result of the transition from the IBM 7090
to the IBM System/360 was that a lot of FORTRAN programs that were able to
use ordinary real nubers had to be switched over to double precision to
yield acceptable results.

And I noticed that a lot of mathematical tables from the old days went up
to 10 digit accuracy, and scientific calculators had 10 digit displays, calculating internally to a slightly higher precision.

And a passing remark in Petr Beckmann's "A History of Pi" about how even
using pi to the accuracy of a computer double precision number was 'artificial' encouraged me to think of trimming down double precision a
bit - say by one digit, to match the precision of numbers in the Control
Data 6600, with which scientists seemed to have been quite content in its
day.

All this was a rather slim basis on which to conclude that our 32-bit and 64-bit floats ought to be replaced by 36-bit, 48-bit, and 60-bit floats.

And in the days that immediately followed the emergence of the IBM System/
360, of course, transistors were still *expensive*. So it made sense to be concerned about optimizing floating-point formats, so that their precision
was as much as necessary, but no more - so that a computer with as few transistors as possible could perform calculations as fast as possible to
get the results needed.

But now? Powerful microprocessors are cheap. The cost of buying a custom specialized part would be so high as to completely eliminate the potential savings of using 36-bit floats instead of 64-bit floats when they might do.

So the only way a benefit would result... is if 36/72 bits became the ubiquitous new standard! I suppose that _could_ happen, if it were widely acknowledged that the requirements of scientific computing would be better
met in that case.

So it seems as if it's impossible for the 36/72 bit transition to start on
a small scale, with something that fills a niche demand, because the lower production volumes would create higher costs that entirely negate the
value for the niche.

Except...

Speaking of niche products, there's the SX-Aurora TSUBASA from NEC... it
looks like a video card, but it's actually the last surviving *vector* supercomputer in the Cray tradition!

As it happens, I encountered - in my years as a grad student - a computer add-on from Floating-Point Systems which, so that it could be attached to (then still existing) 36-bit computers or 18-bit minis in addition to the 32-bit and 16-bit ones... used 38-bit floating-point numbers internally.

And Cray style vector instructions are one thing I've been including in my various hypothetical architectures, on the grounds that there about the
only architectural feature aimed at providing more power that (some) mainframes had that isn't routine in micros these days. Of course, though, you've noted that it can't really be effective without huge memory
bandwidth, which is impractical to provide.

And the SX-Aurora TSUBASA has internal memory, which may even be HBM, so
that removes the issue that standard memory modules are designed around
the 32/64/128/256 -bit data bus width.

So vector modules are a potential niche that could run in 36 bits while connecting to a 32 bit world - and making 36 bits connect to 32 bits is,
of course, just what my latest brainstorm was dealing with.

John Savard
--- Synchronet 3.21b-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Wed Feb 11 23:55:29 2026

From Newsgroup: comp.arch

quadi <quadibloc@ca.invalid> posted:

On Tue, 10 Feb 2026 19:10:45 +0000, MitchAlsup wrote:

Over the last couple of days, I have come to the conclusion that your
job, in the near future, is to sell the idea of a 72-bit computer architecture.

Here, you are raising the one question that I have been merrily avoiding
as irrelevant. Although it is anything but irrelevant in one way, as it directly deals with the value of all this in the real world.

Trying to persuade the world to switch from 32 bits to 36 bits? How could
I be anything other than an amusing crank if I did that?

You do not.

Remember I said you could use PTEs to access 32-bit data (or 36-bit data)
and thus, calculations instructions do not need 32-bit-ed-ness. So, the
whole data-path is 36/72/144-bits wide.

Properly arranged, your disk files remain 32-bit, internet 32-bits, ...

I remember having read one article in a computer magazine where someone mentioned that an unfortunate result of the transition from the IBM 7090
to the IBM System/360 was that a lot of FORTRAN programs that were able to use ordinary real nubers had to be switched over to double precision to yield acceptable results.

I remember that, too.

And I noticed that a lot of mathematical tables from the old days went up
to 10 digit accuracy, and scientific calculators had 10 digit displays, calculating internally to a slightly higher precision.

And a passing remark in Petr Beckmann's "A History of Pi" about how even using pi to the accuracy of a computer double precision number was 'artificial' encouraged me to think of trimming down double precision a
bit - say by one digit, to match the precision of numbers in the Control Data 6600, with which scientists seemed to have been quite content in its day.

Scientists of the day were happy with 60-bit CRAY-quality FP. They were
not happy with IBM 32-bit, and sort-of-OK with Univac 36-bit.

All this was a rather slim basis on which to conclude that our 32-bit and 64-bit floats ought to be replaced by 36-bit, 48-bit, and 60-bit floats.

36/72-bit have the property that 32/64-bit^2 does not overflow !!
avoiding all sorts of IEEE_HYPOT() problems.

And in the days that immediately followed the emergence of the IBM System/ 360, of course, transistors were still *expensive*. So it made sense to be concerned about optimizing floating-point formats, so that their precision was as much as necessary, but no more - so that a computer with as few transistors as possible could perform calculations as fast as possible to get the results needed.

So optimized that IBM forgot about the Guard digit !!!

But now? Powerful microprocessors are cheap. The cost of buying a custom specialized part would be so high as to completely eliminate the potential savings of using 36-bit floats instead of 64-bit floats when they might do.

So the only way a benefit would result... is if 36/72 bits became the ubiquitous new standard! I suppose that _could_ happen, if it were widely acknowledged that the requirements of scientific computing would be better met in that case.

In this case YOU have to ask YOURSELF why are you providing any of those strange data sizes AT ALL ??? That is who is Concertina for ???

So it seems as if it's impossible for the 36/72 bit transition to start on
a small scale, with something that fills a niche demand, because the lower production volumes would create higher costs that entirely negate the
value for the niche.

Except...

Speaking of niche products, there's the SX-Aurora TSUBASA from NEC... it looks like a video card, but it's actually the last surviving *vector* supercomputer in the Cray tradition!

As it happens, I encountered - in my years as a grad student - a computer add-on from Floating-Point Systems which, so that it could be attached to (then still existing) 36-bit computers or 18-bit minis in addition to the 32-bit and 16-bit ones... used 38-bit floating-point numbers internally.

And Cray style vector instructions are one thing I've been including in my various hypothetical architectures, on the grounds that there about the
only architectural feature aimed at providing more power that (some) mainframes had that isn't routine in micros these days. Of course, though, you've noted that it can't really be effective without huge memory bandwidth, which is impractical to provide.

And the SX-Aurora TSUBASA has internal memory, which may even be HBM, so that removes the issue that standard memory modules are designed around
the 32/64/128/256 -bit data bus width.

So vector modules are a potential niche that could run in 36 bits while connecting to a 32 bit world - and making 36 bits connect to 32 bits is,
of course, just what my latest brainstorm was dealing with.

John Savard

--- Synchronet 3.21b-Linux NewsLink 1.2

From David Schultz@david.schultz@earthlink.net to comp.arch on Wed Feb 11 17:57:29 2026

From Newsgroup: comp.arch

On 2/11/26 5:04 PM, quadi wrote:

I remember having read one article in a computer magazine where someone mentioned that an unfortunate result of the transition from the IBM 7090
to the IBM System/360 was that a lot of FORTRAN programs that were able to use ordinary real nubers had to be switched over to double precision to
yield acceptable results.

This reminds me of when I took a numerical analysis course. (The many
ways that computer calculations can go wrong and how to deal with it.)
The professor said that the schools IBM (360 or 370, ca. 1980) was
perfect for the course because of the defects in its floating point
system. Guard digits and rounding sorts of things as near as I can recall.
--
http://davesrocketworks.com
David Schultz
"Gag me with a Smurf"
--- Synchronet 3.21b-Linux NewsLink 1.2

From John Levine@johnl@taugh.com to comp.arch on Thu Feb 12 00:04:37 2026

From Newsgroup: comp.arch

According to David Schultz <david.schultz@earthlink.net>:

This reminds me of when I took a numerical analysis course. (The many
ways that computer calculations can go wrong and how to deal with it.)
The professor said that the schools IBM (360 or 370, ca. 1980) was
perfect for the course because of the defects in its floating point
system. Guard digits and rounding sorts of things as near as I can recall.

The 360's floating point is a famous and somewhat puzzling failure, considering how much else they got right.

It does hex normalization rather than binary. They assumed that
leading digits are evenly distributed so there's be on average one
zero bit, but in fact they're geometrically distributed, so on average
there's two. They got one bit back by making the exponent units of 16
rather than 2, but that's still one bit gone. It truncated rather than rounding, another bit gone. They also truncated rather than rounding
results.

Originally there wre no guard digits which made the results comically
bad but IBM retrofitted them at great cost to all the installed machines.

IEEE floating point can be seen as a reaction to that, how do you use
the same number of bits but get good results.
--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly
--- Synchronet 3.21b-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Thu Feb 12 02:04:58 2026

From Newsgroup: comp.arch

John Levine <johnl@taugh.com> posted:

According to David Schultz <david.schultz@earthlink.net>:

This reminds me of when I took a numerical analysis course. (The many
ways that computer calculations can go wrong and how to deal with it.)
The professor said that the schools IBM (360 or 370, ca. 1980) was
perfect for the course because of the defects in its floating point >system. Guard digits and rounding sorts of things as near as I can recall.

The 360's floating point is a famous and somewhat puzzling failure, considering
how much else they got right.

It does hex normalization rather than binary. They assumed that
leading digits are evenly distributed so there's be on average one
zero bit, but in fact they're geometrically distributed, so on average there's two. They got one bit back by making the exponent units of 16
rather than 2, but that's still one bit gone. It truncated rather than rounding, another bit gone. They also truncated rather than rounding results.

Originally there wre no guard digits which made the results comically
bad but IBM retrofitted them at great cost to all the installed machines.

IEEE floating point can be seen as a reaction to that, how do you use
the same number of bits but get good results.

VAX got this correct too (the VAX format not the one inherited from
PDP-11/45; PDP-11/40* FP was worse). VAX FP is arguably as good as
IEEE 754 with the exception that more IEEE numbers have reciprocals
due to the change in exponent bias by 1. {{One can STILL argue whether deNormals were a plus or a minus in IEEE}}

CMU had a PDP-11/40 with writable control store 1974. I programmed it
to do PDP-11/45 FP instead of PDP-11/40 FP as a Jr. project.

--- Synchronet 3.21b-Linux NewsLink 1.2

From John Levine@johnl@taugh.com to comp.arch on Thu Feb 12 03:31:02 2026

From Newsgroup: comp.arch

According to MitchAlsup <user5857@newsgrouper.org.invalid>:

John Levine <johnl@taugh.com> posted:

According to David Schultz <david.schultz@earthlink.net>:

This reminds me of when I took a numerical analysis course. (The many
ways that computer calculations can go wrong and how to deal with it.)
The professor said that the schools IBM (360 or 370, ca. 1980) was
perfect for the course because of the defects in its floating point
system. Guard digits and rounding sorts of things as near as I can recall. >>

The 360's floating point is a famous and somewhat puzzling failure, considering
how much else they got right.

It does hex normalization rather than binary. They assumed that
leading digits are evenly distributed so there's be on average one
zero bit, but in fact they're geometrically distributed, so on average
there's two. They got one bit back by making the exponent units of 16
rather than 2, but that's still one bit gone. It truncated rather than
rounding, another bit gone. They also truncated rather than rounding
results.

Oh I forgot that using hex exponents meant there was no hidden bit, so
in practice it lost three bits of precision on every operation. There was
a great deal of grumbling that people with 709x Fortran codes had to
make everything double precision to keep getting reasonably good results.

Originally there wre no guard digits which made the results comically
bad but IBM retrofitted them at great cost to all the installed machines.

IEEE floating point can be seen as a reaction to that, how do you use
the same number of bits but get good results.

VAX got this correct too (the VAX format not the one inherited from >PDP-11/45; PDP-11/40* FP was worse). ...

The VAX is the first machine I know that used the hidden bit trick to
get an extra bit of significance. The PDP-6/10 was pretty close but
their format was two's complement which meant no hidden bit but they
could use integer comparisons on normalized floating point numbers.
--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly
--- Synchronet 3.21b-Linux NewsLink 1.2

From Stephen Fuld@sfuld@alumni.cmu.edu.invalid to comp.arch on Wed Feb 11 19:50:00 2026

From Newsgroup: comp.arch

On 2/11/2026 3:04 PM, quadi wrote:

snip

And I noticed that a lot of mathematical tables from the old days went up
to 10 digit accuracy, and scientific calculators had 10 digit displays, calculating internally to a slightly higher precision.

The ten digit displays came from the design of the first electric
calculators, made by such companies as Friden and Monroe in the 1940s
and 50s). They had ten rows of numeric keys (0-9), so that the
operator, who presumably had ten fingers (including thumbs) could
operate them quickly. So 10 digits sort of became standard. When
computers came along, and the designers wanted to use binary for them,
they needed 35 bits (including sign) to hold the ten digits. Going with
36 bits allowed six six bit characters. The requirement from the US
Navy (a major customer) for that precision led to the 36 bit Univac 1100 series being a 36 bit machine. Once you have 36 bit integers, you might
as well use 36 bit floating point numbers, and then 72 bit double
precision floating point numbers as the 1100 series did.
--
- Stephen Fuld
(e-mail address disguised to prevent spam)
--- Synchronet 3.21b-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Thu Feb 12 05:51:03 2026

From Newsgroup: comp.arch

On Wed, 11 Feb 2026 17:57:29 -0600, David Schultz wrote:

On 2/11/26 5:04 PM, quadi wrote:

I remember having read one article in a computer magazine where someone
mentioned that an unfortunate result of the transition from the IBM
7090 to the IBM System/360 was that a lot of FORTRAN programs that were
able to use ordinary real nubers had to be switched over to double
precision to yield acceptable results.

This reminds me of when I took a numerical analysis course. (The many
ways that computer calculations can go wrong and how to deal with it.)
The professor said that the schools IBM (360 or 370, ca. 1980) was
perfect for the course because of the defects in its floating point
system. Guard digits and rounding sorts of things as near as I can
recall.

Mitch Alsup mentioned that there was no guard digit in the floating-point arithmetic units of the various IBM System/360 models when they were
initially released. However, this was so serious an omission, as was
quickly noted in practice, that IBM quickly modified the design, and
refitted all the units in the field.

Even after this was done, though, since the exponent in IBM floating-point
was a power of 16 rather than 2, and since floating-point calculations
were truncated rather than rounded on the System/360, its floating-point
was still considered to be less than the greatest.

There were workarounds, though, which people have mostly forgotten about
due to the ubiquity of IEEE 754 floating-point these days. A famous
numerical analysis textbook which explained how to cope with the problems caused by substandard floating point formats was _Floating-Point
Computation_ by Pat H. Sterbenz.

John Savard

--- Synchronet 3.21b-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Thu Feb 12 05:55:14 2026

From Newsgroup: comp.arch

On Wed, 11 Feb 2026 19:50:00 -0800, Stephen Fuld wrote:

On 2/11/2026 3:04 PM, quadi wrote:

And I noticed that a lot of mathematical tables from the old days went
up to 10 digit accuracy, and scientific calculators had 10 digit
displays, calculating internally to a slightly higher precision.

The ten digit displays came from the design of the first electric calculators, made by such companies as Friden and Monroe in the 1940s
and 50s). They had ten rows of numeric keys (0-9), so that the
operator, who presumably had ten fingers (including thumbs) could
operate them quickly.

So you're saying that the tendency of log tables and the like to go up to
a maximum of ten digits precision wasn't because ten digits were needed
for, say, celestial mechanics or something like that, so my premise that
ten significant digits was what scientific computation usually needs, as reflected in the design of calculators and math tables is completely
mistaken.

Drat!

John Savard
--- Synchronet 3.21b-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Thu Feb 12 06:04:49 2026

From Newsgroup: comp.arch

On Wed, 11 Feb 2026 23:55:29 +0000, MitchAlsup wrote:

quadi <quadibloc@ca.invalid> posted:

All this was a rather slim basis on which to conclude that our 32-bit
and 64-bit floats ought to be replaced by 36-bit, 48-bit, and 60-bit
floats.

36/72-bit have the property that 32/64-bit^2 does not overflow !!
avoiding all sorts of IEEE_HYPOT() problems.

If one's primary floating-point format has a certain length of exponent
and mantissa, one is needed that has at least twice the length of
mantissa, and an exponent field that's one bit longer, then, for
intermediate results if exact squares are sometimes needed.

That was a point I had not thought of.

So the only way a benefit would result... is if 36/72 bits became the
ubiquitous new standard! I suppose that _could_ happen, if it were
widely acknowledged that the requirements of scientific computing would
be better met in that case.

In this case YOU have to ask YOURSELF why are you providing any of those strange data sizes AT ALL ??? That is who is Concertina for ???

As I've noted, the strange data sizes were provided on the basis that they would be preferable for scientific computation.

And so I continued on to suggest a possibility might be to address a niche market (just like ARM became a thing in an x86 world by going into smartphones) - use the strange data sizes to perform computations
internally in a numerical vector coprocessor... your internal values are a
few bits more precise, so your final answer, back in a standard data type,
is more accurate.

John Savard
--- Synchronet 3.21b-Linux NewsLink 1.2

From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.arch on Thu Feb 12 08:45:03 2026

From Newsgroup: comp.arch

MitchAlsup <user5857@newsgrouper.org.invalid> writes:

{{One can STILL argue whether
deNormals were a plus or a minus in IEEE}}

I am surprised to read that from you, who has always written that
denormals can be implemented cheaply and efficiently in hardware. The additional hardware cost (or the cost of trapping and software
emulation) has been the only argument against denormals that I ever encountered.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>
--- Synchronet 3.21b-Linux NewsLink 1.2

From Michael S@already5chosen@yahoo.com to comp.arch on Thu Feb 12 10:49:47 2026

From Newsgroup: comp.arch

On Thu, 12 Feb 2026 02:04:58 GMT
MitchAlsup <user5857@newsgrouper.org.invalid> wrote:

John Levine <johnl@taugh.com> posted:

According to David Schultz <david.schultz@earthlink.net>:

This reminds me of when I took a numerical analysis course. (The
many ways that computer calculations can go wrong and how to deal
with it.) The professor said that the schools IBM (360 or 370, ca.
1980) was perfect for the course because of the defects in its
floating point system. Guard digits and rounding sorts of things
as near as I can recall.

The 360's floating point is a famous and somewhat puzzling failure, considering how much else they got right.

It does hex normalization rather than binary. They assumed that
leading digits are evenly distributed so there's be on average one
zero bit, but in fact they're geometrically distributed, so on
average there's two. They got one bit back by making the exponent
units of 16 rather than 2, but that's still one bit gone. It
truncated rather than rounding, another bit gone. They also
truncated rather than rounding results.

Originally there wre no guard digits which made the results
comically bad but IBM retrofitted them at great cost to all the
installed machines.

IEEE floating point can be seen as a reaction to that, how do you
use the same number of bits but get good results.

VAX got this correct too (the VAX format not the one inherited from PDP-11/45; PDP-11/40* FP was worse). VAX FP is arguably as good as
IEEE 754 with the exception that more IEEE numbers have reciprocals
due to the change in exponent bias by 1. {{One can STILL argue whether deNormals were a plus or a minus in IEEE}}

From the perspective of stability of convergence of few common
algorithms denormals are significant plus.
From the perspective of minimizing surprises it is also plus. On VAX
(a > b) does not necessarily guarantee (a-b > 0).
I wonder in which situation it can be seen as a minus?
There are several things that I don't like about IEEE-754 Standard, but
none of them related to format of binary numbers.

CMU had a PDP-11/40 with writable control store 1974. I programmed it
to do PDP-11/45 FP instead of PDP-11/40 FP as a Jr. project.

--- Synchronet 3.21b-Linux NewsLink 1.2

From Michael S@already5chosen@yahoo.com to comp.arch on Thu Feb 12 10:53:58 2026

From Newsgroup: comp.arch

On Wed, 11 Feb 2026 17:57:29 -0600
David Schultz <david.schultz@earthlink.net> wrote:

On 2/11/26 5:04 PM, quadi wrote:

I remember having read one article in a computer magazine where
someone mentioned that an unfortunate result of the transition from
the IBM 7090 to the IBM System/360 was that a lot of FORTRAN
programs that were able to use ordinary real nubers had to be
switched over to double precision to yield acceptable results.

This reminds me of when I took a numerical analysis course. (The many
ways that computer calculations can go wrong and how to deal with
it.) The professor said that the schools IBM (360 or 370, ca. 1980)
was perfect for the course because of the defects in its floating
point system. Guard digits and rounding sorts of things as near as I
can recall.

Was not quality of arithmetic of CDC machines of the 70s even worse
than that of IBM ?

--- Synchronet 3.21b-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.arch on Thu Feb 12 15:54:46 2026

From Newsgroup: comp.arch

Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:

On 2/11/2026 3:04 PM, quadi wrote:

snip

And I noticed that a lot of mathematical tables from the old days went up
to 10 digit accuracy, and scientific calculators had 10 digit displays,
calculating internally to a slightly higher precision.

The ten digit displays came from the design of the first electric >calculators, made by such companies as Friden and Monroe in the 1940s
and 50s). They had ten rows of numeric keys (0-9), so that the
operator, who presumably had ten fingers (including thumbs) could
operate them quickly. So 10 digits sort of became standard. When
computers came along, and the designers wanted to use binary for them,

When computers came along, they used 40 bits to store 10 BCD digits
(e.g. the electrodata 220 (44 bit) from the mid 50s and the successor Burroughs machines (B300, B3500). The B3500 extended the maximum operand size
to 100 BCD digits. 80's versions of the B3500 had a 40-bit memory
bus (operating on 10 digits at a time).

--- Synchronet 3.21b-Linux NewsLink 1.2

From Stephen Fuld@sfuld@alumni.cmu.edu.invalid to comp.arch on Thu Feb 12 08:40:36 2026

From Newsgroup: comp.arch

On 2/11/2026 9:55 PM, quadi wrote:

On Wed, 11 Feb 2026 19:50:00 -0800, Stephen Fuld wrote:

On 2/11/2026 3:04 PM, quadi wrote:

And I noticed that a lot of mathematical tables from the old days went
up to 10 digit accuracy, and scientific calculators had 10 digit
displays, calculating internally to a slightly higher precision.

The ten digit displays came from the design of the first electric
calculators, made by such companies as Friden and Monroe in the 1940s
and 50s). They had ten rows of numeric keys (0-9), so that the
operator, who presumably had ten fingers (including thumbs) could
operate them quickly.

So you're saying that the tendency of log tables and the like to go up to
a maximum of ten digits precision wasn't because ten digits were needed
for, say, celestial mechanics or something like that, so my premise that
ten significant digits was what scientific computation usually needs, as reflected in the design of calculators and math tables is completely mistaken.

See

https://en.wikipedia.org/wiki/36-bit_computing#History
--
- Stephen Fuld
(e-mail address disguised to prevent spam)
--- Synchronet 3.21b-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.arch on Thu Feb 12 16:51:07 2026

From Newsgroup: comp.arch

Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:

On 2/11/2026 9:55 PM, quadi wrote:

On Wed, 11 Feb 2026 19:50:00 -0800, Stephen Fuld wrote:

On 2/11/2026 3:04 PM, quadi wrote:

And I noticed that a lot of mathematical tables from the old days went >>>> up to 10 digit accuracy, and scientific calculators had 10 digit
displays, calculating internally to a slightly higher precision.

The ten digit displays came from the design of the first electric
calculators, made by such companies as Friden and Monroe in the 1940s
and 50s). They had ten rows of numeric keys (0-9), so that the
operator, who presumably had ten fingers (including thumbs) could
operate them quickly.

So you're saying that the tendency of log tables and the like to go up to
a maximum of ten digits precision wasn't because ten digits were needed
for, say, celestial mechanics or something like that, so my premise that
ten significant digits was what scientific computation usually needs, as
reflected in the design of calculators and math tables is completely
mistaken.

See

https://en.wikipedia.org/wiki/36-bit_computing#History'

The Burroughs class-1 machines from the early 1900s were
built in several widths, but the bulk of them which were
sold to banks, etc. had 9 columns, which were often treated
as fixed point operating on values in pennies.

A typical column:

https://americanhistory.si.edu/collections/object/nmah_690198
--- Synchronet 3.21b-Linux NewsLink 1.2

From Stephen Fuld@sfuld@alumni.cmu.edu.invalid to comp.arch on Thu Feb 12 08:52:43 2026

From Newsgroup: comp.arch

On 2/12/2026 7:54 AM, Scott Lurndal wrote:

Stephen Fuld <sfuld@alumni.cmu.edu.invalid> writes:

On 2/11/2026 3:04 PM, quadi wrote:

snip

And I noticed that a lot of mathematical tables from the old days went up >>> to 10 digit accuracy, and scientific calculators had 10 digit displays,
calculating internally to a slightly higher precision.

The ten digit displays came from the design of the first electric
calculators, made by such companies as Friden and Monroe in the 1940s
and 50s). They had ten rows of numeric keys (0-9), so that the
operator, who presumably had ten fingers (including thumbs) could
operate them quickly. So 10 digits sort of became standard. When
computers came along, and the designers wanted to use binary for them,

When computers came along, they used 40 bits to store 10 BCD digits
(e.g. the electrodata 220 (44 bit) from the mid 50s and the successor Burroughs
machines (B300, B3500). The B3500 extended the maximum operand size
to 100 BCD digits. 80's versions of the B3500 had a 40-bit memory
bus (operating on 10 digits at a time).

In the early days of computers, there was a distinction between
"business" computers and "scientific" computers. Many (most?) of the
business computers were decimal) e.g. the ones you mentioned and some
IBM lines) and character oriented. Conversely, many of the scientific computers were binary and often used 36 bit words.

https://en.wikipedia.org/wiki/36-bit_computing#History

These often used 6 bit characters and conveniently used octal.

But as the late great Tom Lehrer said, "Base eight is just like base ten
if you have no thumbs!"

Of course, the IBM S/360 line was designed to provide one architecture
for both major uses. It also introduced the eight bit character (byte)
and thus naturally hexidecimal.
--
- Stephen Fuld
(e-mail address disguised to prevent spam)
--- Synchronet 3.21b-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Thu Feb 12 17:09:16 2026

From Newsgroup: comp.arch

anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

MitchAlsup <user5857@newsgrouper.org.invalid> writes:

{{One can STILL argue whether
deNormals were a plus or a minus in IEEE}}

I am surprised to read that from you, who has always written that
denormals can be implemented cheaply and efficiently in hardware. The additional hardware cost (or the cost of trapping and software
emulation) has been the only argument against denormals that I ever encountered.

It is only after IEEE 754-2008 came with FMAC that deNormals became
a low cost addition. {And that has been my point--you seem to have
forgotten the -2008 part or the argument}

- anton

--- Synchronet 3.21b-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Thu Feb 12 17:10:45 2026

From Newsgroup: comp.arch

Michael S <already5chosen@yahoo.com> posted:

On Thu, 12 Feb 2026 02:04:58 GMT
MitchAlsup <user5857@newsgrouper.org.invalid> wrote:

John Levine <johnl@taugh.com> posted:

According to David Schultz <david.schultz@earthlink.net>:

This reminds me of when I took a numerical analysis course. (The
many ways that computer calculations can go wrong and how to deal
with it.) The professor said that the schools IBM (360 or 370, ca. >1980) was perfect for the course because of the defects in its
floating point system. Guard digits and rounding sorts of things
as near as I can recall.

The 360's floating point is a famous and somewhat puzzling failure, considering how much else they got right.

It does hex normalization rather than binary. They assumed that
leading digits are evenly distributed so there's be on average one
zero bit, but in fact they're geometrically distributed, so on
average there's two. They got one bit back by making the exponent
units of 16 rather than 2, but that's still one bit gone. It
truncated rather than rounding, another bit gone. They also
truncated rather than rounding results.

Originally there wre no guard digits which made the results
comically bad but IBM retrofitted them at great cost to all the
installed machines.

IEEE floating point can be seen as a reaction to that, how do you
use the same number of bits but get good results.

VAX got this correct too (the VAX format not the one inherited from PDP-11/45; PDP-11/40* FP was worse). VAX FP is arguably as good as
IEEE 754 with the exception that more IEEE numbers have reciprocals
due to the change in exponent bias by 1. {{One can STILL argue whether deNormals were a plus or a minus in IEEE}}

From the perspective of stability of convergence of few common
algorithms denormals are significant plus.
From the perspective of minimizing surprises it is also plus. On VAX
(a > b) does not necessarily guarantee (a-b > 0).
I wonder in which situation it can be seen as a minus?

a-b underflows and takes a trap.

There are several things that I don't like about IEEE-754 Standard, but
none of them related to format of binary numbers.

CMU had a PDP-11/40 with writable control store 1974. I programmed it
to do PDP-11/45 FP instead of PDP-11/40 FP as a Jr. project.

--- Synchronet 3.21b-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Thu Feb 12 18:37:49 2026

From Newsgroup: comp.arch

On Thu, 12 Feb 2026 08:52:43 -0800, Stephen Fuld wrote:

On 2/12/2026 7:54 AM, Scott Lurndal wrote:

When computers came along, they used 40 bits to store 10 BCD digits
(e.g. the electrodata 220 (44 bit) from the mid 50s and the successor
Burroughs machines (B300, B3500). The B3500 extended the maximum
operand size to 100 BCD digits. 80's versions of the B3500 had a
40-bit memory bus (operating on 10 digits at a time).

In the early days of computers, there was a distinction between
"business" computers and "scientific" computers. Many (most?) of the business computers were decimal) e.g. the ones you mentioned and some
IBM lines) and character oriented. Conversely, many of the scientific computers were binary and often used 36 bit words.

https://en.wikipedia.org/wiki/36-bit_computing#History

These often used 6 bit characters and conveniently used octal.

But in the _really_ early days of computers, ones that did binary
arithmetic also often came with 40-bit words. Like EDVAC.

John Savard
--- Synchronet 3.21b-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Thu Feb 12 18:40:07 2026

From Newsgroup: comp.arch

On Thu, 12 Feb 2026 10:53:58 +0200, Michael S wrote:

Was not quality of arithmetic of CDC machines of the 70s even worse than
that of IBM ?

I don't know about that. But I do know that despite having a power-of-two exponent, quality of arithmetic on the Cray I was pretty terrible.

John Savard

--- Synchronet 3.21b-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Thu Feb 12 18:49:01 2026

From Newsgroup: comp.arch

On Thu, 12 Feb 2026 08:40:36 -0800, Stephen Fuld wrote:

On 2/11/2026 9:55 PM, quadi wrote:

So you're saying that the tendency of log tables and the like to go up
to a maximum of ten digits precision wasn't because ten digits were
needed for, say, celestial mechanics or something like that, so my
premise that ten significant digits was what scientific computation
usually needs, as reflected in the design of calculators and math
tables is completely mistaken.

See

https://en.wikipedia.org/wiki/36-bit_computing#History

After Seymour Cray left Univac to participate in founding Control Data,
their first product was the CDC 1604 computer, which had a 48-bit word
length. As it had a 36-bit mantissa (not including the sign of the number)
it had two bits more than needed for 10-digit precision (10 bits has 1,024 combinations, so 10 bits give 3 digits, thus 34 bits give ten if you don't need a bit for the sign; you were indeed right about 35 bits being the minimum).

So the idea that ten digits is good for integers moved over to ten digits
is good for floating-point at that stage.

John Savard
--- Synchronet 3.21b-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Thu Feb 12 20:02:08 2026

From Newsgroup: comp.arch

quadi <quadibloc@ca.invalid> posted:

On Thu, 12 Feb 2026 10:53:58 +0200, Michael S wrote:

Was not quality of arithmetic of CDC machines of the 70s even worse than that of IBM ?

I don't know about that. But I do know that despite having a power-of-two exponent, quality of arithmetic on the Cray I was pretty terrible.

Unlike CDC 6600/7600, CRAY multiply did not have a full 'tree' of multiplication logic--leading to all sorts of "headaches" for
numerical analysists. Given a 4|u4 set of multiplication gates,
cray only used 80%-odd of the required macros to get multiplication
'right'.

John Savard

--- Synchronet 3.21b-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Thu Feb 12 20:33:05 2026

From Newsgroup: comp.arch

quadi <quadibloc@ca.invalid> posted:

On Thu, 12 Feb 2026 10:53:58 +0200, Michael S wrote:

Was not quality of arithmetic of CDC machines of the 70s even worse than that of IBM ?

I don't know about that. But I do know that despite having a power-of-two exponent, quality of arithmetic on the Cray I was pretty terrible.

Unlike CDC 6600/7600, CRAY multiply did not have a full 'tree' of multiplication logic--leading to all sorts of "headaches" for
numerical analysists. Given a 4|u4 set of multiplication gates,
cray only used 80%-odd of the required macros to get multiplication
'right'.

John Savard

--- Synchronet 3.21b-Linux NewsLink 1.2

From Michael S@already5chosen@yahoo.com to comp.arch on Thu Feb 12 23:38:52 2026

From Newsgroup: comp.arch

On Thu, 12 Feb 2026 17:10:45 GMT
MitchAlsup <user5857@newsgrouper.org.invalid> wrote:

Michael S <already5chosen@yahoo.com> posted:

On Thu, 12 Feb 2026 02:04:58 GMT
MitchAlsup <user5857@newsgrouper.org.invalid> wrote:

John Levine <johnl@taugh.com> posted:

According to David Schultz <david.schultz@earthlink.net>:

This reminds me of when I took a numerical analysis course.
(The many ways that computer calculations can go wrong and how
to deal with it.) The professor said that the schools IBM (360
or 370, ca. 1980) was perfect for the course because of the
defects in its floating point system. Guard digits and
rounding sorts of things as near as I can recall.

The 360's floating point is a famous and somewhat puzzling
failure, considering how much else they got right.

It does hex normalization rather than binary. They assumed that
leading digits are evenly distributed so there's be on average
one zero bit, but in fact they're geometrically distributed, so
on average there's two. They got one bit back by making the
exponent units of 16 rather than 2, but that's still one bit
gone. It truncated rather than rounding, another bit gone.
They also truncated rather than rounding results.

Originally there wre no guard digits which made the results
comically bad but IBM retrofitted them at great cost to all the installed machines.

IEEE floating point can be seen as a reaction to that, how do
you use the same number of bits but get good results.

VAX got this correct too (the VAX format not the one inherited
from PDP-11/45; PDP-11/40* FP was worse). VAX FP is arguably as
good as IEEE 754 with the exception that more IEEE numbers have reciprocals due to the change in exponent bias by 1. {{One can
STILL argue whether deNormals were a plus or a minus in IEEE}}

From the perspective of stability of convergence of few common
algorithms denormals are significant plus.
From the perspective of minimizing surprises it is also plus. On
VAX (a > b) does not necessarily guarantee (a-b > 0).
I wonder in which situation it can be seen as a minus?

a-b underflows and takes a trap.

Then, don't take traps on underflow. It's most certainly not a
default behavior.
If somebody decided to enable trap then hopefully he knows what he is
doing.

There are several things that I don't like about IEEE-754 Standard,
but none of them related to format of binary numbers.

CMU had a PDP-11/40 with writable control store 1974. I
programmed it to do PDP-11/45 FP instead of PDP-11/40 FP as a Jr. project.

--- Synchronet 3.21b-Linux NewsLink 1.2

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.arch on Sat Feb 14 16:33:41 2026

From Newsgroup: comp.arch

quadi <quadibloc@ca.invalid> wrote:

I remember having read one article in a computer magazine where someone mentioned that an unfortunate result of the transition from the IBM 7090
to the IBM System/360 was that a lot of FORTRAN programs that were able to use ordinary real nubers had to be switched over to double precision to yield acceptable results.

Note that IBM floating point format effectively lost about 3 bits of
accuracy compared to modern 32-bit format. I am not sure how much they
lost compared to IBM 7090 but it looks that it was at least 5 bits.
Assuming that accuracy requirements are uniformly distributed between
20 and say 60 bits, we can estimate that loss of 5 bits affected about
25% (or more) of applications that could run using 36-bits. That is
"a lot" of programs.

But it does not mean that 36-bits are somewhat magical. Simply, given
36-bit machine original author had extra motivation to make sure that
the program run in 36-bit floating point.

And I noticed that a lot of mathematical tables from the old days went up
to 10 digit accuracy, and scientific calculators had 10 digit displays, calculating internally to a slightly higher precision.

There were various accuracies. I have (or maybe had) 100 digits
logarithm tables. Trouble with high accuracy tables is that with
naive use k-digit table need 10 to k positions, so quickly becomes
unmanagably big. Usual way is to have main table and table of correction coefficients to allow easy interpolation. That means that 2k-digit
table needs 10 to k positions. With k=5 we get reasonably sized
table. So 10-digits tables are the largest ones that still have
reasonable size and are easy to use.

Concernig calculators, in one case internal accuracy was about 14
digits which agrees with using 8-byte BCD format (with 2 digit
exponents).

Early machines that I know about had really varying accuracies,
starting from 23-bits trough 43-bits.

AFAICS now that main consumer of FLOPS is graphics. Most of graphics
would be happy with lower accuracy, but 32-bit is available. There
is one important thing that needs higher accuracy, namely determining orientation of a triangle require computing sign of determnant of
a 3 by 3 matrix. Assuming 16 significant bit input data (adequate
for most graphic uses) determiant needs about 48 significant bits,
so works with 64-bit doubles and would not work with 36-bit or
32-bit floating point numbers.

Second big user is audio. Some audio fans what high accuracy,
32-bit integer probably is good enough for them, 36-bit floating
point is probably kind of borderline and 32-bit float is not
good enough. But it seems that most audio uses lower accuracy,
in particular 32-bit floats.

So, the biggest users of floating point probably have no demand
for 36-bit floats.

If you look at other uses, then note that GPU makers used to
offer single precision only GPU-s as main option and fast double
precision as an expensive extra. So they understand need
for better precision but are relucant to affer it as a
standard feature. 36-bit clearly is much more exotic than that.

BTW. You seem to care about floating point. My personal intersts
go mainly toward integer calculations. More precisly, I am
interested in exact results, that is symbolic computation.
In symbolic computation main way to speed up computation is to
compute modulo prime number. For this 61-bit self-complement
aritmetic would be nice (self-complement means arithemtic modulo
2^n - 1, and 2^61 - 1 is a prime). 64-bit self-complement would
be of some use, but not so good because 2^64 - 1 has several
small factors. It would be good to have 3 argument artimetic
operations of form

(x op y) mod p

p could be of restricted form, say 2^64 - a where a could have
restricted range, for example 10-bit or 16-bit. IIUC for
addition and subtraction circuit cost of such operation could
be quite low, main const would be data path to prowide a and
encoding space. For multiplication cost would be higher, as
one would need to compute high bits of the product, shift them
down, multiply by a and subtract, then perform final correction
as in case of addition. Still, that would probably add about
20% of cost compared to normal multiplication, so not so high
cost you need the operation. Considering that alternative
uses several instructions, some of them expensive ones
(either division or extra multiplication) such instruction could
be attractive one.
--
Waldek Hebisch
--- Synchronet 3.21b-Linux NewsLink 1.2

From Terje Mathisen@terje.mathisen@tmsw.no to comp.arch on Sat Feb 14 18:03:07 2026

From Newsgroup: comp.arch

MitchAlsup wrote:

anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

MitchAlsup <user5857@newsgrouper.org.invalid> writes:

{{One can STILL argue whether
deNormals were a plus or a minus in IEEE}}

I am surprised to read that from you, who has always written that
denormals can be implemented cheaply and efficiently in hardware. The
additional hardware cost (or the cost of trapping and software
emulation) has been the only argument against denormals that I ever
encountered.

It is only after IEEE 754-2008 came with FMAC that deNormals became
a low cost addition. {And that has been my point--you seem to have
forgotten the -2008 part or the argument}

Not at all forgotten, I just didn't notice that caveat in the current discussion. Mea Culpa!

Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
--- Synchronet 3.21b-Linux NewsLink 1.2

From John Levine@johnl@taugh.com to comp.arch on Sat Feb 14 18:12:37 2026

From Newsgroup: comp.arch

According to Waldek Hebisch <antispam@fricas.org>:

quadi <quadibloc@ca.invalid> wrote:

I remember having read one article in a computer magazine where someone
mentioned that an unfortunate result of the transition from the IBM 7090
to the IBM System/360 was that a lot of FORTRAN programs that were able to >> use ordinary real nubers had to be switched over to double precision to
yield acceptable results.

Note that IBM floating point format effectively lost about 3 bits of
accuracy compared to modern 32-bit format. I am not sure how much they
lost compared to IBM 7090 but it looks that it was at least 5 bits.
Assuming that accuracy requirements are uniformly distributed between
20 and say 60 bits, we can estimate that loss of 5 bits affected about
25% (or more) of applications that could run using 36-bits. That is
"a lot" of programs.

But it does not mean that 36-bits are somewhat magical. Simply, given
36-bit machine original author had extra motivation to make sure that
the program run in 36-bit floating point.

It's worse than that, because the 360's floating point had wobbling precision. Depending on the number of leading zero bits in the fraction it could lose anywhere from 1 to 5 bits of precision compared to a rounded binary format. Hence the badness of the result depended more than usual on the input
data.

IBM had excellent numerical analysts who wrote the widely used Scientific Subroutine Package which got decent results with 360 arithmetic:

https://bitsavers.org/pdf/ibm/ssp/GH20-0205-4-SSP-programmers_Aug70.pdf
--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly
--- Synchronet 3.21b-Linux NewsLink 1.2

From jgd@jgd@cix.co.uk (John Dallman) to comp.arch on Sun Feb 15 14:37:00 2026

From Newsgroup: comp.arch

In article <10mq853$106co$1@paganini.bofh.team>, antispam@fricas.org
(Waldek Hebisch) wrote:

Note that IBM floating point format effectively lost about 3 bits of
accuracy compared to modern 32-bit format. I am not sure how much
they lost compared to IBM 7090 but it looks that it was at least 5

bits.

It's somewhat worse than that. Because the mantissa is in whole hex
digits, accuracy is lost in 4-bit lumps during a calculation. And because normalisation is of whole hex digits, and Barnard's Law applies, accuracy
and its loss are quite data-dependent.

But it does not mean that 36-bits are somewhat magical.

Definitely not.

Quadi, have your computer architectures included IBM 360 floating point support? There is probably more demand for that than for 36-bit these
days.

John
--- Synchronet 3.21b-Linux NewsLink 1.2

From Thomas Koenig@tkoenig@netcologne.de to comp.arch on Sun Feb 15 16:53:10 2026

From Newsgroup: comp.arch

John Levine <johnl@taugh.com> schrieb:

Oh I forgot that using hex exponents meant there was no hidden bit, so
in practice it lost three bits of precision on every operation. There was
a great deal of grumbling that people with 709x Fortran codes had to
make everything double precision to keep getting reasonably good results.

Hacker's Delight phrases this as

"When IBM introduced the System/360 computer in 1964, numerical
analysts were horrified at the loss of precision of single-precision arithmetic."

and then goes on to show how the distribution of floating point
values effectively reduces the precision of one quarter of floating
point values to 21. (There's a name for, somebody's law, but it escapes
me at the moment).
--
This USENET posting was made without artificial intelligence,
artificial impertinence, artificial arrogance, artificial stupidity,
artificial flavorings or artificial colorants.
--- Synchronet 3.21b-Linux NewsLink 1.2

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.arch on Tue Feb 17 01:16:33 2026

From Newsgroup: comp.arch

John Levine <johnl@taugh.com> wrote:

According to Waldek Hebisch <antispam@fricas.org>:

quadi <quadibloc@ca.invalid> wrote:

I remember having read one article in a computer magazine where someone >>> mentioned that an unfortunate result of the transition from the IBM 7090 >>> to the IBM System/360 was that a lot of FORTRAN programs that were able to >>> use ordinary real nubers had to be switched over to double precision to >>> yield acceptable results.

Note that IBM floating point format effectively lost about 3 bits of >>accuracy compared to modern 32-bit format. I am not sure how much they >>lost compared to IBM 7090 but it looks that it was at least 5 bits. >>Assuming that accuracy requirements are uniformly distributed between
20 and say 60 bits, we can estimate that loss of 5 bits affected about
25% (or more) of applications that could run using 36-bits. That is
"a lot" of programs.

But it does not mean that 36-bits are somewhat magical. Simply, given >>36-bit machine original author had extra motivation to make sure that
the program run in 36-bit floating point.

It's worse than that, because the 360's floating point had wobbling precision.
Depending on the number of leading zero bits in the fraction it could lose anywhere from 1 to 5 bits of precision compared to a rounded binary format. Hence the badness of the result depended more than usual on the input
data.

Well, IBM format had twice the rage of IEEE format, so effectively one
bit moved from mantissa to exponent. Looking at representable values
except at low end of the range only nomalized values matter. In
hex format 15/16 of values are normalized, which is better than
binary without hidden bit and marginaly worse than binary with hidden
bit. One hex order of maginitude has 15/16 representable values
compared to binary without hidden bit and with IEEE range and
15/32 representable values compared to IEEE. This order of magnitude correspond to 4 binary orders of magnitude, and each binary order
of magnitude has the same namber of values. So hex block beginning
with 1 has 1/16 values compared to all bit patterns of given hex order of magnitude, while corresponding IEEE binary orger of magnitude has
1/2 bit patterns compared to given hex order of magnitude. Which
gives 8 times bigger density for IEEE binary, that is 3 bits of
accuracy. IBM truncated, which looses one extra buit, so AFAICS
worse case for IBM hex is loss of 4 bits. At the high end of
hex order of magnitude density is the same, but still there is
one bit loss due to truncation. So actually, loss varies between
1 to 4 bits. Simple average is 2.5 bit loss, but 3 bits is more
realistic, because once you loose a bit performing following operations
with better accuracy will not compensate for loss.

Note that 1 bit is due to using truncation in arithmetic, which is
indepedent of format. 1 bit is due to exponent range. Hex makes
IBM choice of range natural, but if they really wanted they could
halve exponent range and add one bit to mantissa. So, compared
to binary machine using truncation, no hidden bit and the same
range as IBM hex one looses 1 bit in worst case and gains 2 bits
in best case. So, IBM choice was bad, but at that time other
made bad choices too.
--
Waldek Hebisch
--- Synchronet 3.21b-Linux NewsLink 1.2

From John Levine@johnl@taugh.com to comp.arch on Tue Feb 17 01:24:17 2026

From Newsgroup: comp.arch

According to Waldek Hebisch <antispam@fricas.org>:

Well, IBM format had twice the rage of IEEE format, so effectively one
bit moved from mantissa to exponent. Looking at representable values
except at low end of the range only nomalized values matter. In
hex format 15/16 of values are normalized, ...

That's the same mistake IBM made when they designed the 360's FP.
Leading fraction digits are geometrically distributed, not linearly.
(Look at a slide rule to see what I mean.)

There are on average two leading zeros so only half of the values are normalized.
--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly
--- Synchronet 3.21b-Linux NewsLink 1.2

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.arch on Tue Feb 17 16:21:44 2026

From Newsgroup: comp.arch

John Levine <johnl@taugh.com> wrote:

According to Waldek Hebisch <antispam@fricas.org>:

Well, IBM format had twice the rage of IEEE format, so effectively one
bit moved from mantissa to exponent. Looking at representable values >>except at low end of the range only nomalized values matter. In
hex format 15/16 of values are normalized, ...

That's the same mistake IBM made when they designed the 360's FP.
Leading fraction digits are geometrically distributed, not linearly.
(Look at a slide rule to see what I mean.)

If you have read und understand what I wrote (and you snipped), you
would see that I handle distribution of numbers. Hint: the point of
talking abount hex order of magnitude and binary orders of magnitude
is to compare both distributions.

There are on average two leading zeros so only half of the values are normalized.

No. By _definition_ hex floating point number is normalized if and
only if its leading hex digit is different than zero. It is easy
to check that different normalized hex bit pattern produce different
values.
--
Waldek Hebisch
--- Synchronet 3.21b-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Tue Feb 17 18:57:18 2026

From Newsgroup: comp.arch

On Sun, 15 Feb 2026 14:37:00 +0000, John Dallman wrote:

Quadi, have your computer architectures included IBM 360 floating point support? There is probably more demand for that than for 36-bit these
days.

Yes, in fact they have. The goal there is to facilitate data interchange
and emulation, not to provide better quality floating-point arithmetic... since, of course, it provides rather the opposite, as has been discussed
in this thread.

The original CISC Concertina I architecture went further; it had the goal
of being able to natively emulate the floating-point of just about every computer ever made.

John Savard
--- Synchronet 3.21b-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Tue Feb 17 19:09:53 2026

From Newsgroup: comp.arch

On Tue, 17 Feb 2026 01:24:17 +0000, John Levine wrote:

According to Waldek Hebisch <antispam@fricas.org>:

Well, IBM format had twice the rage of IEEE format, so effectively one
bit moved from mantissa to exponent. Looking at representable values >>except at low end of the range only nomalized values matter. In hex
format 15/16 of values are normalized, ...

That's the same mistake IBM made when they designed the 360's FP.
Leading fraction digits are geometrically distributed, not linearly.
(Look at a slide rule to see what I mean.)

This is Benford's Law, and there was an interesting discussion of it in
the December, 1969 issue of _Scientific American_ - in an article, not in Martin Gardner's _Mathematical Games_ column, as I would have expected.

John Savard
--- Synchronet 3.21b-Linux NewsLink 1.2

From John Levine@johnl@taugh.com to comp.arch on Tue Feb 17 19:20:33 2026

From Newsgroup: comp.arch

According to Waldek Hebisch <antispam@fricas.org>:

There are on average two leading zeros so only half of the values are
normalized.

No. By _definition_ hex floating point number is normalized if and
only if its leading hex digit is different than zero.

I wrote sloppily. On average a normalized hex FP number has two leading
zeros so you lose another bit compared to binary, in addition to what you
lose by no hidden bit and no rounding.
--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly
--- Synchronet 3.21b-Linux NewsLink 1.2

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.arch on Tue Feb 17 19:52:46 2026

From Newsgroup: comp.arch

John Levine <johnl@taugh.com> wrote:

According to Waldek Hebisch <antispam@fricas.org>:

There are on average two leading zeros so only half of the values are
normalized.

No. By _definition_ hex floating point number is normalized if and
only if its leading hex digit is different than zero.

I wrote sloppily. On average a normalized hex FP number has two leading zeros so you lose another bit compared to binary, in addition to what you lose by no hidden bit and no rounding.

That is almost what I wrote, except for that that I sketched proof that
hex FP looses that one bit _in worst case_, and average is better. In
case of IBM hex float tradoff between range and mantissa bits leads to
another bit lost from accuracy, so 4 bits in worst case (but range
is twice as large as IEEE floats). To summarize: 1 bit loss (compared
to binary with no hidden bit) due to uneven distribution of hex, 1 bit
loss due to impossibility to use hidden bit in hex, 1 bit loss due to
larger range, 1 bit loss due to lack of rounding.
--
Waldek Hebisch
--- Synchronet 3.21b-Linux NewsLink 1.2

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.arch on Tue Feb 17 20:43:35 2026

From Newsgroup: comp.arch

quadi <quadibloc@ca.invalid> wrote:

On Sun, 15 Feb 2026 14:37:00 +0000, John Dallman wrote:

Quadi, have your computer architectures included IBM 360 floating point
support? There is probably more demand for that than for 36-bit these
days.

Yes, in fact they have. The goal there is to facilitate data interchange
and emulation, not to provide better quality floating-point arithmetic... since, of course, it provides rather the opposite, as has been discussed
in this thread.

The original CISC Concertina I architecture went further; it had the goal
of being able to natively emulate the floating-point of just about every computer ever made.

That was probably already written, but since you are revising your
design it may be worth stating some facts. If you have 64-bit
machine with convenient access to 32-bit, 16-bit and 8-bit parts
you can store any number of bits between 4 and 64 wasting at most
50% of storage and have simple access to each item. So in terms
of memory use you are trying to avoid this 50% loss. In practice
loss will be much smaller because:

- power of 2 quantities are quite popular
- when program needs large number of items of some other size
programmer is likely to use packing/unpacking routines, keeping
data is space efficient packed formant for most time and unpacking
it for processing
- machine with fast bit-extract/bit-insert instruction can perform
most operation quite fast even on packed data

so possible gain in memory consumption is quite low. Given that
non-standard memory modules and support chips tend to be much more
expensive than standard ones, economically attempting such savings
make no sense.

Of course, that is also question of speed. The argument above shows
that loss of speed on access itself can be quite small. So what
remains is speed of processing data. As long as you do processing
on power of 2 sized items (that is unusual sizes are limited to
storage), loss of speed can be modest, basically dedicated 36-bit
machine probably can do 2 times as much 36-bit float operations
as standard machine can do 64-bit operations. Practically, this
loss will be than loss of storage, but still does not look significant
enough to warrant developement of special machine.

Things are somewhat different when you want bit-accurate result
using old formats. Here already one-complement arithmetic has
significant overhead on two-complement machine. And emulating
old floating point formats is mare expensive. OTOH, modern
machines are much faster than old ones. For example modern CPU
seem to be more than 1000 times faster than real CDC-6600, so
even slow emulation is likely to be faster than real machine,
which means that emulated machine can do the work of orignal
one.

So to summarize: practical consideration leave rather small space
for machine using non-power-of-two formats, and it is rather
unlikely that any design can fit there.

Of course, there is very good reason to expore non-mainstream
approaches, namely having fun. But once you realize that
mainstream designs make their choices for good reasons,
exploring alternatives gets less funny (at least for me).
--
Waldek Hebisch
--- Synchronet 3.21b-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Wed Feb 18 00:50:52 2026

From Newsgroup: comp.arch

antispam@fricas.org (Waldek Hebisch) posted:

quadi <quadibloc@ca.invalid> wrote:

On Sun, 15 Feb 2026 14:37:00 +0000, John Dallman wrote:

Quadi, have your computer architectures included IBM 360 floating point
support? There is probably more demand for that than for 36-bit these
days.

Yes, in fact they have. The goal there is to facilitate data interchange and emulation, not to provide better quality floating-point arithmetic... since, of course, it provides rather the opposite, as has been discussed in this thread.

The original CISC Concertina I architecture went further; it had the goal of being able to natively emulate the floating-point of just about every computer ever made.

That was probably already written, but since you are revising your
design it may be worth stating some facts. If you have 64-bit
machine with convenient access to 32-bit, 16-bit and 8-bit parts
you can store any number of bits between 4 and 64 wasting at most
50% of storage and have simple access to each item. So in terms
of memory use you are trying to avoid this 50% loss. In practice
loss will be much smaller because:

- power of 2 quantities are quite popular
- when program needs large number of items of some other size
programmer is likely to use packing/unpacking routines, keeping
data is space efficient packed formant for most time and unpacking
it for processing
- machine with fast bit-extract/bit-insert instruction can perform
most operation quite fast even on packed data

so possible gain in memory consumption is quite low. Given that
non-standard memory modules and support chips tend to be much more
expensive than standard ones, economically attempting such savings
make no sense.

Of course, that is also question of speed. The argument above shows
that loss of speed on access itself can be quite small. So what
remains is speed of processing data. As long as you do processing
on power of 2 sized items (that is unusual sizes are limited to
storage), loss of speed can be modest, basically dedicated 36-bit
machine probably can do 2 times as much 36-bit float operations
as standard machine can do 64-bit operations. Practically, this
loss will be than loss of storage, but still does not look significant
enough to warrant developement of special machine.

Things are somewhat different when you want bit-accurate result
using old formats. Here already one-complement arithmetic has
significant overhead on two-complement machine.

The only useful difference in 1-s complement and 2-s complement in
ADD is the end around carry, and the adder will have the same number
of gates and the same gates of delay. So, in theory, one could make
a {1-s or 2-s} complement adder at the cost of 1 gate of delay and
one logic gate.

And emulating
old floating point formats is mare expensive. OTOH, modern
machines are much faster than old ones. For example modern CPU
seem to be more than 1000 times faster than real CDC-6600, so
even slow emulation is likely to be faster than real machine,
which means that emulated machine can do the work of orignal
one.

Access to 64|u64->128 is the key unit of processing.

So to summarize: practical consideration leave rather small space
for machine using non-power-of-two formats, and it is rather
unlikely that any design can fit there.

Of course, there is very good reason to expore non-mainstream
approaches, namely having fun. But once you realize that
mainstream designs make their choices for good reasons,
exploring alternatives gets less funny (at least for me).

--- Synchronet 3.21b-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Wed Feb 18 08:52:27 2026

From Newsgroup: comp.arch

On Tue, 17 Feb 2026 20:43:35 +0000, Waldek Hebisch wrote:

But once you realize that mainstream
designs make their choices for good reasons,
exploring alternatives gets less funny (at least for me).

At one time, back in the past, the mainstream computers had word lengths
such as 12 bits, 18 bits, 24 bits, 30 bits, 36 bits, 48 bits, 60 bits...
all multiples of 6 bits.

The reason for this was that computers needed a character set with
letters, numbers, and various special characters - and a six-bit
character, with 64 possibilities, was adequate for that.

As technology advanced, and computer power became cheaper, it became
possible to think of using computers for more applications. Using an eight-
bit character allowed the use of lower-case characters, getting rid of a limitation of the older computers that could possibly become annoying in
the future. Of course, a 7-bit character would also be enough for that -
and at least one company, ASI, actually made computers with word lengths
that were multiples of 7 bits.

Even before System/360, IBM made a computer built around a 64-bit word,
the STRETCH. It was intended to be a very powerful scientific computer,
but it also had the very rare feature of bit addressing - which a power-of-
two word length made much more practical.

Hardly any architectures provide bit addressing these days, though.

None the less, a character set that includes lower-case is a good reason. Since a 36-bit word works better with 9-bit characters instead of 6-bit characters being addressable, nothing is really lost by going to 36 bits.

Of course, there's another good reason for sticking with 32-bit or 64-bit designs: because that's what everyone else is using, standard memory
modules have data buses corresponding to such widths, possibly with extra
bits for ECC.

To me, those don't seem to be enough "good reasons" to absolutely preclude different word lengths. But there would definitely have to be a real
benefit to justify the cost and effort to use a different length. It seems
to me there is a real benefit, in that the available data sizes in the 32-
bit world aren't optimized to the needs of scientific computation.

But it's quite correct to feel this real benefit isn't enough to make
machines oriented around the 36-bit word length likely.

John Savard

--- Synchronet 3.21b-Linux NewsLink 1.2

From Robert Finch@robfi680@gmail.com to comp.arch on Wed Feb 18 08:40:38 2026

From Newsgroup: comp.arch

On 2026-02-18 3:52 a.m., quadi wrote:

On Tue, 17 Feb 2026 20:43:35 +0000, Waldek Hebisch wrote:

But once you realize that mainstream
designs make their choices for good reasons,
exploring alternatives gets less funny (at least for me).

At one time, back in the past, the mainstream computers had word lengths
such as 12 bits, 18 bits, 24 bits, 30 bits, 36 bits, 48 bits, 60 bits...
all multiples of 6 bits.

The reason for this was that computers needed a character set with
letters, numbers, and various special characters - and a six-bit
character, with 64 possibilities, was adequate for that.

As technology advanced, and computer power became cheaper, it became
possible to think of using computers for more applications. Using an eight- bit character allowed the use of lower-case characters, getting rid of a limitation of the older computers that could possibly become annoying in
the future. Of course, a 7-bit character would also be enough for that -
and at least one company, ASI, actually made computers with word lengths
that were multiples of 7 bits.

Even before System/360, IBM made a computer built around a 64-bit word,
the STRETCH. It was intended to be a very powerful scientific computer,
but it also had the very rare feature of bit addressing - which a power-of- two word length made much more practical.

Hardly any architectures provide bit addressing these days, though.

None the less, a character set that includes lower-case is a good reason. Since a 36-bit word works better with 9-bit characters instead of 6-bit characters being addressable, nothing is really lost by going to 36 bits.

Of course, there's another good reason for sticking with 32-bit or 64-bit designs: because that's what everyone else is using, standard memory
modules have data buses corresponding to such widths, possibly with extra bits for ECC.

To me, those don't seem to be enough "good reasons" to absolutely preclude different word lengths. But there would definitely have to be a real
benefit to justify the cost and effort to use a different length. It seems
to me there is a real benefit, in that the available data sizes in the 32- bit world aren't optimized to the needs of scientific computation.

But it's quite correct to feel this real benefit isn't enough to make machines oriented around the 36-bit word length likely.

John Savard

Maybe we should switch to 18-bit bytes to support UNICODE.

--- Synchronet 3.21b-Linux NewsLink 1.2

From BGB@cr88192@gmail.com to comp.arch on Thu Feb 19 02:10:07 2026

From Newsgroup: comp.arch

On 2/12/2026 11:09 AM, MitchAlsup wrote:

anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

MitchAlsup <user5857@newsgrouper.org.invalid> writes:

{{One can STILL argue whether
deNormals were a plus or a minus in IEEE}}

I am surprised to read that from you, who has always written that
denormals can be implemented cheaply and efficiently in hardware. The
additional hardware cost (or the cost of trapping and software
emulation) has been the only argument against denormals that I ever
encountered.

It is only after IEEE 754-2008 came with FMAC that deNormals became
a low cost addition. {And that has been my point--you seem to have
forgotten the -2008 part or the argument}

And, can note, this is assuming that one actually pays the cost of
native hardware FMAC.

Well, and the secondary irony that it is mainly cost-added for FMUL,
whereas FADD almost invariably has the necessary support hardware already.

But:
FMUL is expensive operation + cheap normalizer (if no denormals);
FADD is cheap operation with expensive normalizer.

FMAC then is gluing the costs of the two units together, but:
With roughly the latency of both;
The need to be significantly wider internally to deal with some cases.

So, FMAC is a single unit that costs more than both units taken
separately, and with a higher latency.

FMAC does suddenly get a bit cheaper if its scope is limited to
FP8*FP8+FP16, but this operation is a bit niche.

This one makes a lot of sense for NN's, but haven't gotten my NN tech
working good enough to make a strong use-case for it.

Where, in terms of algorithmic or behavioral complexity relative to computational efficiency, NN's are significantly behind what is possible
with genetic algorithms or genetic programming.

So, for computational efficiency of the result:
Hand-written native code, best efficiency;
Genetic algorithm, moderate efficiency;
Neural Net, very inefficient.

The merit of NNs could then be if one could make them adaptive in some practical way:
Native code: No adaptation apart from specific algos;
Genetic algorithms: Only when running the evolver, static otherwise;
NN's: Could be made adaptable in theory, usually fixed in practice.

And, adaptation process:
Native: None, maybe manual fiddling by programmer;
Genetic algo: Initially very slow, gradually converges on answer;
NNs, via generic algorithm: Slow, but converges toward an answer;
NNs, via backprop: Rapid adaptation initially, then hits a plateau.

Backprop is seemingly prone to get stuck at a non-optimal solution, and
then is hard pressed to make any further progress. Seemingly isn't
really able to "fix" any obvious structural defects once it hits a
plateau, but can sometimes jump up or down between various nearby
options (when obvious suboptimal patterns persist).

Some tricks that work with GA-NN's don't really work with backprop, and
my initial attempts to glue GA handling onto backprop have not been
effective. Also it seems to need at least FP16 weights for training to
work effectively (though, one other option being FP8 with a bias
counter; but this is effectively analogous to using a non-standard
S.E4.M11 format).

Seemingly, my own efforts are getting stuck at the level of very
inefficiently solving very mundane issues, nowhere near the success
being seen by more mainstream efforts.

Nor, as of yet, even anything particularly interesting...

Had started making some progress in other types of areas though, for
example:
Figured out a practical way to get below 16kbps for audio...

By using 8kHz ADPCM and then using lookup table and reversed LZ search trickery to make the audio more LZ compressible (without changing the
storage format).

Or, basically, ADPCM encoding strategy like:
Lookup a match for the last 4 bytes;
Look for the longest backwards match (last N bytes);
Evaluate if the next byte for pattern is within an error limit;
Select based on combination of error and length
Longer matches permit more error than shorter ones.
Check a pattern table,
seeing if anything is within an acceptable error limit;
Use pattern if so.
Else:
Figure out best-match for next 6 samples,
using this to encode next 4 samples (1 byte).

Was able to get around a 20-30% reduction in bitrate, or around 12 kbps typical, before loss of audio quality becomes unacceptable (starts
breaking down in obvious ways).

Did version for 4-bit ADPCM, which can get a roughly similar reduction,
or around 24 kbps, though trying to push it much lower makes 2-bit ADPCM preferable.

A slightly higher reduction rate is possible if the baseline sample-rate
is increased to 16kHz, but still doesn't get as low as when using 8 kHz.

Note that it is possible to just use a pattern table directly to give an equivalent of 8 kbps ADPCM (each byte encoding an index into an 8-sample table, which is then decoded as 2-bit ADPCM), but the audio quality is unacceptably poor (for much of any use-case).

Though, all this was mostly dusting off an experiment from last year,
and putting it to use in my packaging tool (inside BGBCC).

Mostly it is a case of:
It is "good enough" to at least allow for optional super-compression of
ADPCM without breaking the existing decoders.

...

- anton

--- Synchronet 3.21b-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Thu Feb 19 17:30:50 2026

From Newsgroup: comp.arch

BGB <cr88192@gmail.com> posted:

On 2/12/2026 11:09 AM, MitchAlsup wrote:

anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

MitchAlsup <user5857@newsgrouper.org.invalid> writes:

{{One can STILL argue whether
deNormals were a plus or a minus in IEEE}}

I am surprised to read that from you, who has always written that
denormals can be implemented cheaply and efficiently in hardware. The
additional hardware cost (or the cost of trapping and software
emulation) has been the only argument against denormals that I ever
encountered.

It is only after IEEE 754-2008 came with FMAC that deNormals became
a low cost addition. {And that has been my point--you seem to have forgotten the -2008 part or the argument}

And, can note, this is assuming that one actually pays the cost of
native hardware FMAC.

It is exceedingly difficult to get an IEEE quality rounded result if
not done in HW.

Well, and the secondary irony that it is mainly cost-added for FMUL,
whereas FADD almost invariably has the necessary support hardware already.

But:
FMUL is expensive operation + cheap normalizer (if no denormals);
FADD is cheap operation with expensive normalizer.

FMAC then is gluing the costs of the two units together, but:
With roughly the latency of both;
The need to be significantly wider internally to deal with some cases.

The add stage after the multiplication tree is <essentially> 2|u as wide.
FMUL needs a 108-bit 2-input adder
FMAC needs a 160-bit 3-input adder and a 52-bit incrementor.
The multiplication tree is the same, normalizer is larger.

So, FMAC is a single unit that costs more than both units taken
separately, and with a higher latency.

Prior RISC processors did FMUL in 3-4 cycles (mostly 4).
Later RISC processors and x86 did FMAC in 4-cycles (occasionally 5).

--- Synchronet 3.21b-Linux NewsLink 1.2

From Michael S@already5chosen@yahoo.com to comp.arch on Thu Feb 19 20:15:29 2026

From Newsgroup: comp.arch

On Thu, 19 Feb 2026 17:30:50 GMT
MitchAlsup <user5857@newsgrouper.org.invalid> wrote:

BGB <cr88192@gmail.com> posted:

On 2/12/2026 11:09 AM, MitchAlsup wrote:

anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

MitchAlsup <user5857@newsgrouper.org.invalid> writes:

{{One can STILL argue whether
deNormals were a plus or a minus in IEEE}}

I am surprised to read that from you, who has always written that
denormals can be implemented cheaply and efficiently in
hardware. The additional hardware cost (or the cost of trapping
and software emulation) has been the only argument against
denormals that I ever encountered.

It is only after IEEE 754-2008 came with FMAC that deNormals
became a low cost addition. {And that has been my point--you seem
to have forgotten the -2008 part or the argument}

And, can note, this is assuming that one actually pays the cost of
native hardware FMAC.

It is exceedingly difficult to get an IEEE quality rounded result if
not done in HW.

Well, and the secondary irony that it is mainly cost-added for
FMUL, whereas FADD almost invariably has the necessary support
hardware already.

But:
FMUL is expensive operation + cheap normalizer (if no denormals);
FADD is cheap operation with expensive normalizer.

FMAC then is gluing the costs of the two units together, but:
With roughly the latency of both;
The need to be significantly wider internally to deal with some
cases.

The add stage after the multiplication tree is <essentially> 2- as
wide. FMUL needs a 108-bit 2-input adder
FMAC needs a 160-bit 3-input adder and a 52-bit incrementor.
The multiplication tree is the same, normalizer is larger.

So, FMAC is a single unit that costs more than both units taken separately, and with a higher latency.

Prior RISC processors did FMUL in 3-4 cycles (mostly 4).
Later RISC processors and x86 did FMAC in 4-cycles (occasionally 5).

Arm Inc. application processors cores have FMAC latency=4 for
multiplicands, but 2 for accumulator.
--- Synchronet 3.21b-Linux NewsLink 1.2

From Thomas Koenig@tkoenig@netcologne.de to comp.arch on Thu Feb 19 18:49:22 2026

From Newsgroup: comp.arch

John Dallman <jgd@cix.co.uk> schrieb:

Quadi, have your computer architectures included IBM 360 floating point support? There is probably more demand for that than for 36-bit these
days.

It has been quite a few decades since the last large-scale
scientific calculations in IBM hex float; I believe it must have
been the Japanese vector computers (one of which I worked on in
the mid to late 1990s). It is probably safe to say that any
hex float these days is embedded firmly in the z ecosystem.

Since every laptop these days has more performance than the old
vector computers, I very much doubt that there is significant data
saved in that format. Same thing for VAX floating point formats.

Bit vs. little endian data could is more recent. Around 20 years
ago, I wrote code to convert between big- and little endian data
for gfortran. This is also quite irrelevant today.

The last conversion issue I had a hand in was for IBM's "double
double" 128-bit real. Now POWER supports this as IEEE in hardware
(if not very fast), but this ABI change is very painful.

There could, however, be a niche for 36-bit reals - graphics cards.
I have recently discovered a GPU solver in a commercial package that
I use, and it has an option for using 32-bit reals. 36-bit reals
could extend the usefulness of such a solver.
--
This USENET posting was made without artificial intelligence,
artificial impertinence, artificial arrogance, artificial stupidity,
artificial flavorings or artificial colorants.
--- Synchronet 3.21b-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Thu Feb 19 19:55:40 2026

From Newsgroup: comp.arch

Michael S <already5chosen@yahoo.com> posted:

On Thu, 19 Feb 2026 17:30:50 GMT
MitchAlsup <user5857@newsgrouper.org.invalid> wrote:

BGB <cr88192@gmail.com> posted:

On 2/12/2026 11:09 AM, MitchAlsup wrote:

anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

MitchAlsup <user5857@newsgrouper.org.invalid> writes:

{{One can STILL argue whether
deNormals were a plus or a minus in IEEE}}

I am surprised to read that from you, who has always written that
denormals can be implemented cheaply and efficiently in
hardware. The additional hardware cost (or the cost of trapping
and software emulation) has been the only argument against
denormals that I ever encountered.

It is only after IEEE 754-2008 came with FMAC that deNormals
became a low cost addition. {And that has been my point--you seem
to have forgotten the -2008 part or the argument}

And, can note, this is assuming that one actually pays the cost of native hardware FMAC.

It is exceedingly difficult to get an IEEE quality rounded result if
not done in HW.

Well, and the secondary irony that it is mainly cost-added for
FMUL, whereas FADD almost invariably has the necessary support
hardware already.

But:
FMUL is expensive operation + cheap normalizer (if no denormals);
FADD is cheap operation with expensive normalizer.

FMAC then is gluing the costs of the two units together, but:
With roughly the latency of both;
The need to be significantly wider internally to deal with some
cases.

The add stage after the multiplication tree is <essentially> 2|u as
wide. FMUL needs a 108-bit 2-input adder
FMAC needs a 160-bit 3-input adder and a 52-bit incrementor.
The multiplication tree is the same, normalizer is larger.

So, FMAC is a single unit that costs more than both units taken separately, and with a higher latency.

Prior RISC processors did FMUL in 3-4 cycles (mostly 4).
Later RISC processors and x86 did FMAC in 4-cycles (occasionally 5).

Arm Inc. application processors cores have FMAC latency=4 for
multiplicands, but 2 for accumulator.

Thank you for that tid-bit of information.

--- Synchronet 3.21b-Linux NewsLink 1.2

From quadi@quadibloc@ca.invalid to comp.arch on Fri Feb 20 08:14:46 2026

From Newsgroup: comp.arch

On Wed, 18 Feb 2026 08:40:38 -0500, Robert Finch wrote:

Maybe we should switch to 18-bit bytes to support UNICODE.

It's true that Unicode has expanded beyond the old 16-bit Basic
Multilingual Plane. But while all currently-defined characters would fit
in 18 bits, they envisage enlarging Unicode to as many as 31 bits; that is what UTF-8 supports.

If 9-bit bytes are used for simple applications, it certainly will be true that 18-bit halfwords will be an available data type.

John Savard
--- Synchronet 3.21b-Linux NewsLink 1.2

From BGB@cr88192@gmail.com to comp.arch on Fri Feb 20 05:08:28 2026

From Newsgroup: comp.arch

On 2/19/2026 11:30 AM, MitchAlsup wrote:

BGB <cr88192@gmail.com> posted:

On 2/12/2026 11:09 AM, MitchAlsup wrote:

anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

MitchAlsup <user5857@newsgrouper.org.invalid> writes:

{{One can STILL argue whether
deNormals were a plus or a minus in IEEE}}

I am surprised to read that from you, who has always written that
denormals can be implemented cheaply and efficiently in hardware. The >>>> additional hardware cost (or the cost of trapping and software
emulation) has been the only argument against denormals that I ever
encountered.

It is only after IEEE 754-2008 came with FMAC that deNormals became
a low cost addition. {And that has been my point--you seem to have
forgotten the -2008 part or the argument}

And, can note, this is assuming that one actually pays the cost of
native hardware FMAC.

It is exceedingly difficult to get an IEEE quality rounded result if
not done in HW.

Likely depends.

Can use the trick of bumping to the next size up and use that for
computation.

So, for Binary32 compute it as Binary64, and for Binary64 compute it as Binary128.

Can special case the "Binary64 * Binary64 => Binary128" case to save
cost over using a native Binary128 multiply.

For Binary128 multiply, can also make sense to detect and special-case
the "low order bits are zero" case:
If low-order bits are zero, can use a multiply that only produces the
high 128 bits;
Vs a transient 128*128=>256 bit, and then needing to round.

Relative cost is lower if one is already paying the cost of a trap
handler or similar (except that if the ISA supports it, you really don't
want the compiler to combine these operations).

So, one can maybe document if using a compiler like GCC to use "-fp-contract=off -fno-fdiv", ...

Well, and the secondary irony that it is mainly cost-added for FMUL,
whereas FADD almost invariably has the necessary support hardware already. >>
But:
FMUL is expensive operation + cheap normalizer (if no denormals);
FADD is cheap operation with expensive normalizer.

FMAC then is gluing the costs of the two units together, but:
With roughly the latency of both;
The need to be significantly wider internally to deal with some cases.

The add stage after the multiplication tree is <essentially> 2|u as wide. FMUL needs a 108-bit 2-input adder
FMAC needs a 160-bit 3-input adder and a 52-bit incrementor.
The multiplication tree is the same, normalizer is larger.

A 160-bit 3-way adder happening "quickly" is still kinda asking a lot though...

Though, granted, the first step is deciding to decide full-width
multiply, and not discard the low order results.

Granted discarding the low results reduces rounding accuracy, but a way
to fake full IEEE rounding was to detect this case and have the FMUL
raise a fault (similar to denormal/underflow handling). Though, does
mean there is a performance penalty if multiplying numbers where the
low-order bits in both values are non-zero.

In my ISA, the exact behavior depends on instruction an rounding mode.
In the RISC-V mode, partly based on instruction rounding mode and and
flags settings.

For reasons though, can't safely enable full IEEE emulation until after setting up virtual memory and similar though.

The handling of the RISC-V F/D extensions was non-standard in my case,
though not in a way that effects GCC output (it seems to exclusively use
the DYN rounding mode in instructions, assuming the rounding mode to be handled via CSR's). Also, ironically and contrasting with the seeming
design of these extensions, these registers are so rarely accessed in
practice that it seemed most sensible to use trap-and-emulate for the CSRs.

Granted, there are limits to corner cutting:
If a design does not produce exact results for cases where it is trivial
to verify that an exact answer exists in cases that do not require
rounding, IMO this is below the minimum limit for a usable general
purpose FPU.

So, FMAC is a single unit that costs more than both units taken
separately, and with a higher latency.

Prior RISC processors did FMUL in 3-4 cycles (mostly 4).
Later RISC processors and x86 did FMAC in 4-cycles (occasionally 5).

Trying to push the latency down would be pretty bad for timing, unless
there is some cheaper way to implement FPUs that I am not aware of.

In my case:
FMADD.D, RM=DYN: Trap
FMADD.D, RM=RNE, 10-cycle, double-rounded (non-standard)
FMADD.S, RM=DYN, 10-cycle (mimics single rounding, *)
*: Happens internally at Binary64 precision.

It could be possible to handle FMADD.D RM=DYN the same way as RNE
internally, but then trap if the inputs would potentially give a
non-IEEE result. Though, for now, trapping is the cheaper solution in
terms of HW cost.

The one exception is FP8*FP8 + FP16, but mostly because it is possible
to do FP8*FP8 under 1 cycle.

But, still not free here; and overly niche. Ended up going with a
cheaper option of simply having an SIMD FP8*FP8=>FP16 multiply op (which
still ends up as a 2-cycle op, because...).

--- Synchronet 3.21b-Linux NewsLink 1.2

From Terje Mathisen@terje.mathisen@tmsw.no to comp.arch on Fri Feb 20 15:22:05 2026

From Newsgroup: comp.arch

BGB wrote:

On 2/19/2026 11:30 AM, MitchAlsup wrote:

BGB <cr88192@gmail.com> posted:

On 2/12/2026 11:09 AM, MitchAlsup wrote:

anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

MitchAlsup <user5857@newsgrouper.org.invalid> writes:

{{One can STILL argue whether
deNormals were a plus or a minus in IEEE}}

I am surprised to read that from you, who has always written that
denormals can be implemented cheaply and efficiently in hardware.-a The >>>>> additional hardware cost (or the cost of trapping and software
emulation) has been the only argument against denormals that I ever>>>>> encountered.

It is only after IEEE 754-2008 came with FMAC that deNormals became
a low cost addition. {And that has been my point--you seem to have
forgotten the -2008 part or the argument}

And, can note, this is assuming that one actually pays the cost of
native hardware FMAC.

It is exceedingly difficult to get an IEEE quality rounded result if
not done in HW.

Likely depends.

Can use the trick of bumping to the next size up and use that for computation.

So, for Binary32 compute it as Binary64, and for Binary64 compute it as Binary128.

Neither of those work!
I believed this to be true but I was shown the error of my thinking by
more knowledgable people in the 754 working group. I.e. they had a very simple/small example where doing the calculation in the next higher
precision would still cause double rounding errors.
Also note that Mitch have stated multiple times that you need ~160
mantissa bits during FMAC double calculations.
Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
--- Synchronet 3.21b-Linux NewsLink 1.2

From BGB@cr88192@gmail.com to comp.arch on Fri Feb 20 15:26:24 2026

From Newsgroup: comp.arch

On 2/20/2026 8:22 AM, Terje Mathisen wrote:

BGB wrote:

On 2/19/2026 11:30 AM, MitchAlsup wrote:

BGB <cr88192@gmail.com> posted:

On 2/12/2026 11:09 AM, MitchAlsup wrote:

anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:

MitchAlsup <user5857@newsgrouper.org.invalid> writes:

{{One can STILL argue whether
deNormals were a plus or a minus in IEEE}}

I am surprised to read that from you, who has always written that
denormals can be implemented cheaply and efficiently in hardware. >>>>>> The
additional hardware cost (or the cost of trapping and software
emulation) has been the only argument against denormals that I ever >>>>>> encountered.

It is only after IEEE 754-2008 came with FMAC that deNormals became
a low cost addition. {And that has been my point--you seem to have
forgotten the -2008 part or the argument}

And, can note, this is assuming that one actually pays the cost of
native hardware FMAC.

It is exceedingly difficult to get an IEEE quality rounded result if
not done in HW.

Likely depends.

Can use the trick of bumping to the next size up and use that for
computation.

So, for Binary32 compute it as Binary64, and for Binary64 compute it
as Binary128.

Neither of those work!

I believed this to be true but I was shown the error of my thinking by
more knowledgable people in the 754 working group. I.e. they had a very simple/small example where doing the calculation in the next higher precision would still cause double rounding errors.

Also note that Mitch have stated multiple times that you need ~160
mantissa bits during FMAC double calculations.

Could look into this, next option being to use a makeshift 192-bit FP
format with a 176 bit mantissa (likely cheaper than going all the way to
224 bits).

This is slow/annoying, but not really likely a "hard" problem (when one
is already doing this stuff in software in a trap handler).

So, potentially:
Binary32 -> FP96 (truncated Binary128, still stored as Binary128)
Binary64 -> FP192 (extended Binary128)
Binary128 -> FP384 (likewise)
Big/ugly, but no one says this needs to be fast...

Might end up on a sort of "TODO list"...

In any case, actual native hardware support for single-rounded FMA is
unlikely to happen in my case.

...

Terje

--- Synchronet 3.21b-Linux NewsLink 1.2

From Terje Mathisen@terje.mathisen@tmsw.no to comp.arch on Thu Feb 12 14:28:27 2026

From Newsgroup: comp.arch

MitchAlsup wrote:

John Levine <johnl@taugh.com> posted:

According to David Schultz <david.schultz@earthlink.net>:

This reminds me of when I took a numerical analysis course. (The many
ways that computer calculations can go wrong and how to deal with it.)
The professor said that the schools IBM (360 or 370, ca. 1980) was
perfect for the course because of the defects in its floating point
system. Guard digits and rounding sorts of things as near as I can recall. >>

The 360's floating point is a famous and somewhat puzzling failure, considering
how much else they got right.

It does hex normalization rather than binary. They assumed that
leading digits are evenly distributed so there's be on average one
zero bit, but in fact they're geometrically distributed, so on average
there's two. They got one bit back by making the exponent units of 16
rather than 2, but that's still one bit gone. It truncated rather than
rounding, another bit gone. They also truncated rather than rounding
results.

Originally there wre no guard digits which made the results comically
bad but IBM retrofitted them at great cost to all the installed machines.

IEEE floating point can be seen as a reaction to that, how do you use
the same number of bits but get good results.

VAX got this correct too (the VAX format not the one inherited from PDP-11/45; PDP-11/40* FP was worse). VAX FP is arguably as good as
IEEE 754 with the exception that more IEEE numbers have reciprocals
due to the change in exponent bias by 1. {{One can STILL argue whether deNormals were a plus or a minus in IEEE}}

You _can_ argue about that, but as you've told us on numerous occations,
it doesn't actually cost any clock cycles, and there are a few
zer-seeking algorithms which would not be trivially stable without
subnormals.

Finally, having subnormals meens that any possible bit pattern between negative NaN and positive NaN have a meaningful real value (or +/- Inf),
so you can compare them, increment them etc without worry.

Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
--- Synchronet 3.21d-Linux NewsLink 1.2

Who's Online

System Info

Sysop:	Amessyroom
Location:	Fayetteville, NC
Users:	59
Nodes:	6 (0 / 6)
Uptime:	00:15:36
Calls:	810
Files:	1,287
Messages:	197,308

Combining Practicality with Perfection

Who's Online

System Info