Forum: Too Lazy BBS

Re: Safety of casting from 'long' to 'int'

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Fri May 8 20:04:51 2026

From Newsgroup: comp.lang.c

cross@spitfire.i.gajendra.net (Dan Cross) writes:

[concerning UB when multiplying 16-bit unsigneds]

Yes. Btw, the fix is almost trivial:

```
uint16_t
mul(uint16_t a, uint16_t b)
{
unsigned int aa = a, bb = b;
return aa * bb;
}
```

Easier:

uint16_t
mul( unsigned a, unsigned b ){
return a*b;
}
--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Sat May 9 06:25:25 2026

From Newsgroup: comp.lang.c

On 2026-05-08 22:30, Dan Cross wrote:

[...] Victor Yodaiken even wrote
a paper about this: https://arxiv.org/pdf/2201.07845

Very interesting! - Thanks for the link.

Janis

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Sat May 9 07:31:41 2026

From Newsgroup: comp.lang.c

On 2026-05-08 23:02, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 06/05/2026 20:35, Dan Cross wrote:

In article <10tflij$19d6u$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

[...]
This is C:

uint64_t F(uint64_t s, uint64_t t, uint64_t u, uint64_t v) ...

This is my language:

func F(u64 s, t, u, v)u64 ...

All the parameters are the same type. If you change that first u64, they
all change. The parameter names are 's t u v'.

Yes, in this case having single type for parameters is simpler.

They are the same types in the C too, but you have to work harder to
double-check they are in fact identical. If you change that type, then
you have to change it at multiple sites; if you forget one, the compiler
will not tell you.

Well, if there is mismatch, them compiler will tell you. If types
make sense, but are different than intended, then you have trouble.
But this trouble is not different from situation where you need to
change one type, but keep other unchanged. For example, if you
need

uint64_t F(uint64_t s, uint64_t t, uint32_t u, uint64_t v)

In such case C version is easier to modify correctly.

The point is that you can group them or declare them separately, at
your discretion - at least in the languages I know. (I suppose that
in Bart's language it's not different.)

Function parameters for example from some Algol 68 code

(INT x, y, CHAR sym, BOOL highlight)

(STRING what, INT min, max)

I think this option is good to have.[*]

(But it's boring to compare a private, unimportant language to a
widespread one. - To show the advantages we could as well pick any
other common language that supports one or the other syntax variant.)

Janis

[*] As in "C" where you can write

int x, y;

for variable declarations. Not allowing that in function signatures
is a restriction. - BTW, in K&R "C" we could do that, IIRC, as in

f(x,y)
int x, y;
{ ... }
--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Sat May 9 07:37:13 2026

From Newsgroup: comp.lang.c

On 2026-05-08 23:58, Bart wrote:

So what happened since the late 70s? We're still doing independent compilation, still doing linking, still use makefiles.

Luckily!

In fact its got a lot more complex rather than simpler.

I'm shocked that even after thorough explanations of the Real Life
outside your mental biotope nothing has reached your perception.

Janis

[...]

--- Synchronet 3.22a-Linux NewsLink 1.2

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Sat May 9 05:50:37 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> wrote:

On 08/05/2026 22:02, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

Note that:

Let me put numbers to identify the claims:

* There are no C-style header files

1.

* There are no separate declarations needed, neither in shared headers,
nor as prototypes

2.

* If a module is imported by 50 others, it is processed exactly once

3.

* Project info (modules etc) is present in only one module of a project. >>> (Most module schemes require import directives in every module)

4.

* That means no external build system is needed.

5.

Just that is HUGE.

Well, UCSD/Turbo Pascal, Ada, Modula 2, Oberon all have module system
satisfying 3 and 5. UCSD/Turbo Pascal and Oberon do not have
separate interface files, so satisfy 1 (Ada and Modula 2 have
separate interface files, but they have different nature than
C header files). Oberon satisfies 2. So only possibly novel part
is 4.
Looking at implementation decisions, there were strong voices for
separate interface files. So 1 is at least "subjective". I
would say that 2 is probably "arguably bad idea". Concerning 4,
I routinely use a module system where import directive are
sometimes unnecessary: basicaly 'import' means that names from
given module may be used without qualification. If somebody
decided to use only qualified names, then 'import' is not
necessary. Arguably, information saying from which module to
take given name (and given module may simultaneously use the
same name from multiple modules) is part of source code of
given module. You apparently think differently, but I want
to keep related things together, so want this information
included in module source. So for me your 4 is "arguably
bad idea".

I went through several versions of the module scheme. They all worked,
but with problems.

For example, with one version, each module of a 50-module project say,
would have a rag-tag collection of imports at the top, with whatever
subset of the other 49 was currently needed, that needed constant maintainence.

Then you renamed one module, or combined two, or split one into two, and
you had a lot of editing to do. This kind is very common.

In my coding changes to module names happen at least order of
magnitude less frequently than other changes. And when you
need to change module names I do not see how your scheme saves
editing (if you put info in a single file, then you need to
edit one file, but change it in multiple places).

Also, if I need to do massive change I would do something like
below:

for A in *.[ch]; do sed 's,file1,file2,' $A > $A.pp; done
for A in *.[ch]; do mv $A.pp $A; done

That is two commands to do simple mass renaming, regardless of
number of files involved. Could be done in one command, but
doing it in two steps give me oportunity to back up or check
things if I have any doubts about correctness of the first
command.

You see the equivalent in C with long collections of header files, but
here you have to generate and maintain the header files too, and you
have to hunt for them within the file system.

Note that all languages that I mention are at roughly similar
level as C, all are more (Oberon) or less (Ada) niche now.
But I think that each of them has more users than your
language (original UCSD Pascal and Turbo Pascal are dead, but
there are Turbo Pascal compatible products in current developement).

Also, users and developers of those languages probably think
that they "run rings around C", but do not come here to complain
how bad C is.

C sits at a particular level and mine is at about the same place.

For example, FreePascal transpiles to C; you don't hear of C transpiling
to FreePascal!

I first see such claim in your post. AFAIK Free Pascal always offered home-grown native backed. They used to have (and I probably still have)
their own internal linker, so that if you wanted you could directly
generate executable (without creating intermediate .o or assembler
files). So it was closer to what you do than to typical C compiler.

AFAIK now Free Pascal can use LLVM as a backend, but last info
about this I have seen is that using LLVM is optional and that Free
Pascal own backed will be also supported.

More generaly, I do not see any relevance in your "transpiles to C"
argument. Translating from one language to another is common and
valid way to implement a language. Translating from C to Free
Pascal makes little sense as on all platforms supported by Free
Pascal there are C compilers and Free Pascal can use object files
produced by C compiler (unlike Turbo Pascal it can also procduce
object files for linking into a C program). C is frequent
target because it is widely available, but languages implemented
first by translation to C can get native backend (that happend
with C++) and "via C" compilers can co-exist with native
compilers.

What matter is what kinds of constructs is supported and all languages
that I mention support low-level programming. Ada, Modula 2
and Oberon where used to write operating systems, at least in
case of Modula 2 and Oberon there were no other language involved
(beside some small pieces of assembler). I am not sure if
there was any operating system written in Free Pascal, but
Free Pascal has all constructs needed, so there could be
if anybody wanted such system.

So technically each of languages that I mentioned could be
used to implement full software stack, starting from the
lowest level. If any of them gained enough popularity there
would translators targeting this language.

BTW: IIUC one popular PC "database" system used Modula 2 behind
the scene: it offered its own language which was translated to
Modula 2 which in turn was compiled by Modula 2 compiler to
native code.

So C is that kind of language, and mine would also be if someone kindly wrote decent compilers for it. But for its main platform, it can also be used the same way (I'm doing that right now).

Anyway, when you come here and propose your language as alternative
to C,

I'm not pushing my languages at all; they are personal. I was replying
to this:

"You keep referring to your "systems language" as evidence that C
is bad. C may be bad; there are even ways that I think that C
_is_ bad. But your opinion is not evidence, and given that you
have shown it to be founded on misconceptions, it is utterly
irrelevant."

I'm showing I can create a decent language, one that has been tried and tested so I know what worked and what didn't. And that enables me to
make an informed comparison with C, which is used in the same space.

then you implicitly claim that your language is better than
Ada, Modula 2, Pascal and a lot of other languages which competed
with C.

Well, they didn't compete with C very well. Where were the real contenders?!

Clearly they lost. IMO, there were strong non-technical reasons.
At purely technical level I see one thing favouring C: C allowed
better machine code from a simple compiler. Less technical
thing is that typical C implementation used operating system
linker, which ensured good interoperation with other languages.
Turbo Pascal insited on using it own liker and "main" program
had to be in Turbo Pascal. Consequently, if you wanted to
offer a library written in Turbo Pascal, such library would
be usable only from Turbo Pascal. IIUC later there were
cometing compilers allowing creation of normal object files.
But no wonder that library writers prefered other lanuages (like
C). Some technically good things lost because of too high price
demanded by vendors. IMO Pascal had problem because there
were several incompatible dialects

C was informal, and small,

Ada was formalized and big, but other were small to. In fact,
Oberon was quite small, but also was late and made a bunch of
bad choices (for example first implementation was for rather
obscure processor).

and allowed you great freedom (more than

I am not sure what "great freedom" means here. Even Ada which
is consdered most strict between languages that I mentioned allows
doing any needed low level tasks.

necessary), but the way it was presented was poor (syntax etc) and now
is very dated (header files and relying too much on its token-based
macros to fix shortcomings).

Anyway, I'm not making a comparison with those. If I hadn't devised my language (say my place of work provided the language to be used), then
most likely I would have been using C, not Pascal or Ada (which was
anyway still in the future).

Even more, since C programmers did not switch to other
languages earlier, your language must be _much_ better than
other ones. Do you realize how grandiose claim it is? Maybe
you do not mean this, but that is impression that you give.

No. I understand that in reality mine is a crappy little one-man
language that should have been put out of its misery at least 25 years
ago. It has no trendy modern features, no docs, no users, no libraries,
no nothing.

But one of the reasons it's still going is because C is, showing there
is still a demand for that class of primitive language. In that case I
know that niche pretty well!

It that case, then yes it is much more polished, is somewhat safer, with fewer quirks and fewer surprises. This is the simplest function pointer
type in C, and in my language:

void(*)(void)
ref proc

and the same type used to declare variable:

void(*fnptr)(void);
ref proc fnptr

and here, an array of 10 of those pointers:

void(*table[10])(void);
[10]ref proc table

But, this is the kicker: to write that last C version, I had to use a
tool to figure where the parentheses and square brackets go. What kind
of HLL is that?!

I admit that in case above I would first declare pointer type and
only after that I would declare the array. Not nice, but if you
are bothered by this there are many languages that do not have
this problem.

Unless you're going argue that C's syntax has the edge, then my language
is indeed better.

Well, you apparently are not getting simple thing: C is "good
enough". That is problems with C are (or at least were) not deal
breakers. Deal breakres are:
- having a compiler for needed target
- possibly quality (speed and size) of object code
- compatiblity with other software (linking, interlanguage calls
and similar)
- ability to use existing code
- ability to express what is needed (higher level languages may
fail this, C and competion I mentiond is OK)

Actually, for business "objective" quality of language matters
very little. They are intersted in avaliability of programmers,
productivity (time needed to write programs), availability of
compilers and extra tools, possibly price of compilers.

Ada folks made reasonable argument that Ada gives about twice
productivity compared to C. I would say that as long as you
are solving task appropriate for C you are _very_ unlikely to
get bigger factor. It seems that quality of programmers have
bigger impact. And from business point of view succesful
software project brings revenue which make factor 2 in cost
almost irrelevant.

In other words, to compete with C language must offer really
large advantage: it is not enough to be better, to have any
chance competitor must be much better. C++ managed to attain
its position by being "better C", that is by stressing its
compatibility with C. Rust makes big claim about memory safety
and consequently ability to write safe program. That may be
big enough to win. But you should understand that what
matter is whole ecosystem, including extra tools. If somebody
manages to add safety warranties to C code via external
tools, that could reduce advantage of other languages.

In a different spirtum, several currently popular languages
target higher level coding than C, here productivity gain
from different language is bigger and advantages of C smaller.

I'm not claiming it's unique either (eg. my syntax was taken from
Algol68), but most modern alternatives are bigger and more ambitious.

--
Waldek Hebisch
--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Sat May 9 08:39:21 2026

From Newsgroup: comp.lang.c

On 2026-05-09 07:50, Waldek Hebisch wrote:

[...]

Also, if I need to do massive change I would do something like
below:

for A in *.[ch]; do sed 's,file1,file2,' $A > $A.pp; done
for A in *.[ch]; do mv $A.pp $A; done

That is two commands to do simple mass renaming, regardless of
number of files involved. Could be done in one command, but
doing it in two steps give me oportunity to back up or check
things if I have any doubts about correctness of the first
command.

If you happen to be on Linux or have the GNU 'sed' available
you can do in-place editing with option '-i' and also create
backups by providing the extension with '-i.pp' (for example).

And don't forget to double-quote the "$A" expressions.

Janis

[...]

--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c on Sat May 9 12:40:43 2026

From Newsgroup: comp.lang.c

On 09/05/2026 00:10, Tim Rentsch wrote:

antispam@fricas.org (Waldek Hebisch) writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Let me address an additional question, which may have been touched
on in other postings although I am not sure of that. What should we
do if we want to safely add signed integers, avoid nasal demons no
matter what, and are content with wrap-around semantics for cases
that "overflow"? Here is a function to do that:

signed
safely_add( signed a, signed b ){
unsigned ua = a, ub = b, u = ua+ub;
return u <= INT_MAX ? (int)u : -INT_MAX + (int)(u-INT_MAX-1) - 1; >>> }

No undefined behavior, no circumstances where ID behavior comes into
play, and gives desired answer in essentially all environments (the
exceptions are environments where UINT_MAX != INT_MAX*2+1, which is
almost non-existent today).

Now, same question, but for multiplication. The answer is almost
exactly the same:

signed
safely_multiply( signed a, signed b ){
unsigned ua = a, ub = b, u = ua*ub;
return u <= INT_MAX ? (int)u : -INT_MAX + (int)(u-INT_MAX-1) - 1; >>> }

Both gcc and clang compile these functions into one operation each
(along with 'ret').

Here is a little test driver folks may want to try:

#include <stdio.h>

int
main(){
for( signed i = -10000; i <= 10000; i++ ){
for( signed j = -10000; j <= 10000; j++ ){
signed p = i*j;
signed q = safely_multiply( i, j );
if( p == q ) continue;
printf( " %6d * %6d = %12d or %12d\n", i, j, p, q );
}
}
printf( " done.\n" );
return 0;
}

Compiling this with -S -O2 may give an amusing result, for those who
want to try it.

I did to try it, but I would expect 'safe_add' and 'safely_multiply'
to produce just single machine instruction for computation,
possibly inlined (at -O3 gcc should inline them, but -O2 is more
conservative).

When I compile with -S, both gcc and clang generate (besides the
retg) one instruction (that being a leal) for safely_add, and two instructions (those being an imull and a movl) for safely_multiply,
at level -O1 or higher.

Without the 'if( p==q ) continue;' test, gcc will inline at -O1 and
higher, and clang will inline at -O2 and higher. After inlining,
both are smart enough to strength-reduce the loop, eliminating the multiplication in favor of an addition.

The amusing result happens when the 'if( p==q ) continue;' test is
left in, at level -O2 or higher. I'm not giving away what happens;
let me just say I was surprised and amused by the result.

I find it surprising that you are surprised by the result. At least,
you would surely have predicted that it was a possibility.

(I'm assuming that you got the same result as I did. But we are talking
about optimisations that gcc has had since version 7, nearly a decade
ago, and I'd expect you have a more reason version. godbolt.org makes
this kind of testing far simpler, and easier to share.)

--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c on Sat May 9 12:54:51 2026

From Newsgroup: comp.lang.c

On 08/05/2026 22:04, Bart wrote:

On 08/05/2026 17:58, Keith Thompson wrote:

If you have questions about the behavior of gcc's "-fwrapv" option,
a gcc forum is likely to be a better place to ask them.

Yeah, forget it.

All I get is, I can forget about C's UB for signed overflow, and pretend
it works like unsigned overflow, if I stipulate the '-fwrapv' option to
the compiler if it has one.

If you have written code that depends on signed overflow having wrapping semantics, then you need to compile it with an implementation that
guarantees those semantics.

"gcc -fwrapv" gives you that. So does "clang -fwrapv". So does your
own C compiler, according to what you said previously. I don't know of
any other C compiler that provides such guarantees, but that does not
exclude the possibility.

If you use a compiler that does not explicitly guarantee wrapping
semantics, then you are relying on luck to get compiled code that does
what you want it to do. Maybe that's fine for your needs, though it
strikes me as strange given that it is rarely particularly difficult to
avoid a risk of signed arithmetic overflow in practice.

If it doesn't then it will most likely work like that anyway since it
won't be smart enough to do anything clever.

So the answer to my question is my C source must go hand-in-hand with
some stipulations about how it is built. That alreadys happens with my generated code anyway as it assumes a 64-bit target.

Code rarely needs to be completely portable to any C implementation -
it's fine for code to be non-portable. But I recommend marking it as
such if there could be any confusion. (Or even better, as I've said
before, use pragmas and pre-processor directives to make sure it is
either compiled with known appropriate implementations, or fails to
compile at all.)

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Sat May 9 11:56:24 2026

From Newsgroup: comp.lang.c

On 09/05/2026 02:57, Dan Cross wrote:

In article <10tls2u$39j7a$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 08/05/2026 22:02, Waldek Hebisch wrote:

[snip]
Note that all languages that I mention are at roughly similar
level as C, all are more (Oberon) or less (Ada) niche now.
But I think that each of them has more users than your
language (original UCSD Pascal and Turbo Pascal are dead, but
there are Turbo Pascal compatible products in current developement).

Also, users and developers of those languages probably think
that they "run rings around C", but do not come here to complain
how bad C is.

C sits at a particular level and mine is at about the same place.

For example, FreePascal transpiles to C;

You mean `fpc`? I see no evidence for that. I just looked at https://www.freepascal.org and I see no documentation about the
compiler generating C code; it appears to generate object code
for the target platform, and the software requirements don't
mention a C compiler.

When I tried it about a decade ago, it appeared to use a C backend from
what I can remember. The same with Nim, or GHC. Or languages like
Euphoria or Seed7 which are interpreted, but can lower to C as an option.

But it is also possible I mixed it up with FreeBasic (I tried both).

Programs however move on. If I download it now, then it does bundle
something called 'gcc.exe', but it is a stub: it can load C files, but
it is missing 'cc1'.

The point is that some HLL X is commonly transpiled to C, either for bootstrapping, or for early versions or as an option.

C rarely transpiles to some other HLL X, for the purposes of
implementing C. But it sometimes does when language X wants to migrate existing C code to X.

I'm showing I can create a decent language, one that has been tried and
tested so I know what worked and what didn't. And that enables me to
make an informed comparison with C, which is used in the same space.

No. Having a good understanding of C would enable you to make a
good comparison with C. But, again, you haven't demonstrated
that you have a good understanding of C, and you've expressed
negative interest in gaining such understanding, so whatever you
know about your own language is irrelevant.

As I kept saying, anybody can subjectively compare any language with any
other as it pertains to their sphere, their experience and their
requirements, down to individual features.

In my case:

* Pretty much all coding I did, outside of assembly and scripting, was
for applications that anyone else would have used C for.

* ALL of that was achieved via the features of my own languages

* All the generated code was done via my own tools right down to the binary

So for a particular micro-task, to get it from concept A in the source
code to B in the binary executable for machine M, I know exactly how I
expect it to work.

I can then compare that with using C to try and get from A to B.

I don't care how it does it internally or what are the reasons why it
might give different behaviour.

There are reasonable adjustments you need to make to switch languages,
and there are unreasonables ones, such as needing to become a guru in
the new language.

Or having to use workarounds because your code has to work without UB on
the DS9000, even though you are only interested in M, which has the same characteristics as all other target machines you are likely to use.

And for my language, you can substitute 'X'.

So I refute your claim that somebody can't make a comparison or express
a preference without such indepth knowledge.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Sat May 9 13:10:16 2026

From Newsgroup: comp.lang.c

On 09/05/2026 06:50, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 08/05/2026 22:02, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

Note that:

Let me put numbers to identify the claims:

* There are no C-style header files

1.

* There are no separate declarations needed, neither in shared headers, >>>> nor as prototypes

2.

* If a module is imported by 50 others, it is processed exactly once

3.

* Project info (modules etc) is present in only one module of a project. >>>> (Most module schemes require import directives in every module)

4.

* That means no external build system is needed.

5.

Just that is HUGE.

Well, UCSD/Turbo Pascal, Ada, Modula 2, Oberon all have module system
satisfying 3 and 5. UCSD/Turbo Pascal and Oberon do not have
separate interface files, so satisfy 1 (Ada and Modula 2 have
separate interface files, but they have different nature than
C header files). Oberon satisfies 2. So only possibly novel part
is 4.
Looking at implementation decisions, there were strong voices for
separate interface files. So 1 is at least "subjective". I
would say that 2 is probably "arguably bad idea". Concerning 4,
I routinely use a module system where import directive are
sometimes unnecessary: basicaly 'import' means that names from
given module may be used without qualification. If somebody
decided to use only qualified names, then 'import' is not
necessary. Arguably, information saying from which module to
take given name (and given module may simultaneously use the
same name from multiple modules) is part of source code of
given module. You apparently think differently, but I want
to keep related things together, so want this information
included in module source. So for me your 4 is "arguably
bad idea".

I went through several versions of the module scheme. They all worked,
but with problems.

For example, with one version, each module of a 50-module project say,
would have a rag-tag collection of imports at the top, with whatever
subset of the other 49 was currently needed, that needed constant
maintainence.

Then you renamed one module, or combined two, or split one into two, and
you had a lot of editing to do. This kind is very common.

In my coding changes to module names happen at least order of
magnitude less frequently than other changes. And when you
need to change module names I do not see how your scheme saves
editing (if you put info in a single file, then you need to
edit one file, but change it in multiple places).

(1) I generally don't need qualifiers involving module names, so
changing the name doesn't require updates

(2) I sometimes use aliases for module names, and that can be used for a qualifier, allowing the actual name to be changed without affecting any code

However, if I have module M which has local entity F, and split it into
two modules A and B both of which use F, then I'd have to mark F as
'global', in whichever module it ends up in.

It can sometimes happen that other modules already share something
called F, then the compiler reports an ambiguity. But in this case I'd
rather rename one that resort to using qualifiers.

Now look at what's involved in splitting a C module into two.

For example, FreePascal transpiles to C; you don't hear of C transpiling
to FreePascal!

I first see such claim in your post. AFAIK Free Pascal always offered home-grown native backed. They used to have (and I probably still have) their own internal linker, so that if you wanted you could directly
generate executable (without creating intermediate .o or assembler
files). So it was closer to what you do than to typical C compiler.

AFAIK now Free Pascal can use LLVM as a backend, but last info
about this I have seen is that using LLVM is optional and that Free
Pascal own backed will be also supported.

See my other post to DC made an hour or so ago.

More generaly, I do not see any relevance in your "transpiles to C"
argument. Translating from one language to another is common and
valid way to implement a language.

Transpilation to C, is very, very common. I see it on lots of projects
on Reddit. However other HLLs are also used, when not using ASM or LLVM.
Those tend to be for more high level languages

C is popular because it is lower level and gives greater freedom to get
things done. For example it is trivial to get around the type system. It
even has 'goto' (which is disappearing from some languages).

What matter is what kinds of constructs is supported and all languages
that I mention support low-level programming. Ada, Modula 2
and Oberon where used to write operating systems, at least in
case of Modula 2 and Oberon there were no other language involved
(beside some small pieces of assembler). I am not sure if
there was any operating system written in Free Pascal, but
Free Pascal has all constructs needed, so there could be
if anybody wanted such system.

So technically each of languages that I mentioned could be
used to implement full software stack, starting from the
lowest level. If any of them gained enough popularity there
would translators targeting this language.

BTW: IIUC one popular PC "database" system used Modula 2 behind
the scene: it offered its own language which was translated to
Modula 2 which in turn was compiled by Modula 2 compiler to
native code.

I last used Pascal to any great extent in 1980, in a college
environment. It was a teaching language.

It never came up in my work (not even Turbo Pascal), and I never used
other Wirthian languages. I think one (Modula or Oberon) even dropped
'goto'? In that case, no thanks. Oberon anyway comes across as a concept language.

C however was constantly in the background, and I heard about it from
others.

Well, they didn't compete with C very well. Where were the real contenders?!

Clearly they lost. IMO, there were strong non-technical reasons.
At purely technical level I see one thing favouring C: C allowed
better machine code from a simple compiler. Less technical
thing is that typical C implementation used operating system
linker, which ensured good interoperation with other languages.
Turbo Pascal insited on using it own liker and "main" program
had to be in Turbo Pascal. Consequently, if you wanted to
offer a library written in Turbo Pascal, such library would
be usable only from Turbo Pascal. IIUC later there were
cometing compilers allowing creation of normal object files.
But no wonder that library writers prefered other lanuages (like
C). Some technically good things lost because of too high price
demanded by vendors. IMO Pascal had problem because there
were several incompatible dialects

There was also Unix. Did you know that Unix was written in C? You
wouldn't have guessed!

That is a joke. Unix and C (and C compilers and libraries) are so
closely intertwined that you cannot separate them.

I'd say then that that gave C an unfair advantage.

C was informal, and small,

Ada was formalized and big, but other were small to. In fact,
Oberon was quite small, but also was late and made a bunch of
bad choices (for example first implementation was for rather
obscure processor).

I think the problem with Oberon is that it was one man's vision, even if
that man was Wirth, and was full of personal choices that not every one
would agree with. The language was also /too/ small. That was not necessary.

(My language is also a personal endeavour, but I'm not inflicting in on
the world, just sharing some ideas.)

and allowed you great freedom (more than

I am not sure what "great freedom" means here. Even Ada which
is consdered most strict between languages that I mentioned allows
doing any needed low level tasks.

You can completely by-pass the type system including overriding a
function signature with another. You can access any memory address. You
can access the code bytes of any function.

You can pass control (call as a function) to any arbitrary address. You
jump to any address (via gnu extension).

You can execute any inline assembly (probably another extension).

Can you do that with Ada? Then good on it, but I'd imagine you'd need to
jump through a few hoops.

I would doubt it very much with Oberon. Mine of course allows all that.

In C these days, you're need to work around the UB that most of the
above probably is. That is my beef with it.

C pretends to be a safe language by saying all those naughty things are
UB and should be avoided, at the same time, C compilers can be made to
do all that.

and here, an array of 10 of those pointers:

void(*table[10])(void);
[10]ref proc table

But, this is the kicker: to write that last C version, I had to use a
tool to figure where the parentheses and square brackets go. What kind
of HLL is that?!

I admit that in case above I would first declare pointer type and
only after that I would declare the array. Not nice, but if you
are bothered by this there are many languages that do not have
this problem.

Unless you're going argue that C's syntax has the edge, then my language
is indeed better.

Well, you apparently are not getting simple thing: C is "good
enough".

Not for me, sorry. It's like being used to driving an ordinary modern
car, nothing special, then having to drive a Model T Ford.

(This is a very basic vehicle where everthing is an extra. That made it flexible and customisable, but it was a Model T!

The analogy is not quite right; you'd have to imagine I could make my
own sleek-looking car, but with languages that would be doable.)

That is problems with C are (or at least were) not deal
breakers. Deal breakres are:
- having a compiler for needed target
- possibly quality (speed and size) of object code

It the 1980s, the code from C compilers for 8/16-bit machines was just
as poor as mine. In 2020s, it is better, but by a surprisingly small factor.

- compatiblity with other software (linking, interlanguage calls
and similar)

FFIs and ABIs address that. However many libraries expose only an API expressed as C, so languages need a way of creating bindings, or somehow
have additonal complexity to use the API directly.

Ada folks made reasonable argument that Ada gives about twice
productivity compared to C.

Programming in Ada is like doing so with one hand tied behind your back.
(With Rust, probably both hands!) I doubt that claim for people like me.

In other words, to compete with C language must offer really
large advantage: it is not enough to be better,

All the C replacements: C#, Java, D, Rust, Go, Zig, C3... are all much
bigger, more complex languages. They are too ambitious.

There is a need for a language at the level of C, with small scope,
small footprint (it can be implemented in 200KB or less; show me a 200KB Rustc), with lots of rope to be able to do what you like.

But, just look at C....

However, we're stuck: C is too firmly entrenched and ensconced.

Nobody making a replacement for it now will be able to resist adding too
much.

(This is where I got it about right with mine, and by design. I can't
compete with big shiny languages, so let's make a decent job of an 80s one.)

That may be
big enough to win. But you should understand that what
matter is whole ecosystem, including extra tools.

Anyone using Unix/Linux is already dependent on that ecosystem.

--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c on Sat May 9 15:19:56 2026

From Newsgroup: comp.lang.c

On 09/05/2026 00:43, Dan Cross wrote:

In article <10tk4sg$2l19a$2@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:

On 08/05/2026 00:08, Dan Cross wrote:

K&R is a wonderful book for its exposition: well-written,
concise, and the prose is beautiful. Kernighan is an amazing
writer, and Ritchie was well-known for his patience and clear
explainations.

However, it is a product of its time. It dates from a simpler
era, when programmers were expected to use books like it is a
starting point, and subsequently gain mastery either through
careful study of the standard, or extensive practice. (I'm
referring specifically to K&R2, of course, since the first
edition predated the first version of the standard by a decade.)
Machines were smaller and simpler, then, and so were compilers.

I am sad to say that I don't think it has aged particularly well.

I like the way you put that. Sometimes people have a tendency to put
too much reverence on particular texts - such as imagining that K&R says
all that needs to be said about C, and treating any modern tool, text,
standard or program that diverges from it as some kind of heresy, or
"not following the spirit of C". Languages evolve - tools evolve,
programs evolve, standards evolve, requirements evolve. K&R was a
milestone in the history of programming languages, and a model for
technical writing in its field, but C today is not C of fifty years ago.

Thank you. Yes, I pretty much agree.

It is unfortunate that this situation may be UB. I personally think
"unsigned short" should promote to "unsigned int", not "int" -
promotions should be signedness preserving. I don't like the "promote
to int" at all. But opinions don't change the standards, and I suppose
there are historical reasons for the rules that were made here.

But I am not sure I agree that such cases are "easy to stumble into".
How often would code like that be written, where overflowing the
uint16_t would be correct behaviour in the code on a 16-bit int system?
It is certainly possible, but it is perhaps more likely that cases of
overflow in the 16-bit system were also bugs in the code - moving the
code to 32-bit systems could give different undesirable effects from the
bug. It could also happen to remove the effects of the bug by holding
results in 32-bit registers and leading to correct results in later
calculations - UB can work either way.

Sure. This was a bit of a contrived example, but you ask a good
question: how often might one want write code like that?

I think the particularly interesting thing about asking how often code
like this occurs, is that the potential impact of an oddity may be
higher for things that aren't often used. Most C programmers will
fairly quickly learn that overflowing signed arithmetic is UB and try to
avoid it - but the rarity of this example means that people are less
likely to realise it is UB.

In short, I don't know, but I can think of any number of hash
functions, checksums, etc, that may be implemented using 16-bit
arithmetic, and I can well see programmers wanting to take
advantage of the modular semantics afforded by using unsigned
types to do so. Every day? Probably not. But often enough.

I can imagine situations in the microcontroller world (as usual, many of
my examples come from there!) where code that was originally written for
8-bit or 16-bit devices was moved to 32-bit devices. Microcontroller programmers are big users of fixed-size integer types - sometimes a good thing, sometimes not.

One of the things I had to really internalize as an OS person is
that the universe of useful existing software is large. It
doesn't matter if I create the most beautiful abstractions for
them that are infinitely superior to whatever swill their code
is using now. If they don't get to run their program (or worse,
they have to make a bunch of invasive changes for no discernable
benefit from their perspective) because I know better about how
things ought to be done, they're not going to use whatever
system I'm working on unless they're forced. But even then they
will resent it and move to something else the first chance they
get (lookin' at you, DEC, Microsoft, IBM, and any number of
commercial Unix vendors).

Whatever _I_ think of how the interfaces they chose to use is
immaterial, making it difficult for them wins me no friends.
This is one of the smart things Torvalds did with Linux: "don't
break userspace" (unless there's a really, really good reason)
probably did a lot to help make Linux popular.

Anyway, I think this is similar. It doesn't matter what anyone
thinks of whether one ought to prevent all overflow; the fact
is that the language supports it for unsigned integers (though
with some surprising semantics for types of lower rank than
`int`) simply is what it is. And if someone has a program that
avails of those semantics, and that program is important to them
for whatever reason, then there's little choice but to hold
one's nose. I know you know this, of course, but I think it's
worth repeating every now and then.

Agreed. Knowing the semantics (and knowing when no semantics are
defined) is more important than exactly what the semantics are. For any
real language, there are always going to be things you disagree with or
think could be done differently, but you live with it anyway. Just look
at the C or C++ standards committee voting records - very few changes
get voted through unanimously.

(I guess that's why Bart is so deliriously impressed with his own
language - as the language's only designer, implementer, and user, it presumably fits his preferences quite well. Real-world languages are
more of a compromise.)

Certainly, however, the fact that this expression could contain UB would
surprise many C programmers.

Yes. Btw, the fix is almost trivial:

```
uint16_t
mul(uint16_t a, uint16_t b)
{
unsigned int aa = a, bb = b;
return aa * bb;
}
```

But we must be careful - copying the same pattern to uint32_t would then
be incorrect if unsigned int is smaller than 32 bits. (Still no UB,
though.) A general pattern could be :

T mul(T a, T b) {
return (a + 0u) * b;
}

Then "a" would be promoted to unsigned int if it was smaller than that,
but not "demoted" if the unsigned type T is smaller than unsigned int.
It should also, I think, be safe for signed types T up to "int" - but
not bigger than "int".

But if a programmer is not already very familiar with the
language, it may look very odd.

Yes.

I don't think that this what the authors originally intended
(in fact, I'm quite certain it is not, based on conversations
I've had with them in the past; they very much wanted the
original semantics for integer promotion and did not like those
chosen by the ANSI committee).

K&R has not been updated in almost 40 years, and 40 years ago,
it reflected a very different language, and moreover, reflected
the spirit intended by the original authors.. But, regardless of
the original intent, that is not the language we have _today_.

I just picked my copy up off the shelf. The pages are yellowed
and the corners heavily dogeared; but flipping through it is
like seeing an old friend. Then I put it back on the shelf: you
can never go home again.

You make that sound so sad!

It is bittersweet. I have fond memories of times spent with
that copy of that book. I learned a lot from it, and it had an
outsized role in shaping my career and my development as an
engineer.

I met Dennis Ritchie several times. I think he would be pleased
and satisfied to know how many people look at K&R with fondness
and appreciation, but perhaps moreso how many have outgrown it,
as well. I worked in the same office as Kernighan for a while,
and occasionally ate breakfast with him. I managed to overcome
my embarassment enough one morning and asked him to sign my copy
of K&R1, and I could tell he very much appreciated it; I'm
certain he feels much the way I just described. (Sadly, I never
asked Dennis to sign my copy before he passed away.)

Nice anecdote. It's nice to be reminded that such famous names are (or
were) just ordinary real people.

(My copy is in a box in the loft somewhere. I guess it is really one of
these books that should always be on the bookshelf, even if I never look
at it again.)

Absolutely.

I do have Knuth's Art of Computing Programming on my shelf - but while
it too was a milestone, it is not nearly as readable. (The TeXBook, on
the other hand, was quite enjoyable, as was the MetaFont book.)

--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Sat May 9 06:35:06 2026

From Newsgroup: comp.lang.c

cross@spitfire.i.gajendra.net (Dan Cross) writes:

[some non-C, some C]

To get a taste for the flavor of SML, you may find https://www.cs.cmu.edu/~rwh/isml/book.pdf interesting.

The Fibonacci example that in this "M" language is:

|func fib(n)=
| if n<3 then
| 1
| else
| fib(n-1)+fib(n-2)
| fi
|end

[sic: as a sequence, the Fibonacci numbers are undefined for
$n<0$, but this is a pedagogical example, so let's ignore that]

A comment on that further down...

In SML, this same program could be written as:

```
fun fib(n) =
if n<3 then
1
else
fib(n-1) + fib(n-2)
```

Here I'm trying to follow his style. Somewhat more
idiomatically style, would probably be written like:

```
fun fib n =
if n < 3 then 1
else fib (n - 1) + fib (n - 2)
```

Though typically one would use pattern matching so as to more
closely match the mathemtical definition of the Fibonacci
numbers, expressed as a recurrence relation:

```
fun fib 0 = 1
| fib 1 = 1
| fib n = fib (n - 1) + fib (n - 2)
```

(Origin 0, not 1.)

fibonacci(0) is 0. There is no other.

But note that so far all of these programs are exponential in
both space and time. A more robust version, mirroring Harper's,
that runs in linear time and space is:

```
exception Range of string

fun fib n =
let fun fib' 0 = (1, 0)
| fib' 1 = (1, 1)
| fib' n =
let val (a, b) = fib' (n - 1)
in (a + b, a)
end
in if n >= 0
then #1 (fib' n)
else raise Range "fib: n must be non-negative"
end
````

Usually the fibonacci function is defined for negative numbers
through the same recurrence relation, so fib(n) is defined in
terms of fib(n+1) and fib(n+2). Here is (an exponential) fib
written in OCaml:

let rec fib = function
| 0 -> 0
| 1 -> 1
| k when k > 1 -> fib (k-2) + fib (k-1)
| k -> fib (k+2) - fib (k+1)

Incidentally, fibonacci(-n) = fibonacci(n), except with alternating
signs: ..., -8, 5, -3, 2, -1, 1, 0, 1, 1, 2, 3, 5, 8, ...

A tail recursive version that runs in linear time and constant
space is:

```
exception Range of string

fun fib n =
let fun fib' 0 a _ = a
| fib' n a b = fib' (n - 1) (a + b) a
in if n >= 0
then fib' n 1 0
else raise Range "fib: n must be non-negative"
end
```

Of course, in C, one might write the last as something like:

```
unsigned int
fib(unsigned int n)
{
unsigned int a = 1, b = 0;
while (n-- > 0) {
unsigned int sum = a + b;
b = a;
a = sum;
}
return a;
}
```

Here is my current favorite (for C) linear fibonacci function:

typedef unsigned long long ULL;

ULL
sfibonacci( unsigned n ){
ULL a = 0, b = 1;
while( n > 1 ) a += b, b += a, n -= 2;
return !n ? a : b;
}

Here is my current favorite fast fibonacci function (which happens
to be written in a functional and tail-recursive style):

static ULL ff( ULL, ULL, unsigned, unsigned );
static unsigned lone( unsigned );

ULL
ffibonacci( unsigned n ){
return ff( 1, 0, lone( n ), n );
}

ULL
ff( ULL a, ULL b, unsigned m, unsigned n ){
ULL c = a+b;
return
m & n ? ff( (a+c)*b, b*b+c*c, m>>1, n ) :
m ? ff( a*a+b*b, (a+c)*b, m>>1, n ) :
/*****/ b;
}

unsigned
lone( unsigned n ){
return n |= n>>1, n |= n>>2, n |= n>>4, n ^ n>>1;
}

Much faster than the linear version.
--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Sat May 9 15:18:52 2026

From Newsgroup: comp.lang.c

In article <10tn3so$3j8hc$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 09/05/2026 02:57, Dan Cross wrote:

In article <10tls2u$39j7a$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 08/05/2026 22:02, Waldek Hebisch wrote:

[snip]
Note that all languages that I mention are at roughly similar
level as C, all are more (Oberon) or less (Ada) niche now.
But I think that each of them has more users than your
language (original UCSD Pascal and Turbo Pascal are dead, but
there are Turbo Pascal compatible products in current developement).

Also, users and developers of those languages probably think
that they "run rings around C", but do not come here to complain
how bad C is.

C sits at a particular level and mine is at about the same place.

For example, FreePascal transpiles to C;

You mean `fpc`? I see no evidence for that. I just looked at
https://www.freepascal.org and I see no documentation about the
compiler generating C code; it appears to generate object code
for the target platform, and the software requirements don't
mention a C compiler.

When I tried it about a decade ago, it appeared to use a C backend from
what I can remember. The same with Nim, or GHC. Or languages like
Euphoria or Seed7 which are interpreted, but can lower to C as an option.

That's not true for GHC, either. It does use an intermediate
representation language called "Cmm" that is described as a,
"simple, C like language". But that is not C, and the compiler
either generates native code or LLVM IR.

I don't know about Nim. A cursory glance indicates that it has
backends targeting a number of languages in the C family (C,
C++, Objective-C) and JavaScript. The C backend seems to be the
default.

But it is also possible I mixed it up with FreeBasic (I tried both).

FreeBASIC appears to generate native code.

Programs however move on. If I download it now, then it does bundle >something called 'gcc.exe', but it is a stub: it can load C files, but
it is missing 'cc1'.

This doesn't seem particularly relevant to anything. However,
you may be confused because I'm some of these tools may invoke
`gcc` (or similar) as a command driver to invoke the platform
assembler and/or linker. But that doesn't mean they're invoking
it to compile C source code (this can be surprisingly tedious,
depending on the platform).

The point is that some HLL X is commonly transpiled to C, either for >bootstrapping, or for early versions or as an option.

C rarely transpiles to some other HLL X, for the purposes of
implementing C.

Why would it? C compilers are ubiquitous.

I don't see how this is relevant to anything.

But it sometimes does when language X wants to migrate
existing C code to X.

Ok. That's not C treating another language as (effectively) an
IR, though. That's automated conversion. It may technically
meet a definition of "transpilation", but it is qualitatively a
different thing than what, say, `cfront` did for C++ in the
early days.

I'm showing I can create a decent language, one that has been tried and
tested so I know what worked and what didn't. And that enables me to
make an informed comparison with C, which is used in the same space.

No. Having a good understanding of C would enable you to make a
good comparison with C. But, again, you haven't demonstrated
that you have a good understanding of C, and you've expressed
negative interest in gaining such understanding, so whatever you
know about your own language is irrelevant.

As I kept saying, anybody can subjectively compare any language with any >other as it pertains to their sphere, their experience and their >requirements, down to individual features.

Sure. That doesn't mean those analyses are well-informed or
useful.

In your case, you don't have a good handle on C. Despite your
protestations to the contrary, you've shown this repeatedly.

Therefore, neither your opinion about C, nor your comparisons
of C to your own language, are particularly useful.

Having successfully used your ownyour own language, which I am
sure that you do know very well, certainly does not factor into
the matter. In particular given that you lack the critical
requisite understanding _of C_ to make any comparison
meaningful.

Put another way, you can say whatever you want, but no one else
is obliged to care.

In my case:

* Pretty much all coding I did, outside of assembly and scripting, was
for applications that anyone else would have used C for.

...and so? You didn't use C for it, and you don't know C, so it
does not follow that any of that prior experience matters here.

* ALL of that was achieved via the features of my own languages

Ok. But that's not C, and you don't know C, so it's not
relevant to discussing C.

* All the generated code was done via my own tools right down to the binary

Ok. But that's not relevant to discussing C, either.

So for a particular micro-task, to get it from concept A in the source
code to B in the binary executable for machine M, I know exactly how I >expect it to work.

Ok, but your expectation has no bearing on C.

I can then compare that with using C to try and get from A to B.

No, that doesn't follow, since you don't know C.

I don't care how it does it internally or what are the reasons why it
might give different behaviour.

This is why none of what you wrote above particularly matters.

You don't care about C _as it is defined_. You only care about
how _you think it should work based on your intuition_. Your
incredulity at its definition not matching your expectations has
no bearing on anything at all.

There are reasonable adjustments you need to make to switch languages,
and there are unreasonables ones, such as needing to become a guru in
the new language.

It strikes me that you need to know the language if you want to
use and discuss it.

I understand that you do not want to use C. That's fine; no one
is forcing you to. But if you want to discuss C, you'd get a
lot more traction (and a lot less pushback) if you actually
learned it.

Or having to use workarounds because your code has to work without UB on
the DS9000, even though you are only interested in M, which has the same >characteristics as all other target machines you are likely to use.

Read: you need to know how to use the language.

I gather you find C fiddly. That's ok; I find C fiddly. It is
what it is, however, and no amount of gnashing teeth or wailing
in despair is going to change that.

And for my language, you can substitute 'X'.

So I refute your claim that somebody can't make a comparison or express
a preference without such indepth knowledge.

You can complain about it all you want: the secret C police are
not going to show up at your door in the middle of the night and
take you away in your bathrobe.

However, if you want others to take your critique seriously,
then you need to actually _understand the language as it is_.
You have repeatedly shown that you do not understand the
language, and are not interested in understanding it. That,
too, is fine; no one is forcing you to study something you don't
want to study. But because of that, few are going to take your
critique particularly seriously, either.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Sat May 9 08:37:17 2026

From Newsgroup: comp.lang.c

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <10tj2h0$20gfo$1@dont-email.me>,
James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:
...

The `realloc` thing was a particularly egregious example of a
thing that started life well-defined, then became IB, and then
UB; it's relevant because it shows the committee is willing to
make weaken the language's guarantees about what is well-defined
over time, but I admit that that is rare.When was it ever well-defined?

The C89 standard says:
"If the size of the space requested is zero, the behavior is
implementation-defined; the value returned shall be either a null
pointer or a unique pointer."
Prior to C89, the closest thing there was to a standard was K&R, which
didn't mention realloc() (or most of the rest of what became the C
standard library).

In section 7.10.3.4 ("The `realloc` function"), the last
sentence of the "Description" reads: "If `size` is zero and
`ptr` is not a null pointer, the object it points to is freed."
That statement is explicit, and unambiguous.

The text you quoted is from the prefactory material at the top
of section 7.10.3 ("Memory management functions") and clearly
applies to to `malloc` and `calloc`.

I suppose one could make an argument to support it applying
to `realloc` as well because it doesn't explicitly *exclude* it,
but that would be a stretch.

Not at all. The rule in the C standard is that statements in a
higher node of the hierarchy apply to all the child nodes unless
a particular child node explicitly alters it.

I counter with two points: a) the
langauge in realloc is more specific, and thus should supercede
the general statement in the earlier introductory text, and b)
the langauge in 7.10.3 is talking about size requested for
allocation, but the language in 7.10.3.4 says that, in the case
it describes, the behavior is to _free_. In that specific case,
no size is "being requested" a la the 7.10.3 language, and thus
the statement about behavior in 7.10.3 does not apply.

The two provisions are not in conflict. The semantic description
in the realloc() section says the block is free()'d, but doesn't
say anything about the return value. The general prelude higher
up describes what is returned when the size requested is zero.
These two passages are talking about different things, and are
not in conflict with each other, and both apply.

The bottom line is that, despite the 7.10.3 wording, C89
explicitly defined `realloc(ptr, 0);` as equivalent to
`free(ptr)` when `ptr != NULL`.

You are simply wrong. There is different wording in C99, and
that newer wording is not a change but a clarification of the
earlier wording in C89. Such clarifications often occur in the
C99 standard.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Sat May 9 08:48:27 2026

From Newsgroup: comp.lang.c

cross@spitfire.i.gajendra.net (Dan Cross) writes:

Examples of statically typed languages include SML, Haskell,
Rust, etc. Those are also all strongly typed.

Rust is not generally considered to be strongly typed. Rust has
raw pointers and unsafe functions, both of which (can) violate
type safety.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Sat May 9 17:16:08 2026

From Newsgroup: comp.lang.c

On 09/05/2026 16:18, Dan Cross wrote:

In article <10tn3so$3j8hc$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

When I tried it about a decade ago, it appeared to use a C backend from
what I can remember. The same with Nim, or GHC. Or languages like
Euphoria or Seed7 which are interpreted, but can lower to C as an option.

That's not true for GHC, either. It does use an intermediate
representation language called "Cmm" that is described as a,
"simple, C like language". But that is not C, and the compiler
either generates native code or LLVM IR.

That first download I did of it was 1800MB. 300MB of that was bundled
gcc installation. That may have changed.

I don't know about Nim. A cursory glance indicates that it has
backends targeting a number of languages in the C family (C,
C++, Objective-C) and JavaScript. The C backend seems to be the
default.

But it is also possible I mixed it up with FreeBasic (I tried both).

FreeBASIC appears to generate native code.

FBC seems to include a working gcc.exe program.

If I do 'fib64 -R hello.bas' then it produces the C file below.

(You can also see from this that /nobody/ likes stdint.h types, even
though standardised from C99 which also introduced 'long long' used
here. That is another bugbear. Oh, I forgot, my criticism is not not valid.)

Programs however move on. If I download it now, then it does bundle
something called 'gcc.exe', but it is a stub: it can load C files, but
it is missing 'cc1'.

This doesn't seem particularly relevant to anything.

We're drifting from my point, which is that C is in that small category,
of a deceptively simple and malleable language that would be a good fit
for a target language.

I'm saying mine would be in that group, which is my I'm doing
comparisons with Pascal or Ada which have been brought up.

However,
you may be confused because I'm some of these tools may invoke
`gcc` (or similar) as a command driver to invoke the platform
assembler and/or linker.

Probably. But I don't know what FPC looked like when I first tried it.

Why would it? C compilers are ubiquitous.

For the major platforms, so are compilers for dozens of languages.

You don't care about C _as it is defined_. You only care about
how _you think it should work based on your intuition_. Your
incredulity at its definition not matching your expectations has
no bearing on anything at all.

If you disagree with an opinion of mine, would be make it any difference
if I knew the C standard inside out? You are hardly going to change your
mind.

Suppose I proposed for example that C should deprecate, then ban, the
ability to write:

A[i]
B[i][j]

respectively as:

i[A]
j[i[A]]

(The last one is a little mind-blowing, as it turns one 2D array access
- two consecutive 1D accesses) into two /nested/ 1D accesses.)

Basically, it would mean addition between pointers and integers would
not be commutative: P + i, but not i + P.

You will either agree with this or not. But I can't see that it requires
any deep knowledge of the standard to make such a proposal, or why
somebody would require that of me in order to even consider it.

There are reasonable adjustments you need to make to switch languages,
and there are unreasonables ones, such as needing to become a guru in
the new language.

It strikes me that you need to know the language if you want to
use and discuss it.

You want EVERYBODY who uses C to know the standard in as much depth as
KT, JK and TR? (Maybe a few others too but they don't seem that bothered
about it.)

(I've just tried the above proposal in my C compiler. It took half a
minute to find where I had to comment out 4 lines to make it work.

As it happens, because this ability has been there a long time, some
programs use it, for example from sqlite:

nPage = nPageHeader = get4byte(28+(u8*)pPage1->aData);

So this change is not going to happen, and people will continue writing
quirky things like 3["ABCDEF"] just for the hell of it.

This is the story of C.)

Output from fbc64 -R hello.bas:
-----------------------------
typedef signed char int8;
typedef unsigned char uint8;
typedef signed short int16;
typedef unsigned short uint16;
typedef signed int int32;
typedef unsigned int uint32;
typedef signed long long int64;
typedef unsigned long long uint64;
typedef struct { char *data; int64 len; int64 size; } FBSTRING;
typedef int8 boolean;
void fb_PrintString( int32, FBSTRING*, int32 );
FBSTRING* fb_StrAllocTempDescZEx( char*, int64 );
void fb_Init( int32, char**, int32 );
void fb_End( int32 );
void fb_Sleep( int32 );

int32 main( int32 __FB_ARGC__$0, char** __FB_ARGV__$0 )
{
int32 fb$result$0;
__builtin_memset( &fb$result$0, 0, 4ll );
fb_Init( __FB_ARGC__$0, (char**)__FB_ARGV__$0, 0 );
label$0:;
FBSTRING* vr$1 = fb_StrAllocTempDescZEx( (char*)"Hello from
FreeBASIC!", 21ll );
fb_PrintString( 0, (FBSTRING*)vr$1, 1 );
FBSTRING* vr$2 = fb_StrAllocTempDescZEx( (char*)"Press any key to continue...", 28ll );
fb_PrintString( 0, (FBSTRING*)vr$2, 1 );
fb_Sleep( -1 );
label$1:;
fb_End( 0 );
return fb$result$0;
}
-----------------------------

Looks like C to me!

--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c on Sat May 9 18:38:57 2026

From Newsgroup: comp.lang.c

On 09/05/2026 18:16, Bart wrote:

On 09/05/2026 16:18, Dan Cross wrote:

In article <10tn3so$3j8hc$1@dont-email.me>, Bart-a <bc@freeuk.com> wrote:

(You can also see from this that /nobody/ likes stdint.h types, even
though standardised from C99 which also introduced 'long long' used
here. That is another bugbear. Oh, I forgot, my criticism is not not
valid.)

Don't you realise that when you write things like that, you are only demonstrating why so many people do not take you seriously? Have you
checked with every C programmer, and every person writing systems that generate C code, and checked that none of them like the <stdint.h>
types? No? I thought not.

Some people use them extensively. Some people have little use for size-specific types. Some people want size-specific types, but for some reason (good or bad) want to use C90 rather than C99. Some people like
the <stdint.h> types but for some reason (good or bad) are unable to use
them in certain cases. Some people dislike the <stdint.h> types, but
use them anyway.

I like them. I use them a lot (the size-specific types). I like the
names, which I think are clear. If I were programming in a language
that used a different naming scheme for fixed-size types, I'd use the
scheme from that language - whether or not I thought it was the best
names they could have (that would depend on the rest of the language).
I don't like the macros for printing them, but then I don't need printf
much except for testing and debugging.

But I can't tell you what anyone else does, or what anyone else likes.
I can't see how you can claim to know that /nobody/ likes them.

Programs however move on. If I download it now, then it does bundle
something called 'gcc.exe', but it is a stub: it can load C files, but
it is missing 'cc1'.

This doesn't seem particularly relevant to anything.

We're drifting from my point, which is that C is in that small category,
of a deceptively simple and malleable language that would be a good fit
for a target language.

Your language would not be a good fit, because it is a home-made
personal language with no traction. If it were popular (it does not
need to be massively popular), with multiple developers, a group who
discuss the language design decisions, proper documentation, and a
reasonable user base, /then/ maybe it would be a possible fit for such
use. The suitability of a language for a particular purpose cannot be determined from one single user's personal preferences.

I'm saying mine would be in that group, which is my I'm doing
comparisons with Pascal or Ada which have been brought up.

-aHowever,
you may be confused because I'm some of these tools may invoke
`gcc` (or similar) as a command driver to invoke the platform
assembler and/or linker.

Probably. But I don't know what FPC looked like when I first tried it.

Why would it?-a C compilers are ubiquitous.

For the major platforms, so are compilers for dozens of languages.

Almost invariably, C is the first language to be targeted for compilers
for a platform. It does not matter whether you like that or not, it is
a fact.

You don't care about C _as it is defined_.-a You only care about
how _you think it should work based on your intuition_.-a Your
incredulity at its definition not matching your expectations has
no bearing on anything at all.

If you disagree with an opinion of mine, would be make it any difference
if I knew the C standard inside out? You are hardly going to change your mind.

Suppose I proposed for example that C should deprecate, then ban, the ability to write:

-a-a A[i]
-a-a B[i][j]

respectively as:

-a-a i[A]
-a-a j[i[A]]

(The last one is a little mind-blowing, as it turns one 2D array access
- two consecutive 1D accesses) into two /nested/ 1D accesses.)

I am happy to agree that this is an odd effect of the way these
operators are defined in C. I agree that code which is written this way
would seem unnecessarily confusing. On the face of it, I'd agree that
making the change you suggest would reduce the ability of people to
write odd code. I am fairly confident that none of the C committee
think it is a good idea to write "i[A]" or "j[i[B]]" rather than "A[i]"
or "B[i][j]".

But does that mean it would be a good idea to make the change to the C standards? Changes to the standards are not free. There may be good
reasons to keep addition commutative here (I don't know what these might
be). Or it may simply be that it's not worth the effort. Trying to
lock down a language so that people can't write strange things is a
fool's errand.

Basically, it would mean addition between pointers and integers would
not be commutative: P + i, but not i + P.

You will either agree with this or not. But I can't see that it requires
any deep knowledge of the standard to make such a proposal, or why
somebody would require that of me in order to even consider it.

An opinion about preferences for a particular piece of syntax does not
need deep knowledge beyond that bit of code. An opinion on whether it
would be a good idea to change the standard to fit that preference, or
on what other peoples' preferences might be, or any unexpected
consequences or impacts of such a change - /that/ requires a deep knowledge.

--- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Sat May 9 17:39:30 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 08/05/2026 22:47, Scott Lurndal wrote:

antispam@fricas.org (Waldek Hebisch) writes:

Bart <bc@freeuk.com> wrote:

<snip>

1.

* There are no separate declarations needed, neither in shared headers, >>>> nor as prototypes

2.

* If a module is imported by 50 others, it is processed exactly once

3.

* Project info (modules etc) is present in only one module of a project. >>>> (Most module schemes require import directives in every module)

4.

* That means no external build system is needed.

5.

Just that is HUGE.

All of this was the norm in the late 1970s. And it may be HUGE
to you, but clearly it's more YAWN to everyone else.

So what happened since the late 70s?

The state of the art has advanced. Modern
development tools provide additional capabilities
when compared the rather primitive tools of the 1970s.

Much of this is due to the additional resources available
(disk space, memory and faster CPUs) over time.

--- Synchronet 3.22a-Linux NewsLink 1.2

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Sat May 9 18:04:45 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> wrote:

On 09/05/2026 06:50, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 08/05/2026 22:02, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

What matter is what kinds of constructs is supported and all languages
that I mention support low-level programming. Ada, Modula 2
and Oberon where used to write operating systems, at least in
case of Modula 2 and Oberon there were no other language involved
(beside some small pieces of assembler). I am not sure if
there was any operating system written in Free Pascal, but
Free Pascal has all constructs needed, so there could be
if anybody wanted such system.

So technically each of languages that I mentioned could be
used to implement full software stack, starting from the
lowest level. If any of them gained enough popularity there
would translators targeting this language.

BTW: IIUC one popular PC "database" system used Modula 2 behind
the scene: it offered its own language which was translated to
Modula 2 which in turn was compiled by Modula 2 compiler to
native code.

I last used Pascal to any great extent in 1980, in a college
environment. It was a teaching language.

That was orignal goal. But Pascal quickly got serious use.

and allowed you great freedom (more than

I am not sure what "great freedom" means here. Even Ada which
is consdered most strict between languages that I mentioned allows
doing any needed low level tasks.

You can completely by-pass the type system including overriding a
function signature with another. You can access any memory address. You
can access the code bytes of any function.

You can pass control (call as a function) to any arbitrary address. You
jump to any address (via gnu extension).

You can execute any inline assembly (probably another extension).

Can you do that with Ada? Then good on it, but I'd imagine you'd need to jump through a few hoops.

AFAIK inline assembly is an extention which is provided by at least
some Ada compilers. I am not sure about jumps, but that should
be doable via inline assembly. AFAIK the others are available.
Concernig hoops, there are several special constructs, like
"unchecked conversion", you need to use them instead of more
typical code.

I would doubt it very much with Oberon.

I do not think Wirth supported inline assembly. AFAIK you could
access arbitrary address and bypass type rules. That is enough
for low level work.

Mine of course allows all that.

In C these days, you're need to work around the UB that most of the
above probably is. That is my beef with it.

This stuff mostly is defined by implementation. Ada standard
probably says more about low level constructs than C, but
ulimately standard can not dictate too much, so most important
things are defined by implementation. For practical purposes
there is little difference to the user between undefined
behaviour and implementation defined behaviour, you need to know
your implementation and what it promises.
--
Waldek Hebisch
--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Sat May 9 18:18:26 2026

From Newsgroup: comp.lang.c

In article <86ecjkshx0.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

Examples of statically typed languages include SML, Haskell,
Rust, etc. Those are also all strongly typed.

Rust is not generally considered to be strongly typed.

By whom?

Rust has
raw pointers and unsafe functions, both of which (can) violate
type safety.

That is orthogonal to whether or not it the language is strongly
typed.

Perhaps you meant memory safety? In which case it is worth
stating that the langauge only guarantees memory safety for the
safe subset.

Dereferencing a raw pointer, including for assignment through
that pointer, can only be done in an `unsafe` block. Unsafe
functions have no _a priori_ bearing on memory (let alone type)
safety; they only impose the requirement that they must be
called from from an `unsafe` block.

Using the `unsafe` language requires using an `unsafe` block,
even in a function marked `unsafe` (admitted, this did change
from earlier editions of the language).

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Sat May 9 19:20:49 2026

From Newsgroup: comp.lang.c

On 09/05/2026 17:38, David Brown wrote:

On 09/05/2026 18:16, Bart wrote:

(You can also see from this that /nobody/ likes stdint.h types, even
though standardised from C99 which also introduced 'long long' used
here. That is another bugbear. Oh, I forgot, my criticism is not not
valid.)

Don't you realise that when you write things like that, you are only demonstrating why so many people do not take you seriously?-a Have you checked with every C programmer, and every person writing systems that generate C code, and checked that none of them like the <stdint.h>
types?-a No?-a I thought not.

So, what's the figure?

I see this pattern frequently (sometimes every other project seemingly)
so they are unpopular for some. And we don't know if people using
uint8_t etc are doing so because they genuinely like it or feel obliged
to use it.

(Tim Rentsch also seems to avoid it here.)

Perhaps try asking why somebody would invent a new type name for uint8_t
at all.

Some people use them extensively.-a Some people have little use for size- specific types.-a Some people want size-specific types, but for some
reason (good or bad) want to use C90 rather than C99.-a Some people like
the <stdint.h> types but for some reason (good or bad) are unable to use them in certain cases.-a Some people dislike the <stdint.h> types, but
use them anyway.

So they can be problematic. And they are optional which is another matter.

Your language would not be a good fit, because it is a home-made
personal language with no traction.

I'm not suggesting take up of it. The point is that it is in that same category.

Why would it?-a C compilers are ubiquitous.

For the major platforms, so are compilers for dozens of languages.

Almost invariably, C is the first language to be targeted for compilers
for a platform.-a It does not matter whether you like that or not, it is
a fact.

If are looking for a HLL language to target for a new language, this is
not going to be a brand-new platform.

It will be an established one with lots of choices.

Basically, it would mean addition between pointers and integers would
not be commutative: P + i, but not i + P.

You will either agree with this or not. But I can't see that it
requires any deep knowledge of the standard to make such a proposal,
or why somebody would require that of me in order to even consider it.

An opinion about preferences for a particular piece of syntax does not
need deep knowledge beyond that bit of code.-a An opinion on whether it would be a good idea to change the standard to fit that preference, or
on what other peoples' preferences might be, or any unexpected
consequences or impacts of such a change - /that/ requires a deep
knowledge.

Well I made that change and the first app I tried failed because relied
on 'i + P', if not 'A[i]', but C doesn't allow you to separate those.

The next two were OK, but the fourth also used it:

add32le(p + 2, x + s1->plt->data - p);

(From Tiny C sources.) So it looks use 'i + P' is already too widespread
even to deprecate it.

It would have needed to be banned from the start. Then that line would
simply have been written as:

add32le(p + 2, s1->plt->data - p + x);

At least, I made the change and tested it on real programs.
--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Sat May 9 20:13:12 2026

From Newsgroup: comp.lang.c

In article <86mry8so39.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

[sic: as a sequence, the Fibonacci numbers are undefined for
$n<0$, but this is a pedagogical example, so let's ignore that]

A comment on that further down...

[snip]

(Origin 0, not 1.)

fibonacci(0) is 0. There is no other.

You are correct, and I was incorrect stating that Fib(n) is
undefined for n<0.

I suppose I am in good company with my program, as Bob Harper's
version from his book on SML that I referenced earlier uses the
same basic definition:

```
(* for n>=0, fibrCO n evaluates to (a, b), where
a is the nth Fibonacci number, and
b is the (n-1)st *)
fun fib' 0 = (1, 0)
| fib' 1 = (1, 1)
| fib' (n:int) =
let
val (a:int, b:int) = fib' (n-1)
in
(a+b, a)
end
```

He only defined his version over the subset, Fib /\ N, and I was
basing my examples on Harper (which I hope I acknowledged).

[snip]
Here is my current favorite fast fibonacci function (which happens
to be written in a functional and tail-recursive style):

static ULL ff( ULL, ULL, unsigned, unsigned );
static unsigned lone( unsigned );

ULL
ffibonacci( unsigned n ){
return ff( 1, 0, lone( n ), n );
}

ULL
ff( ULL a, ULL b, unsigned m, unsigned n ){
ULL c = a+b;
return
m & n ? ff( (a+c)*b, b*b+c*c, m>>1, n ) :
m ? ff( a*a+b*b, (a+c)*b, m>>1, n ) :
/*****/ b;
}

unsigned
lone( unsigned n ){
return n |= n>>1, n |= n>>2, n |= n>>4, n ^ n>>1;
}

Much faster than the linear version.

Very nice. 64-bit `unsigned long long` overflows for n>93, so I
question how much it matters in practice, though; surely if
calling this frequently you simply cache it in some kind of
table?

I wondered how this compared to Binet's Formula, using floating
point:

```
unsigned long long
binet_fib(unsigned int n)
{
const long double sqrt5 = sqrtl(5.);

long double fn =
(powl(1. + sqrt5, n) - powl(1. - sqrt5, n)) /
(powl(2., n) * sqrt5);

return llroundl(fn);
}
```

Sadly, my quick test suggests accuracy suffers (presumably due
to floating point) for the larger representable values in the
sequence; specifically, n>90. As a result I didn't bother
attempting to benchmark it.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Sat May 9 20:14:14 2026

From Newsgroup: comp.lang.c

In article <86qznls2p8.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

[concerning UB when multiplying 16-bit unsigneds]

Yes. Btw, the fix is almost trivial:

```
uint16_t
mul(uint16_t a, uint16_t b)
{
unsigned int aa = a, bb = b;
return aa * bb;
}
```

Easier:

uint16_t
mul( unsigned a, unsigned b ){
return a*b;
}

Presuming one can tolerate a change in function signature. That
is fine for this contrived example, but I would hesitate to say
that more generally.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Sat May 9 22:15:35 2026

From Newsgroup: comp.lang.c

In article <86ik8wsifm.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <10tj2h0$20gfo$1@dont-email.me>,
James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:
...

[snip]
Prior to C89, the closest thing there was to a standard was K&R, which
didn't mention realloc() (or most of the rest of what became the C
standard library).

Kuyper is wrong.

The source reference documents for the ANSI C standard include
Dennis Ritchie's C reference manual (describing the language;
not K&R) and the 1984 /usr/group standard (which described the
library). It says that clearly in the "Introduction" on page
viii. Those are clearly closer to a "standard" than K&R.

In section 7.10.3.4 ("The `realloc` function"), the last
sentence of the "Description" reads: "If `size` is zero and
`ptr` is not a null pointer, the object it points to is freed."
That statement is explicit, and unambiguous.

The text you quoted is from the prefactory material at the top
of section 7.10.3 ("Memory management functions") and clearly
applies to to `malloc` and `calloc`.

I suppose one could make an argument to support it applying
to `realloc` as well because it doesn't explicitly *exclude* it,
but that would be a stretch.

Not at all. The rule in the C standard is that statements in a
higher node of the hierarchy apply to all the child nodes unless
a particular child node explicitly alters it.

Could you please provide a reference to that rule, preferrably
a section or page number, in the first C standard?

I counter with two points: a) the
langauge in realloc is more specific, and thus should supercede
the general statement in the earlier introductory text, and b)
the langauge in 7.10.3 is talking about size requested for
allocation, but the language in 7.10.3.4 says that, in the case
it describes, the behavior is to _free_. In that specific case,
no size is "being requested" a la the 7.10.3 language, and thus
the statement about behavior in 7.10.3 does not apply.

The two provisions are not in conflict. The semantic description
in the realloc() section says the block is free()'d, but doesn't
say anything about the return value. The general prelude higher
up describes what is returned when the size requested is zero.
These two passages are talking about different things, and are
not in conflict with each other, and both apply.

Yes, I was incorrect to bring up the return value.

The bottom line is that, despite the 7.10.3 wording, C89
explicitly defined `realloc(ptr, 0);` as equivalent to
`free(ptr)` when `ptr != NULL`.

You are simply wrong.

No.

*That* sentence is correct: I said nothing about the _return
value_ of `realloc` _there_. Nor did I say that that was the
_only_ thing that the `realloc` did. I described a single, very
specific behavior.

You failed to read the sentence carefully. Please exercise more
care before saying someone is wrong. Or seek clarification if
you find something ambiguous, but do not assume.

There is different wording in C99, and
that newer wording is not a change but a clarification of the
earlier wording in C89.

You need to go read n2464.

The wording in C90 is absolutely clear that `realloc(ptr, 0)`
frees the object when `ptr != NULL`. That wording disappeard in
C99. Implementations took this as license to not free the
object. C17 tried to address this by making the behavior
explicitly implementation defined:

|If size is zero and memory for the new object is not allocated,
|it is implementation-defined whether the old object is
|deallocated.

So yes, we absolutely went from behavior that was _well-defined_
(deallocating the object if size was 0 and the ptr is non-null)
to IB (in C17) to UB (in C23).

Such clarifications often occur in the
C99 standard.

C has evolved since C99.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Sun May 10 00:25:39 2026

From Newsgroup: comp.lang.c

On 2026-05-09 14:10, Bart wrote:

I last used Pascal to any great extent in 1980, in a college
environment. It was a teaching language.

Back these days. It was also a language that had been used in
critical environments; hereabouts, for example, in a nuclear
reprocessing plant.

These are both application areas where you'll hardly find any
products of privately developed language like yours, I'm sure.

[...]

(My language is also a personal endeavour, but I'm not inflicting in on
the world, just sharing some ideas.)

...ideas you borrowed from other languages and just assembled
them - as it seems, per design principle, arbitrarily - to your
personal liking.

[...]

Programming in Ada is like doing so with one hand tied behind your back.

I suppose you prefer the "freedom" of assembly.

Janis

--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Sat May 9 15:32:58 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:
[...]

FBC seems to include a working gcc.exe program.

If I do 'fib64 -R hello.bas' then it produces the C file below.

FBC (FreeBasic) may or may not use C as an intermediate language.
I don't see why it matters, unless you're working with FBC.

(You can also see from this that /nobody/ likes stdint.h types, even
though standardised from C99 which also introduced 'long long' used
here. That is another bugbear. Oh, I forgot, my criticism is not not
valid.)

This specific criticism is not valid because it's inaccurate.
Your claim that "nobody" likes <stdint.h> types is blatantly and
demonstrably false. You appear to have reached this laughable
conclusion based on the fact that one specific implementation of
one language uses something else.

It's not your credentials that make your criticism invalid.
It's the falsehoods.

Programs however move on. If I download it now, then it does bundle
something called 'gcc.exe', but it is a stub: it can load C files, but
it is missing 'cc1'.

This doesn't seem particularly relevant to anything.

We're drifting from my point, which is that C is in that small
category, of a deceptively simple and malleable language that would be
a good fit for a target language.

I'm saying mine would be in that group, which is my I'm doing
comparisons with Pascal or Ada which have been brought up.

It may well be true that your language would be a good fit for
this purpose. That has nothing to do with C, which is the topic
of this newsgroup. You know how to find comp.lang.misc (which,
as it happens, I follow).

[...]

If you disagree with an opinion of mine, would be make it any
difference if I knew the C standard inside out? You are hardly going
to change your mind.

It might, depending on the circumstances.

Suppose I proposed for example that C should deprecate, then ban, the
ability to write:

A[i]
B[i][j]

respectively as:

i[A]
j[i[A]]

(The last one is a little mind-blowing, as it turns one 2D array
access - two consecutive 1D accesses) into two /nested/ 1D accesses.)

Basically, it would mean addition between pointers and integers would
not be commutative: P + i, but not i + P.

You will either agree with this or not. But I can't see that it
requires any deep knowledge of the standard to make such a proposal,
or why somebody would require that of me in order to even consider it.

It doesn't. You don't need to have a deep understanding of the
entire standard to be able to comment on specific aspects of it.
That's just your fantasy of what you think we're telling you.

As it happens, I would agree with that suggestion. As it happens,
the committee also agrees. The latest C2Y standard draft, N3854,
says "A postfix expression followed by an expression in square
brackets [] is a subscripted designation of an element of an
array. The use of this operator with the first operand of integer
type is an obsolescent feature." (I'll skip over the distinction
between "deprecated" and "obsolescent".)

Suggesting that index[array] should be deprecated is reasonable.

Whining about the meaning of "undefined behavior" while explictly
refusing to even read the definition of that term is not.

There are reasonable adjustments you need to make to switch languages,
and there are unreasonables ones, such as needing to become a guru in
the new language.

It strikes me that you need to know the language if you want to
use and discuss it.

You want EVERYBODY who uses C to know the standard in as much depth as
KT, JK and TR? (Maybe a few others too but they don't seem that
bothered about it.)

No. Again, that's your fantasy intepretation. Your conceptual leap to
"you need to know the language" to "EVERYBODY" (not just you) needs to
know the standard in depth is breathtaking.

If you want to discuss details of a language, you should understand
those details. In the case of array indexing, you've already
demonstrated reasonable understanding -- not by telling us that
you've written a C compiler, but by talking about the feature here,
mostly correctly.

(I've just tried the above proposal in my C compiler. It took half a
minute to find where I had to comment out 4 lines to make it work.

The difficulty of making a change in one compiler has very little
to do with the effort needed to make a change in the language.

As it happens, because this ability has been there a long time, some
programs use it, for example from sqlite:

nPage = nPageHeader = get4byte(28+(u8*)pPage1->aData);

So this change is not going to happen, and people will continue
writing quirky things like 3["ABCDEF"] just for the hell of it.

This is the story of C.)

You were talking about the indexing operator. There is no indexing
operator in that code.

The ability to write index[array] rather than array[index] is
obsolescent in C2Y. The ability to write index+array rather than
array+index is not, and I don't believe anyone has proposed that it
should be. Array indexing and pointer arithmetic are closely tied
together, though the C2Y draft decouples them when the operand is
of array type.

[...]

Bart, the following is directed to people who are interested in
the standard.

In C23 and earlier, the operands of the [] operator must be an
integer and a pointer. The pointer operand is commonly the result
of the "decay" of an array expression, such as the name of an
array object.

In the N3854 draft of C2y, the rules are the same when one of the
operands is a pointer; E1[E2] is equivalent to *((E1)+(E2)) and is
an lvalue. If one operand is of array type, it does not decay, and
the result is described in terms of elements of the array (object
or value). The array operand is not required to be an lvalue.
(This reduces the need for "temporary lifetime". I don't know
whether it makes it unnecessary in all cases.)

In editions of the C standard up to and including C17, there are
three contexts in which array-to-pointer does not occur: when the
expression is the operand of sizeof, when it's the operand of unary
"&", or when it's a string literal used to initialize an array
object. C23 adds another case, an operand of the typeof operators.
C2Y adds two more cases, an operand of the _Countof operator and
an operand of the array subscripting operator.

(The N1570 draft of C11 incorrectly included the _Alignof operator,
which can only be applied to a parenthesized type name.)
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Sat May 9 15:47:35 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:
[...]

Now look at what's involved in splitting a C module into two.

C doesn't have "modules".

[...]

That is a joke. Unix and C (and C compilers and libraries) are so
closely intertwined that you cannot separate them.

I'd say then that that gave C an unfair advantage.

It gave C an advantage. I don't know what you think is "unfair"
about it.

[...]

C pretends to be a safe language by saying all those naughty things
are UB and should be avoided, at the same time, C compilers can be
made to do all that.

C does not pretend to be a "safe language".

[...]

There is a need for a language at the level of C, with small scope,
small footprint (it can be implemented in 200KB or less; show me a
200KB Rustc), with lots of rope to be able to do what you like.

I have no such need. If you do, well, you've implemented languages
before. Go for it.

[...]
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Sun May 10 00:16:26 2026

From Newsgroup: comp.lang.c

On 09/05/2026 23:25, Janis Papanagnou wrote:

On 2026-05-09 14:10, Bart wrote:

I last used Pascal to any great extent in 1980, in a college
environment. It was a teaching language.

Back these days. It was also a language that had been used in
critical environments; hereabouts, for example, in a nuclear
reprocessing plant.

These are both application areas where you'll hardly find any
products of privately developed language like yours, I'm sure.

I don't sell languages. I sold engineering software, which could have
been used in, and for, all sorts of environments, I wouldn't know.

Before that, I designed hardware which my company sold, mostly as
business computers, but also sometimes bare motherboards used in process control.

[...]

(My language is also a personal endeavour, but I'm not inflicting in
on the world, just sharing some ideas.)

...ideas you borrowed from other languages and just assembled
them - as it seems, per design principle, arbitrarily - to your
personal liking.

This sounds like a putdown.

Yes, lots of languages will be collections of ideas that have already
existed. You put them together in a certain way and you have a useful
product that may not exist in quite that form elsewhere.

I looked through mine, and I've identified a dozen or more features that
are either novel (at least I hadn't seen them elsewhere), or adapted in
a different way.

But remember I also develop the tools, and in the early days I had to be creative when working on very low-end equipment. That is still the case
as I explore new ideas. Language and implementation can work together.

So, where's /your/ language? I mean it is apparently so easy to borrow
bits and pieces and create a new one.

[...]

Programming in Ada is like doing so with one hand tied behind your back.

I suppose you prefer the "freedom" of assembly.

I hate assembly. I prefer a HLL a couple of steps up. C could have been
that language, but I got spoiled by a decade of using my private one.

--- Synchronet 3.22a-Linux NewsLink 1.2

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Sat May 9 23:18:15 2026

From Newsgroup: comp.lang.c

Dan Cross <cross@spitfire.i.gajendra.net> wrote:

In article <86mry8so39.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

<snip>

Here is my current favorite fast fibonacci function (which happens
to be written in a functional and tail-recursive style):

static ULL ff( ULL, ULL, unsigned, unsigned );
static unsigned lone( unsigned );

ULL
ffibonacci( unsigned n ){
return ff( 1, 0, lone( n ), n );
}

ULL
ff( ULL a, ULL b, unsigned m, unsigned n ){
ULL c = a+b;
return
m & n ? ff( (a+c)*b, b*b+c*c, m>>1, n ) :
m ? ff( a*a+b*b, (a+c)*b, m>>1, n ) :
/*****/ b;
}

unsigned
lone( unsigned n ){
return n |= n>>1, n |= n>>2, n |= n>>4, n ^ n>>1;
}

Much faster than the linear version.

Very nice. 64-bit `unsigned long long` overflows for n>93, so I
question how much it matters in practice, though; surely if
calling this frequently you simply cache it in some kind of
table?

I wondered how this compared to Binet's Formula, using floating
point:

```
unsigned long long
binet_fib(unsigned int n)
{
const long double sqrt5 = sqrtl(5.);

long double fn =
(powl(1. + sqrt5, n) - powl(1. - sqrt5, n)) /
(powl(2., n) * sqrt5);

return llroundl(fn);
}
```

Sadly, my quick test suggests accuracy suffers (presumably due
to floating point) for the larger representable values in the
sequence; specifically, n>90. As a result I didn't bother
attempting to benchmark it.

Fast version of fibonacci depend on fast computation of matrix
power (of a two by two matrix). One way to have fast matrix power
is to diagonalize and use floating point (which is essentially
what is done by Binet's Formula), but as you noted this needs extra
precision. Tim's version looks like somewhat obscure variant
of fast matrix powering. This has advantage of doing all computations
on integers. Of course, to make sense this must use increased
precision, preferably arbitrary precision arithmetic.
--
Waldek Hebisch
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Sat May 9 16:24:00 2026

From Newsgroup: comp.lang.c

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <86ik8wsifm.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

[...]

There is different wording in C99, and
that newer wording is not a change but a clarification of the
earlier wording in C89.

You need to go read n2464.

That's <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2464.pdf>,
a 4-page PDF, title "Zero-size Reallocations are Undefined Behavior".

[...]
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Sat May 9 23:33:39 2026

From Newsgroup: comp.lang.c

In article <10todi7$3vl63$2@kst.eternal-september.org>,
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

Bart <bc@freeuk.com> writes:

[snip]
There is a need for a language at the level of C, with small scope,
small footprint (it can be implemented in 200KB or less; show me a
200KB Rustc), with lots of rope to be able to do what you like.

I have no such need. If you do, well, you've implemented languages
before. Go for it.

Yeah, I don't get that one.

He mentioned `rustc`, which is a compiler. One can write Rust
_programs_ that compile into a form that fits into very small
memories, comparable with C programs written for similar
environments, but the era of _needing_ a _compiler_ that small
is well and truly over.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Sun May 10 00:45:45 2026

From Newsgroup: comp.lang.c

On 09/05/2026 23:47, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

Now look at what's involved in splitting a C module into two.

C doesn't have "modules".

You want to be /that/ pedantic?

This is exactly why I said that the C standard is your thing. If
somebody uses a term that doesn't appear in the standard, then it
doesn't exist.

So, what is involved in splitting a ... I don't even know what to call
it - a single .c 'source file'? Well, a lot of messy work.

[...]

That is a joke. Unix and C (and C compilers and libraries) are so
closely intertwined that you cannot separate them.

I'd say then that that gave C an unfair advantage.

It gave C an advantage. I don't know what you think is "unfair"
about it.

The context was why C became the dominant language for systems
programming. I offered that as an example. If it helped C over a
potential rival which wasn't used to implement a major OS, then it
strikes me as an unfair advantage.

Suppose Unix was implemented in some other language, then if C was still
more successful over rivals, that would have been fairer.

[...]

C pretends to be a safe language by saying all those naughty things
are UB and should be avoided, at the same time, C compilers can be
made to do all that.

C does not pretend to be a "safe language".

So, C can be unsafe even when you avoid all UB? Examples?

I suppose this depends on what you mean by unsafe. Take this:

m = monthnames[month];
d = daynames[day];

Suppose month and day indices got swapped by mistake, but both are still within bounds; is this the kind of 'unsafe' in C that some languages can
fix through stricter typing?

But then, how about this one:

d1 = daynames[day1];
d2 = daynames[day2];

A type system can't stop day1 and day2 being swapped; it can still go wrong.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Sat May 9 17:33:51 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 09/05/2026 23:47, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

Now look at what's involved in splitting a C module into two.

C doesn't have "modules".

You want to be /that/ pedantic?

The point is that I don't know what you mean by "module" when you're
talking about C. C doesn't have a feature by that name.

This is exactly why I said that the C standard is your thing. If
somebody uses a term that doesn't appear in the standard, then it
doesn't exist.

I have never said that.

Your attempts to put words in my mouth have a close to 100%
failure rate. Consider *asking* me what I mean rather than
pretending to know.

So, what is involved in splitting a ... I don't even know what to call
it - a single .c 'source file'? Well, a lot of messy work.

Right, you don't know what to call it. I think the term you're
probably looking for is "translation unit".

If you have something to say about splitting a C translation unit
(something I don't think I've ever had a need to do), perhaps because
you've had difficulties doing so yourself, feel free to elaborate.

[...]

That is a joke. Unix and C (and C compilers and libraries) are so
closely intertwined that you cannot separate them.

I'd say then that that gave C an unfair advantage.

It gave C an advantage. I don't know what you think is "unfair"
about it.

The context was why C became the dominant language for systems
programming. I offered that as an example. If it helped C over a
potential rival which wasn't used to implement a major OS, then it
strikes me as an unfair advantage.

Suppose Unix was implemented in some other language, then if C was
still more successful over rivals, that would have been fairer.

OK. I still see nothing unfair about it, but I won't argue the
point.

[...]

C pretends to be a safe language by saying all those naughty things
are UB and should be avoided, at the same time, C compilers can be
made to do all that.

C does not pretend to be a "safe language".

So, C can be unsafe even when you avoid all UB? Examples?

Huh??

I suppose this depends on what you mean by unsafe. Take this:

m = monthnames[month];
d = daynames[day];

Suppose month and day indices got swapped by mistake, but both are
still within bounds; is this the kind of 'unsafe' in C that some
languages can fix through stricter typing?

But then, how about this one:

d1 = daynames[day1];
d2 = daynames[day2];

A type system can't stop day1 and day2 being swapped; it can still go wrong.

What on Earth are you talking about?

You said that C "pretends to be a safe language". It does not.
C can be used safely with considerable care. It can be and often
is used unsafely.

Examples of potentially unsafe C code have nothing at all to do
with my point.

Question: Does C, as you claim, "pretend to be a safe language"?
Can you cite a source to support that claim? If you were willing
to read the C standard, I'd refer you to first few paragraphs of
Annex K, introduced in C11.

You made a false statement. I've made plenty of mistakes here
myself. Acknowledging them would substantially increase your
credibility.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Michael S@already5chosen@yahoo.com to comp.lang.c on Sun May 10 03:46:37 2026

From Newsgroup: comp.lang.c

On Sat, 09 May 2026 17:33:51 -0700
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

Right, you don't know what to call it. I think the term you're
probably looking for is "translation unit".

If you have something to say about splitting a C translation unit
(something I don't think I've ever had a need to do),

That surprises me greatly.
In my practice refactoring that includes splitting translation units is
rather common.

Or, may be, I misunderstood your above sentence and you meant that you
never had a need *to say* something about splitting etc...?

perhaps because
you've had difficulties doing so yourself, feel free to elaborate.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Sat May 9 17:54:49 2026

From Newsgroup: comp.lang.c

Michael S <already5chosen@yahoo.com> writes:

On Sat, 09 May 2026 17:33:51 -0700
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

Right, you don't know what to call it. I think the term you're
probably looking for is "translation unit".

If you have something to say about splitting a C translation unit
(something I don't think I've ever had a need to do),

That surprises me greatly.
In my practice refactoring that includes splitting translation units is rather common.

Or, may be, I misunderstood your above sentence and you meant that you
never had a need *to say* something about splitting etc...?

perhaps because
you've had difficulties doing so yourself, feel free to elaborate.

I didn't give it a lot of thought, but I haven't done a lot of
refactoring of C projects. My experience is of course not universal,
and may not be representative.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From James Kuyper@jameskuyper@alumni.caltech.edu to comp.lang.c on Sat May 9 21:25:57 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:
...

That is a joke. Unix and C (and C compilers and libraries) are so
closely intertwined that you cannot separate them.

I've successfully used C on DOS, VMS, Windows, OS2, and a very
specialized operating system that I doubt you've ever heard of, called
(IIRC) Crystal OS (a quick search turned up only a completely unrelated
thing called Crystal OS, which is a variant of Ubuntu Linux - that's NOT
what I'm talking about). Other people on this newsgroup have reported
using C in a far wider variety of environments. For something that is supposedly inseparable from Unix, it seems to have been quite well
separated.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Sun May 10 02:26:44 2026

From Newsgroup: comp.lang.c

On 10/05/2026 01:33, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

So, what is involved in splitting a ... I don't even know what to call
it - a single .c 'source file'? Well, a lot of messy work.

Right, you don't know what to call it. I think the term you're
probably looking for is "translation unit".

A source file isn't a translation unit. A translation unit is the
primary source file with all the includes flattened out (and I guess
with all the comments removed and all macros expanded), and that does
not happen until compile time.

It's not what I see whan I look at file.c in my editor.

Now you're going to tell I'm wrong according to the standard.

If you have something to say about splitting a C translation unit
(something I don't think I've ever had a need to do), perhaps because
you've had difficulties doing so yourself, feel free to elaborate.

You've never had a module - sorry source file - sorry 'translation unit'
get too big for one file?

But you will surely know everything that might need doing if a such a
file needs splitting into two or more files.

My point had been that in my module scheme, it would be less work.

Question: Does C, as you claim, "pretend to be a safe language"?
Can you cite a source to support that claim? If you were willing
to read the C standard, I'd refer you to first few paragraphs of
Annex K, introduced in C11.

You made a false statement. I've made plenty of mistakes here
myself. Acknowledging them would substantially increase your
credibility.

You know what, if all possible answers to all C-related questions were contained within the C standard, why does this group even exist?

Just post a link to the standard document and be done with it.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Sun May 10 01:35:29 2026

From Newsgroup: comp.lang.c

In article <10tnmk6$3os5b$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 09/05/2026 16:18, Dan Cross wrote:

In article <10tn3so$3j8hc$1@dont-email.me>, Bart <bc@freeuk.com> wrote:
[snip]

(You can also see from this that /nobody/ likes stdint.h types, even
though standardised from C99 which also introduced 'long long' used
here. That is another bugbear. Oh, I forgot, my criticism is not not valid.)

Funny, I was just talking to some folks about this, who lamented
that it hadn't been done a decade sooner. It would have saved a
lot of pain.

This doesn't seem particularly relevant to anything.

We're drifting from my point, which is that C is in that small category,
of a deceptively simple and malleable language that would be a good fit
for a target language.

I'm saying mine would be in that group, which is my I'm doing
comparisons with Pascal or Ada which have been brought up.

Apparently you feel your language fits into a category that
could be categorized by the "level" at which you program in it.
Ok. I don't see what that has to do with anything.

Why would it? C compilers are ubiquitous.

For the major platforms, so are compilers for dozens of languages.

So what?

You don't care about C _as it is defined_. You only care about
how _you think it should work based on your intuition_. Your
incredulity at its definition not matching your expectations has
no bearing on anything at all.

If you disagree with an opinion of mine, would be make it any difference
if I knew the C standard inside out? You are hardly going to change your >mind.

The point is that if you understood the language better, your
opinion would actually be worth agreeing or disagreeing with.
As it is, since you do Not know the language particularly well,
your opinion carries little weight; certainly not enough to
merit significant discussion.

It's really not that hard to figure out what all of this means,
by the way. People (plural because, apparently, I am not the
only one) just don't find your opinions worth all that much.

Suppose I proposed for example that C should deprecate, then ban, the >ability to write:

A[i]
B[i][j]

respectively as:

i[A]
j[i[A]]

(The last one is a little mind-blowing, as it turns one 2D array access
- two consecutive 1D accesses) into two /nested/ 1D accesses.)

I haven't really given it much thought. This is an historical
artifact that came from B via "nb", where pointers were denoted
as `A[]`, so A[i] = A + i = i + A = i[A] because arithmetic in
the integers is commutative, and B was word-oriented.

It's a cute parlor trick, surprising to a few who haven't looked
closely at the history, but no deep mystery. I would not mourn
if it were removed from the language.

https://www.nokia.com/bell-labs/about/dennis-m-ritchie/chist.html

Basically, it would mean addition between pointers and integers would
not be commutative: P + i, but not i + P.

No, it would not mean that. It would merely mean that the
syntax for array accesses was divorced from its early history.

You will either agree with this or not. But I can't see that it requires
any deep knowledge of the standard to make such a proposal, or why
somebody would require that of me in order to even consider it.

In observing your behavior, this fits the pattern of being about
the place where your argument breaks down. Chesterson's fence
applies, of course, but I do not think you are not wrong to
question whether that surprising syntax should endure. But it
is your conclusion about communitivity of pointer arithmetic
that fails. You go from something that is, at least, open to
reasonable debate, and draw a specious conclusion that you then
assert as fact.

There are reasonable adjustments you need to make to switch languages,
and there are unreasonables ones, such as needing to become a guru in
the new language.

It strikes me that you need to know the language if you want to
use and discuss it.

You want EVERYBODY who uses C to know the standard in as much depth as
KT, JK and TR? (Maybe a few others too but they don't seem that bothered >about it.)

I did not say that. You presented a straw man. I merely
countered with what I feel is reasonable. (For the record, your
response is another strawman.)

If one wants to use a programming language, then it is axiomatic
that one must know that language. It does not follow that one
must necessarily be a total expert, or have internalized the
standard to the degree that one is capable of citing chapter and
verse for any given construct without the aid of references. I
don't know what your definition of "guru" is, but lots of people
who successfully use a language have a good but imperfect
working knowledge of it, and know to use references, ask someone
more knowledgable than themselves, or avail themselves of other
resources when they are unsure of something about it.

The critical difference between you and those programmers is
that they are _open_ to learning new things. There are many
aspects of C that are (at a minimum) surprising to the
uninitiated; this is known. On learning about them they may
have a subjectively negative reaction; they may express dislike.
But there's a big difference between disliking a thing and
disregarding it.

One might find it silly to stop at a traffic signal in the
middle of the night, when it is easy to see there is no other
traffic and obviously no else is nearby. But if you decide not
to stop, don't be upset when a cop pulls you over and gives you
a ticket.

(I've just tried the above proposal in my C compiler. It took half a
minute to find where I had to comment out 4 lines to make it work.

And did you break commutative arithmetic on pointers when you
were at it?

As it happens, because this ability has been there a long time, some >programs use it, for example from sqlite:

nPage = nPageHeader = get4byte(28+(u8*)pPage1->aData);

That's not using subscripting at all.

So this change is not going to happen, and people will continue writing >quirky things like 3["ABCDEF"] just for the hell of it.

This is the story of C.)

No, it's not.

This appears to be another of your misunderstandings. There are
reasons to dislike the semantic quirk of array subscribes
inherited from nb. But once again, your conclusion is specious.

Output from fbc64 -R hello.bas:
-----------------------------
[snip]
Looks like C to me!

Looks like a non-sequitur to me.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Sat May 9 19:01:57 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 10/05/2026 01:33, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

So, what is involved in splitting a ... I don't even know what to call
it - a single .c 'source file'? Well, a lot of messy work.

Right, you don't know what to call it. I think the term you're
probably looking for is "translation unit".

A source file isn't a translation unit. A translation unit is the
primary source file with all the includes flattened out (and I guess
with all the comments removed and all macros expanded), and that does
not happen until compile time.

It's not what I see whan I look at file.c in my editor.

That's basically correct, and consistent with the definition in
the standard.

Now you're going to tell I'm wrong according to the standard.

No, you're right according to the standard. Your record of
incorrectly guessing what I think is unbroken.

I speculated that, in your vague complaint about splitting something
something you don't know what to call, you might have been talking
about translation units. Apparently you weren't. Apparently you
meant to talk about splitting a source file, but you had difficulty
expressing it.

If you have something to say about splitting a C translation unit
(something I don't think I've ever had a need to do), perhaps because
you've had difficulties doing so yourself, feel free to elaborate.

You've never had a module - sorry source file - sorry 'translation
unit' get too big for one file?

But you will surely know everything that might need doing if a such a
file needs splitting into two or more files.

My point had been that in my module scheme, it would be less work.

Good for you.

So you don't have a problem you're trying to solve, and you don't
want advice about how to do something.

Question: Does C, as you claim, "pretend to be a safe language"?
Can you cite a source to support that claim? If you were willing
to read the C standard, I'd refer you to first few paragraphs of
Annex K, introduced in C11.

You made a false statement. I've made plenty of mistakes here
myself. Acknowledging them would substantially increase your
credibility.

You know what, if all possible answers to all C-related questions were contained within the C standard, why does this group even exist?

Just post a link to the standard document and be done with it.

This group exists because not all possible answers to all C-related
questions are contained within the C standard. You pretend that
someone has made such a ridiculous claim, but unless I missed
something nobody has.

I'll try this again. You claimed that C "pretends to be a safe
language". That was a false claim. Will you either provide evidence
that it was correct or acknowledge that it was incorrect?

It happens that the first few paragraphs of Annex K are relevant
to your statement. If you inferred from that remark that I think
"all possible answers to all C-related questions were contained
within the C standard", that was a very wrong and silly inference.

I expect that you will refuse yet again to respond, but I'm prepared to
be pleasantly surprised.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Sun May 10 04:15:07 2026

From Newsgroup: comp.lang.c

In article <10tntu0$3r6q3$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 09/05/2026 17:38, David Brown wrote:

On 09/05/2026 18:16, Bart wrote:

(You can also see from this that /nobody/ likes stdint.h types, even
though standardised from C99 which also introduced 'long long' used
here. That is another bugbear. Oh, I forgot, my criticism is not not
valid.)

Don't you realise that when you write things like that, you are only
demonstrating why so many people do not take you seriously?-a Have you
checked with every C programmer, and every person writing systems that
generate C code, and checked that none of them like the <stdint.h>
types?-a No?-a I thought not.

So, what's the figure?

What does it matter?

I see this pattern frequently (sometimes every other project seemingly)
so they are unpopular for some. And we don't know if people using
uint8_t etc are doing so because they genuinely like it or feel obliged
to use it.

...so?

(Tim Rentsch also seems to avoid it here.)

...and?

That is one person. He has his own preferences. I don't know
whether or not he likes <stdint.h>, but what bearing does that
have on anything?

Actually, come to think of it, I've never seen Tim Rentsch post
about hampsters here, either. Clearly, that must mean that no
one in the world likes those absurdly cute little rodents.

Perhaps try asking why somebody would invent a new type name for uint8_t
at all.

I can think of a few reasons why someone might do that. Most of
them come down to subjective preference. So what?

Some people use them extensively.-a Some people have little use for size- >> specific types.-a Some people want size-specific types, but for some
reason (good or bad) want to use C90 rather than C99.-a Some people like
the <stdint.h> types but for some reason (good or bad) are unable to use
them in certain cases.-a Some people dislike the <stdint.h> types, but
use them anyway.

So they can be problematic. And they are optional which is another matter.

*sigh*

Your language would not be a good fit, because it is a home-made
personal language with no traction.

I'm not suggesting take up of it. The point is that it is in that same >category.

Why would it?-a C compilers are ubiquitous.

For the major platforms, so are compilers for dozens of languages.

Almost invariably, C is the first language to be targeted for compilers
for a platform.-a It does not matter whether you like that or not, it is
a fact.

If are looking for a HLL language to target for a new language, this is
not going to be a brand-new platform.

It will be an established one with lots of choices.

I am struggling to see a point to those whole line of
discussion.

Earlier it seemed like you were somehow trying to say that this
notion that one can target any number of languages as
(effectively) an IR for the output of a compiler somehow had
something to do with C, particularly as one seldom sees C target
another language in a similar manner.

Now it just seems like you're just ... saying words. Though for
what reason, I cannot fathom.

Basically, it would mean addition between pointers and integers would
not be commutative: P + i, but not i + P.

You will either agree with this or not. But I can't see that it
requires any deep knowledge of the standard to make such a proposal,
or why somebody would require that of me in order to even consider it.

An opinion about preferences for a particular piece of syntax does not
need deep knowledge beyond that bit of code.-a An opinion on whether it
would be a good idea to change the standard to fit that preference, or
on what other peoples' preferences might be, or any unexpected
consequences or impacts of such a change - /that/ requires a deep
knowledge.

Well I made that change and the first app I tried failed because relied
on 'i + P', if not 'A[i]', but C doesn't allow you to separate those.

This is getting silly.

You're talking about a hypothetical change to C in which the
abstruse quirk that allows one to write A[i] as i[A] is removed.
Ok; as a thought experiment, one can conceive of such a change.

But now you are asserting that this change must necessarily
break the existing commutative properties on pointer arithmetic,
justified by saying, "C doesn't allow you to separate those."

But what you are describing is an imagined change to the rules
of C; that is, of what C allows. If one wants to entertain the
idea of changing *that* language rule, then I see no reason why
one can not entertain simultaneously changing things so that
commutativity of pointer arithmetic is preserved. That that is
not done now is fine; we're talking about changing how things
are done right now anyway.

The next two were OK, but the fourth also used it:

add32le(p + 2, x + s1->plt->data - p);

(From Tiny C sources.) So it looks use 'i + P' is already too widespread >even to deprecate it.

I see no reason why you have to deprecate that. True, it is not
how the language is defined today, but you're talking about a
hypothetical.

It would have needed to be banned from the start. Then that line would >simply have been written as:

add32le(p + 2, s1->plt->data - p + x);

At least, I made the change and tested it on real programs.

You made _a_ change. If anything, the issues you uncovered
perhaps highlight that changing the language can have unintended
consequences, and should be done with care, from a place of deep
understanding of the language. Indeed, that understanding is
a prerequisite for not making a complete hash of things.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Sun May 10 06:19:16 2026

From Newsgroup: comp.lang.c

On 2026-05-10 03:35, Dan Cross wrote:

In article <10tnmk6$3os5b$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

[...]

Suppose I proposed for example that C should deprecate, then ban, the
ability to write:

A[i]
B[i][j]

respectively as:

i[A]
j[i[A]]

(The last one is a little mind-blowing, as it turns one 2D array access
- two consecutive 1D accesses) into two /nested/ 1D accesses.)

I haven't really given it much thought. This is an historical
artifact that came from B via "nb", where pointers were denoted
as `A[]`, so A[i] = A + i = i + A = i[A] because arithmetic in
the integers is commutative, and B was word-oriented.

It's a cute parlor trick, surprising to a few who haven't looked
closely at the history, but no deep mystery. I would not mourn
if it were removed from the language.

https://www.nokia.com/bell-labs/about/dennis-m-ritchie/chist.html

Basically, it would mean addition between pointers and integers would
not be commutative: P + i, but not i + P.

No, it would not mean that. It would merely mean that the
syntax for array accesses was divorced from its early history.

You will either agree with this or not. But I can't see that it requires
any deep knowledge of the standard to make such a proposal, or why
somebody would require that of me in order to even consider it.

In observing your behavior, this fits the pattern of being about
the place where your argument breaks down. Chesterson's fence
applies, of course, but I do not think you are not wrong to
question whether that surprising syntax should endure. But it
is your conclusion about communitivity of pointer arithmetic
that fails. You go from something that is, at least, open to
reasonable debate, and draw a specious conclusion that you then
assert as fact.

There's a truth in Chesterson's Fence. But I have my doubts when
applied for the case here; basically asking Bart to inform himself
about the original rationales for that option/rule or "fence". It
would be fine if we'd have a source documenting specifically this
(and all debatable) feature's rationale. It could be that there is
somewhere such a (list of) rationale(s) - I'm not aware of one. It
could of course also be just an obvious property that most people
just recognize. In that light I wouldn't expect anyone criticizing
the 'P+i' symmetry to make any research in the wild first. A
formulated question or criticism (that I'd consider valid) is fine
per se. A rationale might not even exist; design may have been by
inheriting other bad (or deliberate primitive) design, it may have
been just a stupid idea, a personal preference, whatever - usually
we don't know. If I come from other languages - and not inferring
the 'int' commutativity to the case here - I'd see two different
types, a pointer and an integer, and I see an _asymmetric_ (non-
commutative) operator in between; if the operator would have been
'@' instead, say p @ i, the presumed commutativity would be less
obtrusive.

In short; what I was trying to say is that Chesterson's Fence is
probably a disproportionate, if not badly fitting, cannon.

Personally I'm not much surprised any more by "C" design-decisions.
We can observe the enduring, often hard (and sometimes impossible),
way to fix or enhance quality of legacy issues by the standards'
evolution. And until something gets available we use "C" as it is.
I'm writing my "C" code, if possible, without kludges or deliberate
swap of pointer and integer as shown above; that symmetry doesn't
hinder me.

Janis

[...]

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Sun May 10 06:39:55 2026

From Newsgroup: comp.lang.c

On 2026-05-10 01:16, Bart wrote:

On 09/05/2026 23:25, Janis Papanagnou wrote:

On 2026-05-09 14:10, Bart wrote:

(My language is also a personal endeavour, but I'm not inflicting in
on the world, just sharing some ideas.)

...ideas you borrowed from other languages and just assembled
them - as it seems, per design principle, arbitrarily - to your
personal liking.

This sounds like a putdown.

Since you appear to have suggested in your previous post that you're
sharing primarily the ideas and given that I don't see any idea that
you'd have actually invented, and recognizing that you always only
ever spoke about "your languages" in all your posts; yes, this is a
clear putdown of your achievements concerning substantial, novelty
ideas.

[...]

I looked through mine, and I've identified a dozen or more features that
are either novel (at least I hadn't seen them elsewhere), or adapted in
a different way.

I'm not interested in "adaptions", but feel free to post novel ideas
you developed; yet I haven't seen any, and I'm honestly interested in
new ideas.

[...]

So, where's /your/ language? [...]

What makes you think that I'd need to write an own language given that
there's a plethora of languages of all kinds and paradigms existing.

You seem to think that the crude mindset you expose, to write an own
language because you don't like specific aspects of the existing ones,
would be a normal ubiquitous mental stance? - No, it is not. It's sick!

[...]

Programming in Ada is like doing so with one hand tied behind your back.

I suppose you prefer the "freedom" of assembly.

I hate assembly. I prefer a HLL a couple of steps up. C could have been
that language, but I got spoiled by a decade of using my private one.

Then I don't understand from what you spoke w.r.t. Ada and "tied hand".

Janis

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Sun May 10 07:00:40 2026

From Newsgroup: comp.lang.c

On 2026-05-10 01:45, Bart wrote:

On 09/05/2026 23:47, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

Now look at what's involved in splitting a C module into two.

C doesn't have "modules".

You want to be /that/ pedantic?

This is exactly why I said that the C standard is your thing. If
somebody uses a term that doesn't appear in the standard, then it
doesn't exist.

I suppose the word you wanted to use is "translation unit".

Mind that I'm not familiar with the "C Standard"; I know that term
only from me listening to the discussions here. You are much longer
here in this newsgroup than I am, so I'd have expected that you'd
(meanwhile) know terms like that. Especially given that you make so
many posts, let me ask you, do you read (and perceive) the posts
you are replying to. - Given your posting history my guess would be
that you don't really read them, or don't understand them, probably
are not even interested in them.

[...]

[...]

That is a joke. Unix and C (and C compilers and libraries) are so
closely intertwined that you cannot separate them.

I'd say then that that gave C an unfair advantage.

It gave C an advantage.-a I don't know what you think is "unfair"
about it.

The context was why C became the dominant language for systems
programming. I offered that as an example. If it helped C over a
potential rival which wasn't used to implement a major OS, then it
strikes me as an unfair advantage.

Keith already said that it was an advantage. Insisting on a "unfair" qualification is inappropriate, especially without ethical measure
and without any substantial evidence. (That wording reminds me the
wording in the communication style of the current POTUS.)

Janis

[...]

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Sun May 10 07:09:11 2026

From Newsgroup: comp.lang.c

On 2026-05-10 03:26, Bart wrote:

On 10/05/2026 01:33, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

So, what is involved in splitting a ... I don't even know what to call
it - a single .c 'source file'? Well, a lot of messy work.

Right, you don't know what to call it.-a I think the term you're
probably looking for is "translation unit".

A source file isn't a translation unit. [...]

It's not what I see whan I look at file.c in my editor.

Oh, above you expressed a huge *uncertainty* when naming it
"source file". Now you really have had doubts about whether
the term "source file" is correct? - That's getting silly!

Janis

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Sun May 10 07:23:20 2026

From Newsgroup: comp.lang.c

On 2026-05-09 03:36, Dan Cross wrote:

In article <10tlvaa$1l93l$16@dont-email.me>,
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 2026-05-09 00:43, Dan Cross wrote:

[...]

Sure. This was a bit of a contrived example, but you ask a good
question: how often might one want write code like that?

In short, I don't know, but I can think of any number of hash
functions, checksums, etc, that may be implemented using 16-bit
arithmetic, and I can well see programmers wanting to take
advantage of the modular semantics afforded by using unsigned
types to do so. Every day? Probably not. But often enough.

I mentioned it before but it may have got lost in the lots text
typically exchanged here; for hash functions a modulus based on
powers of two has *bad* _distribution properties_, so it's not
a sensible example or plausible rationale to vindicate modular
arithmetic for the few special cases (m=8, 16, 32, 64, etc.).

Maybe, maybe not, depending on the exact hashing function and
the values it uses. Since K&R2 came up elsewhere, consider the
hash function the presented on pp 128-129:

(I don't have that version available so the reference doesn't
help me much.)

/* hash: form hash value for string s */
unsigned hash(char *s)
{
unsigned hashval;

for (hashval = 0; *s != '\0'; s++)
hashval = *s + 31 * hashval;

return hashval % HASHSIZE;
}

The item in question would be 'HASHSIZE'. I cannot infer from
that code whether a prime, a CPU-wordsize, or something else
has been defined for that entity.

Seeking in my older K&R translation I found a similar (but not
the same, a more primitive) function that has a HASHSIZE of 100
defined. (Clearly not a good choice.)

I wrote about collisions in this function a long time ago: https://pub.gajendra.net/2012/09/notes_on_collisions_in_a_common_string_hashing_function

In this case, the important characteristic with respect to
distribution is that the multiplier is relatively prime to the
modulous. Their choice of multipler is 31, which is a prime
number, and thus by definition co-prime to all positive moduli

They happen to chose 101 (also prime) for `HASHSIZE` but
assuming reasonably random input, the pathological behavior you
are referring to would be avoided even if the modulus were (say)
128.

Since you know what you're doing in your context all is fine. :-)

Janis

--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c on Sun May 10 11:29:32 2026

From Newsgroup: comp.lang.c

On 09/05/2026 20:20, Bart wrote:

On 09/05/2026 17:38, David Brown wrote:

On 09/05/2026 18:16, Bart wrote:

(You can also see from this that /nobody/ likes stdint.h types, even
though standardised from C99 which also introduced 'long long' used
here. That is another bugbear. Oh, I forgot, my criticism is not not
valid.)

Don't you realise that when you write things like that, you are only
demonstrating why so many people do not take you seriously?-a Have you
checked with every C programmer, and every person writing systems that
generate C code, and checked that none of them like the <stdint.h>
types?-a No?-a I thought not.

So, what's the figure?

I am not claiming figures - I /know/ I don't know. The point is, I also
know that /you/ don't know.

I see this pattern frequently (sometimes every other project seemingly)
so they are unpopular for some. And we don't know if people using
uint8_t etc are doing so because they genuinely like it or feel obliged
to use it.

We all know that some people don't use them - even in situations where
they could be useful.

C90 did not have <stdint.h>. So projects that were started before
widespread support for C99 would necessarily have a different solution
for size-specific integer types (assuming such types were useful for the project). Future versions of these projects and other code that uses
such projects would be likely to continue usage of the non-standard size-specific integer types for compatibility and consistency. This
tells us /nothing/ about whether or not anyone involved likes or
dislikes <stdint.h> types.

Outside of that, what do we actually know? I know that lots of people
use <stdint.h> types. I know that lots of people do not. I have no
idea of the likes or dislikes of any of them. In fact, I'd say that the
only data we can be sure about is that /you/ don't like them, and /I/ do
like them. So based on currently available survey data, 50% of C
programmers like <stdint.h> types. Or, since you don't actually do C programming, 100% of surveyed C programmers who expressed an opinion
like <stdint.h> types. Feel free to provide more comprehensive data and statistics, if you can back them up.

(Tim Rentsch also seems to avoid it here.)

Tim has his own opinions and preferences on many things. He may also
have reasons for writing code in a way that does not represent those preferences. I've seen him post code using <stdint.h> types, and code
that does not use those types (I can't remember ever seeing him use a size-specific type that was not a standard one, but I cannot rule out
that posssibility). So I cannot see how you can conclude anything about
Tim's preferences on the matter - or why you could possibly think the
opinions of a single person are relevant to your claims.

Perhaps try asking why somebody would invent a new type name for uint8_t
at all.

I am not the one making unsubstantiated and wildly exaggerated claims.

Some people use them extensively.-a Some people have little use for
size- specific types.-a Some people want size-specific types, but for
some reason (good or bad) want to use C90 rather than C99.-a Some
people like the <stdint.h> types but for some reason (good or bad) are
unable to use them in certain cases.-a Some people dislike the
<stdint.h> types, but use them anyway.

So they can be problematic. And they are optional which is another matter.

I don't see why you think they are "problematic" - certainly that is not something I said or implied.

The commonly used <stdint.h> types are not optional. An implementation
is required to provide types like int8_t and uint32_t for all bit sizes
for which it supports an appropriate standard or extended integer type.
Thus if an implementation has, for example, a 16-bit short int, then it
must provide int16_t and uint16_t. If it has a 24-bit int type, then it
must provide int24_t. And so on. The "least" and "fast" types of size
8, 16, 32 and 64 are not optional. Other sizes there /are/ optional - otherwise the <stdint.h> header would be infinite in length and implementations would have to support integers of every size conceivable.

Your language would not be a good fit, because it is a home-made
personal language with no traction.

I'm not suggesting take up of it. The point is that it is in that same category.

No, it is not. Being a simple relatively low-level language (assuming
that is correct) is not sufficient for the purpose.

Why would it?-a C compilers are ubiquitous.

For the major platforms, so are compilers for dozens of languages.

Almost invariably, C is the first language to be targeted for
compilers for a platform.-a It does not matter whether you like that or
not, it is a fact.

If are looking for a HLL language to target for a new language, this is
not going to be a brand-new platform.

It will be an established one with lots of choices.

You are arguing about hypotheticals. When new languages are made that
use an existing high-level language as a transpiler backend, that HLL is
most commonly C. While some effort has been made to establish re-usable alternatives for the task (such as "C--"), no such effort has gained significant traction. So despite its flaws for this purpose (real or imagined), C is good enough that it is not worth the effort creating an alternative, and other existing languages have more disadvantages.

Basically, it would mean addition between pointers and integers would
not be commutative: P + i, but not i + P.

You will either agree with this or not. But I can't see that it
requires any deep knowledge of the standard to make such a proposal,
or why somebody would require that of me in order to even consider it.

An opinion about preferences for a particular piece of syntax does not
need deep knowledge beyond that bit of code.-a An opinion on whether it
would be a good idea to change the standard to fit that preference, or
on what other peoples' preferences might be, or any unexpected
consequences or impacts of such a change - /that/ requires a deep
knowledge.

Well I made that change and the first app I tried failed because relied
on 'i + P', if not 'A[i]', but C doesn't allow you to separate those.

The next two were OK, but the fourth also used it:

-a add32le(p + 2, x + s1->plt->data - p);

(From Tiny C sources.) So it looks use 'i + P' is already too widespread even to deprecate it.

It would have needed to be banned from the start. Then that line would simply have been written as:

-a add32le(p + 2, s1->plt->data - p + x);

At least, I made the change and tested it on real programs.

So you discovered that your knowledge was too superficial to give an
informed opinion, and after learning more, you discovered that something
you thought should "obviously" be changed in C, cannot be changed. I
guess that's progress!

--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Sun May 10 03:25:11 2026

From Newsgroup: comp.lang.c

David Brown <david.brown@hesbynett.no> writes:

On 09/05/2026 20:20, Bart wrote:

[...]

Well I made that change and the first app I tried failed because
relied on 'i + P', if not 'A[i]', but C doesn't allow you to
separate those.
The next two were OK, but the fourth also used it:
-a add32le(p + 2, x + s1->plt->data - p);
(From Tiny C sources.) So it looks use 'i + P' is already too
widespread even to deprecate it.
It would have needed to be banned from the start. Then that line
would simply have been written as:
-a add32le(p + 2, s1->plt->data - p + x);
At least, I made the change and tested it on real programs.

So you discovered that your knowledge was too superficial to give an
informed opinion, and after learning more, you discovered that
something you thought should "obviously" be changed in C, cannot be
changed. I guess that's progress!

No, made a change in his own compiler to forbid index[array]
and mistakenly thought that it should also forbid index+pointer.
Again, see the C2y draft that makes index[array] obsolescent without
touching pointer arithmetic.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Sun May 10 11:49:36 2026

From Newsgroup: comp.lang.c

On 10/05/2026 02:35, Dan Cross wrote:

In article <10tnmk6$3os5b$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

One might find it silly to stop at a traffic signal in the
middle of the night, when it is easy to see there is no other
traffic and obviously no else is nearby. But if you decide not
to stop, don't be upset when a cop pulls you over and gives you
a ticket.

Yes, and you don't need to be an expert in highways and traffic
management to be able to complain about it. Just a user.

Or maybe you are a visitor from another country (eg. Netherlands) where
they have a more sensible approach (for example, traffic lights switch
to flashing amber at night so that drivers can make their own decisions).

(I've just tried the above proposal in my C compiler. It took half a
minute to find where I had to comment out 4 lines to make it work.

And did you break commutative arithmetic on pointers when you
were at it?

Yes, that's exactly how it works, since A[i] is reduced to *(A + i). I commented out the bit of code that swapped operands when the pointer was
on the right, as in (i + A).

As it happens, because this ability has been there a long time, some
programs use it, for example from sqlite:

nPage = nPageHeader = get4byte(28+(u8*)pPage1->aData);

That's not using subscripting at all.

In C, you can write i[A] /because/ it is exactly equivalent to *(i + A),
and pointer addition (between T* and int) is commutative.

So this change is not going to happen, and people will continue writing
quirky things like 3["ABCDEF"] just for the hell of it.

This is the story of C.)

No, it's not.

This appears to be another of your misunderstandings.

What is the misunderstanding?

There are
reasons to dislike the semantic quirk of array subscribes
inherited from nb. But once again, your conclusion is specious.

There are cruder ways to stop people writing i[A] whilst still allowing
(i + P). But it would be more of a hack.

(In my languages, i + P is simply not allowed, while A[i] is not reduced
to pointer arithmetic at the AST level. For one thing, arrays have an arbitrary lower bound so the mapping isn't as simple.)

Output from fbc64 -R hello.bas:
-----------------------------
[snip]
Looks like C to me!

Looks like a non-sequitur to me.

You suggested FBC didn't transpile to C. I actually tried it to find

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Sun May 10 12:06:41 2026

From Newsgroup: comp.lang.c

On 10/05/2026 03:01, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

My point had been that in my module scheme, it would be less work.

Good for you.

So you don't have a problem you're trying to solve, and you don't
want advice about how to do something.

You keep forgetting context. It was a throwaway remark in a brief
discussion WH and I were having about module schemes.

(And I still don't know what to call a 'primary source file'; that is,
one of these files:

gcc one.c two.c three.c

and not a .h file, or a .c or other file that is included indirectly.)

You know what, if all possible answers to all C-related questions were
contained within the C standard, why does this group even exist?

Just post a link to the standard document and be done with it.

This group exists because not all possible answers to all C-related
questions are contained within the C standard. You pretend that
someone has made such a ridiculous claim, but unless I missed
something nobody has.

I'll try this again. You claimed that C "pretends to be a safe
language". That was a false claim. Will you either provide evidence
that it was correct or acknowledge that it was incorrect?

It happens that the first few paragraphs of Annex K are relevant
to your statement. If you inferred from that remark that I think
"all possible answers to all C-related questions were contained
within the C standard", that was a very wrong and silly inference.

I expect that you will refuse yet again to respond, but I'm prepared to
be pleasantly surprised.

I've glanced at appendix K.1 and saw nothing relevant there. It's about exceeding arrray bounds.

I assuming that doing that would UB.

My question was (it is always important to keep conext!):

So, C can be unsafe even when you avoid all UB? Examples?

Really it comes down to what 'unsafe' means in a language, and in C,
whether it is tied to UB or can be more general.

But since 'unsafe' is not defined in the standard (not in N1570 anyway,
where it used casually on only one instance), I expect you don't know,
and wouldn't want to speculate.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Sun May 10 12:29:52 2026

From Newsgroup: comp.lang.c

On 10/05/2026 10:29, David Brown wrote:

On 09/05/2026 20:20, Bart wrote:

Well I made that change and the first app I tried failed because
relied on 'i + P', if not 'A[i]', but C doesn't allow you to separate
those.

The next two were OK, but the fourth also used it:

-a-a add32le(p + 2, x + s1->plt->data - p);

(From Tiny C sources.) So it looks use 'i + P' is already too
widespread even to deprecate it.

It would have needed to be banned from the start. Then that line would
simply have been written as:

-a-a add32le(p + 2, s1->plt->data - p + x);

At least, I made the change and tested it on real programs.

So you discovered that your knowledge was too superficial to give an informed opinion,

So, what did I miss? And about what; the prevalance of i+P arithmetic in
C codebases? I suspect you didn't know that either.

and after learning more, you discovered that something
you thought should "obviously" be changed in C, cannot be changed.-a I
guess that's progress!

Well it /can/ be changed, but it would be too draconian when dealing
with legacy code.

It requires constructs like i[A] to be deprecated, while still allowing
i + A.

That is also possible, but is not as simple a change, since C currently requires them to be interchangeable, and that is baked in to my compiler.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Sun May 10 12:44:38 2026

From Newsgroup: comp.lang.c

On 10/05/2026 06:00, Janis Papanagnou wrote:

On 2026-05-10 01:45, Bart wrote:

On 09/05/2026 23:47, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

Now look at what's involved in splitting a C module into two.

C doesn't have "modules".

You want to be /that/ pedantic?

This is exactly why I said that the C standard is your thing. If
somebody uses a term that doesn't appear in the standard, then it
doesn't exist.

I suppose the word you wanted to use is "translation unit".

No. That is a technical term used within the C standard and relates to a subsequent representation of your source code within a compiler.

It is also C-specific. What is the generic term for one of the discrete
source files of a program?

Mind that I'm not familiar with the "C Standard"; I know that term
only from me listening to the discussions here. You are much longer
here in this newsgroup than I am, so I'd have expected that you'd
(meanwhile) know terms like that. Especially given that you make so
many posts, let me ask you, do you read (and perceive) the posts
you are replying to. - Given your posting history my guess would be
that you don't really read them, or don't understand them, probably
are not even interested in them.

Do people understand mine? 90% of my posts are about defending my
position especially when attacked on multiple fronts.

I can say something and immediately I get attacked and accused of not
knowing this or that, by people who get the wrong end of the stick or
pick up on a choice of word I used.

The context was why C became the dominant language for systems
programming. I offered that as an example. If it helped C over a
potential rival which wasn't used to implement a major OS, then it
strikes me as an unfair advantage.

Keith already said that it was an advantage. Insisting on a "unfair" qualification is inappropriate, especially without ethical measure
and without any substantial evidence. (That wording reminds me the
wording in the communication style of the current POTUS.)

Hmm, weren't Microsoft accused of unfair practice by bundling their
browsers with Windows?

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Sun May 10 13:22:11 2026

From Newsgroup: comp.lang.c

On 10/05/2026 05:39, Janis Papanagnou wrote:

On 2026-05-10 01:16, Bart wrote:

On 09/05/2026 23:25, Janis Papanagnou wrote:

On 2026-05-09 14:10, Bart wrote:

(My language is also a personal endeavour, but I'm not inflicting in
on the world, just sharing some ideas.)

...ideas you borrowed from other languages and just assembled
them - as it seems, per design principle, arbitrarily - to your
personal liking.

This sounds like a putdown.

Since you appear to have suggested in your previous post that you're
sharing primarily the ideas and given that I don't see any idea that
you'd have actually invented, and recognizing that you always only
ever spoke about "your languages" in all your posts; yes, this is a
clear putdown of your achievements concerning substantial, novelty
ideas.

Originally my language was created to run on a bare board with very
little memory and no existing software /at all/, not even an assembler.

/You/ try it.

But even in that primitive early language, I could write:

print x

That was the start.

[...]

I looked through mine, and I've identified a dozen or more features
that are either novel (at least I hadn't seen them elsewhere), or
adapted in a different way.

I'm not interested in "adaptions", but feel free to post novel ideas
you developed; yet I haven't seen any, and I'm honestly interested in
new ideas.

This is just trolling? You ask me to post some ideas in good faith, but
have already decided in advance you're going to shoot them down,
whatever they are? Because I can't imagine you ever saying anything
positive about my stuff!

OK, I'll bite! I've long used embedding of text and binary files, for
example to incorporate system libs, C headers etc into my products to
make them self-contained.

It's not a new idea, and C will have #embed /any time now/, but it was executed sweetly and very efficiently.

But tie it in with this one:

c:\mx>mm -ma mm
Compiling mm.m to mm.ma

This is an unusual /built-in/ feature where this /whole-program/
compiler takes all the source and support files and creates a single amalgamated file, 'mm.ma'.

I could compile that directly without needing to unpack the files:

mm mm.ma

So it is great for distributing applications: you only need two files,
mm.exe and source code mm.ma.

But let's now combine the two features: I can embed the source code of
the compiler into itself, and add an option to the compiler to print it
out using:

when sources_sw then
println strinclude("mm.ma")
stop

Now, anyone who has the compiler binary, can obtain its full source code:

c:\mx>mm -sources >sources.ma

c:\mx>dir sources.ma
10/05/2026 13:10 614,945 sources.ma

Now for all the flak...

[...]

So, where's /your/ language? [...]

What makes you think that I'd need to write an own language given that there's a plethora of languages of all kinds and paradigms existing.

So where's the one that works like mine?

And why are there so many new ones still appearing? Most of them you
will not know about.
--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Sun May 10 12:37:48 2026

From Newsgroup: comp.lang.c

In article <10tp4o8$1l93k$7@dont-email.me>,
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 2026-05-09 03:36, Dan Cross wrote:

In article <10tlvaa$1l93l$16@dont-email.me>,
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 2026-05-09 00:43, Dan Cross wrote:

[...]

Sure. This was a bit of a contrived example, but you ask a good
question: how often might one want write code like that?

In short, I don't know, but I can think of any number of hash
functions, checksums, etc, that may be implemented using 16-bit
arithmetic, and I can well see programmers wanting to take
advantage of the modular semantics afforded by using unsigned
types to do so. Every day? Probably not. But often enough.

I mentioned it before but it may have got lost in the lots text
typically exchanged here; for hash functions a modulus based on
powers of two has *bad* _distribution properties_, so it's not
a sensible example or plausible rationale to vindicate modular
arithmetic for the few special cases (m=8, 16, 32, 64, etc.).

Maybe, maybe not, depending on the exact hashing function and
the values it uses. Since K&R2 came up elsewhere, consider the
hash function the presented on pp 128-129:

(I don't have that version available so the reference doesn't
help me much.)

I mean, I gave you the function; you quoted it. :-)

/* hash: form hash value for string s */
unsigned hash(char *s)
{
unsigned hashval;

for (hashval = 0; *s != '\0'; s++)
hashval = *s + 31 * hashval;

return hashval % HASHSIZE;
}

The item in question would be 'HASHSIZE'. I cannot infer from
that code whether a prime, a CPU-wordsize, or something else
has been defined for that entity.

I told you what it was (101). It was in the text you quoted..

Seeking in my older K&R translation I found a similar (but not
the same, a more primitive) function that has a HASHSIZE of 100
defined. (Clearly not a good choice.)

Actually, it's fine; that's the point: as long as the multiplier
is coprime to the modulus, distribution is ok for sufficiently
random data.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Sun May 10 12:39:50 2026

From Newsgroup: comp.lang.c

In article <10tpq7e$a6kp$3@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 10/05/2026 10:29, David Brown wrote:

On 09/05/2026 20:20, Bart wrote:

Well I made that change and the first app I tried failed because
relied on 'i + P', if not 'A[i]', but C doesn't allow you to separate
those.

The next two were OK, but the fourth also used it:

-a-a add32le(p + 2, x + s1->plt->data - p);

(From Tiny C sources.) So it looks use 'i + P' is already too
widespread even to deprecate it.

It would have needed to be banned from the start. Then that line would
simply have been written as:

-a-a add32le(p + 2, s1->plt->data - p + x);

At least, I made the change and tested it on real programs.

So you discovered that your knowledge was too superficial to give an
informed opinion,

So, what did I miss? And about what; the prevalance of i+P arithmetic in
C codebases? I suspect you didn't know that either.

Apparently, you missed the changes afoot in the committee to do
exactly what everyone has been telling you: deprecate `i[A]` but
preserve `i + A`.

and after learning more, you discovered that something
you thought should "obviously" be changed in C, cannot be changed.-a I
guess that's progress!

Well it /can/ be changed, but it would be too draconian when dealing
with legacy code.

It requires constructs like i[A] to be deprecated, while still allowing
i + A.

How is that draconian?

That is also possible, but is not as simple a change, since C currently >requires them to be interchangeable, and that is baked in to my compiler.

Sounds like a problem for you and your compiler.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Sun May 10 12:52:43 2026

From Newsgroup: comp.lang.c

In article <10tp104$1l93l$20@dont-email.me>,
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 2026-05-10 03:35, Dan Cross wrote:

[snip]
In observing your behavior, this fits the pattern of being about
the place where your argument breaks down. Chesterson's fence
applies, of course, but I do not think you are not wrong to
question whether that surprising syntax should endure. But it
is your conclusion about communitivity of pointer arithmetic
that fails. You go from something that is, at least, open to
reasonable debate, and draw a specious conclusion that you then
assert as fact.

There's a truth in Chesterson's Fence. But I have my doubts when
applied for the case here; basically asking Bart to inform himself
about the original rationales for that option/rule or "fence".

My point is that one should seek to understand _why_ something
is the way that it is before unilaterally deciding that changing
it is the best thing to do.

In this case, I gave him a reference to the paper Dennis Ritchie
gave on the history of C, which discusses the origin: in "nb",
one declared a pointer to e.g. `int` as `int ptr[];`. In B, all
of the addressable memory address space was considered an
"array"; the square bracket subscript notation was merely
syntactic sugar. The "Critique" section of that paper also has
on this notion.

|Moreover, some rules designed to ease early transitions
|contributed to later confusion. For example, the empty square
|brackets in the function declaration
|
| int f(a) int a[]; { ... }
|
|are a living fossil, a remnant of NB's way of declaring a
|pointer; a is, in this special case only, interpreted in C as a
|pointer. The notation survived in part for the sake of
|compatibility, in part under the rationalization that it would
|allow programmers to communicate to their readers an intent to
|pass f a pointer generated from an array, rather than a
|reference to a single integer. Unfortunately, it serves as
|much to confuse the learner as to alert the reader. (https://www.nokia.com/bell-labs/about/dennis-m-ritchie/chist.html)

I think one can take from this that the designers of the
language regarded this as something of an embarrassing accident.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Sun May 10 12:59:33 2026

From Newsgroup: comp.lang.c

In article <10tpnru$a6kp$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 10/05/2026 02:35, Dan Cross wrote:

In article <10tnmk6$3os5b$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

[snip]
(I've just tried the above proposal in my C compiler. It took half a
minute to find where I had to comment out 4 lines to make it work.

And did you break commutative arithmetic on pointers when you
were at it?

Yes,

I was asking rhetorically.

that's exactly how it works, since A[i] is reduced to *(A + i).

Yes, we're all aware of this.

The point is that you are proposing a change but then declaring
it impossible because you don't want to make another, related
change.

I
commented out the bit of code that swapped operands when the pointer was
on the right, as in (i + A).

I suppose one needs to explicitly say that that is probably not
the only way to express the desired change.

As it happens, because this ability has been there a long time, some
programs use it, for example from sqlite:

nPage = nPageHeader = get4byte(28+(u8*)pPage1->aData);

That's not using subscripting at all.

In C, you can write i[A] /because/ it is exactly equivalent to *(i + A),
and pointer addition (between T* and int) is commutative.

We know.

So this change is not going to happen, and people will continue writing
quirky things like 3["ABCDEF"] just for the hell of it.

This is the story of C.)

No, it's not.

This appears to be another of your misunderstandings.

What is the misunderstanding?

That you can change two things at one time.

There are
reasons to dislike the semantic quirk of array subscribes
inherited from nb. But once again, your conclusion is specious.

There are cruder ways to stop people writing i[A] whilst still allowing
(i + P). But it would be more of a hack.

Perhaps in your compiler, you might consider it so.

(In my languages, i + P is simply not allowed, while A[i] is not reduced
to pointer arithmetic at the AST level. For one thing, arrays have an >arbitrary lower bound so the mapping isn't as simple.)

We're talking about C, not your language.

Output from fbc64 -R hello.bas:
-----------------------------
[snip]
Looks like C to me!

Looks like a non-sequitur to me.

You suggested FBC didn't transpile to C. I actually tried it to find

I suggested it was irrelevant. I honestly don't care what it
generates. I fail to see how that is at all related to C, other
than FBC (apparently) using C as an intermediate representation.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c on Sun May 10 15:03:05 2026

From Newsgroup: comp.lang.c

On 10/05/2026 13:29, Bart wrote:

On 10/05/2026 10:29, David Brown wrote:

On 09/05/2026 20:20, Bart wrote:

Well I made that change and the first app I tried failed because
relied on 'i + P', if not 'A[i]', but C doesn't allow you to separate
those.

The next two were OK, but the fourth also used it:

-a-a add32le(p + 2, x + s1->plt->data - p);

(From Tiny C sources.) So it looks use 'i + P' is already too
widespread even to deprecate it.

It would have needed to be banned from the start. Then that line
would simply have been written as:

-a-a add32le(p + 2, s1->plt->data - p + x);

At least, I made the change and tested it on real programs.

So you discovered that your knowledge was too superficial to give an
informed opinion,

So, what did I miss? And about what; the prevalance of i+P arithmetic in
C codebases?

Yes - as well as how often i[A] is used in real code rather than what we
both think is a more "natural" order A[i].

I suspect you didn't know that either.

No, I have no idea how prevalent it is in real-world code. That's why I
said I didn't know, and why I can't say I think it would make sense to
change the C standards to disallow "i[A]" or "i + p" expressions. All I
can say is that I don't like that order of expressions.

Do you see the difference? I can have opinions on the syntax of C - so
can you. We can have /informed/ opinions on an aspect of the syntax if
we understand that particular bit of the language. But any opinions we
might have on whether or not it is a good idea or practical to change
the standards and/or the behaviour of real compilers needs knowledge of real-world usage, and an in-depth knowledge of the rest of the language
and standards in order to judge the impact. I am not saying /I/ have
the in-depth knowledge required to give a good argument for changing the standards here - I am merely saying that /you/ don't have that knowledge.

and after learning more, you discovered that something you thought
should "obviously" be changed in C, cannot be changed.-a I guess that's
progress!

Well it /can/ be changed, but it would be too draconian when dealing
with legacy code.

It requires constructs like i[A] to be deprecated, while still allowing
i + A.

That is also possible, but is not as simple a change, since C currently requires them to be interchangeable, and that is baked in to my compiler.

Not only do you not have the knowledge required to give an informed
opinion about making this particular change to the standards, you don't
have the knowledge required to give an informed opinion about making
/any/ changes to the standard, the C language, or implementations.

This is not like making changes to your personal little languages or
your toy C compiler.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c,comp.lang.misc on Sun May 10 13:05:35 2026

From Newsgroup: comp.lang.c

In article <10tpt9j$c3i4$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 10/05/2026 05:39, Janis Papanagnou wrote:

[snip]
What makes you think that I'd need to write an own language given that
there's a plethora of languages of all kinds and paradigms existing.

So where's the one that works like mine?

I mean, Rust does exactly what you were just describing.

And why are there so many new ones still appearing? Most of them you
will not know about.

Consider the possibility that you may be unique in the world in
possessing the combination of requirements and aesthetic
judgement that makes you feel you need a language like yours.

As for new languages, there are a number of reasons. Most of
them are not particularly relevant here.

At this point, you may consider doing what Keith suggested, and
moving further discussion of your language to comp.lang.misc.
Here, I'll start by cross-posting to that group for you.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Sun May 10 14:38:59 2026

From Newsgroup: comp.lang.c

On 10/05/2026 14:03, David Brown wrote:

On 10/05/2026 13:29, Bart wrote:

I am not saying /I/ have
the in-depth knowledge required to give a good argument for changing the standards here - I am merely saying that /you/ don't have that knowledge.

So, this is mystery: I am at fault for not knowing X, but you not at
fault for not knowing X?!

This particular thing was just some simple example I'd thought up:

Suppose I proposed for example that C should deprecate, then ban, the ability to write:

...

But I can't see that it requires any deep knowledge of the standard to

make such a proposal, or why somebody would require that of me in order
to even consider it.

Now it turns out that the C committee is actually looking at such a
proposal. But funnily enough, no one has given me credit for that.

It requires constructs like i[A] to be deprecated, while still
allowing i + A.

That is also possible, but is not as simple a change, since C
currently requires them to be interchangeable, and that is baked in to
my compiler.

Not only do you not have the knowledge required to give an informed
opinion about making this particular change to the standards, you don't
have the knowledge required to give an informed opinion about making /
any/ changes to the standard, the C language, or implementations.

This is not like making changes to your personal little languages or
your toy C compiler.

Why, what's the difference? At least I attempted to make the change to
see what would happen, and I tried it out on some real non-toy code-bases.

My only mistake was thinking that C REQUIRED indexing syntax to be tied
to pointer arithmetic, but as far as I know, it currently does do that,
and will do so for some years yet.

But if we're allowed to separate them, then OK I'll have another go at
my compiler. It turns out to even simpler: I had to modify three lines
of code.

Now P+i and i+P are still allowed, but not i[A], only A[i]. All the
tests I tried before now still work.

So my toy compiler implements part of C2y!

The interesting thing is that to achieve it, I had to ignore my
knowledge of the current C standard (specifically 6.5.2.1p2).

SO *NOW* TELL ME WHAT I DID WRONG.

It seems to me that guys just want to constantly pick on me for specious reasons.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Sun May 10 14:51:43 2026

From Newsgroup: comp.lang.c

On 10/05/2026 13:39, Dan Cross wrote:

In article <10tpq7e$a6kp$3@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 10/05/2026 10:29, David Brown wrote:

On 09/05/2026 20:20, Bart wrote:

Well I made that change and the first app I tried failed because
relied on 'i + P', if not 'A[i]', but C doesn't allow you to separate
those.

The next two were OK, but the fourth also used it:

-a-a add32le(p + 2, x + s1->plt->data - p);

(From Tiny C sources.) So it looks use 'i + P' is already too
widespread even to deprecate it.

It would have needed to be banned from the start. Then that line would >>>> simply have been written as:

-a-a add32le(p + 2, s1->plt->data - p + x);

At least, I made the change and tested it on real programs.

So you discovered that your knowledge was too superficial to give an
informed opinion,

So, what did I miss? And about what; the prevalance of i+P arithmetic in
C codebases? I suspect you didn't know that either.

Apparently, you missed the changes afoot in the committee to do
exactly what everyone has been telling you: deprecate `i[A]` but
preserve `i + A`.

The current standard says that those have to be tied together: 6.5.2.1p2.

However, I also, independently (and off the top of my head), came up
with a proposal that is being actually considered.

Well, done, Bart!

Oh, hang on, EVERY SINGLE THING I SAY AND DO HERE IS WRONG. I forgot
that part.

and after learning more, you discovered that something
you thought should "obviously" be changed in C, cannot be changed.-a I
guess that's progress!

Well it /can/ be changed, but it would be too draconian when dealing
with legacy code.

It requires constructs like i[A] to be deprecated, while still allowing
i + A.

How is that draconian?

If implemented by removing pointer+int commutativity, too many programs
would fail. My first attempt did that since I wanted to honour 6.5.2.1p.

If I broke 6.5.2.1p2, then it was more successful. Programs using i[A]
are much rarer than those using i+P.

That is also possible, but is not as simple a change, since C currently
requires them to be interchangeable, and that is baked in to my compiler.

Sounds like a problem for you and your compiler.

Always with the positives! You just have to keep bullying don't you.

It turns out that disallowing i[A] while keeping i+P was even simpler
because of the way my compiler works.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Sun May 10 14:34:10 2026

From Newsgroup: comp.lang.c

In article <10togv8$b63$2@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 09/05/2026 23:47, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

Now look at what's involved in splitting a C module into two.

C doesn't have "modules".

You want to be /that/ pedantic?

This is exactly why I said that the C standard is your thing. If
somebody uses a term that doesn't appear in the standard, then it
doesn't exist.

So, what is involved in splitting a ... I don't even know what to call
it - a single .c 'source file'? Well, a lot of messy work.

I move C code between source files pretty regularly.

It is not terribly difficult, if you know how to use a text
editor and some basic tools.

[snip]
The context was why C became the dominant language for systems
programming. I offered that as an example. If it helped C over a
potential rival which wasn't used to implement a major OS, then it
strikes me as an unfair advantage.

Yes, C became popular in large part through the spread of Unix.
Advantage? Yes. Unfair? That's a weird way to look at it.

C does not pretend to be a "safe language".

So, C can be unsafe even when you avoid all UB? Examples?

It depends on your definition of "safety". You appear to be
taking a very narrow view of it.

I suppose this depends on what you mean by unsafe. Take this:

m = monthnames[month];
d = daynames[day];

Suppose month and day indices got swapped by mistake, but both are still >within bounds; is this the kind of 'unsafe' in C that some languages can
fix through stricter typing?

Yes. One could define enumerations for these sorts of objects,
and then define operations to convert those to strings. For
example, in Rust,

```
enum Month {
January,
February,
March,
// ... etc.
December,
}

impl Month {
fn name(self) -> &'static str {
match self {
Self::January => "January",
Self::February => "February",
Self::March => "March",
// etc...
Self::December => "December",
}
}
}

fn main() {
let mon = Month::February;
println!("mon is {name}", name = mon.name());
}
```

(Similary for a `Day` type.)

But then, how about this one:

d1 = daynames[day1];
d2 = daynames[day2];

A type system can't stop day1 and day2 being swapped; it can still go wrong.

That doesn't mean that the language is not "safe". "Safe" does
not usually mean "no bugs."

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Sun May 10 14:49:11 2026

From Newsgroup: comp.lang.c

antispam@fricas.org (Waldek Hebisch) writes:

Bart <bc@freeuk.com> wrote:

On 09/05/2026 06:50, Waldek Hebisch wrote:

I last used Pascal to any great extent in 1980, in a college
environment. It was a teaching language.

That was orignal goal. But Pascal quickly got serious use.

Indeed. VAX-11 Pascal was a wonderful systems programming
language, with easy access to the full range of system
services (and even the Change Mode to Kernel system call
for applications executed or installed with the appropriate
privileges).
--- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Sun May 10 14:58:43 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 09/05/2026 17:38, David Brown wrote:

On 09/05/2026 18:16, Bart wrote:

(You can also see from this that /nobody/ likes stdint.h types, even
though standardised from C99 which also introduced 'long long' used
here. That is another bugbear. Oh, I forgot, my criticism is not not
valid.)

Don't you realise that when you write things like that, you are only
demonstrating why so many people do not take you seriously?-a Have you
checked with every C programmer, and every person writing systems that
generate C code, and checked that none of them like the <stdint.h>
types?-a No?-a I thought not.

So, what's the figure?

One doesn't understand your question. Is 'figure' some britishism
in this context? Or do you expect David to provide an accurate
percentage describing the preferences of every C programmer on
the planet (or in orbit, if any of the current station occupants
can program in C :-).

Personally, for my working code, the stdint types are used
extensively.

Perhaps try asking why somebody would invent a new type name for uint8_t
at all.

Strawman. Please provide examples of "somebody inventing a new type name
for uint8_t" (post standardization). One swallow doesn't make a summer, so a single example
from some obscure project you found on the WWW isn't partcularly
instructive.

Some people use them extensively.-a Some people have little use for size- >> specific types.-a Some people want size-specific types, but for some
reason (good or bad) want to use C90 rather than C99.-a Some people like
the <stdint.h> types but for some reason (good or bad) are unable to use
them in certain cases.-a Some people dislike the <stdint.h> types, but
use them anyway.

So they can be problematic.

That's not what David said, or even implied.

--- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Sun May 10 15:18:10 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 10/05/2026 05:39, Janis Papanagnou wrote:

Originally my language was created to run on a bare board with very
little memory and no existing software /at all/, not even an assembler.

/You/ try it.

Typical project in undergraduate computer science programs;
in my era, one wrote a recursive descent compiler for a
subset of Pascal[*] (or C) - the course took a single academic
quarter.

[*] Called Rascal (c.f. Compiler Design and Construction by Arthur B. Pyster) --- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Sun May 10 15:42:05 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 09/05/2026 23:47, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

Now look at what's involved in splitting a C module into two.

C doesn't have "modules".

You want to be /that/ pedantic?

This is exactly why I said that the C standard is your thing. If
somebody uses a term that doesn't appear in the standard, then it
doesn't exist.

C doesn't have a concept of 'module' per se. Perhaps you're looking
for "translation unit"?

So, what is involved in splitting a ... I don't even know what to call
it - a single .c 'source file'? Well, a lot of messy work.

An experienced C programmer uses independent translation units
without even thinking about it, when the application is
non-trivial. For many reasons, including reusability,
maintainability and collaboration. There are codebases that
have well over a million SLOC.

You are the only programmer who has ever claimed
that an entire application must be contained within a single
translation unit. It sounds like you've never actually worked
with either a team, or a non-trivial application.

The context was why C became the dominant language for systems
programming. I offered that as an example. If it helped C over a
potential rival which wasn't used to implement a major OS, then it
strikes me as an unfair advantage.

Suppose Unix was implemented in some other language, then if C was still >more successful over rivals, that would have been fairer.

Fair? What is your definition of "fair" with respect to programming
languages?

--- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Sun May 10 15:46:05 2026

From Newsgroup: comp.lang.c

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Michael S <already5chosen@yahoo.com> writes:

On Sat, 09 May 2026 17:33:51 -0700
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

Right, you don't know what to call it. I think the term you're
probably looking for is "translation unit".

If you have something to say about splitting a C translation unit
(something I don't think I've ever had a need to do),

That surprises me greatly.
In my practice refactoring that includes splitting translation units is
rather common.

Or, may be, I misunderstood your above sentence and you meant that you
never had a need *to say* something about splitting etc...?

perhaps because
you've had difficulties doing so yourself, feel free to elaborate.

I didn't give it a lot of thought, but I haven't done a lot of
refactoring of C projects. My experience is of course not universal,
and may not be representative.

I don't recall refactoring existing code, primarily because the
original programmers used multiple translation units logically
dividing the code into functionly related segments, where necessary,
from the start.

Likewise for new projects, whether C or C++, independent translation
units within the application have been, and are de rigueur.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Sun May 10 16:47:55 2026

From Newsgroup: comp.lang.c

On 10/05/2026 15:58, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 09/05/2026 17:38, David Brown wrote:

On 09/05/2026 18:16, Bart wrote:

(You can also see from this that /nobody/ likes stdint.h types, even
though standardised from C99 which also introduced 'long long' used
here. That is another bugbear. Oh, I forgot, my criticism is not not
valid.)

Don't you realise that when you write things like that, you are only
demonstrating why so many people do not take you seriously?-a Have you
checked with every C programmer, and every person writing systems that
generate C code, and checked that none of them like the <stdint.h>
types?-a No?-a I thought not.

So, what's the figure?

One doesn't understand your question. Is 'figure' some britishism
in this context? Or do you expect David to provide an accurate
percentage describing the preferences of every C programmer on
the planet (or in orbit, if any of the current station occupants
can program in C :-).

Personally, for my working code, the stdint types are used
extensively.

You know, I could well be right, and nobody does like them, apart of
course from people here. Instead they could just be tolerated.

I doubt whether they are loved, otherwise we'd see those _t suffixes in
other languages too because they look so good.

Perhaps try asking why somebody would invent a new type name for uint8_t
at all.

Strawman. Please provide examples of "somebody inventing a new type name
for uint8_t" (post standardization). One swallow doesn't make a summer, so a single example
from some obscure project you found on the WWW isn't partcularly
instructive.

You invite people to give examples, but then immediately qualify that by putting restrictions on quantity and popularity so that they can never win!

For other people's benefit:

typedef uint8_t byte;

(From: https://github.com/arduino/ArduinoCore-avr/blob/master/cores/arduino/Arduino.h)

typedef int64_t mz_int64;

(From a compression library called "miniz")

typedef uint32_t Uint32;

(From SDL2 header files)

Some people use them extensively.-a Some people have little use for size- >>> specific types.-a Some people want size-specific types, but for some
reason (good or bad) want to use C90 rather than C99.-a Some people like >>> the <stdint.h> types but for some reason (good or bad) are unable to use >>> them in certain cases.-a Some people dislike the <stdint.h> types, but
use them anyway.

So they can be problematic.

That's not what David said, or even implied.

Problematic in being a mess leading to a mix of classic, stdint and user-defined types. Compare with the use of comparable types in C#, D,
Java, Zig, Rust and Go.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Sun May 10 09:14:43 2026

From Newsgroup: comp.lang.c

cross@spitfire.i.gajendra.net (Dan Cross) writes:

[..I am summarizing parts in an effort to get to key aspects..]

In article <86o6isuegr.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

[snip]
It's important to understand the perspectives of different groups
of participants in the C ecosystem. There are three main groups:

If you're a programmer, you hate undefined behavior, and avoid it
like the plague.

If you're a compiler writer, you love undefined behavior, because
it lets you do whatever you want.

If you're a member of the ISO C standards committee (and I admit
that to a degree I am speculating here), you think of undefined
behavior as a balancing test, of needing to weigh the tensions
inherent in what the first two groups would prefer.

This, I think, is the tragedy of C ("tragedy" in the dramatic,
Shakespearean sense).
[long exposition on the history of C]

My point here is that the users and developers of the language
were the same group, [elaboration]

But, as you pointed out, this is no longer the case. The two are
now distinct, with very different goals. [a consequence of which
is C usage is less uniform (my paraphrase)]

I think this is fair: pretty much no production OS is written in
pure ISO C, if they're written in C at all: they all use compiler
flags or custom toolchains to enable various extensions and pin
down aspects of UB they depend on in one form or another.

And this is the tragedy. This isn't how it started, and I don't
think the folks who created the language wanted it to go down this
way, but here we are. [rest omitted]

I wouldn't call it a tragedy, in fact just the opposite. If C had
stayed in its original environment it never would have become as
ubiquitous and widespread as it is today. The original ecosystem
doesn't scale. By letting C, and also Unix, enter the public
sphere, a great benefit accrued to the world at large.

A key component of that benefit is allowing variability. Different environments make different assumptions about how things should
work. Other languages, and other systems, maintain proprietary
control, partially in the interest of keeping things uniform, and
haven't been nearly as successful. The C community becoming more
diverse is a good thing, not a bad thing.

Similarly the C standardization efforts strengthened the language
rather than weaken it. That's because the original authors had the
foresight to allow some variation rather than insisting on complete
uniformity. During the 1980s it was common to see C code littered
with #ifdef's. These days, although there are still occasions when
some code needs to be platform specific, they occur much less often
than they do pre-standardization.

As for operating systems not being written in "pure" ISO C, that's
because operating systems need access to the hardware that most
programs don't need and shouldn't have. Also some of that is
historical artifact. Linux, to give a prominent example, was
written starting in the early 1990s, before C99 introduced inline
functions, and so the linux kernel allows the use of "statement
expressions" in gcc; these days such constructs can be supplied
using inline functions, perhaps also with generic type facilities
that were added to C starting in C11. Moreover such non-standard
language usages are not limited to C -- the Rust language is also
used in the linux kernel, and there too some non-standard language
features are used in kernel code.

I don't mean to compare C and Rust. My position here is only that,
in my view, the complaints raised about C are misplaced. Others
are welcome to their own views on the subject.
--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Sun May 10 16:20:14 2026

From Newsgroup: comp.lang.c

In article <10tnc9s$3lm5l$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:

On 09/05/2026 00:43, Dan Cross wrote:

In article <10tk4sg$2l19a$2@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:

On 08/05/2026 00:08, Dan Cross wrote:

K&R is a wonderful book for its exposition: well-written,
concise, and the prose is beautiful. Kernighan is an amazing
writer, and Ritchie was well-known for his patience and clear
explainations.

However, it is a product of its time. It dates from a simpler
era, when programmers were expected to use books like it is a
starting point, and subsequently gain mastery either through
careful study of the standard, or extensive practice. (I'm
referring specifically to K&R2, of course, since the first
edition predated the first version of the standard by a decade.)
Machines were smaller and simpler, then, and so were compilers.

I am sad to say that I don't think it has aged particularly well.

I like the way you put that. Sometimes people have a tendency to put
too much reverence on particular texts - such as imagining that K&R says >>> all that needs to be said about C, and treating any modern tool, text,
standard or program that diverges from it as some kind of heresy, or
"not following the spirit of C". Languages evolve - tools evolve,
programs evolve, standards evolve, requirements evolve. K&R was a
milestone in the history of programming languages, and a model for
technical writing in its field, but C today is not C of fifty years ago.

Thank you. Yes, I pretty much agree.

It is unfortunate that this situation may be UB. I personally think
"unsigned short" should promote to "unsigned int", not "int" -
promotions should be signedness preserving. I don't like the "promote
to int" at all. But opinions don't change the standards, and I suppose
there are historical reasons for the rules that were made here.

But I am not sure I agree that such cases are "easy to stumble into".
How often would code like that be written, where overflowing the
uint16_t would be correct behaviour in the code on a 16-bit int system?
It is certainly possible, but it is perhaps more likely that cases of
overflow in the 16-bit system were also bugs in the code - moving the
code to 32-bit systems could give different undesirable effects from the >>> bug. It could also happen to remove the effects of the bug by holding
results in 32-bit registers and leading to correct results in later
calculations - UB can work either way.

Sure. This was a bit of a contrived example, but you ask a good
question: how often might one want write code like that?

I think the particularly interesting thing about asking how often code
like this occurs, is that the potential impact of an oddity may be
higher for things that aren't often used. Most C programmers will
fairly quickly learn that overflowing signed arithmetic is UB and try to >avoid it - but the rarity of this example means that people are less
likely to realise it is UB.

That's a very interesting point: aspects of languages have costs
and language design should take into account the cost of
features vis their utility.

A contemporary example I can think of here was the presence of
XML literals in Scala. Neat idea; and as I gathered at the time
I looked, it was implemented in terms of macros, but wow: talk
about something you would find in the dictionary as an example
of, "seemed like a good idea at the time."

In short, I don't know, but I can think of any number of hash
functions, checksums, etc, that may be implemented using 16-bit
arithmetic, and I can well see programmers wanting to take
advantage of the modular semantics afforded by using unsigned
types to do so. Every day? Probably not. But often enough.

I can imagine situations in the microcontroller world (as usual, many of
my examples come from there!) where code that was originally written for >8-bit or 16-bit devices was moved to 32-bit devices. Microcontroller >programmers are big users of fixed-size integer types - sometimes a good >thing, sometimes not.

One of the things I had to really internalize as an OS person is
that the universe of useful existing software is large. It
doesn't matter if I create the most beautiful abstractions for
them that are infinitely superior to whatever swill their code
is using now. If they don't get to run their program (or worse,
they have to make a bunch of invasive changes for no discernable
benefit from their perspective) because I know better about how
things ought to be done, they're not going to use whatever
system I'm working on unless they're forced. But even then they
will resent it and move to something else the first chance they
get (lookin' at you, DEC, Microsoft, IBM, and any number of
commercial Unix vendors).

Whatever _I_ think of how the interfaces they chose to use is
immaterial, making it difficult for them wins me no friends.
This is one of the smart things Torvalds did with Linux: "don't
break userspace" (unless there's a really, really good reason)
probably did a lot to help make Linux popular.

Anyway, I think this is similar. It doesn't matter what anyone
thinks of whether one ought to prevent all overflow; the fact
is that the language supports it for unsigned integers (though
with some surprising semantics for types of lower rank than
`int`) simply is what it is. And if someone has a program that
avails of those semantics, and that program is important to them
for whatever reason, then there's little choice but to hold
one's nose. I know you know this, of course, but I think it's
worth repeating every now and then.

Agreed. Knowing the semantics (and knowing when no semantics are
defined) is more important than exactly what the semantics are. For any >real language, there are always going to be things you disagree with or >think could be done differently, but you live with it anyway. Just look
at the C or C++ standards committee voting records - very few changes
get voted through unanimously.

(I guess that's why Bart is so deliriously impressed with his own
language - as the language's only designer, implementer, and user, it >presumably fits his preferences quite well. Real-world languages are
more of a compromise.)

Yes. "My language is great according to my criteria" doesn't
mean it maps to anything else more generally.

Certainly, however, the fact that this expression could contain UB would >>> surprise many C programmers.

Yes. Btw, the fix is almost trivial:

```
uint16_t
mul(uint16_t a, uint16_t b)
{
unsigned int aa = a, bb = b;
return aa * bb;
}
```

But we must be careful - copying the same pattern to uint32_t would then
be incorrect if unsigned int is smaller than 32 bits. (Still no UB, >though.)

A general pattern could be :

T mul(T a, T b) {
return (a + 0u) * b;
}

Yes. The example is carefully constructed to highlight a
particular case, but a more general problem would require a more
general solution solution.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Sun May 10 16:22:12 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 10/05/2026 15:58, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

Perhaps try asking why somebody would invent a new type name for uint8_t >>> at all.

Strawman. Please provide examples of "somebody inventing a new type name
for uint8_t" (post standardization). One swallow doesn't make a summer, so a single example
from some obscure project you found on the WWW isn't partcularly
instructive.

You invite people to give examples, but then immediately qualify that by >putting restrictions on quantity and popularity so that they can never win!

For other people's benefit:

typedef uint8_t byte;

They're explicitly using uint8_t specifically for the purpose
it was intended for. They fact that they have an alias could
be for dozens of reasons, including code reuse or compatability between older
C compilers that didn't yet support <stdint.h> (with suitable
preprocessor code to define uint8_t on targets that don't support
<stdint.h>. See autotools.

They didn't do this because the programmer disliked uint8_t
or the stdint.h types in general.

(From: >https://github.com/arduino/ArduinoCore-avr/blob/master/cores/arduino/Arduino.h)

typedef int64_t mz_int64;

(From a compression library called "miniz")

typedef uint32_t Uint32;

(From SDL2 header files)

Again, they're using those stdint.h types exactly as they were meant
to be used. The fact that they alias them in no way implies that the programmer didn't _like_ the stdint.h types.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c on Sun May 10 18:27:16 2026

From Newsgroup: comp.lang.c

Am 30.04.2026 um 02:39 schrieb Kalevi Kolttonen:

Is it always safe and not undefined behavior to do:
int i;
long l;
i = (int)l;
as long as you have first veried that 'l' is within
the range between INT_MIN and INT_MAX? Thanks.

Yes, that's a major problem with all 64 bit Unices.Use Windows with
that. On Windows long and int have the same size.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Sun May 10 16:28:40 2026

From Newsgroup: comp.lang.c

In article <10tq2he$d08i$2@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 10/05/2026 13:39, Dan Cross wrote:

In article <10tpq7e$a6kp$3@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 10/05/2026 10:29, David Brown wrote:

On 09/05/2026 20:20, Bart wrote:

Well I made that change and the first app I tried failed because
relied on 'i + P', if not 'A[i]', but C doesn't allow you to separate >>>>> those.

The next two were OK, but the fourth also used it:

-a-a add32le(p + 2, x + s1->plt->data - p);

(From Tiny C sources.) So it looks use 'i + P' is already too
widespread even to deprecate it.

It would have needed to be banned from the start. Then that line would >>>>> simply have been written as:

-a-a add32le(p + 2, s1->plt->data - p + x);

At least, I made the change and tested it on real programs.

So you discovered that your knowledge was too superficial to give an
informed opinion,

So, what did I miss? And about what; the prevalance of i+P arithmetic in >>> C codebases? I suspect you didn't know that either.

Apparently, you missed the changes afoot in the committee to do
exactly what everyone has been telling you: deprecate `i[A]` but
preserve `i + A`.

The current standard says that those have to be tied together: 6.5.2.1p2.

Yes. But you're hypothesizing a change to the language. In
that context, binding yourself to that doesn't make a lot of
sense. Suggesting the subscript part of the change and then
pointing to this as an impediment, and then using that to make
some kind of statement about C (which is, I gather, what you're
doing) is specious.

However, I also, independently (and off the top of my head), came up
with a proposal that is being actually considered.

Ok.

Well, done, Bart!

Sure. You've come up with something that most folks seem to be
in agreement about. Good on you.

Oh, hang on, EVERY SINGLE THING I SAY AND DO HERE IS WRONG. I forgot
that part.

Now that is just dramatic.

and after learning more, you discovered that something
you thought should "obviously" be changed in C, cannot be changed.-a I >>>> guess that's progress!

Well it /can/ be changed, but it would be too draconian when dealing
with legacy code.

It requires constructs like i[A] to be deprecated, while still allowing
i + A.

How is that draconian?

If implemented by removing pointer+int commutativity, too many programs >would fail. My first attempt did that since I wanted to honour 6.5.2.1p.

So don't do that.

If I broke 6.5.2.1p2, then it was more successful. Programs using i[A]
are much rarer than those using i+P.

Amazing.

That is also possible, but is not as simple a change, since C currently
requires them to be interchangeable, and that is baked in to my compiler. >>

Sounds like a problem for you and your compiler.

Always with the positives! You just have to keep bullying don't you.

You made a general statement about the language based on what
you perceived as a difficult implementing a change in your
compiler. That doesn't follow.

I made a statement of fact: the issue you saw was a problem for
you to address in your compiler: it had little to do with the
language itself.

That's not bullying.

It turns out that disallowing i[A] while keeping i+P was even simpler >because of the way my compiler works.

Ok.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Sun May 10 16:37:25 2026

From Newsgroup: comp.lang.c

In article <10tq1pi$d08i$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 10/05/2026 14:03, David Brown wrote:

On 10/05/2026 13:29, Bart wrote:
[snip]
Suppose I proposed for example that C should deprecate, then ban, the >>ability to write:

...

But I can't see that it requires any deep knowledge of the standard to >>make such a proposal, or why somebody would require that of me in order
to even consider it.

Now it turns out that the C committee is actually looking at such a >proposal. But funnily enough, no one has given me credit for that.

Why would anyone give you credit for something the committee
came up with?

You may have independently come up with a similar idea, or even
the same idea, but I see no evidence the committee was aware of
that. Indeed, your disdain for standard C suggests that you've
never tried to engage with them actively, and thus, they
probably have no idea who you are.

[snip]
My only mistake was thinking that C REQUIRED indexing syntax to be tied
to pointer arithmetic, but as far as I know, it currently does do that,
and will do so for some years yet.

Again. You were talking about a hypothetical. Within that
framework, what the language "requires" is mutable.

In the real world, it is not.

You can seek to change the language, and that's fine, but the
process for doing that is involved and unfolds over a long time
horizon. But them's the breaks when you're talking about a
language used by more than a single person.

But if we're allowed to separate them, then OK I'll have another go at
my compiler. It turns out to even simpler: I had to modify three lines
of code.

Now P+i and i+P are still allowed, but not i[A], only A[i]. All the
tests I tried before now still work.

So my toy compiler implements part of C2y!

See? Not that hard, huh?

The interesting thing is that to achieve it, I had to ignore my
knowledge of the current C standard (specifically 6.5.2.1p2).

SO *NOW* TELL ME WHAT I DID WRONG.

You are mixing up things that happen in two very different
contexts. See above.

It seems to me that guys just want to constantly pick on me for specious >reasons.

No one is picking on you. You're making public statements that
are based on poor logic or factual inaccuracies, and people are
pointing those out to you. You seem to have a persecution
complex about that. That is no one's problem but yours.

I doubt that anyone here bears you any ill will: I know I don't;
why would I? I don't even know you.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From kalevi@kalevi@kolttonen.fi (Kalevi Kolttonen) to comp.lang.c on Sun May 10 16:44:34 2026

From Newsgroup: comp.lang.c

Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

I wouldn't call it a tragedy, in fact just the opposite. If C had
stayed in its original environment it never would have become as
ubiquitous and widespread as it is today. The original ecosystem
doesn't scale. By letting C, and also Unix, enter the public
sphere, a great benefit accrued to the world at large.

The portability of C in the POSIX/SUS systems has proved to be
quite good provided that the programmers have not made too
many unwarranted assumptions when writing their code.
So programs like Sendmail ran, and still runs, on incredibly
many commercial and non-commercial UNIX platforms.

At home, I run FreeBSD just for fun as a little server:
Apache, PostgreSQL, OpenLDAP, Sendmail, pf firewall.
On my laptop I always have the latest Fedora Xfce spin. Every day
I do appreciate the fact that I can run approximately the same
software set on both operating systems even though the kernels
and libc are very different in their internals. Standardization
is really helpful and a great benefit.

I guess FreeBSD is getting left behind in certain areas
like not being able to run docker. To be honest, Capsicum
security model is also quite stupid and not applicable
to most of modern application but with Linux you have
LSMs like SELinux that is amazingly versatile and
is actually useful in the real world for confining many
applications.

Concerning C portability, Linux and NetBSD also run on lots
of different architectures so carefully written kernel C
code can also be very portable. It has been decades since
the machine dependent and machine independent parts of UNIX
were identified and isolated. So when RISC-V was released
a while ago, it did not take too long before both FreeBSD
and Linux supported it. This is so cool and even amazing.

I know quite well that C is not a perfect language, but
I still love it and respect it.

I am so old now that I prefer to spend my time studying the
C sources of real projects like SELinux kernel and userspace
instead of learning new languages such as modern C++ or Rust.

I gladly leave that to the younger generations.

br,
KK
--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c on Sun May 10 18:53:50 2026

From Newsgroup: comp.lang.c

On 10/05/2026 15:38, Bart wrote:

On 10/05/2026 14:03, David Brown wrote:

On 10/05/2026 13:29, Bart wrote:

I am not saying /I/ have the in-depth knowledge required to give a
good argument for changing the standards here - I am merely saying
that /you/ don't have that knowledge.

So, this is mystery: I am at fault for not knowing X, but you not at
fault for not knowing X?!

No, you are at fault for not knowing X but claiming you do, or making statements that depend on a knowledge of X.

Ignorance is not a problem in itself - it is usually fixable by asking questions and learning. Your brand of wilful ignorance, where you
refuse to learn, is more of a problem. And making claims or reaching conclusions despite a state of ignorance is just as bad.

It is hard to understand why any of this is a mystery to you.

--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c on Sun May 10 18:55:25 2026

From Newsgroup: comp.lang.c

On 10/05/2026 16:58, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 09/05/2026 17:38, David Brown wrote:

On 09/05/2026 18:16, Bart wrote:

(You can also see from this that /nobody/ likes stdint.h types, even
though standardised from C99 which also introduced 'long long' used
here. That is another bugbear. Oh, I forgot, my criticism is not not
valid.)

Don't you realise that when you write things like that, you are only
demonstrating why so many people do not take you seriously?-a Have you
checked with every C programmer, and every person writing systems that
generate C code, and checked that none of them like the <stdint.h>
types?-a No?-a I thought not.

So, what's the figure?

One doesn't understand your question. Is 'figure' some britishism
in this context? Or do you expect David to provide an accurate
percentage describing the preferences of every C programmer on
the planet (or in orbit, if any of the current station occupants
can program in C :-).

Personally, for my working code, the stdint types are used
extensively.

Even that does not give us another sample point - Bart's claim was not
about who /uses/ <stdint> types, but who /likes/ them. That's far
harder to tell from sample code.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Sun May 10 17:57:48 2026

From Newsgroup: comp.lang.c

On 10/05/2026 17:22, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 10/05/2026 15:58, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

Perhaps try asking why somebody would invent a new type name for uint8_t >>>> at all.

Strawman. Please provide examples of "somebody inventing a new type name >>> for uint8_t" (post standardization). One swallow doesn't make a summer, so a single example
from some obscure project you found on the WWW isn't partcularly
instructive.

You invite people to give examples, but then immediately qualify that by
putting restrictions on quantity and popularity so that they can never win! >>
For other people's benefit:

typedef uint8_t byte;

They're explicitly using uint8_t specifically for the purpose
it was intended for. They fact that they have an alias could
be for dozens of reasons, including code reuse or compatability between older C compilers that didn't yet support <stdint.h> (with suitable
preprocessor code to define uint8_t on targets that don't support
<stdint.h>. See autotools.

Well, OBVIOUSLY you're going to reject every example. That's how it
works here.

They didn't do this because the programmer disliked uint8_t
or the stdint.h types in general.

You don't know that: you actually said it could be for 'dozens of reasons'.

However, it is telling that when they had to choose name for that type,
it was 'byte'.

--- Synchronet 3.22a-Linux NewsLink 1.2

From kalevi@kalevi@kolttonen.fi (Kalevi Kolttonen) to comp.lang.c on Sun May 10 16:58:57 2026

From Newsgroup: comp.lang.c

Bonita Montero <Bonita.Montero@gmail.com> wrote:

Yes, that's a major problem with all 64 bit Unices.Use Windows with
that. On Windows long and int have the same size.

I have used Linux since the summer of 1998 and would never
ever even consider installing Windows. It is so disgusting.

br,
KK
--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Sun May 10 18:00:48 2026

From Newsgroup: comp.lang.c

On 10/05/2026 17:37, Dan Cross wrote:

In article <10tq1pi$d08i$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 10/05/2026 14:03, David Brown wrote:

On 10/05/2026 13:29, Bart wrote:
[snip]
Suppose I proposed for example that C should deprecate, then ban, the
ability to write:

...

But I can't see that it requires any deep knowledge of the standard to
make such a proposal, or why somebody would require that of me in order
to even consider it.

Now it turns out that the C committee is actually looking at such a
proposal. But funnily enough, no one has given me credit for that.

Why would anyone give you credit for something the committee
came up with?

You may have independently come up with a similar idea, or even
the same idea, but I see no evidence the committee was aware of
that.

I never claimed that.
--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Sun May 10 17:03:04 2026

From Newsgroup: comp.lang.c

In article <10tq9ba$fe06$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 10/05/2026 15:58, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 09/05/2026 17:38, David Brown wrote:

On 09/05/2026 18:16, Bart wrote:

(You can also see from this that /nobody/ likes stdint.h types, even >>>>> though standardised from C99 which also introduced 'long long' used
here. That is another bugbear. Oh, I forgot, my criticism is not not >>>>> valid.)

Don't you realise that when you write things like that, you are only
demonstrating why so many people do not take you seriously?-a Have you >>>> checked with every C programmer, and every person writing systems that >>>> generate C code, and checked that none of them like the <stdint.h>
types?-a No?-a I thought not.

So, what's the figure?

One doesn't understand your question. Is 'figure' some britishism
in this context? Or do you expect David to provide an accurate
percentage describing the preferences of every C programmer on
the planet (or in orbit, if any of the current station occupants
can program in C :-).

Personally, for my working code, the stdint types are used
extensively.

You know, I could well be right, and nobody does like them, apart of
course from people here. Instead they could just be tolerated.

It wouldn't matter. Your statement was that **nobody** likes
them. Even a single counter-example proves that false, and such
a counter example has been given. QED.

I doubt whether they are loved, otherwise we'd see those _t suffixes in >other languages too because they look so good.

Some people like the taste of pineapple on pizza. As a New
Yorker, I am personally offended by that. Unfortunately, That
does not mean that people don't like pizza with pineapple on it.

Perhaps try asking why somebody would invent a new type name for uint8_t >>> at all.

Strawman. Please provide examples of "somebody inventing a new type name
for uint8_t" (post standardization). One swallow doesn't make a summer, so a single example
from some obscure project you found on the WWW isn't partcularly
instructive.

You invite people to give examples, but then immediately qualify that by >putting restrictions on quantity and popularity so that they can never win!

Probably because you tend to lean heavily on yet another logical
fallacy: anecdotal evidence.

That's not what David said, or even implied.

Problematic in being a mess leading to a mix of classic, stdint and >user-defined types. Compare with the use of comparable types in C#, D,
Java, Zig, Rust and Go.

All of which have the significant advantage of coming many years
after C was created, and all of which applied the lessons
learned from C (and other languages). Also all of which, except
for Java and maybe very early C#, were developed in the 21st
century, by which time 64-bit machines were available, if not
yet ubiquitous for some (for Go, Rust, and Zig, they were).
Even for Java and C#, however, 64-bit targets were available
(and used in their development).

Many of us who lived through the painful transition to 64-bit
CPUs in the late 1990s and early 2000s, and had to port lots of
software that made assumptions about the relative sizes of
integers and pointers to those machines, were ecstatic to see
explicitly sized types become common in later languages.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Sun May 10 17:27:17 2026

From Newsgroup: comp.lang.c

In article <86a4u7s0lo.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

[..I am summarizing parts in an effort to get to key aspects..]

In article <86o6isuegr.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

[snip]
It's important to understand the perspectives of different groups
of participants in the C ecosystem. There are three main groups:

If you're a programmer, you hate undefined behavior, and avoid it
like the plague.

If you're a compiler writer, you love undefined behavior, because
it lets you do whatever you want.

If you're a member of the ISO C standards committee (and I admit
that to a degree I am speculating here), you think of undefined
behavior as a balancing test, of needing to weigh the tensions
inherent in what the first two groups would prefer.

This, I think, is the tragedy of C ("tragedy" in the dramatic,
Shakespearean sense).
[long exposition on the history of C]

My point here is that the users and developers of the language
were the same group, [elaboration]

But, as you pointed out, this is no longer the case. The two are
now distinct, with very different goals. [a consequence of which
is C usage is less uniform (my paraphrase)]

Here, I was referring to the _groups you defined_.

I think this is fair: pretty much no production OS is written in
pure ISO C, if they're written in C at all: they all use compiler
flags or custom toolchains to enable various extensions and pin
down aspects of UB they depend on in one form or another.

And this is the tragedy. This isn't how it started, and I don't
think the folks who created the language wanted it to go down this
way, but here we are. [rest omitted]

I wouldn't call it a tragedy, in fact just the opposite.

The tragedy I alluded to is that precisely the difference
between two groups you yourself suggested: programmers hate UB,
compiler writers love it. The compiler writers and programmers
often appear to have an antagonistic relationship.

If C had
stayed in its original environment it never would have become as
ubiquitous and widespread as it is today. The original ecosystem
doesn't scale. By letting C, and also Unix, enter the public
sphere, a great benefit accrued to the world at large.

I said as much.

[snip] Moreover such non-standard
language usages are not limited to C -- the Rust language is also
used in the linux kernel, and there too some non-standard language
features are used in kernel code.

Rust has no language standard, so saying that "non-standard
language features are used in kernel code" is incorrect.

I don't mean to compare C and Rust. My position here is only that,
in my view, the complaints raised about C are misplaced. Others
are welcome to their own views on the subject.

I don't think you read my note carefully, and your response is
to something I did not say.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Sun May 10 12:53:37 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 09/05/2026 17:38, David Brown wrote:

On 09/05/2026 18:16, Bart wrote:

(You can also see from this that /nobody/ likes stdint.h types,
even though standardised from C99 which also introduced 'long long'
used here. That is another bugbear. Oh, I forgot, my criticism is
not not valid.)

Don't you realise that when you write things like that, you are only
demonstrating why so many people do not take you seriously? Have
you checked with every C programmer, and every person writing
systems that generate C code, and checked that none of them like the
<stdint.h> types? No? I thought not.

So, what's the figure?

I see this pattern frequently (sometimes every other project
seemingly) so they are unpopular for some. And we don't know if people
using uint8_t etc are doing so because they genuinely like it or feel
obliged to use it.

(Tim Rentsch also seems to avoid it here.)

I wouldn't say I avoid it; I just rarely encounter circumstances
where it seems called for. In almost all cases where uint8_t
might be used, unsigned char works just as well.
--- Synchronet 3.22a-Linux NewsLink 1.2

From kalevi@kalevi@kolttonen.fi (Kalevi Kolttonen) to comp.lang.c on Sun May 10 20:15:17 2026

From Newsgroup: comp.lang.c

Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

In almost all cases where uint8_t
might be used, unsigned char works just as well.

Why "almost"? Where is the difference if any?

As far as I know, ISO guarantees that
sizeof(unsigned char) is always 1 byte.

And operations on unsigned char are well defined,
including wrap-around. So I fail to see any
difference between unsigned char and uint8_t.

br,
KK
--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Sun May 10 13:21:30 2026

From Newsgroup: comp.lang.c

scott@slp53.sl.home (Scott Lurndal) writes:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Michael S <already5chosen@yahoo.com> writes:

On Sat, 09 May 2026 17:33:51 -0700
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

Right, you don't know what to call it. I think the term you're
probably looking for is "translation unit".

If you have something to say about splitting a C translation unit
(something I don't think I've ever had a need to do),

That surprises me greatly.
In my practice refactoring that includes splitting translation units
is rather common.

Or, may be, I misunderstood your above sentence and you meant that
you never had a need *to say* something about splitting etc...?

perhaps because
you've had difficulties doing so yourself, feel free to elaborate.

I didn't give it a lot of thought, but I haven't done a lot of
refactoring of C projects. My experience is of course not universal,
and may not be representative.

I don't recall refactoring existing code, primarily because the
original programmers used multiple translation units logically
dividing the code into functionly related segments, where necessary,
from the start.

There are various forces that influence the partitioning of programs
into multiple .c files. These forces can change over the life of a
project, as the code evolves. An obvious one is that as code is
added, a single .c file can grow to the point of being overly large,
and dividing it into two or three seems like a good idea.

Having said that, I don't remember it ever being a big deal. If
some source file needs to be subdivided, you simply subdivide it
and move on. The effort needed to do re-partitioning is a small
fraction of the overall code development effort. Not worth
worrying about.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Sun May 10 13:23:39 2026

From Newsgroup: comp.lang.c

scott@slp53.sl.home (Scott Lurndal) writes:

An experienced C programmer uses independent translation
units without even thinking about it, when the application
is non-trivial. For many reasons, including reusability,
maintainability and collaboration.

+1
--- Synchronet 3.22a-Linux NewsLink 1.2

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Sun May 10 22:46:06 2026

From Newsgroup: comp.lang.c

Scott Lurndal <scott@slp53.sl.home> wrote:

Bart <bc@freeuk.com> writes:

On 10/05/2026 15:58, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

Perhaps try asking why somebody would invent a new type name for uint8_t >>>> at all.

Strawman. Please provide examples of "somebody inventing a new type name >>> for uint8_t" (post standardization). One swallow doesn't make a summer, so a single example
from some obscure project you found on the WWW isn't partcularly
instructive.

You invite people to give examples, but then immediately qualify that by >>putting restrictions on quantity and popularity so that they can never win! >>
For other people's benefit:

typedef uint8_t byte;

They're explicitly using uint8_t specifically for the purpose
it was intended for. They fact that they have an alias could
be for dozens of reasons, including code reuse or compatability between older C compilers that didn't yet support <stdint.h> (with suitable
preprocessor code to define uint8_t on targets that don't support
<stdint.h>. See autotools.

They didn't do this because the programmer disliked uint8_t
or the stdint.h types in general.

My impression is that Bart disliked the type _names_.
--
Waldek Hebisch
--- Synchronet 3.22a-Linux NewsLink 1.2

From James Kuyper@jameskuyper@alumni.caltech.edu to comp.lang.c on Sun May 10 18:52:21 2026

From Newsgroup: comp.lang.c

On 2026-05-10 16:15, Kalevi Kolttonen wrote:

Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

In almost all cases where uint8_t
might be used, unsigned char works just as well.

Why "almost"? Where is the difference if any?

If uint8_t exists, CHAR_BIT must be 8, and unsigned char must therefore
meet the requirements to be the type that uint8_t is a typedef for.
However, the standard doesn't mandate it. If, for example, a machine
supported two different 8-bit types, with the order of the bits from low
to high reversed between them, uint8_t could be one of those types, and unsigned char could be the other - the C standard imposes no
requirements that would be broken by that choice.
This is not something you're likely to ever see, just a possibility
allowed by the standard that we're extremely unlikely to see.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Sun May 10 16:11:15 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 10/05/2026 03:01, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

My point had been that in my module scheme, it would be less work.

Good for you.

So you don't have a problem you're trying to solve, and you don't
want advice about how to do something.

You keep forgetting context. It was a throwaway remark in a brief
discussion WH and I were having about module schemes.

You said something about modules. I said that C doesn't have
modules. You misinterpreted my statement to mean that I was
telling you you shouldn't use the word "module" when discussing C.
The actual implication, which I stated in a followup, is that I
don't know what you mean by "module".

You could have simply explained what you meant.

(And I still don't know what to call a 'primary source file'; that is,
one of these files:

gcc one.c two.c three.c

and not a .h file, or a .c or other file that is included indirectly.)

You could call it "a C source file (not a header file)", or "a .c file".

You know what, if all possible answers to all C-related questions were
contained within the C standard, why does this group even exist?

Just post a link to the standard document and be done with it.

This group exists because not all possible answers to all C-related
questions are contained within the C standard. You pretend that
someone has made such a ridiculous claim, but unless I missed
something nobody has.

I'll try this again. You claimed that C "pretends to be a safe
language". That was a false claim. Will you either provide evidence
that it was correct or acknowledge that it was incorrect?

It happens that the first few paragraphs of Annex K are relevant
to your statement. If you inferred from that remark that I think
"all possible answers to all C-related questions were contained
within the C standard", that was a very wrong and silly inference.

I expect that you will refuse yet again to respond, but I'm prepared
to be pleasantly surprised.

I've glanced at appendix K.1 and saw nothing relevant there. It's
about exceeding arrray bounds.

Your false claim was that C "pretends" to be a safe language.
The first subsection of Annex K acknowledges a number of ways in
which C can be unsafe. This directly contradicts your claim.

I assuming that doing that would UB.

My question was (it is always important to keep conext!):

So, C can be unsafe even when you avoid all UB? Examples?

That's a question you asked later. I didn't answer it because I
didn't find it interesting, and I was still trying to get you to
acknowledge that your earlier claim that C "pretends" to be safe
was incorrect. I am not surprised that you are still dancing around
and avoiding addressing that issue.

Really it comes down to what 'unsafe' means in a language, and in C,
whether it is tied to UB or can be more general.

But since 'unsafe' is not defined in the standard (not in N1570
anyway, where it used casually on only one instance), I expect you
don't know, and wouldn't want to speculate.

As usual, your expectations about me are wrong.

You've asked "So, C can be unsafe even when you avoid all UB?
Examples?". I have an answer to that. (I have an example that
I've posted here recently. I'm going to assume one example is
sufficient.)

Earlier, you claimed that C pretends to be safe. I will answer
your question (quoted in the previous paragraph) only after you
address that claim. Do you still falsely claim that C pretends
to be safe? Do you acknowledge that you were wrong? Was it a
deliberate exaggeration? Was it a deliberate lie?

Your exact words were:

C pretends to be a safe language by saying all those naughty
things are UB and should be avoided, at the same time, C
compilers can be made to do all that.

in Message-ID: <10tn877$3kg8u$1@dont-email.me>.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From kalevi@kalevi@kolttonen.fi (Kalevi Kolttonen) to comp.lang.c on Sun May 10 23:19:28 2026

From Newsgroup: comp.lang.c

James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

On 2026-05-10 16:15, Kalevi Kolttonen wrote:

Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

In almost all cases where uint8_t
might be used, unsigned char works just as well.

Why "almost"? Where is the difference if any?

If uint8_t exists, CHAR_BIT must be 8, and unsigned char must therefore
meet the requirements to be the type that uint8_t is a typedef for.
However, the standard doesn't mandate it. If, for example, a machine supported two different 8-bit types, with the order of the bits from low
to high reversed between them, uint8_t could be one of those types, and unsigned char could be the other - the C standard imposes no
requirements that would be broken by that choice.
This is not something you're likely to ever see, just a possibility
allowed by the standard that we're extremely unlikely to see.

I see, thanks. So from a practical point of view today, they
appear pretty identical.

br,
KK
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Sun May 10 16:38:12 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 10/05/2026 14:03, David Brown wrote:

On 10/05/2026 13:29, Bart wrote:
I am not saying /I/ have the in-depth knowledge required to give a
good argument for changing the standards here - I am merely saying
that /you/ don't have that knowledge.

So, this is mystery: I am at fault for not knowing X, but you not at
fault for not knowing X?!

This particular thing was just some simple example I'd thought up:

Suppose I proposed for example that C should deprecate, then ban, the

ability to write:
...

But I can't see that it requires any deep knowledge of the standard

to make such a proposal, or why somebody would require that of me in
order to even consider it.

Now it turns out that the C committee is actually looking at such a
proposal.

Correct.

But funnily enough, no one has given me credit for that.

You independently came up with the idea of deprecating index[array].
It's a fairly obvious idea. The legality of index[array] has been
considered a (small) wart in the language for decades, used mostly
in deliberately obfuscated code. I presume you never mentioned your
idea to the C committee. You found out, because I told you about
it, that the latest proposed draft for C202y makes index[array]
obsolescent. The committee also realized something that you missed,
that this could be done without touching index+pointer.

You had no influence on the committee. The N3783 draft standard
was released in January, and the idea predates that (I haven't
checked the history).

As I understand it, the reason you made that suggestion was to
demonstrate that we would disagree with it because it came from you.
You were wrong about that.

What credit do you think you deserve?

Yeah, you had a good idea.

It requires constructs like i[A] to be deprecated, while still
allowing i + A.

That is also possible, but is not as simple a change, since C
currently requires them to be interchangeable, and that is baked in
to my compiler.

Not only do you not have the knowledge required to give an informed
opinion about making this particular change to the standards, you
don't have the knowledge required to give an informed opinion about
making / any/ changes to the standard, the C language, or
implementations.

This is not like making changes to your personal little languages or
your toy C compiler.

Why, what's the difference? At least I attempted to make the change to
see what would happen, and I tried it out on some real non-toy
code-bases.

My only mistake was thinking that C REQUIRED indexing syntax to be
tied to pointer arithmetic, but as far as I know, it currently does do
that, and will do so for some years yet.

But if we're allowed to separate them, then OK I'll have another go at
my compiler. It turns out to even simpler: I had to modify three lines
of code.

Now P+i and i+P are still allowed, but not i[A], only A[i]. All the
tests I tried before now still work.

So my toy compiler implements part of C2y!

The interesting thing is that to achieve it, I had to ignore my
knowledge of the current C standard (specifically 6.5.2.1p2).

You didn't have to ignore anything. It was a change to the standard
that was being discussed. You were therefore not completely bound by
what the standard currently says.

I do not blame you for not initially realizing that you could disallow index[array] while continuing to allow index+pointer by partially
decoupling indxing from pointer arithmetic. I imagine that I could have
missed that myself. But now that you know it's possible, you're still complaining about it.

SO *NOW* TELL ME WHAT I DID WRONG.

You did nothing wrong in experimenting with the feature in your
own compiler. (What David wrote could be interpreted that way.
I think he overstated the case.)

What you've done wrong is to incessantly whine about it.

It seems to me that guys just want to constantly pick on me for
specious reasons.

I'm sure it seems that way to you. Consider the possibility that
your perception is incorrect. Take into account your remarkably poor
track record in guessing what other people think or what they're
going to say. Just one example: you thought we would reject your
idea of deprecating index[array]. We didn't. Think about how
and why you made that mistake. You presented something that was
IMHO a pretty good idea, but you did so with the apparent intent
of demonstrating that we would reject it.

You have very little credibility when it comes to proposing
changes to the C standard, because you generally refuse to read or
understand it. That doesn't prevent you from occasionally coming
up with a decent idea (even if others did so first).
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Sun May 10 16:45:57 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 10/05/2026 06:00, Janis Papanagnou wrote:

On 2026-05-10 01:45, Bart wrote:

On 09/05/2026 23:47, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

Now look at what's involved in splitting a C module into two.

C doesn't have "modules".

You want to be /that/ pedantic?

This is exactly why I said that the C standard is your thing. If
somebody uses a term that doesn't appear in the standard, then it
doesn't exist.

I suppose the word you wanted to use is "translation unit".

No. That is a technical term used within the C standard and relates to
a subsequent representation of your source code within a compiler.

It is also C-specific. What is the generic term for one of the
discrete source files of a program?

The C standard defines the term "source files". See 5.1.1.1.

Header files are "source files". Elsewhere in this thread, you've
used the term "primary source files" to refer to .c files that
are fed to the compiler, as distinct from .h files that are not.
That's a perfectly cromulent term, as long as you make it clear
what you mean by it.

You started this by making some point about the difficulty of
splitting "modules", by which you apparently meant what you've
called "primary source files". You've now embarked on a long
journey debating terminology, and avoided discussing anything
specific about why it's difficult to split them. (If you don't
want to discuss that, that's fine with me.)

[...]
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Mon May 11 00:58:19 2026

From Newsgroup: comp.lang.c

On 11/05/2026 00:11, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

I've glanced at appendix K.1 and saw nothing relevant there. It's
about exceeding arrray bounds.

Your false claim was that C "pretends" to be a safe language.

People keep jumping to conclusions without asking for clarification.
This is what I said (quite a few posts back):

C pretends to be a safe language by saying all those naughty things

are UB and should be avoided, at the same time, C compilers can be made
to do all that.

(I see now you quoted this yourself; I can have saved some time!)

The assumption made here is that unsafe-ness arises in C from UB. Then I suggest that, while the language itself washes it hands of it, it lets
the compiler do the dirty work (as well as pushing the responsibility to
the user, by allowing the compiler to do something that is UB).

In a later follow-up to you I ask:

So, C can be unsafe even when you avoid all UB? Examples?

And yet later I ask for clarification for what it means to be 'unsafe'
and gave some examples of my own. I don't recall that being answered.

Do you still falsely claim that C pretends
to be safe?

"When you avoid all UB". You keep forgetting this bit.

Well, first tell me what it means for a language to be 'unsafe'. That
term has not been defined. Is it only what happens when UB is invoked,
or can it be at any time?

If you think I was wrong, then you can politely suggest that and offer
some enlightenment. Why become aggressive and give me the third degree? Sometimes I feel like I'm in the dock.

So, reading between the lines, you seem to be suggesting that C /can/ be
an unsafe language (whatever that means) whether or not UB is involved.

Do you acknowledge that you were wrong? Was it a
deliberate exaggeration? Was it a deliberate lie?

Please stop this. If you don't agree with what I said, then post a couner-argument.

You should also look at the context: I was explaining the various
underhand, 'unsafe' things that are possible in C, which give it an edge
over competitors for systems work, then I suggest that many of those are likely to be UB so not officially sanctioned.

Phew! (Mopping sweaty brow with a handkerchief.)
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Sun May 10 17:04:08 2026

From Newsgroup: comp.lang.c

kalevi@kolttonen.fi (Kalevi Kolttonen) writes:

Bonita Montero <Bonita.Montero@gmail.com> wrote:

Yes, that's a major problem with all 64 bit Unices.Use Windows with
that. On Windows long and int have the same size.

I have used Linux since the summer of 1998 and would never
ever even consider installing Windows. It is so disgusting.

Let me encourage you to discuss C here, not the relative merits of
Linux and Windows.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Sun May 10 17:10:58 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:
[...]

(You can also see from this that /nobody/ likes stdint.h types, even
though standardised from C99 which also introduced 'long long' used
here. That is another bugbear. Oh, I forgot, my criticism is not not
valid.)

Bart, you claimed here that literally *nobody* likes stdint.h types.

I like stdint.h types.

Your claim is therefore false.

Will you acknowledge that simple fact?
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Mon May 11 01:17:17 2026

From Newsgroup: comp.lang.c

On 10/05/2026 16:18, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 10/05/2026 05:39, Janis Papanagnou wrote:

Originally my language was created to run on a bare board with very
little memory and no existing software /at all/, not even an assembler.

/You/ try it.

Typical project in undergraduate computer science programs;
in my era, one wrote a recursive descent compiler for a
subset of Pascal[*] (or C) - the course took a single academic
quarter.

My course had similar content for the compilers module, but we were only expected to get as far as a parser (although I went a lot further).

Final year projects varied but mine was porting the GEC 4000 systems
language to PDP10, in assembly then self-hosted.

(This was a nice language, lower level than C and above assembly, which
I recently wanted to re-implement, but the only existing info was in a computer museum in the UK, and they were unwilling to copy details as
the artefacts were too delicate.)

However, what I described above had nothing to do with that. It was
about bootstrapping a compiler from literally nothing, on a small microprocessor computer I'd devised (after building it with a soldering
iron).

No existing software, and no handy other computer to cross-compile on.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Mon May 11 01:21:10 2026

From Newsgroup: comp.lang.c

On 11/05/2026 01:10, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

(You can also see from this that /nobody/ likes stdint.h types, even
though standardised from C99 which also introduced 'long long' used
here. That is another bugbear. Oh, I forgot, my criticism is not not
valid.)

Bart, you claimed here that literally *nobody* likes stdint.h types.

I like stdint.h types.

Your claim is therefore false.

Will you acknowledge that simple fact?

From Google:

3. Hyperbole (Exaggeration) for Emphasis

"Nobody" is frequently used in hyperbolic statements to emphasize that
almost no one was there, or that the number of people was negligible.
--- Synchronet 3.22a-Linux NewsLink 1.2

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Mon May 11 00:26:37 2026

From Newsgroup: comp.lang.c

Kalevi Kolttonen <kalevi@kolttonen.fi> wrote:

Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

In almost all cases where uint8_t
might be used, unsigned char works just as well.

Why "almost"? Where is the difference if any?

As far as I know, ISO guarantees that
sizeof(unsigned char) is always 1 byte.

And operations on unsigned char are well defined,
including wrap-around. So I fail to see any
difference between unsigned char and uint8_t.

If machine has bytes bigger than 8 bit, then uint8_t will not
exit, so trying to use uint8_t will fail at compile time, which
may be good thing, if code depends on size being exactly 8
bits.
--
Waldek Hebisch
--- Synchronet 3.22a-Linux NewsLink 1.2

From James Kuyper@jameskuyper@alumni.caltech.edu to comp.lang.c on Sun May 10 20:30:24 2026

From Newsgroup: comp.lang.c

On 2026-05-10 20:10, Keith Thompson wrote:
...

Bart, you claimed here that literally *nobody* likes stdint.h types.

I like stdint.h types.

Me too.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Sun May 10 17:31:05 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 11/05/2026 00:11, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

I've glanced at appendix K.1 and saw nothing relevant there. It's
about exceeding arrray bounds.

Your false claim was that C "pretends" to be a safe language.

People keep jumping to conclusions without asking for
clarification. This is what I said (quite a few posts back):

C pretends to be a safe language by saying all those naughty things

are UB and should be avoided, at the same time, C compilers can be
made to do all that.

(I see now you quoted this yourself; I can have saved some time!)

The assumption made here

Made by whom?

is that unsafe-ness arises in C from UB. Then
I suggest that, while the language itself washes it hands of it, it
lets the compiler do the dirty work (as well as pushing the
responsibility to the user, by allowing the compiler to do something
that is UB).

You appear to be making the assumption either that C's only
unsafe-ness arises from UB, or that someone else is making that
assumption. I don't know where you got that idea.

C does not pretend to be a safe language. Nobody that I'm aware
of pretends that C is a safe language. The C standard explicitly
acknowledges some of the ways in which C can be unsafe. Undefined
behavior is not the only way in which C can be unsafe. I'm not
aware that anyone has claimed that it is.

In a later follow-up to you I ask:

So, C can be unsafe even when you avoid all UB? Examples?

And yet later I ask for clarification for what it means to be 'unsafe'
and gave some examples of my own. I don't recall that being answered.

Do you still falsely claim that C pretends
to be safe?

"When you avoid all UB". You keep forgetting this bit.

No, I'm not forgetting it. You claimed that "C pretends to be a
safe language". It does not. You claimed that C pretends to be a
safe language by [doing certain things]. Since C does not pretend
to be a safe language, how it does so is irrelevant.

C programs can be unsafe. C programs that exhibit no undefined
behavior can be unsafe. Nobody has pretended otherwise.

Well, first tell me what it means for a language to be 'unsafe'. That
term has not been defined. Is it only what happens when UB is invoked,
or can it be at any time?

If you think I was wrong, then you can politely suggest that and offer
some enlightenment. Why become aggressive and give me the third
degree? Sometimes I feel like I'm in the dock.

I've tried that. It does not result in you acknowledging that you've
made an incorrect claim.

So, reading between the lines, you seem to be suggesting that C /can/
be an unsafe language (whatever that means) whether or not UB is
involved.

Do you acknowledge that you were wrong? Was it a
deliberate exaggeration? Was it a deliberate lie?

Please stop this. If you don't agree with what I said, then post a couner-argument.

OK. You claimed that C pretends to be a safe language. C does not
pretend to be a safe language. I've posted counter arguments to your
false claim. It doesn't work.

I've made mistakes and acknowledged them when corrected. You've
made mistakes and refused to acknowledge them when corrected.
This damages your credibility. I would have expected you to care
about that.

You should also look at the context: I was explaining the various
underhand, 'unsafe' things that are possible in C, which give it an
edge over competitors for systems work, then I suggest that many of
those are likely to be UB so not officially sanctioned.

Did I express any disagreement with that?

Phew! (Mopping sweaty brow with a handkerchief.)

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From James Kuyper@jameskuyper@alumni.caltech.edu to comp.lang.c on Sun May 10 20:31:17 2026

From Newsgroup: comp.lang.c

On 2026-05-08 06:43, David Brown wrote:
...

Yes, I have heard that argument before. I am unconvinced that the
"value preserving" choice actually has any real advantages. I also
think it is a misnomer - it implies that "unsigned preserving" would
not preserve values, which is wrong.

Unsigned-preserving rules would convert a signed value which might be
negative to unsigned type more frequently than the value preserving
rules do. Such a conversion is not value-preserving.

--- Synchronet 3.22a-Linux NewsLink 1.2

From James Kuyper@jameskuyper@alumni.caltech.edu to comp.lang.c on Sun May 10 20:36:49 2026

From Newsgroup: comp.lang.c

On 2026-05-10 20:26, Waldek Hebisch wrote:

Kalevi Kolttonen <kalevi@kolttonen.fi> wrote:

Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

In almost all cases where uint8_t
might be used, unsigned char works just as well.

Why "almost"? Where is the difference if any?

As far as I know, ISO guarantees that
sizeof(unsigned char) is always 1 byte.

And operations on unsigned char are well defined,
including wrap-around. So I fail to see any
difference between unsigned char and uint8_t.

If machine has bytes bigger than 8 bit, then uint8_t will not
exit, so trying to use uint8_t will fail at compile time, which
may be good thing, if code depends on size being exactly 8
bits.

He referred to "cases where uint8_t might be used". Such machines don't
allow such usage, so his statement doesn't cover them.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Sun May 10 17:42:15 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 11/05/2026 01:10, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

(You can also see from this that /nobody/ likes stdint.h types, even
though standardised from C99 which also introduced 'long long' used
here. That is another bugbear. Oh, I forgot, my criticism is not not
valid.)

Bart, you claimed here that literally *nobody* likes stdint.h types.
I like stdint.h types.
Your claim is therefore false.
Will you acknowledge that simple fact?

From Google:

3. Hyperbole (Exaggeration) for Emphasis

"Nobody" is frequently used in hyperbolic statements to emphasize that
almost no one was there, or that the number of people was negligible.

Yes, thank you, I know why hyperbole means. You're obviously saying
that your statement was hypberbole.

You used the word "nobody". You've repeatedly defended your
statement at great length and raised other questions, like asking
for "figures".

You could have said, in response to the first criticism of your
statement, that it was merely hyperbole. It would have saved us
all a great deal of time.

It can honestly be difficult to tell which of your statements are
meant to be taken literally and which are hyperbolic or figurative.
If you make a statement that's not literally true, we can't always
tell whether you believe it or not.

If you make a statement like that and someone challenges it because
it's not literally true, just tell us it wasn't meant literally.
We are not, despite what you appear to believe about is, so pedantic
that we can't accept that.

I might also suggest that you try to avoid exaggerations when
posting here. As you've seen, they can be misinterpreted, and such misinterpretation is not necessarily malicious.

Unless, of course, you enjoy triggering long arguments.

I note your continuing refusal to directly acknowledge that your
statement was false.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Sun May 10 18:19:57 2026

From Newsgroup: comp.lang.c

kalevi@kolttonen.fi (Kalevi Kolttonen) writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

In almost all cases where uint8_t
might be used, unsigned char works just as well.

Why "almost"? Where is the difference if any?

As far as I know, ISO guarantees that
sizeof(unsigned char) is always 1 byte.

And operations on unsigned char are well defined,
including wrap-around. So I fail to see any
difference between unsigned char and uint8_t.

I respond downthread to your subsequent posting.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Mon May 11 02:33:29 2026

From Newsgroup: comp.lang.c

On 11/05/2026 01:42, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 11/05/2026 01:10, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

(You can also see from this that /nobody/ likes stdint.h types, even
though standardised from C99 which also introduced 'long long' used
here. That is another bugbear. Oh, I forgot, my criticism is not not
valid.)

Bart, you claimed here that literally *nobody* likes stdint.h types.
I like stdint.h types.
Your claim is therefore false.
Will you acknowledge that simple fact?

From Google:

3. Hyperbole (Exaggeration) for Emphasis

"Nobody" is frequently used in hyperbolic statements to emphasize that
almost no one was there, or that the number of people was negligible.

Yes, thank you, I know why hyperbole means. You're obviously saying
that your statement was hypberbole.

You used the word "nobody". You've repeatedly defended your
statement at great length and raised other questions, like asking
for "figures".

You could have said, in response to the first criticism of your
statement, that it was merely hyperbole.

I think we've been here before. I was talking figuratively, and didn't
feel the need to point that out.

It would have saved us
all a great deal of time.

A great deal of doing what, arguing about nothing?

--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Sun May 10 18:43:48 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 11/05/2026 01:42, Keith Thompson wrote:

[...]

You could have said, in response to the first criticism of your
statement, that it was merely hyperbole.

I think we've been here before. I was talking figuratively, and didn't
feel the need to point that out.

Finally, a direct acknowledgement.

I can understand not feeling the need to point it out. But you've
repeated refused to say so when being directly questioned about it.

You seem unwilling or unable to accept that sometimes the rest of
us *can't tell* whether you're speaking literally or not.

Your unwillingness to give direct answers to direct questions has
resulted in a great deal of noise here. One might almost infer
that it's deliberate on your part.

[...]

Elsewhere in this thread, you said that "C pretends to be a safe
language". Was that hyperbole? It's obvious that it's not literally
true, but what did you actually mean by it?
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Mon May 11 01:44:46 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> wrote:

On 11/05/2026 00:11, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

I've glanced at appendix K.1 and saw nothing relevant there. It's
about exceeding arrray bounds.

Your false claim was that C "pretends" to be a safe language.

People keep jumping to conclusions without asking for clarification.
This is what I said (quite a few posts back):

C pretends to be a safe language by saying all those naughty things

are UB and should be avoided, at the same time, C compilers can be made
to do all that.

(I see now you quoted this yourself; I can have saved some time!)

The assumption made here is that unsafe-ness arises in C from UB. Then I suggest that, while the language itself washes it hands of it, it lets
the compiler do the dirty work (as well as pushing the responsibility to
the user, by allowing the compiler to do something that is UB).

You are seriously confused by what other people consider as
"safe language". First, I do not think it is possible to
give satisfactory definition of safety, either get the idea
or not. One popular attempt at definition is that language
is safe if no untrapped errors are possible. Of course, this
definition has trouble because then one needs to say what
an error is. Resonable definition could be that there is an
error if program is doing different thing than intended by
its creator. But as you noted there are errors that
language implementation can not reasonably detect so clearly
attempt above + this definiot on error is not satisfactory.
So we need to restrict what we consider to be an error.
When talking about language safety posible (and popular)
approach is restrict errors to things that break language
rules, like using out of bound array indices or overflow
in C signed arithmetic.

Now, if you look at UB, UB in particular means that
implementation is not obliged to detect errors. So
UB in language definition means that language is more
or less unsafe.

I think that your formulation "allowing the compiler to do
something that is UB" is quite misleading. Standard says
that some things are UB. If UB appears in a program, it
is programmer who put it there. Essential part of UB is
that it is programmer responsibility to avoid UB.
Specific compiler may be helpful by detecting UB or
defining some useful behaviour, but in general compiler
is allowed to proceed blindy, trusting that there are
no UB in the source.

Coming back to safety, definig errors as violations of
language rules is not fully satisfactory too. Namely,
using language that "allow anything", like assembler,
there will be no violation of language rules, but clearly
such language does not help in detecting error. So
to meaningfuly talk about language safety there must
be rules such that some classes of error lead to
violation of rule and violation must be detected. C
has type rules and violations of type rules will
detect some errors at compile time. But by design C
does not require any error detection at runtime so
clearly is unsafe.

Now, unqualified "safe" is really a fuzzy concept, as
there is no hope of detecting all errors and while
detecting some errors is theoretically possible
cost of checking could be prohibitive. So basically
"safe" boils down to "due diligence": language rules
forbid things that are recognized as likely to be
errors and language uses state of the art methods
to detect or prevent violations of the rules.
Let me add that basically from time where Pascal
were invented it was known how to define a language
rich enough to do most real world task, having rules
which eliminate substantial fraction of errors and
where _all_ violations of language rules are detected.
So languages that allow undetected violations of rules
are consdered more or less unsafe.
--
Waldek Hebisch
--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Sun May 10 18:37:39 2026

From Newsgroup: comp.lang.c

kalevi@kolttonen.fi (Kalevi Kolttonen) writes:

James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

On 2026-05-10 16:15, Kalevi Kolttonen wrote:

Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

In almost all cases where uint8_t
might be used, unsigned char works just as well.

Why "almost"? Where is the difference if any?

If uint8_t exists, CHAR_BIT must be 8, and unsigned char must therefore
meet the requirements to be the type that uint8_t is a typedef for.
However, the standard doesn't mandate it. If, for example, a machine
supported two different 8-bit types, with the order of the bits from low
to high reversed between them, uint8_t could be one of those types, and
unsigned char could be the other - the C standard imposes no
requirements that would be broken by that choice.
This is not something you're likely to ever see, just a possibility
allowed by the standard that we're extremely unlikely to see.

I see, thanks. So from a practical point of view today, they
appear pretty identical.

The possibility of differing representations had no bearing on my
comment. In most cases two types[*] whose representations happen
to be different can be used interchangeably. ([*] of the same
width and signedness, of course.)

The key point is that unsigned char and uint8_t can be distinct
types even when they have the same representation. Where this
matters is when a pointer is needed to one type or the other. If
for example there is a function with a parameter whose type is
unsigned char *, it doesn't do to take the address of a uint8_t to
supply as the argument, and vice versa. It's easy to imagine that
an implementation could choose to make uint8_t be a type distinct
from unsigned char, in the interest of type safety. Thus one case
where I would choose uint8_t is where there is an externally defined
library function with a uint8_t * parameter. I don't remember ever
seeing that, but if it came up that would be a reason to use uint8_t
rather than unsigned char.
--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Mon May 11 02:18:02 2026

From Newsgroup: comp.lang.c

In article <10tr62r$nu2a$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 11/05/2026 00:11, Keith Thompson wrote:

[snip]

People keep jumping to conclusions without asking for clarification.
This is what I said (quite a few posts back):

C pretends to be a safe language by saying all those naughty things

are UB and should be avoided, at the same time, C compilers can be made
to do all that.

(I see now you quoted this yourself; I can have saved some time!)

The assumption made here is that unsafe-ness arises in C from UB. Then I >suggest that, while the language itself washes it hands of it, it lets
the compiler do the dirty work (as well as pushing the responsibility to
the user, by allowing the compiler to do something that is UB).

The comiler doesn't "do something that is UB." The compiler
detects that something in a program is undefined behavior and
does something as a result (that "something" may be nothing).

This is why people don't take you seriously. You could look
this up in the standard and most probably bring yourself to
understand it if you were so inclined, but you steadfastly
refuse to do so.

In a later follow-up to you I ask:

So, C can be unsafe even when you avoid all UB? Examples?

And yet later I ask for clarification for what it means to be 'unsafe'
and gave some examples of my own. I don't recall that being answered.

How do you possibly expect anyone to answer that? You're the
one who made the claim; you have to be the one who supplies the
definition.

Do you still falsely claim that C pretends
to be safe?

"When you avoid all UB". You keep forgetting this bit.

Well, first tell me what it means for a language to be 'unsafe'. That
term has not been defined. Is it only what happens when UB is invoked,
or can it be at any time?

An unsafe language is one that is not safe. Since you said
that C claims to be "safe" in the cases where "you avoid all UB"
then you must clearly have some definition in mind of what a
"safe" language is. If you are going to claim that C is a
"safe" language with respect to that definition, and someone is
asking you to justify that claim, it's on you to provide the
definition.

If you think I was wrong, then you can politely suggest that and offer
some enlightenment. Why become aggressive and give me the third degree? >Sometimes I feel like I'm in the dock.

Perhaps you feel that way because you make so many outright
false or uninformed statements, and people are telling you that
those are wrong and asking you to justify them. If you were
less inclined to make so many such erroneous statements, perhaps
you would not feel so put upon to justify them.

So, reading between the lines, you seem to be suggesting that C /can/ be
an unsafe language (whatever that means) whether or not UB is involved.

Who knows? You're the one who made a statement that is
predicated on some defintion of "safe". You need to supply
that, and then we will all know what definition of "unsafe" you
are using.

Do you acknowledge that you were wrong? Was it a
deliberate exaggeration? Was it a deliberate lie?

Please stop this. If you don't agree with what I said, then post a >couner-argument.

He did.

You should also look at the context: I was explaining the various
underhand, 'unsafe' things that are possible in C, which give it an edge >over competitors for systems work, then I suggest that many of those are >likely to be UB so not officially sanctioned.

Phew! (Mopping sweaty brow with a handkerchief.)

When you say something factually incorrect, and someone corrects
you, you get all butt-hurt about the correction, accusing others
of bullying or "picking on" you.

Honestly, it's sad. Do you have no self-respect? Grow up and
take some responsbility for yourself.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Sun May 10 19:48:46 2026

From Newsgroup: comp.lang.c

cross@spitfire.i.gajendra.net (Dan Cross) writes:
[...]

The comiler doesn't "do something that is UB." The compiler
detects that something in a program is undefined behavior and
does something as a result (that "something" may be nothing).

Or the compiler *doesn't* detect that something in the program
has undefined behavior, but assumes that the behavior is defined,
and generates code consistent with that assumption. A big part of
the rationale behind "undefined behavior" is that compilers don't
have to detect it.

An example:

#include <stdio.h>
#include <time.h>
#include <limits.h>

int main(void) {
int n = time(NULL) > 0 ? INT_MAX : 0;
printf("n=%d, n+1=%d, ", n, n+1);
printf("%d %s %d\n",
n+1,
n+1 > n ? ">" : n+1 == n ? "==" : "<",
n);
}

With different compilers and optimization settings, I get any of the
following outputs on my system:

n=2147483647, n+1=1, 1 > 2147483647

n=2147483647, n+1=-2147483648, -2147483648 < 2147483647

n=2147483647, n+1=-2147483648, -2147483648 > 2147483647

I'm fairly sure that none of the compilers detect that there will
be undefined behavior at run time. The fact that time(NULL) is
greater than 0 is not something I'd expect a compiler to assume.
(That's why I added that to the program.) Rather, some compilers
assume that the behavior is defined, and therefore that n + 1 must
be greater than n.

[...]
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Mon May 11 03:09:22 2026

From Newsgroup: comp.lang.c

In article <10trcac$1uosm$1@paganini.bofh.team>,
Waldek Hebisch <antispam@fricas.org> wrote:

[snip]
I think that your formulation "allowing the compiler to do
something that is UB" is quite misleading. Standard says
that some things are UB. If UB appears in a program, it
is programmer who put it there.

This may be strictly true, but it is a trivial statement that
conveys no useful information. This is what I was trying to get
at elsewhere in the thread with the `realloc` example.

Consider this program fragment:

if (p != NULL)
free(realloc(p, 0));

I claim, excepting allocation failures, this is,
1. Well-defined in C90.
2. IB in C17, and
3. UB in C23.

In C90, the return value from `realloc` is IB, but will either
be NULL or a unique pointer (if we accept for a moment that
failure of a zero-size allocation does not happen). In either
case the behavior of `free` is well defined; if `realloc`
returns NULL, "no action occurs." Otherwise, it's argument is
obviously a pointer returned form `realloc` and the argument
will be `freed`, whatever that means for an object of zero-size.
Further, `realloc` is explicitly documented to free `p` in this
case, since size is 0 and p is not null, which we know from the
immediately preceding `if` statement.

In C17, the standard is explicit that whether or not `p` is
freed when size is 0 is IB, and it is known that there are
implementations that do not free it in that case.

In C23, the behavior when size is 0 is explicitly undefined.

Admittedly this is a weird program to write. However, I believe
that, barring allocation failures for zero-sized allocations,
the behavior of this program is well-defined if following C90,
and UB in C23. So the programmer who wrote this, presumably
doing their very best to faithfully follow the letter of the
standard as written at the time and ensure well-defined
behavior, will find this changed now.

It is, of course, strictly true that the programmer introduced
the UB; after all, the program would not exist (with UB or not)
had they not written it. But again, that's not a useful
statement.

example
Essential part of UB is
that it is programmer responsibility to avoid UB.

See above.

Specific compiler may be helpful by detecting UB or
defining some useful behaviour, but in general compiler
is allowed to proceed blindy, trusting that there are
no UB in the source.

In fact, it kind of has no other choice, because C is not rich
enough to do anything else.

C code is full of things like, `memcpy(dst, src, len);`. This
function is just dangerous; there are so many ways that it can
fail:

1. Either `dst` or `src` could be invalid pointers (NULL,
dangling, uninitiazed; whatever).
2. `dst` may not be writable.
3. `src` may not be readable (C has no real notion of this, but
write-only and execute-only memory is absolutely a thing).
4. Even if `src` and `dst` are valid, the language provides no
means to guarantee that the range bounded by `len` is valid
for both: it could overlap, span over multiple objects, and
so on.

There is no possible, universally "correct" thing C could do
here. And how any of those things impact the actual program
varies between systems. Moreover, there's no way that these can
be detected ahead of time. So what else _can_ the language do,
other than declaring the effect of all of the above to be
"undefined?"

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Sun May 10 21:21:49 2026

From Newsgroup: comp.lang.c

On 5/10/2026 4:58 PM, Bart wrote:

On 11/05/2026 00:11, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

I've glanced at appendix K.1 and saw nothing relevant there. It's
about exceeding arrray bounds.

Your false claim was that C "pretends" to be a safe language.

People keep jumping to conclusions without asking for clarification.
This is what I said (quite a few posts back):

C pretends to be a safe language by saying all those naughty things

are UB and should be avoided, at the same time, C compilers can be made
to do all that.[...]

Safety? Like what? Can your lang/compiler/system prevent one from
creating a virus designed for another system, say it dump outs malicious
ASM code for another arch, or for your arch, and JMP's into it? Safe for
sure, but bad as can be? I am jesting here, but you do not want to put
"corks on the forks", right?

Imvho, C needs to be like it is... To allow one to shoot themselves in
the foot! Both feet. ;^)
--- Synchronet 3.22a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Sun May 10 21:28:58 2026

From Newsgroup: comp.lang.c

On 5/10/2026 8:42 AM, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 09/05/2026 23:47, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

Now look at what's involved in splitting a C module into two.

C doesn't have "modules".

You want to be /that/ pedantic?

This is exactly why I said that the C standard is your thing. If
somebody uses a term that doesn't appear in the standard, then it
doesn't exist.

C doesn't have a concept of 'module' per se. Perhaps you're looking
for "translation unit"?

So, what is involved in splitting a ... I don't even know what to call
it - a single .c 'source file'? Well, a lot of messy work.

An experienced C programmer uses independent translation units
without even thinking about it, when the application is
non-trivial. For many reasons, including reusability,
maintainability and collaboration. There are codebases that
have well over a million SLOC.

You are the only programmer who has ever claimed
that an entire application must be contained within a single
translation unit. It sounds like you've never actually worked
with either a team, or a non-trivial application.

I wonder if his system has pre-compiled header support.

The context was why C became the dominant language for systems
programming. I offered that as an example. If it helped C over a
potential rival which wasn't used to implement a major OS, then it
strikes me as an unfair advantage.

Suppose Unix was implemented in some other language, then if C was still
more successful over rivals, that would have been fairer.

Fair? What is your definition of "fair" with respect to programming languages?

--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Sun May 10 21:49:50 2026

From Newsgroup: comp.lang.c

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <86mry8so39.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

[sic: as a sequence, the Fibonacci numbers are undefined for
$n<0$, but this is a pedagogical example, so let's ignore that]

A comment on that further down...

[snip]

(Origin 0, not 1.)

fibonacci(0) is 0. There is no other.

You are correct, and I was incorrect stating that Fib(n) is
undefined for n<0. [...]

No worries, I just wanted to stick to the usual formulation.

[snip]
Here is my current favorite fast fibonacci function (which happens
to be written in a functional and tail-recursive style):

static ULL ff( ULL, ULL, unsigned, unsigned );
static unsigned lone( unsigned );

ULL
ffibonacci( unsigned n ){
return ff( 1, 0, lone( n ), n );
}

ULL
ff( ULL a, ULL b, unsigned m, unsigned n ){
ULL c = a+b;
return
m & n ? ff( (a+c)*b, b*b+c*c, m>>1, n ) :
m ? ff( a*a+b*b, (a+c)*b, m>>1, n ) :
/*****/ b;
}

unsigned
lone( unsigned n ){
return n |= n>>1, n |= n>>2, n |= n>>4, n ^ n>>1;
}

Much faster than the linear version.

Very nice. 64-bit `unsigned long long` overflows for n>93, so I
question how much it matters in practice, though; surely if
calling this frequently you simply cache it in some kind of
table?

Depends on the cost of a cache miss. Because the computational
version is very fast, I tend to prefer it over a lookup table
with its higher variability.

I wondered how this compared to Binet's Formula, using floating
point:

```
unsigned long long
binet_fib(unsigned int n)
{
const long double sqrt5 = sqrtl(5.);

long double fn =
(powl(1. + sqrt5, n) - powl(1. - sqrt5, n)) /
(powl(2., n) * sqrt5);

return llroundl(fn);
}
```

Sadly, my quick test suggests accuracy suffers (presumably due
to floating point) for the larger representable values in the
sequence; specifically, n>90. As a result I didn't bother
attempting to benchmark it.

Yes, that is the perennial problem with floating point. Also
it's hard to scale up floating point, and fairly easy with
integers. Here is code that works up to n = 186:

typedef __uint128_t U128;

static U128 qff( U128, U128, unsigned, unsigned );
static unsigned lone( unsigned );

U128
qfibonacci( unsigned n ){
return qff( 1, 0, lone( n ), n );
}

U128
qff( U128 a, U128 b, unsigned m, unsigned n ){
U128 c = a+b;
return
m & n ? qff( (a+c)*b, b*b+c*c, m>>1, n ) :
m ? qff( a*a+b*b, (a+c)*b, m>>1, n ) :
/*****/ b;
}

unsigned
lone( unsigned n ){
return n |= n>>1, n |= n>>2, n |= n>>4, n ^ n>>1;
}
--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Sun May 10 22:31:59 2026

From Newsgroup: comp.lang.c

antispam@fricas.org (Waldek Hebisch) writes:

Dan Cross <cross@spitfire.i.gajendra.net> wrote:

In article <86mry8so39.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

<snip>

Here is my current favorite fast fibonacci function (which happens
to be written in a functional and tail-recursive style):

static ULL ff( ULL, ULL, unsigned, unsigned );
static unsigned lone( unsigned );

ULL
ffibonacci( unsigned n ){
return ff( 1, 0, lone( n ), n );
}

ULL
ff( ULL a, ULL b, unsigned m, unsigned n ){
ULL c = a+b;
return
m & n ? ff( (a+c)*b, b*b+c*c, m>>1, n ) :
m ? ff( a*a+b*b, (a+c)*b, m>>1, n ) :
/*****/ b;
}

unsigned
lone( unsigned n ){
return n |= n>>1, n |= n>>2, n |= n>>4, n ^ n>>1;
}

Much faster than the linear version.

Very nice. 64-bit `unsigned long long` overflows for n>93, so I
question how much it matters in practice, though; surely if
calling this frequently you simply cache it in some kind of
table?

I wondered how this compared to Binet's Formula, using floating
point:

```
unsigned long long
binet_fib(unsigned int n)
{
const long double sqrt5 = sqrtl(5.);

long double fn =
(powl(1. + sqrt5, n) - powl(1. - sqrt5, n)) /
(powl(2., n) * sqrt5);

return llroundl(fn);
}
```

Sadly, my quick test suggests accuracy suffers (presumably due
to floating point) for the larger representable values in the
sequence; specifically, n>90. As a result I didn't bother
attempting to benchmark it.

Fast version of fibonacci depend on fast computation of matrix
power (of a two by two matrix). One way to have fast matrix power
is to diagonalize and use floating point (which is essentially
what is done by Binet's Formula), but as you noted this needs extra precision. Tim's version looks like somewhat obscure variant
of fast matrix powering.

I started not from a matrix power approach but from a simple
recurrence relationship: if we know fib(k-1) and fib(k), we can
compute fib(2k-1) and fib(2k) (and so also fib(2k+1)) using just
multiplication and addition. After that it's just a matter of
deciding which track to go down.

This has advantage of doing all computations on integers.

Right! Integers are cool.

Of course, to make sense this must use increased precision,
preferably arbitrary precision arithmetic.

For python enthusiasts here is a python version:

def fibonacci( n ) :
return ff2( 1, 0, high_mask( 1, n ), n )

def high_mask( m, n ) :
return m>>1 if m > n else high_mask( m<<1, n )

def ff2( a, b, m, n ) :
c = a+b
if m & n : return ff2( (a+c)*b, b*b+c*c, m>>1, n )
if m : return ff2( a*a+b*b, (a+c)*b, m>>1, n )
return b

That said, fibonacci numbers grow fast enough so in most cases we
probably don't need the full generality of arbitrary precision.
Here is my latest 128-bit version:

typedef __uint128_t U128;

static U128 qff( U128, U128, unsigned, unsigned );
static unsigned mone( unsigned );

U128
zfibonacci( unsigned n ){
return n>10 ? qff( 0,1,mone(n),n ) : 0x410831483558b7 >>(60-6*n) &077;
}

U128
qff( U128 a, U128 b, unsigned m, unsigned n ){
U128 c = a+b;
return
m & n ? qff( (a+c)*b, b*b+c*c, m>>1, n ) :
m ? qff( a*a+b*b, (a+c)*b, m>>1, n ) :
/*****/ b;
}

unsigned
mone( unsigned n ){
return n |= n>>1, n |= n>>2, n |= n>>4, (n ^ n>>1) >>1;
}

This version might run just a tad faster than my last version, by
shaving one iteration off cases where n is not 0. Also the n > 10
test allows computing small values by what is sort of a table
lookup, encoding the first 11 values in a 64-bit integer quantity,
and extracting the needed value appropriately.

This gives fibonacci(186) is 332825110087067562321196029789634457848
(the largest value that fits in 128 bits).
--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Mon May 11 07:43:23 2026

From Newsgroup: comp.lang.c

On 2026-05-11 06:21, Chris M. Thomasson wrote:

[ C's characteristics ]

To allow one to shoot themselves in the foot! Both feet. ;^)

To stress that picture...

"C" allows you to shoot yourself in your foot, but if you
manage to shoot in both of your feet with a single bullet
then it's the programmer's fault!

Janis

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Mon May 11 07:58:25 2026

From Newsgroup: comp.lang.c

On 2026-05-10 13:44, Bart wrote:

[...]

Do people understand mine? 90% of my posts are about defending my
position especially when attacked on multiple fronts.

I'm sure if you'd not open multiple fronts you'd not suffer from
such effects. I suggest to stay on topic, read and understand the
answers, and don't shift goalposts or create straw man arguments,
don't assume things inappropriately, or put words in others' mouth.

I can say something and immediately I get attacked and accused of not knowing this or that, by people who get the wrong end of the stick or
pick up on a choice of word I used.

(How likely do you think is it that it's the fault of the hostile
environment and your personality and communication or the level of
expertise has nothing to do with it?)

The context was why C became the dominant language for systems
programming. I offered that as an example. If it helped C over a
potential rival which wasn't used to implement a major OS, then it
strikes me as an unfair advantage.

Keith already said that it was an advantage. Insisting on a "unfair"
qualification is inappropriate, especially without ethical measure
and without any substantial evidence. (That wording reminds me the
wording in the communication style of the current POTUS.)

Hmm, weren't Microsoft accused of unfair practice by bundling their
browsers with Windows?

That are completely different cases by any measure. You are comparing
apples and oranges.

But I now see where your misconception came from, so thanks for that clarification. (It would have been helpful to explain your thinking
in the first place instead of driving around unnecessarily in many
posts.)

Janis

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Mon May 11 08:06:18 2026

From Newsgroup: comp.lang.c

On 2026-05-10 14:37, Dan Cross wrote:

In article <10tp4o8$1l93k$7@dont-email.me>,
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 2026-05-09 03:36, Dan Cross wrote:

Maybe, maybe not, depending on the exact hashing function and
the values it uses. Since K&R2 came up elsewhere, consider the
hash function the presented on pp 128-129:

(I don't have that version available so the reference doesn't
help me much.)

I mean, I gave you the function; you quoted it. :-)

Erm, no. I referred to something from an earlier K&R release.
The algorithm was different from the one you posted, and the
modulus was also different; using 100 (2*2*5*5) vs. 101 (this
is a prime) makes a difference. - It seems the newer K&R that
you were referring to used a better modulus than the old book.
Never mind.

Janis

[...]

--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c on Mon May 11 08:55:54 2026

From Newsgroup: comp.lang.c

On 11/05/2026 02:31, James Kuyper wrote:

On 2026-05-08 06:43, David Brown wrote:
...

Yes, I have heard that argument before. I am unconvinced that the
"value preserving" choice actually has any real advantages. I also
think it is a misnomer - it implies that "unsigned preserving" would
not preserve values, which is wrong.

Unsigned-preserving rules would convert a signed value which might be negative to unsigned type more frequently than the value preserving
rules do. Such a conversion is not value-preserving.

If you have a signed value, you have a signed type. Unsigned-preserving
rules are also signed-preserving - smaller unsigned types promote to
bigger unsigned types, while smaller signed types promote to bigger
signed types. I don't think anyone ever suggested smaller signed types
should promote to larger unsigned types.

Perhaps I am being bone-headed here and missing something obvious.
(Given that the C committee put in a lot of effort and came to a
different conclusion, it seems very likely that I'm missing something.)

Unsigned-preserving promotions would, AFAICS, preserve value and
signedness :

unsigned short -> unsigned int
signed short -> signed int

Value-preserving promotions would preserve values too :

unsigned short -> signed int
signed short -> signed int

The unsigned-preserving promotions could also safely be applied even if
short is the same size as int - that is not the case for the "always
promote to signed int" rules.

--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c on Mon May 11 09:29:29 2026

From Newsgroup: comp.lang.c

On 11/05/2026 01:19, Kalevi Kolttonen wrote:

James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

On 2026-05-10 16:15, Kalevi Kolttonen wrote:

Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

In almost all cases where uint8_t
might be used, unsigned char works just as well.

Why "almost"? Where is the difference if any?

If uint8_t exists, CHAR_BIT must be 8, and unsigned char must therefore
meet the requirements to be the type that uint8_t is a typedef for.
However, the standard doesn't mandate it. If, for example, a machine
supported two different 8-bit types, with the order of the bits from low
to high reversed between them, uint8_t could be one of those types, and
unsigned char could be the other - the C standard imposes no
requirements that would be broken by that choice.
This is not something you're likely to ever see, just a possibility
allowed by the standard that we're extremely unlikely to see.

I see, thanks. So from a practical point of view today, they
appear pretty identical.

You are right from the practical viewpoint. Tim is correct that there
is no guarantee in the standards to say that "unsigned char" and
"uint8_t" are the same type - but there are no (AFAIK) implementations
in which they are different. The C standards allow an implementation to
have "extended integer types", and uint8_t could be one of these, but I
have never heard of the existence of a compiler that has them.

To me, the biggest difference is the names. "uint8_t" says "this is an
8-bit unsigned integer" - a small number. "unsigned char" says
"standard C type for a raw byte of memory". So if I am using the type
to represent a number, I will always use "uint8_t". If it is for raw
memory access, I might use "unsigned char" which is standard practice in general C programming, but "uint8_t" is also very common for the purpose
in my field.

I am happy to code on the assumption that they are always the same type (except, obviously, on platforms where uint8_t does not exist -
something I have not had the displeasure of using for a couple of
decades, but which always looms in the shadows when a customer wants a
card that runs at high temperatures). But if I am, say, writing a code snippet for c.l.c., I'll use a slightly different style from my
professional coding and would be sure to use "unsigned char" for raw
memory access and not mix it with uint8_t.

(Actually, the whole idea that a character - a letter, digit or other
such symbol - can be "signed" or "unsigned" makes no sense. But the
names are what they are.)

--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c on Mon May 11 09:46:17 2026

From Newsgroup: comp.lang.c

On 11/05/2026 01:58, Bart wrote:

On 11/05/2026 00:11, Keith Thompson wrote:

In a later follow-up to you I ask:

So, C can be unsafe even when you avoid all UB? Examples?

And yet later I ask for clarification for what it means to be 'unsafe'
and gave some examples of my own. I don't recall that being answered.

I may be losing track of this discussion, but I think it was /you/ who
first talked about "C pretends to be a safe language" and "C can be unsafe".

What do /you/ mean by "safe" and "unsafe" ?

For my part, I don't think those words have any practical meaning as
they stand, without context or qualification.

You can have "dynamic memory safe" - the language makes it hard to make mistakes such as forgetting to free memory that is no longer in use, or
trying to free memory twice.

You can have "memory access safe" - the language makes it hard to have
buffer overflows, access arrays out of bounds, or dereference invalid pointers.

You can have "type safe" - the language makes it hard to mix up types or
use them incorrectly.

As it stands, a "safe" language would be one where it is hard to write
buggy or incorrect code. I've never seen such a language (though some languages certainly reduce the risks of certain classes of bugs). "Safe language" is the kind of term you expect from marketing folk and
politicians - not from programmers.

No, C does not "pretend to be a safe language". Yes, "C can be unsafe
even when you avoid all UB". You can make mistakes in your C
programming without hitting UB, and no C programmer would suggest otherwise.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Mon May 11 11:53:34 2026

From Newsgroup: comp.lang.c

On 10/05/2026 16:47, Bart wrote:

On 10/05/2026 15:58, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 09/05/2026 17:38, David Brown wrote:

On 09/05/2026 18:16, Bart wrote:

(You can also see from this that /nobody/ likes stdint.h types, even >>>>> though standardised from C99 which also introduced 'long long' used
here. That is another bugbear. Oh, I forgot, my criticism is not not >>>>> valid.)

Don't you realise that when you write things like that, you are only
demonstrating why so many people do not take you seriously?-a Have you >>>> checked with every C programmer, and every person writing systems that >>>> generate C code, and checked that none of them like the <stdint.h>
types?-a No?-a I thought not.

So, what's the figure?

One doesn't understand your question.-a Is 'figure' some britishism
in this context?-a Or do you expect David to provide an accurate
percentage describing the preferences of every C programmer on
the planet (or in orbit, if any of the current station occupants
can program in C :-).

Personally, for my working code, the stdint types are used
extensively.

You know, I could well be right, and nobody does like them, apart of
course from people here. Instead they could just be tolerated.

I doubt whether they are loved, otherwise we'd see those _t suffixes in other languages too because they look so good.

Perhaps try asking why somebody would invent a new type name for uint8_t >>> at all.

Strawman.-a Please provide examples of "somebody inventing a new type name >> for uint8_t" (post standardization).-a-a One swallow doesn't make a
summer, so a single example
from some obscure project you found on the WWW isn't partcularly
instructive.

You invite people to give examples, but then immediately qualify that by putting restrictions on quantity and popularity so that they can never win!

For other people's benefit:

-a typedef uint8_t byte;

(From: https://github.com/arduino/ArduinoCore-avr/blob/master/cores/ arduino/Arduino.h)

-a typedef int64_t mz_int64;

(From a compression library called "miniz")

-a typedef uint32_t Uint32;

(From SDL2 header files)

This I just discovered by chance. It's a small Reddit language project
which here transpiles to C:

self.emit("// Core types");
self.emit("typedef int64_t Int;");
self.emit("typedef int8_t Int8;");
self.emit("typedef int16_t Int16;");
self.emit("typedef int32_t Int32;");
self.emit("typedef int64_t Int64;");
self.emit("typedef uint64_t UInt;");
self.emit("typedef uint8_t UInt8;");
...

You'd think that if transpling to C anyway, they can tolerate using
"int64_t" in the generated C. But apparently not.

I don't blame them; I do the same:

typedef signed char i8;
typedef short i16;
typedef int i32;
typedef long long int i64;
typedef unsigned char u8;
typedef unsigned short u16;
typedef unsigned int u32;
typedef unsigned long long int u64;

In this case however I don't use any standard headers.

I do it because, when needing to check the generated C for original
source that looks like this:

func F(u64 a, b, c, d)u64 = ...

The output C is this:

static u64 fred_f(u64 a, u64 b, u64 c, u64 d) {

(I don't bother combining common types in this case, but I do in some
cases of generated IL since I spend a lot more time with those.)

It is easier to see than:

static unsigned long long int fred_f(unsigned long long int a, unsigned
long long int b, unsigned long long int c, unsigned long long int d) {

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Mon May 11 10:56:27 2026

From Newsgroup: comp.lang.c

In article <10trrkq$1l93k$9@dont-email.me>,
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 2026-05-10 14:37, Dan Cross wrote:

In article <10tp4o8$1l93k$7@dont-email.me>,
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 2026-05-09 03:36, Dan Cross wrote:

Maybe, maybe not, depending on the exact hashing function and
the values it uses. Since K&R2 came up elsewhere, consider the
hash function the presented on pp 128-129:

(I don't have that version available so the reference doesn't
help me much.)

I mean, I gave you the function; you quoted it. :-)

Erm, no. I referred to something from an earlier K&R release.

Ah, ok.

The algorithm was different from the one you posted,

The version I posted was from K&R2; the version from K&R 1st
Edition is as follows:

```
#define HASHSIZE 100

hash(s)
char *s;
{
int hashval;

for (hashval = 0; *s != '\0'; )
hashval += *s++;
return (hashval % HASHSIZE);
}
```
(Note that this is historical C. I have reproduced it verbatim,
modulo mistakes in transcription)

and the
modulus was also different; using 100 (2*2*5*5) vs. 101 (this
is a prime) makes a difference. - It seems the newer K&R that
you were referring to used a better modulus than the old book.
Never mind.

No, the newer version uses a _multiplier_. For reference, the
newer algorithm is:

```
#define HASHSIZE 101

unsigned hash(char *c)
{
unsigned hashval = 0
for (hashval = 0; *s != '\0'; s++)
hashval = *s + 31 * hashval;

return hashval % HASHSIZE;
}
```
Note the `31 * hashval` term in the loop. 31 is a prime number,
and thus coprime to the hash size (which also happens to be
prime, but that's not _as_ important; see the analysis I linked
to earlier to understand the math).

That is the essential difference between the two.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Mon May 11 12:15:35 2026

From Newsgroup: comp.lang.c

On 11/05/2026 02:44, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 11/05/2026 00:11, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

I've glanced at appendix K.1 and saw nothing relevant there. It's
about exceeding arrray bounds.

Your false claim was that C "pretends" to be a safe language.

People keep jumping to conclusions without asking for clarification.
This is what I said (quite a few posts back):

C pretends to be a safe language by saying all those naughty things

are UB and should be avoided, at the same time, C compilers can be made
to do all that.

(I see now you quoted this yourself; I can have saved some time!)

The assumption made here is that unsafe-ness arises in C from UB. Then I
suggest that, while the language itself washes it hands of it, it lets
the compiler do the dirty work (as well as pushing the responsibility to
the user, by allowing the compiler to do something that is UB).

You are seriously confused by what other people consider as
"safe language".

Yes. You say that as though I shouldn't be ...

First, I do not think it is possible to
give satisfactory definition of safety, either get the idea
or not. One popular attempt at definition is that language
is safe if no untrapped errors are possible. Of course, this
definition has trouble because then one needs to say what
an error is. Resonable definition could be that there is an
error if program is doing different thing than intended by
its creator. But as you noted there are errors that
language implementation can not reasonably detect so clearly
attempt above + this definiot on error is not satisfactory.
So we need to restrict what we consider to be an error.
When talking about language safety posible (and popular)
approach is restrict errors to things that break language
rules, like using out of bound array indices or overflow
in C signed arithmetic.

Now, if you look at UB, UB in particular means that
implementation is not obliged to detect errors. So
UB in language definition means that language is more
or less unsafe.

I think that your formulation "allowing the compiler to do
something that is UB" is quite misleading. Standard says
that some things are UB. If UB appears in a program, it
is programmer who put it there. Essential part of UB is
that it is programmer responsibility to avoid UB.
Specific compiler may be helpful by detecting UB or
defining some useful behaviour, but in general compiler
is allowed to proceed blindy, trusting that there are
no UB in the source.

Coming back to safety, definig errors as violations of
language rules is not fully satisfactory too. Namely,
using language that "allow anything", like assembler,
there will be no violation of language rules, but clearly
such language does not help in detecting error. So
to meaningfuly talk about language safety there must
be rules such that some classes of error lead to
violation of rule and violation must be detected. C
has type rules and violations of type rules will
detect some errors at compile time. But by design C
does not require any error detection at runtime so
clearly is unsafe.

Now, unqualified "safe" is really a fuzzy concept, as
there is no hope of detecting all errors and while
detecting some errors is theoretically possible
cost of checking could be prohibitive. So basically
"safe" boils down to "due diligence": language rules
forbid things that are recognized as likely to be
errors and language uses state of the art methods
to detect or prevent violations of the rules.
Let me add that basically from time where Pascal
were invented it was known how to define a language
rich enough to do most real world task, having rules
which eliminate substantial fraction of errors and
where _all_ violations of language rules are detected.

... but then you do a very good job of demonstrating why anyone could be confused!
But thank you engaging in the topic and providing some examples.

Assembly language is a good one. Clearly it does have some rules, but if
some program manages to assemble, it doesn't mean it has no bugs,
including dangerous ones.

Other languages will have a line drawn elsewhere, as they have more
rules, stricter typing etc. Some, like Rust, which /people/ sometimes
claim will give you bug-free programs once you managed to get it to
compile, have it near the opposite end.

To get back to C and UB, if that 'safe' line isn't on the boundary
between non-UB and UB, then what does the boundary mean? Is it just deterministic vs. non-deterministic behaviour?

So languages that allow undetected violations of rules
are consdered more or less unsafe.

This is back to the other topic as to what makes a practical systems
language.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Mon May 11 12:39:24 2026

From Newsgroup: comp.lang.c

On 11/05/2026 03:48, Keith Thompson wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:
[...]

The comiler doesn't "do something that is UB." The compiler
detects that something in a program is undefined behavior and
does something as a result (that "something" may be nothing).

Or the compiler *doesn't* detect that something in the program
has undefined behavior, but assumes that the behavior is defined,
and generates code consistent with that assumption. A big part of
the rationale behind "undefined behavior" is that compilers don't
have to detect it.

An example:

#include <stdio.h>
#include <time.h>
#include <limits.h>

int main(void) {
int n = time(NULL) > 0 ? INT_MAX : 0;
printf("n=%d, n+1=%d, ", n, n+1);
printf("%d %s %d\n",
n+1,
n+1 > n ? ">" : n+1 == n ? "==" : "<",
n);
}

With different compilers and optimization settings, I get any of the following outputs on my system:

n=2147483647, n+1=1, 1 > 2147483647

n=2147483647, n+1=-2147483648, -2147483648 < 2147483647

n=2147483647, n+1=-2147483648, -2147483648 > 2147483647

I'm fairly sure that none of the compilers detect that there will
be undefined behavior at run time. The fact that time(NULL) is
greater than 0 is not something I'd expect a compiler to assume.
(That's why I added that to the program.) Rather, some compilers
assume that the behavior is defined, and therefore that n + 1 must
be greater than n.

I expected an output that looks like that middle line, which is the most intuitive if you accept that integers have a limited capacity and will
wrap, when represented as 32-bit two's complement.

And it turns out this is exactly what is produced by lccwin32, DMC, Tcc,
Pico C, bcc, with or without any any optimise flag.

Also by gcc -O0, and clang -O0.

Any optimised gcc code looks like that first line, and any optimised
clang code looks like that second line.

If I add -fwrapv to gcc/clang, then they will produce identical outputs
to those other products, at all optimisation levels.

To me, that consistency and reliability is desirable.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Mon May 11 12:53:18 2026

From Newsgroup: comp.lang.c

On 11/05/2026 05:28, Chris M. Thomasson wrote:

On 5/10/2026 8:42 AM, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 09/05/2026 23:47, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

Now look at what's involved in splitting a C module into two.

C doesn't have "modules".

You want to be /that/ pedantic?

This is exactly why I said that the C standard is your thing. If
somebody uses a term that doesn't appear in the standard, then it
doesn't exist.

C doesn't have a concept of 'module' per se.-a Perhaps you're looking
for "translation unit"?

So, what is involved in splitting a ... I don't even know what to call
it - a single .c 'source file'? Well, a lot of messy work.

An experienced C programmer uses independent translation units
without even thinking about it, when the application is
non-trivial.-a-a For many reasons, including reusability,
maintainability and collaboration.-a There are codebases that
have well over a million SLOC.

You are the only programmer who has ever claimed
that an entire application must be contained within a single
translation unit.-a It sounds like you've never actually worked
with either a team, or a non-trivial application.

I wonder if his system has pre-compiled header support.

SL is talking nonsense.

Because sometimes I use tools that transpile whole programs of dozens of modules into a single C source, for the purpose of compiling into an executable (another single file!), he thinks I advocate writing and
developing projects in such a single file too!

Nobody has a problem with distributing an EXE file as one monolithic file.

But if EXEs are a problem, due to AV, or to mistrust, then the next step
back might be some textual format that could be ASM, IR, or C. Then the end-user can run apply that final step themselves.

I've used both ASM and C, but the latter is preferable as local
optimisations can be appplied.

That is not however the original source. Scott Lurndal cannot grasp this
when that file happens to be 'C'.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Mon May 11 13:55:20 2026

From Newsgroup: comp.lang.c

On 11/05/2026 06:58, Janis Papanagnou wrote:

On 2026-05-10 13:44, Bart wrote:

(How likely do you think is it that it's the fault of the hostile
environment and your personality and communication or the level of
expertise has nothing to do with it?)

No, being constantly insulted by people questioning my expertise and knowledge, and bringing up personal matters, has nothing to do with it
at all!

Now I guess people are going to make a big deal out of my using the word 'constantly', and will point that I am wrong (it's only 'sometimes') or
asking whether I'm deliberately lying, or require me to state explicitly
that it is a figurative turn of speech.

It is getting tiresome.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Mon May 11 13:54:49 2026

From Newsgroup: comp.lang.c

In article <10tsfvd$11qhe$4@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 11/05/2026 05:28, Chris M. Thomasson wrote:

On 5/10/2026 8:42 AM, Scott Lurndal wrote:

[snip]
An experienced C programmer uses independent translation units
without even thinking about it, when the application is
non-trivial.-a-a For many reasons, including reusability,
maintainability and collaboration.-a There are codebases that
have well over a million SLOC.

You are the only programmer who has ever claimed
that an entire application must be contained within a single
translation unit.-a It sounds like you've never actually worked
with either a team, or a non-trivial application.

I wonder if his system has pre-compiled header support.

SL is talking nonsense.

No, he's really not.

Because sometimes I use tools that transpile whole programs of dozens of >modules into a single C source, for the purpose of compiling into an >executable (another single file!), he thinks I advocate writing and >developing projects in such a single file too!

You are the one making a big deal out of the fact that whole
programs are in single source files.

Executable object files (to use the ELF terminology) are a
completely different matter.

Nobody has a problem with distributing an EXE file as one monolithic file.

Actually, many do.

On systems that support dynamically linked object files (for
example `.so` on Linux/illumos/*BSD, `.dylib` on macOS, or
`.DLL` on Windows), it is very common to distribute executable
_programs_ as a binary file that is executed by the user (via
whatever mechanism is appropriate) and any number of shared
objects that are dynamically loaded into the program when run,
either by a runtime linker, or explicitly, under the program's
control (on systems that support that).

If you are only concerned with a single (as you called it)
"monolithic" "EXE" file, then yeah, it's tautalogically true
that that is a single file.

But if EXEs are a problem, due to AV, or to mistrust, then the next step >back might be some textual format that could be ASM, IR, or C. Then the >end-user can run apply that final step themselves.

I've used both ASM and C, but the latter is preferable as local >optimisations can be appplied.

That is not however the original source. Scott Lurndal cannot grasp this >when that file happens to be 'C'.

Sure he can. SQLite does that. It's a well-known technique.

You are moving the goalposts because you were using your own
terminology and got pushbacks, and you seem constitutionally
incapable of accepting when people tell you what you wrote is
ambiguous or incorrect.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Mon May 11 14:45:58 2026

From Newsgroup: comp.lang.c

kalevi@kolttonen.fi (Kalevi Kolttonen) writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

In almost all cases where uint8_t
might be used, unsigned char works just as well.

Why "almost"? Where is the difference if any?

As far as I know, ISO guarantees that
sizeof(unsigned char) is always 1 byte.

On at least one system with a working C compiler,
a byte is 9 bits, not 8. If I wanted an 8-bit datum
on that system, I'd have to use uint8_t.

(Now, I haven't used that system in decades, but it
still exists and powers a large fraction of the
worlds airline reservation and operational functions).

And operations on unsigned char are well defined,
including wrap-around. So I fail to see any
difference between unsigned char and uint8_t.

Indeed. Although from my perspective, the use of the
stdint types clearly documents the programmers
intent, whereas a typedef such as BYTE or WORD
is inherently ambiguous and would require a programmer
to look up the definition of such types in the
application to determine the original programmers intent.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Mon May 11 08:10:07 2026

From Newsgroup: comp.lang.c

scott@slp53.sl.home (Scott Lurndal) writes:

kalevi@kolttonen.fi (Kalevi Kolttonen) writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

In almost all cases where uint8_t
might be used, unsigned char works just as well.

Why "almost"? Where is the difference if any?

As far as I know, ISO guarantees that
sizeof(unsigned char) is always 1 byte.

On at least one system with a working C compiler,
a byte is 9 bits, not 8. If I wanted an 8-bit datum
on that system, I'd have to use uint8_t.

If a byte is 9 bits (ie, if CHAR_BIT == 9) there cannot
be a uint8_t type. The fixed-width types are not allowed
to have padding bits.

(Now, I haven't used that system in decades, but it
still exists and powers a large fraction of the
worlds airline reservation and operational functions).

And operations on unsigned char are well defined,
including wrap-around. So I fail to see any
difference between unsigned char and uint8_t.

Indeed. Although from my perspective, the use of the
stdint types clearly documents the programmers
intent, whereas a typedef such as BYTE or WORD
is inherently ambiguous and would require a programmer
to look up the definition of such types in the
application to determine the original programmers intent.

BYTE and WORD are poor choices for type names, no doubt
about that. On the other hand, in many or most cases
so are [u]intNN_t; they simultaneously convey both too
little and too much information. There is a certain kind
of programming where the fixed-width types are genuinely
helpful; unfortunately though they are used a lot more
widely than circumstances where they are helpful.
--- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Mon May 11 15:17:33 2026

From Newsgroup: comp.lang.c

James Kuyper <jameskuyper@alumni.caltech.edu> writes:

On 2026-05-10 20:10, Keith Thompson wrote:
...

Bart, you claimed here that literally *nobody* likes stdint.h types.

I like stdint.h types.

Me too.

Me three. Alhough "like" and "dislike" are emotions, not logic. Those
types are part of the language, and they should be used when appropriate.
--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Mon May 11 15:19:55 2026

From Newsgroup: comp.lang.c

In article <10tsdom$11qhe$2@dont-email.me>, Bart <bc@freeuk.com> wrote: >[snip]

[... ]Rust, which /people/ sometimes claim will give you
bug-free programs once you managed to get it to compile, [...]

No one who programs seriously ever claims that.

Certainly, no one working on Rust claims that.

Nor does anyone who works on Ada, Eiffel, or any number of other
languages that claim safety properties.

It may be that true that some undefined group of "people" make
that claim for Rust (or Ada, or Eiffel, etc) "sometimes". But
anyone who makes that claim seriously is uninformed, and you
should not listen to them.

To get back to C and UB, if that 'safe' line isn't on the boundary
between non-UB and UB, then what does the boundary mean? Is it just >deterministic vs. non-deterministic behaviour?

Again, you need to provide the definition of "safe" that you are
using to try and make that distinction. No one can read your
mind to divine what you are thinking.

This is back to the other topic as to what makes a practical systems >language.

That is a broad topic and certainly beyond the scope of
comp.lang.c.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Mon May 11 15:25:47 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 11/05/2026 05:28, Chris M. Thomasson wrote:

On 5/10/2026 8:42 AM, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 09/05/2026 23:47, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

Now look at what's involved in splitting a C module into two.

C doesn't have "modules".

You want to be /that/ pedantic?

This is exactly why I said that the C standard is your thing. If
somebody uses a term that doesn't appear in the standard, then it
doesn't exist.

C doesn't have a concept of 'module' per se.-a Perhaps you're looking
for "translation unit"?

So, what is involved in splitting a ... I don't even know what to call >>>> it - a single .c 'source file'? Well, a lot of messy work.

An experienced C programmer uses independent translation units
without even thinking about it, when the application is
non-trivial.-a-a For many reasons, including reusability,
maintainability and collaboration.-a There are codebases that
have well over a million SLOC.

You are the only programmer who has ever claimed
that an entire application must be contained within a single
translation unit.-a It sounds like you've never actually worked
with either a team, or a non-trivial application.

I wonder if his system has pre-compiled header support.

SL is talking nonsense.

Really.

Because sometimes I use tools that transpile whole programs of dozens of >modules into a single C source, for the purpose of compiling into an >executable (another single file!), he thinks I advocate writing and >developing projects in such a single file too!

Nobody has a problem with distributing an EXE file as one monolithic file.

Actually, many (if not most) of us distribute applications. The application my CPOE ships includes a fairly small ELF (7MB text) executable, more than fifty shared objects (DLL in windows terminology), manual pages (nroff), several small stand-alone utilities and other collateral.

A single ELF executable is very seldom shipped stand-alone in the real world. --- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Mon May 11 16:48:08 2026

From Newsgroup: comp.lang.c

On 11/05/2026 14:54, Dan Cross wrote:

In article <10tsfvd$11qhe$4@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 11/05/2026 05:28, Chris M. Thomasson wrote:

On 5/10/2026 8:42 AM, Scott Lurndal wrote:

[snip]
An experienced C programmer uses independent translation units
without even thinking about it, when the application is
non-trivial.-a-a For many reasons, including reusability,
maintainability and collaboration.-a There are codebases that
have well over a million SLOC.

You are the only programmer who has ever claimed
that an entire application must be contained within a single
translation unit.-a It sounds like you've never actually worked
with either a team, or a non-trivial application.

I wonder if his system has pre-compiled header support.

SL is talking nonsense.

No, he's really not.

Because sometimes I use tools that transpile whole programs of dozens of
modules into a single C source, for the purpose of compiling into an
executable (another single file!), he thinks I advocate writing and
developing projects in such a single file too!

You are the one making a big deal out of the fact that whole
programs are in single source files.

Only for special purposes such as for distribution or as intermediate files.

But when such a file happens to be C source code, people here seem to
get confused, and think my original program source actually exists as a
single 80,000-line module.

Executable object files (to use the ELF terminology) are a
completely different matter.

Nobody has a problem with distributing an EXE file as one monolithic file.

Actually, many do.

If you are only concerned with a single (as you called it)
"monolithic" "EXE" file, then yeah, it's tautalogically true
that that is a single file.

That's not what I mean by monolithic.

A complete application will consist of one or more EXEs, and each may dynamically link to DLLs, either external libraries or also part of the application.

I'm talking about a single EXE or DLL file, which is created by
compiling dozens or hundreds of individual source files.

Suppose, for some reason, a prebuilt binary isn't practical, what is the alternative? Supply original source code which is, say 100 modules?

Then you get the nightmarish build systems you associate with C and
especially Linux.

Why can't the original source be reduced down to one monolithic file? Advantages:

* You only need supply one file 'prog.c'; not sprawling directories

* The build process then is nearly as simple as compiling hello.c

* A compiler can also do whole-program optimisations

* Where original source is an an obscure language, people don't need a compiler for that language (another EXE) and can use one they have and trust

That is not however the original source. Scott Lurndal cannot grasp this
when that file happens to be 'C'.

Sure he can. SQLite does that. It's a well-known technique.

SQLlite3 is about 100 differen source files, which have gone through an amalgamation process to produce an easy-to-deploy single file. It is not
what the developers work with.

Scott said this:

You are the only programmer who has ever claimed
that an entire application must be contained within a single
translation unit. It sounds like you've never actually worked
with either a team, or a non-trivial application.

Clearly he actually thinks I'm advocating using a single source file for
any kind of project, for actual development rather than a distribution
medium. Either that or he's deliberately spewing misinformation.

You're both clever chaps, and I think you know perfectly well what is happening. So shame on you.

You are moving the goalposts because you were using your own
terminology and got pushbacks, and you seem constitutionally
incapable of accepting when people tell you what you wrote is
ambiguous or incorrect.

I explained the single file thing multiple times. It never seems to get through. Or people don't bother reading my explanations.

In that case this will probably cut no ice either.

-----------------------------------

Below is the list of 77 files that comprise my C compiler. It is written
in my 'M' language, so code files have extension '.m'. Bundled C header
files have extension '.h'. There are a couple of other support files.

cc.m is the lead module that contains the build info.

To build it into an EXE on Windows, I'd normally do this:

c:\cx>mm cc
Compiling cc.m to cc.exe

If I wanted someone else to build from source, and they had a binary of
my M compiler, I'd create a tidy amalgamation of the sources like this:

c:\cx>mm -ma cc
Compiling cc.m to cc.ma # single 611KB 31Kloc source file

If I wanted you to build it, and you were on Linux, I would create a
version as a single C file like this:

c:\cx>mc -linux cc
Compiling cc.m to cc.c # single 2MB 85Kloc C source file

It's bigger as this is poor quality transpiled C, but it works:

root@xx:/mnt/c/cx# gcc cc.c -occ -lm -ldl -fno-strict-aliasing -fwrapv
root@xx:/mnt/c/cx# ./cc -s hello
Compiling hello.c to hello.asm

(Not very useful however, as on Linux it can only generate ASM or
PE-format OBJ, not EXE, and it targets Win64 ABI anyway).

BUT, THIS IS THE IMPORTANT BIT:

* mm.ma is an intermediate, amalgamated source file

* cc.c is an intermediate, generated C source file via transpilation

* Neither of the above is the original source code

* The original source is contained within the 77 files below

==========================
cc.m
pcl.m
pc_api.m
pc_decls.m
pc_diags.m
pc_reduce.m
pc_run.m
pc_runaux.m
pc_tables.m
mc_genmcl.m
mc_auxmcl.m
mc_libmcl.m
mc_stackmcl.m
mc_optim.m
mc_genss.m
mc_decls.m
mc_objdecls.m
mc_writeasm.m
mc_writeexe.m
mx_decls.m
mx_run.m
mx_lib.m
mx_write.m
cc_cli.m
cc_decls.m
cc_tables.m
cc_lex.m
cc_parse.m
cc_genpcl.m
cc_blockpcl.m
cc_libpcl.m
cc_lib.m
cc_support.m
cc_headers.m
cc_show.m
info.txt
assert.h
ctype.h
errno.h
fenv.h
float.h
inttypes.h
stdint.h
limits.h
locale.h
_ansi.h
math.h
setjmp.h
signal.h
stdarg.h
stdbool.h
stddef.h
stdio.h
stdlib.h
_syslist.h
string.h
time.h
utime.h
unistd.h
safelib.h
wchar.h
wctype.h
types.h
stat.h
timeb.h
memory.h
fcntl.h
io.h
direct.h
process.h
malloc.h
conio.h
winsock2.h
_mingw.h
windowsx.h
cc_help.txt
mcc.h

--- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Mon May 11 15:58:36 2026

From Newsgroup: comp.lang.c

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

scott@slp53.sl.home (Scott Lurndal) writes:

kalevi@kolttonen.fi (Kalevi Kolttonen) writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

In almost all cases where uint8_t
might be used, unsigned char works just as well.

Why "almost"? Where is the difference if any?

As far as I know, ISO guarantees that
sizeof(unsigned char) is always 1 byte.

On at least one system with a working C compiler,
a byte is 9 bits, not 8. If I wanted an 8-bit datum
on that system, I'd have to use uint8_t.

If a byte is 9 bits (ie, if CHAR_BIT == 9) there cannot
be a uint8_t type. The fixed-width types are not allowed
to have padding bits.

That was a 36-bit system. It could easly create a
uint8_t value from 1/9th of two 72-bit words;
so no padding bits required.

Indeed. Although from my perspective, the use of the
stdint types clearly documents the programmers
intent, whereas a typedef such as BYTE or WORD
is inherently ambiguous and would require a programmer
to look up the definition of such types in the
application to determine the original programmers intent.

BYTE and WORD are poor choices for type names, no doubt
about that. On the other hand, in many or most cases
so are [u]intNN_t; they simultaneously convey both too
little and too much information. There is a certain kind
of programming where the fixed-width types are genuinely
helpful; unfortunately though they are used a lot more
widely than circumstances where they are helpful.

The programming I do
(mainly kernel programming, SoC simulation,
firmware) all naturally require the fixed-width types.

For other apps, int, long, float, double are preferred
to INT, LONG, FLOAT, DOUBLE (which seems to be the
way windows programmers code)[*]

[*] which probably dates back to 16-bit windows
and their methods of maintaining backward compatability
across two subsequent (32, 64) x86 processor architectures
plus MIPS et alia.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Mon May 11 17:03:05 2026

From Newsgroup: comp.lang.c

On 11/05/2026 16:25, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 11/05/2026 05:28, Chris M. Thomasson wrote:

On 5/10/2026 8:42 AM, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 09/05/2026 23:47, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

Now look at what's involved in splitting a C module into two.

C doesn't have "modules".

You want to be /that/ pedantic?

This is exactly why I said that the C standard is your thing. If
somebody uses a term that doesn't appear in the standard, then it
doesn't exist.

C doesn't have a concept of 'module' per se.-a Perhaps you're looking
for "translation unit"?

So, what is involved in splitting a ... I don't even know what to call >>>>> it - a single .c 'source file'? Well, a lot of messy work.

An experienced C programmer uses independent translation units
without even thinking about it, when the application is
non-trivial.-a-a For many reasons, including reusability,
maintainability and collaboration.-a There are codebases that
have well over a million SLOC.

You are the only programmer who has ever claimed
that an entire application must be contained within a single
translation unit.-a It sounds like you've never actually worked
with either a team, or a non-trivial application.

I wonder if his system has pre-compiled header support.

SL is talking nonsense.

Really.

Because sometimes I use tools that transpile whole programs of dozens of
modules into a single C source, for the purpose of compiling into an
executable (another single file!), he thinks I advocate writing and
developing projects in such a single file too!

Nobody has a problem with distributing an EXE file as one monolithic file.

Actually, many (if not most) of us distribute applications. The application my CPOE ships includes a fairly small ELF (7MB text) executable, more than fifty shared objects (DLL in windows terminology), manual pages (nroff), several small stand-alone utilities and other collateral.

A single ELF executable is very seldom shipped stand-alone in the real world.

You're gettting the wrong end of the stick again.

A EXE file is not distributed as dozens of different piece, each a
separate file, it is one file contained headers, tables and multiple
code and data sections.

That you have multiple such files is not disputed.

But given a single EXE or DLL, it is quite possible to have a
representation of that as one ASM, LL or C file, which may need 'as',
'llc' or 'cc' to turn into an executable.

A single ELF executable is very seldom shipped stand-alone in the

real world.

Installers for Windows are usually a single file, EXE or MSI. Sometimes
apps are packaged as a ZIP. Maybe locally they expand into 1000s of
files, but the user seems one file at the point of install.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Mon May 11 18:11:10 2026

From Newsgroup: comp.lang.c

On 2026-05-11 12:53, Bart wrote:

On 10/05/2026 16:47, Bart wrote:

[...]
This I just discovered by chance. It's a small Reddit language project
which here transpiles to C:

-a-a-a-a-a-a-a self.emit("// Core types");
-a-a-a-a-a-a-a self.emit("typedef int64_t Int;");
-a-a-a-a-a-a-a self.emit("typedef int8_t Int8;");
-a-a-a-a-a-a-a self.emit("typedef int16_t Int16;");
-a-a-a-a-a-a-a self.emit("typedef int32_t Int32;");
-a-a-a-a-a-a-a self.emit("typedef int64_t Int64;");
-a-a-a-a-a-a-a self.emit("typedef uint64_t UInt;");
-a-a-a-a-a-a-a self.emit("typedef uint8_t UInt8;");
-a-a-a-a-a-a-a ...

These are similar to the definitions we had also used during
the early 1990's; the reason was that we had code conventions
that (a) deprecated prefixes and suffixes like '_t', and that
(b) we used type identifiers with first letter capitalized.
The latter was mainly for consistency with our C++ class style
conventions, but the types-header we provided was also for "C".
This detail was a _layout/style convention_ to foster *unified*
looking code across developers and across different projects,
and to guarantee the technical sizes visibly where it matters.
(And that was the whole reason. Nothing about "liking" or so.)
In the absence of any existing standard (like "stdint.h") back
these days we used the base types, though, like you did below.
If there would have been standard types (like above) we'd have
used these of course (as a base for our typedefs. - No magic.
No "liking" or "disliking". Just _conventions_ in the context
of what was existing (and missing) back then. - As in (e.g.)
typedef unsigned long int UInt32; -- only 32 bit back then
We also had things like UChar ('char' had undefined sign), or
Byte (which was generally 8 bit in our contexts), and some
such (IIRC).

Janis

You'd think that if transpling to C anyway, they can tolerate using "int64_t" in the generated C. But apparently not.

I don't blame them; I do the same:

typedef signed char-a-a-a-a-a-a-a-a-a-a-a-a i8;
typedef short-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a i16;
typedef int-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a i32;
typedef long long int-a-a-a-a-a-a-a-a-a-a i64;
typedef unsigned char-a-a-a-a-a-a-a-a-a-a u8;
typedef unsigned short-a-a-a-a-a-a-a-a-a u16;
typedef unsigned int-a-a-a-a-a-a-a-a-a-a-a u32;
typedef unsigned long long int-a u64;

In this case however I don't use any standard headers.

[...]

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Mon May 11 18:12:03 2026

From Newsgroup: comp.lang.c

On 2026-05-11 17:17, Scott Lurndal wrote:

James Kuyper <jameskuyper@alumni.caltech.edu> writes:

On 2026-05-10 20:10, Keith Thompson wrote:
...

Bart, you claimed here that literally *nobody* likes stdint.h types.

I like stdint.h types.

Me too.

Me three.

Make that four, adding me. - Now are we maybe just outliers in Bart's "Statistics of Arbitrary Assumptions and Imaginary Alternative Facts"?

Alhough "like" and "dislike" are emotions, not logic. Those
types are part of the language, and they should be used when appropriate.

Janis

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c on Mon May 11 18:23:46 2026

From Newsgroup: comp.lang.c

Am 10.05.2026 um 18:58 schrieb Kalevi Kolttonen:

Bonita Montero <Bonita.Montero@gmail.com> wrote:

Yes, that's a major problem with all 64 bit Unices.Use Windows with
that. On Windows long and int have the same size.

I have used Linux since the summer of 1998 and would never
ever even consider installing Windows. It is so disgusting.

I'd never switch to a LP64-system because it's not so dynamic
and agile as LLP64-systems. They provide a totally new user
experience with that.

--- Synchronet 3.22a-Linux NewsLink 1.2

From tTh@tth@none.invalid to comp.lang.c on Mon May 11 18:26:11 2026

From Newsgroup: comp.lang.c

On 5/11/26 17:48, Bart wrote:

Clearly he actually thinks I'm advocating using a single source file for
any kind of project, for actual development rather than a distribution medium. Either that or he's deliberately spewing misinformation.

I'm currently working on an application where the main
command is a Bash script, who call a few binaries in
Fortran who use two C libraries and call some plugins
written in Awk. May be you have a magic recipe for
putting all that mess in a single source file ?
--
** **
* tTh des Bourtoulots *
* http://maison.tth.netlib.re/ *
** **
--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Mon May 11 17:07:33 2026

From Newsgroup: comp.lang.c

In article <10truhq$tqbj$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:

On 11/05/2026 02:31, James Kuyper wrote:

On 2026-05-08 06:43, David Brown wrote:
...

Yes, I have heard that argument before. I am unconvinced that the
"value preserving" choice actually has any real advantages. I also
think it is a misnomer - it implies that "unsigned preserving" would
not preserve values, which is wrong.

Unsigned-preserving rules would convert a signed value which might be
negative to unsigned type more frequently than the value preserving
rules do. Such a conversion is not value-preserving.

If you have a signed value, you have a signed type. Unsigned-preserving >rules are also signed-preserving - smaller unsigned types promote to
bigger unsigned types, while smaller signed types promote to bigger
signed types. I don't think anyone ever suggested smaller signed types >should promote to larger unsigned types.

Perhaps I am being bone-headed here and missing something obvious.
(Given that the C committee put in a lot of effort and came to a
different conclusion, it seems very likely that I'm missing something.)

The C89 rationale document is useful here, specifically section
3.2.1.1.

It describes the tradeoffs between unsigned-preserving and
value-preserving semantics that the committeee considered when
making the decision to codify value-preserving behavior. Of
note to this discussion is the following:

|Both schemes give the same answer in the vast majority of
|cases, and both give the same effective result in even more
|cases in implementations with twos complement arithmetic and
|quiet wraparound on signed overflow rCo that is, in most current |implementations.

This suggests the committee felt that it was rare that signed
integer overflow was treated specially by compilers, and that
the equivalent of `-fwrapv` was the dominant case, and would
continue to be in the future. (Oh, those sweet summer
children....)

The text continues with descriptions of operations where the
promotion of `unsigned char` and `unsigned short` values yield
results that the committee dubbed, "questionably signed." That
is, places where interpreting the sign of the result is
ambiguous given the two different semantics.

They highlight that the same ambiguity arises with operations
mixing `unsigned int` and `signed int`, but state that (to use
their words), the "unsigned preserving rules greatly increase
the number of situations where `unsigned int` confronts `signed
int` to yield a questionably signed result, whereas the value
preserving rules minimize such confrontations. Thus, the value
preserving rules were considered to be safer for the novice, or
unwary, programmer."

They do go on to note that this is a, "quiet change", at odds
with contemporary Unix compilers, and say, "This is considered
the most serious semantic change made by the Committee to a
widespread current practice." Indeed.

Unsigned-preserving promotions would, AFAICS, preserve value and
signedness :

unsigned short -> unsigned int
signed short -> signed int

Value-preserving promotions would preserve values too :

unsigned short -> signed int
signed short -> signed int

The unsigned-preserving promotions could also safely be applied even if >short is the same size as int - that is not the case for the "always
promote to signed int" rules.

The situations they were thinking about were things like this:

unsigned short a = 8;
int b = -5;
long c = a * b;

With value-preserving semantics, `c` is 40. On the other hand,
with unsigned-preserving semantics, assuming a 64-bit `long` and
32-bit `int`, `c` is 4294967256; logical enough, but one could
see how that might be surprising for someone unfamiliar with the
language.

What they do not appear to have antipicated are compiler
developers who would exploit the undefined nature of signed
integer overflow so aggressively that things like taking the
product of two 16-bit `unsigned short` values and assigning it
to a variable of unsigned type might yield unexpected results
(like a saturated product).

And I sincerely believe that they never thought that anyone
would use "undefined behavior" as a cudgel to justify such
behavior, even if a compiler would technically be operating
within the bounds of the standard if it did so. Talk about
being surprising for the novice or unwary....

On balance, I agree with you that they should have chosen
unsigned-preserving semantics. Perhaps it would have led to
more situations where `unsigned int` "confronts" a `signed int`
that is negative (like they're about to throw down outside a bar
over a spilled drink or something), but in retrospect, I think
that's relatively easy to explain, while the value preserving
semantics lead to more UB and different questions: from the
novice perspective, it is very reasonable to ask why
`(unsigned short)8 * -5 == -40` but
`(unsigned)8 * -5 == 4294967256`).

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Mon May 11 11:05:30 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:
[...]

This I just discovered by chance. It's a small Reddit language project
which here transpiles to C:

self.emit("// Core types");
self.emit("typedef int64_t Int;");
self.emit("typedef int8_t Int8;");
self.emit("typedef int16_t Int16;");
self.emit("typedef int32_t Int32;");
self.emit("typedef int64_t Int64;");
self.emit("typedef uint64_t UInt;");
self.emit("typedef uint8_t UInt8;");
...

You'd think that if transpling to C anyway, they can tolerate using
"int64_t" in the generated C. But apparently not.

[...]

So what?

Some people like the <stdint.h> types. Some people don't. Everyone
here knows that. Showing us yet another example of someone renaming
them proves nothing.

What is your point?
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Mon May 11 11:12:06 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 11/05/2026 03:48, Keith Thompson wrote:

[...]

With different compilers and optimization settings, I get any of the
following outputs on my system:

n=2147483647, n+1=1, 1 > 2147483647

n=2147483647, n+1=-2147483648, -2147483648 < 2147483647

n=2147483647, n+1=-2147483648, -2147483648 > 2147483647

I'm fairly sure that none of the compilers detect that there will
be undefined behavior at run time. The fact that time(NULL) is
greater than 0 is not something I'd expect a compiler to assume.
(That's why I added that to the program.) Rather, some compilers
assume that the behavior is defined, and therefore that n + 1 must
be greater than n.

I expected an output that looks like that middle line, which is the
most intuitive if you accept that integers have a limited capacity and
will wrap, when represented as 32-bit two's complement.

The program has undefined behavior.

Since you refuse to even read the definition of the term, your
opinions about it and your expectations of how the program should
behave are irrelevant.

[...]
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Mon May 11 11:20:04 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:
[...]

Scott said this:

You are the only programmer who has ever claimed
that an entire application must be contained within a single
translation unit. It sounds like you've never actually worked
with either a team, or a non-trivial application.

Clearly he actually thinks I'm advocating using a single source file
for any kind of project, for actual development rather than a
distribution medium. Either that or he's deliberately spewing
misinformation.

Apparently Scott misunderstood something you wrote. That's not
at all surprising. You could have calmly and briefly corrected
Scott's error rather than arguing about it at great length.
Something like "No, I don't advocate using a single source file
for actual development" would have been more than sufficient.

[...]
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Mon May 11 18:21:06 2026

From Newsgroup: comp.lang.c

In article <M0nMR.786566$G7x8.651226@fx15.iad>,
Scott Lurndal <slp53@pacbell.net> wrote:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

scott@slp53.sl.home (Scott Lurndal) writes:

kalevi@kolttonen.fi (Kalevi Kolttonen) writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

In almost all cases where uint8_t
might be used, unsigned char works just as well.

Why "almost"? Where is the difference if any?

As far as I know, ISO guarantees that
sizeof(unsigned char) is always 1 byte.

On at least one system with a working C compiler,
a byte is 9 bits, not 8. If I wanted an 8-bit datum
on that system, I'd have to use uint8_t.

If a byte is 9 bits (ie, if CHAR_BIT == 9) there cannot
be a uint8_t type. The fixed-width types are not allowed
to have padding bits.

That was a 36-bit system. It could easly create a
uint8_t value from 1/9th of two 72-bit words;
so no padding bits required.

I think the issue is the standard's section on, "representation
of types" (sec 6.2.6.1 para 4 in `n3220`), which requires that
anything that's not a `char` type (`(signed|unsigned)? char`)
must be a multiple represented by a multiple `CHAR_BIT` bits.
So if `CHAR_BIT` is 9, then since the exact-width types do not
permit padding bits (sec 7.22, para 1), then `uint8_t` cannot
be defined on such a system since there is no (integer) multiple
of 9 that gives 8.

Granted, that section does not explicitly say that it needs to
be an *integer* multiple of `CHAR_BIT`, but it implies it, and
section 5.2.5.3.2 says that `CHAR_BIT` is the, "number of bits
for smallest object that is not a bit-field (byte)".

So it is not clear to me that the definition of `byte` in the C
standard comports with that of some 36-bit machines, where bytes
can be of variable width; that would have to be some kind of
non-standard extension.

Indeed. Although from my perspective, the use of the
stdint types clearly documents the programmers
intent, whereas a typedef such as BYTE or WORD
is inherently ambiguous and would require a programmer
to look up the definition of such types in the
application to determine the original programmers intent.

BYTE and WORD are poor choices for type names, no doubt
about that. On the other hand, in many or most cases
so are [u]intNN_t; they simultaneously convey both too
little and too much information. There is a certain kind
of programming where the fixed-width types are genuinely
helpful; unfortunately though they are used a lot more
widely than circumstances where they are helpful.

The programming I do
(mainly kernel programming, SoC simulation,
firmware) all naturally require the fixed-width types.

For other apps, int, long, float, double are preferred
to INT, LONG, FLOAT, DOUBLE (which seems to be the
way windows programmers code)[*]

[*] which probably dates back to 16-bit windows
and their methods of maintaining backward compatability
across two subsequent (32, 64) x86 processor architectures
plus MIPS et alia.

Same. "But Doctor, I am Pagliacci!"

It is not worth trying to write the hardware dependent parts of
a kernel in _strictly conforming_ ISO C.

At a minimum, one usually relies on careful use of an ABI for
the target platform because the exact semantics around structure
layout and calling conventions are essential. And, of course,
when dealing with hardware, where the characteristics of device
registers and so forth are generally fixed.

But being gratuitously different from the standard is not useful
either. The exact-width types as a standards-guaranteed way to
get a value of a specified width is a real boon.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Mon May 11 11:22:07 2026

From Newsgroup: comp.lang.c

scott@slp53.sl.home (Scott Lurndal) writes:
[...]

On at least one system with a working C compiler,
a byte is 9 bits, not 8. If I wanted an 8-bit datum
on that system, I'd have to use uint8_t.

(Now, I haven't used that system in decades, but it
still exists and powers a large fraction of the
worlds airline reservation and operational functions).

[...]

What system is that?
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Mon May 11 19:24:28 2026

From Newsgroup: comp.lang.c

On 11/05/2026 19:05, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

This I just discovered by chance. It's a small Reddit language project
which here transpiles to C:

self.emit("// Core types");
self.emit("typedef int64_t Int;");
self.emit("typedef int8_t Int8;");
self.emit("typedef int16_t Int16;");
self.emit("typedef int32_t Int32;");
self.emit("typedef int64_t Int64;");
self.emit("typedef uint64_t UInt;");
self.emit("typedef uint8_t UInt8;");
...

You'd think that if transpling to C anyway, they can tolerate using
"int64_t" in the generated C. But apparently not.

[...]

So what?

Some people like the <stdint.h> types. Some people don't. Everyone
here knows that. Showing us yet another example of someone renaming
them proves nothing.

What is your point?

I was asked for multiple examples of somebody defining aliases for
stdint.h types. This was one more, and not cherry-picked either.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Mon May 11 11:25:07 2026

From Newsgroup: comp.lang.c

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
[...]

BYTE and WORD are poor choices for type names, no doubt
about that.

[...]

WORD is certainly ambiguous (unless, I suppose, it's sufficiently
obvious from the context). But I don't have a problem with BYTE,
or preferably byte, as a type name as long as it really is a byte.

C does have a byte type; it just happens to spell it "unsigned char".
But I don't object to something like

typedef unsigned char byte;

and I've used it myself.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Mon May 11 19:30:44 2026

From Newsgroup: comp.lang.c

On 11/05/2026 19:12, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 11/05/2026 03:48, Keith Thompson wrote:

[...]

With different compilers and optimization settings, I get any of the
following outputs on my system:

n=2147483647, n+1=1, 1 > 2147483647

n=2147483647, n+1=-2147483648, -2147483648 < 2147483647

n=2147483647, n+1=-2147483648, -2147483648 > 2147483647

I'm fairly sure that none of the compilers detect that there will
be undefined behavior at run time. The fact that time(NULL) is
greater than 0 is not something I'd expect a compiler to assume.
(That's why I added that to the program.) Rather, some compilers
assume that the behavior is defined, and therefore that n + 1 must
be greater than n.

I expected an output that looks like that middle line, which is the
most intuitive if you accept that integers have a limited capacity and
will wrap, when represented as 32-bit two's complement.

The program has undefined behavior.

Even when -fwrapv is applied?

--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Mon May 11 11:32:35 2026

From Newsgroup: comp.lang.c

cross@spitfire.i.gajendra.net (Dan Cross) writes:
[...]

The situations they were thinking about were things like this:

unsigned short a = 8;
int b = -5;
long c = a * b;

With value-preserving semantics, `c` is 40.

You mean -40.

On the other hand,
with unsigned-preserving semantics, assuming a 64-bit `long` and
32-bit `int`, `c` is 4294967256; logical enough, but one could
see how that might be surprising for someone unfamiliar with the
language.

[...]
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Mon May 11 11:34:59 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 11/05/2026 19:12, Keith Thompson wrote:

[...]

The program has undefined behavior.

Even when -fwrapv is applied?

I've answered that before. You wouldn't need to ask if you read
the definition of the term.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Mon May 11 19:38:27 2026

From Newsgroup: comp.lang.c

On 11/05/2026 19:20, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

Scott said this:

You are the only programmer who has ever claimed
that an entire application must be contained within a single
translation unit. It sounds like you've never actually worked
with either a team, or a non-trivial application.

Clearly he actually thinks I'm advocating using a single source file
for any kind of project, for actual development rather than a
distribution medium. Either that or he's deliberately spewing
misinformation.

Apparently Scott misunderstood something you wrote. That's not
at all surprising. You could have calmly and briefly corrected
Scott's error rather than arguing about it at great length.

I wasn't replying to Scott. And in fact this correction has been made
multiple times in the past; it doesn't help.

Something like "No, I don't advocate using a single source file
for actual development" would have been more than sufficient.

I was replying to Dan Cross who tool SL's side and made these comments:

DC:

No, he's really not.

You are the one making a big deal out of the fact that whole
programs are in single source files.

Sure he can. SQLite does that. It's a well-known technique.

(Here DC is getting things mixed up)

You are moving the goalposts because you were using your own
terminology and got pushbacks, and you seem constitutionally
incapable of accepting when people tell you what you wrote is
ambiguous or incorrect.

It's difficult to keep calm when people post bullying and patronising
garbage like this.

Still, I believe the post I made was civil, and accurate.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Mon May 11 18:44:46 2026

From Newsgroup: comp.lang.c

In article <10tstnn$17jmo$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 11/05/2026 14:54, Dan Cross wrote:

In article <10tsfvd$11qhe$4@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 11/05/2026 05:28, Chris M. Thomasson wrote:

On 5/10/2026 8:42 AM, Scott Lurndal wrote:

[snip]
An experienced C programmer uses independent translation units
without even thinking about it, when the application is
non-trivial.-a-a For many reasons, including reusability,
maintainability and collaboration.-a There are codebases that
have well over a million SLOC.

You are the only programmer who has ever claimed
that an entire application must be contained within a single
translation unit.-a It sounds like you've never actually worked
with either a team, or a non-trivial application.

I wonder if his system has pre-compiled header support.

SL is talking nonsense.

No, he's really not.

Because sometimes I use tools that transpile whole programs of dozens of >>> modules into a single C source, for the purpose of compiling into an
executable (another single file!), he thinks I advocate writing and
developing projects in such a single file too!

You are the one making a big deal out of the fact that whole
programs are in single source files.

Only for special purposes such as for distribution or as intermediate files.

What? No, that wasn't the context at all.

But when such a file happens to be C source code, people here seem to
get confused, and think my original program source actually exists as a >single 80,000-line module.

Your original words, from <10tn877$3kg8u$1@dont-email.me>, were:

|However, if I have module M which has local entity F, and split it into
|two modules A and B both of which use F, then I'd have to mark F as
|'global', in whichever module it ends up in.
|
|It can sometimes happen that other modules already share something
|called F, then the compiler reports an ambiguity. But in this case I'd
|rather rename one that resort to using qualifiers.
|
|Now look at what's involved in splitting a C module into two.

When I first read this, I also thought it was describing a
module system a la C++ namespaces or Rust modules or something,
and it was not clear you were talking about source files.

Now you're talking about _distributing_ programs as a single
source file. That has nothing to do with how this whole line
came about.

Executable object files (to use the ELF terminology) are a
completely different matter.

Nobody has a problem with distributing an EXE file as one monolithic file. >>

Actually, many do.

[If you're going to cut out a big chunk of context, at least try
and annotate that some way.]

If you are only concerned with a single (as you called it)
"monolithic" "EXE" file, then yeah, it's tautalogically true
that that is a single file.

That's not what I mean by monolithic.

A complete application will consist of one or more EXEs, and each may >dynamically link to DLLs, either external libraries or also part of the >application.

I'm talking about a single EXE or DLL file, which is created by
compiling dozens or hundreds of individual source files.

Suppose, for some reason, a prebuilt binary isn't practical, what is the >alternative? Supply original source code which is, say 100 modules?

That has nothing to do with your "monolithic EXE" thing.

Then you get the nightmarish build systems you associate with C and >especially Linux.

Why can't the original source be reduced down to one monolithic file?

This is totally orthogonal to the original point of contention.

Advantages:

* You only need supply one file 'prog.c'; not sprawling directories

* The build process then is nearly as simple as compiling hello.c

* A compiler can also do whole-program optimisations

* Where original source is an an obscure language, people don't need a >compiler for that language (another EXE) and can use one they have and trust

Somehow, in your mind, you went from discussing the definition
of a "module" to now talking about taking a program (possibly
spread across multiple source files) in some language, and
transpiling it into a single, distributable, source file.

That has nothing to do with the original point you were taken to
task for (that C had some unspecified notion of "modules", as
quoted above).

You are now somehow twisting things into somehow arguing that
others do not understand this technique (despite it being widely
known) or

That is not however the original source. Scott Lurndal cannot grasp this >>> when that file happens to be 'C'.

Sure he can. SQLite does that. It's a well-known technique.

SQLlite3 is about 100 differen source files, which have gone through an >amalgamation process to produce an easy-to-deploy single file. It is not >what the developers work with.

Yes, that is what I said.

Scott said this:

You are the only programmer who has ever claimed
that an entire application must be contained within a single
translation unit. It sounds like you've never actually worked
with either a team, or a non-trivial application.

Clearly he actually thinks I'm advocating using a single source file for
any kind of project, for actual development rather than a distribution >medium. Either that or he's deliberately spewing misinformation.

Yes, because that's the way that you made it sound. Or
something. It wasn't at all clear what you were talking about.

You're both clever chaps, and I think you know perfectly well what is >happening. So shame on you.

Consider that, perhaps, your use of terminology is so muddled
and unclear that we do not, in fact, "know perfectly well what
is happening."

I can't speak for Scott, of course, but from where I am sitting,
you seem to be very uninformed about how these things work
generally, and you're using your own, made-up terminology.
Sometimes, that terminology conflicts with standard terminology,
and confusion results. You seem to think this is people
deliberately trying to misinterpret you.

You are moving the goalposts because you were using your own
terminology and got pushbacks, and you seem constitutionally
incapable of accepting when people tell you what you wrote is
ambiguous or incorrect.

I explained the single file thing multiple times. It never seems to get >through. Or people don't bother reading my explanations.

In that case this will probably cut no ice either.

Were you? It seems like you changed what you were talking
about. It still seems that way. Now you're trying to argue you
were saying something else; perhaps to save face, perhaps you
really believe it. In either event, you're responding to things
that others were not saying.

Below is the list of 77 files that comprise my C compiler.
[snip]

Not relevant.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Mon May 11 11:46:02 2026

From Newsgroup: comp.lang.c

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <M0nMR.786566$G7x8.651226@fx15.iad>,
Scott Lurndal <slp53@pacbell.net> wrote:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

[...]

If a byte is 9 bits (ie, if CHAR_BIT == 9) there cannot
be a uint8_t type. The fixed-width types are not allowed
to have padding bits.

That was a 36-bit system. It could easly create a
uint8_t value from 1/9th of two 72-bit words;
so no padding bits required.

I think the issue is the standard's section on, "representation
of types" (sec 6.2.6.1 para 4 in `n3220`), which requires that
anything that's not a `char` type (`(signed|unsigned)? char`)
must be a multiple represented by a multiple `CHAR_BIT` bits.
So if `CHAR_BIT` is 9, then since the exact-width types do not
permit padding bits (sec 7.22, para 1), then `uint8_t` cannot
be defined on such a system since there is no (integer) multiple
of 9 that gives 8.

Exactly. (We can confidently infer that it must be an integer
multiple because it refers to "the size of an object of that type,
in bytes", and the sizeof operator yields an integer value, and
because it wouldn't make sense otherwise.)

Granted, that section does not explicitly say that it needs to
be an *integer* multiple of `CHAR_BIT`, but it implies it, and
section 5.2.5.3.2 says that `CHAR_BIT` is the, "number of bits
for smallest object that is not a bit-field (byte)".

So it is not clear to me that the definition of `byte` in the C
standard comports with that of some 36-bit machines, where bytes
can be of variable width; that would have to be some kind of
non-standard extension.

It's crystal clear that a C "byte' has a fixed width, and that it's inconsistent with any kind of variable-width "byte".
You might have, say, a 36-bit machine that can work with 6-bit,
9-bit, or 12-bit "bytes", but a conforming C implementation must
chose a constant value for CHAR_BIT (and it can't be 6).

A C-like implementation that has CHAR_BIT==6 or that supports
variable-width bytes might be useful, but it wouldn't be conforming.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Mon May 11 18:50:45 2026

From Newsgroup: comp.lang.c

In article <10tt7n2$1asao$3@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 11/05/2026 19:20, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

Scott said this:

You are the only programmer who has ever claimed
that an entire application must be contained within a single
translation unit. It sounds like you've never actually worked
with either a team, or a non-trivial application.

Clearly he actually thinks I'm advocating using a single source file
for any kind of project, for actual development rather than a
distribution medium. Either that or he's deliberately spewing
misinformation.

Apparently Scott misunderstood something you wrote. That's not
at all surprising. You could have calmly and briefly corrected
Scott's error rather than arguing about it at great length.

I wasn't replying to Scott. And in fact this correction has been made >multiple times in the past; it doesn't help.

Something like "No, I don't advocate using a single source file
for actual development" would have been more than sufficient.

I was replying to Dan Cross who tool SL's side and made these comments:

DC:

No, he's really not.

You are the one making a big deal out of the fact that whole
programs are in single source files.

Sure he can. SQLite does that. It's a well-known technique.

(Here DC is getting things mixed up)

Lol, no. My point was that Scott surely understands the notion
of _distributing_ a program that is composed of multiple source
files as a single source file. The example was SQLite.

But that seems to have nothing to do with anything. Your
original complaint was about "splitting modules" in C, by which
we all now understand that you mean distributing the contents of
a source file into other source files, but that was not at all
clear at the time.

For some reason you seemed to think that this business of
refactoring code between different source files was hard in C.

You are moving the goalposts because you were using your own
terminology and got pushbacks, and you seem constitutionally
incapable of accepting when people tell you what you wrote is
ambiguous or incorrect.

It's difficult to keep calm when people post bullying and patronising >garbage like this.

No one is bullying you.

Still, I believe the post I made was civil, and accurate.

It was reasonably civil, if whiny. It was not accurate.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Mon May 11 18:56:16 2026

From Newsgroup: comp.lang.c

In article <10tt7c3$1adha$7@kst.eternal-september.org>,
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote: >cross@spitfire.i.gajendra.net (Dan Cross) writes:

[...]

The situations they were thinking about were things like this:

unsigned short a = 8;
int b = -5;
long c = a * b;

With value-preserving semantics, `c` is 40.

You mean -40.

Sigh: I need more coffee.

Yes, I do, of course. Thanks.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Mon May 11 11:58:36 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 11/05/2026 19:20, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

Scott said this:

You are the only programmer who has ever claimed
that an entire application must be contained within a single
translation unit. It sounds like you've never actually worked
with either a team, or a non-trivial application.

Clearly he actually thinks I'm advocating using a single source file
for any kind of project, for actual development rather than a
distribution medium. Either that or he's deliberately spewing
misinformation.

Apparently Scott misunderstood something you wrote. That's not
at all surprising. You could have calmly and briefly corrected
Scott's error rather than arguing about it at great length.

I wasn't replying to Scott. And in fact this correction has been made multiple times in the past; it doesn't help.

[...]

As I recall, this sub-discussion started with you saying something
about the difficulty of splitting C source files, presumably in
the context of refactoring. I have yet to see you say anything
substantive about that.

Do you have anything to say about splitting C source files?
Are there difficulties you've run into that someone might help
you with?
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Mon May 11 19:04:20 2026

From Newsgroup: comp.lang.c

In article <10tt6sr$1asao$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 11/05/2026 19:05, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

This I just discovered by chance. It's a small Reddit language project
which here transpiles to C:

self.emit("// Core types");
self.emit("typedef int64_t Int;");
self.emit("typedef int8_t Int8;");
self.emit("typedef int16_t Int16;");
self.emit("typedef int32_t Int32;");
self.emit("typedef int64_t Int64;");
self.emit("typedef uint64_t UInt;");
self.emit("typedef uint8_t UInt8;");
...

You'd think that if transpling to C anyway, they can tolerate using
"int64_t" in the generated C. But apparently not.

[...]

So what?

Some people like the <stdint.h> types. Some people don't. Everyone
here knows that. Showing us yet another example of someone renaming
them proves nothing.

What is your point?

I was asked for multiple examples of somebody defining aliases for
stdint.h types. This was one more, and not cherry-picked either.

No you weren't. You were asked to prove that "nobody" likes
them. That was trivially proven false when the first person
here said that they _did_ like hem, so you shifted to, "most
people don't like them." You've been posting random examples
of people creating type definitions around them as "evidence"
for that claim, but every time, you've failed to show that that
is because "they don't like them." When it has been suggested
that there are any number of reasons why a program might do
that (for instance, compatibility with existing code that
predated the exact-width types in ISO C), and that anecdotal
evidence is not actually evidence of fact, you have complained
that the goalposts are moving and that people are picking on
you. They are not.

The fact of the matter is that you don't like them. That's
fine, but beyond that, you have no possible way of knowing
whether a majority of C programmers like them or not.

You set yourself up for failure by making a subjective assertion
based on your own preference and presenting as fact.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Mon May 11 19:06:42 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> wrote:

On 11/05/2026 02:44, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 11/05/2026 00:11, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

I've glanced at appendix K.1 and saw nothing relevant there. It's
about exceeding arrray bounds.

Your false claim was that C "pretends" to be a safe language.

People keep jumping to conclusions without asking for clarification.
This is what I said (quite a few posts back):

C pretends to be a safe language by saying all those naughty things

are UB and should be avoided, at the same time, C compilers can be made
to do all that.

(I see now you quoted this yourself; I can have saved some time!)

The assumption made here is that unsafe-ness arises in C from UB. Then I >>> suggest that, while the language itself washes it hands of it, it lets
the compiler do the dirty work (as well as pushing the responsibility to >>> the user, by allowing the compiler to do something that is UB).

You are seriously confused by what other people consider as
"safe language".

Yes. You say that as though I shouldn't be ...

First, I do not think it is possible to
give satisfactory definition of safety, either get the idea
or not. One popular attempt at definition is that language
is safe if no untrapped errors are possible. Of course, this
definition has trouble because then one needs to say what
an error is. Resonable definition could be that there is an
error if program is doing different thing than intended by
its creator. But as you noted there are errors that
language implementation can not reasonably detect so clearly
attempt above + this definiot on error is not satisfactory.
So we need to restrict what we consider to be an error.
When talking about language safety posible (and popular)
approach is restrict errors to things that break language
rules, like using out of bound array indices or overflow
in C signed arithmetic.

Now, if you look at UB, UB in particular means that
implementation is not obliged to detect errors. So
UB in language definition means that language is more
or less unsafe.

I think that your formulation "allowing the compiler to do
something that is UB" is quite misleading. Standard says
that some things are UB. If UB appears in a program, it
is programmer who put it there. Essential part of UB is
that it is programmer responsibility to avoid UB.
Specific compiler may be helpful by detecting UB or
defining some useful behaviour, but in general compiler
is allowed to proceed blindy, trusting that there are
no UB in the source.

Coming back to safety, definig errors as violations of
language rules is not fully satisfactory too. Namely,
using language that "allow anything", like assembler,
there will be no violation of language rules, but clearly
such language does not help in detecting error. So
to meaningfuly talk about language safety there must
be rules such that some classes of error lead to
violation of rule and violation must be detected. C
has type rules and violations of type rules will
detect some errors at compile time. But by design C
does not require any error detection at runtime so
clearly is unsafe.

Now, unqualified "safe" is really a fuzzy concept, as
there is no hope of detecting all errors and while
detecting some errors is theoretically possible
cost of checking could be prohibitive. So basically
"safe" boils down to "due diligence": language rules
forbid things that are recognized as likely to be
errors and language uses state of the art methods
to detect or prevent violations of the rules.
Let me add that basically from time where Pascal
were invented it was known how to define a language
rich enough to do most real world task, having rules
which eliminate substantial fraction of errors and
where _all_ violations of language rules are detected.

... but then you do a very good job of demonstrating why anyone could be confused!
But thank you engaging in the topic and providing some examples.

Assembly language is a good one. Clearly it does have some rules, but if some program manages to assemble, it doesn't mean it has no bugs,
including dangerous ones.

Other languages will have a line drawn elsewhere, as they have more
rules, stricter typing etc. Some, like Rust, which /people/ sometimes
claim will give you bug-free programs once you managed to get it to
compile, have it near the opposite end.

Rust claims that if you stay withing safe subset, then your program
will have no memory errors. That is it will only access memory that
it allocated and access it as correct type (and a bit more but
I am skipping details). So no things like executing string obtained
from the user as machine code (popular technique for breaking into
systems).

To get back to C and UB, if that 'safe' line isn't on the boundary
between non-UB and UB, then what does the boundary mean? Is it just deterministic vs. non-deterministic behaviour?

As I wrote, safety is about ability to avoid or detect errors.
And there is no well defined boundary, researchers constanty try
to push the boundary. You can add some fancy hardware to detect
new classes of errors and mandate that all language implementations
use this hardware. People invent new ways of checking things
at runtime. People invent new proof-like methods to make
sure at compile time that some problems will not appear at
runtime. There is a lot of heuristic/statistical approaches
which try to reduce number of errors (without any warranty
that they will eliminate all problems even in some restriced
class).

Concerning determinstic, there is corelation between deterministic
and safe, but they are not the same thing. It is easier to
deal with deterministic systems, which may increase safety.
But other factors are _much_ more important, so deterministic
alone is not warranty of safety. OTOH non-deterministic
behaviour may be desirable or unavoidable and such system
can have strong safety features.

So languages that allow undetected violations of rules
are consdered more or less unsafe.

This is back to the other topic as to what makes a practical systems language.

With current state of art, if you need to work with hardware,
then you need unsafe features. Modern tendency is that only
operating system (including device drivers) has "unrestricted"
access to hardware (I put unrestricted in quotes to account
for things like hypervisors). At higer level it seems that
safe languages allow to do all needed work. Of course, this
may require more effort from programmers, but that is managable
and there are indications that on average safe languages
may require less effort. There may be loss of efficiency.
C++ preached safe and "zero runtime cost", but in reality
safety features have some overhead. You seem to be quite
satified having half of speed of optimized C. It seem that
safe languges can deliver that. The battle is about last
few percent of preformanmce and there are disagreements if
those last few percents matter.

Anyway, once you are above OS level languages like SML or
OCaml add strong safety features and offer performace within
small factor around preformance of optimized C. Languages
based on JVM claim slightly better performance and comparable
safety. Those languages depend on garbage collection which may
introduce unacceptable delays. But there are now methods
to make delays smaller (used in Erlang and Go), and
methods that burn more machine cycles but completely eliminate
delays (parallel garbage collection). Rust got a lot of
good press because it promises memory safety (which previously
needed garbage collector) for manual memory management.

Concerning "system programming", for long time many developers
believed that safety is needed only in very special applications
and that in general purpose systems bugs are tolerable.
Internet slightly challenged this, highliting need for
security. But even after industry got serious about security,
they still considered language safety almost as unneeded
luxury. That changed in recent times. There is one thing
when Joe Random Hacker encrypts disk of user computer and
demands ransom, basically all powers that could change this
considered such things as user problem. But now hackers from
hostile country can break critical systems (shut down
electricity in whole county, destroy electric plants, stop
vital pipeline from operationg, etc) and goverments got
more serious. I am not sure what is current state of
relevant regulations, but there were proposals to
mandate use of languages possesing safety features
(in particular memory safety). So, it is possible that
memory safety safety will be considered as necessary
feature of a practical system language.
--
Waldek Hebisch
--- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Mon May 11 19:20:27 2026

From Newsgroup: comp.lang.c

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

scott@slp53.sl.home (Scott Lurndal) writes:
[...]

On at least one system with a working C compiler,
a byte is 9 bits, not 8. If I wanted an 8-bit datum
on that system, I'd have to use uint8_t.

(Now, I haven't used that system in decades, but it
still exists and powers a large fraction of the
worlds airline reservation and operational functions).

[...]

What system is that?

Univac 2200 (Unisys Clearpath 2200[*])

[*] Mostly emulated, I don't know if any of the CMOS
processors are still running.

--- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Mon May 11 19:28:29 2026

From Newsgroup: comp.lang.c

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <10tstnn$17jmo$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 11/05/2026 14:54, Dan Cross wrote:

You're both clever chaps, and I think you know perfectly well what is >>happening. So shame on you.

Consider that, perhaps, your use of terminology is so muddled
and unclear that we do not, in fact, "know perfectly well what
is happening."

I can't speak for Scott, of course, but from where I am sitting,
you seem to be very uninformed about how these things work
generally, and you're using your own, made-up terminology.
Sometimes, that terminology conflicts with standard terminology,
and confusion results. You seem to think this is people
deliberately trying to misinterpret you.

Indeed. I misunderstood him, my apologies.

Software distribution is a problem was been solved decades ago.

Whether early shell archives (shar) or tar/cpio,
.rpm/.deb et alia or even windows installers,
it's a problem that's been solved many times;
'shar' is even a single text file.

All of which must, of course, be unpacked before building the
code, although with shar (and .rpm/.deb), the software can
be built as part the installation process automatically.

Bart seems to be advocating a distribution mechanism where one
feeds the distributed file directly into a compiler without
being required to unpack an archive first.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Mon May 11 19:34:00 2026

From Newsgroup: comp.lang.c

In article <10tt85b$1adha$9@kst.eternal-september.org>,
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote: >cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <M0nMR.786566$G7x8.651226@fx15.iad>,
Scott Lurndal <slp53@pacbell.net> wrote:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

[...]

If a byte is 9 bits (ie, if CHAR_BIT == 9) there cannot
be a uint8_t type. The fixed-width types are not allowed
to have padding bits.

That was a 36-bit system. It could easly create a
uint8_t value from 1/9th of two 72-bit words;
so no padding bits required.

I think the issue is the standard's section on, "representation
of types" (sec 6.2.6.1 para 4 in `n3220`), which requires that
anything that's not a `char` type (`(signed|unsigned)? char`)
must be a multiple represented by a multiple `CHAR_BIT` bits.
So if `CHAR_BIT` is 9, then since the exact-width types do not
permit padding bits (sec 7.22, para 1), then `uint8_t` cannot
be defined on such a system since there is no (integer) multiple
of 9 that gives 8.

Exactly. (We can confidently infer that it must be an integer
multiple because it refers to "the size of an object of that type,
in bytes", and the sizeof operator yields an integer value, and
because it wouldn't make sense otherwise.)

Granted, that section does not explicitly say that it needs to
be an *integer* multiple of `CHAR_BIT`, but it implies it, and
section 5.2.5.3.2 says that `CHAR_BIT` is the, "number of bits
for smallest object that is not a bit-field (byte)".

So it is not clear to me that the definition of `byte` in the C
standard comports with that of some 36-bit machines, where bytes
can be of variable width; that would have to be some kind of
non-standard extension.

It's crystal clear that a C "byte' has a fixed width,

I'm not sure that's actually true, but am willing to accept it
at face value.

But I take exception with the assertion that it is "crystal
clear". It is a conclusion that is inferered, not explicit,
though it is likely the only possible conclusion one can arrive
at considering the full set of constraints imposed by the
standard as a whole.

One can imagine a system where "bytes" are variable length, but
tagged with their size, addressible, where the minimal width is
7, and `CHAR_BIT` is defined as the maximal allowable byte size,
and `m x CHAR_BIT` permits `m` to be rational. With sufficient
controtions, one _may_ be able to force this round peg of a
non-existent machine into the square hole of standards
conformance, though I **strongly** suspect there is some other
requirement that invalidates the idea (as it rightly should).

Such a machine does not exist. And since thinking about it is
not useful other than as an academic thought exercise, I cannot
motivate myself to go find the disconfirming passages in the
standard. I shall simply trust that the exist, or that in any
event it does not matter, and lose no sleep over the matter.

and that it's
inconsistent with any kind of variable-width "byte".
You might have, say, a 36-bit machine that can work with 6-bit,
9-bit, or 12-bit "bytes",

Or 7 bits. Or mixed within a word. 36-bit machines got pretty
funky.

but a conforming C implementation must
chose a constant value for CHAR_BIT (and it can't be 6).

(nb, because 6 bits is insufficient to represent the characters
in the "basic character set", which has 94 characters in it [as
of N3220], and sec 3.7 defines a byte as an, "addressable unit
of data storage large enough to hold any member of the basic
character set of the execution environment").

Curiously, that section does not require it to be fixed; it
arguably should.

A C-like implementation that has CHAR_BIT==6 or that supports
variable-width bytes might be useful, but it wouldn't be conforming.

More fundamentally for conformance for word-oriented 36-bit
machines, bytes are not usually (ever?) directly addressed.
Rather, the containing word is the addressable unit, and a byte
accessed from within the word via special instructions or some
sort of descriptor.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Mon May 11 19:41:58 2026

From Newsgroup: comp.lang.c

In article <x5qMR.617789$9qO5.534585@fx12.iad>,
Scott Lurndal <slp53@pacbell.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <10tstnn$17jmo$1@dont-email.me>, Bart <bc@freeuk.com> wrote: >>>On 11/05/2026 14:54, Dan Cross wrote:

You're both clever chaps, and I think you know perfectly well what is >>>happening. So shame on you.

Consider that, perhaps, your use of terminology is so muddled
and unclear that we do not, in fact, "know perfectly well what
is happening."

I can't speak for Scott, of course, but from where I am sitting,
you seem to be very uninformed about how these things work
generally, and you're using your own, made-up terminology.
Sometimes, that terminology conflicts with standard terminology,
and confusion results. You seem to think this is people
deliberately trying to misinterpret you.

Indeed. I misunderstood him, my apologies.

Software distribution is a problem was been solved decades ago.

Whether early shell archives (shar) or tar/cpio,
.rpm/.deb et alia or even windows installers,
it's a problem that's been solved many times;
'shar' is even a single text file.

All of which must, of course, be unpacked before building the
code, although with shar (and .rpm/.deb), the software can
be built as part the installation process automatically.

Bart seems to be advocating a distribution mechanism where one
feeds the distributed file directly into a compiler without
being required to unpack an archive first.

I think that's right.

I don't really know how to respond to him other than to say,
"Ok. Sure. Have fun?""

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Mon May 11 19:42:59 2026

From Newsgroup: comp.lang.c

In article <10tt78i$1asao$2@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 11/05/2026 19:12, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 11/05/2026 03:48, Keith Thompson wrote:

[...]

With different compilers and optimization settings, I get any of the
following outputs on my system:

n=2147483647, n+1=1, 1 > 2147483647

n=2147483647, n+1=-2147483648, -2147483648 < 2147483647

n=2147483647, n+1=-2147483648, -2147483648 > 2147483647

I'm fairly sure that none of the compilers detect that there will
be undefined behavior at run time. The fact that time(NULL) is
greater than 0 is not something I'd expect a compiler to assume.
(That's why I added that to the program.) Rather, some compilers
assume that the behavior is defined, and therefore that n + 1 must
be greater than n.

I expected an output that looks like that middle line, which is the
most intuitive if you accept that integers have a limited capacity and
will wrap, when represented as 32-bit two's complement.

The program has undefined behavior.

Even when -fwrapv is applied?

Yes.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Mon May 11 20:52:07 2026

From Newsgroup: comp.lang.c

On 11/05/2026 20:04, Dan Cross wrote:

In article <10tt6sr$1asao$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 11/05/2026 19:05, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

This I just discovered by chance. It's a small Reddit language project >>>> which here transpiles to C:

self.emit("// Core types");
self.emit("typedef int64_t Int;");
self.emit("typedef int8_t Int8;");
self.emit("typedef int16_t Int16;");
self.emit("typedef int32_t Int32;");
self.emit("typedef int64_t Int64;");
self.emit("typedef uint64_t UInt;");
self.emit("typedef uint8_t UInt8;");
...

You'd think that if transpling to C anyway, they can tolerate using
"int64_t" in the generated C. But apparently not.

[...]

So what?

Some people like the <stdint.h> types. Some people don't. Everyone
here knows that. Showing us yet another example of someone renaming
them proves nothing.

What is your point?

I was asked for multiple examples of somebody defining aliases for
stdint.h types. This was one more, and not cherry-picked either.

No you weren't. You were asked to prove that "nobody" likes
them.

I was asked this:

SL:

Strawman. Please provide examples of "somebody inventing a new type name for uint8_t" (post standardization). One swallow doesn't make a

summer, so a single example

from some obscure project you found on the WWW isn't partcularly instructive.

2 or 3 posts back.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Mon May 11 21:03:44 2026

From Newsgroup: comp.lang.c

On 11/05/2026 19:44, Dan Cross wrote:

In article <10tstnn$17jmo$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 11/05/2026 14:54, Dan Cross wrote:

In article <10tsfvd$11qhe$4@dont-email.me>, Bart <bc@freeuk.com> wrote: >>>> On 11/05/2026 05:28, Chris M. Thomasson wrote:

On 5/10/2026 8:42 AM, Scott Lurndal wrote:

[snip]

********************************
SL:

An experienced C programmer uses independent translation units
without even thinking about it, when the application is
non-trivial.-a-a For many reasons, including reusability,
maintainability and collaboration.-a There are codebases that
have well over a million SLOC.

You are the only programmer who has ever claimed
that an entire application must be contained within a single
translation unit.-a It sounds like you've never actually worked
with either a team, or a non-trivial application.

CMT:

I wonder if his system has pre-compiled header support.

BC (me):

SL is talking nonsense.

********************************

No, he's really not.

The context here is what I've marked between rows of asterisks above.

Scott Lurndal seems to think I prefer applications to be within one
source file. I said that is nonsense because it isn't true.

Only for special purposes such as for distribution or as intermediate files.

What? No, that wasn't the context at all.

In this case it was, but it depends on past history where I've advocated /distribution/ of programs, if they can't be binaries, as a
self-contained C source file which has been generated from the original sources. I first did that in 2014.

Everyone here, not just SL, always seem to think that it means I'm
suggested developing using such files, despite my explaining it many times.

I assumed your remark ('No, he's really not') was about that. If not
then I misunderstood and I apologise.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Mon May 11 20:04:25 2026

From Newsgroup: comp.lang.c

In article <10ttc16$1cpqk$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 11/05/2026 20:04, Dan Cross wrote:

In article <10tt6sr$1asao$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 11/05/2026 19:05, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

This I just discovered by chance. It's a small Reddit language project >>>>> which here transpiles to C:

self.emit("// Core types");
self.emit("typedef int64_t Int;");
self.emit("typedef int8_t Int8;");
self.emit("typedef int16_t Int16;");
self.emit("typedef int32_t Int32;");
self.emit("typedef int64_t Int64;");
self.emit("typedef uint64_t UInt;");
self.emit("typedef uint8_t UInt8;");
...

You'd think that if transpling to C anyway, they can tolerate using
"int64_t" in the generated C. But apparently not.

[...]

So what?

Some people like the <stdint.h> types. Some people don't. Everyone
here knows that. Showing us yet another example of someone renaming
them proves nothing.

What is your point?

I was asked for multiple examples of somebody defining aliases for
stdint.h types. This was one more, and not cherry-picked either.

No you weren't. You were asked to prove that "nobody" likes
them.

I was asked this:

No. Way back when, in <10tnmk6$3os5b$1@dont-email.me>, you
wrote this, in the context of looking at the generated C output
from the FreeBasic compiler:

|(You can also see from this that /nobody/ likes stdint.h types,
|even though standardised from C99 which also introduced 'long
|long' used here. That is another bugbear. Oh, I forgot, my
|criticism is not not valid.)

The whining aside, David Brown rightly took you to task for this
absurd claim "...that /nobody/ lines stdint.h types".

You countered by demanding to know how about unspecified
"figures" (which none of us could possibly know, since you did
not even bother to define what you were referring to), then you
pivoted to saying, essentially, that that was merely hyperbole,
and that you were just saying that _most_ programmers don't like
them, ok, that some programmers don't like that.

SL:

Strawman. Please provide examples of "somebody inventing a new type name for uint8_t" (post standardization). One swallow doesn't make a

summer, so a single example

from some obscure project you found on the WWW isn't partcularly instructive.

2 or 3 posts back.

a) Taken devoid of all the previous context, I can see how you
might be upset that Scott didn't further qualify this,
particularly as you are regularly challenged to be more precise.
In context, however, it's clear he was challenging your
assertion that it is a widespread practice.

b) Note that he said, "post standardization." You have provided
no data about any of the projects you cited and when they
adopted whatever their alternative type names are, or whether or
not they target platforms and/or compilers that are not
standards conforming. For all we know, that was done
pre-standardization of those names, or an important platform is
something that doesn't support `<stdint.h>` for some obscure
reason.

c) None of this has anything to do with the original point, was
was (again) that you made an inherently subjective statement
based on your personal preference, and you stated it as a fact.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Mon May 11 20:08:07 2026

From Newsgroup: comp.lang.c

In article <10ttcmu$1cpqk$2@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 11/05/2026 19:44, Dan Cross wrote:

In article <10tstnn$17jmo$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 11/05/2026 14:54, Dan Cross wrote:

In article <10tsfvd$11qhe$4@dont-email.me>, Bart <bc@freeuk.com> wrote: >>>>> On 11/05/2026 05:28, Chris M. Thomasson wrote:

On 5/10/2026 8:42 AM, Scott Lurndal wrote:

[snip]

********************************
SL:

An experienced C programmer uses independent translation units
without even thinking about it, when the application is
non-trivial.-a-a For many reasons, including reusability,
maintainability and collaboration.-a There are codebases that
have well over a million SLOC.

You are the only programmer who has ever claimed
that an entire application must be contained within a single
translation unit.-a It sounds like you've never actually worked
with either a team, or a non-trivial application.

CMT:

I wonder if his system has pre-compiled header support.

BC (me):

SL is talking nonsense.

********************************

No, he's really not.

The context here is what I've marked between rows of asterisks above.

Scott Lurndal seems to think I prefer applications to be within one
source file. I said that is nonsense because it isn't true.

Honestly, that's the way it came across.

Only for special purposes such as for distribution or as intermediate files.

What? No, that wasn't the context at all.

In this case it was, but it depends on past history where I've advocated >/distribution/ of programs, if they can't be binaries, as a
self-contained C source file which has been generated from the original >sources. I first did that in 2014.

Ok.

Everyone here, not just SL, always seem to think that it means I'm
suggested developing using such files, despite my explaining it many times.

Again, perhaps consider that it is ambiguous?

I assumed your remark ('No, he's really not') was about that. If not
then I misunderstood and I apologise.

No worries.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Mon May 11 21:29:05 2026

From Newsgroup: comp.lang.c

On 11/05/2026 20:06, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

This is back to the other topic as to what makes a practical systems
language.

With current state of art, if you need to work with hardware,
then you need unsafe features. Modern tendency is that only
operating system (including device drivers) has "unrestricted"
access to hardware (I put unrestricted in quotes to account
for things like hypervisors). At higer level it seems that
safe languages allow to do all needed work. Of course, this
may require more effort from programmers, but that is managable
and there are indications that on average safe languages
may require less effort. There may be loss of efficiency.
C++ preached safe and "zero runtime cost", but in reality
safety features have some overhead. You seem to be quite
satified having half of speed of optimized C.

On benchmarks. In my real programs, the difference in usually narrower.

It seem that
safe languges can deliver that.

/My/ safer language is a scripting one, yet it can do low-level stuff
too. It's safer because of things like bounds-checking, and also because
most explicit pointer use disappears.

However it is also dynamic and interpreted, and that makes it 10-30
times slower than optimised C.

(My current project hopes to close that gap. So far it made it wider!)

The battle is about last
few percent of preformanmce and there are disagreements if
those last few percents matter.

I've played with my friend's laptop (only used for browers, email etc),
and my stuff ran 70% faster.

So if I really needed the speedup, I can just switch machines.

Anyway, once you are above OS level languages like SML or
OCaml add strong safety features and offer performace within
small factor around preformance of optimized C. Languages
based on JVM claim slightly better performance and comparable
safety. Those languages depend on garbage collection which may
introduce unacceptable delays. But there are now methods
to make delays smaller (used in Erlang and Go), and
methods that burn more machine cycles but completely eliminate
delays (parallel garbage collection). Rust got a lot of
good press because it promises memory safety (which previously
needed garbage collector) for manual memory management.

Concerning "system programming", for long time many developers
believed that safety is needed only in very special applications
and that in general purpose systems bugs are tolerable.
Internet slightly challenged this, highliting need for
security. But even after industry got serious about security,
they still considered language safety almost as unneeded
luxury. That changed in recent times. There is one thing
when Joe Random Hacker encrypts disk of user computer and
demands ransom, basically all powers that could change this
considered such things as user problem. But now hackers from
hostile country can break critical systems (shut down
electricity in whole county, destroy electric plants, stop
vital pipeline from operationg, etc) and goverments got
more serious.

My first microprocessor machine had 32KB RAM, and very poor, unreliable storage so that I kept compiler and source code in memory as much as
possible.

However, a bug in the program being run could easily out wipe not only
the compiler but my unsaved source code!

Solution: since it was 2 x 16KB memory banks, I put a write-protect
switch on the half with compiler and source code.

With these hackers, why are critical systems connected to the public
internet in the first place?

--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c on Mon May 11 22:37:26 2026

From Newsgroup: comp.lang.c

On 11/05/2026 19:07, Dan Cross wrote:

In article <10truhq$tqbj$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:

On 11/05/2026 02:31, James Kuyper wrote:

On 2026-05-08 06:43, David Brown wrote:
...

Yes, I have heard that argument before. I am unconvinced that the
"value preserving" choice actually has any real advantages. I also
think it is a misnomer - it implies that "unsigned preserving" would
not preserve values, which is wrong.

Unsigned-preserving rules would convert a signed value which might be
negative to unsigned type more frequently than the value preserving
rules do. Such a conversion is not value-preserving.

If you have a signed value, you have a signed type. Unsigned-preserving
rules are also signed-preserving - smaller unsigned types promote to
bigger unsigned types, while smaller signed types promote to bigger
signed types. I don't think anyone ever suggested smaller signed types
should promote to larger unsigned types.

Perhaps I am being bone-headed here and missing something obvious.
(Given that the C committee put in a lot of effort and came to a
different conclusion, it seems very likely that I'm missing something.)

The C89 rationale document is useful here, specifically section
3.2.1.1.

It describes the tradeoffs between unsigned-preserving and
value-preserving semantics that the committeee considered when
making the decision to codify value-preserving behavior. Of
note to this discussion is the following:

|Both schemes give the same answer in the vast majority of
|cases, and both give the same effective result in even more
|cases in implementations with twos complement arithmetic and
|quiet wraparound on signed overflow rCo that is, in most current |implementations.

Yes, I've read the rationale here, and I'm still not convinced I
understand their reasoning.

This suggests the committee felt that it was rare that signed
integer overflow was treated specially by compilers, and that
the equivalent of `-fwrapv` was the dominant case, and would
continue to be in the future. (Oh, those sweet summer
children....)

The text continues with descriptions of operations where the
promotion of `unsigned char` and `unsigned short` values yield
results that the committee dubbed, "questionably signed." That
is, places where interpreting the sign of the result is
ambiguous given the two different semantics.

They highlight that the same ambiguity arises with operations
mixing `unsigned int` and `signed int`, but state that (to use
their words), the "unsigned preserving rules greatly increase
the number of situations where `unsigned int` confronts `signed
int` to yield a questionably signed result, whereas the value
preserving rules minimize such confrontations. Thus, the value
preserving rules were considered to be safer for the novice, or
unwary, programmer."

They do go on to note that this is a, "quiet change", at odds
with contemporary Unix compilers, and say, "This is considered
the most serious semantic change made by the Committee to a
widespread current practice." Indeed.

Unsigned-preserving promotions would, AFAICS, preserve value and
signedness :

unsigned short -> unsigned int
signed short -> signed int

Value-preserving promotions would preserve values too :

unsigned short -> signed int
signed short -> signed int

The unsigned-preserving promotions could also safely be applied even if
short is the same size as int - that is not the case for the "always
promote to signed int" rules.

The situations they were thinking about were things like this:

unsigned short a = 8;
int b = -5;
long c = a * b;

With value-preserving semantics, `c` is 40. On the other hand,
with unsigned-preserving semantics, assuming a 64-bit `long` and
32-bit `int`, `c` is 4294967256; logical enough, but one could
see how that might be surprising for someone unfamiliar with the
language.

Thanks for that example.

Perhaps the main "mistake" (where "mistake" means "I personally think C
would be nicer for my own use if things were different") is that when
mixing operations between signed int and unsigned int, the signed int is converted to unsigned. I suspect that in real-world code, unsigned int
values that are within the range of signed int are common - and that
negative signed int values are more common than unsigned int values that
are out of range of signed int. Any common type here, unless it is
larger than the two original types, is going to get some things wrong -
but I think that converging on signed int as the common type would be
wrong less often. And if that had been the rule, then
unsigned-preserving promotion would be correct too in examples like yours.

What they do not appear to have antipicated are compiler
developers who would exploit the undefined nature of signed
integer overflow so aggressively that things like taking the
product of two 16-bit `unsigned short` values and assigning it
to a variable of unsigned type might yield unexpected results
(like a saturated product).

And I sincerely believe that they never thought that anyone
would use "undefined behavior" as a cudgel to justify such
behavior, even if a compiler would technically be operating
within the bounds of the standard if it did so. Talk about
being surprising for the novice or unwary....

On balance, I agree with you that they should have chosen
unsigned-preserving semantics. Perhaps it would have led to
more situations where `unsigned int` "confronts" a `signed int`
that is negative (like they're about to throw down outside a bar
over a spilled drink or something), but in retrospect, I think
that's relatively easy to explain, while the value preserving
semantics lead to more UB and different questions: from the
novice perspective, it is very reasonable to ask why
`(unsigned short)8 * -5 == -40` but
`(unsigned)8 * -5 == 4294967256`).

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c on Mon May 11 22:45:11 2026

From Newsgroup: comp.lang.c

On 11/05/2026 22:04, Dan Cross wrote:

b) Note that he said, "post standardization." You have provided
no data about any of the projects you cited and when they
adopted whatever their alternative type names are, or whether or
not they target platforms and/or compilers that are not
standards conforming. For all we know, that was done
pre-standardization of those names, or an important platform is
something that doesn't support `<stdint.h>` for some obscure
reason.

It is quite common to see "home-made" sized types in libraries or other
code blocks from before C99. Typically you'll see lists like :

typedef short int int16;
typedef long int int32;
typedef __int64 int64;

or perhaps with type names prefixed by a library indicator of some sort.

And it is not uncommon for such libraries to later be made more portable
for more platforms, and then updated to take advantage of some C99
features. But no one wants to go through the existing code, changing
every "int16" to "int16_t" and so on. Thus the type definitions become

typedef int16_t int16;
typedef int32_t int32;
typedef int64_t int64;

and all the other code works as before.

There's no suggestion there that the author likes or dislikes the
<stdint.h> names - merely that the code started off before these names
were in common use.

(I haven't a clue whether this applies to the example Bart dug up, but
it certainly applies in some similar cases.)

--- Synchronet 3.22a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Mon May 11 13:46:09 2026

From Newsgroup: comp.lang.c

On 5/11/2026 11:05 AM, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

This I just discovered by chance. It's a small Reddit language project
which here transpiles to C:

self.emit("// Core types");
self.emit("typedef int64_t Int;");
self.emit("typedef int8_t Int8;");
self.emit("typedef int16_t Int16;");
self.emit("typedef int32_t Int32;");
self.emit("typedef int64_t Int64;");
self.emit("typedef uint64_t UInt;");
self.emit("typedef uint8_t UInt8;");
...

You'd think that if transpling to C anyway, they can tolerate using
"int64_t" in the generated C. But apparently not.

[...]

So what?

Some people like the <stdint.h> types.

Fwiw, I love it. uintptr_t in particular.

Some people don't. Everyone
here knows that. Showing us yet another example of someone renaming
them proves nothing.

What is your point?

--- Synchronet 3.22a-Linux NewsLink 1.2

From Michael S@already5chosen@yahoo.com to comp.lang.c on Mon May 11 23:48:04 2026

From Newsgroup: comp.lang.c

On Sun, 10 May 2026 20:30:24 -0400
James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

On 2026-05-10 20:10, Keith Thompson wrote:
...

Bart, you claimed here that literally *nobody* likes stdint.h types.

I like stdint.h types.

Me too.

I not just like stdint.h*
I also hate when C programmers define their own fixed-width integer
types.

That I wouldn't do myself, but it is o.k.:
typedef int32_t sample_index;
typedef uint32_t sample_value;

That would raise my blood pressure:
typedef int32_t s32;
typedef uint32_t u32;
typedef uint8_t octet;

------------
* PTR macros is something else. Those I hate.
And all those *_fast and *_least types... Not that I hate them, but
it's certainly shows lack of taste.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Mon May 11 13:51:54 2026

From Newsgroup: comp.lang.c

On 5/11/2026 7:45 AM, Scott Lurndal wrote:

kalevi@kolttonen.fi (Kalevi Kolttonen) writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

In almost all cases where uint8_t
might be used, unsigned char works just as well.

Why "almost"? Where is the difference if any?

As far as I know, ISO guarantees that
sizeof(unsigned char) is always 1 byte.

On at least one system with a working C compiler,
a byte is 9 bits, not 8. If I wanted an 8-bit datum
on that system, I'd have to use uint8_t.

(Now, I haven't used that system in decades, but it
still exists and powers a large fraction of the
worlds airline reservation and operational functions).

And operations on unsigned char are well defined,
including wrap-around. So I fail to see any
difference between unsigned char and uint8_t.

Indeed. Although from my perspective, the use of the
stdint types clearly documents the programmers
intent, whereas a typedef such as BYTE or WORD
is inherently ambiguous and would require a programmer
to look up the definition of such types in the
application to determine the original programmers intent.

Gotta love uintptr_t. Still need to see how big it is. Can call that a
word. ;^)
--- Synchronet 3.22a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Mon May 11 13:57:09 2026

From Newsgroup: comp.lang.c

On 5/10/2026 10:43 PM, Janis Papanagnou wrote:

On 2026-05-11 06:21, Chris M. Thomasson wrote:

[ C's characteristics ]

To allow one to shoot themselves in the foot! Both feet. ;^)

To stress that picture...

"C" allows you to shoot yourself in your foot, but if you
manage to shoot in both of your feet with a single bullet
then it's the programmer's fault!

;^D
--- Synchronet 3.22a-Linux NewsLink 1.2

From Michael S@already5chosen@yahoo.com to comp.lang.c on Mon May 11 23:57:14 2026

From Newsgroup: comp.lang.c

On Mon, 11 May 2026 15:58:36 GMT
scott@slp53.sl.home (Scott Lurndal) wrote:

For other apps, int, long, float, double are preferred
to INT, LONG, FLOAT, DOUBLE (which seems to be the
way windows programmers code)[*]

[*] which probably dates back to 16-bit windows
and their methods of maintaining backward compatability
across two subsequent (32, 64) x86 processor architectures
plus MIPS et alia.

Unfortunately, LONG is not rare in Microsoft's code examples.
I can't be sure that I had never seen INT, but I am sure that it is
far less common.
I am sure that I never encountered FLOAT and DOUBLE.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Mon May 11 14:16:28 2026

From Newsgroup: comp.lang.c

scott@slp53.sl.home (Scott Lurndal) writes:
[...]

Software distribution is a problem was been solved decades ago.

For certain values of "solved".

Whether early shell archives (shar) or tar/cpio,
.rpm/.deb et alia or even windows installers,
it's a problem that's been solved many times;
'shar' is even a single text file.

All of which must, of course, be unpacked before building the
code, although with shar (and .rpm/.deb), the software can
be built as part the installation process automatically.

Bart seems to be advocating a distribution mechanism where one
feeds the distributed file directly into a compiler without
being required to unpack an archive first.

And that's a valid mechanism. It is, for example, one of
several source distribution mechanisms used by SQLite. But the
"amalgamation" (a) is distributed in compressed form (which unpacks
to 4 source files last time I looked), and (b) is not the form
used for development and maintenance. It's generated from a more
conventional collection of source files in a directory tree.

I don't think anyone had advocated distributing single source files
as the best method in general; that was a simple misunderstanding.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Mon May 11 14:23:38 2026

From Newsgroup: comp.lang.c

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <10tt85b$1adha$9@kst.eternal-september.org>,
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <M0nMR.786566$G7x8.651226@fx15.iad>,
Scott Lurndal <slp53@pacbell.net> wrote:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

[...]

If a byte is 9 bits (ie, if CHAR_BIT == 9) there cannot
be a uint8_t type. The fixed-width types are not allowed
to have padding bits.

That was a 36-bit system. It could easly create a
uint8_t value from 1/9th of two 72-bit words;
so no padding bits required.

I think the issue is the standard's section on, "representation
of types" (sec 6.2.6.1 para 4 in `n3220`), which requires that
anything that's not a `char` type (`(signed|unsigned)? char`)
must be a multiple represented by a multiple `CHAR_BIT` bits.
So if `CHAR_BIT` is 9, then since the exact-width types do not
permit padding bits (sec 7.22, para 1), then `uint8_t` cannot
be defined on such a system since there is no (integer) multiple
of 9 that gives 8.

Exactly. (We can confidently infer that it must be an integer
multiple because it refers to "the size of an object of that type,
in bytes", and the sizeof operator yields an integer value, and
because it wouldn't make sense otherwise.)

Granted, that section does not explicitly say that it needs to
be an *integer* multiple of `CHAR_BIT`, but it implies it, and
section 5.2.5.3.2 says that `CHAR_BIT` is the, "number of bits
for smallest object that is not a bit-field (byte)".

So it is not clear to me that the definition of `byte` in the C
standard comports with that of some 36-bit machines, where bytes
can be of variable width; that would have to be some kind of
non-standard extension.

It's crystal clear that a C "byte' has a fixed width,

I'm not sure that's actually true, but am willing to accept it
at face value.

But I take exception with the assertion that it is "crystal
clear". It is a conclusion that is inferered, not explicit,
though it is likely the only possible conclusion one can arrive
at considering the full set of constraints imposed by the
standard as a whole.

CHAR_BIT is the number of bits in a byte, and it's required to
expand to a constant expression with a value of at least 8.

One can imagine a system where "bytes" are variable length, but
tagged with their size, addressible, where the minimal width is
7, and `CHAR_BIT` is defined as the maximal allowable byte size,
and `m x CHAR_BIT` permits `m` to be rational. With sufficient
controtions, one _may_ be able to force this round peg of a
non-existent machine into the square hole of standards
conformance, though I **strongly** suspect there is some other
requirement that invalidates the idea (as it rightly should).

A conforming implementation can certainly provide extensions to
operate on, say, 6-bit or even n-bit quantities that are referred
to as "bytes" in the environment, but they would not be "bytes"
as C defines the term. sizeof will always yield the size of an
object in units of CHAR_BIT-bit bytes.

Such a machine does not exist. And since thinking about it is
not useful other than as an academic thought exercise, I cannot
motivate myself to go find the disconfirming passages in the
standard. I shall simply trust that the exist, or that in any
event it does not matter, and lose no sleep over the matter.

and that it's
inconsistent with any kind of variable-width "byte".
You might have, say, a 36-bit machine that can work with 6-bit,
9-bit, or 12-bit "bytes",

Or 7 bits. Or mixed within a word. 36-bit machines got pretty
funky.

but a conforming C implementation must
chose a constant value for CHAR_BIT (and it can't be 6).

(nb, because 6 bits is insufficient to represent the characters
in the "basic character set", which has 94 characters in it [as
of N3220], and sec 3.7 defines a byte as an, "addressable unit
of data storage large enough to hold any member of the basic
character set of the execution environment").

More directly, because the standard requires CHAR_BIT to be at
least 8. If the only requirement were based on the basic character
set, CHAR_BIT==7 would be valid.

Curiously, that section does not require it to be fixed; it
arguably should.

The requirement might not be in one place, but it's definitely there.

A C-like implementation that has CHAR_BIT==6 or that supports >>variable-width bytes might be useful, but it wouldn't be conforming.

More fundamentally for conformance for word-oriented 36-bit
machines, bytes are not usually (ever?) directly addressed.
Rather, the containing word is the addressable unit, and a byte
accessed from within the word via special instructions or some
sort of descriptor.

Right, CHAR_BIT==36 is perfectly valid, as are extensions to
manipulate smaller units.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Mon May 11 14:30:07 2026

From Newsgroup: comp.lang.c

David Brown <david.brown@hesbynett.no> writes:
[...]

Perhaps the main "mistake" (where "mistake" means "I personally think
C would be nicer for my own use if things were different") is that
when mixing operations between signed int and unsigned int, the signed
int is converted to unsigned. I suspect that in real-world code,
unsigned int values that are within the range of signed int are common
- and that negative signed int values are more common than unsigned
int values that are out of range of signed int. Any common type here,
unless it is larger than the two original types, is going to get some
things wrong - but I think that converging on signed int as the common
type would be wrong less often. And if that had been the rule, then unsigned-preserving promotion would be correct too in examples like
yours.

[...]

If I were designing a new C-like language, I'd probably avoid the
issue of signed-preserving vs. value-preserving altogether. I might
say operations where one operand is signed and the other is unsigned
are not allowed; if you need that, you can cast one of the operands.
The C committee decided to impose a more or less reasonable rule on
all such operations; I might require the programmer to decide what
to do in each case. (There might be an exception for constants,
so that u+1 doesn't require a cast; I haven't thought through the
implications of that.)

I'd also define operations on narrow types, so the promotion rules
become unnecesary.

Of course C can't be changed in this way without breaking tons of
existing code.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Mon May 11 23:05:33 2026

From Newsgroup: comp.lang.c

On 11/05/2026 22:16, Keith Thompson wrote:

scott@slp53.sl.home (Scott Lurndal) writes:
[...]

Software distribution is a problem was been solved decades ago.

For certain values of "solved".

Whether early shell archives (shar) or tar/cpio,
.rpm/.deb et alia or even windows installers,
it's a problem that's been solved many times;
'shar' is even a single text file.

All of which must, of course, be unpacked before building the
code, although with shar (and .rpm/.deb), the software can
be built as part the installation process automatically.

Bart seems to be advocating a distribution mechanism where one
feeds the distributed file directly into a compiler without
being required to unpack an archive first.

And that's a valid mechanism. It is, for example, one of
several source distribution mechanisms used by SQLite. But the "amalgamation" (a) is distributed in compressed form (which unpacks
to 4 source files last time I looked), and (b) is not the form
used for development and maintenance. It's generated from a more conventional collection of source files in a directory tree.

It may unpack to 4 source files but the biggest one is sqlite3.c which
is an amalgamation of over 100 .c source files.

I don't think anyone had advocated distributing single source files
as the best method in general; that was a simple misunderstanding.

The misunderstanding what in mistakenly suggested I'd advocated
maintaining original, maintanable source for any scale of program in one source file.

About distributing amalgamations, the SQLITE site says:

"Combining all the code for SQLite into one big file makes SQLite easier
to deploy rCo there is just one file to keep track of. And because all
code is in a single translation unit, compilers can do better
inter-procedure and inlining optimization resulting in machine code that
is between 5% and 10% faster."

This sounds like it could have much wider benefits.

In my case, because I have to transpile to C anyway, then for my
whole-program compiler that normally turns the N modules of an
application into 1 EXE file, or 1 DLL file, or 1 OBJ file, or 1 ASM
file, (or 1 IL file when supported), then it makes sense to produce 1 C
source file.

Doing anything else would be much harder, and pointless.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Mon May 11 16:13:26 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 11/05/2026 22:16, Keith Thompson wrote:

[...]

I don't think anyone had advocated distributing single source files
as the best method in general; that was a simple misunderstanding.

The misunderstanding what in mistakenly suggested I'd advocated
maintaining original, maintanable source for any scale of program in
one source file.

Yes, I slightly misstated what the misunderstanding was.

Perhaps we can move on now.

[...]
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.misc,comp.lang.c on Tue May 12 02:28:15 2026

From Newsgroup: comp.lang.c

On 10/05/2026 14:05, Dan Cross wrote:

In article <10tpt9j$c3i4$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 10/05/2026 05:39, Janis Papanagnou wrote:

[snip]
What makes you think that I'd need to write an own language given that
there's a plethora of languages of all kinds and paradigms existing.

So where's the one that works like mine?

I mean, Rust does exactly what you were just describing.

Rust could hardly be more different than mine.

And why are there so many new ones still appearing? Most of them you
will not know about.

Consider the possibility that you may be unique in the world in
possessing the combination of requirements and aesthetic
judgement that makes you feel you need a language like yours.

My language fills the same niche that C does.

I don't have much of a problem with the things that C can do, but with
how it does it, its syntax, its ancient baggage, its quirks, its
folklore, its Unix-centric ecosystem, its pointless UBs, its insistence
in working with every oddball processor, its solving every shortcoming
with macros, its adherents who will defend every misfeature to the death...

Maybe the answer is to just create my own language?! I did exactly that,
and didn't to have to deal with C for 10-15 years, but you can't get
away from it because it's everywhere.

It is also frustrating looking at C forums and people thinking they are
too stupid to grasp something when it's language that could have been
better.

As for new languages, there are a number of reasons. Most of
them are not particularly relevant here.

At this point, you may consider doing what Keith suggested, and
moving further discussion of your language to comp.lang.misc.

Sure, a pretty much dead group.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Mon May 11 18:32:07 2026

From Newsgroup: comp.lang.c

antispam@fricas.org (Waldek Hebisch) writes:

[discussing the notion of "safe" programs]

As I wrote, safety is about ability to avoid or detect errors.

In the functional programming community the usual statement is
"Well-typed programs cannot go wrong." I think a good way of
understanding this is that, if a program stays inside the
safe limits of the language, the program can produce wrong
answers, but it cannot produce meaningless answers.

Of course, that has nothing to do with failing hardware, etc.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.misc,comp.lang.c on Mon May 11 18:37:05 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:
[...]

I don't have much of a problem with the things that C can do, but with
how it does it, its syntax, its ancient baggage, its quirks, its
folklore, its Unix-centric ecosystem, its pointless UBs, its
insistence in working with every oddball processor, its solving every shortcoming with macros, its adherents who will defend every
misfeature to the death...

You're mostly wrong about that last point. Many of us spend a
great deal of time and effort here *explaining* how C is defined
and how best to use it.

To explain is not to defend. What will it take for you to understand
that?

[...]

It is also frustrating looking at C forums and people thinking they
are too stupid to grasp something when it's language that could have
been better.

(I'm going to assume I parsed that sentence correctly.)

Nobody has said that C couldn't have been better. But it could
hardly have been more successful. As Dennis Ritchie himself said,
"C is quirky, flawed, and an enormous success."

[...]
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.misc,comp.lang.c on Tue May 12 02:40:31 2026

From Newsgroup: comp.lang.c

In article <10ttvng$1j579$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 10/05/2026 14:05, Dan Cross wrote:

In article <10tpt9j$c3i4$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 10/05/2026 05:39, Janis Papanagnou wrote:

[snip]
What makes you think that I'd need to write an own language given that >>>> there's a plethora of languages of all kinds and paradigms existing.

So where's the one that works like mine?

I mean, Rust does exactly what you were just describing.

Rust could hardly be more different than mine.

You were describing what Rust calls, `include_str!` and
`include_bytes!`. That's what I was referring to.

https://doc.rust-lang.org/std/macro.include_str.html https://doc.rust-lang.org/std/macro.include_bytes.html

And why are there so many new ones still appearing? Most of them you
will not know about.

Consider the possibility that you may be unique in the world in
possessing the combination of requirements and aesthetic
judgement that makes you feel you need a language like yours.

My language fills the same niche that C does.

I don't have much of a problem with the things that C can do, but with
how it does it, its syntax, its ancient baggage, its quirks, its
folklore, its Unix-centric ecosystem, its pointless UBs, its insistence
in working with every oddball processor, its solving every shortcoming
with macros, its adherents who will defend every misfeature to the death...

Maybe the answer is to just create my own language?! I did exactly that,
and didn't to have to deal with C for 10-15 years, but you can't get
away from it because it's everywhere.

So like I said, you may be unique in the world in possessing the
combination of requirements _and aesthetic judgement_ that makes
you feel you need a language exactly like yours.

I think you'll find very few "adherents who will defend every
misfeature to the death."

It is also frustrating looking at C forums and people thinking they are
too stupid to grasp something when it's language that could have been >better.

The problem you keep encountering here, specifically, is that by
your own admission you do not know C, the language, well enough
to accurately understand what would have made it a "language
that could have been better."

As for new languages, there are a number of reasons. Most of
them are not particularly relevant here.

At this point, you may consider doing what Keith suggested, and
moving further discussion of your language to comp.lang.misc.

Sure, a pretty much dead group.

Maybe you could liven it up.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Tue May 12 03:44:18 2026

From Newsgroup: comp.lang.c

In article <868q9ppg4o.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

antispam@fricas.org (Waldek Hebisch) writes:

[discussing the notion of "safe" programs]

As I wrote, safety is about ability to avoid or detect errors.

In the functional programming community the usual statement is
"Well-typed programs cannot go wrong."

This is only concerning _type safety_.

Robin Milner made the, "well-typed programs cannot go wrong"
statement as an informal description of the meaning of a
"Semantic Soundness Theorem", which is described (slightly) more
formally to mean, "a well-typed program is semantically free of
type violation." Roughly speaking, this means that, given a
"phrase" P in the language, and some "basis" B denoting the
current _state_ of a program, then if the inputs to P have
proper types with respect to the semantics defined by the
languge, then when P is "applied" to B, the output is a
new basis, B', that will itself be properly typed: B |- P => B'.
That is, if for a given operation the inputs all have the
"expected" types with respect to the semantics of that operation
then the output will also be of the expected type with respect
to those semantics. For example, if I add two integers, I get
an integer, but perhaps I can't add two pointers together,
because the result doesn't make any sense semantically: the
result has no type, which Milner illusted with an untyped,
unrepresentable, hypothetical object he called "wrong";
semantically meaningless phrases are defined to have no type,
thus matching "wrong". Hence, semantically well-typed programs
could not be "wrong".

Of course, to for the semantic soundness theorem to be
meaningful, a language must have formally defined semantics.

SML is an example of such a language: it is formally described
by its grammar and a well-defined operational semantics, but C
(as an extreme counter point) is not: C has no formal semantics:
it is informally defined in terms of an "abstract virtual
machine" that is itself informally described by the C standard.

I think a good way of
understanding this is that, if a program stays inside the
safe limits of the language, the program can produce wrong
answers, but it cannot produce meaningless answers.

No. It simply means that if all inputs to all operations type
check (say, the arguments to a function call are the correct
types with respect to the function's definition), then the
outputs will type check (the return value will be of the correct
type, as defined by the function): the program will be
"well-typed". Type safety, alone, has little bearing on overall
correctness beyond that.

What the "safe limits" of the language are is undefined this
context, so itself meaningless. And what has meaning in terms
of the output of a program similarly. For example, a perfectly
type safe function in SML is: `fun squarerootpos (a:int) = ~1;`

This is function, when applied to any datum of type `int`, will
always produce a number of type `int`, but that number has no
meaning (it is always arbitrarily -1, which is not the positive
square root of any integer).

Of course, that has nothing to do with failing hardware, etc.

Type safety is only one aspect of safety. There are others as
well, of course that can (up to a limit) be checked by the
language: memory safety, concurrency safety (perhaps as defined
by data-race freedom) and so on.

At some point, one must draw a line at what the language can do
on your behalf. If a program starts poking at a debugging
interface (like opening `/dev/mem` on Linux and poking around)
to subvert itself outside of the watchful eye of the language,
then all bets are off: but a general purpose language cannot
possibly guard against all such scenarios.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Mon May 11 20:53:11 2026

From Newsgroup: comp.lang.c

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <868q9ppg4o.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

antispam@fricas.org (Waldek Hebisch) writes:

[discussing the notion of "safe" programs]

As I wrote, safety is about ability to avoid or detect errors.

In the functional programming community the usual statement is
"Well-typed programs cannot go wrong."

This is only concerning _type safety_.

I didn't mean to imply anything different.
--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c on Tue May 12 08:35:09 2026

From Newsgroup: comp.lang.c

On 11/05/2026 23:30, Keith Thompson wrote:

David Brown <david.brown@hesbynett.no> writes:
[...]

Perhaps the main "mistake" (where "mistake" means "I personally think
C would be nicer for my own use if things were different") is that
when mixing operations between signed int and unsigned int, the signed
int is converted to unsigned. I suspect that in real-world code,
unsigned int values that are within the range of signed int are common
- and that negative signed int values are more common than unsigned
int values that are out of range of signed int. Any common type here,
unless it is larger than the two original types, is going to get some
things wrong - but I think that converging on signed int as the common
type would be wrong less often. And if that had been the rule, then
unsigned-preserving promotion would be correct too in examples like
yours.

[...]

If I were designing a new C-like language, I'd probably avoid the
issue of signed-preserving vs. value-preserving altogether. I might
say operations where one operand is signed and the other is unsigned
are not allowed; if you need that, you can cast one of the operands.

I'd be with you on that.

However, I think you'd quickly run into inconveniences and annoyances
with integer constants - you'd want "x * 2" to work regardless of the signedness of x's type. I am no Ada expert, and it's OT anyway, but I
believe in Ada the type of integer constants adapts to fit when used
like this - you'd need something similar to make the hypothetical K&B C language work well. Integer constants would have to be "questionably
signed", not signed or unsigned. (Maybe "adaptively typed" might be a
better term, and include the size of the type as well as the signedness.)

The C committee decided to impose a more or less reasonable rule on
all such operations; I might require the programmer to decide what
to do in each case. (There might be an exception for constants,
so that u+1 doesn't require a cast; I haven't thought through the implications of that.)

Certainly the rules work - even if I might have preferred something
different, you can learn the rules and right correct code using them.
Lots of people do!

I'd also define operations on narrow types, so the promotion rules
become unnecesary.

<aol> Me too! </aol>

I might start using the _BitInt types, once the versions of gcc I need
for the targets I need have good support for them.

Of course C can't be changed in this way without breaking tons of
existing code.

The curse of popularity.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Tue May 12 00:38:52 2026

From Newsgroup: comp.lang.c

David Brown <david.brown@hesbynett.no> writes:

On 11/05/2026 23:30, Keith Thompson wrote:

David Brown <david.brown@hesbynett.no> writes:
[...]

Perhaps the main "mistake" (where "mistake" means "I personally think
C would be nicer for my own use if things were different") is that
when mixing operations between signed int and unsigned int, the signed
int is converted to unsigned. I suspect that in real-world code,
unsigned int values that are within the range of signed int are common
- and that negative signed int values are more common than unsigned
int values that are out of range of signed int. Any common type here,
unless it is larger than the two original types, is going to get some
things wrong - but I think that converging on signed int as the common
type would be wrong less often. And if that had been the rule, then
unsigned-preserving promotion would be correct too in examples like
yours.

[...]
If I were designing a new C-like language, I'd probably avoid the
issue of signed-preserving vs. value-preserving altogether. I might
say operations where one operand is signed and the other is unsigned
are not allowed; if you need that, you can cast one of the operands.

I'd be with you on that.

However, I think you'd quickly run into inconveniences and annoyances
with integer constants - you'd want "x * 2" to work regardless of the signedness of x's type. I am no Ada expert, and it's OT anyway, but I believe in Ada the type of integer constants adapts to fit when used
like this - you'd need something similar to make the hypothetical K&B
C language work well. Integer constants would have to be
"questionably signed", not signed or unsigned. (Maybe "adaptively
typed" might be a better term, and include the size of the type as
well as the signedness.)

Right, that could be a problem.

In Ada, the type of an integer literal (and of certain other
constructs, similar to C's integer constant expressions) is <universal_integer>, an anonymous type that exists only at compile
time and can represent unbounded values. Any value of that type
is implicitly converted a type determined by the context.

*Maybe* something similar to that could be a sensible approach for
my hypothetical C-like language that will never actually exist.
Or maybe an integer literal could be treated as of type intmax_t
or uintmax_t, depending on the context (similar to how integer
literals are treated in the preprocessor).

(Aside: I'm using "integer literal" to match the term used in both
Ada and the draft of C2y, rather than "integer constant" as used
in C up to C23.)

[...]

Of course C can't be changed in this way without breaking tons of
existing code.

The curse of popularity.

Indeed.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c on Tue May 12 10:42:10 2026

From Newsgroup: comp.lang.c

On 11/05/2026 22:48, Michael S wrote:

On Sun, 10 May 2026 20:30:24 -0400
James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

On 2026-05-10 20:10, Keith Thompson wrote:
...

Bart, you claimed here that literally *nobody* likes stdint.h types.

I like stdint.h types.

Me too.

I not just like stdint.h*
I also hate when C programmers define their own fixed-width integer
types.

That I wouldn't do myself, but it is o.k.:
typedef int32_t sample_index;
typedef uint32_t sample_value;

That would raise my blood pressure:
typedef int32_t s32;
typedef uint32_t u32;
typedef uint8_t octet;

------------
* PTR macros is something else. Those I hate.

Do you mean the PRI macros, for printf? I can understand disliking
these - I have almost never used them myself. (I use "%lu" and friends,
and let gcc tell me if I've got it wrong. Lazy, perhaps, but good
enough for the portability I usually need.)

Certainly it seems a bit of an overreaction to "hate" PTRDIFF_MIN, or INTPTR_MAX !

And all those *_fast and *_least types... Not that I hate them, but
it's certainly shows lack of taste.

These are often the types that better fit what you are actually looking
for in the code - but they are a bit too cumbersome for my tastes.

For "bulk" data - arrays of data where size gets relevant - the "least"
types express the need for a type that can store the range of values you
need, with the least amount of space. And they are highly portable,
working even on systems that don't have uint8_t or uint16_t types. But
of course such systems are very rare, and you usually know if you are
coding for them. So in almost all practical situations, int_least8_t is identical to int8_t, and similarly for all the other "least" types -
making them of very little use.

The "fast" types are well-suited to local variables - you want the
variable to hold the values you need, but otherwise be as fast as
possible. In some cases, that can lead to faster code. On x86-64, the
"fast" types (from int_fast16_t upwards) are all 64-bit. On 32-bit ARM systems, using 32-bit types can definitely be faster than using 16-bit
types, so using int_fast8_t and int_fast16_t makes sense if you are
writing embedded code that might be used on 8-bit, 16-bit and 32-bit microcontrollers. But I think that in many cases for local variables
that don't escape, compilers can optimise smaller int types as though
they were the equivalent fast types, meaning again that there is not
much use in bothering with them. So again, the prime use of these types
would be in highly portable code - though more useful than the "least"
types.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Tue May 12 13:10:06 2026

From Newsgroup: comp.lang.c

In article <10ttem6$1daks$2@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:

On 11/05/2026 19:07, Dan Cross wrote:

[snip]
The C89 rationale document is useful here, specifically section
3.2.1.1.

It describes the tradeoffs between unsigned-preserving and
value-preserving semantics that the committeee considered when
making the decision to codify value-preserving behavior. Of
note to this discussion is the following:

|Both schemes give the same answer in the vast majority of
|cases, and both give the same effective result in even more
|cases in implementations with twos complement arithmetic and
|quiet wraparound on signed overflow rCo that is, in most current
|implementations.

Yes, I've read the rationale here, and I'm still not convinced I
understand their reasoning.

Nor am I.

[snip]
The situations they were thinking about were things like this:

unsigned short a = 8;
int b = -5;
long c = a * b;

With value-preserving semantics, `c` is 40. On the other hand,
with unsigned-preserving semantics, assuming a 64-bit `long` and
32-bit `int`, `c` is 4294967256; logical enough, but one could
see how that might be surprising for someone unfamiliar with the
language.

Thanks for that example.

Perhaps the main "mistake" (where "mistake" means "I personally think C >would be nicer for my own use if things were different") is that when
mixing operations between signed int and unsigned int, the signed int is >converted to unsigned. I suspect that in real-world code, unsigned int >values that are within the range of signed int are common - and that >negative signed int values are more common than unsigned int values that
are out of range of signed int. Any common type here, unless it is
larger than the two original types, is going to get some things wrong -
but I think that converging on signed int as the common type would be
wrong less often. And if that had been the rule, then
unsigned-preserving promotion would be correct too in examples like yours.

If I understand what you're saying -- and correct me if I'm
wrong -- it sounds like you are suggesting sign-preserving
semantics for all types.

I'm sure they must have at least talked about that. Where did
they idea go? I'm speculating, but I think they were trying to
thread a needle here, and felt that redefining the semantics for
types ranked with `int` and higher would be a bridge too far. I
keep saying I had (and still have) a lot of sympathy for the
committee: they were chared with imposing order on an unruly
situation, balancing many competing organizations and interests,
all while preserving compatibility with existing pratice and
implementations, and (as they put it) retaining the "character"
of C. This is an unenviable position to be in.

I imagine the committee felt that, by the time the standards
process was in full swing, the ship had sailed on changing the
rules for values of type `int` or types of higher ranks, and
they could only reasonably address promotion of leser ranked
types to that of `int`. They acknowledged that the
sign-preserving promotion rules were a big semantic difference
from established practice; had they attempted to mandate
sign-preserving rules for arithmetic involving the `int` family
of types, they likely would have faced a serious revolt.

And as they said in the rationale, in _most_ cases, it doesn't
matter; for `int`/`unsigned int` even less so. For instance,
assume a platform with 32-bit `int`. Then the behavior of this
code is implementation-defined, but documented to have the same
predictable result across most conforming compilers:

unsigned int a = 8;
int b = -5;
int c = a * b;

To whit, `b` is prompted to `unsigned int` per the rules set
forth in the standard prior to the multiplication; the product
is taken in some ring $Z/2^nZ$ where $n$ is the bit-width of
`unsigned int` (in this example, 32); the product then undergoes
lvalue conversion to `signed int`, but per the rules for
unsigned-to-signed conversions, the result is
implementation-defined (since the product is outside of the
range of the positive subset of 32-bit numbers in 2s complement representation). However, almost all real implementations will
define this using twos complement semantics with no change to
representation, and assign the resulting value assigned to `c`.
This is, surely, by far the most common case.

So, for all _practical_ purposes, the interpretation of the
product as signed or unsigned only matters in the handful of
cases listed in the rationale: using the result in a comparison,
right-shifting the result or widening it (in which case
sign-extension matters, now that all the world's a 2s complement
machine) and so on.

And in cases where the compiler permits silent wrapping on
signed overflow, as I firmly believe they expected to be the
near-universal case, they made the argument that it mattered
even less.

Of course, we understand the consequences of these decisions
much better now, 40 years after the fact. But I really don't
think they thought things would unfold the way they have, with
UB taking such a prominent role as a basis for optimization.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Tue May 12 13:36:44 2026

From Newsgroup: comp.lang.c

In article <10tthov$1eenk$3@kst.eternal-september.org>,
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

David Brown <david.brown@hesbynett.no> writes:
[...]

Perhaps the main "mistake" (where "mistake" means "I personally think
C would be nicer for my own use if things were different") is that
when mixing operations between signed int and unsigned int, the signed
int is converted to unsigned. I suspect that in real-world code,
unsigned int values that are within the range of signed int are common
- and that negative signed int values are more common than unsigned
int values that are out of range of signed int. Any common type here,
unless it is larger than the two original types, is going to get some
things wrong - but I think that converging on signed int as the common
type would be wrong less often. And if that had been the rule, then
unsigned-preserving promotion would be correct too in examples like
yours.

[...]

If I were designing a new C-like language, I'd probably avoid the
issue of signed-preserving vs. value-preserving altogether. I might
say operations where one operand is signed and the other is unsigned
are not allowed; if you need that, you can cast one of the operands.

Agreed. C grew `unsigned` types relatively late in its design;
it was not part of the language until "Typesetter C" (released
to the world with 7th Edition Unix in early 1979), and was sort
of shoehorned in. Prior to that, `int` was used interchangably
for either signed or unsigned quantities, depending on context.

I suspect that much of this has to do with the machines they
were constrained with at the time: doing extensive type analysis
on a PDP-11/45 with a few hundred kilobytes of RAM was not
reasonable.

The C committee decided to impose a more or less reasonable rule on
all such operations; I might require the programmer to decide what
to do in each case. (There might be an exception for constants,
so that u+1 doesn't require a cast; I haven't thought through the >implications of that.)

I'd also define operations on narrow types, so the promotion rules
become unnecesary.

Not to keep beating that drum too hard, but this is what Rust
did, and it works well: operations between values of different
type require explicit conversions to a common type, and the
semantics of those conversions are well-defined; signed and
unsigned types are considered different. Overflow (or
underflow) is considered an error, but operations with explicit
wrapping semantics are available.

Of course C can't be changed in this way without breaking tons of
existing code.

I think the same argument applied in the 1980s, while work was
underway on what would become the 1989 ANSI standard. One of
its problems stemming from its origins growing out of a typeless
language (B) on small machines, is that C is weakly typed. And
it has never been rigorously defined in the formal sense. But C
had become wildly popular, and so fundamental changes to the
type system would have broken a lot of code, and probably killed
the effort.

Much of the confusion that results on this newsgroup and
elsewhere is a direct consequence of those properties: weakly
typed, informally specified, profligate with nullable pointers
that are really just integers in a trenchcoat, and so on. But
I, for one, do not really believe that it could have been any
other way.

And of course, changing it so fundamentally now is unreasonable.
It is the language that it is. The committee has expressed
interest in making improvements so that it is less obtuse, but
anything they do at this point will necessarily be incremental.

Fortunately, there are alternatives for many of the domains
where C has been dominant over the last five decades. Of course
that has tradeoffs as well. But for example, if I were starting
a new project designed to run on bare metal today and had the
freedom to make the decision, I would not choose C as the
implementation language.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Tue May 12 14:05:21 2026

From Newsgroup: comp.lang.c

In article <10tuhmt$1o3bp$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:

On 11/05/2026 23:30, Keith Thompson wrote:

David Brown <david.brown@hesbynett.no> writes:
[...]

Perhaps the main "mistake" (where "mistake" means "I personally think
C would be nicer for my own use if things were different") is that
when mixing operations between signed int and unsigned int, the signed
int is converted to unsigned. I suspect that in real-world code,
unsigned int values that are within the range of signed int are common
- and that negative signed int values are more common than unsigned
int values that are out of range of signed int. Any common type here,
unless it is larger than the two original types, is going to get some
things wrong - but I think that converging on signed int as the common
type would be wrong less often. And if that had been the rule, then
unsigned-preserving promotion would be correct too in examples like
yours.

[...]

If I were designing a new C-like language, I'd probably avoid the
issue of signed-preserving vs. value-preserving altogether. I might
say operations where one operand is signed and the other is unsigned
are not allowed; if you need that, you can cast one of the operands.

I'd be with you on that.

However, I think you'd quickly run into inconveniences and annoyances
with integer constants - you'd want "x * 2" to work regardless of the >signedness of x's type. I am no Ada expert, and it's OT anyway, but I >believe in Ada the type of integer constants adapts to fit when used
like this - you'd need something similar to make the hypothetical K&B C >language work well. Integer constants would have to be "questionably >signed", not signed or unsigned. (Maybe "adaptively typed" might be a >better term, and include the size of the type as well as the signedness.)

I think the term you are looking for is "strongly typed". :-)
That is, types are verifably compatible. In a strongly- and
statically-typed language (that is, one where the types of
objects are known at compile time), it's possible to be both
expressive and precise. There are plenty of examples of such
langauges, but the common characteristic is that they (usually)
_infer_ the type of an expression based on the types of the
operands; there are well-known, formally sound, techniques for
doing this

With respect to literal constants, this would simply mean that
the literal would be considered to be of the inferred type of
the expression it was in: if no such inference could be made
(for instance, the types are fundamentally incompatbile), then
the compiler fail, flagging the type incompatibility as an
error.

So, if this were a fragment of a program in a hypothetical C
dialect that was strongly typed and used type inference,

unsigned int a = 5;
unsigned int c = a * 2;

both `5` and `2` would be inferred to have type `unsigned int`,
since both are representable as unsigned ints. However,

unsigned int c = a * -2;

would be a compile time error, since the resulting type of the
expression must be `unsigned int`, but `-2` is not an unsigned
integer: it would have to be explicitly converted first.

The C committee decided to impose a more or less reasonable rule on
all such operations; I might require the programmer to decide what
to do in each case. (There might be an exception for constants,
so that u+1 doesn't require a cast; I haven't thought through the
implications of that.)

Certainly the rules work - even if I might have preferred something >different, you can learn the rules and right correct code using them.
Lots of people do!

Yes, there are many examples of this, so it is obviously true.
However, I don't think there are many large projects written in
C where there isn't undefined behavior lurking somewhere, and
the amount of effort required to learn _all_ the rules of the
language is unnecessarily large.

I think it is fair to say that there are people who wear their
knowledge of the C standard as a badge of honor and look down at
those who desire a simpler language or who do not know the rules
as well. Some of that is fair (we see examples in this group of
some who not only refuse to learn the rules of the language, but
revel in their ignorance).

But that doesn't mean that all of the criticism is wrong, and
the frequency at which it happens that people run into UB is
also an indictment of the language. Put it this way: it may be
the programmer's fault that they relied on UB, but that it is so
evidently hard to learn and internalize the rules is also the
fault of the langauge. It is not wrong to wish it were better.

I'd also define operations on narrow types, so the promotion rules
become unnecesary.

<aol> Me too! </aol>

I might start using the _BitInt types, once the versions of gcc I need
for the targets I need have good support for them.

Of course C can't be changed in this way without breaking tons of
existing code.

The curse of popularity.

The curse of history!

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.misc,comp.lang.c on Tue May 12 15:11:03 2026

From Newsgroup: comp.lang.c

On 12/05/2026 03:40, Dan Cross wrote:

In article <10ttvng$1j579$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 10/05/2026 14:05, Dan Cross wrote:

In article <10tpt9j$c3i4$1@dont-email.me>, Bart <bc@freeuk.com> wrote: >>>> On 10/05/2026 05:39, Janis Papanagnou wrote:

[snip]
What makes you think that I'd need to write an own language given that >>>>> there's a plethora of languages of all kinds and paradigms existing.

So where's the one that works like mine?

I mean, Rust does exactly what you were just describing.

Rust could hardly be more different than mine.

You were describing what Rust calls, `include_str!` and
`include_bytes!`. That's what I was referring to.

https://doc.rust-lang.org/std/macro.include_str.html https://doc.rust-lang.org/std/macro.include_bytes.html

OK, so some specific features. I don't know Rust, other than when I
first starting testing its compiler in 2021, it was one of the slowest
I'd ever tried.

However it is interesting that it includes both 'str' and 'byte'
versions which mirror my 'strinclude/sinclude' and 'binclude'.

C23 now has '#embed', although its operation is much clunkier. To try it though I needed to download a new version, and used 16.x. I was
interested in how fast it could deal with large embedded data, and tried
this program:

#include <stdio.h>
#include <string.h>

char str[] = {
#embed "big.txt"
,0
};

int main(void) {
printf("%zu\n", sizeof(str));
printf("%zu\n", strlen(str));
}

'big.txt' contains 100 million 'A's, not zero-terminated. The ',0' will
do that (I understand attributes can be used to control that).

As for speed, I was pleasantly surprised: compiling this took 5 seconds.

(While #embed notionally produces a token list like 65,65,...., it must
handle it internally far more efficiently, especially within the
intermediate assembly.)

Still, with my language where it looks like this:

[]char str = sinclude("big.txt") # sinclude adds the terminator

it took only one second. (However, I don't use intermediate ASM so it
can be streamlined.)

It is also frustrating looking at C forums and people thinking they are
too stupid to grasp something when it's language that could have been
better.

The problem you keep encountering here, specifically, is that by
your own admission you do not know C, the language, well enough
to accurately understand what would have made it a "language
that could have been better."

That's like saying I can't compare one car with another, because I don't understand the internals or rationale of the one I criticise.

I do however understand the tasks I expect them to do, and can speak
about my experiences of using each.

So I don't need to care why that car behaves as it does; doing so will
not make the experience any better!

I brought up a Model T analogy for C the other day, and it is apt.

This actually applies to anybody; they don't need to be involved in
making their own vehicles. But if they are, then they will be in a
position to fix shortcomings.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Tue May 12 07:12:00 2026

From Newsgroup: comp.lang.c

Michael S <already5chosen@yahoo.com> writes:

I not just like stdint.h*
I also hate when C programmers define their own fixed-width integer
types.

[...]

That would raise my blood pressure:
typedef int32_t s32;
typedef uint32_t u32;
typedef uint8_t octet;

Can you say what it is about them that you don't like?
Or why you don't like them? Are the reasons the same
in all three cases, or is octet different?
--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Tue May 12 14:27:51 2026

From Newsgroup: comp.lang.c

In article <864ikdp9lk.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <868q9ppg4o.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

antispam@fricas.org (Waldek Hebisch) writes:

[discussing the notion of "safe" programs]

As I wrote, safety is about ability to avoid or detect errors.

In the functional programming community the usual statement is
"Well-typed programs cannot go wrong."

This is only concerning _type safety_.

I didn't mean to imply anything different.

Looking at what you wrote:

|I think a good way of understanding this is that, if
|a program stays inside the safe limits of the language,
|the program can produce wrong answers, but it cannot
|produce meaningless answers.

You are wrong.

A well-typed program _can_ produce meaningless answers; those
answers will have a well-defined type, but it is impossible to
say whether the value produced has any meaning with respect to
the program's intended purpose. Moreover, the "safe limits of
the lanugage", whatever those may be, have nothing to do with
it.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c on Tue May 12 16:32:40 2026

From Newsgroup: comp.lang.c

On 12/05/2026 16:05, Dan Cross wrote:

In article <10tuhmt$1o3bp$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:

On 11/05/2026 23:30, Keith Thompson wrote:

David Brown <david.brown@hesbynett.no> writes:
[...]

Perhaps the main "mistake" (where "mistake" means "I personally think
C would be nicer for my own use if things were different") is that
when mixing operations between signed int and unsigned int, the signed >>>> int is converted to unsigned. I suspect that in real-world code,
unsigned int values that are within the range of signed int are common >>>> - and that negative signed int values are more common than unsigned
int values that are out of range of signed int. Any common type here, >>>> unless it is larger than the two original types, is going to get some
things wrong - but I think that converging on signed int as the common >>>> type would be wrong less often. And if that had been the rule, then
unsigned-preserving promotion would be correct too in examples like
yours.

[...]

If I were designing a new C-like language, I'd probably avoid the
issue of signed-preserving vs. value-preserving altogether. I might
say operations where one operand is signed and the other is unsigned
are not allowed; if you need that, you can cast one of the operands.

I'd be with you on that.

However, I think you'd quickly run into inconveniences and annoyances
with integer constants - you'd want "x * 2" to work regardless of the
signedness of x's type. I am no Ada expert, and it's OT anyway, but I
believe in Ada the type of integer constants adapts to fit when used
like this - you'd need something similar to make the hypothetical K&B C
language work well. Integer constants would have to be "questionably
signed", not signed or unsigned. (Maybe "adaptively typed" might be a
better term, and include the size of the type as well as the signedness.)

I think the term you are looking for is "strongly typed". :-)

Sure - I want this all to be strongly typed, but the question is what
should the type of integer constants / integer literals be? Ada calls
them "universal_integer" type, which might be a good name. (I don't
think there's a need to do too much bikeshedding for a purely
hypothetical language, however.)

That is, types are verifably compatible. In a strongly- and
statically-typed language (that is, one where the types of
objects are known at compile time), it's possible to be both
expressive and precise. There are plenty of examples of such
langauges, but the common characteristic is that they (usually)
_infer_ the type of an expression based on the types of the
operands; there are well-known, formally sound, techniques for
doing this

Yes. I'd want the hypothetical language to be more strongly typed than C.

With respect to literal constants, this would simply mean that
the literal would be considered to be of the inferred type of
the expression it was in: if no such inference could be made
(for instance, the types are fundamentally incompatbile), then
the compiler fail, flagging the type incompatibility as an
error.

Yes.

So, if this were a fragment of a program in a hypothetical C
dialect that was strongly typed and used type inference,

unsigned int a = 5;
unsigned int c = a * 2;

both `5` and `2` would be inferred to have type `unsigned int`,
since both are representable as unsigned ints. However,

unsigned int c = a * -2;

would be a compile time error, since the resulting type of the
expression must be `unsigned int`, but `-2` is not an unsigned
integer: it would have to be explicitly converted first.

That would be good.

I think there'd be a fair bit of overlap in our personal perfected
versions or dialects of C - but I'm sure there would be differences too.

The C committee decided to impose a more or less reasonable rule on
all such operations; I might require the programmer to decide what
to do in each case. (There might be an exception for constants,
so that u+1 doesn't require a cast; I haven't thought through the
implications of that.)

Certainly the rules work - even if I might have preferred something
different, you can learn the rules and right correct code using them.
Lots of people do!

Yes, there are many examples of this, so it is obviously true.
However, I don't think there are many large projects written in
C where there isn't undefined behavior lurking somewhere, and
the amount of effort required to learn _all_ the rules of the
language is unnecessarily large.

I think it is fair to say that there are people who wear their
knowledge of the C standard as a badge of honor and look down at
those who desire a simpler language or who do not know the rules
as well. Some of that is fair (we see examples in this group of
some who not only refuse to learn the rules of the language, but
revel in their ignorance).

But that doesn't mean that all of the criticism is wrong, and
the frequency at which it happens that people run into UB is
also an indictment of the language. Put it this way: it may be
the programmer's fault that they relied on UB, but that it is so
evidently hard to learn and internalize the rules is also the
fault of the langauge. It is not wrong to wish it were better.

It is not wrong to wish C were better - with hindsight, there are many
ways in which a slightly different language would have kept the
advantages of C while reducing at least some risks of errors (whether UB
or not).

But I think that a lot of the UB in you might find in large projects
would be bugs in the code regardless of how that UB might have been
defined. That is, even if signed integer arithmetic overflow had been
fully defined, you'd still get the wrong answer and the program has a
bug. The same with dereferencing a null pointer, or a buffer overflow,
or using the value of an uninitialised local variable. That is, if you
write your code so that it would have been bug-free in a language that
did not have these UB's, the C code would be the same.

The exceptions here would be cases where a programmer wrongly assumes something has defined behaviour, and writes code according to that
assumption. Thus if they write code that assumes reading an
uninitialised local variable returns 0, or has an unspecified (but not undefined) value, or that assumes signed integer overflow is defined as wrapping - /then/ the C language's UB can surprise them in a way other languages generally do not. I don't think there are other situations
where you could hit UB while expecting defined behaviour. (But as we
know, there are a few situations where the signed integer overflow can
be hiding unexpectedly, like uint16_t * uint16_t.)

I'd also define operations on narrow types, so the promotion rules
become unnecesary.

<aol> Me too! </aol>

I might start using the _BitInt types, once the versions of gcc I need
for the targets I need have good support for them.

Of course C can't be changed in this way without breaking tons of
existing code.

The curse of popularity.

The curse of history!

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Tue May 12 15:33:18 2026

From Newsgroup: comp.lang.c

On 12/05/2026 15:05, Dan Cross wrote:

In article <10tuhmt$1o3bp$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:

On 11/05/2026 23:30, Keith Thompson wrote:

David Brown <david.brown@hesbynett.no> writes:
[...]

Perhaps the main "mistake" (where "mistake" means "I personally think
C would be nicer for my own use if things were different") is that
when mixing operations between signed int and unsigned int, the signed >>>> int is converted to unsigned. I suspect that in real-world code,
unsigned int values that are within the range of signed int are common >>>> - and that negative signed int values are more common than unsigned
int values that are out of range of signed int. Any common type here, >>>> unless it is larger than the two original types, is going to get some
things wrong - but I think that converging on signed int as the common >>>> type would be wrong less often. And if that had been the rule, then
unsigned-preserving promotion would be correct too in examples like
yours.

[...]

If I were designing a new C-like language, I'd probably avoid the
issue of signed-preserving vs. value-preserving altogether. I might
say operations where one operand is signed and the other is unsigned
are not allowed; if you need that, you can cast one of the operands.

I'd be with you on that.

However, I think you'd quickly run into inconveniences and annoyances
with integer constants - you'd want "x * 2" to work regardless of the
signedness of x's type. I am no Ada expert, and it's OT anyway, but I
believe in Ada the type of integer constants adapts to fit when used
like this - you'd need something similar to make the hypothetical K&B C
language work well. Integer constants would have to be "questionably
signed", not signed or unsigned. (Maybe "adaptively typed" might be a
better term, and include the size of the type as well as the signedness.)

I think the term you are looking for is "strongly typed". :-)
That is, types are verifably compatible. In a strongly- and
statically-typed language (that is, one where the types of
objects are known at compile time), it's possible to be both
expressive and precise. There are plenty of examples of such
langauges, but the common characteristic is that they (usually)
_infer_ the type of an expression based on the types of the
operands; there are well-known, formally sound, techniques for
doing this

With respect to literal constants, this would simply mean that
the literal would be considered to be of the inferred type of
the expression it was in: if no such inference could be made
(for instance, the types are fundamentally incompatbile), then
the compiler fail, flagging the type incompatibility as an
error.

So, if this were a fragment of a program in a hypothetical C
dialect that was strongly typed and used type inference,

unsigned int a = 5;
unsigned int c = a * 2;

both `5` and `2` would be inferred to have type `unsigned int`,
since both are representable as unsigned ints. However,

unsigned int c = a * -2;

would be a compile time error, since the resulting type of the
expression must be `unsigned int`, but `-2` is not an unsigned
integer: it would have to be explicitly converted first.

The C committee decided to impose a more or less reasonable rule on
all such operations; I might require the programmer to decide what
to do in each case. (There might be an exception for constants,
so that u+1 doesn't require a cast; I haven't thought through the
implications of that.)

Certainly the rules work - even if I might have preferred something
different, you can learn the rules and right correct code using them.
Lots of people do!

Yes, there are many examples of this, so it is obviously true.
However, I don't think there are many large projects written in
C where there isn't undefined behavior lurking somewhere, and
the amount of effort required to learn _all_ the rules of the
language is unnecessarily large.

I think it is fair to say that there are people who wear their
knowledge of the C standard as a badge of honor and look down at
those who desire a simpler language or who do not know the rules
as well. Some of that is fair (we see examples in this group of
some who not only refuse to learn the rules of the language, but
revel in their ignorance).

Take for example C's set of operator precedences.

The one for the ?: operator is particularly obscure, so in an expression
like one of these:

a + b ? c - d : e * f
a ? b ? c : d ? e : f : g

then parentheses would be used to make things clearer. (I haven't check
these are valid, but that is the point; it is hard to see!)

But would shouldn't people be expected to learn the rules? Why is it OK
to 'revel' in not knowing the basics here, but not when unnecessary UBs
are involved where rules are harder and which depend on runtime inputs?

(In my syntaxes, the ?: equivalent /requires/ parentheses. And some of
those UBs are not UBs. To get back to my car analogy, its like somebody refusing to master double-declutching, but in modern car it is not
necessary.

As for mixing signed and unsigned, I have my own misgivings about that,
and am moving slowly into marginalising unsigned types, but it is
already causing some unintuitive errors in either language.)
--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c on Tue May 12 16:46:04 2026

From Newsgroup: comp.lang.c

On 12/05/2026 15:10, Dan Cross wrote:

In article <10ttem6$1daks$2@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:

On 11/05/2026 19:07, Dan Cross wrote:

[snip]
The C89 rationale document is useful here, specifically section
3.2.1.1.

It describes the tradeoffs between unsigned-preserving and
value-preserving semantics that the committeee considered when
making the decision to codify value-preserving behavior. Of
note to this discussion is the following:

|Both schemes give the same answer in the vast majority of
|cases, and both give the same effective result in even more
|cases in implementations with twos complement arithmetic and
|quiet wraparound on signed overflow rCo that is, in most current
|implementations.

Yes, I've read the rationale here, and I'm still not convinced I
understand their reasoning.

Nor am I.

[snip]
The situations they were thinking about were things like this:

unsigned short a = 8;
int b = -5;
long c = a * b;

With value-preserving semantics, `c` is 40. On the other hand,
with unsigned-preserving semantics, assuming a 64-bit `long` and
32-bit `int`, `c` is 4294967256; logical enough, but one could
see how that might be surprising for someone unfamiliar with the
language.

Thanks for that example.

Perhaps the main "mistake" (where "mistake" means "I personally think C
would be nicer for my own use if things were different") is that when
mixing operations between signed int and unsigned int, the signed int is
converted to unsigned. I suspect that in real-world code, unsigned int
values that are within the range of signed int are common - and that
negative signed int values are more common than unsigned int values that
are out of range of signed int. Any common type here, unless it is
larger than the two original types, is going to get some things wrong -
but I think that converging on signed int as the common type would be
wrong less often. And if that had been the rule, then
unsigned-preserving promotion would be correct too in examples like yours.

If I understand what you're saying -- and correct me if I'm
wrong -- it sounds like you are suggesting sign-preserving
semantics for all types.

Yes. (Although I might not have thought through all the consequences of
this - so it's possible that I'll later realise or learn that it would
have been a bad idea.)

I'm sure they must have at least talked about that. Where did
they idea go? I'm speculating, but I think they were trying to
thread a needle here, and felt that redefining the semantics for
types ranked with `int` and higher would be a bridge too far. I
keep saying I had (and still have) a lot of sympathy for the
committee: they were chared with imposing order on an unruly
situation, balancing many competing organizations and interests,
all while preserving compatibility with existing pratice and
implementations, and (as they put it) retaining the "character"
of C. This is an unenviable position to be in.

Sounds reasonable.

I imagine the committee felt that, by the time the standards
process was in full swing, the ship had sailed on changing the
rules for values of type `int` or types of higher ranks, and
they could only reasonably address promotion of leser ranked
types to that of `int`. They acknowledged that the
sign-preserving promotion rules were a big semantic difference
from established practice; had they attempted to mandate
sign-preserving rules for arithmetic involving the `int` family
of types, they likely would have faced a serious revolt.

And as they said in the rationale, in _most_ cases, it doesn't
matter; for `int`/`unsigned int` even less so. For instance,
assume a platform with 32-bit `int`. Then the behavior of this
code is implementation-defined, but documented to have the same
predictable result across most conforming compilers:

I don't know much about early C compilers (other than briefly trying C
on a home computer in my teens, ANSI C was established by the time I
first used C). Did early any / many C compilers guarantee wrapping for
signed integer arithmetic? It is not a guarantee I have seen in any of
the embedded C compiler manuals I have read, though some of these
compilers were far too weakly optimising for it to have made a difference.

unsigned int a = 8;
int b = -5;
int c = a * b;

To whit, `b` is prompted to `unsigned int` per the rules set
forth in the standard prior to the multiplication; the product
is taken in some ring $Z/2^nZ$ where $n$ is the bit-width of
`unsigned int` (in this example, 32); the product then undergoes
lvalue conversion to `signed int`, but per the rules for
unsigned-to-signed conversions, the result is
implementation-defined (since the product is outside of the
range of the positive subset of 32-bit numbers in 2s complement representation). However, almost all real implementations will
define this using twos complement semantics with no change to
representation, and assign the resulting value assigned to `c`.
This is, surely, by far the most common case.

Yes, you end up with the same answer of -40, when "c" is an "int". But
if "c" is "long" (like in your first example), and that is bigger than
"int", the answer is 4294967256 which is almost certainly not what the programmer intended. If the common type for "a * b" had been signed
int, rather than unsigned int, then you'd get -40 whether "c" is "int"
or "long". And you'd get it more directly, with less IB.

So, for all _practical_ purposes, the interpretation of the
product as signed or unsigned only matters in the handful of
cases listed in the rationale: using the result in a comparison, right-shifting the result or widening it (in which case
sign-extension matters, now that all the world's a 2s complement
machine) and so on.

And in cases where the compiler permits silent wrapping on
signed overflow, as I firmly believe they expected to be the
near-universal case, they made the argument that it mattered
even less.

Of course, we understand the consequences of these decisions
much better now, 40 years after the fact. But I really don't
think they thought things would unfold the way they have, with
UB taking such a prominent role as a basis for optimization.

- Dan C.

Well, as they say, making predictions is hard - especially about the future!

--- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Tue May 12 15:19:20 2026

From Newsgroup: comp.lang.c

David Brown <david.brown@hesbynett.no> writes:

On 12/05/2026 15:10, Dan Cross wrote:

And as they said in the rationale, in _most_ cases, it doesn't
matter; for `int`/`unsigned int` even less so. For instance,
assume a platform with 32-bit `int`. Then the behavior of this
code is implementation-defined, but documented to have the same
predictable result across most conforming compilers:

I don't know much about early C compilers (other than briefly trying C
on a home computer in my teens, ANSI C was established by the time I
first used C). Did early any / many C compilers guarantee wrapping for >signed integer arithmetic?

For the early C compiler on the PDP-11, the 'int' type was
16-bits, implicitly signed, and the code generator simply emitted available arithmetic instructions.

It was the only C compiler at the time, any guarantees would have
been implicit in the choice of target architecture.

I mostly wrote unix kernel code using the v6 compiler, rather
than writing code that did any heavy math, so whether value was preserved
or sign was preserved wasn't something I, as a kernel programmer,
routinely considered.
--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Tue May 12 15:57:50 2026

From Newsgroup: comp.lang.c

In article <10tvefc$1vmna$2@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:

On 12/05/2026 15:10, Dan Cross wrote:

In article <10ttem6$1daks$2@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:

[snip]
Perhaps the main "mistake" (where "mistake" means "I personally think C
would be nicer for my own use if things were different") is that when
mixing operations between signed int and unsigned int, the signed int is >>> converted to unsigned. I suspect that in real-world code, unsigned int
values that are within the range of signed int are common - and that
negative signed int values are more common than unsigned int values that >>> are out of range of signed int. Any common type here, unless it is
larger than the two original types, is going to get some things wrong -
but I think that converging on signed int as the common type would be
wrong less often. And if that had been the rule, then
unsigned-preserving promotion would be correct too in examples like yours. >>

If I understand what you're saying -- and correct me if I'm
wrong -- it sounds like you are suggesting sign-preserving
semantics for all types.

Yes. (Although I might not have thought through all the consequences of >this - so it's possible that I'll later realise or learn that it would
have been a bad idea.)

Fair. This is all hypothetical.

[snip]
I imagine the committee felt that, by the time the standards
process was in full swing, the ship had sailed on changing the
rules for values of type `int` or types of higher ranks, and
they could only reasonably address promotion of leser ranked
types to that of `int`. They acknowledged that the
sign-preserving promotion rules were a big semantic difference
from established practice; had they attempted to mandate
sign-preserving rules for arithmetic involving the `int` family
of types, they likely would have faced a serious revolt.

And as they said in the rationale, in _most_ cases, it doesn't
matter; for `int`/`unsigned int` even less so. For instance,
assume a platform with 32-bit `int`. Then the behavior of this
code is implementation-defined, but documented to have the same
predictable result across most conforming compilers:

I don't know much about early C compilers (other than briefly trying C
on a home computer in my teens, ANSI C was established by the time I
first used C). Did early any / many C compilers guarantee wrapping for >signed integer arithmetic? It is not a guarantee I have seen in any of
the embedded C compiler manuals I have read, though some of these
compilers were far too weakly optimising for it to have made a difference.

I think "guarantee" is too strong of a word; after all, there
was no standard in which to make a guarantee, but that was how
very early C compilers operated in practice. They were very
primitive, probably in part because of the paucity of the
machine they were developed on, so one really could imagine the
instructions that would be emitted in response to a given line
of code (C's unwarranted reputation as a "high-level assembler"
likely comes from this).

Pre-typesetter C, in particular, was pretty wild, though the
basic skeletal structure of the language as we know it had
mostly settled by then. Still, if one looks at the 6th Edition
Unix kernel source codes, one will frequently find things like
this (excerpted from the DN-11 driver):

```
struct dn {
struct {
char dn_stat;
char dn_reg;
} dn11[3];
}

#define DNADDR 0175200

dnopen(dev, flag)
{
register struct dn *dp;
register int rdev;

rdev = dev.d_minor;
dp = &DNADDR->dn11[rdev];
if (dp->dn_reg&(PWI|DLO))
u.u_error = ENXIO;
else {
DNADDR->dn11[0].dn_stat =| MENABLE;
dp->dn_stat = IENABLE|MENABLE|CRQ;
}
}
```

Notice the pointer that the struct member references are made
against, not just a variable with no declared type, but against
an integer constant: in early C, all `struct` members shared a
single common namespace; so the language assumed if it saw

member`, the thing on the left side of `->` must be a pointer

to an instance of whatever `struct` definition contained
`member`. On the PDP-11, an integer literal was taken as an
absolute address in the virtual address space of the program, as
defined by the settings in its segmentation registers. In the
kernel, this is basically a physical address.

unsigned int a = 8;
int b = -5;
int c = a * b;

To whit, `b` is prompted to `unsigned int` per the rules set
forth in the standard prior to the multiplication; the product
is taken in some ring $Z/2^nZ$ where $n$ is the bit-width of
`unsigned int` (in this example, 32); the product then undergoes
lvalue conversion to `signed int`, but per the rules for
unsigned-to-signed conversions, the result is
implementation-defined (since the product is outside of the
range of the positive subset of 32-bit numbers in 2s complement
representation). However, almost all real implementations will
define this using twos complement semantics with no change to
representation, and assign the resulting value assigned to `c`.
This is, surely, by far the most common case.

Yes, you end up with the same answer of -40, when "c" is an "int". But
if "c" is "long" (like in your first example), and that is bigger than >"int", the answer is 4294967256 which is almost certainly not what the >programmer intended. If the common type for "a * b" had been signed
int, rather than unsigned int, then you'd get -40 whether "c" is "int"
or "long". And you'd get it more directly, with less IB.

But you'd have more UB, because you'd run into signed overflow
more often (assuming they preserved that as UB in this
hypothetical alternate reality). If, instead, they had defined
the language to have unsigned-preserving semantics and defined
the behavior of unsigned to signed convertion to be the inverse
of signed to unsigned conversion, then you'd get the same result
without the IB.

So, for all _practical_ purposes, the interpretation of the
product as signed or unsigned only matters in the handful of
cases listed in the rationale: using the result in a comparison,
right-shifting the result or widening it (in which case
sign-extension matters, now that all the world's a 2s complement
machine) and so on.

And in cases where the compiler permits silent wrapping on
signed overflow, as I firmly believe they expected to be the
near-universal case, they made the argument that it mattered
even less.

Of course, we understand the consequences of these decisions
much better now, 40 years after the fact. But I really don't
think they thought things would unfold the way they have, with
UB taking such a prominent role as a basis for optimization.

Well, as they say, making predictions is hard - especially about the future!

Lol. Thanks, Steincke.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c on Tue May 12 19:07:51 2026

From Newsgroup: comp.lang.c

On 12/05/2026 17:57, Dan Cross wrote:

In article <10tvefc$1vmna$2@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:

On 12/05/2026 15:10, Dan Cross wrote:

In article <10ttem6$1daks$2@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
[snip]
I imagine the committee felt that, by the time the standards
process was in full swing, the ship had sailed on changing the
rules for values of type `int` or types of higher ranks, and
they could only reasonably address promotion of leser ranked
types to that of `int`. They acknowledged that the
sign-preserving promotion rules were a big semantic difference
from established practice; had they attempted to mandate
sign-preserving rules for arithmetic involving the `int` family
of types, they likely would have faced a serious revolt.

And as they said in the rationale, in _most_ cases, it doesn't
matter; for `int`/`unsigned int` even less so. For instance,
assume a platform with 32-bit `int`. Then the behavior of this
code is implementation-defined, but documented to have the same
predictable result across most conforming compilers:

I don't know much about early C compilers (other than briefly trying C
on a home computer in my teens, ANSI C was established by the time I
first used C). Did early any / many C compilers guarantee wrapping for
signed integer arithmetic? It is not a guarantee I have seen in any of
the embedded C compiler manuals I have read, though some of these
compilers were far too weakly optimising for it to have made a difference.

I think "guarantee" is too strong of a word; after all, there
was no standard in which to make a guarantee, but that was how
very early C compilers operated in practice. They were very
primitive, probably in part because of the paucity of the
machine they were developed on, so one really could imagine the
instructions that would be emitted in response to a given line
of code (C's unwarranted reputation as a "high-level assembler"
likely comes from this).

Perhaps "documented" would be better than "guaranteed". I realise that
in many situations, even highly optimising compilers generate signed
integer arithmetic operations that wrap. But to me, it's important what
the documentation says. The C standard says signed integer arithmetic
is UB - if a C compiler's manual does not document what the compiler
does with overflow, you can't rely on any particular behaviour. But if
the manual says "signed integer overflow follows the target processor's behaviour" and you know that is wrapping (no traps or other
"interesting" stuff), that's fine. Before the C standard, then of
course the compiler manual (and any referenced documents) would be only
source of information on the semantics.

Very occasionally, I'll rely on "what happens in practice" - if there is
no good and efficient way to avoid it and I can be sure from testing and examining generated assembly code that everything works as I want.

Pre-typesetter C, in particular, was pretty wild, though the
basic skeletal structure of the language as we know it had
mostly settled by then. Still, if one looks at the 6th Edition
Unix kernel source codes, one will frequently find things like
this (excerpted from the DN-11 driver):

```
struct dn {
struct {
char dn_stat;
char dn_reg;
} dn11[3];
}

#define DNADDR 0175200

dnopen(dev, flag)
{
register struct dn *dp;
register int rdev;

rdev = dev.d_minor;
dp = &DNADDR->dn11[rdev];
if (dp->dn_reg&(PWI|DLO))
u.u_error = ENXIO;
else {
DNADDR->dn11[0].dn_stat =| MENABLE;
dp->dn_stat = IENABLE|MENABLE|CRQ;
}
}
```

Notice the pointer that the struct member references are made
against, not just a variable with no declared type, but against
an integer constant: in early C, all `struct` members shared a
single common namespace; so the language assumed if it saw

member`, the thing on the left side of `->` must be a pointer

to an instance of whatever `struct` definition contained
`member`. On the PDP-11, an integer literal was taken as an
absolute address in the virtual address space of the program, as
defined by the settings in its segmentation registers. In the
kernel, this is basically a physical address.

Yes, I knew that's how structs worked before (though I have never had to
work with any code from that time). I notice also it has "=|" rather
than "|=".

And it seems to have been written at a time when space characters still
cost real money :-)

unsigned int a = 8;
int b = -5;
int c = a * b;

To whit, `b` is prompted to `unsigned int` per the rules set
forth in the standard prior to the multiplication; the product
is taken in some ring $Z/2^nZ$ where $n$ is the bit-width of
`unsigned int` (in this example, 32); the product then undergoes
lvalue conversion to `signed int`, but per the rules for
unsigned-to-signed conversions, the result is
implementation-defined (since the product is outside of the
range of the positive subset of 32-bit numbers in 2s complement
representation). However, almost all real implementations will
define this using twos complement semantics with no change to
representation, and assign the resulting value assigned to `c`.
This is, surely, by far the most common case.

Yes, you end up with the same answer of -40, when "c" is an "int". But
if "c" is "long" (like in your first example), and that is bigger than
"int", the answer is 4294967256 which is almost certainly not what the
programmer intended. If the common type for "a * b" had been signed
int, rather than unsigned int, then you'd get -40 whether "c" is "int"
or "long". And you'd get it more directly, with less IB.

But you'd have more UB, because you'd run into signed overflow
more often (assuming they preserved that as UB in this
hypothetical alternate reality).

Would you get more signed overflow in practice? And in particular,
would you get more signed overflow UB in places where you would not have
a bug in the code anyway. There would certainly be more cases of signed integer arithmetic, whereas moving to a common unsigned type means more unsigned integer arithmetic. But I don't see signed integer arithmetic
as a risk of UB in itself - it is only a risk UB if you are working with inappropriate values.

I think perhaps this is getting a bit speculative - we can't really give quantitative values for the risk of problems with particular expressions
in existing C code. I believe the conclusion is simply that the C
committee chose the rules that they thought, at the time, gave the most consistent results with the least risk of introducing new problems in
existing code written for a variety of slightly different C dialects.
Four decades later I disagree with some of those decisions, but there's nothing to be done about it now.

If, instead, they had defined
the language to have unsigned-preserving semantics and defined
the behavior of unsigned to signed convertion to be the inverse
of signed to unsigned conversion, then you'd get the same result
without the IB.

So, for all _practical_ purposes, the interpretation of the
product as signed or unsigned only matters in the handful of
cases listed in the rationale: using the result in a comparison,
right-shifting the result or widening it (in which case
sign-extension matters, now that all the world's a 2s complement
machine) and so on.

And in cases where the compiler permits silent wrapping on
signed overflow, as I firmly believe they expected to be the
near-universal case, they made the argument that it mattered
even less.

Of course, we understand the consequences of these decisions
much better now, 40 years after the fact. But I really don't
think they thought things would unfold the way they have, with
UB taking such a prominent role as a basis for optimization.

Well, as they say, making predictions is hard - especially about the future!

Lol. Thanks, Steincke.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Tue May 12 17:27:35 2026

From Newsgroup: comp.lang.c

In article <10tvdm8$1vmna$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:

On 12/05/2026 16:05, Dan Cross wrote:

[snip]
I think the term you are looking for is "strongly typed". :-)

Sure - I want this all to be strongly typed, but the question is what
should the type of integer constants / integer literals be? Ada calls
them "universal_integer" type, which might be a good name. (I don't
think there's a need to do too much bikeshedding for a purely
hypothetical language, however.)

It is whatever it is inferred to be by the compiler, during type
analysis.

In this discussion, we are hypothesizing a syntax for numeric
literals that is ambiguous, in that the same literal can be
shared between different types: 2 is 2, in our notation for both
signed and unsigned integers. One might say that such a literal
could be an instance of any in a _set_ of potential types, of
which the value of the literal is a member. The question is
then, how do we select one?

Assuming, say, the Hindley-Milner type inference algorithm, the
process is pretty well-defined: the semantics of the language
define what the types must be for a given "phrase" in the
language, and the compiler tries to unify those with whatever
constructs it finds match those phrases in the program sources.
If that process succeeds, it knows the required type; if it
fails, then the program is in error. Ie, the semantics might
say, "the language provides a way to define functions. Part of
a function's definition is defining the types of its arguments.
If a function is defined to take two integer arguments, then its
arguments must be integers." It sounds tautalogical, but
consider that the semantics could _also_ include a bunch of
implicit type conversion rules (this is what C does, more or
less). But then it would be weakly typed: one could not look at
the definition of the function and know that the arguments could
not be, say, character data that is implicitly converted to an implementation-defined code point.

Anyway, for literals, the set of types to which one may belong
is a unification input. If unification succeeds but the result
is an equivalence class, then the types are ambiguous and
inference usually fails with the compiler complaining; often the
programmer can fix that by adding an explicit type annotation
somewhere.

That is, types are verifably compatible. In a strongly- and
statically-typed language (that is, one where the types of
objects are known at compile time), it's possible to be both
expressive and precise. There are plenty of examples of such
langauges, but the common characteristic is that they (usually)
_infer_ the type of an expression based on the types of the
operands; there are well-known, formally sound, techniques for
doing this

Yes. I'd want the hypothetical language to be more strongly typed than C.

Absolutely.

With respect to literal constants, this would simply mean that
the literal would be considered to be of the inferred type of
the expression it was in: if no such inference could be made
(for instance, the types are fundamentally incompatbile), then
the compiler fail, flagging the type incompatibility as an
error.

Yes.

So, if this were a fragment of a program in a hypothetical C
dialect that was strongly typed and used type inference,

unsigned int a = 5;
unsigned int c = a * 2;

both `5` and `2` would be inferred to have type `unsigned int`,
since both are representable as unsigned ints. However,

unsigned int c = a * -2;

would be a compile time error, since the resulting type of the
expression must be `unsigned int`, but `-2` is not an unsigned
integer: it would have to be explicitly converted first.

That would be good.

I think there'd be a fair bit of overlap in our personal perfected
versions or dialects of C - but I'm sure there would be differences too.

Oh sure.

The C committee decided to impose a more or less reasonable rule on
all such operations; I might require the programmer to decide what
to do in each case. (There might be an exception for constants,
so that u+1 doesn't require a cast; I haven't thought through the
implications of that.)

Certainly the rules work - even if I might have preferred something
different, you can learn the rules and right correct code using them.
Lots of people do!

Yes, there are many examples of this, so it is obviously true.
However, I don't think there are many large projects written in
C where there isn't undefined behavior lurking somewhere, and
the amount of effort required to learn _all_ the rules of the
language is unnecessarily large.

I think it is fair to say that there are people who wear their
knowledge of the C standard as a badge of honor and look down at
those who desire a simpler language or who do not know the rules
as well. Some of that is fair (we see examples in this group of
some who not only refuse to learn the rules of the language, but
revel in their ignorance).

But that doesn't mean that all of the criticism is wrong, and
the frequency at which it happens that people run into UB is
also an indictment of the language. Put it this way: it may be
the programmer's fault that they relied on UB, but that it is so
evidently hard to learn and internalize the rules is also the
fault of the langauge. It is not wrong to wish it were better.

It is not wrong to wish C were better - with hindsight, there are many
ways in which a slightly different language would have kept the
advantages of C while reducing at least some risks of errors (whether UB
or not).

But I think that a lot of the UB in you might find in large projects
would be bugs in the code regardless of how that UB might have been
defined.

I think that most of it is emergent. One is consuming some
library, and one uses a function-like macro that that library
exposes through some header file, and one thinks that everything
is well-defined; at least, it appears so given given the context
the macro appears in, but this tickles UB in some way the
programmer may not be aware of. That kind of thing happens
pretty frequently, and UB frequently manifests as spooky action
at a distance.

For example, a few years ago at my previous job, someone
discovered a version of Linux compiled with clang/LLVM behaving
strangely; eventually, it was traced to a linked-list library
that exhibited UB under some obscure condition (I'm afraid I no
longer recall the details), and the compiler taking advantage of
that to remove some crucial check for something. I vaguely
remember the Linux people demanding a knob force the compiler's
behavior. However, the salient issue to my mind was that no one
saw the UB problem until the code that _used_ that library
started exhibiting errors. It's too easy to make a mess.

That is, even if signed integer arithmetic overflow had been
fully defined, you'd still get the wrong answer and the program has a
bug. The same with dereferencing a null pointer, or a buffer overflow,
or using the value of an uninitialised local variable. That is, if you >write your code so that it would have been bug-free in a language that
did not have these UB's, the C code would be the same.

Sure, though this ignores the issue with large probjects that
integrate many parts I mentioned above. Sometimes it's due to
something you did, but it's happening somewhere else because of
the way they used your thing (if that makes any sense).

I would argue those things things you mentioned should either
be a) unrepresentable (e.g., provide non-nullable references in
the language), or b) hard errors that are defined to trap unless
wrapping behavior is explicitly requested. Of course, we all
agree that language would not be C. :-)

The exceptions here would be cases where a programmer wrongly assumes >something has defined behaviour, and writes code according to that >assumption. Thus if they write code that assumes reading an
uninitialised local variable returns 0, or has an unspecified (but not >undefined) value, or that assumes signed integer overflow is defined as >wrapping - /then/ the C language's UB can surprise them in a way other >languages generally do not. I don't think there are other situations
where you could hit UB while expecting defined behaviour. (But as we
know, there are a few situations where the signed integer overflow can
be hiding unexpectedly, like uint16_t * uint16_t.)

Yes.

UB often manifests as spooky action at a distance, and the loose
notion of "undefined behavior" that is interpreted as, "lol the
compiler can do whateeeeever, bro...yolo!" is, I think, bad.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Tue May 12 18:09:34 2026

From Newsgroup: comp.lang.c

In article <10tvmp7$23t17$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:

On 12/05/2026 17:57, Dan Cross wrote:

I think "guarantee" is too strong of a word; after all, there
was no standard in which to make a guarantee, but that was how
very early C compilers operated in practice. They were very
primitive, probably in part because of the paucity of the
machine they were developed on, so one really could imagine the
instructions that would be emitted in response to a given line
of code (C's unwarranted reputation as a "high-level assembler"
likely comes from this).

Perhaps "documented" would be better than "guaranteed". I realise that
in many situations, even highly optimising compilers generate signed
integer arithmetic operations that wrap. But to me, it's important what
the documentation says. The C standard says signed integer arithmetic
is UB - if a C compiler's manual does not document what the compiler
does with overflow, you can't rely on any particular behaviour. But if
the manual says "signed integer overflow follows the target processor's >behaviour" and you know that is wrapping (no traps or other
"interesting" stuff), that's fine. Before the C standard, then of
course the compiler manual (and any referenced documents) would be only >source of information on the semantics.

Sure, you turned around and hollered across the room, asking
Dennis what it did. :-D (I am only half joking.)

There is historical evidence indicating that a few references
were available, but they were mostly informal and internal-only.
The closest to actually describing the behavior of the language
was the C Reference Manual, which was in the printed version of
volume 2 of the Unix Programmer's Manual; in the version that
described 7th Edition Unix, it mentions overflow exactly once,
when describing a bug in the implementation that ran on the
Honeywell 6070 computer, under the GCOS system. It does mention
2's complement in a couple of places, when describing `>>` and
the `char` and `int` types.

K&R 1st Edition says, "The handling of overflow and divide check
in expression evaluation is machine-dependent. All existing
implementations of C ignore integer overflows" and in the
section on arithmetic operators, "The action taken on overflow
or underflow depends on the machine at hand."

One might infer from that that overflow was intended to be
defined as having 2s complement wrapping semantics, but that is
not said explicitly.

Very occasionally, I'll rely on "what happens in practice" - if there is
no good and efficient way to avoid it and I can be sure from testing and >examining generated assembly code that everything works as I want.

Pre-typesetter C, in particular, was pretty wild, though the
basic skeletal structure of the language as we know it had
mostly settled by then. Still, if one looks at the 6th Edition
Unix kernel source codes, one will frequently find things like
this (excerpted from the DN-11 driver):

```
struct dn {
struct {
char dn_stat;
char dn_reg;
} dn11[3];
}

#define DNADDR 0175200

dnopen(dev, flag)
{
register struct dn *dp;
register int rdev;

rdev = dev.d_minor;
dp = &DNADDR->dn11[rdev];
if (dp->dn_reg&(PWI|DLO))
u.u_error = ENXIO;
else {
DNADDR->dn11[0].dn_stat =| MENABLE;
dp->dn_stat = IENABLE|MENABLE|CRQ;
}
}
```

Notice the pointer that the struct member references are made
against, not just a variable with no declared type, but against
an integer constant: in early C, all `struct` members shared a
single common namespace; so the language assumed if it saw

member`, the thing on the left side of `->` must be a pointer

to an instance of whatever `struct` definition contained
`member`. On the PDP-11, an integer literal was taken as an
absolute address in the virtual address space of the program, as
defined by the settings in its segmentation registers. In the
kernel, this is basically a physical address.

Yes, I knew that's how structs worked before (though I have never had to >work with any code from that time). I notice also it has "=|" rather
than "|=".

Yes. This is in Dennis Ritchie's C history paper; apparently it
was due to something they did in the lexical analyzer in B, on
the PDP-7.

And it seems to have been written at a time when space characters still
cost real money :-)

Heh. They preserved the density of that style in Plan 9, too.

unsigned int a = 8;
int b = -5;
int c = a * b;

To whit, `b` is prompted to `unsigned int` per the rules set
forth in the standard prior to the multiplication; the product
is taken in some ring $Z/2^nZ$ where $n$ is the bit-width of
`unsigned int` (in this example, 32); the product then undergoes
lvalue conversion to `signed int`, but per the rules for
unsigned-to-signed conversions, the result is
implementation-defined (since the product is outside of the
range of the positive subset of 32-bit numbers in 2s complement
representation). However, almost all real implementations will
define this using twos complement semantics with no change to
representation, and assign the resulting value assigned to `c`.
This is, surely, by far the most common case.

Yes, you end up with the same answer of -40, when "c" is an "int". But
if "c" is "long" (like in your first example), and that is bigger than
"int", the answer is 4294967256 which is almost certainly not what the
programmer intended. If the common type for "a * b" had been signed
int, rather than unsigned int, then you'd get -40 whether "c" is "int"
or "long". And you'd get it more directly, with less IB.

But you'd have more UB, because you'd run into signed overflow
more often (assuming they preserved that as UB in this
hypothetical alternate reality).

Would you get more signed overflow in practice? And in particular,
would you get more signed overflow UB in places where you would not have
a bug in the code anyway. There would certainly be more cases of signed >integer arithmetic, whereas moving to a common unsigned type means more >unsigned integer arithmetic. But I don't see signed integer arithmetic
as a risk of UB in itself - it is only a risk UB if you are working with >inappropriate values.

I suspect you would, if only because one of the major motivating
factors for using unsigned arithmetic in practice is to have the
full bit-range of the type available. Consider a mask for the
high 20 bits of a uint32_t defined as,

const uint32_t MASK = ~0U * 4096;

In your hypothetical, this is technically UB.

Or consider the hashing algorithm from K&R2 as an example: if
the unsigned `hash` value were normalized to a `signed int`
before the multiply and add, then this would definitely overflow
for even short strings. Each iteration of the loop, effectively
shifting the hash value left by a little less than 5 bits. As
written, that would overflows a signed 32-bit number on the 7th
character.

I think perhaps this is getting a bit speculative - we can't really give >quantitative values for the risk of problems with particular expressions
in existing C code. I believe the conclusion is simply that the C
committee chose the rules that they thought, at the time, gave the most >consistent results with the least risk of introducing new problems in >existing code written for a variety of slightly different C dialects.

Yes. I think they did a good job given the constraints imposed
on them.

Four decades later I disagree with some of those decisions, but there's >nothing to be done about it now.

100%. I don't blame them for the decisions they made; I'm not
going to gnash my teeth and scream about it. A lot has happened
since, though, and I think showed that they got a few of them
wrong.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Tue May 12 18:45:05 2026

From Newsgroup: comp.lang.c

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <10tvmp7$23t17$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:

<snip>

Pre-typesetter C, in particular, was pretty wild, though the
basic skeletal structure of the language as we know it had
mostly settled by then. Still, if one looks at the 6th Edition
Unix kernel source codes, one will frequently find things like
this (excerpted from the DN-11 driver):

```
struct dn {
struct {
char dn_stat;
char dn_reg;
} dn11[3];
}

#define DNADDR 0175200

dnopen(dev, flag)
{
register struct dn *dp;
register int rdev;

rdev = dev.d_minor;
dp = &DNADDR->dn11[rdev];
if (dp->dn_reg&(PWI|DLO))
u.u_error = ENXIO;
else {
DNADDR->dn11[0].dn_stat =| MENABLE;
dp->dn_stat = IENABLE|MENABLE|CRQ;
}
}
```

Notice the pointer that the struct member references are made
against, not just a variable with no declared type, but against
an integer constant: in early C, all `struct` members shared a
single common namespace; so the language assumed if it saw

member`, the thing on the left side of `->` must be a pointer

to an instance of whatever `struct` definition contained
`member`. On the PDP-11, an integer literal was taken as an
absolute address in the virtual address space of the program, as
defined by the settings in its segmentation registers. In the
kernel, this is basically a physical address.

Yes, I knew that's how structs worked before (though I have never had to >>work with any code from that time). I notice also it has "=|" rather
than "|=".

Yes. This is in Dennis Ritchie's C history paper; apparently it
was due to something they did in the lexical analyzer in B, on
the PDP-7.

And it seems to have been written at a time when space characters still >>cost real money :-)

Heh. They preserved the density of that style in Plan 9, too.

The code of that era was generally terse. Here's an interesting
fragment from v6 ls.c:

readdir(dir)
char *dir;
{
static struct {
int dinode;
char dname[14];
} dentry;
register char *p;
register int j;
register struct lbuf *ep;

if (fopen(dir, &inf) < 0) {
printf("%s unreadable\n", dir);
return;
}
tblocks = 0;
for(;;) {
p = &dentry;
for (j=0; j<16; j++)
*p++ = getc(&inf);
if (dentry.dinode==0
|| aflg==0 && dentry.dname[0]=='.')
continue;
if (dentry.dinode == -1)
break;
ep = gstat(makename(dir, dentry.dname), 0);
if (ep->lnum != -1)
ep->lnum = dentry.dinode;
for (j=0; j<14; j++)
ep->lname[j] = dentry.dname[j];
}
close(inf.fdes);
}

As an aside, I think this addresses your question/gripe about
why 'ls' ignored all dot files by default. While the intent
was to hide '.' and '..', ls(1) simply looked at the first
byte. I don't think user-created 'dot-files' were common in the v6
days, and when they did become common, it was _because_ of that
shortcut in V6 ls(1).

It's worth noting that Ken? used "aflg==0" rather than '!aflg' :-)
--- Synchronet 3.22a-Linux NewsLink 1.2

From Michael S@already5chosen@yahoo.com to comp.lang.c on Tue May 12 22:21:19 2026

From Newsgroup: comp.lang.c

On Tue, 12 May 2026 07:12:00 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Michael S <already5chosen@yahoo.com> writes:

I not just like stdint.h*
I also hate when C programmers define their own fixed-width integer
types.

[...]

That would raise my blood pressure:
typedef int32_t s32;
typedef uint32_t u32;
typedef uint8_t octet;

Can you say what it is about them that you don't like?

They increase mental load for casual reader of the code.
IMHO, for no good reason.

Or why you don't like them? Are the reasons the same
in all three cases, or is octet different?

The same in all three cases.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Tue May 12 21:24:29 2026

From Newsgroup: comp.lang.c

In article <RyKMR.12$Dw1.7@fx15.iad>, Scott Lurndal <slp53@pacbell.net> wrote: >cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <10tvmp7$23t17$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:

[snip]
And it seems to have been written at a time when space characters still >>>cost real money :-)

Heh. They preserved the density of that style in Plan 9, too.

The code of that era was generally terse. Here's an interesting
fragment from v6 ls.c:

readdir(dir)
char *dir;
{
static struct {
int dinode;
char dname[14];
} dentry;
register char *p;
register int j;
register struct lbuf *ep;

if (fopen(dir, &inf) < 0) {
printf("%s unreadable\n", dir);
return;
}
tblocks = 0;
for(;;) {
p = &dentry;
for (j=0; j<16; j++)
*p++ = getc(&inf);
if (dentry.dinode==0
|| aflg==0 && dentry.dname[0]=='.')
continue;
if (dentry.dinode == -1)
break;
ep = gstat(makename(dir, dentry.dname), 0);
if (ep->lnum != -1)
ep->lnum = dentry.dinode;
for (j=0; j<14; j++)
ep->lname[j] = dentry.dname[j];
}
close(inf.fdes);
}

As an aside, I think this addresses your question/gripe about
why 'ls' ignored all dot files by default. While the intent
was to hide '.' and '..', ls(1) simply looked at the first
byte. I don't think user-created 'dot-files' were common in the v6
days, and when they did become common, it was _because_ of that
shortcut in V6 ls(1).

Heh, yeah. It worked fine on PDP-7 Unix, but they broke it on
the -11 in C. Now running `ls -a` is like lifting up the carpet
and finding where all of those fleas biting you have been coming
from....

It is a gripe of mine. I think Rob Pike posted about the origin
on social media somewhere, though.

It's worth noting that Ken? used "aflg==0" rather than '!aflg' :-)

Ken had some interesting stylistic choices. For instance, he
did a lot of code that looked like this on Plan 9:

if (foo == 0)
if (bar != 0)
if (baz > 2)
something(...);

The first time we ran that through an automated formatter it was
sadness.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.misc,comp.lang.c on Tue May 12 22:32:30 2026

From Newsgroup: comp.lang.c

On 12/05/2026 02:37, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

I don't have much of a problem with the things that C can do, but with
how it does it, its syntax, its ancient baggage, its quirks, its
folklore, its Unix-centric ecosystem, its pointless UBs, its
insistence in working with every oddball processor, its solving every
shortcoming with macros, its adherents who will defend every
misfeature to the death...

You're mostly wrong about that last point. Many of us spend a
great deal of time and effort here *explaining* how C is defined
and how best to use it.

I don't see the connection with my point. I haven't said that people
don't explain things.

But it does seem that every poor feature in C is an invaluable asset to somebody, that must never be fixed.

So the inconvenience of how 'switch' works is excused because
/sometimes/ you need fallthrough, or the one time in a thousand you need Duff's device.

To explain is not to defend. What will it take for you to understand
that?

[...]

It is also frustrating looking at C forums and people thinking they
are too stupid to grasp something when it's language that could have
been better.

(I'm going to assume I parsed that sentence correctly.)

Nobody has said that C couldn't have been better. But it could
hardly have been more successful. As Dennis Ritchie himself said,
"C is quirky, flawed, and an enormous success."

Yeah, it's one of the great mysteries. Even half a century ago, there
were big companies and lots of clever people, who could have cranked out
a suitable systems language of equal capability to C in their sleep, but
with fewer rough edges.

I wonder why they didn't? Maybe they would have been aimimg too high
even then? (Instead we got Smalltalk and Ada.)

[...]

--- Synchronet 3.22a-Linux NewsLink 1.2

From John Ames@commodorejohn@gmail.com to comp.lang.misc,comp.lang.c on Tue May 12 15:28:53 2026

From Newsgroup: comp.lang.c

On Tue, 12 May 2026 22:32:30 +0100
Bart <bc@freeuk.com> wrote:

Even half a century ago, there were big companies and lots of clever
people, who could have cranked out a suitable systems language of
equal capability to C in their sleep, but with fewer rough edges.

I wonder why they didn't?

I am reminded of Seymour Cray's rebuttal to Tom Watson:
"I understand that in the laboratory developing this system there are
only 34 people, 'including the janitor.' Of these, 14 are engineers and
4 are programmers, and only one has a Ph. D., a relatively junior
programmer. To the outsider, the laboratory appeared to be cost
conscious, hard working and highly motivated.
Contrasting this modest effort with our own vast development
activities, I fail to understand why we have lost our industry
leadership position by letting someone else offer the worldrCOs most
powerful computer."
"It seems like Mr. Watson has answered his own question."
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Tue May 12 15:35:59 2026

From Newsgroup: comp.lang.c

[Dropping comp.lang.misc, since this is only about C.]

Bart <bc@freeuk.com> writes:

On 12/05/2026 02:37, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

I don't have much of a problem with the things that C can do, but with
how it does it, its syntax, its ancient baggage, its quirks, its
folklore, its Unix-centric ecosystem, its pointless UBs, its
insistence in working with every oddball processor, its solving every
shortcoming with macros, its adherents who will defend every
misfeature to the death...

You're mostly wrong about that last point. Many of us spend a
great deal of time and effort here *explaining* how C is defined
and how best to use it.

I don't see the connection with my point. I haven't said that people
don't explain things.

No, you haven't said that, and I didn't mean to imply that have.

My point is that when we explain things to you, sometimes at great
length to try to get past your unwillingness to understand them,
you assume that we're defending or advocating them, and you respond
by attacking the thing being explained, or the person explaining it,
or just refusing to accept it.

But it does seem that every poor feature in C is an invaluable asset
to somebody, that must never be fixed.

No. Old-style function declarations and definitions were a poor
feature. They were made obsolescent in C89 if I'm not mistaken, and
removed from the language in C23. Implicit int was a poor feature.
It was removed from the language in C99. Backwards indexing is a
poor feature. It's been marked as obsolescent in the C2y draft.

I'm aware that you're not satisfied with the speed at which poor
features are removed. Note also that not everyone agrees on which
features are poor.

So the inconvenience of how 'switch' works is excused because
/sometimes/ you need fallthrough, or the one time in a thousand you
need Duff's device.

Not at all. "switch" was originally implemented in a way that,
I suspect, was easier for the compiler to implement (basically
a scoped computed goto), and for an audience of programmers who,
to exaggerate slightly, could shout across the room and ask Dennis
Ritchie questions about it. I suspect many or most C programmers
would prefer it to have been designed without default fallthrough.
It stays the way it is because changing it *would break existing
code*. Worse, some seemingly reasonable ways of changing it would
mean that existing code is still valid but with different semantics.

Note that the current method for using multiple cases relies on
implicit fallthrough.

Certainly a "better" switch statement could do that differently,
but it's something that would have to be addressed. And since
the existing switch statement *works*, and can be used reasonably
safely if the programmer exercises a reasonable amount of care,
and since compilers can and do warn about questionable uses, it
hasn't been seen as worth fixing. As far as I know, nobody has
submitted a proposal to change it.

Except that C23 adds a "fallthrough" attribute that, while it
doesn't change the semantics of the switch statement, allows a
programmer to tell the compiler that a fallthrough was intentional.
A compiler can choose to warn about an unmarked fallthrough and
remain silent when it sees the "fallthrough" attribute.

We don't, as you seem to believe, fail or refuse to recognize that
C has problems. We don't, as you seem to believe, religiously hold
onto its existing flaws because they Must Be Preserved. We simply
don't feel the need to spend a lot of time talking about the things
we dislike about C, and we recognize that fixing them is often
impractical because it would break existing code, and because it
would take years for any fixes to propagate through the standards
process and then to multiple implementations. (But sometimes it's
worth it; see above.)

Most of us don't whine about it.

To explain is not to defend. What will it take for you to understand
that?
[...]

It is also frustrating looking at C forums and people thinking they
are too stupid to grasp something when it's language that could have
been better.

(I'm going to assume I parsed that sentence correctly.)

Nobody has said that C couldn't have been better. But it could
hardly have been more successful. As Dennis Ritchie himself said,
"C is quirky, flawed, and an enormous success."

Yeah, it's one of the great mysteries. Even half a century ago, there
were big companies and lots of clever people, who could have cranked
out a suitable systems language of equal capability to C in their
sleep, but with fewer rough edges.

I wonder why they didn't? Maybe they would have been aimimg too high
even then? (Instead we got Smalltalk and Ada.)

OK, you acknowledge that you don't understand why C is successful.
That's a good start.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Tue May 12 16:00:20 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 12/05/2026 15:05, Dan Cross wrote:

[...]

I think it is fair to say that there are people who wear their
knowledge of the C standard as a badge of honor and look down at
those who desire a simpler language or who do not know the rules
as well. Some of that is fair (we see examples in this group of
some who not only refuse to learn the rules of the language, but
revel in their ignorance).

Take for example C's set of operator precedences.

The one for the ?: operator is particularly obscure, so in an
expression like one of these:

a + b ? c - d : e * f
a ? b ? c : d ? e : f : g

then parentheses would be used to make things clearer. (I haven't
check these are valid, but that is the point; it is hard to see!)

Some C programmers make it a point to know all the operator
precedences by heart. I don't. I know most of them, but I
occasionally have to look them up. (My method is to look at the
subsection headers in 6.5 "Expressions", and look at the grammar
when I need more detail. Others prefer to use tables.)

But would shouldn't people be expected to learn the rules? Why is it
OK to 'revel' in not knowing the basics here, but not when unnecessary
UBs are involved where rules are harder and which depend on runtime
inputs?

There's nothing wrong with adding parentheses to make an expression
clearer. It doesn't imply an unwillingness to learn the rules,
just consideration for one's audience. Some readers mitgh understand
the unparenthesized version at a glance. Others might have to think
about it for an annoy few seconds, others might have to look it up
or feed it to some tool.

That's not at all comparable to your explicit refusal to even read
the standard's definition of "undefined behavior". (Or were you
being figurative?)

[...]
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Tue May 12 23:14:55 2026

From Newsgroup: comp.lang.c

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Bart <bc@freeuk.com> writes:

On 12/05/2026 15:05, Dan Cross wrote:

[...]

I think it is fair to say that there are people who wear their
knowledge of the C standard as a badge of honor and look down at
those who desire a simpler language or who do not know the rules
as well. Some of that is fair (we see examples in this group of
some who not only refuse to learn the rules of the language, but
revel in their ignorance).

Take for example C's set of operator precedences.

The one for the ?: operator is particularly obscure, so in an
expression like one of these:

a + b ? c - d : e * f
a ? b ? c : d ? e : f : g

then parentheses would be used to make things clearer. (I haven't
check these are valid, but that is the point; it is hard to see!)

Some C programmers make it a point to know all the operator
precedences by heart. I don't. I know most of them, but I
occasionally have to look them up. (My method is to look at the
subsection headers in 6.5 "Expressions", and look at the grammar
when I need more detail. Others prefer to use tables.)

I generally use parenthesis, to make the intent clear.

I also try to avoid code that looks like a submission to
the obfuscated code contest. Something like Duff's device
is clever, but if the next person to maintain the code
has to learn esoterica to support it, a better solution
should be found.

But would shouldn't people be expected to learn the rules? Why is it
OK to 'revel' in not knowing the basics here, but not when unnecessary
UBs are involved where rules are harder and which depend on runtime
inputs?

There's nothing wrong with adding parentheses to make an expression
clearer. It doesn't imply an unwillingness to learn the rules,
just consideration for one's audience.

Indeed. Maintainability is a keystone of quality code.

--- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.misc,comp.lang.c on Tue May 12 23:21:59 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 12/05/2026 02:37, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

<snip>

Nobody has said that C couldn't have been better. But it could
hardly have been more successful. As Dennis Ritchie himself said,
"C is quirky, flawed, and an enormous success."

Yeah, it's one of the great mysteries. Even half a century ago, there
were big companies and lots of clever people, who could have cranked out
a suitable systems language of equal capability to C in their sleep, but >with fewer rough edges.

Those clever people _were_ cranking out suitable systems languages
by the bucketful. PL/1, Algol derivatives, proprietary internal
languages (Burroughs SPRITE and BPL languages), HP-3000 SPL (Systems Programming Language - I used SPL in the late 70s) and
on the academic side, modula, ADA, Pascal (yes, it could be
a systems programming language, c.f. VAX-11 Pascal).

They weren't aiming at your 70's target 8080 processors, although
there was PL/M and a few others for the 8080 at the time.

https://en.wikipedia.org/wiki/PL/M

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.misc,comp.lang.c on Wed May 13 02:49:34 2026

From Newsgroup: comp.lang.c

On 2026-05-13 00:28, John Ames wrote:

On Tue, 12 May 2026 22:32:30 +0100
Bart <bc@freeuk.com> wrote:

Even half a century ago, there were big companies and lots of clever
people, who could have cranked out a suitable systems language of
equal capability to C in their sleep, but with fewer rough edges.

I wonder why they didn't?

I am reminded of Seymour Cray's rebuttal to Tom Watson:

"I understand that in the laboratory developing this system there are
only 34 people, 'including the janitor.' Of these, 14 are engineers and
4 are programmers, and only one has a Ph. D., a relatively junior
programmer. To the outsider, the laboratory appeared to be cost
conscious, hard working and highly motivated.

Contrasting this modest effort with our own vast development
activities, I fail to understand why we have lost our industry
leadership position by letting someone else offer the worldrCOs most
powerful computer."

"It seems like Mr. Watson has answered his own question."

Hmm.. - I wonder what the amount of involved people is supposed to
tell us here? - There's sophisticated languages where international
committees with many members spent huge efforts, and there's also
extremely ambitious languages where less than a handful of experts
designed and developed it. - I mean, now concerning "C", does that
in any way makes the difference or explains anything concerning the
actual "C" design (with its strengths and with its shortcomings and
inherent deficiencies)?

My (very subjective) impression is that "leadership positions" were
primarily defined by other factors the past decades; spanning from
commercial market-power to thorough marketing activities.

Janis

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.misc,comp.lang.c on Wed May 13 02:53:22 2026

From Newsgroup: comp.lang.c

On 2026-05-13 01:21, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 12/05/2026 02:37, Keith Thompson wrote:

[...]

Yeah, it's one of the great mysteries. Even half a century ago, there
were big companies and lots of clever people, who could have cranked out
a suitable systems language of equal capability to C in their sleep, but
with fewer rough edges.

Those clever people _were_ cranking out suitable systems languages
by the bucketful. PL/1, Algol derivatives, proprietary internal
languages (Burroughs SPRITE and BPL languages), HP-3000 SPL (Systems Programming Language - I used SPL in the late 70s) and
on the academic side, modula, ADA, Pascal (yes, it could be
a systems programming language, c.f. VAX-11 Pascal).

[...]

I wonder about why you put Ada just in the "academic box".

Janis

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Wed May 13 03:26:44 2026

From Newsgroup: comp.lang.c

On 2026-05-13 00:35, Keith Thompson wrote:

[Dropping comp.lang.misc, since this is only about C.]
Bart <bc@freeuk.com> writes:

[...]

[...]

So the inconvenience of how 'switch' works is excused because
/sometimes/ you need fallthrough, or the one time in a thousand you
need Duff's device.

I don't see any inconvenience in "how it works"; it actually
allows programmers to implement both semantics as needed. And
both semantics were needed, they have been used. (Even if you
think your projection of your preferences and limited uses is
what should constitute the global software development world.)

Not at all. "switch" was originally implemented in a way that,
I suspect, was easier for the compiler to implement (basically
a scoped computed goto), and for an audience of programmers who,
to exaggerate slightly, could shout across the room and ask Dennis
Ritchie questions about it.

Computed 'goto' was actually considered quite "high standards"
back then, quite some prominent languages provided these. (Hard
to believe given all the options that appeared later.)

I suspect many or most C programmers
would prefer it to have been designed without default fallthrough.

The explicit and clumsy 'break' is what syntactically annoys me,
but it's also no drama, to be clear.

Of course there could be various ways to support it, one way or
the other (fall-through or not), or supporting both even in terse
ways (like in Kornshell), or using existing conditionals for it,
or by inventing something completely different (polymorphism and OO
concepts makes disjunctive switches or fall-through behavior almost
obsolete).

It stays the way it is because changing it *would break existing
code*. Worse, some seemingly reasonable ways of changing it would
mean that existing code is still valid but with different semantics.

Indeed. And that's the crucial point. A simple "dislike"-criticism
without acknowledging the practical side effects is pointless.

Note that the current method for using multiple cases relies on
implicit fallthrough.

Certainly a "better" switch statement could do that differently,
but it's something that would have to be addressed. And since
the existing switch statement *works*, and can be used reasonably
safely if the programmer exercises a reasonable amount of care,
and since compilers can and do warn about questionable uses, it
hasn't been seen as worth fixing. As far as I know, nobody has
submitted a proposal to change it.

Except that C23 adds a "fallthrough" attribute that, while it
doesn't change the semantics of the switch statement, allows a
programmer to tell the compiler that a fallthrough was intentional.
A compiler can choose to warn about an unmarked fallthrough and
remain silent when it sees the "fallthrough" attribute.

We considered it generally good style to write /* fall-through */
at such places in our software as an explicit visible hint (and
that is even more "bulky" than the explicit 'break'). I'm thus not
astonished about these new features.

Janis

[...]

--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Tue May 12 18:36:43 2026

From Newsgroup: comp.lang.c

James Kuyper <jameskuyper@alumni.caltech.edu> writes:

On 2026-05-08 06:43, David Brown wrote:
...

Yes, I have heard that argument before. I am unconvinced that the
"value preserving" choice actually has any real advantages. I also
think it is a misnomer - it implies that "unsigned preserving" would
not preserve values, which is wrong.

Unsigned-preserving rules would convert a signed value which might be negative to unsigned type more frequently than the value preserving
rules do.

This statement is wrong. An "unsigned preserving" promotion rule
converts a signed value to a signed value and an unsigned value to
an unsigned value. The value being converted stays the same in both
cases. Both an "unsigned preserving" promotion and a so-called
"value preserving" promotion preserve the value of the operand being
promoted (and converted).
--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Tue May 12 19:02:42 2026

From Newsgroup: comp.lang.c

scott@slp53.sl.home (Scott Lurndal) writes:

For the early C compiler on the PDP-11, the 'int' type was
16-bits, implicitly signed, and the code generator simply emitted
available arithmetic instructions.

It was the only C compiler at the time, any guarantees would have
been implicit in the choice of target architecture.

I mostly wrote unix kernel code using the v6 compiler, rather than
writing code that did any heavy math, so whether value was
preserved or sign was preserved wasn't something I, as a kernel
programmer, routinely considered.

If int was only 16 bits, I expect promotion considerations didn't
come up very often.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Tue May 12 19:19:22 2026

From Newsgroup: comp.lang.c

Michael S <already5chosen@yahoo.com> writes:

On Tue, 12 May 2026 07:12:00 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Michael S <already5chosen@yahoo.com> writes:

I not just like stdint.h*
I also hate when C programmers define their own fixed-width
integer types.

[...]

That would raise my blood pressure:
typedef int32_t s32;
typedef uint32_t u32;
typedef uint8_t octet;

Can you say what it is about them that you don't like?

They increase mental load for casual reader of the code.
IMHO, for no good reason.

Or why you don't like them? Are the reasons the same
in all three cases, or is octet different?

The same in all three cases.

Interesting. Thank you for the answers.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Tue May 12 19:48:40 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

[.. I am cutting 100ish lines as they don't bear on my response ..]

Take for example C's set of operator precedences.

The one for the ?: operator is particularly obscure, so in an
expression like one of these:

a + b ? c - d : e * f
a ? b ? c : d ? e : f : g

then parentheses would be used to make things clearer. (I haven't
check these are valid, but that is the point; it is hard to see!)

But would shouldn't people be expected to learn the rules? Why is it
OK to 'revel' in not knowing the basics here, but not when unnecessary
UBs are involved where rules are harder and which depend on runtime
inputs?

If you want people to take you seriously, you need to find more
compelling examples. I am both familiar with and comfortable with
the syntax of C expressions, and even I would never write such
expressions as the two shown above. These lines look like they
were written by someone in junior high school (or these days,
probably elementary school). Whether you mean to or not, this
example gives the impression of offering a strawman argument, and
it's only natural for people to react to that by dismissing your
comments, or even dismissing them altogether. Is that what you
want? To be dismissed? Or do you hope to actually communicate
with people? If so I recommend looking for a better framing of
your views and ideas.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Tue May 12 20:47:59 2026

From Newsgroup: comp.lang.c

scott@slp53.sl.home (Scott Lurndal) writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

scott@slp53.sl.home (Scott Lurndal) writes:

kalevi@kolttonen.fi (Kalevi Kolttonen) writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

In almost all cases where uint8_t
might be used, unsigned char works just as well.

Why "almost"? Where is the difference if any?

As far as I know, ISO guarantees that
sizeof(unsigned char) is always 1 byte.

On at least one system with a working C compiler,
a byte is 9 bits, not 8. If I wanted an 8-bit datum
on that system, I'd have to use uint8_t.

If a byte is 9 bits (ie, if CHAR_BIT == 9) there cannot
be a uint8_t type. The fixed-width types are not allowed
to have padding bits.

That was a 36-bit system. It could easly create a
uint8_t value from 1/9th of two 72-bit words;
so no padding bits required.

That doesn't work in C where CHAR_BIT == 9, which I think other
people have explained. There could be a uint9_t in such an
environment, but not uint8_t because the fixed-width types are
not allowed to have padding bits; all objects other than
bit-fields are mandated to have a whole number of bytes (which in
C is the same size as the character types), so an 8-bit integer
type with no padding bits isn't possible.

Indeed. Although from my perspective, the use of the
stdint types clearly documents the programmers
intent, whereas a typedef such as BYTE or WORD
is inherently ambiguous and would require a programmer
to look up the definition of such types in the
application to determine the original programmers intent.

BYTE and WORD are poor choices for type names, no doubt
about that. On the other hand, in many or most cases
so are [u]intNN_t; they simultaneously convey both too
little and too much information. There is a certain kind
of programming where the fixed-width types are genuinely
helpful; unfortunately though they are used a lot more
widely than circumstances where they are helpful.

The programming I do
(mainly kernel programming, SoC simulation,
firmware) all naturally require the fixed-width types.

Right. Code that interacts very closely with hardware is one of
those cases where the fixed-width types make sense.

For other apps, int, long, float, double are preferred
to INT, LONG, FLOAT, DOUBLE (which seems to be the
way windows programmers code)[*]

[*] which probably dates back to 16-bit windows
and their methods of maintaining backward compatability
across two subsequent (32, 64) x86 processor architectures
plus MIPS et alia.

I wouldn't hold Microsoft Windows code up as an example for
anyone except perhaps as a horror story. :)
--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Tue May 12 22:31:12 2026

From Newsgroup: comp.lang.c

cross@spitfire.i.gajendra.net (Dan Cross) writes:

[...]

It is not clear to me that `longjmp` out of a non-nested signal
handler is still well-defined as of C11, though it is explicitly
stated to be C89.

It seems you are misunderstanding what the standards are saying.
The description of longjmp() says (paraphrasing) that it restores
the environment where the relevant setjmp() was done. There is
in C89 a passage about returning from signal handlers and so
forth, but that is followed by a carveout for nested signal
handlers, which in C89 is undefined behavior. (I assume that
also holds for C90 but I haven't verified that.)

Starting in C99, any mention of interrupts and signal handlers was
removed, along with the carveout. Because there is a definition
for what longjmp() does, the behavior is defined, and there is no
undefined behavior (not counting things like doing a longjmp()
with a jmp_buf that wasn't set up, etc). Removing the mention of
interrupts and signals, and also removing the carveout, only makes
longjmp() more defined, not less.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Wed May 13 11:02:18 2026

From Newsgroup: comp.lang.c

On 2026-05-13 05:47, Tim Rentsch wrote:

scott@slp53.sl.home (Scott Lurndal) writes:

[...]

The programming I do
(mainly kernel programming, SoC simulation,
firmware) all naturally require the fixed-width types.

Right. Code that interacts very closely with hardware is one of
those cases where the fixed-width types make sense.

Another common one - also "low-level" but different - are data types
exchanged through communication protocols.

Janis

[...]

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Wed May 13 12:16:49 2026

From Newsgroup: comp.lang.c

On 2026-05-12 16:33, Bart wrote:

[...]

Take for example C's set of operator precedences.

The one for the ?: operator is particularly obscure, so in an expression like one of these:

-a-a a + b ? c - d : e * f
-a-a a ? b ? c : d ? e : f : g

then parentheses would be used to make things clearer. (I haven't check these are valid, but that is the point; it is hard to see!)

What has that example to do with ["obscure"] _operator precedence_?

Ternary conditionals are actually expressions that are sensibly
defined in "C" (i.e. concerning their precedence ranking).

a + b ? c - d
: e * f

a ?
b ? c
: d ? e
: f
: g

For complex expressions you can, as a *responsible* programmer, use
various means to not (not deliberately) write obfuscated expressions;
you can indent code, use parentheses[*], or you can decompose it to
(semantic or technical) identified sub-units.

[*] Parentheses would IMO make your layout in your example above not
in any way better, just yet more overloaded. (So forcing parenthesis
[in "your language"] is certainly addressing the wrong problem here.)

Your complaint, as so often, fails to work on so many levels. It
tells, yet again, more about you than about the "C" language.

But would shouldn't people be expected to learn the rules?

Programmers should certainly learn, know, apply, and obey the rules.

(If you don't understand that you may try to transform that truism
to your "car example".)

Janis

PS: There *is* a specific issue in C's operator precedence ranking
but it's not the ternary conditional.

[...]

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Wed May 13 12:34:23 2026

From Newsgroup: comp.lang.c

On 2026-05-12 16:05, Dan Cross wrote:

[...]

Yes, there are many examples of this, so it is obviously true.
However, I don't think there are many large projects written in
C where there isn't undefined behavior lurking somewhere, and
the amount of effort required to learn _all_ the rules of the
language is unnecessarily large.

I think it is fair to say that there are people who wear their
knowledge of the C standard as a badge of honor and look down at
those who desire a simpler language or who do not know the rules
as well. Some of that is fair (we see examples in this group of
some who not only refuse to learn the rules of the language, but
revel in their ignorance).

But that doesn't mean that all of the criticism is wrong, and
the frequency at which it happens that people run into UB is
also an indictment of the language. Put it this way: it may be
the programmer's fault that they relied on UB, but that it is so
evidently hard to learn and internalize the rules is also the
fault of the langauge. It is not wrong to wish it were better.

I admire how well you put that.

Janis

[...]

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Wed May 13 12:41:58 2026

From Newsgroup: comp.lang.c

On 2026-05-11 22:37, David Brown wrote:

[...]

Perhaps the main "mistake" (where "mistake" means "I personally think C would be nicer for my own use if things were different") is that when
mixing operations between signed int and unsigned int, the signed int is converted to unsigned.-a I suspect that in real-world code, unsigned int values that are within the range of signed int are common - and that negative signed int values are more common than unsigned int values that
are out of range of signed int.-a Any common type here, unless it is
larger than the two original types, is going to get some things wrong -
but I think that converging on signed int as the common type would be
wrong less often.-a And if that had been the rule, then unsigned-
preserving promotion would be correct too in examples like yours.

That all matches with my thoughts about the matter.

Janis

[...]

--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Wed May 13 04:07:38 2026

From Newsgroup: comp.lang.c

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
[...]

BYTE and WORD are poor choices for type names, no doubt
about that.

[...]

WORD is certainly ambiguous (unless, I suppose, it's sufficiently
obvious from the context). But I don't have a problem with BYTE,
or preferably byte, as a type name as long as it really is a byte.

C does have a byte type; it just happens to spell it "unsigned char".
But I don't object to something like

typedef unsigned char byte;

and I've used it myself.

BYTE is a poor choice for a type name because it looks like a
macro.

A lower-case version, byte, is a poor choice for a type name,
because it is both confusing and ambiguous.

Confusing, because for a very long time and for a huge segment of
the programming community, the term byte is synonymous with eight
bits, but in C that need not be true.

Ambiguous, because byte could easily mean any of three types.
The C standard library makes things worse by using 'int' for what
are basically characters, augmented with a non-character value --
EOF -- that means "something else". If byte is synonymous with
character, it might also mean 'int'.

In K&R the word "byte" is used not as a type but as a unit of
measure. The C standard defines "byte" as an addressable unit of
storage to hold any member of the basic character set -- not a
type but an amount of memory, which if anything sounds like it
might correspond to the type 'char'. In talking about character
strings, the C standard says a string ends with a byte with all
bits set to zero - another argument that 'byte' should be the
same as 'char'. It's easy to imagine an independently developed
third-party library using 'byte' to mean 'char' - more confusion
and more ambiguity. It's better to avoid the name 'byte' as a
type name altogether.

There is nothing to stop you from writing confusing or ambiguous
code. But just because you have done so doesn't make it a good
idea.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Wed May 13 13:12:31 2026

From Newsgroup: comp.lang.c

On 2026-05-12 20:09, Dan Cross wrote:

In article <10tvmp7$23t17$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:

Would you get more signed overflow in practice? And in particular,
would you get more signed overflow UB in places where you would not have
a bug in the code anyway. There would certainly be more cases of signed
integer arithmetic, whereas moving to a common unsigned type means more
unsigned integer arithmetic. But I don't see signed integer arithmetic
as a risk of UB in itself - it is only a risk UB if you are working with
inappropriate values.

I suspect you would, if only because one of the major motivating
factors for using unsigned arithmetic in practice is to have the
full bit-range of the type available. [...]

Hmm.. - I'm using 'unsigned' typically to express the domain of the
application values (not to "wrest" some more values out of a type).

Janis

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Wed May 13 13:14:50 2026

From Newsgroup: comp.lang.c

On 2026-05-12 20:45, Scott Lurndal wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <10tvmp7$23t17$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:

[...]

And it seems to have been written at a time when space characters still
cost real money :-)

Heh. They preserved the density of that style in Plan 9, too.

The code of that era was generally terse. [...]

Beyond "C", not as far as my observations go.

Janis

--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Wed May 13 04:27:20 2026

From Newsgroup: comp.lang.c

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <10tpq7e$a6kp$3@dont-email.me>, Bart <bc@freeuk.com> wrote:

[...]

Apparently, you missed the changes afoot in the committee to do
exactly what everyone has been telling you: deprecate `i[A]` but
preserve `i + A`.

Not deprecate but deem it obsolescent. A very different thing.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Wed May 13 12:32:09 2026

From Newsgroup: comp.lang.c

On 13/05/2026 02:26, Janis Papanagnou wrote:

On 2026-05-13 00:35, Keith Thompson wrote:

[Dropping comp.lang.misc, since this is only about C.]
Bart <bc@freeuk.com> writes:

[...]

[...]

So the inconvenience of how 'switch' works is excused because
/sometimes/ you need fallthrough, or the one time in a thousand you
need Duff's device.

I don't see any inconvenience in "how it works"; it actually
allows programmers to implement both semantics as needed. And
both semantics were needed, they have been used. (Even if you
think your projection of your preferences and limited uses is
what should constitute the global software development world.)

Well, no other language (save C++) implements switch like C does.

I'm not sure you appreciate how bizarre it actually is. Here is a piece
of code from a 'Sieve' benchmark:

for (i=3; i<=size; ++i) {
if (data[i]) {
++count;
j=i*2;
while (j<=size) {
if (data[j])
data[j]=0;
j+=i;
}
}
}

Now, let's put wrap 'switch' around it:

switch (a)
case 10:
for (i=3; i<=size; ++i) {
case 20:
if (data[i]) {
default:
++count;
case 30:
j=i*2;
while (j<=size) {
case 40:
if (data[j])
case 50:
data[j]=0;
j+=i;
}
}
}

This is perfectly valid C code, if meaningless.

The original code is 4 nested statements, but the switch's 'case' labels
can go literally anywhere within that structure. Even 'default' can go anywhere and be mixed up with the other cases.

Further, if you wanted to apply 'break' to one of those case-blocks, it wouldn't work as it would pertain to one of those nested loops.

I made this point before but it was brushed off. The C authors couldn't
think of an alternate keyword so there remains this conflict.

BTW switch fallthrough is necessary so that you can do this:

switch (a) {
case 'A': case 'B': case 'C': .... // deal with A/B/C

Without fall-through behavior, it would exit after that case 'A': label.
This is how crude it is.

Remember my saying people defend its misfeatures to the death? Your post
is a perfect example!

And yes, I know exactly why C switch works this way; I've had to
implement it.

I don't see any inconvenience in "how it works"; it actually
allows programmers to implement both semantics as needed.

Ha, ha, ha! This is exactly my point. 99% of the time, at least, you
want very simple, boring semantics and properly structured syntax, just
as I offer im my languages and others do in their switch/match statements.

And yet, I can still emulate the behaviour of that switch example above
in my language, weird as it is. Example is shown below (if case values
were sequential, it would be even simpler).

/You can do this in C too/, showing you don't need a crazy switch.

(Even if you
think your projection of your preferences and limited uses is
what should constitute the global software development world.)

You think my view is limited, and genuinely think C switch is superior
to how it works in other languages?

Then this is definitely a wind-up.

Not at all.-a "switch" was originally implemented in a way that,
I suspect, was easier for the compiler to implement (basically
a scoped computed goto), and for an audience of programmers who,
to exaggerate slightly, could shout across the room and ask Dennis
Ritchie questions about it.

Computed 'goto' was actually considered quite "high standards"

Switch is not really computed goto. The latter is more like this (FORTRAN):

GOTO (10, 20, 30, 40, 50)N

Where the numbers are label names, but N must be 1-5.

In my syntax:

goto (N | L10, L20, L30, L40, L50 | Ldefault)

Here a default can be provided, but again N is 1-5, not 10-50 as in my C example. To emulate that C switch via a computed goto, it would look
like this:

goto (N | L10, Ldefault, ... L20, ..., Ldefault, L50 | Ldefault)

It would need 41 elements. In this case with only 5 options, a compiler
will likely generate code for sequentional testing.

back then, quite some prominent languages provided these. (Hard
to believe given all the options that appeared later.)

I suspect many or most C programmers
would prefer it to have been designed without default fallthrough.

The explicit and clumsy 'break' is what syntactically annoys me,
but it's also no drama, to be clear.

I disagree, it /IS/ a drama where you have to keep remembering to write it.

It stays the way it is because changing it *would break existing
code*.-a Worse, some seemingly reasonable ways of changing it would
mean that existing code is still valid but with different semantics.

Indeed. And that's the crucial point. A simple "dislike"-criticism
without acknowledging the practical side effects is pointless.

I understand the problems of changing it in the 21st century rather than
much earlier on. People could simply agree with me that it is a terrible language feature.

It would also have been perfectly possible to leave 'switch' alone and
instead introduce a new kind of statement.

--------------------------
switch a
when 10 then L10
when 20 then L20
when 30 then L30
when 40 then L40
when 50 then L50
else Ldefault
end

L10:
for i:=3 to n do
L20:
if data[i] then
Ldefault:
++count
L30:
j:=i*2
while j<=n, j+:=i do
L40:
if data[j] then
L50:
data[j]:=0
end
end

end
end

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Wed May 13 13:35:20 2026

From Newsgroup: comp.lang.c

On 2026-05-13 13:07, Tim Rentsch wrote:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
[...]

BYTE and WORD are poor choices for type names, no doubt
about that.

[...]

WORD is certainly ambiguous (unless, I suppose, it's sufficiently
obvious from the context). But I don't have a problem with BYTE,
or preferably byte, as a type name as long as it really is a byte.

[...]

BYTE is a poor choice for a type name because it looks like a
macro.

A lower-case version, byte, is a poor choice for a type name,
because it is both confusing and ambiguous.

Confusing, because for a very long time and for a huge segment of
the programming community, the term byte is synonymous with eight
bits, but in C that need not be true.

Actually, it was more an issue in the "intermediate epoch", when
terminology spread to the non-expert home-users who considered
a byte to be 8 bit on their typical PC systems while not knowing
anything from the professional IT world before (with 6, 7, 9 bit
entities). Nowadays I'd consider it less an issue since these
systems seem to have (mostly?) vanished. There was a reason why
the standards back then introduced and used the term "octet" for
the common 8-bit entities, to avoid ambiguity and misunderstanding.

What's technically defined for the "C" language in the respective
standard documents is an own thing, not necessarily equivalent to
the respective application semantics expressed by some C-program,
although I'd always prefer "octet" for that (and avoid "byte").

Janis

[...]

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Wed May 13 12:48:39 2026

From Newsgroup: comp.lang.c

On 13/05/2026 03:48, Tim Rentsch wrote:

Bart <bc@freeuk.com> writes:

[.. I am cutting 100ish lines as they don't bear on my response ..]

Take for example C's set of operator precedences.

The one for the ?: operator is particularly obscure, so in an
expression like one of these:

a + b ? c - d : e * f
a ? b ? c : d ? e : f : g

then parentheses would be used to make things clearer. (I haven't
check these are valid, but that is the point; it is hard to see!)

But would shouldn't people be expected to learn the rules? Why is it
OK to 'revel' in not knowing the basics here, but not when unnecessary
UBs are involved where rules are harder and which depend on runtime
inputs?

If you want people to take you seriously, you need to find more
compelling examples. I am both familiar with and comfortable with
the syntax of C expressions, and even I would never write such
expressions as the two shown above.

No? I actually had your posted examples in mind. I can't remember you
using parentheses. I can remember you not being sympathetic to readers
of your code and expected them to be as familiar with precedence as you are.

These lines look like they
were written by someone in junior high school (or these days,
probably elementary school).

The lines are not meant to mean anything, just sequences of terms and operators. You can think of them as exercises where you add parentheses
to make them unambiguous.

A bit like adding punctuation here:

"that that is is that that is not is not is that it it is"

Whether you mean to or not, this
example gives the impression of offering a strawman argument, and
it's only natural for people to react to that by dismissing your
comments, or even dismissing them altogether. Is that what you
want? To be dismissed? Or do you hope to actually communicate
with people? If so I recommend looking for a better framing of
your views and ideas.

Now this is getting silly. Can no one here engage in a civil discussion without reducing to insults and casting aspersions?

--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c on Wed May 13 13:54:26 2026

From Newsgroup: comp.lang.c

On 13/05/2026 13:35, Janis Papanagnou wrote:

On 2026-05-13 13:07, Tim Rentsch wrote:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
[...]

BYTE and WORD are poor choices for type names, no doubt
about that.

[...]

WORD is certainly ambiguous (unless, I suppose, it's sufficiently
obvious from the context).-a But I don't have a problem with BYTE,
or preferably byte, as a type name as long as it really is a byte.

[...]

BYTE is a poor choice for a type name because it looks like a
macro.

A lower-case version, byte, is a poor choice for a type name,
because it is both confusing and ambiguous.

Confusing, because for a very long time and for a huge segment of
the programming community, the term byte is synonymous with eight
bits, but in C that need not be true.

Actually, it was more an issue in the "intermediate epoch", when
terminology spread to the non-expert home-users who considered
a byte to be 8 bit on their typical PC systems while not knowing
anything from the professional IT world before (with 6, 7, 9 bit
entities). Nowadays I'd consider it less an issue since these
systems seem to have (mostly?) vanished. There was a reason why
the standards back then introduced and used the term "octet" for
the common 8-bit entities, to avoid ambiguity and misunderstanding.

Yes.

What's technically defined for the "C" language in the respective
standard documents is an own thing, not necessarily equivalent to
the respective application semantics expressed by some C-program,
although I'd always prefer "octet" for that (and avoid "byte").

Janis

The computing world standardised on "byte" meaning 8 bits long ago - by
the 1980's the only exceptions of any significant are network standard documents that use the term "octet", and the C standards (and by
inheritance, the C++ standards) where a "byte" is not necessarily 8
bits. While there are current C compilers where CHAR_BIT is not 8, information and documentation about them (typically DSP processors) specifically avoid using the term "byte" to refer to anything other than
8-bit data elements.

The use of "byte" as a type name in C could possibly have been
considered confusing or ambiguous 40 years ago - these days, you'd have
to try very hard to misunderstand it.

(Note that C++ has had std::byte as a type for access to raw memory
since C++17. Why would C programmers be more easily confused than C++ programmers?)

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Wed May 13 13:20:40 2026

From Newsgroup: comp.lang.c

On 13/05/2026 11:16, Janis Papanagnou wrote:

On 2026-05-12 16:33, Bart wrote:

[...]

Take for example C's set of operator precedences.

The one for the ?: operator is particularly obscure, so in an
expression like one of these:

-a-a-a a + b ? c - d : e * f
-a-a-a a ? b ? c : d ? e : f : g

then parentheses would be used to make things clearer. (I haven't
check these are valid, but that is the point; it is hard to see!)

What has that example to do with ["obscure"] _operator precedence_?

Ternary conditionals are actually expressions that are sensibly
defined in "C" (i.e. concerning their precedence ranking).

-a-a-a-a a + b ? c - d
-a-a-a-a-a-a-a-a-a-a : e * f

-a-a-a-a a ?
-a-a-a-a-a-a-a-a b ? c
-a-a-a-a-a-a-a-a-a-a : d ? e
-a-a-a-a-a-a-a-a-a-a-a-a-a-a : f
-a-a-a-a-a-a : g

For complex expressions you can, as a *responsible* programmer, use
various means to not (not deliberately) write obfuscated expressions;
you can indent code, use parentheses[*], or you can decompose it to
(semantic or technical) identified sub-units.

/I/ might do that; how about everyone else?

A random quote from Hacker News:

'I once asked my college professor about operator precedence in C. He
had been writing C code in industry for decades. "I have no idea" he
told me."

From a reply further downthread:

"As a young egotist I would often omit parens in complicated C
expressions. I did this intentionally and in a very self-satisfied way - writing multi-line conditionals and lining them up neatly without parens
with a metaphorical flourish of my pen.

Then one day, chasing a hard-to-find bug, I realised it had happened
because I'd mixed up the precedence of && and || in a long conditional.
I was an idiot. Since then I've made a point of reminding myself that I
know nothing and that there's nothing to be gained from pretending I do,
and putting parens in everywhere."

(https://news.ycombinator.com/item?id=22482223)

[*] Parentheses would IMO make your layout in your example above not
in any way better,

Yet they are needed in Algol68, apparently your favourite language.

just yet more overloaded. (So forcing parenthesis
[in "your language"] is certainly addressing the wrong problem here.)

Your complaint, as so often, fails to work on so many levels. It
tells, yet again, more about you than about the "C" language.

Yeah, like I'm the only person to have ever complained about this!

Could C operator precedence levels have been done better: Yes or No?

By replying No, you suggest they are absolutely perfect.

By repying Yes, you admit they might have some issues, but it's OK, you
can work around them (you and 100M other people across half a century).

But instead, you decide to insult me.

But would shouldn't people be expected to learn the rules?

Programmers should certainly learn, know, apply, and obey the rules.

(If you don't understand that you may try to transform that truism
to your "car example".)

Janis

PS: There *is* a specific issue in C's operator precedence ranking
but it's not the ternary conditional.

There are lots of issues:

* There are too many

* == != have a different precedence from < <= >= >. Why? Which one
is higher? How would you make use of this?

* | & ^ have different levels for reasons that are unclear. Again, why?
What possible advantage does this have?

* Ones like << and >>, which scale values in the same was as * and /,
have a completely different level.

* In particular, << and >> have a lower precedence than +, so that a <<
3 + b is actually parsed as a << (3 + b) rather than (a << 3) + b.

I don't a list in front of me so can tell you off-hand what else there is.

In the case of ?: I find it bizarre that a 3-way operator it classed
amongst the binary operators anyway/.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Wed May 13 05:26:34 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 13/05/2026 03:48, Tim Rentsch wrote:

Bart <bc@freeuk.com> writes:

[.. I am cutting 100ish lines as they don't bear on my response ..]

Take for example C's set of operator precedences.

The one for the ?: operator is particularly obscure, so in an
expression like one of these:

a + b ? c - d : e * f
a ? b ? c : d ? e : f : g

then parentheses would be used to make things clearer. (I haven't
check these are valid, but that is the point; it is hard to see!)

But would shouldn't people be expected to learn the rules? Why is it
OK to 'revel' in not knowing the basics here, but not when unnecessary
UBs are involved where rules are harder and which depend on runtime
inputs?

If you want people to take you seriously, you need to find more
compelling examples. I am both familiar with and comfortable with
the syntax of C expressions, and even I would never write such
expressions as the two shown above.

No? I actually had your posted examples in mind. I can't remember you
using parentheses. I can remember you not being sympathetic to readers
of your code and expected them to be as familiar with precedence as
you are.

These lines look like they
were written by someone in junior high school (or these days,
probably elementary school).

The lines are not meant to mean anything, just sequences of terms and operators. You can think of them as exercises where you add
parentheses to make them unambiguous.

A bit like adding punctuation here:

"that that is is that that is not is not is that it it is"

Whether you mean to or not, this
example gives the impression of offering a strawman argument, and
it's only natural for people to react to that by dismissing your
comments, or even dismissing them altogether. Is that what you
want? To be dismissed? Or do you hope to actually communicate
with people? If so I recommend looking for a better framing of
your views and ideas.

Now this is getting silly. Can no one here engage in a civil
discussion without reducing to insults and casting aspersions?

I'm sorry my comments weren't of more help to you.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Wed May 13 14:42:03 2026

From Newsgroup: comp.lang.c

On 2026-05-13 13:32, Bart wrote:

On 13/05/2026 02:26, Janis Papanagnou wrote:

On 2026-05-13 00:35, Keith Thompson wrote:

[Dropping comp.lang.misc, since this is only about C.]
Bart <bc@freeuk.com> writes:

[...]

[...]

So the inconvenience of how 'switch' works is excused because
/sometimes/ you need fallthrough, or the one time in a thousand you
need Duff's device.

I don't see any inconvenience in "how it works"; it actually
allows programmers to implement both semantics as needed. And
both semantics were needed, they have been used. (Even if you
think your projection of your preferences and limited uses is
what should constitute the global software development world.)

Well, no other language (save C++) implements switch like C does.

I'm not sure you appreciate how bizarre it actually is. Here is a piece
of code from a 'Sieve' benchmark:

[ code snipped ]

This is perfectly valid C code, if meaningless.

Picking arbitrary samples from the wild is meaningless!

[...]

BTW switch fallthrough is necessary so that you can do this:

-a-a-a switch (a) {
-a-a-a case 'A': case 'B': case 'C': .... // deal with A/B/C

Without fall-through behavior, it would exit after that case 'A': label. This is how crude it is.

There's many ways to define syntax of switch-like statements.
"C" has chosen above form and fall-through semantics and providing
'break' to control that. - You don't like that? - Fine. I don't care.

Remember my saying people defend its misfeatures to the death? Your post
is a perfect example!

And you are still wrong. - You have already been told before by
others that this is not some endorsement but an explanation and
the stated fact that it works.

What I said (still quoted below) was:
| The explicit and clumsy 'break' is what syntactically annoys me,
| but it's also no drama, to be clear.

[...]

I don't see any inconvenience in "how it works"; it actually
allows programmers to implement both semantics as needed.

Ha, ha, ha! This is exactly my point. 99% of the time, at least, you
want very simple, boring semantics and properly structured syntax, just
as I offer im my languages and others do in their switch/match statements.

You languages are meaningless, and any repetition and mention of
them pointless.

It would be (marginally) better if you'd given positive examples
from other, *established* languages instead. (I mentioned, e.g.,
the Kornshell way of supporting such a feature alternatively.)

[...]

(Even if you
think your projection of your preferences and limited uses is
what should constitute the global software development world.)

You think my view is limited, and genuinely think C switch is superior
to how it works in other languages?

You are talking nonsense, and, again, despite having being told
so many times to not put words in others' mouths, or assuming
things where you've regularly been shown that your imaginations
are just wrong! - Tell me; do you understand that? - If so, when
will you stop that?

<meta>
Could it be that you need medical help? Have you ever spoken with
a psychotherapist about how the world reacts towards you? - It may
really help you; he might confirm that we are all malicious people
here, and that you should abstain from contact and communication
with us to keep your sanity and your personal view of life intact.
</meta>

[...]
Then this is definitely a wind-up.

Not at all.-a "switch" was originally implemented in a way that,
I suspect, was easier for the compiler to implement (basically
a scoped computed goto), and for an audience of programmers who,
to exaggerate slightly, could shout across the room and ask Dennis
Ritchie questions about it.

Computed 'goto' was actually considered quite "high standards"

Switch is not really computed goto.

I was adding a comment on Keith's "computed goto", its valuation
and dissemination.

(If you intended to comment on his valuation reply to him.)

[...]

[...]

The explicit and clumsy 'break' is what syntactically annoys me,
but it's also no drama, to be clear.

I disagree, it /IS/ a drama where you have to keep remembering to write it.

I acknowledge if you say that it is a drama *for you*. (And I'm not
the least astonished about that, to be honest.)

And I understand that if you are used to other languages you may get
confused and "forget" about writing a 'break' if you rarely use "C"
for programming. - "C" is clearly no language designed for you! - I
wonder why you use it (if it is so bad), and why you complain about
it here, instead of just leaving the place and focus on your preferred interests instead. - Or have you some "preacher stance" and intend to
convince us all about your enlightened wisdom? - Meanwhile you should
have noticed that we're not worthy. Sadly we're still on another level
and that won't change soon, I fear.

It stays the way it is because changing it *would break existing
code*.-a Worse, some seemingly reasonable ways of changing it would
mean that existing code is still valid but with different semantics.

Indeed. And that's the crucial point. A simple "dislike"-criticism
without acknowledging the practical side effects is pointless.

I understand the problems of changing it in the 21st century rather than much earlier on.

Yet you are whining and complaining and dramatizing the topic.

People could simply agree with me that it is a terrible
language feature.

As I've already expressed, I think there's alternative options that
I'd (syntactically) prefer. - So what?

Why haven't you read and interpreted those statements as agreement?

I certainly don't agree that it's a problem (as it is for you).

Janis

[...]

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Wed May 13 15:07:35 2026

From Newsgroup: comp.lang.c

On 2026-05-13 13:48, Bart wrote:

Bart <bc@freeuk.com> writes:

The one for the ?: operator is particularly obscure, so in an
expression like one of these:

-a-a-a a + b ? c - d : e * f
-a-a-a a ? b ? c : d ? e : f : g

[...]

The lines are not meant to mean anything, just sequences of terms and operators. You can think of them as exercises where you add parentheses
to make them unambiguous.

You have a misconception. - Above expressions *are* unambiguous!

You may have a fuzzy relation to an ambiguous 'if' statements in mind,
where you can have a "dangling else" situation in cases where there's
fewer 'else' branches (than 'if'/"then" branches) in some code. - But
notice that this is not the case with ternary conditional expressions
where you will generally have "saturated" '?' ':' pairs, and thus no ambiguities, inherently.

Janis

[...]

--- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.misc,comp.lang.c on Wed May 13 14:15:27 2026

From Newsgroup: comp.lang.c

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 2026-05-13 01:21, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 12/05/2026 02:37, Keith Thompson wrote:

[...]

Yeah, it's one of the great mysteries. Even half a century ago, there
were big companies and lots of clever people, who could have cranked out >>> a suitable systems language of equal capability to C in their sleep, but >>> with fewer rough edges.

Those clever people _were_ cranking out suitable systems languages
by the bucketful. PL/1, Algol derivatives, proprietary internal
languages (Burroughs SPRITE and BPL languages), HP-3000 SPL (Systems
Programming Language - I used SPL in the late 70s) and
on the academic side, modula, ADA, Pascal (yes, it could be
a systems programming language, c.f. VAX-11 Pascal).

[...]

I wonder about why you put Ada just in the "academic box".

Because the first ADA compiler I used came from NYU. :-)

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Wed May 13 14:31:02 2026

From Newsgroup: comp.lang.c

In article <10u1j2h$1l93l$31@dont-email.me>,
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 2026-05-12 16:33, Bart wrote:

[snip]
But would shouldn't people be expected to learn the rules?

Programmers should certainly learn, know, apply, and obey the rules.

(If you don't understand that you may try to transform that truism
to your "car example".)

Programmers _should_ absolutely learn the rules. But in C,
there are many of them, and some of them are deceptively subtle.

_A_ rule that programmers can remember quite easily, however,
is that parenthesis generally carry very high precedence, and
so when it doubt, wrapping something in paren's can aid
understanding (for the programmer and the maintainer). The key
is to find balance between extreme terseness and extreme
verbosity, both of which can feel obfuscating.

There was a time when I knew and had memorized the precedence of
all operators in C. I remember most, but have forgotten some
that I use less frequently; I suspect many programmers are in
the same (or a similar) situation. If I am writing code and can
not immediately remember the precedence of some operator in some
expression, I apply parentheses.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Wed May 13 14:33:02 2026

From Newsgroup: comp.lang.c

In article <8633zwm5h9.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

scott@slp53.sl.home (Scott Lurndal) writes:

For the early C compiler on the PDP-11, the 'int' type was
16-bits, implicitly signed, and the code generator simply emitted
available arithmetic instructions.

It was the only C compiler at the time, any guarantees would have
been implicit in the choice of target architecture.

I mostly wrote unix kernel code using the v6 compiler, rather than
writing code that did any heavy math, so whether value was
preserved or sign was preserved wasn't something I, as a kernel
programmer, routinely considered.

If int was only 16 bits, I expect promotion considerations didn't
come up very often.

Presumably they came up all the time; `char` was used a small
integer frequently. But there was no `unsigned` type so
whether, it was promoted to an `int` or `unsigned int` was moot.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Wed May 13 14:40:26 2026

From Newsgroup: comp.lang.c

In article <10u1mav$1l93k$15@dont-email.me>,
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 2026-05-12 20:09, Dan Cross wrote:

In article <10tvmp7$23t17$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:

Would you get more signed overflow in practice? And in particular,
would you get more signed overflow UB in places where you would not have >>> a bug in the code anyway. There would certainly be more cases of signed >>> integer arithmetic, whereas moving to a common unsigned type means more
unsigned integer arithmetic. But I don't see signed integer arithmetic
as a risk of UB in itself - it is only a risk UB if you are working with >>> inappropriate values.

I suspect you would, if only because one of the major motivating
factors for using unsigned arithmetic in practice is to have the
full bit-range of the type available. [...]

Hmm.. - I'm using 'unsigned' typically to express the domain of the >application values (not to "wrest" some more values out of a type).

It's not about expanding the numeric range. It's about being
able to do bit-level manipulation. I often use unsigned types
to represent values that are consumed by hardware: device
registers, page table entries, addreses of various kinds, etc.

Signed semantics in that context are not helpful. Setting the
"sign" bit doesn't mean the value is "negative": it just means
that that bit is set.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Wed May 13 14:47:55 2026

From Newsgroup: comp.lang.c

In article <867bp8m6ok.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

James Kuyper <jameskuyper@alumni.caltech.edu> writes:

On 2026-05-08 06:43, David Brown wrote:
...

Yes, I have heard that argument before. I am unconvinced that the
"value preserving" choice actually has any real advantages. I also
think it is a misnomer - it implies that "unsigned preserving" would
not preserve values, which is wrong.

Unsigned-preserving rules would convert a signed value which might be
negative to unsigned type more frequently than the value preserving
rules do.

This statement is wrong.

The truth of Kuyper's statement depends on how one interprets
its meaning.

An "unsigned preserving" promotion rule
converts a signed value to a signed value and an unsigned value to
an unsigned value. The value being converted stays the same in both
cases. Both an "unsigned preserving" promotion and a so-called
"value preserving" promotion preserve the value of the operand being
promoted (and converted).

Unsigned-preserving rules mean that unsigned types are prompted
to unsigned types.

But that also means that signed types in expressions involving
those promoted values are converted to unsigned types (including
negative values of the signed type). I took Kuyper's statement
to refer this promotion.

For example, given:

unsigned char a = 8;
signed short b = -5;

Now consider the type of `b` after promotion in the expression,
`a * b`.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Wed May 13 14:57:05 2026

From Newsgroup: comp.lang.c

In article <10u0k0k$1l93l$30@dont-email.me>,
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

[snip]
Bart <bc@freeuk.com> writes:
[...]

So the inconvenience of how 'switch' works is excused because
/sometimes/ you need fallthrough, or the one time in a thousand you
need Duff's device.

I don't see any inconvenience in "how it works"; it actually
allows programmers to implement both semantics as needed. And
both semantics were needed, they have been used. (Even if you
think your projection of your preferences and limited uses is
what should constitute the global software development world.)

It's easy to get wrong. Other languages accommodate both
semantics using alternation in the selector arm. For example,
one might imagine an hypothetical syntax, something like:

switch (a) {
case 1 || 2 || 3 || 4: whatever();
default: other();
}

...with no `break` to end each `case`.

You couldn't use it to build Duff's Device, but I'm not sure
that even Duff would call that a loss.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Wed May 13 15:12:23 2026

From Newsgroup: comp.lang.c

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 2026-05-13 00:35, Keith Thompson wrote:

Except that C23 adds a "fallthrough" attribute that, while it
doesn't change the semantics of the switch statement, allows a
programmer to tell the compiler that a fallthrough was intentional.
A compiler can choose to warn about an unmarked fallthrough and
remain silent when it sees the "fallthrough" attribute.

We considered it generally good style to write /* fall-through */
at such places in our software as an explicit visible hint (and
that is even more "bulky" than the explicit 'break'). I'm thus not
astonished about these new features.

We use:

# if __GNUC__ >= 7 // 'statement attributes' were new with GCC 7.x
# if defined(__cplusplus) && (__cplusplus >= 201103L) // C++11 or greater
# define XXX_FALLTHROUGH [[gnu::fallthrough]]
# else
# define XXX_FALLTHROUGH __attribute__ ((fallthrough))
# endif
# else // GCC 4.x, 5.x, 6.x, comment only!
# define XXX_FALLTHROUGH /* Fall Through */
# endif

Where 'XXX' is replaced by the app name.

switch (variable) {
case cond1:
break;
case cond:
do something
XXX_FALLTHROUGH
default:
do something else
}

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Wed May 13 15:14:24 2026

From Newsgroup: comp.lang.c

In article <86cxyzlfc7.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <10tpq7e$a6kp$3@dont-email.me>, Bart <bc@freeuk.com> wrote:

[...]

Apparently, you missed the changes afoot in the committee to do
exactly what everyone has been telling you: deprecate `i[A]` but
preserve `i + A`.

Not deprecate but deem it obsolescent. A very different thing.

https://www.open-std.org/JTC1/SC22/WG14/www/docs/n3517.htm is
the original issue. It links to n3380, available online at https://open-std.org/JTC1/SC22/WG14/www/docs/n3380.htm.

Note `n3380` dated, 202410, which is accompanied by this
comment: "Do not remove index[array], yet, but deprecate it."
Note also the poll and results from the Minneapolis meeting: "https://open-std.org/JTC1/SC22/WG14/www/docs/n3380.htm" (10
voted yes, 1 no, 8 abstain; result is "direction").

This is the sense in which I used that word, not the sense in
the standard.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c on Wed May 13 17:16:53 2026

From Newsgroup: comp.lang.c

On 13/05/2026 16:31, Dan Cross wrote:

In article <10u1j2h$1l93l$31@dont-email.me>,
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 2026-05-12 16:33, Bart wrote:

[snip]
But would shouldn't people be expected to learn the rules?

Programmers should certainly learn, know, apply, and obey the rules.

(If you don't understand that you may try to transform that truism
to your "car example".)

Programmers _should_ absolutely learn the rules. But in C,
there are many of them, and some of them are deceptively subtle.

_A_ rule that programmers can remember quite easily, however,
is that parenthesis generally carry very high precedence, and
so when it doubt, wrapping something in paren's can aid
understanding (for the programmer and the maintainer). The key
is to find balance between extreme terseness and extreme
verbosity, both of which can feel obfuscating.

There was a time when I knew and had memorized the precedence of
all operators in C. I remember most, but have forgotten some
that I use less frequently; I suspect many programmers are in
the same (or a similar) situation. If I am writing code and can
not immediately remember the precedence of some operator in some
expression, I apply parentheses.

I don't think it is necessary to /learn/ all the rules of a language -
but it is necessary to be aware of them, and to know how well you know
them. It's fine not to be sure of all the precedence rules in a
language (and some languages have many more operators than C, or
stranger precedence rules). You only need to know the ones you rely on regularly, and the ones you have to read regularly. If you occasionally
come across something different, then you can look it up. There's no
point in filling your head with knowledge that you almost never need.

So there is usually no need to know the precedence rules for mixing
relational operators, shift operators and bitwise and/or operators, or whatever, if you put parentheses in your own code or split the complex expression into multiple variables. (With the caveat that you mentioned earlier that both too few and too many parentheses make code harder to understand.)

But you might have to understand code written which relies on more of
the details - you need to be aware of what you know, and what you have
to look up, in order to understand the code. The risk comes not from ignorance of the precedence rules, but from thinking you know them when
you have misremembered them. Self-awareness of your own knowledge,
along with convenient and reliable references, is vital.

--- Synchronet 3.22a-Linux NewsLink 1.2

From James Kuyper@jameskuyper@alumni.caltech.edu to comp.lang.c on Wed May 13 11:17:54 2026

From Newsgroup: comp.lang.c

On 2026-05-11 16:48, Michael S wrote:

On Sun, 10 May 2026 20:30:24 -0400
James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

On 2026-05-10 20:10, Keith Thompson wrote:

...

I like stdint.h types.

Me too.

...

And all those *_fast and *_least types... Not that I hate them, but
it's certainly shows lack of taste.

I greatly prefer the *_fast types - in general, they match my design
criteria, except when reading or writing data in a fixed format.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Wed May 13 15:20:48 2026

From Newsgroup: comp.lang.c

In article <10u1emq$1l93k$13@dont-email.me>,
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 2026-05-13 05:47, Tim Rentsch wrote:

scott@slp53.sl.home (Scott Lurndal) writes:

[...]

The programming I do
(mainly kernel programming, SoC simulation,
firmware) all naturally require the fixed-width types.

Right. Code that interacts very closely with hardware is one of
those cases where the fixed-width types make sense.

Another common one - also "low-level" but different - are data types >exchanged through communication protocols.

Yes, in particular, networking protocols are often described in
terms of "octets", since many protocols date from the era in
which machines with differently sized bytes were still common.
E.g., much of the early work presaging TCP/IP was done on DEC
PDP-10 machines, which were 36-bit, word-oriented computers.

However, when discussing protocols (or hardware peripherals on
the local system, for that matter) it is important to exercise
care with respect to ordering of octets within multi-octet
data. For instance, IP networking "on the wire" uses Big-Endian
ordering to represent the fields in the IP datagram header,
while a processor might use Little-endian natively. Hence, one
must be sensitive to transforming between the two. It may be
easier to leave the packet data in an octet buffer, and extract
the fields one is interested in on the host from that.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Wed May 13 15:47:44 2026

From Newsgroup: comp.lang.c

In article <86lddnlvtr.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

[...]

It is not clear to me that `longjmp` out of a non-nested signal
handler is still well-defined as of C11, though it is explicitly
stated to be C89.

It seems you are misunderstanding what the standards are saying.

You read my post with insufficient care, and failed to
understand what I wrote, and are responding to something I did
not say.

The description of longjmp() says (paraphrasing) that it restores
the environment where the relevant setjmp() was done.

Yes.

There is
in C89 a passage about returning from signal handlers and so
forth, but that is followed by a carveout for nested signal
handlers, which in C89 is undefined behavior. (I assume that
also holds for C90 but I haven't verified that.)

Yes.

Aside: surely it is well well-known by now that the language in
C90 is verbatim identical to the language for C89 except for
some bits of the front matter that explain the provenance of the
standard originating from ANSI.

If you know of specific differences, or a reason this is known
to be incorrect, please point it out.

Starting in C99, any mention of interrupts and signal handlers was
removed, along with the carveout.

This is wrong. Section 7.14 of C23 talks about signals and
signal handlers at length.

I never mentioned "interrupts" at all (traditionally, Unix
signals, which formed the basis for C signals, are not
interrputs in the conventional sense. Modern systems will
sometimes make use of interprocessor-interrupts to hasten their
delivery, however).

I think you are talking about _only_ the description of
`longjmp`. I am actually talking about the standard considered
in total. I only mentioned "non-nested" signal handler because
C90 was explicit in saying that that `longjmp` from a _nested_
signal handler was UB.

Because there is a definition
for what longjmp() does, the behavior is defined, and there is no
undefined behavior (not counting things like doing a longjmp()
with a jmp_buf that wasn't set up, etc). Removing the mention of
interrupts and signals, and also removing the carveout, only makes
longjmp() more defined, not less.

I don't think you understood my statement.

Read section 7.14 of C23 carefully; it is not at all obvious
that a `longjmp` out of a signal handler is not _a priori_ UB.
By my reading, it's the opposite, in fact: I see no way to do
so without invoking UB.

I was asked for an example, beyond the behavior of
`realloc(ptr, 0)` with respect to whether it free's `ptr` if
`ptr` is non-null, where something that was explicitly
guaranteed by an earlier version of the standard was changed to
UB in a later version. This appears another example of such a
case.

By all means, correct me if you think I am mistaken, but your
explanation above was based on your own misinterpretation, not
otherwise relevant to the statement I had made, and incorrect
in fact (the standard did _not_ remove mention of signals).

Note, in the case of `longjmp` and signal handlers, I suspect it
doesn't much matter because if one is doing something like that
anyway, as one is almost invariably going to targeting a system
that conforms to a standard like POSIX, which extends ISO C with
stronger guarantees for defined behavior in this specific area.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Wed May 13 15:52:50 2026

From Newsgroup: comp.lang.c

In article <10u24l5$2oaav$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:

On 13/05/2026 16:31, Dan Cross wrote:

In article <10u1j2h$1l93l$31@dont-email.me>,
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 2026-05-12 16:33, Bart wrote:

[snip]
But would shouldn't people be expected to learn the rules?

Programmers should certainly learn, know, apply, and obey the rules.

(If you don't understand that you may try to transform that truism
to your "car example".)

Programmers _should_ absolutely learn the rules. But in C,
there are many of them, and some of them are deceptively subtle.

_A_ rule that programmers can remember quite easily, however,
is that parenthesis generally carry very high precedence, and
so when it doubt, wrapping something in paren's can aid
understanding (for the programmer and the maintainer). The key
is to find balance between extreme terseness and extreme
verbosity, both of which can feel obfuscating.

There was a time when I knew and had memorized the precedence of
all operators in C. I remember most, but have forgotten some
that I use less frequently; I suspect many programmers are in
the same (or a similar) situation. If I am writing code and can
not immediately remember the precedence of some operator in some
expression, I apply parentheses.

I don't think it is necessary to /learn/ all the rules of a language -
but it is necessary to be aware of them, and to know how well you know
them. It's fine not to be sure of all the precedence rules in a
language (and some languages have many more operators than C, or
stranger precedence rules). You only need to know the ones you rely on >regularly, and the ones you have to read regularly. If you occasionally >come across something different, then you can look it up. There's no
point in filling your head with knowledge that you almost never need.

So there is usually no need to know the precedence rules for mixing >relational operators, shift operators and bitwise and/or operators, or >whatever, if you put parentheses in your own code or split the complex >expression into multiple variables. (With the caveat that you mentioned >earlier that both too few and too many parentheses make code harder to >understand.)

But you might have to understand code written which relies on more of
the details - you need to be aware of what you know, and what you have
to look up, in order to understand the code. The risk comes not from >ignorance of the precedence rules, but from thinking you know them when
you have misremembered them. Self-awareness of your own knowledge,
along with convenient and reliable references, is vital.

Yes, I agree. The key is knowing when it's time to go to look
at a reference.

I like the way you put it.

I might go a bit further and say that it's fine not to know
every rule, but there's a qualitative difference between
acknowledging that and know that easy access to a reliable
reference is useful, and steadfasty, refusing to learn the rules
because one considers them poor to begin with.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Wed May 13 15:55:28 2026

From Newsgroup: comp.lang.c

In article <10u24gg$ev2$6@reader1.panix.com>,
Dan Cross <cross@spitfire.i.gajendra.net> wrote:

In article <86cxyzlfc7.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote: >>cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <10tpq7e$a6kp$3@dont-email.me>, Bart <bc@freeuk.com> wrote:

[...]

Apparently, you missed the changes afoot in the committee to do
exactly what everyone has been telling you: deprecate `i[A]` but
preserve `i + A`.

Not deprecate but deem it obsolescent. A very different thing.

https://www.open-std.org/JTC1/SC22/WG14/www/docs/n3517.htm is
the original issue. It links to n3380, available online at >https://open-std.org/JTC1/SC22/WG14/www/docs/n3380.htm.

Note `n3380` dated, 202410, which is accompanied by this
comment: "Do not remove index[array], yet, but deprecate it."
Note also the poll and results from the Minneapolis meeting: >"https://open-std.org/JTC1/SC22/WG14/www/docs/n3380.htm" (10

Sigh. That is supposed to be the text of the polled question:
"Does WG14 want to deprecate index[array] as in N3360 in C2y?"

voted yes, 1 no, 8 abstain; result is "direction").

This is the sense in which I used that word, not the sense in
the standard.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Wed May 13 10:38:21 2026

From Newsgroup: comp.lang.c

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 2026-05-08 18:23, Tim Rentsch wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

NB: As opposed to other people I've never considered the K&R bible
a good book;

K&R is meant to be an introduction to C, written in an informal and
sometimes tutorial style. [...]

Acknowledged.

[...]

for every other statement it gave it created one more question
that just wasn't answered, so I remember. [...]

This is actually very unlike other languages I learned, were
simple coding situations either work as documented or lead to an
error.

C was typical for its era; that's just how languages were in those
days. In any case this complaint is about the language, not about
the book.

[.. I am focusing on just the book comparison ..]

It was a comment about the books available to learn a language.
(Of course K&R was sufficient to sit down and hack up programs.
But to repeat the main point: "for every other statement it gave
it created one more question that just wasn't answered". YMMV.)

[.. the language chosen for comparison is Simula 67] There's good
tutorials (like "SIMULA BEGIN"), [...]. [Other Simula documents
were mentioned but they aren't relevant to my comments which were
only about The C Programming Language, by Kernighan and Ritchie]

To partly repeat myself, my comments were only about K&R and the
writing in it; it wasn't an assessment of the language.

I think the implied parallel with Simula BEGIN is an apples and
oranges comparison. I wouldn't call Simula BEGIN a tutorial. It's
almost twice as long as K&R. Simula 67 is built around Algol 60 as
a base language; it's safe to assume many or most readers would
already be familiar with it, and so it doesn't need to be explained
as much. Simula is a safe language (or at least it can be - it's
possible there were unsafe implementations), and C is not, and that
means it's easier to define the semantics of Simula than to define
the semantics of C. C has bare pointers, and address arithmetic;
Simula has neither. C is able to deal with bare hardware, including
numeric hardware addresses; Simula doesn't consider such things.

I don't mean to shortchange Simula, which is an amazing and powerful
language. But the semantics of Simula are easier to define than the
semantics of C, with all its of vagaries, and yet K&R does a fair job
of that in roughly half as many pages as Simula BEGIN -- and it's
worth noting that K&R includes both a tutorial and a short reference.

Disclaimer: even though I have done a fair amount of programming in
Simula, it's been a long time since I did so, and a longer time
since I read Simula BEGIN. I am working purely from memory and
haven't had a chance to check against the actual text.

I did cut out a lot of your comments about other Simula documents.
I did that because it looked like an effort to shift the topic of
conversation from talking about the quality of the text to talking
about the language. I understand that you have complaints about C,
and I might even agree with some of them. But my earlier comments
were only about the writing in K&R, not about whether C is a good
language, and I don't want to stray from that into other uncharted
waters.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Wed May 13 11:00:58 2026

From Newsgroup: comp.lang.c

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 2026-05-13 13:07, Tim Rentsch wrote:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
[...]

BYTE and WORD are poor choices for type names, no doubt
about that.

[...]

WORD is certainly ambiguous (unless, I suppose, it's sufficiently
obvious from the context). But I don't have a problem with BYTE,
or preferably byte, as a type name as long as it really is a byte.

[...]

BYTE is a poor choice for a type name because it looks like a
macro.

A lower-case version, byte, is a poor choice for a type name,
because it is both confusing and ambiguous.

Confusing, because for a very long time and for a huge segment of
the programming community, the term byte is synonymous with eight
bits, but in C that need not be true.

Actually, it was more an issue in the "intermediate epoch", when
terminology spread to the non-expert home-users who considered
a byte to be 8 bit on their typical PC systems while not knowing
anything from the professional IT world before (with 6, 7, 9 bit
entities). Nowadays I'd consider it less an issue since these
systems seem to have (mostly?) vanished. There was a reason why
the standards back then introduced and used the term "octet" for
the common 8-bit entities, to avoid ambiguity and misunderstanding.

What's technically defined for the "C" language in the respective
standard documents is an own thing, not necessarily equivalent to
the respective application semantics expressed by some C-program,
although I'd always prefer "octet" for that (and avoid "byte").

I agree with your comment about preferring "octet", but let me
add to that.

There is another difference worth noting. A byte is a unit of
storage, whereas octet is a measure of information. The word
byte is inherently about memory; the word octet is inherently
about value (eight bits of information). For this reason too
the name 'octet' is a better choice for a type name than 'byte'.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Wed May 13 11:32:53 2026

From Newsgroup: comp.lang.c

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 2026-05-09 00:43, Dan Cross wrote:

[...]

Sure. This was a bit of a contrived example, but you ask a good
question: how often might one want write code like that?

In short, I don't know, but I can think of any number of hash
functions, checksums, etc, that may be implemented using 16-bit
arithmetic, and I can well see programmers wanting to take
advantage of the modular semantics afforded by using unsigned
types to do so. Every day? Probably not. But often enough.

I mentioned it before but it may have got lost in the lots text
typically exchanged here; for hash functions a modulus based on
powers of two has *bad* _distribution properties_, so it's not
a sensible example or plausible rationale to vindicate modular
arithmetic for the few special cases (m=8, 16, 32, 64, etc.).

To me this sounds like folklore. A well-designed hash function
(and it isn't hard to write one or find one) will have good
distribution properties with any modulus. In many cases there
are good reasons to prefer a power-of-two modulus. So I think
the right takeaway here is make sure you have a good hash
function, and don't settle for some ad hoc thing just thrown
together.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Wed May 13 11:37:05 2026

From Newsgroup: comp.lang.c

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <864ikdp9lk.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <868q9ppg4o.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

antispam@fricas.org (Waldek Hebisch) writes:

[discussing the notion of "safe" programs]

As I wrote, safety is about ability to avoid or detect errors.

In the functional programming community the usual statement is
"Well-typed programs cannot go wrong."

This is only concerning _type safety_.

I didn't mean to imply anything different.

Looking at what you wrote:

|I think a good way of understanding this is that, if
|a program stays inside the safe limits of the language,
|the program can produce wrong answers, but it cannot
|produce meaningless answers.

You are wrong.

A well-typed program _can_ produce meaningless answers; those
answers will have a well-defined type, but it is impossible to
say whether the value produced has any meaning with respect to
the program's intended purpose. Moreover, the "safe limits of
the lanugage", whatever those may be, have nothing to do with
it.

What you mean by meaningless isn't what I meant by meaningless.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Wed May 13 11:39:03 2026

From Newsgroup: comp.lang.c

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
[...]

There is another difference worth noting. A byte is a unit of
storage, whereas octet is a measure of information. The word
byte is inherently about memory; the word octet is inherently
about value (eight bits of information). For this reason too
the name 'octet' is a better choice for a type name than 'byte'.

The words "octet" and "byte" mean different things.

If I were to typedef "byte" as unsigned char, it would be because I
want to emphasize the fact that a byte object holds one fundamental
unit of data, not necessarily character data. And I'd probably
use it in a way that doesn't assume it's 8 bits (unless I have a
good reason not to need portability). C's conflation of character
types with "bytes" is IMHO unfortunate; a typedef makes it clearer
what the type is being used for.

I usually just use "unsigned char" and remember that it's one byte
(however many bits that is).
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Wed May 13 11:44:41 2026

From Newsgroup: comp.lang.c

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <8633zwm5h9.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

[...]

If int was only 16 bits, I expect promotion considerations didn't
come up very often.

Presumably they came up all the time; `char` was used a small
integer frequently. But there was no `unsigned` type so
whether, it was promoted to an `int` or `unsigned int` was moot.

Very early C didn't have unsigned int, but the signedness of char was effectively implementation-defined. From the 1975 C Reference Manual:

A char object may be used anywhere an int may be. In all
cases the char is converted to an int by propagating its sign
through the upper 8 bits of the resultant integer. This is
consistent with the tworCOs complement representation used for
both characters and integers. (However, the sign-propagation
feature disappears in other implementations.)

In modern terms, the "other implementations" made plain char unsigned.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Wed May 13 11:59:24 2026

From Newsgroup: comp.lang.c

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <86lddnlvtr.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

[...]

Starting in C99, any mention of interrupts and signal handlers was
removed, along with the carveout.

This is wrong. Section 7.14 of C23 talks about signals and
signal handlers at length.

Obviously, but that's clearly not what Tim meant. His statement
was not wrong in context. (7.14 describes <signal.h>. It's not
plausible that Tim would think that had been removed.)

I never mentioned "interrupts" at all (traditionally, Unix
signals, which formed the basis for C signals, are not
interrputs in the conventional sense. Modern systems will
sometimes make use of interprocessor-interrupts to hasten their
delivery, however).

I think you are talking about _only_ the description of
`longjmp`. I am actually talking about the standard considered
in total. I only mentioned "non-nested" signal handler because
C90 was explicit in saying that that `longjmp` from a _nested_
signal handler was UB.

Yes, Tim was clearly talking only about the descrition of longjmp.
His statement wasn't wrong, just restricted to a certain context.
C90's description of of longjmp includes a paragraph about interrupts
and signals. C99 removed that paragraph.

[...]
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Wed May 13 12:28:05 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 13/05/2026 02:26, Janis Papanagnou wrote:

On 2026-05-13 00:35, Keith Thompson wrote:

[Dropping comp.lang.misc, since this is only about C.]
Bart <bc@freeuk.com> writes:

[...]

[...]

So the inconvenience of how 'switch' works is excused because
/sometimes/ you need fallthrough, or the one time in a thousand you
need Duff's device.

I don't see any inconvenience in "how it works"; it actually
allows programmers to implement both semantics as needed. And
both semantics were needed, they have been used. (Even if you
think your projection of your preferences and limited uses is
what should constitute the global software development world.)

Well, no other language (save C++) implements switch like C does.

And Objective-C, and perhaps others. And csh, the so-called
"C shell". It's likely that the only languages that have switch
statements with implicit fallthrough are those derived from or
inspired by C.

I'm not sure you appreciate how bizarre it actually is.

Why would you think that? I do, in fact, appreciate how bizarre
C's switch statement is.

Here is a piece of code from a 'Sieve' benchmark:

[snip]

Now, let's put wrap 'switch' around it:

[...]

This is perfectly valid C code, if meaningless.

No language can prevent bad code. Yes, C has some features that make
writing bad code a bit easier.

The original code is 4 nested statements, but the switch's 'case'
labels can go literally anywhere within that structure. Even 'default'
can go anywhere and be mixed up with the other cases.

Yes, you're right, it's bizarre.

Further, if you wanted to apply 'break' to one of those case-blocks,
it wouldn't work as it would pertain to one of those nested loops.

I made this point before but it was brushed off. The C authors
couldn't think of an alternate keyword so there remains this conflict.

Yes, the fact that "break;" is used both for loops and for switch
statements is inconvenient. (csh uses "breaksw" for switch
statements.)

BTW switch fallthrough is necessary so that you can do this:

switch (a) {
case 'A': case 'B': case 'C': .... // deal with A/B/C

Without fall-through behavior, it would exit after that case 'A':
label. This is how crude it is.

Yes, I mentioned that.

Remember my saying people defend its misfeatures to the death? Your
post is a perfect example!

I presume you read my post, but you did not understand it.

I don't like using all-caps for emphasis, but I'll try it here.

TO EXPLAIN IS NOT TO DEFEND.

You stubbornly refuse to understand that.

I did not "defend" C's switch statement. I described how it
worked, and speculated on the historical reasons for why it's
specified the way it is. The only thing I said that might be
considered "defending" it is it *can* be used correctly with a
little discipline, and that it's impractical to change it because
that would break existing code (I loosely think of the "break;"
as part of the syntax of a switch statement. I know it isn't,
but it helps with the discipline needed to use it correctly.)

C's switch statement is bizarre. I do not like the way it is
defined. If I were designing a new language without worrying
about breaking existing code, I would not consider having a switch
statement with default fallthrough.

But I rarely bother to say so, because (a) as long as we use C, we're
stuck with it, (b) it's really not difficult to use it correctly,
and (c) everybody already knows all this.

[...]

Ha, ha, ha! This is exactly my point. 99% of the time, at least, you
want very simple, boring semantics and properly structured syntax,
just as I offer im my languages and others do in their switch/match statements.

Your languages were not restricted by any need to avoid breaking
existing code.

[...]

The explicit and clumsy 'break' is what syntactically annoys me,
but it's also no drama, to be clear.

I disagree, it /IS/ a drama where you have to keep remembering to write it.

Competent C programmers rarely have any difficulty remembering to add
break statements where needed. Yes, it's annoying to have to do that.
Some compilers will warn about fallthrough (and yes, that does imply
that it's a mistake that some programmers make).

It stays the way it is because changing it *would break existing
code*.-a Worse, some seemingly reasonable ways of changing it would
mean that existing code is still valid but with different semantics.

Indeed. And that's the crucial point. A simple "dislike"-criticism
without acknowledging the practical side effects is pointless.

I understand the problems of changing it in the 21st century rather
than much earlier on. People could simply agree with me that it is a
terrible language feature.

I don't think it's "terrible", but your opinion is valid. I do
agree that it could have been better. But we're stuck with it, and
it's not as bad as you try to demonstrate by writing deliberately
obfuscated code.

It would also have been perfectly possible to leave 'switch' alone and instead introduce a new kind of statement.

Yes, that could have been done (though with a more C-like syntax than
you suggest). But such a new feature would not support anything that
can't already be done (slightly more awkwardly) with the existing
switch statement. Programmers would be faced with an arbitrary
choice of whether to use "switch" or, let's call it "newswitch".
Programmers that use the new feature would have code that can't be
compiled with older compilers. As far as I know, nobody has made
such a proposal to the C committee, and I speculate that any such
proposal would be rejected for lack of utility.

[...]

We know about C's flaws. We know how to use the language in spite
of its flaws -- or we choose to use other languages (and discuss
them in other newsgroups). We don't waste time whining about them.

You know about the difficulties of the switch statement. Have you
*ever* written anything here that would help someone use it
correctly?
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.misc,comp.lang.c on Wed May 13 12:30:36 2026

From Newsgroup: comp.lang.c

scott@slp53.sl.home (Scott Lurndal) writes:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

[...]

I wonder about why you put Ada just in the "academic box".

Because the first ADA compiler I used came from NYU. :-)

It's Ada, not ADA. It's a person's name, not an acronym.

And along with C, it's one of the few languages whose name is a
hexadecimal palindrome.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Wed May 13 12:35:16 2026

From Newsgroup: comp.lang.c

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <10u0k0k$1l93l$30@dont-email.me>,

[...]

It's easy to get wrong. Other languages accommodate both
semantics using alternation in the selector arm. For example,
one might imagine an hypothetical syntax, something like:

switch (a) {
case 1 || 2 || 3 || 4: whatever();
default: other();
}

...with no `break` to end each `case`.

That's already valid syntax. If C's switch statement were to be
changed, it would have to use something that's currently a syntax
error. Perhaps something like

case 1, case 2, case 3, case 4: whatever();

Oh, I know, we could reuse the "static" keyword!

You couldn't use it to build Duff's Device, but I'm not sure
that even Duff would call that a loss.

A "better" switch statement might have an explicit fallthrough
construct. (bash's "case" statement has this, more or less.)
Or you could use goto (yeah, I know).
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Wed May 13 13:11:59 2026

From Newsgroup: comp.lang.c

On 5/13/2026 2:02 AM, Janis Papanagnou wrote:

On 2026-05-13 05:47, Tim Rentsch wrote:

scott@slp53.sl.home (Scott Lurndal) writes:

[...]

The programming I do
(mainly kernel programming, SoC simulation,
firmware) all naturally require the fixed-width types.

Right.-a Code that interacts very closely with hardware is one of
those cases where the fixed-width types make sense.

Another common one - also "low-level" but different - are data types exchanged through communication protocols.

Big time!

[...]
--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Wed May 13 20:18:42 2026

From Newsgroup: comp.lang.c

In article <10u2jpk$2t96p$6@kst.eternal-september.org>,
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote: >cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <10u0k0k$1l93l$30@dont-email.me>,

[...]

It's easy to get wrong. Other languages accommodate both
semantics using alternation in the selector arm. For example,
one might imagine an hypothetical syntax, something like:

switch (a) {
case 1 || 2 || 3 || 4: whatever();
default: other();
}

...with no `break` to end each `case`.

That's already valid syntax.

It wasn't meant to be taken as a serious suggestion!

If C's switch statement were to be
changed, it would have to use something that's currently a syntax
error. Perhaps something like

case 1, case 2, case 3, case 4: whatever();

Sure, that's better.

Oh, I know, we could reuse the "static" keyword!

Better yet, reuse auto!

(I'm kidding)

You couldn't use it to build Duff's Device, but I'm not sure
that even Duff would call that a loss.

A "better" switch statement might have an explicit fallthrough
construct. (bash's "case" statement has this, more or less.)
Or you could use goto (yeah, I know).

This reinforces the salient bit: one can imagine a better
construct, and there is prior art in other languages showing
that implicit fallthrough is not, strictly speaking, necessary.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.misc,comp.lang.c on Wed May 13 20:20:58 2026

From Newsgroup: comp.lang.c

In article <10u2jgv$2t96p$5@kst.eternal-september.org>,
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

scott@slp53.sl.home (Scott Lurndal) writes:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

[...]

I wonder about why you put Ada just in the "academic box".

Because the first ADA compiler I used came from NYU. :-)

It's Ada, not ADA. It's a person's name, not an acronym.

Perhaps Scott was alluding to its design reminding one of the
era in which only upper case characters were available on one's
teletype. :-D

- Dan C.

(I'm kidding again. I actually don't think Ada is a terrible
language.)
--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Wed May 13 20:28:28 2026

From Newsgroup: comp.lang.c

In article <86mry3jgvi.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <864ikdp9lk.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <868q9ppg4o.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

antispam@fricas.org (Waldek Hebisch) writes:

[discussing the notion of "safe" programs]

As I wrote, safety is about ability to avoid or detect errors.

In the functional programming community the usual statement is
"Well-typed programs cannot go wrong."

This is only concerning _type safety_.

I didn't mean to imply anything different.

Looking at what you wrote:

|I think a good way of understanding this is that, if
|a program stays inside the safe limits of the language,
|the program can produce wrong answers, but it cannot
|produce meaningless answers.

You are wrong.

A well-typed program _can_ produce meaningless answers; those
answers will have a well-defined type, but it is impossible to
say whether the value produced has any meaning with respect to
the program's intended purpose. Moreover, the "safe limits of
the lanugage", whatever those may be, have nothing to do with
it.

What you mean by meaningless isn't what I meant by meaningless.

What you wrote was provided with no definition, so it is
impossible to say what you meant.

It was evident in context, however, that what you wrote was not
at all related to what Milner meant by "meaningless" in the
context of his 1978 paper, "A Theory of Type Polymorphism in
Programming" that introduced the phrase.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Wed May 13 20:45:43 2026

From Newsgroup: comp.lang.c

In article <10u2hmc$2t96p$3@kst.eternal-september.org>,
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote: >cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <86lddnlvtr.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

[...]

Starting in C99, any mention of interrupts and signal handlers was >>>removed, along with the carveout.

This is wrong. Section 7.14 of C23 talks about signals and
signal handlers at length.

Obviously, but that's clearly not what Tim meant.

Sorry, but it wasn't at all clear to me.

His statement
was not wrong in context. (7.14 describes <signal.h>. It's not
plausible that Tim would think that had been removed.)

I disagree. The actual context was whether `longjmp` from a
signal handler is UB or not. His statement was either unrelated
or incorrect.

I never mentioned "interrupts" at all (traditionally, Unix
signals, which formed the basis for C signals, are not
interrputs in the conventional sense. Modern systems will
sometimes make use of interprocessor-interrupts to hasten their
delivery, however).

I think you are talking about _only_ the description of
`longjmp`. I am actually talking about the standard considered
in total. I only mentioned "non-nested" signal handler because
C90 was explicit in saying that that `longjmp` from a _nested_
signal handler was UB.

Yes, Tim was clearly talking only about the descrition of longjmp.
His statement wasn't wrong, just restricted to a certain context.
C90's description of of longjmp includes a paragraph about interrupts
and signals. C99 removed that paragraph.

Yes, that is a simple matter of fact.

But by itself it is only tangentially related to the topic at
hand. At best, his response was a non-sequitur. At a minimum,
he failed to properly understand the context before replying.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Wed May 13 21:22:43 2026

From Newsgroup: comp.lang.c

In article <10u2gqp$2t96p$2@kst.eternal-september.org>,
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote: >cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <8633zwm5h9.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

[...]

If int was only 16 bits, I expect promotion considerations didn't
come up very often.

Presumably they came up all the time; `char` was used a small
integer frequently. But there was no `unsigned` type so
whether, it was promoted to an `int` or `unsigned int` was moot.

Very early C didn't have unsigned int, but the signedness of char was >effectively implementation-defined. From the 1975 C Reference Manual:

A char object may be used anywhere an int may be. In all
cases the char is converted to an int by propagating its sign
through the upper 8 bits of the resultant integer. This is
consistent with the tworCOs complement representation used for
both characters and integers. (However, the sign-propagation
feature disappears in other implementations.)

In modern terms, the "other implementations" made plain char unsigned.

That's a separate issue, but raises an interesting point when it
comes to the early rationale for the integer promotion rules.

Since it was (and is) implementation-defined whether values of
type `char` would be treated as signed or not, it was (and is)
IB whether the the value of a `char` is positive or negative.

With value-preserving promotion semantics, it doesn't matter:
in either event, the result would be a signed int with the same
value as the `char`. With unsigned-preserving, it was IB
whether the resulting value would be of type `signed int` or
`unsigned int`.

I don't know that it particularly matters all that much, but it
does seem like the sort of thing that may have figured into the
committee's decision. It's interesting that it doesn't seem to
be in the rationale in the section covering promotion semantics.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Wed May 13 21:46:28 2026

From Newsgroup: comp.lang.c

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <10u2jpk$2t96p$6@kst.eternal-september.org>,
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote: >>cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <10u0k0k$1l93l$30@dont-email.me>,

[...]

It's easy to get wrong. Other languages accommodate both
semantics using alternation in the selector arm. For example,
one might imagine an hypothetical syntax, something like:

switch (a) {
case 1 || 2 || 3 || 4: whatever();
default: other();
}

...with no `break` to end each `case`.

That's already valid syntax.

It wasn't meant to be taken as a serious suggestion!

If C's switch statement were to be
changed, it would have to use something that's currently a syntax
error. Perhaps something like

case 1, case 2, case 3, case 4: whatever();

Sure, that's better.

case 1...4: whatever();

is a typical GCC extension (that we use heavily).
--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Wed May 13 15:28:30 2026

From Newsgroup: comp.lang.c

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <86lddnlvtr.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

[...]

Starting in C99, any mention of interrupts and signal handlers was
removed, along with the carveout.

This is wrong. Section 7.14 of C23 talks about signals and
signal handlers at length.

Obviously, but that's clearly not what Tim meant. His statement
was not wrong in context. (7.14 describes <signal.h>. It's not
plausible that Tim would think that had been removed.)

I never mentioned "interrupts" at all (traditionally, Unix
signals, which formed the basis for C signals, are not
interrputs in the conventional sense. Modern systems will
sometimes make use of interprocessor-interrupts to hasten their
delivery, however).

I think you are talking about _only_ the description of
`longjmp`. I am actually talking about the standard considered
in total. I only mentioned "non-nested" signal handler because
C90 was explicit in saying that that `longjmp` from a _nested_
signal handler was UB.

Yes, Tim was clearly talking only about the descrition of longjmp.
His statement wasn't wrong, just restricted to a certain context.
C90's description of of longjmp includes a paragraph about interrupts
and signals. C99 removed that paragraph.

Right. Thank you for clarifying.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Wed May 13 15:33:12 2026

From Newsgroup: comp.lang.c

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <86lddnlvtr.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

[...]

It is not clear to me that `longjmp` out of a non-nested signal
handler is still well-defined as of C11, though it is explicitly
stated to be C89.

It seems you are misunderstanding what the standards are saying.

You read my post with insufficient care, and failed to
understand what I wrote, and are responding to something I did
not say.

The description of longjmp() says (paraphrasing) that it restores
the environment where the relevant setjmp() was done.

Yes.

There is
in C89 a passage about returning from signal handlers and so
forth, but that is followed by a carveout for nested signal
handlers, which in C89 is undefined behavior. (I assume that
also holds for C90 but I haven't verified that.)

Yes.

Aside: surely it is well well-known by now that the language in
C90 is verbatim identical to the language for C89 except for
some bits of the front matter that explain the provenance of the
standard originating from ANSI.

If you know of specific differences, or a reason this is known
to be incorrect, please point it out.

Starting in C99, any mention of interrupts and signal handlers was
removed, along with the carveout.

This is wrong. Section 7.14 of C23 talks about signals and
signal handlers at length.

I never mentioned "interrupts" at all (traditionally, Unix
signals, which formed the basis for C signals, are not
interrputs in the conventional sense. Modern systems will
sometimes make use of interprocessor-interrupts to hasten their
delivery, however).

I think you are talking about _only_ the description of
`longjmp`. I am actually talking about the standard considered
in total. I only mentioned "non-nested" signal handler because
C90 was explicit in saying that that `longjmp` from a _nested_
signal handler was UB.

Because there is a definition
for what longjmp() does, the behavior is defined, and there is no
undefined behavior (not counting things like doing a longjmp()
with a jmp_buf that wasn't set up, etc). Removing the mention of
interrupts and signals, and also removing the carveout, only makes
longjmp() more defined, not less.

I don't think you understood my statement.

Read section 7.14 of C23 carefully; it is not at all obvious
that a `longjmp` out of a signal handler is not _a priori_ UB.
By my reading, it's the opposite, in fact: I see no way to do
so without invoking UB.

I was asked for an example, beyond the behavior of
`realloc(ptr, 0)` with respect to whether it free's `ptr` if
`ptr` is non-null, where something that was explicitly
guaranteed by an earlier version of the standard was changed to
UB in a later version. This appears another example of such a
case.

By all means, correct me if you think I am mistaken, but your
explanation above was based on your own misinterpretation, not
otherwise relevant to the statement I had made, and incorrect
in fact (the standard did _not_ remove mention of signals).

Note, in the case of `longjmp` and signal handlers, I suspect it
doesn't much matter because if one is doing something like that
anyway, as one is almost invariably going to targeting a system
that conforms to a standard like POSIX, which extends ISO C with
stronger guarantees for defined behavior in this specific area.

I replied to Keith Thompson's reply downthread.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Wed May 13 15:42:41 2026

From Newsgroup: comp.lang.c

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
[...]

There is another difference worth noting. A byte is a unit of
storage, whereas octet is a measure of information. The word
byte is inherently about memory; the word octet is inherently
about value (eight bits of information). For this reason too
the name 'octet' is a better choice for a type name than 'byte'.

The words "octet" and "byte" mean different things.

If I were to typedef "byte" as unsigned char, it would be because I
want to emphasize the fact that a byte object holds one fundamental
unit of data, not necessarily character data. And I'd probably
use it in a way that doesn't assume it's 8 bits (unless I have a
good reason not to need portability). C's conflation of character
types with "bytes" is IMHO unfortunate; a typedef makes it clearer
what the type is being used for.

It could, if someone happens to be looking at the typedef. More
often than not what is being looked at is a use of the name, and
not the typedef. Readers don't always have time to look up where
the name is defined, and that's why a good choice of name matters.

I usually just use "unsigned char" and remember that it's one byte
(however many bits that is).

I must remember to start using "char unsigned" in preference to
"unsigned char". ;)
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Wed May 13 15:45:29 2026

From Newsgroup: comp.lang.c

scott@slp53.sl.home (Scott Lurndal) writes:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <10u2jpk$2t96p$6@kst.eternal-september.org>,

[...]

If C's switch statement were to be
changed, it would have to use something that's currently a syntax
error. Perhaps something like

case 1, case 2, case 3, case 4: whatever();

Sure, that's better.

case 1...4: whatever();

is a typical GCC extension (that we use heavily).

Yes, and the C2y draft adopts that syntax.

(One possible reason it wasn't adopted sooner is that `case 'a'...'z'`
doesn't necessarily work if the letters are not contiguous, for
example in EBCDIC.)

But the issue being discussed was that multiple cases (that may
not be contiguous) depend on the default fallthrough behavior.

case 10:
case 20:
case 30:
whatever();
break;

In a hypothetical C-like language without default fallthrough, it
would make sense to invent a different syntax. For C repeating the
"case" keyword is slightly ugly, but probably not worth fixing.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Wed May 13 16:33:11 2026

From Newsgroup: comp.lang.c

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 2026-05-13 00:35, Keith Thompson wrote:

[Dropping comp.lang.misc, since this is only about C.]
Bart <bc@freeuk.com> writes:

[...]

[...]

So the inconvenience of how 'switch' works is excused because
/sometimes/ you need fallthrough, or the one time in a thousand you
need Duff's device.

I don't see any inconvenience in "how it works"; it actually
allows programmers to implement both semantics as needed. And
both semantics were needed, they have been used. (Even if you
think your projection of your preferences and limited uses is
what should constitute the global software development world.)

I don't use switch() very often. Out of curiosity I looked
through some source code that I worked on recently and checked
all the switch() statements. Out of roughly 10,000 lines of .c
files, there were

10 switch() statements, with
70 "arms" total (either non-trivial 'case's or 'default's)
9 duplicate labels (fallthrough with no extra actions taken)
13 breaks needed

So for this sample 'break;' and no 'break;' were roughly equally
common, and it was useful to have both.
--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Wed May 13 23:56:22 2026

From Newsgroup: comp.lang.c

In article <86ecjfj5xz.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <86lddnlvtr.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

[...]

It is not clear to me that `longjmp` out of a non-nested signal
handler is still well-defined as of C11, though it is explicitly
stated to be C89.

It seems you are misunderstanding what the standards are saying.

You read my post with insufficient care, and failed to
understand what I wrote, and are responding to something I did
not say.

The description of longjmp() says (paraphrasing) that it restores
the environment where the relevant setjmp() was done.

Yes.

There is
in C89 a passage about returning from signal handlers and so
forth, but that is followed by a carveout for nested signal
handlers, which in C89 is undefined behavior. (I assume that
also holds for C90 but I haven't verified that.)

Yes.

Aside: surely it is well well-known by now that the language in
C90 is verbatim identical to the language for C89 except for
some bits of the front matter that explain the provenance of the
standard originating from ANSI.

If you know of specific differences, or a reason this is known
to be incorrect, please point it out.

Starting in C99, any mention of interrupts and signal handlers was
removed, along with the carveout.

This is wrong. Section 7.14 of C23 talks about signals and
signal handlers at length.

I never mentioned "interrupts" at all (traditionally, Unix
signals, which formed the basis for C signals, are not
interrputs in the conventional sense. Modern systems will
sometimes make use of interprocessor-interrupts to hasten their
delivery, however).

I think you are talking about _only_ the description of
`longjmp`. I am actually talking about the standard considered
in total. I only mentioned "non-nested" signal handler because
C90 was explicit in saying that that `longjmp` from a _nested_
signal handler was UB.

Because there is a definition
for what longjmp() does, the behavior is defined, and there is no
undefined behavior (not counting things like doing a longjmp()
with a jmp_buf that wasn't set up, etc). Removing the mention of
interrupts and signals, and also removing the carveout, only makes
longjmp() more defined, not less.

I don't think you understood my statement.

Read section 7.14 of C23 carefully; it is not at all obvious
that a `longjmp` out of a signal handler is not _a priori_ UB.
By my reading, it's the opposite, in fact: I see no way to do
so without invoking UB.

I was asked for an example, beyond the behavior of
`realloc(ptr, 0)` with respect to whether it free's `ptr` if
`ptr` is non-null, where something that was explicitly
guaranteed by an earlier version of the standard was changed to
UB in a later version. This appears another example of such a
case.

By all means, correct me if you think I am mistaken, but your
explanation above was based on your own misinterpretation, not
otherwise relevant to the statement I had made, and incorrect
in fact (the standard did _not_ remove mention of signals).

Note, in the case of `longjmp` and signal handlers, I suspect it
doesn't much matter because if one is doing something like that
anyway, as one is almost invariably going to targeting a system
that conforms to a standard like POSIX, which extends ISO C with
stronger guarantees for defined behavior in this specific area.

I replied to Keith Thompson's reply downthread.

To what end?

You did not engage with the actual topic (which is, again, that
it appears that `longjmp` from a signal handler is now UB in
strictly conforming C, whereas it was not previously). Neither
your messages on this subthread, nor your response to Keith,
are related to that. I am genuinely confused as to why you
would respond at all, if you do not intend to address the topic.

I am, frankly, baffled as to what you were, and are, trying to
convey.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Thu May 14 02:59:43 2026

From Newsgroup: comp.lang.c

On 2026-05-13 16:31, Dan Cross wrote:

In article <10u1j2h$1l93l$31@dont-email.me>,
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 2026-05-12 16:33, Bart wrote:

[snip]
But would shouldn't people be expected to learn the rules?

Programmers should certainly learn, know, apply, and obey the rules.

(If you don't understand that you may try to transform that truism
to your "car example".)

Programmers _should_ absolutely learn the rules. But in C,
there are many of them, and some of them are deceptively subtle.

We agreed.

_A_ rule that programmers can remember quite easily, however,
is that parenthesis generally carry very high precedence, and
so when it doubt, wrapping something in paren's can aid
understanding (for the programmer and the maintainer).

I agree.

The key
is to find balance between extreme terseness and extreme
verbosity, both of which can feel obfuscating.

First, don't forget that there was no problem with precedence
existing in Bart's post; it was just an overloaded and badly
formatted composition in an example of ternary conditionals.

Now back to your statement. The point is that precedence rules
vary between programming languages. Folks can usually rely on
the precedence of * and / compared to + and - . But being a
computer scientist there's also other characteristics one can
assume with respect to typical types; but weighed against the
design decisions of the language. For example I can live with
the difference of Pascal's and C's operator precedence, even
that they differ. But it's harder to live with a discrepancy,
a mis-ranking of a class of operators in "C". (I noticed that
already when I read K&R some time around 1985, but I first saw
that "officially" acknowledged not too long ago when someone
posted a link to a paper from, IIRC, some time in the 1990's
written by one of the authors of "C".) - And that discrepancy
detail in C's precedence ranking was actually the only reason
for me looking "regularly" into the precedence table of my K&R.
(The point is that - with the exception of & ^ | - the ranking
makes perfectly sense and should be easily usable without doubt
by a concept-knowing programmer. But note that, historically,
a sort of "rationale" can be formulated for the discrepancy to
justify the given choice in context of specifically "C". But
still remember the "official" acknowledgement of an issue here.)

There was a time when I knew and had memorized the precedence of
all operators in C. I remember most, but have forgotten some
that I use less frequently; I suspect many programmers are in
the same (or a similar) situation. If I am writing code and can
not immediately remember the precedence of some operator in some
expression, I apply parentheses.

Depending on the complexity of expressions that is a sensible
approach. (I do that as well were I think that it aids clarity.)

Janis

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Thu May 14 03:14:04 2026

From Newsgroup: comp.lang.c

On 2026-05-13 17:20, Dan Cross wrote:

In article <10u1emq$1l93k$13@dont-email.me>,
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 2026-05-13 05:47, Tim Rentsch wrote:

scott@slp53.sl.home (Scott Lurndal) writes:

[...]

The programming I do
(mainly kernel programming, SoC simulation,
firmware) all naturally require the fixed-width types.

Right. Code that interacts very closely with hardware is one of
those cases where the fixed-width types make sense.

Another common one - also "low-level" but different - are data types
exchanged through communication protocols.

Yes, in particular, networking protocols are often described in
terms of "octets", since many protocols date from the era in
which machines with differently sized bytes were still common.
E.g., much of the early work presaging TCP/IP was done on DEC
PDP-10 machines, which were 36-bit, word-oriented computers.

However, when discussing protocols (or hardware peripherals on
the local system, for that matter) it is important to exercise
care with respect to ordering of octets within multi-octet
data. For instance, IP networking "on the wire" uses Big-Endian
ordering to represent the fields in the IP datagram header,
while a processor might use Little-endian natively. Hence, one
must be sensitive to transforming between the two. It may be
easier to leave the packet data in an octet buffer, and extract
the fields one is interested in on the host from that.

Well, I was developing software in the ISO/OSI universe, not so
much in the IETF/IP world. Endianess on the protocol level was
inherently no issue with (for example) the ASN.1/BER standards.
The "OSI-libraries" we used did the mapping from/to the machine
format. For our own local (non-OSI) protocols between different
systems we used existing functions (htonl, nltoh, etc.) for the
correct data-mapping.

Janis

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Thu May 14 01:39:13 2026

From Newsgroup: comp.lang.c

In article <10u36pv$1l93k$18@dont-email.me>,
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 2026-05-13 16:31, Dan Cross wrote:

In article <10u1j2h$1l93l$31@dont-email.me>,
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 2026-05-12 16:33, Bart wrote:

[snip]
But would shouldn't people be expected to learn the rules?

Programmers should certainly learn, know, apply, and obey the rules.

(If you don't understand that you may try to transform that truism
to your "car example".)

Programmers _should_ absolutely learn the rules. But in C,
there are many of them, and some of them are deceptively subtle.

We agreed.

_A_ rule that programmers can remember quite easily, however,
is that parenthesis generally carry very high precedence, and
so when it doubt, wrapping something in paren's can aid
understanding (for the programmer and the maintainer).

I agree.

The key
is to find balance between extreme terseness and extreme
verbosity, both of which can feel obfuscating.

First, don't forget that there was no problem with precedence
existing in Bart's post; it was just an overloaded and badly
formatted composition in an example of ternary conditionals.

I didn't say that there was. In fact and intent, my post had no
basis in Bart's snippet at all, but in looking at it now, I
think that code is an example of being _too_ terse.

Now back to your statement. The point is that precedence rules
vary between programming languages. Folks can usually rely on
the precedence of * and / compared to + and - .

I can think of at least two languages where you could not, but
yeah, that is usually true.

But being a
computer scientist there's also other characteristics one can
assume with respect to typical types; but weighed against the
design decisions of the language. For example I can live with
the difference of Pascal's and C's operator precedence, even
that they differ. But it's harder to live with a discrepancy,
a mis-ranking of a class of operators in "C". (I noticed that
already when I read K&R some time around 1985, but I first saw
that "officially" acknowledged not too long ago when someone
posted a link to a paper from, IIRC, some time in the 1990's
written by one of the authors of "C".) - And that discrepancy
detail in C's precedence ranking was actually the only reason
for me looking "regularly" into the precedence table of my K&R.
(The point is that - with the exception of & ^ | - the ranking
makes perfectly sense and should be easily usable without doubt
by a concept-knowing programmer. But note that, historically,
a sort of "rationale" can be formulated for the discrepancy to
justify the given choice in context of specifically "C". But
still remember the "official" acknowledgement of an issue here.)

I'm not sure what issue you are referring to, but I infer it has
to do with shifts and the bit-wise binary operators. I agree;
those are a mess.

There was a time when I knew and had memorized the precedence of
all operators in C. I remember most, but have forgotten some
that I use less frequently; I suspect many programmers are in
the same (or a similar) situation. If I am writing code and can
not immediately remember the precedence of some operator in some
expression, I apply parentheses.

Depending on the complexity of expressions that is a sensible
approach. (I do that as well were I think that it aids clarity.)

Yes. Also, sometimes projects have code standards that must be
obeyed that mandate use (or absence) of parenthesis; sometimes a
consensus among more than one programmer is required.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Thu May 14 03:39:38 2026

From Newsgroup: comp.lang.c

On 2026-05-13 20:00, Tim Rentsch wrote:

There is another difference worth noting. A byte is a unit of
storage, whereas octet is a measure of information. The word
byte is inherently about memory; the word octet is inherently
about value (eight bits of information). For this reason too
the name 'octet' is a better choice for a type name than 'byte'.

Well, I have a slightly different view; I suppose it's cultural.

I often see, specifically from the Anglo-American culture, that
they talk about, say, "8 bits"; and this has partly culturally
also spread across the ocean. - Here we try to distinguish the
units and the "metal"; the latter are formally substantives and
written with a capital letter. So we have units of "1 bit" or
"5 bit" entities (no 's' at the end). But seen as "metal" we
speak about "one Bit" or "five Bits" - although it's somewhat
quirky to imagine a thing that is physically "5 Bit", mostly it
is more accurate to say it's an entity of "5 bit" - and similar
with "1 byte". Because we use that also as _unit_ for 8 bit
entities. It gets complicated by us addressing the unit 'bit'
by a name, which is then "Bit". So the more accurate forms for
the _units_ are 5 bit or 8 byte. - As said, we may culturally
see that differently, and colloquially you nowadays also often
hear "5 Bits" or "8 Bytes" (as pluralized substantive), so it's
cumbersome to argue about that. - Only that "byte" is also a
unit (and not necessarily associated with memory) seems to be
our difference in how we view that.

Janis

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Thu May 14 03:46:25 2026

From Newsgroup: comp.lang.c

On 2026-05-14 00:42, Tim Rentsch wrote:

I must remember to start using "char unsigned" in preference to
"unsigned char". ;)

Despite the smiley I can't really interpret that. So a honest
question; is there a difference in those two, or what do you
want to express by that?

Janis

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Thu May 14 03:50:25 2026

From Newsgroup: comp.lang.c

On 2026-05-14 03:14, Janis Papanagnou wrote:

Well, I was developing software in the ISO/OSI universe, not so
much in the IETF/IP world. Endianess on the protocol level was
inherently no issue with (for example) the ASN.1/BER standards.
The "OSI-libraries" we used did the mapping from/to the machine
format. For our own local (non-OSI) protocols between different
systems we used existing functions (htonl, nltoh, etc.) for the

Oops! - htonl, ntohl, etc.

correct data-mapping.

Janis

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Thu May 14 03:57:30 2026

From Newsgroup: comp.lang.c

On 2026-05-14 03:39, Dan Cross wrote:

In article <10u36pv$1l93k$18@dont-email.me>,
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

Now back to your statement. The point is that precedence rules
vary between programming languages. Folks can usually rely on
the precedence of * and / compared to + and - .

I can think of at least two languages where you could not, but
yeah, that is usually true.

Are you thinking about languages like Algol 68 where you
can explicitly define and re-define operator precedence,
or do you mean languages where they made just bad design
decisions?

Janis

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Thu May 14 04:30:19 2026

From Newsgroup: comp.lang.c

On 2026-05-13 21:28, Keith Thompson wrote:

[...]

[...]
Yes, the fact that "break;" is used both for loops and for switch
statements is inconvenient. (csh uses "breaksw" for switch
statements.)

If we'd have two distinct keywords (in a language, BTW, that tried
to avoid too many keywords in the first place!) there would be
complaints (and foremost from Bart, for sure); why they decided to
use two keywords to basically do "the same" thing, namely leaving
a control structure.

I don't see how using 'break' in more than one context would be in
any way "inconvenient".

Janis

[...]

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Thu May 14 04:48:39 2026

From Newsgroup: comp.lang.c

On 2026-05-13 21:35, Keith Thompson wrote:

A "better" switch statement might have an explicit fallthrough
construct. (bash's "case" statement has this, more or less.)

This is actually Kornshell's way (only very lately adopted by Bash).

In these shells we have syntactical forms for _both_ cases; the ';;'
symbol is used to get 'break' semantics, and the ';&' symbol to fall
through.

It's also in other respects more powerful; e.g. in using alternatives
like (a|b|c) as targets, or compare against patterns (a*|*b[uv]) .

Or you could use goto (yeah, I know).

A lot is possible; there's no need to get pathological here. :-)

Janis

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Thu May 14 04:56:10 2026

From Newsgroup: comp.lang.c

On 2026-05-13 17:12, Scott Lurndal wrote:

We use:

# if __GNUC__ >= 7 // 'statement attributes' were new with GCC 7.x
# if defined(__cplusplus) && (__cplusplus >= 201103L) // C++11 or greater # define XXX_FALLTHROUGH [[gnu::fallthrough]]
# else
# define XXX_FALLTHROUGH __attribute__ ((fallthrough))
# endif
# else // GCC 4.x, 5.x, 6.x, comment only!
# define XXX_FALLTHROUGH /* Fall Through */
# endif

Where 'XXX' is replaced by the app name.

switch (variable) {
case cond1:
break;
case cond:
do something
XXX_FALLTHROUGH
default:
do something else
}

Just a note aside; couldn't the XXX be automatically concatenated using
the CPP features? (I seem to recall we've done such things back then.)

I also wonder about the app-specific variants; wouldn't one version for
all apps have sufficed?

Janis

--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Wed May 13 19:58:39 2026

From Newsgroup: comp.lang.c

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 2026-05-13 21:28, Keith Thompson wrote:

[...]

[...]
Yes, the fact that "break;" is used both for loops and for switch
statements is inconvenient. (csh uses "breaksw" for switch
statements.)

If we'd have two distinct keywords (in a language, BTW, that tried
to avoid too many keywords in the first place!) there would be
complaints (and foremost from Bart, for sure); why they decided to
use two keywords to basically do "the same" thing, namely leaving
a control structure.

I don't see how using 'break' in more than one context would be in
any way "inconvenient".

With nested loops, "break" or "continue" always refers to the innermost
loop. With a switch statement inside a loop, "break" refers to the
switch statement, but "continue" refers to the loop.

It's obviously not impossible to deal with, but I find it mildly
annoying.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From tTh@tth@none.invalid to comp.lang.c on Thu May 14 07:47:16 2026

From Newsgroup: comp.lang.c

On 5/14/26 03:39, Janis Papanagnou wrote:

although it's somewhat
quirky to imagine a thing that is physically "5 Bit"

Is data on the wire a physical thing ?

https://en.wikipedia.org/wiki/Baudot_code
--
** **
* tTh des Bourtoulots *
* http://maison.tth.netlib.re/ *
** **
--- Synchronet 3.22a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Thu May 14 00:38:58 2026

From Newsgroup: comp.lang.c

On 5/13/2026 3:42 PM, Tim Rentsch wrote:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
[...]

There is another difference worth noting. A byte is a unit of
storage, whereas octet is a measure of information. The word
byte is inherently about memory; the word octet is inherently
about value (eight bits of information). For this reason too
the name 'octet' is a better choice for a type name than 'byte'.

The words "octet" and "byte" mean different things.

If I were to typedef "byte" as unsigned char, it would be because I
want to emphasize the fact that a byte object holds one fundamental
unit of data, not necessarily character data. And I'd probably
use it in a way that doesn't assume it's 8 bits (unless I have a
good reason not to need portability). C's conflation of character
types with "bytes" is IMHO unfortunate; a typedef makes it clearer
what the type is being used for.

It could, if someone happens to be looking at the typedef. More
often than not what is being looked at is a use of the name, and
not the typedef. Readers don't always have time to look up where
the name is defined, and that's why a good choice of name matters.

I usually just use "unsigned char" and remember that it's one byte
(however many bits that is).

I must remember to start using "char unsigned" in preference to
"unsigned char". ;)

Nothing wrong with unsigned char? right? ;^o
--- Synchronet 3.22a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Thu May 14 00:39:48 2026

From Newsgroup: comp.lang.c

On 5/14/2026 12:38 AM, Chris M. Thomasson wrote:

On 5/13/2026 3:42 PM, Tim Rentsch wrote:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
[...]

There is another difference worth noting.-a A byte is a unit of
storage, whereas octet is a measure of information.-a The word
byte is inherently about memory;-a the word octet is inherently
about value (eight bits of information).-a For this reason too
the name 'octet' is a better choice for a type name than 'byte'.

The words "octet" and "byte" mean different things.

If I were to typedef "byte" as unsigned char, it would be because I
want to emphasize the fact that a byte object holds one fundamental
unit of data, not necessarily character data.-a And I'd probably
use it in a way that doesn't assume it's 8 bits (unless I have a
good reason not to need portability).-a C's conflation of character
types with "bytes" is IMHO unfortunate;-a a typedef makes it clearer
what the type is being used for.

It could, if someone happens to be looking at the typedef.-a More
often than not what is being looked at is a use of the name, and
not the typedef.-a Readers don't always have time to look up where
the name is defined, and that's why a good choice of name matters.

I usually just use "unsigned char" and remember that it's one byte
(however many bits that is).

I must remember to start using "char unsigned" in preference to
"unsigned char".-a ;)

Nothing wrong with unsigned char? right? ;^o

I read from right to left here...
--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Thu May 14 09:40:00 2026

From Newsgroup: comp.lang.c

On 2026-05-14 04:58, Keith Thompson wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 2026-05-13 21:28, Keith Thompson wrote:

[...]

[...]
Yes, the fact that "break;" is used both for loops and for switch
statements is inconvenient. (csh uses "breaksw" for switch
statements.)

If we'd have two distinct keywords (in a language, BTW, that tried
to avoid too many keywords in the first place!) there would be
complaints (and foremost from Bart, for sure); why they decided to
use two keywords to basically do "the same" thing, namely leaving
a control structure.

I don't see how using 'break' in more than one context would be in
any way "inconvenient".

With nested loops, "break" or "continue" always refers to the innermost
loop. With a switch statement inside a loop, "break" refers to the
switch statement, but "continue" refers to the loop.

It's obviously not impossible to deal with, but I find it mildly
annoying.

Well, okay, I see what you mean; when including 'continue' into the consideration it appears inconsistent. - Other languages have more
flexible "continues" and "breaks"; Kornshell, for example, addresses
the limitation you describe, it allows 'break' (and also 'continue')
to target more than just one nesting-level if an appropriate integer
parameter is supplied. (But the Shell also doesn't use break for the
switch statements. - An old tool but not that bad in many respects.)

Janis

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Thu May 14 09:54:00 2026

From Newsgroup: comp.lang.c

On 2026-05-14 07:47, tTh wrote:

On 5/14/26 03:39, Janis Papanagnou wrote:

although it's somewhat
quirky to imagine a thing that is physically "5 Bit"

-a-a Is data on the wire a physical thing ?

These terms are a bit peculiar in the given context, but I'd
say yes, they are sort of a "physical thing". - Certainly so
if you want them differentiated from the abstract unit "bit".

-a-a https://en.wikipedia.org/wiki/Baudot_code

Not sure what you want to tell us with that link.

But since you mention the Baudot_code; you certainly remember
the unit 'baud'; it's basically a symbol rate, not a bit-rate,
and only with specific constraints there's an equivalence
relation possible to bit-rate possible. - How would an analog
modulated signal of a symbol be expressed in "bits"! Just BTW.

Janis

--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c on Thu May 14 11:43:52 2026

From Newsgroup: comp.lang.c

On 13/05/2026 17:52, Dan Cross wrote:

In article <10u24l5$2oaav$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:

On 13/05/2026 16:31, Dan Cross wrote:

In article <10u1j2h$1l93l$31@dont-email.me>,
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 2026-05-12 16:33, Bart wrote:

[snip]
But would shouldn't people be expected to learn the rules?

Programmers should certainly learn, know, apply, and obey the rules.

(If you don't understand that you may try to transform that truism
to your "car example".)

Programmers _should_ absolutely learn the rules. But in C,
there are many of them, and some of them are deceptively subtle.

_A_ rule that programmers can remember quite easily, however,
is that parenthesis generally carry very high precedence, and
so when it doubt, wrapping something in paren's can aid
understanding (for the programmer and the maintainer). The key
is to find balance between extreme terseness and extreme
verbosity, both of which can feel obfuscating.

There was a time when I knew and had memorized the precedence of
all operators in C. I remember most, but have forgotten some
that I use less frequently; I suspect many programmers are in
the same (or a similar) situation. If I am writing code and can
not immediately remember the precedence of some operator in some
expression, I apply parentheses.

I don't think it is necessary to /learn/ all the rules of a language -
but it is necessary to be aware of them, and to know how well you know
them. It's fine not to be sure of all the precedence rules in a
language (and some languages have many more operators than C, or
stranger precedence rules). You only need to know the ones you rely on
regularly, and the ones you have to read regularly. If you occasionally
come across something different, then you can look it up. There's no
point in filling your head with knowledge that you almost never need.

So there is usually no need to know the precedence rules for mixing
relational operators, shift operators and bitwise and/or operators, or
whatever, if you put parentheses in your own code or split the complex
expression into multiple variables. (With the caveat that you mentioned
earlier that both too few and too many parentheses make code harder to
understand.)

But you might have to understand code written which relies on more of
the details - you need to be aware of what you know, and what you have
to look up, in order to understand the code. The risk comes not from
ignorance of the precedence rules, but from thinking you know them when
you have misremembered them. Self-awareness of your own knowledge,
along with convenient and reliable references, is vital.

Yes, I agree. The key is knowing when it's time to go to look
at a reference.

I like the way you put it.

I might go a bit further and say that it's fine not to know
every rule, but there's a qualitative difference between
acknowledging that and know that easy access to a reliable
reference is useful, and steadfasty, refusing to learn the rules
because one considers them poor to begin with.

Of course.

There is also often a dangerous point in learning anything (not just a programming language), where you have learned enough to think you have "grokked" the subject but don't yet realise how little you actually
know. You have to pass that hump in the learning curve as fast as you
can - some people get stuck there, and that's when they start believing
things like "C is portable assembly".

--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c on Thu May 14 11:49:39 2026

From Newsgroup: comp.lang.c

On 14/05/2026 03:57, Janis Papanagnou wrote:

On 2026-05-14 03:39, Dan Cross wrote:

In article <10u36pv$1l93k$18@dont-email.me>,
Janis Papanagnou-a <janis_papanagnou+ng@hotmail.com> wrote:

Now back to your statement. The point is that precedence rules
vary between programming languages. Folks can usually rely on
the precedence of * and / compared to + and - .

I can think of at least two languages where you could not, but
yeah, that is usually true.

Are you thinking about languages like Algol 68 where you
can explicitly define and re-define operator precedence,
or do you mean languages where they made just bad design
decisions?

Janis

I believe APL does not have operator precedences, though I have never
written more than a one-line program in the language.

And in Forth, operators are all post-fix, so there are no precedences
there either.

But I'm curious which two languages Dan was referring to. (My guess is
that APL was one of them, but I don't know the other.)

--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c on Thu May 14 12:08:35 2026

From Newsgroup: comp.lang.c

On 13/05/2026 16:57, Dan Cross wrote:

In article <10u0k0k$1l93l$30@dont-email.me>,
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

[snip]
Bart <bc@freeuk.com> writes:
[...]

So the inconvenience of how 'switch' works is excused because
/sometimes/ you need fallthrough, or the one time in a thousand you
need Duff's device.

I don't see any inconvenience in "how it works"; it actually
allows programmers to implement both semantics as needed. And
both semantics were needed, they have been used. (Even if you
think your projection of your preferences and limited uses is
what should constitute the global software development world.)

It's easy to get wrong. Other languages accommodate both
semantics using alternation in the selector arm. For example,
one might imagine an hypothetical syntax, something like:

switch (a) {
case 1 || 2 || 3 || 4: whatever();
default: other();
}

...with no `break` to end each `case`.

You couldn't use it to build Duff's Device, but I'm not sure
that even Duff would call that a loss.

- Dan C.

Anyone curious about how far C's switch statements can be used or
abused, might like to read about "Protothreads" :

<https://en.wikipedia.org/wiki/Protothread>

This is a conglomeration of Duff's Device on steroids with supporting
macros that gives you a limited type of stackless cooperative
multitasking with extremely low overhead. The library has seen real
usage in small embedded systems. Reactions to the underlying
implementation range from thinking it is a hideous abuse of a bad
language design, to elegant and very ingenious.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Thu May 14 10:22:00 2026

From Newsgroup: comp.lang.c

In article <10u3a6a$34b89$3@dont-email.me>,
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 2026-05-14 03:39, Dan Cross wrote:

In article <10u36pv$1l93k$18@dont-email.me>,
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

Now back to your statement. The point is that precedence rules
vary between programming languages. Folks can usually rely on
the precedence of * and / compared to + and - .

I can think of at least two languages where you could not, but
yeah, that is usually true.

Are you thinking about languages like Algol 68 where you
can explicitly define and re-define operator precedence,
or do you mean languages where they made just bad design
decisions?

No. troff and the 7th Edition Unix assembler (the latter has
infix expression syntax for the calculation of immediate values
and so on, but that is parsed left to right, and does not follow
the usual rules of arithmetic).

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Thu May 14 10:57:23 2026

From Newsgroup: comp.lang.c

In article <10u45rj$28j9$3@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:

On 14/05/2026 03:57, Janis Papanagnou wrote:

On 2026-05-14 03:39, Dan Cross wrote:

In article <10u36pv$1l93k$18@dont-email.me>,
Janis Papanagnou-a <janis_papanagnou+ng@hotmail.com> wrote:

Now back to your statement. The point is that precedence rules
vary between programming languages. Folks can usually rely on
the precedence of * and / compared to + and - .

I can think of at least two languages where you could not, but
yeah, that is usually true.

Are you thinking about languages like Algol 68 where you
can explicitly define and re-define operator precedence,
or do you mean languages where they made just bad design
decisions?

Janis

<OT>

I believe APL does not have operator precedences, though I have never >written more than a one-line program in the language.

And in Forth, operators are all post-fix, so there are no precedences
there either.

But I'm curious which two languages Dan was referring to. (My guess is
that APL was one of them, but I don't know the other.)

APL is an example, yes. It does have some notion of precedence,
but arithmetic expressions do not follow the usual rules of
mathematics.

Precedence is really only a concern for infix notations;
langauges like Forth, PostScript, or members of the Lisp family
are not infix languages, so it doesn't matter; in Lisp, for
instance, one may think of a program as S-expressions as a
textual representation of an abstract syntax tree; thus,
programmers work directly in terms of the AST.

Smalltalk is another language that does not have operator
that exhibit precedence in the manner one is likely accustomed
to in the usual sense.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Thu May 14 12:03:30 2026

From Newsgroup: comp.lang.c

On 14/05/2026 03:58, Keith Thompson wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 2026-05-13 21:28, Keith Thompson wrote:

[...]

[...]
Yes, the fact that "break;" is used both for loops and for switch
statements is inconvenient. (csh uses "breaksw" for switch
statements.)

If we'd have two distinct keywords (in a language, BTW, that tried
to avoid too many keywords in the first place!) there would be
complaints (and foremost from Bart, for sure); why they decided to
use two keywords to basically do "the same" thing, namely leaving
a control structure.

I don't see how using 'break' in more than one context would be in
any way "inconvenient".

With nested loops, "break" or "continue" always refers to the innermost
loop. With a switch statement inside a loop, "break" refers to the
switch statement, but "continue" refers to the loop.

It's obviously not impossible to deal with, but I find it mildly
annoying.

Break doing the two jobs is a flaw. 'break' and 'continue' being
inconsistent is a further one:

Suppose you have a loop, and within the loop, you have an if-else-if
chain within which are 'break' and 'continue' statements.

You decide that that if-else-if chain is better off as a switch. But
now, while 'continue' continues to do its job, 'break' silently behaves differently.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Thu May 14 12:32:22 2026

From Newsgroup: comp.lang.c

On 14/05/2026 01:59, Janis Papanagnou wrote:

On 2026-05-13 16:31, Dan Cross wrote:

In article <10u1j2h$1l93l$31@dont-email.me>,
Janis Papanagnou-a <janis_papanagnou+ng@hotmail.com> wrote:

On 2026-05-12 16:33, Bart wrote:

[snip]
But would shouldn't people be expected to learn the rules?

Programmers should certainly learn, know, apply, and obey the rules.

(If you don't understand that you may try to transform that truism
to your "car example".)

Programmers _should_ absolutely learn the rules.-a But in C,
there are many of them, and some of them are deceptively subtle.

We agreed.

_A_ rule that programmers can remember quite easily, however,
is that parenthesis generally carry very high precedence, and
so when it doubt, wrapping something in paren's can aid
understanding (for the programmer and the maintainer).

I agree.

The key
is to find balance between extreme terseness and extreme
verbosity, both of which can feel obfuscating.

First, don't forget that there was no problem with precedence
existing in Bart's post; it was just an overloaded and badly
formatted composition in an example of ternary conditionals.

(The point is that - with the exception of & ^ | - the ranking
makes perfectly sense

It doesn't make sense even then; here are the remaining groups for
binary ops from high to low:

(* / %) (+ -) (<< >>) (< <= >= >) (== !==) (&&) (||) (=)

Why are the shift operators at that spot? This causes chaos in
expressions like 'a << 3 + b' which are parsed as 'a << (3 + b)'.

Why are == and != lower precedence than the other compare operators? In
which circumstances would that be an advantage? This is just a pointless
extra level, as such usage would be so unusual that you'd use
parentheses anyway.

TBF, while other languages may not have as many levels, they also have questionable choices, because there are no standards.

At best it is generally agreed that there are 3 groups (4 including assignment) again arranged from high to low:

1 School arithmetic which everyone knows

2 Comparisons

3 Logical (and, or)

4 (Assignment)

These should be intuitive, all that's left is the ordering within group
1 and group 3, and also where these extra ops need to go:

<< >> & | ^

In the case if C, it also decided that ?: belongs in this chart of
/binary/ operators. (I supposed you can consider each of ? and : as a
binary operator...)

--- Synchronet 3.22a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Thu May 14 05:15:30 2026

From Newsgroup: comp.lang.c

On 5/14/2026 3:08 AM, David Brown wrote:

On 13/05/2026 16:57, Dan Cross wrote:

In article <10u0k0k$1l93l$30@dont-email.me>,
Janis Papanagnou-a <janis_papanagnou+ng@hotmail.com> wrote:

[snip]
Bart <bc@freeuk.com> writes:
[...]

So the inconvenience of how 'switch' works is excused because
/sometimes/ you need fallthrough, or the one time in a thousand you
need Duff's device.

I don't see any inconvenience in "how it works"; it actually
allows programmers to implement both semantics as needed. And
both semantics were needed, they have been used. (Even if you
think your projection of your preferences and limited uses is
what should constitute the global software development world.)

It's easy to get wrong.-a Other languages accommodate both
semantics using alternation in the selector arm.-a For example,
one might imagine an hypothetical syntax, something like:

-a-a-a-a switch (a) {
-a-a-a-a-a-a-a-a case 1 || 2 || 3 || 4: whatever();
-a-a-a-a-a-a-a-a default: other();
-a-a-a-a }

...with no `break` to end each `case`.

You couldn't use it to build Duff's Device, but I'm not sure
that even Duff would call that a loss.

-a-a-a-a-a-a-a-a - Dan C.

Anyone curious about how far C's switch statements can be used or
abused, might like to read about "Protothreads" :

<https://en.wikipedia.org/wiki/Protothread>

This is a conglomeration of Duff's Device on steroids with supporting
macros that gives you a limited type of stackless cooperative
multitasking with extremely low overhead.-a The library has seen real
usage in small embedded systems.-a Reactions to the underlying implementation range from thinking it is a hideous abuse of a bad
language design, to elegant and very ingenious.

Need to check it out. Btw have you ever examined the chaos PP lib?
--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Thu May 14 06:07:39 2026

From Newsgroup: comp.lang.c

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 2026-05-14 00:42, Tim Rentsch wrote:

I must remember to start using "char unsigned" in preference to
"unsigned char". ;)

Despite the smiley I can't really interpret that. So a honest
question; is there a difference in those two, or what do you
want to express by that?

The emoticon was meant to be a wink; in other words I was joking.
Both "unsigned char" and "char unsigned" are legal, and mean the
same thing as far as the language rules go, but the first one
sounds normal and the second one sounds like Yoda. My only
intention was to make people laugh.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Thu May 14 06:25:20 2026

From Newsgroup: comp.lang.c

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 2026-05-13 20:00, Tim Rentsch wrote:

There is another difference worth noting. A byte is a unit of
storage, whereas octet is a measure of information. The word
byte is inherently about memory; the word octet is inherently
about value (eight bits of information). For this reason too
the name 'octet' is a better choice for a type name than 'byte'.

Well, I have a slightly different view; I suppose it's cultural.

I often see, specifically from the Anglo-American culture, that
they talk about, say, "8 bits"; and this has partly culturally
also spread across the ocean. - Here we try to distinguish the
units and the "metal"; the latter are formally substantives and
written with a capital letter. So we have units of "1 bit" or
"5 bit" entities (no 's' at the end). But seen as "metal" we
speak about "one Bit" or "five Bits" - although it's somewhat
quirky to imagine a thing that is physically "5 Bit", mostly it
is more accurate to say it's an entity of "5 bit" - and similar
with "1 byte". Because we use that also as _unit_ for 8 bit
entities. It gets complicated by us addressing the unit 'bit'
by a name, which is then "Bit". So the more accurate forms for
the _units_ are 5 bit or 8 byte. - As said, we may culturally
see that differently, and colloquially you nowadays also often
hear "5 Bits" or "8 Bytes" (as pluralized substantive), so it's
cumbersome to argue about that. - Only that "byte" is also a
unit (and not necessarily associated with memory) seems to be
our difference in how we view that.

I don't know if I see what you're getting at here. My writing
follows standard usage in American English. Sometimes the names
of units are capitalized but for the most part they aren't. The
names of units are singular or plural when used as nouns (1 bit,
2 bits), but singular when used as adjectives (16-bit int).
There may be exceptions to those rules, I haven't thought about
it deeply.

My main point is that "byte" and "octet" are talking about
different kinds of things. A computer might have 64k bytes of
RAM, but normally I wouldn't (and I think normally other people
wouldn't) say that a computer has 64k octets of RAM. We might
say a computer has enough RAM to _hold_ 64k octets, but not that
it _has_ 64k octets. There's a semantic incongruity in the
latter case. Do you see what I mean?
--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Thu May 14 08:13:22 2026

From Newsgroup: comp.lang.c

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 2026-05-12 20:09, Dan Cross wrote:

[...] one of the major motivating
factors for using unsigned arithmetic in practice is to have the
full bit-range of the type available. [...]

Hmm.. - I'm using 'unsigned' typically to express the domain of the application values (not to "wrest" some more values out of a type).

I concur. I use unsigned types a lot more often than signed types,
and needing an extra one bit of range is almost never a factor.
--- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Thu May 14 15:19:05 2026

From Newsgroup: comp.lang.c

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 2026-05-13 17:12, Scott Lurndal wrote:

We use:

# if __GNUC__ >= 7 // 'statement attributes' were new with GCC 7.x
# if defined(__cplusplus) && (__cplusplus >= 201103L) // C++11 or greater >> # define XXX_FALLTHROUGH [[gnu::fallthrough]]
# else
# define XXX_FALLTHROUGH __attribute__ ((fallthrough))
# endif
# else // GCC 4.x, 5.x, 6.x, comment only!
# define XXX_FALLTHROUGH /* Fall Through */
# endif

Where 'XXX' is replaced by the app name.

switch (variable) {
case cond1:
break;
case cond:
do something
XXX_FALLTHROUGH
default:
do something else
}

Just a note aside; couldn't the XXX be automatically concatenated using
the CPP features? (I seem to recall we've done such things back then.)

Not sure I understand your question. I used xxx above just
to obscure the name of the proprietary program that includes
the above file.

I also wonder about the app-specific variants; wouldn't one version for
all apps have sufficed?

There is a need to support gcc4 through gcc14 in that project. We've subsequently raised the lower limit to gcc7. The project was started
in 2012.

--- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Thu May 14 15:26:37 2026

From Newsgroup: comp.lang.c

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 2026-05-13 20:00, Tim Rentsch wrote:

There is another difference worth noting. A byte is a unit of
storage, whereas octet is a measure of information. The word
byte is inherently about memory; the word octet is inherently
about value (eight bits of information). For this reason too
the name 'octet' is a better choice for a type name than 'byte'.

Well, I have a slightly different view; I suppose it's cultural.

I often see, specifically from the Anglo-American culture, that
they talk about, say, "8 bits";

For many older Americans, 8 bits is a dollar.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Thu May 14 08:37:20 2026

From Newsgroup: comp.lang.c

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

[..discussing C expression syntax..]

[...] [Remembering precedence in C is difficult because of]
a mis-ranking of a class of operators in "C". (I noticed that
already when I read K&R some time around 1985, but I first saw
that "officially" acknowledged not too long ago when someone
posted a link to a paper from, IIRC, some time in the 1990's
written by one of the authors of "C".) - And that discrepancy
detail in C's precedence ranking was actually the only reason
for me looking "regularly" into the precedence table of my K&R.
(The point is that - with the exception of & ^ | - the ranking
makes perfectly sense and should be easily usable without doubt
by a concept-knowing programmer. But note that, historically,
a sort of "rationale" can be formulated for the discrepancy to
justify the given choice in context of specifically "C". But
still remember the "official" acknowledgement of an issue here.)

I think it's easy to remember how expressions in C work with the
help of just a few memory aids:

1. unary operators are always ahead of binary operators, first
those on the right and then those on the left;

2. the bitwise operators form a sandwich enclosing the relational
operators and the equality operators - shift (<<,>>) on top,
and the three kinds of logical operations (&,^,|) underneath;

3. sizeof is greedy with respect to type names: sizeof (int)+1
is (sizeof (int))+1, not sizeof ((int)+1)

Everything else goes where it makes sense.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Thu May 14 17:00:47 2026

From Newsgroup: comp.lang.c

On 14/05/2026 16:37, Tim Rentsch wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

[..discussing C expression syntax..]

[...] [Remembering precedence in C is difficult because of]
a mis-ranking of a class of operators in "C". (I noticed that
already when I read K&R some time around 1985, but I first saw
that "officially" acknowledged not too long ago when someone
posted a link to a paper from, IIRC, some time in the 1990's
written by one of the authors of "C".) - And that discrepancy
detail in C's precedence ranking was actually the only reason
for me looking "regularly" into the precedence table of my K&R.
(The point is that - with the exception of & ^ | - the ranking
makes perfectly sense and should be easily usable without doubt
by a concept-knowing programmer. But note that, historically,
a sort of "rationale" can be formulated for the discrepancy to
justify the given choice in context of specifically "C". But
still remember the "official" acknowledgement of an issue here.)

I think it's easy to remember how expressions in C work with the
help of just a few memory aids:

1. unary operators are always ahead of binary operators, first
those on the right and then those on the left;

Unary operators aren't the problem. It's a mystery why they need to be
in a table at all. Nobody's going to think that '&a + b' means '&(a + b)'.

3. sizeof is greedy with respect to type names: sizeof (int)+1
is (sizeof (int))+1, not sizeof ((int)+1)

This isn't a problem either: it works like a unary operator.

2. the bitwise operators form a sandwich enclosing the relational
operators and the equality operators - shift (<<,>>) on top,
and the three kinds of logical operations (&,^,|) underneath;

This is where the trouble starts: these make up 6 different levels.

Combinations of & ^ | are rare enough, as bitwise operations, that you'd
use parentheses anyway. They don't need 3 separate levels.

Comparison ones don't need 2 levels.

And shift operators don't really need their own level either. (Since
they scale numbers just like * and /, they can be lumped in with those.
Having 'a * 8 + b' mean the same as 'a << 3 + b' makes sense; currently
they have quite different meanings.)

--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Thu May 14 09:44:02 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 14/05/2026 16:37, Tim Rentsch wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

[..discussing C expression syntax..]

[...] [Remembering precedence in C is difficult because of]
a mis-ranking of a class of operators in "C". (I noticed that
already when I read K&R some time around 1985, but I first saw
that "officially" acknowledged not too long ago when someone
posted a link to a paper from, IIRC, some time in the 1990's
written by one of the authors of "C".) - And that discrepancy
detail in C's precedence ranking was actually the only reason
for me looking "regularly" into the precedence table of my K&R.
(The point is that - with the exception of & ^ | - the ranking
makes perfectly sense and should be easily usable without doubt
by a concept-knowing programmer. But note that, historically,
a sort of "rationale" can be formulated for the discrepancy to
justify the given choice in context of specifically "C". But
still remember the "official" acknowledgement of an issue here.)

I think it's easy to remember how expressions in C work with the
help of just a few memory aids:

1. unary operators are always ahead of binary operators, first
those on the right and then those on the left;

Unary operators aren't the problem. It's a mystery why they need to be
in a table at all. Nobody's going to think that '&a + b' means '&(a +
b)'.

3. sizeof is greedy with respect to type names: sizeof (int)+1
is (sizeof (int))+1, not sizeof ((int)+1)

This isn't a problem either: it works like a unary operator.

2. the bitwise operators form a sandwich enclosing the relational
operators and the equality operators - shift (<<,>>) on top,
and the three kinds of logical operations (&,^,|) underneath;

This is where the trouble starts: these make up 6 different levels.

Combinations of & ^ | are rare enough, as bitwise operations, that
you'd use parentheses anyway. They don't need 3 separate levels.

Comparison ones don't need 2 levels.

And shift operators don't really need their own level either. (Since
they scale numbers just like * and /, they can be lumped in with
those. Having 'a * 8 + b' mean the same as 'a << 3 + b' makes sense; currently they have quite different meanings.)

I wasn't trying to help you. I know that's a lost cause.
--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c on Thu May 14 18:51:09 2026

From Newsgroup: comp.lang.c

On 14/05/2026 18:00, Bart wrote:

On 14/05/2026 16:37, Tim Rentsch wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

[..discussing C expression syntax..]

[...]-a [Remembering precedence in C is difficult because of]
a mis-ranking of a class of operators in "C".-a (I noticed that
already when I read K&R some time around 1985, but I first saw
that "officially" acknowledged not too long ago when someone
posted a link to a paper from, IIRC, some time in the 1990's
written by one of the authors of "C".) - And that discrepancy
detail in C's precedence ranking was actually the only reason
for me looking "regularly" into the precedence table of my K&R.
(The point is that - with the exception of & ^ | - the ranking
makes perfectly sense and should be easily usable without doubt
by a concept-knowing programmer.-a But note that, historically,
a sort of "rationale" can be formulated for the discrepancy to
justify the given choice in context of specifically "C".-a But
still remember the "official" acknowledgement of an issue here.)

I think it's easy to remember how expressions in C work with the
help of just a few memory aids:

-a-a 1. unary operators are always ahead of binary operators, first
-a-a-a-a-a those on the right and then those on the left;

Unary operators aren't the problem. It's a mystery why they need to be
in a table at all. Nobody's going to think that '&a + b' means '&(a + b)'.

Unary operator precedence is certainly important. (*p)++ and *(p++) are
very different things, and *p++ can only mean one of them.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Thu May 14 09:55:27 2026

From Newsgroup: comp.lang.c

scott@slp53.sl.home (Scott Lurndal) writes:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 2026-05-13 17:12, Scott Lurndal wrote:

We use:

# if __GNUC__ >= 7 // 'statement attributes' were new with GCC 7.x
# if defined(__cplusplus) && (__cplusplus >= 201103L) // C++11 or higher >>> # define XXX_FALLTHROUGH [[gnu::fallthrough]]
# else
# define XXX_FALLTHROUGH __attribute__ ((fallthrough))
# endif
# else // GCC 4.x, 5.x, 6.x, comment only!
# define XXX_FALLTHROUGH /* Fall Through */
# endif

Where 'XXX' is replaced by the app name.

switch (variable) {
case cond1:
break;
case cond:
do something
XXX_FALLTHROUGH
default:
do something else
}

Just a note aside; couldn't the XXX be automatically concatenated using
the CPP features? (I seem to recall we've done such things back then.)

Not sure I understand your question. I used xxx above just
to obscure the name of the proprietary program that includes
the above file.

I also wonder about the app-specific variants; wouldn't one version for
all apps have sufficed?

There is a need to support gcc4 through gcc14 in that project. We've subsequently raised the lower limit to gcc7. The project was started
in 2012.

If instead you use

#define XXX_FALLTHROUGH GOHERE_( __LINE__ )
#define GOHERE_( n ) GOHERE__( n )
#define GOHERE__( n ) goto RIGHT_HYAR_##n; RIGHT_HYAR_##n:

and just give 'XXX_FALLTHROUGH;', how are the results? It
works fine in my tests.
--- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Thu May 14 17:32:10 2026

From Newsgroup: comp.lang.c

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

scott@slp53.sl.home (Scott Lurndal) writes:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 2026-05-13 17:12, Scott Lurndal wrote:

We use:

# if __GNUC__ >= 7 // 'statement attributes' were new with GCC 7.x
# if defined(__cplusplus) && (__cplusplus >= 201103L) // C++11 or higher
# define XXX_FALLTHROUGH [[gnu::fallthrough]]
# else
# define XXX_FALLTHROUGH __attribute__ ((fallthrough))
# endif
# else // GCC 4.x, 5.x, 6.x, comment only!
# define XXX_FALLTHROUGH /* Fall Through */
# endif

Where 'XXX' is replaced by the app name.

switch (variable) {
case cond1:
break;
case cond:
do something
XXX_FALLTHROUGH
default:
do something else
}

Just a note aside; couldn't the XXX be automatically concatenated using >>> the CPP features? (I seem to recall we've done such things back then.)

Not sure I understand your question. I used xxx above just
to obscure the name of the proprietary program that includes
the above file.

I also wonder about the app-specific variants; wouldn't one version for >>> all apps have sufficed?

There is a need to support gcc4 through gcc14 in that project. We've
subsequently raised the lower limit to gcc7. The project was started
in 2012.

If instead you use

#define XXX_FALLTHROUGH GOHERE_( __LINE__ )
#define GOHERE_( n ) GOHERE__( n )
#define GOHERE__( n ) goto RIGHT_HYAR_##n; RIGHT_HYAR_##n:

and just give 'XXX_FALLTHROUGH;', how are the results? It
works fine in my tests.

Perhaps, but we are not competing in the obfuscated C competition.
--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Thu May 14 17:49:14 2026

From Newsgroup: comp.lang.c

In article <86pl2yi0n3.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 2026-05-13 20:00, Tim Rentsch wrote:

There is another difference worth noting. A byte is a unit of
storage, whereas octet is a measure of information. The word
byte is inherently about memory; the word octet is inherently
about value (eight bits of information). For this reason too
the name 'octet' is a better choice for a type name than 'byte'.

Well, I have a slightly different view; I suppose it's cultural.

I often see, specifically from the Anglo-American culture, that
they talk about, say, "8 bits"; and this has partly culturally
also spread across the ocean. - Here we try to distinguish the
units and the "metal"; the latter are formally substantives and
written with a capital letter. So we have units of "1 bit" or
"5 bit" entities (no 's' at the end). But seen as "metal" we
speak about "one Bit" or "five Bits" - although it's somewhat
quirky to imagine a thing that is physically "5 Bit", mostly it
is more accurate to say it's an entity of "5 bit" - and similar
with "1 byte". Because we use that also as _unit_ for 8 bit
entities. It gets complicated by us addressing the unit 'bit'
by a name, which is then "Bit". So the more accurate forms for
the _units_ are 5 bit or 8 byte. - As said, we may culturally
see that differently, and colloquially you nowadays also often
hear "5 Bits" or "8 Bytes" (as pluralized substantive), so it's
cumbersome to argue about that. - Only that "byte" is also a
unit (and not necessarily associated with memory) seems to be
our difference in how we view that.

I don't know if I see what you're getting at here. My writing
follows standard usage in American English. Sometimes the names
of units are capitalized but for the most part they aren't. The
names of units are singular or plural when used as nouns (1 bit,
2 bits), but singular when used as adjectives (16-bit int).
There may be exceptions to those rules, I haven't thought about
it deeply.

My main point is that "byte" and "octet" are talking about
different kinds of things.

Not really. It has always been understood to refer to the same
kind of thing that "byte" refers to.

The problem was that, at the time the term "octet" was coined,
the size of a byte (measured in bits) varied between different
computers, and sometimes on the same computer. When people
starting getting serious about making computers talk to one
another, this became an issue: hence octet to have standard
nomenclature.

A computer might have 64k bytes of
RAM, but normally I wouldn't (and I think normally other people
wouldn't) say that a computer has 64k octets of RAM.

Some would, though it may sound a bit odd.

The term is mostly historical at this point: the vast majority
of systems standardized on 8-bit bytes last century, after IBM
introduced the 360 line of computers with power-of-two widths
for different types of data. 8-bit bytes quickly became
dominant.

We might
say a computer has enough RAM to _hold_ 64k octets, but not that
it _has_ 64k octets. There's a semantic incongruity in the
latter case. Do you see what I mean?

At this point, the term "byte" has been standardized by several
different bodies (IEC, ISO) to be synonymous with octet. The
continued use of "octet" by organizations like the IETF is
mostly a legacy curiosity.

Use of "octet" when referring to legacy systems that different
byte sizes, or perhaps for some highly specialized devices,
remains. Some hardware engineers may use it when dealing with,
for example, RAM parts that have an extra parity bit per octet,
combining to make a nine-bit byte, but would be a highly
specialized and thus uncommon use.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Thu May 14 10:53:32 2026

From Newsgroup: comp.lang.c

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

[...] the issue being discussed was that multiple cases (that may
not be contiguous) depend on the default fallthrough behavior.

case 10:
case 20:
case 30:
whatever();
break;

In a hypothetical C-like language without default fallthrough, it
would make sense to invent a different syntax. For C repeating the
"case" keyword is slightly ugly, but probably not worth fixing.

Both gcc and clang, with options -std=c99 -pedantic -Wall -Wextra,
accept the code below and give no diagnostics:

#include <stdio.h>
#include "cases.h"

int
main( int argc, char *argv[] ){

switch( argc-1 ){
cases( 2, 3, 5, 8 ):
printf( " okay - we got %d arguments\n", argc-1 );
printf( " argv[%d] is %s\n", argc-1, argv[argc-1] );
break;

cases( 0, 1, 4, 6, 7, 9 ):
printf( " oops - there were %d arguments\n", argc-1 );
break;

case 10:
printf( " 10 -" );
FALLTHROUGH;
default:
printf( " golly... there were %d arguments\n", argc-1 );
}

printf( "\n" );

return 0;
}

The cases() macro is a fairly straightforward application of variadic
macros, as follows:

#define ARGS_N_(...) \
ARGS_N_X_( __VA_ARGS__, \
09, 08, 07, 06, 05, 04, 03, 02, 01, 00 \
)

#define ARGS_N_X_( dummy, _9, _8, _7, _6, _5, _4, _3, _2, _1, ... ) _1

#define cases(...) casesx_( ARGS_N_( __VA_ARGS__ ), __VA_ARGS__ )
#define casesx_( N, ... ) casesy_( N, __VA_ARGS__ )
#define casesy_( N, ... ) cases_ ## N ## _( __VA_ARGS__ )

#define cases_01_(a) case a
#define cases_02_(a,...) case a : cases_01_( __VA_ARGS__ )
#define cases_03_(a,...) case a : cases_02_( __VA_ARGS__ )
#define cases_04_(a,...) case a : cases_03_( __VA_ARGS__ )
#define cases_05_(a,...) case a : cases_04_( __VA_ARGS__ )
#define cases_06_(a,...) case a : cases_05_( __VA_ARGS__ )
#define cases_07_(a,...) case a : cases_06_( __VA_ARGS__ )
#define cases_08_(a,...) case a : cases_07_( __VA_ARGS__ )
#define cases_09_(a,...) case a : cases_08_( __VA_ARGS__ )

It's easy to see how to extend this definition to allow more cases, if
that is needed.

Personally I would rather have something that works in C99, etc, now,
than to wait for some possible change at some point in the indefinite
future.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Thu May 14 18:57:11 2026

From Newsgroup: comp.lang.c

On 14/05/2026 17:44, Tim Rentsch wrote:

Bart <bc@freeuk.com> writes:

On 14/05/2026 16:37, Tim Rentsch wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

[..discussing C expression syntax..]

[...] [Remembering precedence in C is difficult because of]
a mis-ranking of a class of operators in "C". (I noticed that
already when I read K&R some time around 1985, but I first saw
that "officially" acknowledged not too long ago when someone
posted a link to a paper from, IIRC, some time in the 1990's
written by one of the authors of "C".) - And that discrepancy
detail in C's precedence ranking was actually the only reason
for me looking "regularly" into the precedence table of my K&R.
(The point is that - with the exception of & ^ | - the ranking
makes perfectly sense and should be easily usable without doubt
by a concept-knowing programmer. But note that, historically,
a sort of "rationale" can be formulated for the discrepancy to
justify the given choice in context of specifically "C". But
still remember the "official" acknowledgement of an issue here.)

I think it's easy to remember how expressions in C work with the
help of just a few memory aids:

1. unary operators are always ahead of binary operators, first
those on the right and then those on the left;

Unary operators aren't the problem. It's a mystery why they need to be
in a table at all. Nobody's going to think that '&a + b' means '&(a +
b)'.

3. sizeof is greedy with respect to type names: sizeof (int)+1
is (sizeof (int))+1, not sizeof ((int)+1)

This isn't a problem either: it works like a unary operator.

2. the bitwise operators form a sandwich enclosing the relational
operators and the equality operators - shift (<<,>>) on top,
and the three kinds of logical operations (&,^,|) underneath;

This is where the trouble starts: these make up 6 different levels.

Combinations of & ^ | are rare enough, as bitwise operations, that
you'd use parentheses anyway. They don't need 3 separate levels.

Comparison ones don't need 2 levels.

And shift operators don't really need their own level either. (Since
they scale numbers just like * and /, they can be lumped in with
those. Having 'a * 8 + b' mean the same as 'a << 3 + b' makes sense;
currently they have quite different meanings.)

I wasn't trying to help you. I know that's a lost cause.

I don't think what you said helped anyone.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Thu May 14 19:19:57 2026

From Newsgroup: comp.lang.c

On 14/05/2026 17:51, David Brown wrote:

On 14/05/2026 18:00, Bart wrote:

On 14/05/2026 16:37, Tim Rentsch wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

[..discussing C expression syntax..]

[...]-a [Remembering precedence in C is difficult because of]
a mis-ranking of a class of operators in "C".-a (I noticed that
already when I read K&R some time around 1985, but I first saw
that "officially" acknowledged not too long ago when someone
posted a link to a paper from, IIRC, some time in the 1990's
written by one of the authors of "C".) - And that discrepancy
detail in C's precedence ranking was actually the only reason
for me looking "regularly" into the precedence table of my K&R.
(The point is that - with the exception of & ^ | - the ranking
makes perfectly sense and should be easily usable without doubt
by a concept-knowing programmer.-a But note that, historically,
a sort of "rationale" can be formulated for the discrepancy to
justify the given choice in context of specifically "C".-a But
still remember the "official" acknowledgement of an issue here.)

I think it's easy to remember how expressions in C work with the
help of just a few memory aids:

-a-a 1. unary operators are always ahead of binary operators, first
-a-a-a-a-a those on the right and then those on the left;

Unary operators aren't the problem. It's a mystery why they need to be
in a table at all. Nobody's going to think that '&a + b' means '&(a +
b)'.

Unary operator precedence is certainly important.-a (*p)++ and *(p++) are very different things, and *p++ can only mean one of them.

Yes, but it is nothing to do with the precedences of binary operators**.
This is pretty much universal.

Unary ops in charts share the one precedence level, and have their own
rules when there is a cluster of them around the same term.

That is, start on the one to the immediate right, and work left to
right, then the one on the immediate left and go right to left. If 'a b
c d' are unary operators, then:

a b X c d

is evaluated as a(b(d(c(X)))).

However I couldn't find a valid example in C of successive post-fix
operators, so there will be at most one on the right. In that case,
'right to left' is accurate.

Still, Tim said '/those/' on the right, so I'd be interested if there
was in fact such an example, then I think you would have to evaluate
them 'inside-out' like my example.

By that may just have meant the '++ --' operators in general, not more
than one in any one example.

(** I don't count '.' as a binary operator.)
--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c on Thu May 14 20:50:37 2026

From Newsgroup: comp.lang.c

On 14/05/2026 20:19, Bart wrote:

On 14/05/2026 17:51, David Brown wrote:

On 14/05/2026 18:00, Bart wrote:

On 14/05/2026 16:37, Tim Rentsch wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

[..discussing C expression syntax..]

[...]-a [Remembering precedence in C is difficult because of]
a mis-ranking of a class of operators in "C".-a (I noticed that
already when I read K&R some time around 1985, but I first saw
that "officially" acknowledged not too long ago when someone
posted a link to a paper from, IIRC, some time in the 1990's
written by one of the authors of "C".) - And that discrepancy
detail in C's precedence ranking was actually the only reason
for me looking "regularly" into the precedence table of my K&R.
(The point is that - with the exception of & ^ | - the ranking
makes perfectly sense and should be easily usable without doubt
by a concept-knowing programmer.-a But note that, historically,
a sort of "rationale" can be formulated for the discrepancy to
justify the given choice in context of specifically "C".-a But
still remember the "official" acknowledgement of an issue here.)

I think it's easy to remember how expressions in C work with the
help of just a few memory aids:

-a-a 1. unary operators are always ahead of binary operators, first
-a-a-a-a-a those on the right and then those on the left;

Unary operators aren't the problem. It's a mystery why they need to
be in a table at all. Nobody's going to think that '&a + b' means
'&(a + b)'.

Unary operator precedence is certainly important.-a (*p)++ and *(p++)
are very different things, and *p++ can only mean one of them.

Yes, but it is nothing to do with the precedences of binary operators**. This is pretty much universal.

I certainly agree it would be odd if there were binary arithmetic
operators with higher precedence than unary operators. But the "." and
"->" operators are binary operators in C, in that they are referred to
as "operators" and have two operands. I don't think the standard uses
the term "binary operator" as a defined term, however.

Unary ops in charts share the one precedence level, and have their own
rules when there is a cluster of them around the same term.

That is, start on the one to the immediate right, and work left to
right, then the one on the immediate left and go right to left. If 'a b
c d' are unary operators, then:

-a a b X c d

is evaluated as a(b(d(c(X)))).

Agreed.

However I couldn't find a valid example in C of successive post-fix operators, so there will be at most one on the right. In that case,
'right to left' is accurate.

There are a few ways to combine multiple post-fix operators, especially
with array subscription. But you can't write things like "x++ --".

xs[i][j]

funcs[i](x)

get_pointer()->

xs[i]++

Likewise there are a few ways to combine pre-fix operators, especially multiple casts. And of course "+ - + - x" is legal, albeit unlikely to
turn up in real code.

Still, Tim said '/those/' on the right, so I'd be interested if there
was in fact such an example, then I think you would have to evaluate
them 'inside-out' like my example.

Yes. Your example seemed fine to me.

By that may just have meant the '++ --' operators in general, not more
than one in any one example.

(** I don't count '.' as a binary operator.)

--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Thu May 14 16:11:43 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 14/05/2026 01:59, Janis Papanagnou wrote:

[...]

(The point is that - with the exception of & ^ | - the ranking
makes perfectly sense

It doesn't make sense even then; here are the remaining groups for
binary ops from high to low:

(* / %) (+ -) (<< >>) (< <= >= >) (== !==) (&&) (||) (=)

Why are the shift operators at that spot? This causes chaos in
expressions like 'a << 3 + b' which are parsed as 'a << (3 + b)'.

I've never heard anyone claim that C's operator precedence rules are
ideal. They aren't. But they can't be changed without breaking
existing code, so there's little point in complaining about it.

Someone could probably make an argument that the existing precedences
are more sensible than any alternatives, but that doesn't really
matter.

Why are == and != lower precedence than the other compare operators?
In which circumstances would that be an advantage? This is just a
pointless extra level, as such usage would be so unusual that you'd
use parentheses anyway.

I suggest that it doesn't matter why. It is what it is. And yes,
I'd add parentheses in the unlikely event that I needed to write
an expression that uses both equality and comparison operators
(unless I were writing deliberately obfuscated code, which I've
been known to do).

TBF, while other languages may not have as many levels, they also have questionable choices, because there are no standards.

Plenty of other languages have standards. I'm not aware of any
correlation between whether a language has a written standard and
how reasonable its operator precedence rules are.

At best it is generally agreed that there are 3 groups (4 including assignment) again arranged from high to low:

Agreed by whom?

1 School arithmetic which everyone knows

2 Comparisons

3 Logical (and, or)

4 (Assignment)

These should be intuitive, all that's left is the ordering within
group 1 and group 3, and also where these extra ops need to go:

<< >> & | ^

In the case if C, it also decided that ?: belongs in this chart of
/binary/ operators. (I supposed you can consider each of ? and : as a
binary operator...)

I don't know what chart you're referring to. There is no such chart
in the C standard; the precedence is defined implicitly by the
grammar rules. K&R2 Table 2.1 is a chart showing the precedence
and associativity of C's operators. It includes unary, binary,
and ternary operators.

?: is a ternary operator. It is not in any sense a binary operator.
If you've seen a table that implies it's a binary operator, that
table is wrong (or you've misinterpreted it).
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Thu May 14 16:33:41 2026

From Newsgroup: comp.lang.c

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <86pl2yi0n3.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

[...]

My main point is that "byte" and "octet" are talking about
different kinds of things.

Not really. It has always been understood to refer to the same
kind of thing that "byte" refers to.

I agree, at least for the way I understand the terms. For me,
"octet" and "byte" refer to the same kind of thing. The difference
is that an "octet" is specifically 8 bits, and a "byte" is a
fundamental unit of storage for a given system (commonly 8 bits).
ISO/IEC 2382 happens to agree with me.

The problem was that, at the time the term "octet" was coined,
the size of a byte (measured in bits) varied between different
computers, and sometimes on the same computer. When people
starting getting serious about making computers talk to one
another, this became an issue: hence octet to have standard
nomenclature.

A computer might have 64k bytes of
RAM, but normally I wouldn't (and I think normally other people
wouldn't) say that a computer has 64k octets of RAM.

Some would, though it may sound a bit odd.

Agreed. "64k bytes" is certainly more common, but "64k octets"
means essentially the same thing while being more specific.

Also, the "k" suffix formally means 1000, but is often used to mean
1024, which is why we have "Ki", "kibi" to denote a power of two
explicitly.

[...]]

At this point, the term "byte" has been standardized by several
different bodies (IEC, ISO) to be synonymous with octet. The
continued use of "octet" by organizations like the IETF is
mostly a legacy curiosity.

Has it? The ISO C and C++ standards certainly do not use "byte"
to mean exactly 8 bits. ISO/IEC 2382 says:

byte

string that consists of a number of bits, treated as a unit, and
usually representing a character or a part of a character

Note 1 to entry: The number of bits in a byte is fixed for a given
data processing system.

Note 2 to entry: The number of bits in a byte is usually 8.

and

octet

8-bit byte

byte that consists of eight bits

<https://www.iso.org/obp/ui/#iso:std:iso-iec:2382:ed-1:v2:en>

The latter implies that you can't have octets on a system with,
say, 16-bit bytes, which doesn't match what I would have expected.
I would think it would be reasonable to say that a system with
16-bit bytes has, say, 32k bytes or 64k octets of memory. But C
doesn't use the word "octet", so this is at best marginally topical.

[...]
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Thu May 14 16:40:02 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:
[...]

Unary operators aren't the problem. It's a mystery why they need to be
in a table at all. Nobody's going to think that '&a + b' means '&(a +
b)'.

[...]

It would be silly for an operator precedence tables to omit the
operators that "everybody knows". If I had a table that didn't
show *all* the operators, I'd look for a better table (like the
one in K&R2).
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Thu May 14 16:51:22 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 14/05/2026 03:58, Keith Thompson wrote:

[...]

With nested loops, "break" or "continue" always refers to the
innermost loop. With a switch statement inside a loop, "break"
refers to the switch statement, but "continue" refers to the loop.

It's obviously not impossible to deal with, but I find it mildly
annoying.

Break doing the two jobs is a flaw. 'break' and 'continue' being
inconsistent is a further one:

Suppose you have a loop, and within the loop, you have an if-else-if
chain within which are 'break' and 'continue' statements.

You decide that that if-else-if chain is better off as a switch. But
now, while 'continue' continues to do its job, 'break' silently
behaves differently.

Yes, that is a potential problem, one that could have been solved
by not using "break" to terminate a case in a switch statement.
I don't know that I've ever seen the scenario you describe, but
it's certainly possible. I don't often use "continue", but yes,
the asymmetry between "break" and "continue" is a potential problem.

There is of course no real ambiguity, but it is a possible trap
for the unwary.

The problem could be largely solved by adding support for named break
and continue, as I've discussed here before. That would also make it
easier to break out of nested loops, even without switch statements.
And it could be done without breaking existing code. (Making a
break within a switch within a loop break out of the loop would
change the behavior of existing code, so it's not likely to happen.)

There, I've agreed with you (again) about a perceived flaw in
the C language. Will you stop accusing me of defending it when I
merely describe it? Or will you continue to be angry that I'm not
as angry about it as you are?
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Thu May 14 16:59:33 2026

From Newsgroup: comp.lang.c

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

[...] the issue being discussed was that multiple cases (that may
not be contiguous) depend on the default fallthrough behavior.

case 10:
case 20:
case 30:
whatever();
break;

In a hypothetical C-like language without default fallthrough, it
would make sense to invent a different syntax. For C repeating the
"case" keyword is slightly ugly, but probably not worth fixing.

Both gcc and clang, with options -std=c99 -pedantic -Wall -Wextra,
accept the code below and give no diagnostics:

#include <stdio.h>
#include "cases.h"

int
main( int argc, char *argv[] ){

switch( argc-1 ){
cases( 2, 3, 5, 8 ):

[SNIP]

The cases() macro is a fairly straightforward application of variadic
macros, as follows:

#define ARGS_N_(...) \
ARGS_N_X_( __VA_ARGS__, \
09, 08, 07, 06, 05, 04, 03, 02, 01, 00 \
)

#define ARGS_N_X_( dummy, _9, _8, _7, _6, _5, _4, _3, _2, _1, ... ) _1

#define cases(...) casesx_( ARGS_N_( __VA_ARGS__ ), __VA_ARGS__ )
#define casesx_( N, ... ) casesy_( N, __VA_ARGS__ )
#define casesy_( N, ... ) cases_ ## N ## _( __VA_ARGS__ )

#define cases_01_(a) case a
#define cases_02_(a,...) case a : cases_01_( __VA_ARGS__ )
#define cases_03_(a,...) case a : cases_02_( __VA_ARGS__ )
#define cases_04_(a,...) case a : cases_03_( __VA_ARGS__ )
#define cases_05_(a,...) case a : cases_04_( __VA_ARGS__ )
#define cases_06_(a,...) case a : cases_05_( __VA_ARGS__ )
#define cases_07_(a,...) case a : cases_06_( __VA_ARGS__ )
#define cases_08_(a,...) case a : cases_07_( __VA_ARGS__ )
#define cases_09_(a,...) case a : cases_08_( __VA_ARGS__ )

It's easy to see how to extend this definition to allow more cases, if
that is needed.

Personally I would rather have something that works in C99, etc, now,
than to wait for some possible change at some point in the indefinite
future.

Personally, I'd much rather just write:

case 2: case 3: case 5: case 8:

I find the default fallthrough behavior mildly annoying, but I don't
have much problem dealing with it. I'm not willing to use elaborate
macros to avoid it or to be able to write slightly terser code.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Fri May 15 01:12:12 2026

From Newsgroup: comp.lang.c

On 15/05/2026 00:11, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 14/05/2026 01:59, Janis Papanagnou wrote:

[...]

(The point is that - with the exception of & ^ | - the ranking
makes perfectly sense

It doesn't make sense even then; here are the remaining groups for
binary ops from high to low:

(* / %) (+ -) (<< >>) (< <= >= >) (== !==) (&&) (||) (=)

Why are the shift operators at that spot? This causes chaos in
expressions like 'a << 3 + b' which are parsed as 'a << (3 + b)'.

I've never heard anyone claim that C's operator precedence rules are
ideal. They aren't. But they can't be changed without breaking
existing code, so there's little point in complaining about it.

I was replying to the claim that they made 'perfect sense' aside from '&
^ |'

TBF, while other languages may not have as many levels, they also have
questionable choices, because there are no standards.

Plenty of other languages have standards.

I mean cross-language or real-world standards for operator precedences.

There are de-facto ones for operators that everyone used like '* / + -',
and also for '**' (exponentiation) even if that is an implied op in mathematical notation.

In the computing world, logical 'add' before logical 'or' seems to be
common.

Those form groups 1 and 3 of my list below. Then common sense demands
that comparisons must be group 2, otherwise:

a + b == 0 might parsed as a + (b == 2)

a == b && c == d might be parsed as a == (b && c) == d

The fourth group is assignment. If that was any else other than the
lowest level, then:

a = b + c might be parsed as (a = b) + c
a = b && c might be parsed as (a = b) && c
a = b == c might be parsed as (a = b) == c

depending on exactly where. None is desirable. In this case, C does the
right thing anyway so no need to disagree!

At best it is generally agreed that there are 3 groups (4 including
assignment) again arranged from high to low:

Agreed by whom?

1 School arithmetic which everyone knows

2 Comparisons

3 Logical (and, or)

4 (Assignment)

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Fri May 15 02:30:33 2026

From Newsgroup: comp.lang.c

On 2026-05-15 01:11, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

Why are == and != lower precedence than the other compare operators?

To prevent forcing parenthesis (which would make C-programs,
which are already inherently overloaded with a plethora of (),
{} and [], yet more unreadable).

In which circumstances would that be an advantage? This is just a
pointless extra level, as such usage would be so unusual that you'd
use parentheses anyway.

Equal and unequal are basically more universal operators than
the other four types of value comparisons.[*] You can observe
a hierarchy of types (reflected also in conditionals) where
at the upper end you want a boolean predicate.

bool x bool -> bool && and || arranged per logic convention
type x type -> bool ==, !=, and then (see [*]) <, <=, >, >=
type x type -> type various types including math convention

And some specific cases in "C", like ++x and x++, which need
some attention.

Unaries are mostly not (or at least rarely) a problem. But...
exponentiation needs attention, especially where unaries are
involved in expressions. (Sadly there's differences here.)

[*] I'm not considering that languages my impose an ordering in
'boolean' values. As universal fundamental abstract type 'true'
and 'false' are unordered and need no ordering. It makes sense
to have classes more differentiated than thrown together and
thereby requiring lots of parentheses to fix things again. It's
a convenience based on common conventions and logical models.

I suggest that it doesn't matter why. It is what it is. And yes,
I'd add parentheses in the unlikely event that I needed to write
an expression that uses both equality and comparison operators
(unless I were writing deliberately obfuscated code, which I've
been known to do).

It's not "obfuscated", IMHO, if you have the conceptual model I
presented in mind. But of course you could also use parenthesis
to emphasize things, or use different spacing; I use all means
depending on context and complexity of the expressions.

[...]

These should be intuitive, all that's left is the ordering within
group 1 and group 3, and also where these extra ops need to go:

<< >> & | ^

These are basically in the "value-operators" type x type -> type
class, though the former two operators are asymmetric; they work
like

type1 x type2 -> type1

so they are justified an own class.[**] It makes sense where "C"
has them defined.

[**] Yes, the arguments make technically both use of the same
'int' type but they are semantically in different classes.

The problem is with the latter three, & | ^ ; they are (typically)
used with arithmetic value types, but they are sorted in after the
comparison types.[***] - I cannot do anything about that and want
to use parenthesis here, not (not primarily) for clarity, but to
prevent computational errors in the first place.

[***] Obviously the problem arose historically from using boolean
operations through bit-sets occasionally as substitute for 'bool'.
A misfortune of a coincidence of the language "flexibility" with
its fuzziness of not separating 'bool' from 'int', and habits to
use clever "optimized" coding. (I haven't seen such coding more
recently, though, i.e. not during the past about four decades.
I could imagine that it might play some role still in areas like controllers-programming, but don't know, to be frank.)

Janis

[...]

[...]

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Fri May 15 01:31:59 2026

From Newsgroup: comp.lang.c

On 15/05/2026 00:40, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

Unary operators aren't the problem. It's a mystery why they need to be
in a table at all. Nobody's going to think that '&a + b' means '&(a +
b)'.

[...]

It would be silly for an operator precedence tables to omit the
operators that "everybody knows". If I had a table that didn't
show *all* the operators, I'd look for a better table (like the
one in K&R2).

Do you need to know the precedence of a unary operator (say applied to
any of these terms) in order to correctly parse this:

a op1 b op2 c

?

'a b c' are terms, and 'op1 op2' are operators. You need to know their relative precedences in order to correctly parse this as either '(a op1
b) op2 c' or 'a op1 (b op2 c)'. Any unary ops on those terms don't
affect that.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Fri May 15 02:38:41 2026

From Newsgroup: comp.lang.c

On 2026-05-14 13:32, Bart wrote:

On 14/05/2026 01:59, Janis Papanagnou wrote:

[...]

(The point is that - with the exception of & ^ | - the ranking
makes perfectly sense

It doesn't make sense even then; [...]

I acknowledge that it doesn't make sense for you.

(Some explanations can be found in a recent posting.)

[...]

TBF, while other languages may not have as many levels, they also have questionable choices, because there are no standards.

There are common public conventions and conceptual models from the
computer science that influence the definitions. Design constraints
and goals of specific programming languages may influence details.
Beyond that the languages generally define their precedence rules
(and these are reflected in the respective language standards).

[...]

In the case if C, it also decided that ?: belongs in this chart of /
binary/ operators. (I supposed you can consider each of ? and : as a
binary operator...)

(Here I'm sure you're imaging things or deliberately making them up.)

Janis

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Fri May 15 02:46:19 2026

From Newsgroup: comp.lang.c

On 2026-05-14 18:00, Bart wrote:

On 14/05/2026 16:37, Tim Rentsch wrote:

[...]
-a-a 1. unary operators are always ahead of binary operators, first
-a-a-a-a-a those on the right and then those on the left;

Unary operators aren't the problem. It's a mystery why they need to be
in a table at all. [...]

Well, they are; see for example these examples of Algol 68 and Awk...

$ genie -p '-2^4'
+16
$ awk 'BEGIN {print -2^4}'
-16

Janis

[...]

--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Thu May 14 17:52:17 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 15/05/2026 00:40, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

Unary operators aren't the problem. It's a mystery why they need to be
in a table at all. Nobody's going to think that '&a + b' means '&(a +
b)'.

[...]
It would be silly for an operator precedence tables to omit the
operators that "everybody knows". If I had a table that didn't
show *all* the operators, I'd look for a better table (like the
one in K&R2).

Do you need to know the precedence of a unary operator (say applied to
any of these terms) in order to correctly parse this:

a op1 b op2 c

?

Perhaps not, though that's not an actual example. But that's not
my point.

'a b c' are terms, and 'op1 op2' are operators. You need to know their relative precedences in order to correctly parse this as either '(a
op1 b) op2 c' or 'a op1 (b op2 c)'. Any unary ops on those terms don't
affect that.

Again, I see no point in having a C operator precedence
table that doesn't include *all* the operators in the language.
And I don't think I've ever seen such a table. It's much easier
to include everything than to waste time deciding which operators
don't need to be in the table.

Has the presense of unary operators in a precedence table ever
inconvenienced you in any way? Why is this a concern for you?

If you want to publish a table that excludes unary operators,
go ahead.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Fri May 15 02:52:25 2026

From Newsgroup: comp.lang.c

On 2026-05-14 20:50, David Brown wrote:

On 14/05/2026 20:19, Bart wrote:

[...]

I certainly agree it would be odd if there were binary arithmetic
operators with higher precedence than unary operators.-a [...]

I've just posted an example; exponentiation. It depends on the
language. (Below examples from Algol 68 and Awk...)

$ genie -p '-2^4'
+16
$ awk 'BEGIN{print -2^4}'
-16

But (I think; unless that has changed) "C" has no exponentiation
operator, so it doesn't apply here at least.

But if you start with any new language you have to inspect the
documentation about its operators and their precedence rules.

Janis

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Fri May 15 02:07:35 2026

From Newsgroup: comp.lang.c

On 15/05/2026 01:52, Janis Papanagnou wrote:

On 2026-05-14 20:50, David Brown wrote:

On 14/05/2026 20:19, Bart wrote:

[...]

I certainly agree it would be odd if there were binary arithmetic
operators with higher precedence than unary operators.-a [...]

I've just posted an example; exponentiation. It depends on the
language. (Below examples from Algol 68 and Awk...)

$ genie -p '-2^4'
-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a +16
$ awk 'BEGIN{print -2^4}'
-16

But (I think; unless that has changed) "C" has no exponentiation
operator, so it doesn't apply here at least.

But if you start with any new language you have to inspect the
documentation about its operators and their precedence rules.

Yes, negation with exponentation is a special case. Different languages produce different results. Getting the same result as in mathematics is tricky.

It's not really helped by having having a table that combines
precedences of different kinds of operator.

And C gets off scot-free because it doesn't have such an operator. You
can choose to write pow(-2, 4) or -pow(2, 4), with care taken as pow
deals with floats.

Other languages can also write (-2)**4 or -(2**4) to force the behaviour.

But again, language syntax is not mathematics.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Fri May 15 03:31:38 2026

From Newsgroup: comp.lang.c

On 2026-05-15 01:33, Keith Thompson wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:
[...]

At this point, the term "byte" has been standardized by several
different bodies (IEC, ISO) to be synonymous with octet. The
continued use of "octet" by organizations like the IETF is
mostly a legacy curiosity.

Has it? The ISO C and C++ standards certainly do not use "byte"
to mean exactly 8 bits. ISO/IEC 2382 says:

byte

string that consists of a number of bits, treated as a unit, and
usually representing a character or a part of a character

Note 1 to entry: The number of bits in a byte is fixed for a given
data processing system.

Note 2 to entry: The number of bits in a byte is usually 8.

and

octet

8-bit byte

byte that consists of eight bits

<https://www.iso.org/obp/ui/#iso:std:iso-iec:2382:ed-1:v2:en>

The latter implies that you can't have octets on a system with,
say, 16-bit bytes, which doesn't match what I would have expected.
I would think it would be reasonable to say that a system with
16-bit bytes has, say, 32k bytes or 64k octets of memory. But C
doesn't use the word "octet", so this is at best marginally topical.

I've seen octets used in ITU-T standards (called "recommendations")
earlier than in ISO standards (who often borrowed ITU-T standards
later under their label as an International Standard). Specifically
in context of the ASN.1 definition in the 1980's (IIRC), specifically
in ITU-T X.209 (which got later replaced by X.680, and X.690 for BER).
I'm too lazy to "grep" the tons of the respective ITU-T papers, but
Google(AI) confirms my memories when it says:

| An octet, defined as a sequence of exactly 8 bits, is commonly
| defined in several ITU-T (International Telecommunication Union -
| Telecommunication Standardization Sector) standards, particularly
| within the X-series for Open Systems Interconnection (OSI) and ASN.1
| encoding.
| Key ITU-T standards that define or rely on the definition| of an
| octet include:
| * ITU-T Rec. X.680 (ISO/IEC 8824-1): Defines the Abstract
| Syntax Notation One (ASN.1) basic notation, where an "octet"
| is fundamentally defined as an 8-bit unit.
| * ...

Janis

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Fri May 15 03:39:20 2026

From Newsgroup: comp.lang.c

On 2026-05-15 03:07, Bart wrote:

[...]

It's not really helped by having having a table that combines
precedences of different kinds of operator.

It's a necessity to have precedences documented for the users.

And tables are a very useful common, established representation;
one can immediately see "who's first", yet more so to look up
things if in doubt about any detail.

If you're staying with your claim I cannot help you. Presumedly
no one can help you. *sigh*

Janis

[...]

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Fri May 15 03:51:02 2026

From Newsgroup: comp.lang.c

On 2026-05-14 12:08, David Brown wrote:

Anyone curious about how far C's switch statements can be used or
abused, might like to read about "Protothreads" :

<https://en.wikipedia.org/wiki/Protothread>

(I haven't yet read it.)

This is a conglomeration of Duff's Device on steroids with supporting
macros that gives you a limited type of stackless cooperative
multitasking with extremely low overhead.-a The library has seen real
usage in small embedded systems.-a Reactions to the underlying implementation range from thinking it is a hideous abuse of a bad
language design, to elegant and very ingenious.

Here we say "There's a fine line between genius and insanity."
(The bandwidth of the reactions is thus not too astonishing.)

Janis

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Fri May 15 01:56:51 2026

From Newsgroup: comp.lang.c

In article <10u5m4m$uo0d$3@kst.eternal-september.org>,
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote: >cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <86pl2yi0n3.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

[...]

My main point is that "byte" and "octet" are talking about
different kinds of things.

Not really. It has always been understood to refer to the same
kind of thing that "byte" refers to.

I agree, at least for the way I understand the terms. For me,
"octet" and "byte" refer to the same kind of thing. The difference
is that an "octet" is specifically 8 bits, and a "byte" is a
fundamental unit of storage for a given system (commonly 8 bits).
ISO/IEC 2382 happens to agree with me.

Indeed; Werner Buchholz created the term "byte" while working on
the IBM Stretch computer. This letter, from a 1977 issue of
BYTE magazine (bit on the nose, honestly) claims it was 21 years
old that year: https://archive.org/details/byte-magazine-1977-02/page/n145/mode/1up

The letter says that initially, byte sizes ranged from 1 to 6
bytes, but were extended to 8 bits in late 1956. The IBM 360
used the same byte and word sizes as the Stretch, but fixed
bytes at 8 bits.

The problem was that, at the time the term "octet" was coined,
the size of a byte (measured in bits) varied between different
computers, and sometimes on the same computer. When people
starting getting serious about making computers talk to one
another, this became an issue: hence octet to have standard
nomenclature.

A computer might have 64k bytes of
RAM, but normally I wouldn't (and I think normally other people
wouldn't) say that a computer has 64k octets of RAM.

Some would, though it may sound a bit odd.

Agreed. "64k bytes" is certainly more common, but "64k octets"
means essentially the same thing while being more specific.

Yes. I understand that "octet" is preferred in French, enough
so that I read it was the, "French word for byte." Not knowing
French, I don't know if that's true.

However, this cute story from Bob Bemer suggests he disliked the
term "byte" and greaterly preferred "octet," pushing it when he
was directly of software at Bull, in France, in the mid-1960s: https://web.archive.org/web/20170403130829/http://www.bobbemer.com/BYTE.HTM

Also, the "k" suffix formally means 1000, but is often used to mean
1024, which is why we have "Ki", "kibi" to denote a power of two
explicitly.

Yes. K, M, G, etc, have always been the SI indicators for
powers of 10, not powers of 2. The "Ki", "Mi", "Gi", etc, forms
are (as I understand it) relatively new.

[...]]

At this point, the term "byte" has been standardized by several
different bodies (IEC, ISO) to be synonymous with octet. The
continued use of "octet" by organizations like the IETF is
mostly a legacy curiosity.

Has it?

Yes. IEC 80000-13 declares them to be synonyms.

The ISO C and C++ standards certainly do not use "byte"
to mean exactly 8 bits.

Indeed. I don't blame them. I suspect there are some DSP chips
or weird one-off processors with oddball byte sizes, even now.

ISO/IEC 2382 says:

byte

string that consists of a number of bits, treated as a unit, and
usually representing a character or a part of a character

Note 1 to entry: The number of bits in a byte is fixed for a given
data processing system.

Note 2 to entry: The number of bits in a byte is usually 8.

and

octet

8-bit byte

byte that consists of eight bits

<https://www.iso.org/obp/ui/#iso:std:iso-iec:2382:ed-1:v2:en>

The latter implies that you can't have octets on a system with,
say, 16-bit bytes, which doesn't match what I would have expected.

I suspect that ambiguity is unintentional.

I would think it would be reasonable to say that a system with
16-bit bytes has, say, 32k bytes or 64k octets of memory. But C
doesn't use the word "octet", so this is at best marginally topical.

I wonder. For word oriented systems, it was common to describe
memory in terms of words (e.g., "the KL-10B processor with
extended addressing supports a maximum of 4 MW of memory...").
Similarly, even for byte-addressed machines, like the PDP-11,
memory capacities were often described in terms of 16-bit words
("this machine has 256 KW of memory", aka, 512 KB). [Of course,
these machines all predate common use if the "Ki" and "Mi"
units). Anyway, there is some precedent for using the machine
specific sizes in discussion, though I agree generally that
using octets makes sense in this context.

None of this has much to do with C, though, as you point out.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Thu May 14 19:04:50 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:
[...]

It's not really helped by having having a table that combines
precedences of different kinds of operator.

[...]

C has postfix, unary, binary, and ternary operators. You claim to
dislike the fact that all those operators are commonly shown in one
table. I don't know whether you would prefer the postfix and unary
operators to be left out, or shown in one or two separate tables.
The only ternary operator is the conditional operator; its precedence
is *between* the precedence of the "||" and "==" operators, but
apparently you dislike the inclusion of the conditional operator
in tables of operator precedences.

I can think of no reason for your complaint other than that you
like arguing.

If you have an actual point, perhaps you could show us a C operator
precedence table that you like better than Table 2-1 in K&R2,
and explain to us why you think it's better.

Or were you being figurative?
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Thu May 14 19:12:43 2026

From Newsgroup: comp.lang.c

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <10u5m4m$uo0d$3@kst.eternal-september.org>,
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

[...]

At this point, the term "byte" has been standardized by several
different bodies (IEC, ISO) to be synonymous with octet. The
continued use of "octet" by organizations like the IETF is
mostly a legacy curiosity.

Has it?

Yes. IEC 80000-13 declares them to be synonyms.

Interesting. It's odd that ISO/IEC 2382 and ISO/IEC 80000-13
disagree with each other.

<https://www.iso.org/standard/87648.html>
IEC 80000-13:2025 Quantities and units
Part 13: Information science and technology

I'm not going to spend 115 Swiss Francs (currently
146.39 USD) to get a copy.

If you have a copy, can you quote the relevant wording?

[...]
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Fri May 15 02:20:46 2026

From Newsgroup: comp.lang.c

In article <10u5ver$uo0d$11@kst.eternal-september.org>,
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote: >cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <10u5m4m$uo0d$3@kst.eternal-september.org>,
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote: >>>cross@spitfire.i.gajendra.net (Dan Cross) writes:

[...]

At this point, the term "byte" has been standardized by several
different bodies (IEC, ISO) to be synonymous with octet. The
continued use of "octet" by organizations like the IETF is
mostly a legacy curiosity.

Has it?

Yes. IEC 80000-13 declares them to be synonyms.

Interesting. It's odd that ISO/IEC 2382 and ISO/IEC 80000-13
disagree with each other.

<https://www.iso.org/standard/87648.html>
IEC 80000-13:2025 Quantities and units
Part 13: Information science and technology

I'm not going to spend 115 Swiss Francs (currently
146.39 USD) to get a copy.

If you have a copy, can you quote the relevant wording?

Sure.

|The specified data elements depend on the organization of the
|storage device, for example, binary elements (also called
|"bits"), octets (also called "bytes"), words of a given
|number of bits, blocks. A subscript referring to a specified
|data element can be added to the symbol.
|
| ...
|
|When used to express a storage capacity or an equivalent binary
|storage capacity, the bit and the octet (or byte) may be
|combined with SI prefixes or prefixes for binary multiples.
|
|In English, the name "byte", symbol B, is used as a synonym
|for "octet". Here, "byte" means an eight-bit byte. However,
|"byte" has been used for numbers of bits other than eight. To
|avoid the risk of confusion, it is strongly recommended that
|the name "byte" and the symbol B be used only for eight-bit
|bytes.

Note that they do acknowledge the historical usage, but it's
clear they are defining "byte" and "octet" to mean the same
thing.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c on Fri May 15 10:27:19 2026

From Newsgroup: comp.lang.c

On 15/05/2026 02:52, Janis Papanagnou wrote:

On 2026-05-14 20:50, David Brown wrote:

On 14/05/2026 20:19, Bart wrote:

[...]

I certainly agree it would be odd if there were binary arithmetic
operators with higher precedence than unary operators.-a [...]

I've just posted an example; exponentiation. It depends on the
language. (Below examples from Algol 68 and Awk...)

$ genie -p '-2^4'
-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a +16
$ awk 'BEGIN{print -2^4}'
-16

Fair enough. Exponentiation is a binary operator (in languages that
have it) that would - to fit with normal maths usage - have higher
precedence than things like unary minus.

Thanks for that example - even though C does not have such an operator,
it's worth remembering.

But (I think; unless that has changed) "C" has no exponentiation
operator, so it doesn't apply here at least.

But if you start with any new language you have to inspect the
documentation about its operators and their precedence rules.

Janis

--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c on Fri May 15 10:32:00 2026

From Newsgroup: comp.lang.c

On 15/05/2026 02:31, Bart wrote:

On 15/05/2026 00:40, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

Unary operators aren't the problem. It's a mystery why they need to be
in a table at all. Nobody's going to think that '&a + b' means '&(a +
b)'.

[...]

It would be silly for an operator precedence tables to omit the
operators that "everybody knows".-a If I had a table that didn't
show *all* the operators, I'd look for a better table (like the
one in K&R2).

Do you need to know the precedence of a unary operator (say applied to
any of these terms) in order to correctly parse this:

-a-a a op1 b op2 c

?

'a b c' are terms, and 'op1 op2' are operators. You need to know their relative precedences in order to correctly parse this as either '(a op1
b) op2 c' or 'a op1 (b op2 c)'. Any unary ops on those terms don't
affect that.

That argument makes no sense.

You don't need to know where binary "-" fits in the precedence ordering
in order to correctly parse :

a + b / c

However, I doubt if you would be happy with a table of operators that
omitted binary minus.

Yes, it would be possible to draw tables of C operator precedence where
you had separate tables for each type of operator, and then a separate description of how they fit together. But it is a lot simpler, clearer
and easier to use if you have a table that includes them all.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Fri May 15 02:35:51 2026

From Newsgroup: comp.lang.c

David Brown <david.brown@hesbynett.no> writes:

On 15/05/2026 02:31, Bart wrote:

[...]

Do you need to know the precedence of a unary operator (say applied
to any of these terms) in order to correctly parse this:
-a-a a op1 b op2 c
?
'a b c' are terms, and 'op1 op2' are operators. You need to know
their relative precedences in order to correctly parse this as
either '(a op1 b) op2 c' or 'a op1 (b op2 c)'. Any unary ops on
those terms don't affect that.

That argument makes no sense.

I suspect Bart intended that any of the terms a, b, c could include
a unary operator, so for example "a op1 b op2 c" might actually
be "-x op1 !y op2 ~z". And because of that, I guess, you don't
need to worry about the precedence of the unary operators because
they're obvious?

I don't agree with his conclusion at all, but I *think* I see
what he was trying to say.

This is at least partly speculative, and I'm sure Bart can correct
me if I'm mistaken.

[...]
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From tTh@tth@none.invalid to comp.lang.c on Fri May 15 12:25:52 2026

From Newsgroup: comp.lang.c

On 5/15/26 02:52, Janis Papanagnou wrote:

$ awk 'BEGIN{print -2^4}'
-16

tth@linda:~/Desktop$ awk 'BEGIN{print -2^4}'
-16
tth@linda:~/Desktop$ awk 'BEGIN{print (-2)^4}'
16

Computing is an evil science :)
--
** **
* tTh des Bourtoulots *
* http://maison.tth.netlib.re/ *
** **
--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Fri May 15 11:38:15 2026

From Newsgroup: comp.lang.c

On 15/05/2026 09:32, David Brown wrote:

On 15/05/2026 02:31, Bart wrote:

On 15/05/2026 00:40, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

Unary operators aren't the problem. It's a mystery why they need to be >>>> in a table at all. Nobody's going to think that '&a + b' means '&(a +
b)'.

[...]

It would be silly for an operator precedence tables to omit the
operators that "everybody knows".-a If I had a table that didn't
show *all* the operators, I'd look for a better table (like the
one in K&R2).

Do you need to know the precedence of a unary operator (say applied to
any of these terms) in order to correctly parse this:

-a-a-a a op1 b op2 c

?

'a b c' are terms, and 'op1 op2' are operators. You need to know their
relative precedences in order to correctly parse this as either '(a
op1 b) op2 c' or 'a op1 (b op2 c)'. Any unary ops on those terms don't
affect that.

That argument makes no sense.

You don't need to know where binary "-" fits in the precedence ordering
in order to correctly parse :

-a-a-a-aa + b / c

However, I doubt if you would be happy with a table of operators that omitted binary minus.

What do you mean by 'binary "-"' and 'binary minus'? Are they both the operator in "x - y" or did one or both mean the unary negation operator
in "-z"?

In my example, 'a b c' each represent arbitrary terms. These are
examples of such terms:

-x
&x
++x[i]
x(i, j)
(x + y)
x.m--
-(+(-(sizeof(x))))

These include some unary operators. But they don't influence how 'a op1
b op2 c' is parsed.

At least not in C. Some languages may have special rules so that:

-a**b is parsed as -(a**b)
not a or b is parsed as not (a or b)

Yes, it would be possible to draw tables of C operator precedence where
you had separate tables for each type of operator, and then a separate description of how they fit together.-a But it is a lot simpler, clearer
and easier to use if you have a table that includes them all.

Disagree: in C, the only thing I've used the precedence table for is for
the relative precedence of op1 and op2 in examples like mine.

It's not even useful inside C compilers; you tend to follow the grammar
rather than have it table-driven using such a chart.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Adam Sampson@ats@offog.org to comp.lang.c on Fri May 15 11:55:18 2026

From Newsgroup: comp.lang.c

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

"switch" was originally implemented in a way that, I suspect, was
easier for the compiler to implement

It would also have been familiar from BCPL. When C was designed, switch
would have been recognised as a direct equivalent of BCPL's SWITCHON
construct:

https://archive.org/details/bcpl_20200522/page/19/mode/2up

There's no equivalent of break in that version of BCPL; if you look at
example code from that era (e.g. the Xerox Alto BCPL manuals), the
convention was to use GOTO at the end of each case with a label after
the block. Later versions of BCPL have ENDCASE which works like break:

https://archive.org/details/DTIC_ADA003599/page/41/mode/2up

(and I wonder whether this was influenced by C).
--
Adam Sampson <ats@offog.org> <http://offog.org/>
--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Fri May 15 11:27:53 2026

From Newsgroup: comp.lang.c

In article <y2av7cpq6w9.fsf@offog.org>, Adam Sampson <ats@offog.org> wrote: >Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

"switch" was originally implemented in a way that, I suspect, was
easier for the compiler to implement

It would also have been familiar from BCPL. When C was designed, switch
would have been recognised as a direct equivalent of BCPL's SWITCHON >construct:

https://archive.org/details/bcpl_20200522/page/19/mode/2up

There's no equivalent of break in that version of BCPL; if you look at >example code from that era (e.g. the Xerox Alto BCPL manuals), the
convention was to use GOTO at the end of each case with a label after
the block. Later versions of BCPL have ENDCASE which works like break:

https://archive.org/details/DTIC_ADA003599/page/41/mode/2up

(and I wonder whether this was influenced by C).

It almost certainly was, as C did exert some influence on its
ancestor throughout the 1980s. For instance, when C was written
the only comment indicator in BCPL was `//` (which some
erroneously assume originated in C++). The `/* ... */` syntax
was absent, but _was_ later incorporated, due to its prevalence
brought forth by the popularity C. C inherited that syntax from
PL/1.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.lang.c on Fri May 15 11:35:51 2026

From Newsgroup: comp.lang.c

In article <10u6t2n$4mai$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 15/05/2026 09:32, David Brown wrote:

On 15/05/2026 02:31, Bart wrote:

On 15/05/2026 00:40, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

Unary operators aren't the problem. It's a mystery why they need to be >>>>> in a table at all. Nobody's going to think that '&a + b' means '&(a + >>>>> b)'.

[...]

It would be silly for an operator precedence tables to omit the
operators that "everybody knows".-a If I had a table that didn't
show *all* the operators, I'd look for a better table (like the
one in K&R2).

Do you need to know the precedence of a unary operator (say applied to
any of these terms) in order to correctly parse this:

-a-a-a a op1 b op2 c

?

'a b c' are terms, and 'op1 op2' are operators. You need to know their
relative precedences in order to correctly parse this as either '(a
op1 b) op2 c' or 'a op1 (b op2 c)'. Any unary ops on those terms don't
affect that.

That argument makes no sense.

You don't need to know where binary "-" fits in the precedence ordering
in order to correctly parse :

-a-a-a-aa + b / c

However, I doubt if you would be happy with a table of operators that
omitted binary minus.

What do you mean by 'binary "-"' and 'binary minus'?

As a binary operator, `-` refers to subtraction. Note that the
expression quoted above does not contain subtraction. Therefore
one does to need to know the precedence of the subtraction
operator to parse that expression.

'binary "-"' and 'binary minus' mean the same thing; in the
former he used a literal `-` character, and in the latter he
substituted the name of the symbol.

Are they both the
operator in "x - y" or did one or both mean the unary negation operator
in "-z"?

It says it right there on the tin, dude. It ain't that hard.

Yes, it would be possible to draw tables of C operator precedence where
you had separate tables for each type of operator, and then a separate
description of how they fit together.-a But it is a lot simpler, clearer
and easier to use if you have a table that includes them all.

Disagree: in C, the only thing I've used the precedence table for is for
the relative precedence of op1 and op2 in examples like mine.

Not the flex you think it is....

It's not even useful inside C compilers; you tend to follow the grammar >rather than have it table-driven using such a chart.

Well, it is a good thing that there is no table like that in the
C standard, then, but instead a grammar that shows precedence.

Tables like that are meant for quick reference by working
programmers, not as normative sources for compiler implementers.

- Dan C

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Fri May 15 12:43:47 2026

From Newsgroup: comp.lang.c

On 15/05/2026 11:55, Adam Sampson wrote:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

"switch" was originally implemented in a way that, I suspect, was
easier for the compiler to implement

It would also have been familiar from BCPL. When C was designed, switch
would have been recognised as a direct equivalent of BCPL's SWITCHON construct:

https://archive.org/details/bcpl_20200522/page/19/mode/2up

That mentions labels existing within a block. It doesn't say if labels
can also exist within nested blocks, as happens in C, which allows
switch to be more chaotic.

Actually, in C, you don't even need any block after 'switch'.

There's no equivalent of break in that version of BCPL; if you look at example code from that era (e.g. the Xerox Alto BCPL manuals), the
convention was to use GOTO at the end of each case with a label after
the block. Later versions of BCPL have ENDCASE which works like break:

https://archive.org/details/DTIC_ADA003599/page/41/mode/2up

It's not clear what the rules are: if the syntax requires CASE ...
ENDCASE without nesting or overlapping, then this is better formed than C.

But this would make it harder to do CASE 1: CASE 2:... ENDCASE, unless consective CASEs are considered one label that is terminated by one ENDCASE.

(and I wonder whether this was influenced by C).

The previous page shows this form of FOR loop:

for N = El to E2 do C

where N is a name.

It seems C didn't copy this, neither did it influence BCPL.

--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c on Fri May 15 13:44:20 2026

From Newsgroup: comp.lang.c

On 15/05/2026 03:56, Dan Cross wrote:

In article <10u5m4m$uo0d$3@kst.eternal-september.org>,
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <86pl2yi0n3.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

[...]

Also, the "k" suffix formally means 1000, but is often used to mean
1024, which is why we have "Ki", "kibi" to denote a power of two
explicitly.

Yes. K, M, G, etc, have always been the SI indicators for
powers of 10, not powers of 2. The "Ki", "Mi", "Gi", etc, forms
are (as I understand it) relatively new.

They are also very rarely used, IME. People who care about these things already know that 1 KB is 1024 bytes, not 1000 bytes. (There was a big
to-do about disk sizes a number of years ago, where disks were marketed
using 1000-based units but users expected 1024-based units.)

[...]]

At this point, the term "byte" has been standardized by several
different bodies (IEC, ISO) to be synonymous with octet. The
continued use of "octet" by organizations like the IETF is
mostly a legacy curiosity.

Has it?

Yes. IEC 80000-13 declares them to be synonyms.

The ISO C and C++ standards certainly do not use "byte"
to mean exactly 8 bits.

Indeed. I don't blame them. I suspect there are some DSP chips
or weird one-off processors with oddball byte sizes, even now.

There certainly are. But on these systems, IME no one ever uses the
word "byte" to refer to anything other than 8 bits. Sizes are generally
given explicitly, or refer to "unsigned char" (which will be perhaps
12-bit or 16-bit), or sometimes a more generic term like "word" will be
used. When you read the manual for a CHAR_BIT 16 compiler for a DSP (at
least for the two that I have read), you won't see the term "byte"
referring to anything other than 8 bits.

I would think it would be reasonable to say that a system with
16-bit bytes has, say, 32k bytes or 64k octets of memory. But C
doesn't use the word "octet", so this is at best marginally topical.

I wonder. For word oriented systems, it was common to describe
memory in terms of words (e.g., "the KL-10B processor with
extended addressing supports a maximum of 4 MW of memory...").

That is the norm in the embedded world. While it is mostly just DSP's
that have char greater than 8 bits, it is not uncommon to have flash or
other program memory that is wider than 8 bits. An AVR microcontroller
might be said to have 16 kW or 32 kB of memory, where each word of flash
is 16 bits. For PIC microcontrollers, kW is common since each word of
program memory might be 12, 14 or 16 bits according to the family.
(These are Harvard architecture devices - data memory is entirely
separate from code memory.)

Similarly, even for byte-addressed machines, like the PDP-11,
memory capacities were often described in terms of 16-bit words
("this machine has 256 KW of memory", aka, 512 KB). [Of course,
these machines all predate common use if the "Ki" and "Mi"
units). Anyway, there is some precedent for using the machine
specific sizes in discussion, though I agree generally that
using octets makes sense in this context.

None of this has much to do with C, though, as you point out.

- Dan C.

--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c on Fri May 15 13:58:37 2026

From Newsgroup: comp.lang.c

On 15/05/2026 12:38, Bart wrote:

On 15/05/2026 09:32, David Brown wrote:

On 15/05/2026 02:31, Bart wrote:

On 15/05/2026 00:40, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

Unary operators aren't the problem. It's a mystery why they need to be >>>>> in a table at all. Nobody's going to think that '&a + b' means '&(a + >>>>> b)'.

[...]

It would be silly for an operator precedence tables to omit the
operators that "everybody knows".-a If I had a table that didn't
show *all* the operators, I'd look for a better table (like the
one in K&R2).

Do you need to know the precedence of a unary operator (say applied
to any of these terms) in order to correctly parse this:

-a-a-a a op1 b op2 c

?

'a b c' are terms, and 'op1 op2' are operators. You need to know
their relative precedences in order to correctly parse this as either
'(a op1 b) op2 c' or 'a op1 (b op2 c)'. Any unary ops on those terms
don't affect that.

That argument makes no sense.

You don't need to know where binary "-" fits in the precedence
ordering in order to correctly parse :

-a-a-a-a-aa + b / c

However, I doubt if you would be happy with a table of operators that
omitted binary minus.

What do you mean by 'binary "-"' and 'binary minus'? Are they both the operator in "x - y" or did one or both mean the unary negation operator
in "-z"?

I meant "binary minus", written either "minus" or "-". If I had meant
unary minus or negation, I would not have written "binary".

In my example, 'a b c' each represent arbitrary terms. These are
examples of such terms:

-a-a -x
-a-a &x
-a-a ++x[i]
-a-a x(i, j)
-a-a (x + y)
-a-a x.m--
-a-a -(+(-(sizeof(x))))

Yes. So?

What you wrote is that if you have an expression where only the binary operators are of relevance or interest, you only need a table of the
binary operators in order to understand the interaction between them.

I pointed out that the same logic applies if you have an expression
where only some binary operators are used - you only need a table with
those binary operators in order to understand the interactions.

There is no benefit in having multiple tables for the normal operators
in a language - it is simpler and clearer to put them all in one table. Isolating the unary operators is no more logical or useful than
isolating the binary minus operator.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Fri May 15 13:05:45 2026

From Newsgroup: comp.lang.c

On 15/05/2026 12:35, Dan Cross wrote:

In article <10u6t2n$4mai$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 15/05/2026 09:32, David Brown wrote:

On 15/05/2026 02:31, Bart wrote:

On 15/05/2026 00:40, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

Unary operators aren't the problem. It's a mystery why they need to be >>>>>> in a table at all. Nobody's going to think that '&a + b' means '&(a + >>>>>> b)'.

[...]

It would be silly for an operator precedence tables to omit the
operators that "everybody knows".-a If I had a table that didn't
show *all* the operators, I'd look for a better table (like the
one in K&R2).

Do you need to know the precedence of a unary operator (say applied to >>>> any of these terms) in order to correctly parse this:

-a-a-a a op1 b op2 c

?

'a b c' are terms, and 'op1 op2' are operators. You need to know their >>>> relative precedences in order to correctly parse this as either '(a
op1 b) op2 c' or 'a op1 (b op2 c)'. Any unary ops on those terms don't >>>> affect that.

That argument makes no sense.

You don't need to know where binary "-" fits in the precedence ordering
in order to correctly parse :

-a-a-a-aa + b / c

However, I doubt if you would be happy with a table of operators that
omitted binary minus.

What do you mean by 'binary "-"' and 'binary minus'?

As a binary operator, `-` refers to subtraction. Note that the
expression quoted above does not contain subtraction. Therefore
one does to need to know the precedence of the subtraction
operator to parse that expression.

'binary "-"' and 'binary minus' mean the same thing; in the
former he used a literal `-` character, and in the latter he
substituted the name of the symbol.

Are they both the
operator in "x - y" or did one or both mean the unary negation operator
in "-z"?

It says it right there on the tin, dude. It ain't that hard.

No need to be impertinent. There was ambiguity in DB's comment:

* I suggested that unary operators (including unary minus) doesn't
belong in a chart of precedences which is mostly about binary ops

* He says: "I doubt if you would be happy with a table of operators that
omitted binary minus"

I hadn't suggested that the binary or dyadic "-" operator should be
omitted. So I tried to clear things up.

Disagree: in C, the only thing I've used the precedence table for is for
the relative precedence of op1 and op2 in examples like mine.

Not the flex you think it is....

Jesus. Just stop with the putdowns.

I'm pretty sure very, very many people have the same experience: does
"^" come before or after "|"? They look at the table.

How many need to look up whether unary "!" comes before or after "^"? It
just doesn't happen. You can't change the fixed meaning of 'a | !b'
using parentheses.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Fri May 15 14:54:01 2026

From Newsgroup: comp.lang.c

On 15/05/2026 12:58, David Brown wrote:

On 15/05/2026 12:38, Bart wrote:

On 15/05/2026 09:32, David Brown wrote:

However, I doubt if you would be happy with a table of operators that
omitted binary minus.

What do you mean by 'binary "-"' and 'binary minus'? Are they both the
operator in "x - y" or did one or both mean the unary negation
operator in "-z"?

I meant "binary minus", written either "minus" or "-".-a If I had meant unary minus or negation, I would not have written "binary".

In my example, 'a b c' each represent arbitrary terms. These are
examples of such terms:

-a-a-a -x
-a-a-a &x
-a-a-a ++x[i]
-a-a-a x(i, j)
-a-a-a (x + y)
-a-a-a x.m--
-a-a-a -(+(-(sizeof(x))))

Yes.-a So?

See below.

What you wrote is that if you have an expression where only the binary operators are of relevance or interest, you only need a table of the
binary operators in order to understand the interaction between them.

Yes.

I pointed out that the same logic applies if you have an expression
where only some binary operators are used - you only need a table with
those binary operators in order to understand the interactions.

I didn't imply a dedicated table for each possible expression that can
be written in C.

There is one such table for all programs.

Possibly you misunderstood my example. Let me rewrite it as:

Term binop1 Term binop2 Term

Examples of a Term are given above.

Parsing this involves knowing /only/ about binop precedences. Any unary
ops are contained within any of those Terms and don't affect this at all.

There is no benefit in having multiple tables for the normal operators
in a language - it is simpler and clearer to put them all in one table. Isolating the unary operators is no more logical or useful than
isolating the binary minus operator.

I'm saying that unary operators don't need to be in ANY table!

In fact, the C standard doesn't seem to have any such table; precedence
is implied by the grammar, but in the same way as the rest of the syntax.

Precedence tables come up elsewhere, and they invariably mix up binary
ops with unary ops and postfix ops.

The latter include (), [] and ".".

Prefix and Postfix ops together, when clustered around a specific term,
have their own set of rules. They are quite different from binary ops.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Fri May 15 15:00:39 2026

From Newsgroup: comp.lang.c

On 15/05/2026 14:54, Bart wrote:

Prefix and Postfix ops together, when clustered around a specific term,
have their own set of rules. They are quite different from binary ops.

I'm going to bail out here. This is not going anywhere.

Either people don't understand the subject, or are pretending not to, or
just want to have a go.

I have quite a bit of knowledge and practical experience of the subject,
even if people here don't like to admit that, but I'm poor at getting
the point across.

I will reconsider my claim that precedence tables don't need to include anything beyond binary ops, if somebody can give a reference to such a
table in the C standard.

--- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Fri May 15 15:45:26 2026

From Newsgroup: comp.lang.c

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

scott@slp53.sl.home (Scott Lurndal) writes:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <10u2jpk$2t96p$6@kst.eternal-september.org>,

[...]

If C's switch statement were to be
changed, it would have to use something that's currently a syntax >>>>error. Perhaps something like

case 1, case 2, case 3, case 4: whatever();

Sure, that's better.

case 1...4: whatever();

is a typical GCC extension (that we use heavily).

Yes, and the C2y draft adopts that syntax.

(One possible reason it wasn't adopted sooner is that `case 'a'...'z'` >doesn't necessarily work if the letters are not contiguous, for
example in EBCDIC.)

I think that's a bit far-fetched. Regular expressions
have the same EBCDIC related issues (i.e. the discontinuous
nature of the EBCDIC alpha translations); yet, there are
no other defined characters in the gaps between the
alpha groups in EBCDIC, so [a-z] or "case 'a'...'z':" would
probably work just fine in most cases to match lowercase EBCDIC
alpha text.

Worse case, it could be coded as

case 'a'...'i': /* FALLTHROUGH */
case 'j'...'r': /* FALLTHROUGH */
case 's'...'z':
do something;
break;

--- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Fri May 15 16:01:28 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 15/05/2026 14:54, Bart wrote:

Prefix and Postfix ops together, when clustered around a specific term,
have their own set of rules. They are quite different from binary ops.

I'm going to bail out here. This is not going anywhere.

Either people don't understand the subject, or are pretending not to, or >just want to have a go.

I have quite a bit of knowledge and practical experience of the subject, >even if people here don't like to admit that, but I'm poor at getting
the point across.

I will reconsider my claim that precedence tables don't need to include >anything beyond binary ops, if somebody can give a reference to such a
table in the C standard.

Given that 99.9% of all programmers learn C without ever referring to
the C standard, you're being disingenous by requiring a reference to
a table in the standard.

I'll point you to page 53 in the K&R Second Edition for a precedence table (2-1)
that includes the unary ops (albeit in a footnote) that notes"

"unary +, - and * have higher precedence than the binary forms"

The text that accompanies the table details, with examples, the
precedence and order of evaluation.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Fri May 15 20:17:30 2026

From Newsgroup: comp.lang.c

On 2026-05-15 17:45, Scott Lurndal wrote:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

[...]

(One possible reason it wasn't adopted sooner is that `case 'a'...'z'`
doesn't necessarily work if the letters are not contiguous, for
example in EBCDIC.)

In case of programming languages they at least *could* define
an _appropriate semantics_ for such ranges, independent of the
numeric code point in the character table. (Hardly for "C", I
suppose, when something like 'a' is just an integer.)

I think that's a bit far-fetched. Regular expressions
have the same EBCDIC related issues (i.e. the discontinuous
nature of the EBCDIC alpha translations); yet, there are
no other defined characters in the gaps between the
alpha groups in EBCDIC, so [a-z] or "case 'a'...'z':" would
probably work just fine in most cases to match lowercase EBCDIC
alpha text.

I wouldn't count on that and use, in case of Regular Expressions,
the character classes; [[:alpha:]], [[:upper:]], and [[:lower:]],
respectively.

Worse case, it could be coded as

case 'a'...'i': /* FALLTHROUGH */
case 'j'...'r': /* FALLTHROUGH */
case 's'...'z':
do something;
break;

Which I'd consider to be very ugly, especially given that here you
hard-code your logic for a specific character code table. - Quite anachronistic, I'd say, if not non-sophisticated.

Curious; is anyone still programming on EBCDIC systems?

Janis

--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Fri May 15 12:23:52 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 15/05/2026 14:54, Bart wrote:

Prefix and Postfix ops together, when clustered around a specific
term, have their own set of rules. They are quite different from
binary ops.

I'm going to bail out here. This is not going anywhere.

Either people don't understand the subject, or are pretending not to,
or just want to have a go.

I have quite a bit of knowledge and practical experience of the
subject, even if people here don't like to admit that, but I'm poor at getting the point across.

I think that most of us found your idea that precedence tables
should exclude unary ops to be so bizarre that we weren't sure you
actually meant it.

If you prefer such tables, that's fine. It apparently works for
you, and it doesn't affect anyone else. If you insist that your
preference is the one and only right way to construct a precedence
table, you're going to get some pushback.

I will reconsider my claim that precedence tables don't need to
include anything beyond binary ops, if somebody can give a reference
to such a table in the C standard.

That's disingenuous. You know, because you've been told several
times in this thread, that there is precedence table in the C
standard. You also know that there is a precedence table, that
includes unary, postfix, binary, and ternary operators, in K&R2.

Your personal preference for a precedence table that excludes unary
and postfix operators is perfectly valid for you. Other people's
preference for a table that includes all the operators is perfectly
valid for them. (The evidence so far suggests that the latter
includes everyone but you.)

You need to understand that your personal preferences, while they may
be perfectly valid, are unusual, and I advise you to stop trying to
pretend that they're some kind of universal truth. You've wasted
a lot of time here arguing that precedence tables *should* omit
unary and postfix operators, when in fact it's nothing more than
your personal preference, one not shared by others.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Fri May 15 12:47:04 2026

From Newsgroup: comp.lang.c

scott@slp53.sl.home (Scott Lurndal) writes:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

scott@slp53.sl.home (Scott Lurndal) writes:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <10u2jpk$2t96p$6@kst.eternal-september.org>,

[...]

If C's switch statement were to be
changed, it would have to use something that's currently a syntax >>>>>error. Perhaps something like

case 1, case 2, case 3, case 4: whatever();

Sure, that's better.

case 1...4: whatever();

is a typical GCC extension (that we use heavily).

Yes, and the C2y draft adopts that syntax.

(One possible reason it wasn't adopted sooner is that `case 'a'...'z'` >>doesn't necessarily work if the letters are not contiguous, for
example in EBCDIC.)

I think that's a bit far-fetched. Regular expressions
have the same EBCDIC related issues (i.e. the discontinuous
nature of the EBCDIC alpha translations); yet, there are
no other defined characters in the gaps between the
alpha groups in EBCDIC, so [a-z] or "case 'a'...'z':" would
probably work just fine in most cases to match lowercase EBCDIC
alpha text.

I understand there are different versions of EBCDIC. According to
the table in the Wikipedia article, '~' is between 'r' and 's',
'}' is between 'I' and 'J', and '\\' is between 'R' and 'S'.

ISO C has no support for regular expressions.

Worse case, it could be coded as

case 'a'...'i': /* FALLTHROUGH */
case 'j'...'r': /* FALLTHROUGH */
case 's'...'z':
do something;
break;

In fact EBCDIC, though not mentioned by name, was part of the reason for
not supporting case ranges. Quoting the ANSI C Rationale:

Case ranges (of the form lo .. hi) were seriously considered,
but ultimately not adopted in the Standard on the grounds
that it added no new capability, just a problematic coding
convenience. The construct seems to promise more than it could
be mandated to deliver:

- A great deal of code (or jump table space) might be generated
for an innocent-looking case range such as 0 .. 65535.

- The range 'A'..'Z' would specify all the integers between
the character code for A and that for Z. In some common
character sets this range would include non-alphabetic
characters, and in others it might not include all the
alphabetic characters (especially in non-English character
sets).

No serious consideration was given to making the switch more
structured, as in Pascal, out of fear of invalidating working
code.

gcc and C2Y use "..." rather than ".." because it's an existing token,
used in variadic function declarations. `1..4` is actually a
preprocessing number, resulting in a syntax error when it's
converted to an integer constant.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Fri May 15 12:54:09 2026

From Newsgroup: comp.lang.c

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Bart <bc@freeuk.com> writes:

[...]

I will reconsider my claim that precedence tables don't need to
include anything beyond binary ops, if somebody can give a reference
to such a table in the C standard.

That's disingenuous. You know, because you've been told several
times in this thread, that there is precedence table in the C
standard. You also know that there is a precedence table, that
includes unary, postfix, binary, and ternary operators, in K&R2.

Sorry, editing error. I meant that there is *no* precedence table
in the C standard.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Fri May 15 13:15:44 2026

From Newsgroup: comp.lang.c

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
[...]

gcc and C2Y use "..." rather than ".." because it's an existing token,
used in variadic function declarations. `1..4` is actually a
preprocessing number, resulting in a syntax error when it's
converted to an integer constant.

That actually raises a couple of interesting issues.

C2y (N3783) defines:

constant-range-expression:
constant-expression ... constant-expression

with this description:

A *constant range expression* is a special syntactic form that
describes a sequence of contiguously incrementing integer
values ranging from the arithmetic value of the first constant
expression to the second, without listing intermediate values
explicitly. The operator is not a value expression and is only
permitted in specific contexts, such as the operand to a case
label to indicate that a single label matches multiple values.

In fact I'm fairly sure that's the only context in which it can
appear, but I suppose it doesn't hurt to allow for other uses in
the future.

It says the *operator* is not a "value expression", a phrase not used
anywhere else. (Is `...` an operator?) It would be more accurate
to say that a constant-range-expression is not an expression,
and IMHO better to use a different name, such as constant-range.

I had assumed that `case 1...2:` was intended to be valid, but
due to the maximal munch rule `1...2` is a preprocessing number.
Both gcc and clang flag it as a syntax error, whether with "-std=c2y"
(treating it as a standard feature) or without it (treating it as
an extension).

So `case '0'...'1' is valid, and so is `case foo...bar:` if
foo and bar are constant expressions, but `case 1...2:` is not.
The whitespace is a good idea anyway, but a footnote would be helpful
-- or perhaps a change in the syntax of preprocessing numbers so
they can't contain "...".
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Fri May 15 21:39:52 2026

From Newsgroup: comp.lang.c

On 15/05/2026 20:23, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 15/05/2026 14:54, Bart wrote:

Prefix and Postfix ops together, when clustered around a specific
term, have their own set of rules. They are quite different from
binary ops.

I'm going to bail out here. This is not going anywhere.

Either people don't understand the subject, or are pretending not to,
or just want to have a go.

I have quite a bit of knowledge and practical experience of the
subject, even if people here don't like to admit that, but I'm poor at
getting the point across.

I think that most of us found your idea that precedence tables
should exclude unary ops to be so bizarre that we weren't sure you
actually meant it.

Well, I find it bizarre that they should! Let's see:

* Infix operators have 13 different precedence levels. Prefix and
Postfix have one each

* You can vary evaluation order of infix expressions with parentheses.
Prefix/Postfix are much more limited (you might force a prefix op to
be done before a postfix op on the same term, but cannot change the
order of each set)

* Half the prefix/postfix aren't even proper operators IMO (see below)

I mean, have you ever needed to look up a precedence level for anything
other than an infix operator?

I will reconsider my claim that precedence tables don't need to
include anything beyond binary ops, if somebody can give a reference
to such a table in the C standard.

That's disingenuous. You know, because you've been told several
times in this thread, that there is precedence table in the C
standard.

Whereabouts?

You also know that there is a precedence table, that
includes unary, postfix, binary, and ternary operators, in K&R2.

Usually you're reluctant refuse to discuss anything that isn't mentioned
in the standard.

Your personal preference for a precedence table that excludes unary
and postfix operators is perfectly valid for you. Other people's
preference for a table that includes all the operators is perfectly
valid for them. (The evidence so far suggests that the latter
includes everyone but you.)

This is information about Go (from https://rosettacode.org/wiki/Operator_precedence#Go):

---------------------------------
Precedence Operators

Highest Unary operators: +, -, !, ^, *, &, <-
5 *, /, %, <<, >>, &, &^
4 +, -, |, ^
3 ==, !=, <, <=, >, >=
2 &&
1 ||

Syntactic elements not in the list are not considered operators in Go;
if they present ambiguity in order of evaluation, the ambiguity is
resolved by other rules specific to those elements. ---------------------------------

Notice how much tider it is than C's dozen or so levels AND EASIER TO
REMEMBER because the choices make sense.

Although this still includes a section on unary, it is one level only,
and is again not relevant for someone seeking clarification on an infix op.

It is sufficient to know that a unary op automatically binds more
tightly. It's not even given a level number.

(My own views are also that '() [] .' etc are syntactic elements, not operators. I also have 5 precedence levels for those sets of binary ops
(but have one more for '**').

My choices are almost exactly like Go's, except that & is lumped with |
and ^, since I consider level 5 ops to scale values, but level 4 ops do
not scale.

So, the way I think is not as far-out as you suggest. I never looked at
Go's operators until today; it is telling that my ideas and Google's
have more or less converged.)

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Fri May 15 21:52:08 2026

From Newsgroup: comp.lang.c

On 13/05/2026 22:46, Scott Lurndal wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <10u2jpk$2t96p$6@kst.eternal-september.org>,
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <10u0k0k$1l93l$30@dont-email.me>,

[...]

It's easy to get wrong. Other languages accommodate both
semantics using alternation in the selector arm. For example,
one might imagine an hypothetical syntax, something like:

switch (a) {
case 1 || 2 || 3 || 4: whatever();
default: other();
}

...with no `break` to end each `case`.

That's already valid syntax.

It wasn't meant to be taken as a serious suggestion!

If C's switch statement were to be
changed, it would have to use something that's currently a syntax
error. Perhaps something like

case 1, case 2, case 3, case 4: whatever();

Sure, that's better.

case 1...4: whatever();

is a typical GCC extension (that we use heavily).

This is bit of a quandry in C.

C is zero-based; other zero-based languages tend to have open ranges,
with an exclusive upper bound. That means that A..B means A to B-1
inclusive, and 0..N means 0 to N-1.

Some might use a special syntax to make it clearer, such as 0..<N.

Other the other hand, an inclusive range is far more intuitive in
general, such as in 'A'...'Z' and Mon...Fri.

I guess one more quirk at this point makes little difference.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Fri May 15 14:14:52 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 15/05/2026 20:23, Keith Thompson wrote:

[...]

I think that most of us found your idea that precedence tables
should exclude unary ops to be so bizarre that we weren't sure you
actually meant it.

Well, I find it bizarre that they should!

Already acknowledged.

[...]

That's disingenuous. You know, because you've been told several
times in this thread, that there is precedence table in the C
standard.

Whereabouts?

I accidentally omitted a word there. There is *no* precedence
table in the C standard.

[...]
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Fri May 15 22:16:03 2026

From Newsgroup: comp.lang.c

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

scott@slp53.sl.home (Scott Lurndal) writes:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

scott@slp53.sl.home (Scott Lurndal) writes:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <10u2jpk$2t96p$6@kst.eternal-september.org>,

[...]

If C's switch statement were to be
changed, it would have to use something that's currently a syntax >>>>>>error. Perhaps something like

case 1, case 2, case 3, case 4: whatever();

Sure, that's better.

case 1...4: whatever();

is a typical GCC extension (that we use heavily).

Yes, and the C2y draft adopts that syntax.

(One possible reason it wasn't adopted sooner is that `case 'a'...'z'` >>>doesn't necessarily work if the letters are not contiguous, for
example in EBCDIC.)

I think that's a bit far-fetched. Regular expressions
have the same EBCDIC related issues (i.e. the discontinuous
nature of the EBCDIC alpha translations); yet, there are
no other defined characters in the gaps between the
alpha groups in EBCDIC, so [a-z] or "case 'a'...'z':" would
probably work just fine in most cases to match lowercase EBCDIC
alpha text.

I understand there are different versions of EBCDIC. According to
the table in the Wikipedia article, '~' is between 'r' and 's',
'}' is between 'I' and 'J', and '\\' is between 'R' and 'S'.

Ah yes. I had the Burroughs EBCDIC card handy when I wrote that.

IBM was always fond of doing things their way.

<snip>

In fact EBCDIC, though not mentioned by name, was part of the reason for
not supporting case ranges. Quoting the ANSI C Rationale:

Ah, hadn't seen that. Makes sense.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Sat May 16 00:44:39 2026

From Newsgroup: comp.lang.c

On 2026-05-15 22:39, Bart wrote:

[...]

The fewer precedence groups you have the more parentheses you will
have to use in expressions. And vice versa. - The actual choice is
a decision of the respective language designers.

Actually, ranges of about ten levels seem to be not uncommon amongst programming languages to provide a sensible, widely accepted grouping.
Language designers seem to be trying to avoid a flood of unnecessary parentheses in programs. (Note: that may not apply to Lisp people. :-)

Janis

<OT-begin>

This is information about Go [...]

[snip table]

[...]

And here's a precedence table for another language that facilitates
that there's no parenthesis at all necessary to obtain unambiguous
expressions:

lvl op
---------
1 !!
2 !-o
3 !$
4 !%
5 !/
6 !+
7 !~
8 !=
9 -o!
10 -o-o
11 -o$
12 -o%
13 -o/
14 -o+
15 -o~
16 -o=
17 $!
18 $-o
19 $$
20 $%
21 $/
22 $+
23 $~
24 $=
25 %!
26 %-o
27 %$
28 %%
29 %/
30 %+
31 %~
32 %=
33 /!
34 /-o
35 /$
36 /%
37 //
38 /+
39 /~
40 /=
41 +!
42 +-o
43 +$
44 +%
45 +/
46 ++
47 +~
48 +=
49 ~!
50 ~-o
51 ~$
52 ~%
53 ~/
54 ~+
55 ~~
56 ~=
57 =!
58 =-o
59 =$
60 =%
61 =/
62 =+
63 =~
64 ==

And here a variant with only one level of operator precedence,
all listed in a single group[*]

!! !-o !$ !% !/ !+ !~ != -o! -o-o -o$ -o% -o/ -o+ -o~ -o=
$! $-o $$ $% $/ $+ $~ $= %! %-o %$ %% %/ %+ %~ %=
/! /-o /$ /% // /+ /~ /= +! +-o +$ +% +/ ++ +~ +=
~! ~-o ~$ ~% ~/ ~+ ~~ ~= =! =-o =$ =% =/ =+ =~ ==

where sub-expressions will have to be enclosed in parentheses,
generally.

[*] "All men, all animals, and all operators are equal!"[**]

[**] "But some men, animals, and operators are more equal than others!"

<OT-end>

--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Sat May 16 00:36:35 2026

From Newsgroup: comp.lang.c

On 15/05/2026 23:44, Janis Papanagnou wrote:

On 2026-05-15 22:39, Bart wrote:

[...]

The fewer precedence groups you have the more parentheses you will
have to use in expressions.

Says whom?

The more precedence levels there are, the more parentheses need to be
used because you or your readers can't remember what they are.

With about the right number, small enough that most people will be able
to parse intuitvely, parentheses are used to override the default
priorities.

And vice versa. - The actual choice is
a decision of the respective language designers.

Some language designers, especially in the past (and including me)
thought that lots of precedence levels was a Feature. It fact it was a worthless one.

I eventually learnt my lesson, but I was able to adapt my language. That
isn't always the case.

Actually, ranges of about ten levels seem to be not uncommon amongst programming languages to provide a sensible, widely accepted grouping. Language designers seem to be trying to avoid a flood of unnecessary parentheses in programs.

I use 5 levels where C uses ten (from multiply to ||).

Parentheses used to override default precedence only occur about once
per 100 lines of code in my codebases. That is very rare.

I can't give reliable figures for C, as measured by my compiler, as they
can be affected by macro expansions for example that inject parentheses everywhere. But then C source is awash with parentheses anyway.

And here's a precedence table for another language that facilitates
that there's no parenthesis at all necessary to obtain unambiguous expressions:

-a-a lvl-a op
-a ---------
-a-a-a-a 1-a-a-a !!
-a-a-a-a 2-a-a-a !-o
-a-a-a-a 3-a-a-a !$

...

-a-a-a 61-a-a-a =/
-a-a-a 62-a-a-a =+
-a-a-a 63-a-a-a =~
-a-a-a 64-a-a-a ==

And here a variant with only one level of operator precedence,
all listed in a single group[*]

-a-a !! !-o !$ !% !/ !+ !~ != -o! -o-o -o$ -o% -o/ -o+ -o~ -o=
-a-a $! $-o $$ $% $/ $+ $~ $= %! %-o %$ %% %/ %+ %~ %=
-a-a /! /-o /$ /% // /+ /~ /= +! +-o +$ +% +/ ++ +~ +=
-a-a ~! ~-o ~$ ~% ~/ ~+ ~~ ~= =! =-o =$ =% =/ =+ =~ ==

OK. I've no idea what your point is, or what all these mean.

The first language I implemented had no operator precedences and needed
no table. Expressions were unambiguous.

It was a lot more usable than this.

--- Synchronet 3.22a-Linux NewsLink 1.2

From James Kuyper@jameskuyper@alumni.caltech.edu to comp.lang.c on Sun May 17 20:43:51 2026

From Newsgroup: comp.lang.c

On 2026-05-13 13:48, Bart wrote:

Bart <bc@freeuk.com> writes:

The one for the ?: operator is particularly obscure, so in an
expression like one of these:

-a-a-a a + b ? c - d : e * f
-a-a-a a ? b ? c : d ? e : f : g

[...]

The lines are not meant to mean anything, just sequences of terms and operators. You can think of them as exercises where you add parentheses
to make them unambiguous.

What I think you mean is that the parenthesis help those who are
unfamiliar with the relevant rules understand what those rules already unambiguously require.
--- Synchronet 3.22a-Linux NewsLink 1.2

From James Kuyper@jameskuyper@alumni.caltech.edu to comp.lang.c on Mon May 18 19:48:07 2026

From Newsgroup: comp.lang.c

On 2026-05-14 13:32, Bart wrote:
...

In the case if C, it also decided that ?: belongs in this chart of /
binary/ operators. (I supposed you can consider each of ? and : as a
binary operator...)

When and where was that decided? There's a big difference between
putting ?: in a chart along with binary and unary operators (which
happened in section 2.12 of the 1st edition of K&R) and putting it in a
chart of binary operators. To the best of my knowledge, the latter never happened. Could you please identify where it did?
--- Synchronet 3.22a-Linux NewsLink 1.2

From Bart@bc@freeuk.com to comp.lang.c on Tue May 19 01:12:06 2026

From Newsgroup: comp.lang.c

On 19/05/2026 00:48, James Kuyper wrote:

On 2026-05-14 13:32, Bart wrote:
...

In the case if C, it also decided that ?: belongs in this chart of /
binary/ operators. (I supposed you can consider each of ? and : as a
binary operator...)

When and where was that decided? There's a big difference between
putting ?: in a chart along with binary and unary operators (which
happened in section 2.12 of the 1st edition of K&R) and putting it in a
chart of binary operators. To the best of my knowledge, the latter never happened. Could you please identify where it did?

If you read my post again, you'll find that 'this chart' most likely
refers to my suggested chart containing those 4 groups I mentioned.

That chart contains a set of binary (and infix) operators that also
appear in K&R2. I'm saying that whoever put that together in K&R2
decided that ?: belonged in this set.

(Personally I don't, as I don't consider a ?:-like feature to be an
operator, but even it it was, a 3-way operator is a poor fit when all
the others are 2-way.)
--- Synchronet 3.22a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Mon May 18 19:22:08 2026

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:
[...]

If you read my post again, you'll find that 'this chart' most likely
refers to my suggested chart containing those 4 groups I mentioned.

That chart contains a set of binary (and infix) operators that also
appear in K&R2. I'm saying that whoever put that together in K&R2
decided that ?: belonged in this set.

The ?: operator, which is a ternary operator, belongs in the set
of operators.

You somehow concluded that ?: was being treated as a binary operator.
It very very clearly is not a binary operator. The chart in K&R2
very very clearly does not imply that it is.

(Personally I don't, as I don't consider a ?:-like feature to be an
operator,

It is an operator in C.

but even it it was, a 3-way operator is a poor fit when all
the others are 2-way.)

No, not all the others are 2-way.

Are you now saying that you think ?: *should* be a binary (2-way)
operator? (I'm not going to ask how that would work.)
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.22a-Linux NewsLink 1.2

Who's Online

System Info

Sysop:	Amessyroom
Location:	Fayetteville, NC
Users:	65
Nodes:	6 (0 / 6)
Uptime:	06:17:23
Calls:	862
Files:	1,311
D/L today:	921 files (14,318M bytes)
Messages:	264,699

Re: Safety of casting from 'long' to 'int'

Who's Online

System Info