Hi,
I'm currently working on an academic project that analyzes the speedup gain of
Undefined Behavior Optimizations in C. I argue that UB Optimizations introduce
more risks than actual speedup gains. This happens because most of the time the compiler does not have the context to understand the intention of the programmer in code that contains UB. For example this fast inverse square root
function [1] triggers UB at this line: i = * ( long * ) &y;. A "smart" compiler could delete this line as an optimization because it contains UB.
Don't get me wrong, I'm not saying that the compiler should magically understand situations where the programmer makes UB mistakes and then
repair them. What I'm saying is that the compiler should compile the code
"as is" without making smart optimization transformations that break the intention of the programmer.
To test the theory that the UB Optimizations introduce more risks than speedup gains,
I take OpenBSD (for its focus on security and robustness) and compile it on one hand with UB Optimizations turned on and with UB Optimizations turned off. If I will find out that the speedup gain is not so big (I don't know what big means at the moment) then the UB Optimizations don't make sense in the OS. Otherwise, it means that I was wrong, but I will still want to see what security risks they introduce on the respective OS setup.
My current progress is here [2]. I did not start the technical work, ATM I only--- Synchronet 3.21b-Linux NewsLink 1.2
have the research proposal. I reached you to see if you have any feedback
on this proposal. Is it a manageable goal? Do you see ways in which it can
be improved? Does it suck? etc, etc.
Lucian Popescu
[1] https://en.wikipedia.org/wiki/Fast_inverse_square_root#Overview_of_the_code
[2] https://tildegit.org/lucic71/dissertation/src/branch/master/TSW/tsw.pdf
On 5 Jan 2023 10:05:49 +0000(snip)
"Lucian Popescu" <luc...@ctrl-c.club> wrote:
I'm currently working on an academic project that analyzes the speedup gain of
Undefined Behavior Optimizations in C.
To test the theory that the UB Optimizations introduce more risks than speedup gains,
Isn't this comparing apples and oranges ?
(snip)I take OpenBSD (for its focus on security and robustness) and compile it on one hand with UB Optimizations turned on and with UB Optimizations turned off.
I think it would make more sense to test with chess engines : prepare
a collection of chess positions and see how many nodes per second an engine can calculate for a given position depending on how the engine was compiled.
Doing analogous experiments with graphics code or numerical analysis code would also be interesting. But an operating system , I don't know , it seems too general.
I'm currently working on an academic project that analyzes the speedup gain of >Undefined Behavior Optimizations in C.
I argue that UB Optimizations introduce
more risks than actual speedup gains.
Don't get me wrong, I'm not saying that the compiler should magically >understand situations where the programmer makes UB mistakes and then
repair them. What I'm saying is that the compiler should compile the code
"as is" without making smart optimization transformations that break the >intention of the programmer.
My current progress is here [2]. I did not start the technical work, ATM I only[[2] https://tildegit.org/lucic71/dissertation/src/branch/master/TSW/tsw.pdf]
have the research proposal. I reached you to see if you have any feedback
on this proposal.
On Thursday, January 5, 2023 at 10:13:08 AM UTC-8, Spiros Bousbouras wrote:
On 5 Jan 2023 10:05:49 +0000
"Lucian Popescu" <luc...@ctrl-c.club> wrote:
(snip)I'm currently working on an academic project that analyzes the speedup gain of
Undefined Behavior Optimizations in C.
To test the theory that the UB Optimizations introduce more risks than
speedup gains,
Isn't this comparing apples and oranges ?
Probably.
Most important when debugging, is that you can trust the compiler to
do what you said. That they don't, has always been part of
optimization
but these UB make it worse.
My thought would be all of SPEC
I haven't thought about this so recently. Do compilers give warnings
when they remove such UB code?
On Thursday, January 5, 2023 at 10:13:08 AM UTC-8, Spiros Bousbouras wrote:
On 5 Jan 2023 10:05:49 +0000
"Lucian Popescu" <luc...@ctrl-c.club> wrote:
(snip)I'm currently working on an academic project that analyzes the speedup gain of
Undefined Behavior Optimizations in C.
To test the theory that the UB Optimizations introduce more risks than
speedup gains,
Isn't this comparing apples and oranges ?
Probably.
You can quantify speed-up, but it is harder to quantify risk.
You might be able to quantify debug time, and how much longer
it takes to debug a program with such behavior.
Most important when debugging, is that you can trust the compiler to
do what you said. That they don't, has always been part of
optimization, but these UB make it worse.
Most important when debugging, is that you can trust the compiler to
do what you said. That they don't, has always been part of
optimization, but these UB make it worse.
The trouble with undefined behaviour is that, in general, you cannot
trust the compiler to "do what you say" because it cannot know what you
have said.
Now, this would not be a problem at all in C, with short-circuit IF evaluation required, but even now Fortran doesn't do short
circuit IF evaluation. The way Fortran did static allocation in 1977,
there was pretty much no chance that you would access outside
your address space evaluating X(0). There is a chance, though,
that the value 3 is stored there.
[Both Fortran 66 and 77 could do short-circuit evaluation, but it's
not required. I think most compilers did, since it was easy, but
of course you couldn't count on it. -John]
Hi,
I'm currently working on an academic project that analyzes the speedup gain of
Undefined Behavior Optimizations in C. I argue that UB Optimizations introduce
more risks than actual speedup gains. This happens because most of the time the compiler does not have the context to understand the intention of the programmer in code that contains UB. For example this fast inverse square root
function [1] triggers UB at this line: i = * ( long * ) &y;. A "smart" compiler could delete this line as an optimization because it contains UB.
Don't get me wrong, I'm not saying that the compiler should magically understand situations where the programmer makes UB mistakes and then
repair them. What I'm saying is that the compiler should compile the code
"as is" without making smart optimization transformations that break the intention of the programmer.
On 06/01/2023 01:22, gah4 wrote:
On Thursday, January 5, 2023 at 10:13:08 AM UTC-8, Spiros Bousbouras wrote: >>> On 5 Jan 2023 10:05:49 +0000
"Lucian Popescu" <luc...@ctrl-c.club> wrote:(snip)
I'm currently working on an academic project that analyzes the speedup gain of
Undefined Behavior Optimizations in C.
To test the theory that the UB Optimizations introduce more risks than >>>> speedup gains,
Isn't this comparing apples and oranges ?
Probably.
You can quantify speed-up, but it is harder to quantify risk.
You might be able to quantify debug time, and how much longer
it takes to debug a program with such behavior.
Most important when debugging, is that you can trust the compiler to
do what you said. That they don't, has always been part of
optimization, but these UB make it worse.
The trouble with undefined behaviour is that, in general, you cannot
trust the compiler to "do what you say" because it cannot know what you
have said.
So before we decide if UB optimizations are actually allowed by thestandard we need to decide what "ignoring the situation completely
On 2023-01-06, David Brown <david.brown@hesbynett.no> wrote:
On 06/01/2023 01:22, gah4 wrote:
Most important when debugging, is that you can trust the compiler to
do what you said. That they don't, has always been part of
optimization, but these UB make it worse.
The trouble with undefined behaviour is that, in general, you cannot
trust the compiler to "do what you say" because it cannot know what you
have said.
That's correcst; however, what we should be able to trust is for the
compiler not to make it worse.
The compiler makes it worse when it assumes that the programmer is infallible, and thus makes logical inferences predicated on some
construct never having undefined behavior, using those to guide the translation of other constructs.
This is particularly harmful when the undefined behavior is *de facto* defined: like that if the undefinedness aspect is ignored, and the
obvious code is emitted, that code will do something characteristic
of the machine.
In C99, the original definition of undefined behavior is this:
behavior, upon use of a nonportable or erroneous program construct or
of erroneous data, for which this International Standard imposes no
requirements.
NOTE: Possible undefined behavior ranges from ignoring the situation
completely with unpredictable results, to behaving during translation
or program execution in a documented manner characteristic of the
environment (with or without the issuance of a diagnostic message), to
terminating a translation or execution (with the issuance of a
diagnostic message).
There is an obvious intepretation of this text which rules out
the too-clever optimizations.
If the compiler optimizes construct Y based on some other construct
X being free of undefined behavior, and that assumption turns
out to be false, that compiler has not conformed to the "NOTE"
part of the treatment of undefined behavior.
Firstly, it has not "ignor[ed] the situation completely". You cannot
assume that X is well-defined, in order to treat Y in some way, and yet
say that you're completely ignoring the situaton of X. If the UB
problem in X causes a secondary problem with the way Y was translated,
that is a problem with the assumption that was made about X. That
assumption was made because it's possible for X to be undefined, and
that possiblity was expliclty disregarded, which is not the same as completely ignored.
The situation also isn't a documented manner characteristic of
the environment.
It also isn't a termination of translation or execution with or without
a diagnostic message.
The situation doesn't conform to the NOTE.
C compiler writers should take the NOTE more seriously.
It is particularly harmful when programmers think there is such a thing
as "de facto defined". That's an oxymoron. If the behaviour is
defined, it is defined. If it is not defined, it is not defined. If it
is not defined and a programmer makes unwarranted and incorrect
assumptions about what they think it means, then the programmer needs to update his or her understanding of the language. They don't get to
blame the compiler or the compiler writer for not making the same
unfounded assumptions that they did.
So before we decide if UB optimizations are actually allowed by thestandard we need to decide what "ignoring the situation completely
with unpredictable results" actually means.
[1] https://port70.net/~nsz/c/c89/rationale/
Lucian
WG14 are aware of UB optimising compilers and could have steered away from this path, but haven't. It's been decades now. The pointer provenance work seeks to apply aliasing rules even more aggressively. GCC and clang are
both pursuing faster codegen via exploiting undefined behaviour.
C, the WG14 ISO defined language, as implemented by the primary open source toolchains, is thus unfit for my purposes.
I'm not clear what use that language has.
C, the typed assembler of ye olde times, is a profoundly useful language.
One just can't use GCC or clang to build it reliably.
It annoys me intensely that the type aliasing rules capture something a
whole program optimising compiler can usually work out for itself anyway,
while preventing me from reading 128bit integers from the same memory I fetch_add 32bit integers into.
So before we decide if UB optimizations are actually allowed by thestandard we need to decide what "ignoring the situation completely
with unpredictable results" actually means.
[1] https://port70.net/~nsz/c/c89/rationale/
Lucian
WG14 are aware of UB optimising compilers and could have steered away from this path, but haven't. It's been decades now. The pointer provenance work seeks to apply aliasing rules even more aggressively. GCC and clang are
both pursuing faster codegen via exploiting undefined behaviour.
C, the WG14 ISO defined language, as implemented by the primary open source toolchains, is thus unfit for my purposes. I'm not clear what use that language has.
C, the typed assembler of ye olde times, is a profoundly useful language.
One just can't use GCC or clang to build it reliably.
It annoys me intensely that the type aliasing rules capture something a
whole program optimising compiler can usually work out for itself anyway, while preventing me from reading 128bit integers from the same memory I fetch_add 32bit integers into.
On Tuesday, January 10, 2023 at 2:00:57 PM UTC-8, David Brown wrote:
(big snip)
It is particularly harmful when programmers think there is such a thing
as "de facto defined". That's an oxymoron. If the behaviour is
defined, it is defined. If it is not defined, it is not defined. If it
is not defined and a programmer makes unwarranted and incorrect
assumptions about what they think it means, then the programmer needs to
update his or her understanding of the language. They don't get to
blame the compiler or the compiler writer for not making the same
unfounded assumptions that they did.
A lot of C code assumes two's complement. I believe Unisys still sells
ones' complement hardware, and so that code might have UB, but often
people know that it will only run on two's complement machines.
Much also assumes ASCII code. Don't tell IBM about that.
If C compilers had a fatal compilation error when they found
suspicious UB, I could live with that. Maybe it means doing
something to convince the compiler, even if it really is still UB.
Remember, much of the data breaches we hear about
(and also the ones we don't) are due to buffer overflow
in C programs. Make it easier, instead of harder, for programmers
to avoid buffer problems.
Much also assumes ASCII code. Don't tell IBM about that.
There is usually no requirement for a given piece of C code to be fully portable to all conforming compilers. Most C code is quite limited in
the scope of its use, and it is quite reasonable to assume two's
complement representation (though /not/ two's complement wrapping on overflow), 8-bit char, ASCII characters, etc. It may be fine to assume
32-bit int, and perhaps little-endian ordering. These are /warranted/ assumptions, not unwarranted ones - they are typically reasonable to
make, and if you want to you can sometimes put compile-time checks so
that if someone does try to use it out of context, they get an error
message.
Jon Chesterfield <jonathanchesterfield@gmail.com> schrieb:
It annoys me intensely that the type aliasing rules capture something a
whole program optimising compiler can usually work out for itself anyway,
Link-time optimization, while highly useful, currently has severe
scaling issues. This is an area where advances would be really
useful (if implemented in production compilers).
Link-time optimization violates ISO C, unfortunately.
ISO C describes C translation in a number of phases. I'm going to use
C99 numbering; it might have changed in C11.
In translation phase 7, syntax and semantic analysis is performed on the tokens coming from the previous phases (preprocessing) and the[...]
translation unit is translated.
Then in translation phase 8, multiple translation units are combined,
and references are resolved.
Link time optimization breaks the layering; it reintroduces semantic
analysis into translation phase 8, where only linking is supposed to be taking place.
When the compiler/linker peeks into translation unit a.o in order to
alter something in b.o, you no longer have translation units.
[I don't see the problem, since there's invariably an "as if" rule
that lets you do whatever you want so long as the results are the same
as if you'd done everything in sequence. If a linktime optimizer looks
at the code, promotes a couple of very busy variables that never have
their addresses taken into global registers, so what? Or if it can
globally tell that two variables are never aliased so it can reuse a
register copy of one after modifying the other, again so what? -John]
On 2023-01-11, Thomas Koenig <tkoenig@netcologne.de> wrote:
Jon Chesterfield <jonathanchesterfield@gmail.com> schrieb:
It annoys me intensely that the type aliasing rules capture something aLink-time optimization, while highly useful, currently has severe
whole program optimising compiler can usually work out for itself anyway, >>
scaling issues. This is an area where advances would be really
useful (if implemented in production compilers).
Link-time optimization violates ISO C, unfortunately.
ISO C describes C translation in a number of phases. I'm going to use
C99 numbering; it might have changed in C11.
In translation phase 7, syntax and semantic analysis is performed on the tokens coming from the previous phases (preprocessing) and the
translation unit is translated.
Then in translation phase 8, multiple translation units are combined,
and references are resolved.
Link time optimization breaks the layering; it reintroduces semantic
analysis into translation phase 8, where only linking is supposed to be taking place.
When the compiler/linker peeks into translation unit a.o in order to
alter something in b.o, you no longer have translation units.
On Fri, 6 Jan 2023 10:33:29 -0800 (PST)
gah4 <gah4@u.washington.edu> wrote:
It turns out, though, to have a fair number of statements like:
IF(I.GT.0.AND.X(I).EQ.3) ...
That is, test the subscript and use it in the same IF statement.
Now, this would not be a problem at all in C, with short-circuit IF evaluation required, but even now Fortran doesn't do short
circuit IF evaluation. ...
The analogous C code would be
if (I > 0 && X[I] == 3) ...
[I don't see the problem, since there's invariably an "as if" rule
that lets you do whatever you want so long as the results are the same
as if you'd done everything in sequence. If a linktime optimizer looks
at the code, promotes a couple of very busy variables that never have
their addresses taken into global registers, so what? Or if it can
globally tell that two variables are never aliased so it can reuse a
register copy of one after modifying the other, again so what? -John]
C was designed from day one to be a high-level language, not an
assembler of any sort. Limitations of weaker earlier compilers does
not mean the language was supposed to work that way.
I first used a C compiler that optimised on the assumption that UB
didn't happen some 25 years ago. (In particular, it assumed signed
integer arithmetic never overflowed.)
It annoys /me/ intensely that people complain about this sort of thing,
and yet apparently haven't bothered to read the compiler manuals to see
how to get the effects they want. Compile with "-fno-strict-aliasing",
or (better, IMHO) add this to your code:
#pragma GCC optimize ("-fno-strict-aliasing")
Now, if you want to complain that the gcc documentation is not great,
On Wed, 11 Jan 2023 14:20:49 +0100
David Brown <david.brown@hesbynett.no> wrote:
C was designed from day one to be a high-level language, not an
assembler of any sort. Limitations of weaker earlier compilers does
not mean the language was supposed to work that way.
For those who want an abstract or portable assembler , there exists c9x.me/compile/ .I've never used it but at least it aims to be that ,
unlike C. I would be curious to know of other analogous projects. I
guess the "register transfer language" of GCC is somewhat analogous.
I first used a C compiler that optimised on the assumption that UB
didn't happen some 25 years ago. (In particular, it assumed signed
integer arithmetic never overflowed.)
I have encountered several times the claim that compilers assume that UB does not happen and I don't understand it. Lets consider 2 examples :
x + 1 > x
in C where x is a signed integer. Compilers will often treat this as
always true with the following reasoning :
- if x does not have the maximum value which fits in its type then the
meaning of the C expressions is the same as their mathematical meaning
so the expression evaluates to true.
- if x has the maximum value which fits in its type then x + 1 is not
defined so any translation (including treating the whole expression as
true) is valid.
There's no assumption that UB (undefined behaviour) will not happen, both possibilities are accounted for.
Another example is
... *some_pointer_object ...
[ some_pointer_object does not get modified in this part of the code and
has not been declared as volatile ]
if (some_pointer_object == NULL) ...
If some_pointer_object is not NULL then the test can be omitted ; if it is NULL then the earlier dereference is UB so any translation is valid including omitting the test.
Again, there's no assumpion that UB will not happen.
So the request that C compilers should stop assuming that UB will not
happen seems to me completely misguided. I think what is really meant
is that, in reasoning what a valid translation is, C compilers (or
the authors of the compilers) should not employ the notion of UB. But
then how should UB be translated ? Again there exists the assumption
or claim that there is some intuitively obvious translation and
compilers should go for that. First, I'm not sure that there exists
such a common intuition even among humans and second, even if it does
, how does one go from an intuition to an algorithm C compilers can
use to do translation ? Lots of things are intuitively obvious but
creating an algorithm to duplicate the human intuition is a hard
problem, one which has not been solved in many cases and perhaps even
one which is unsolvable in some cases.
From your example above, we can see that the compiler can transform (a)
into "y++;" - there is no need for the conditional. But the compiler
can /also/ transform (b) into ";" - it is allowed to reason that if x
/were/ equal to INT_MAX, statement (a) would be undefined behaviour
(even though it was transformed away) and there is no value for x which
would result in "y = 10" being executed without also executing UB.
I have encountered several times the claim that compilers assume that UB does not happen and I don't understand it. Lets consider 2 examples :
x + 1 > x
in C where x is a signed integer. Compilers will often treat this as
always true with the following reasoning :
- if x does not have the maximum value which fits in its type then the meaning of the C expressions is the same as their mathematical meaning
so the expression evaluates to true.
- if x has the maximum value which fits in its type then x + 1 is not
defined so any translation (including treating the whole expression as
true) is valid.
There's no assumption that UB (undefined behaviour) will not happen, both possibilities are accounted for.
So if you have :
int x, y;
if (x + 1 > x) y++; // (a)
if (x == INT_MAX) y = 10; // (b)
From your example above, we can see that the compiler can transform (a)
into "y++;" - there is no need for the conditional. But the compiler
can /also/ transform (b) into ";" - it is allowed to reason that if x
/were/ equal to INT_MAX, statement (a) would be undefined behaviour
(even though it was transformed away) and there is no value for x which
would result in "y = 10" being executed without also executing UB.
I believe in a case like this a modern C/C++ compiler reasons that x must be less than the maximum representable value and it generates code according
to this, possibly removing dead code that depends on x being the maximum representable value. If the compiler's assumption that x is less than the maximum is wrong, it's perfectly fine, it's UB, any "broken" code generated is allowed.
For many years, C allowed for sign-magnitude and ones' complement representation.
Now, C has unsigned int which you can use when you need specific
overflow behavior (except on some Unisys machines).
As C allows for both UB and system-dependent behavior, and it is
hard for people to remember every case of each one, it is unreasonable
to me, to assume fixed point overflow is UB.
Take the function
int add (int a, int b)
{
return a+b;
}
on an instruction set architecture which has only 64-bit
arithmetic, such as POWER. This is translated by gcc,
with optimization, to
add 3,3,4
extsw 3,3
blr
(which is an addition followed by a sign extension). The POWER ABi
specifies that all values passed in registers are sign-extended,
so the content of a register has the same value independent of
the width of the signed integer it is being considered as.
So, the compiler would be within its right to _not_ extend the sign
of the result, because it could assume that no overflow occurs.
This, however, would result in a violation of the ABI, so the
compiler puts in the extra instruction just in case. If you
replace int by long in the example above, the sign extension
instruction is not generated.
By comparision, MIPS gcc translates this to
jr $31
addu $2,$4,$5
(note use of the delay slot), so no explicit sign extension is done,
Alexei A. Frounze <alexfrunews@gmail.com> schrieb:
I believe in a case like this a modern C/C++ compiler reasons that x must be >> less than the maximum representable value and it generates code according
to this, possibly removing dead code that depends on x being the maximum
representable value. If the compiler's assumption that x is less than the
maximum is wrong, it's perfectly fine, it's UB, any "broken" code generated >> is allowed.
There are cases when compilers don't even use this knowledge.
Take the function
int add (int a, int b)
{
return a+b;
}
on an instruction set architecture which has only 64-bit
arithmetic, such as POWER. This is translated by gcc,
with optimization, to
add 3,3,4
extsw 3,3
blr
(which is an addition followed by a sign extension). The POWER ABi
specifies that all values passed in registers are sign-extended,
so the content of a register has the same value independent of
the width of the signed integer it is being considered as.
So, the compiler would be within its right to _not_ extend the sign
of the result, because it could assume that no overflow occurs.
AMD64 specifies zero-extension for both signed
and unsigned ints (and has instructions that generate zero-extended
results).
64-bit zero extension of %rdi. So gcc apparently assumes thatpassed operands are garbage-extended on AMD64. Or maybe gcc is just
| Sysop: | Amessyroom |
|---|---|
| Location: | Fayetteville, NC |
| Users: | 59 |
| Nodes: | 6 (0 / 6) |
| Uptime: | 25:31:07 |
| Calls: | 810 |
| Files: | 1,287 |
| Messages: | 196,038 |