• Undefined Behavior Optimizations in C

    From Lucian Popescu@lucic71@ctrl-c.club to comp.compilers on Thu Jan 5 10:05:49 2023
    From Newsgroup: comp.compilers

    Hi,

    I'm currently working on an academic project that analyzes the speedup gain of Undefined Behavior Optimizations in C. I argue that UB Optimizations introduce more risks than actual speedup gains. This happens because most of the time
    the compiler does not have the context to understand the intention of the programmer in code that contains UB. For example this fast inverse square root function [1] triggers UB at this line: i = * ( long * ) &y;. A "smart" compiler could delete this line as an optimization because it contains UB.

    Don't get me wrong, I'm not saying that the compiler should magically understand situations where the programmer makes UB mistakes and then
    repair them. What I'm saying is that the compiler should compile the code
    "as is" without making smart optimization transformations that break the intention of the programmer.

    To test the theory that the UB Optimizations introduce more risks than speedup gains, I take OpenBSD (for its focus on security and robustness) and compile
    it
    on one hand with UB Optimizations turned on and with UB Optimizations turned off. If I will find out that the speedup gain is not so big (I don't know what big
    means at the moment) then the UB Optimizations don't make sense in the OS. Otherwise, it means that I was wrong, but I will still want to see what security
    risks they introduce on the respective OS setup.

    My current progress is here [2]. I did not start the technical work, ATM I only have the research proposal. I reached you to see if you have any feedback
    on this proposal. Is it a manageable goal? Do you see ways in which it can
    be improved? Does it suck? etc, etc.

    Lucian Popescu

    [1] https://en.wikipedia.org/wiki/Fast_inverse_square_root#Overview_of_the_code [2] https://tildegit.org/lucic71/dissertation/src/branch/master/TSW/tsw.pdf
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Nuno Lopes@nuno.lopes@tecnico.ulisboa.pt to comp.compilers on Thu Jan 5 10:24:13 2023
    From Newsgroup: comp.compilers

    Hi Lucian,

    I've a fair bit of experience with UB-related optimizations in LLVM. I don't think the speedup and code size reduction (also important) from certain UB exploitation is negligible. I think a better project would be to quantify
    the impact of each of the UB features.
    As a counter argument to your thesis, see the slides of my presentation "Semantics for Compiler IRs: Undefined Behavior is not Evil!": https://web.ist.utl.pt/nuno.lopes/pres/ub-vmcai19.pptx

    See an example here that gives a 39% speedup in a loop: https://web.ist.utl.pt/nuno.lopes/pres/poison-llvm16.pptx
    More data on how alias analysis loves UB: https://web.ist.utl.pt/nuno.lopes/pres/ub-vmcai19.pptx

    There's a key implementation difficulty in doing your study. While it's feasible to disable most UB exploitation at the clang side (not with
    compiler switches, you need to change the code). At the LLVM side is
    trickier to do a fair comparison. Because optimizations themselves produce
    UB code, but that's fine; it's just their way of expressing invariants. But then you cannot distinguish between what's an invariant and what's an exploitation of UB in the program.

    Don't get me wrong, I would be very keen to read such a study. But it's a
    lot more work than you think to do it right.

    Do ping me privately if you want more information/pointers.

    Nuno


    -----Original Message-----
    From: Lucian Popescu
    Sent: Thursday, January 5, 2023 10:06 AM
    Subject: Undefined Behavior Optimizations in C

    Hi,

    I'm currently working on an academic project that analyzes the speedup gain
    of Undefined Behavior Optimizations in C. I argue that UB Optimizations introduce more risks than actual speedup gains. This happens because most of the time the compiler does not have the context to understand the intention
    of the programmer in code that contains UB. ...
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Spiros Bousbouras@spibou@gmail.com to comp.compilers on Thu Jan 5 18:06:53 2023
    From Newsgroup: comp.compilers

    On 5 Jan 2023 10:05:49 +0000
    "Lucian Popescu" <lucic71@ctrl-c.club> wrote:
    Hi,

    I'm currently working on an academic project that analyzes the speedup gain of
    Undefined Behavior Optimizations in C. I argue that UB Optimizations introduce
    more risks than actual speedup gains. This happens because most of the time the compiler does not have the context to understand the intention of the programmer in code that contains UB. For example this fast inverse square root
    function [1] triggers UB at this line: i = * ( long * ) &y;. A "smart" compiler could delete this line as an optimization because it contains UB.

    Don't get me wrong, I'm not saying that the compiler should magically understand situations where the programmer makes UB mistakes and then
    repair them. What I'm saying is that the compiler should compile the code
    "as is" without making smart optimization transformations that break the intention of the programmer.

    Do you have a rigorous definition of what it means for a compiler to

    compile the code "as is" without making smart optimization
    transformations that break the intention of the programmer.

    ? Without such a definition , I don't think you are making a meaningful distinction and I think it would be hard to impossible to come up with
    such a definition.

    To test the theory that the UB Optimizations introduce more risks than speedup gains,

    Isn't this comparing apples and oranges ?

    I take OpenBSD (for its focus on security and robustness) and compile it on one hand with UB Optimizations turned on and with UB Optimizations turned off. If I will find out that the speedup gain is not so big (I don't know what big means at the moment) then the UB Optimizations don't make sense in the OS. Otherwise, it means that I was wrong, but I will still want to see what security risks they introduce on the respective OS setup.

    Given the large variety of tasks that an operating system can be used for ,
    I'm curious to see what you are going to test on OpenBSD to determine speedup gain. I think it would make more sense to test with chess engines : prepare
    a collection of chess positions and see how many nodes per second an engine
    can calculate for a given position depending on how the engine was compiled.

    Doing analogous experiments with graphics code or numerical analysis code
    would also be interesting. But an operating system , I don't know , it seems too general.

    My current progress is here [2]. I did not start the technical work, ATM I only
    have the research proposal. I reached you to see if you have any feedback
    on this proposal. Is it a manageable goal? Do you see ways in which it can
    be improved? Does it suck? etc, etc.

    Lucian Popescu

    [1] https://en.wikipedia.org/wiki/Fast_inverse_square_root#Overview_of_the_code
    [2] https://tildegit.org/lucic71/dissertation/src/branch/master/TSW/tsw.pdf
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From gah4@gah4@u.washington.edu to comp.compilers on Thu Jan 5 16:22:07 2023
    From Newsgroup: comp.compilers

    On Thursday, January 5, 2023 at 10:13:08 AM UTC-8, Spiros Bousbouras wrote:
    On 5 Jan 2023 10:05:49 +0000
    "Lucian Popescu" <luc...@ctrl-c.club> wrote:

    I'm currently working on an academic project that analyzes the speedup gain of
    Undefined Behavior Optimizations in C.
    (snip)

    To test the theory that the UB Optimizations introduce more risks than speedup gains,

    Isn't this comparing apples and oranges ?

    Probably.

    You can quantify speed-up, but it is harder to quantify risk.

    You might be able to quantify debug time, and how much longer
    it takes to debug a program with such behavior.

    Most important when debugging, is that you can trust the compiler to
    do what you said. That they don't, has always been part of
    optimization, but these UB make it worse.

    I take OpenBSD (for its focus on security and robustness) and compile it on one hand with UB Optimizations turned on and with UB Optimizations turned off.
    (snip)

    I think it would make more sense to test with chess engines : prepare
    a collection of chess positions and see how many nodes per second an engine can calculate for a given position depending on how the engine was compiled.

    Sounds good to me. Big enough, but not too big. And something that someone always wants to be faster.

    Doing analogous experiments with graphics code or numerical analysis code would also be interesting. But an operating system , I don't know , it seems too general.

    My thought would be all of SPEC, if you can avoid the funny SPEC rules on
    using the numbers.

    Numerical analysis code sounds interesting, as I don't know how much
    the problem affects floating point code.

    I haven't thought about this so recently. Do compilers give warnings
    when they remove such UB code?
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.compilers on Fri Jan 6 07:47:55 2023
    From Newsgroup: comp.compilers

    "Lucian Popescu" <lucic71@ctrl-c.club> writes:
    I'm currently working on an academic project that analyzes the speedup gain of >Undefined Behavior Optimizations in C.

    Some earlier work:

    @InProceedings{wang+12,
    author = {Xi Wang and Haogang Chen and Alvin Cheung and Zhihao Jia and Nickolai Zeldovich and M. Frans Kaashoek},
    title = {Undefined Behavior: What Happened to My Code?},
    booktitle = {Asia-Pacific Workshop on Systems (APSYS'12)},
    OPTpages = {},
    year = {2012},
    url1 = {http://homes.cs.washington.edu/~akcheung/getFile.php?file=apsys12.pdf},
    url2 = {http://people.csail.mit.edu/nickolai/papers/wang-undef-2012-08-21.pdf},
    OPTannote = {}
    }

    The most interesting part wrt. topic is section 3.3: They compiled
    SPECint 2006 with gcc and clang, with either the default options, or
    with defining the behaviour with the options -fno-strict-overflow (or
    -fwrapv) -fno-delete-null-pointer-checks -fno-strict-aliasing, which
    disables most of what you call "UB Optimizations" in the compiler
    versions they used (and probably still disables most "UB
    Optimizations"; there have been new -fno-* flags added, corresponding
    to some of the new "UB Optimizations", but I expect that the early "UB Optimizations" have the most impact (you would not start with
    low-impact "optimizations").

    The results were that in 10 of 12 programs, there was no significant performance difference. In 2 of the 12 programs, there were speed
    differences by factors 1.072, 1.09, 1.063 and 1.118 (depending on
    program and compiler used).

    However, they also found that these speed differences are due to a
    single place in the source code of each of the two programs, and that
    a small change to the source code of each program could eliminate the
    speed difference.

    @InProceedings{ertl15kps,
    author = {M. Anton Ertl},
    title = {What every compiler writer should know about programmers},
    crossref = {kps15},
    pages = {112--133},
    url = {http://www.complang.tuwien.ac.at/kps2015/proceedings/KPS_2015_submission_29.pdf},
    abstract = {In recent years C compiler writers have taken the
    attitude that tested and working production (i.e.,
    \emph{conforming} according to the C standard) C
    programs are buggy if they contain undefined
    behavior, and they feel free to compile these
    programs (except benchmarks) in a way that they no
    longer work. The justification for this attitude is
    that it allows C compilers to optimize better. But
    while these ``optimizations'' provide a factor 1.017
    speedup with Clang-3.1 on SPECint 2006, for
    non-benchmarks it also has a cost: if we go by the
    wishes of the compiler maintainers, we have to
    ``fix'' our working, conforming C programs; this
    requires a substantial effort, and can result in
    bigger and slower code. The effort could otherwise
    be invested in source-level optimizations by
    programmers, which can have a much bigger effect
    (e.g., a factor $>2.7$ for Jon Bentley's traveling
    salesman program). Therefore, optimizations that
    assume that undefined behavior does not exist are a
    bad idea not just for security, but also for the
    performance of non-benchmark programs.}
    }

    @Proceedings{kps15,
    title = {18. Kolloquium Programmiersprachen und Grundlagen der Programmierung (KPS'15)},
    booktitle = {18. Kolloquium Programmiersprachen und Grundlagen der Programmierung (KPS'15)},
    year = {2015},
    key = {KPS '15},
    editor = {Jens Knoop and M. Anton Ertl},
    url = {http://www.complang.tuwien.ac.at/kps2015/proceedings/},
    url-pdf = {http://www.complang.tuwien.ac.at/kps2015/proceedings/kps15.pdf}
    }

    In section 4.2 I use the same way to determine the effectiveness of
    "UB Optimizations" (called "optimizations" (with quotes) in the
    paper), but with a longer list of flags for gcc-5.2.0:

    -fno-aggressive-loop-optimizations -fno-unsafe-loop-optimizations -fno-delete-null-pointer-checks -fno-strict-aliasing
    -fno-strict-overflow -fno-isolate-erroneous-paths-dereference
    -fwrapv

    I used 8 different versions of Jon Bentley's Traveling Salesman
    program translated to C (from Pascal). I found that for clang-3.5 the
    flags made no difference (same binary) for any of the programs, while
    for gcc-5.2.0 they made no difference in the binary for three of the
    programs and no significant speed difference for two more programs.
    For the remaining programs, the "speedups" from "UB Optimizations"
    were by a factor of 1.04, 0.95, and 1.02.

    It contrasts this with the speedup seen from source-level (manual) optimizations: The 8 programs were successive steps in manual
    optimization. The speedup seen here was a factor 2.7 when using
    gcc-5.2.0 -O3 and more for other compilers.

    I argue that UB Optimizations introduce
    more risks than actual speedup gains.

    It's hard to quantify the risks.

    My argument has been that it is cheaper and much more effective to
    perform manual optimization than to ensure that a C program contains
    no undefined behaviour.

    Don't get me wrong, I'm not saying that the compiler should magically >understand situations where the programmer makes UB mistakes and then
    repair them. What I'm saying is that the compiler should compile the code
    "as is" without making smart optimization transformations that break the >intention of the programmer.

    To which the advocates of "UB Optimizations" react by claiming that
    they don't know the intention. So I have written a followup paper
    that discusses this aspect in more depth:

    @InProceedings{ertl17kps,
    author = {M. Anton Ertl},
    title = {The Intended Meaning of \emph{Undefined Behaviour}
    in {C} Programs},
    crossref = {kps17},
    pages = {20--28},
    url = {http://www.complang.tuwien.ac.at/papers/ertl17kps.pdf},
    url2 = {http://www.kps2017.uni-jena.de/proceedings/kps2017_submission_5.pdf},
    abstract = {All significant C programs contain undefined
    behaviour. There are conflicting positions about how
    to deal with that: One position is that all these
    programs are broken and may be compiled to arbitrary
    code. Another position is that tested and working
    programs should continue to work as intended by the
    programmer with future versions of the same C
    compiler. In that context the advocates of the first
    position often claim that they do not know the
    intended meaning of a program with undefined
    behaviour. This paper explores this topic in greater
    depth. The goal is to preserve the behaviour of
    existing, tested programs. It is achieved by letting
    the compiler define a consistent mapping of C
    operations to machine code; and the compiler then
    has to stick to this behaviour during optimizations
    and in future releases.}
    }

    @Proceedings{kps17,
    title = {19. Kolloquium Programmiersprachen und Grundlagen der Programmierung (KPS'17)},
    booktitle = {19. Kolloquium Programmiersprachen und Grundlagen der Programmierung (KPS'17)},
    year = {2017},
    key = {KPS '17},
    editor = {Wolfram Amme and Thomas Heinze},
    url = {http://www.kps2017.uni-jena.de/kps2017_proceedings.html},
    url-pdf = {http://www.kps2017.uni-jena.de/proceedings/kps2017.pdf}
    }

    My current progress is here [2]. I did not start the technical work, ATM I only
    have the research proposal. I reached you to see if you have any feedback
    on this proposal.
    [[2] https://tildegit.org/lucic71/dissertation/src/branch/master/TSW/tsw.pdf]

    I have read Section III and IV of this paper.

    Section III: You cite my 2015 paper above. About SPECInt 2006, I
    cited only Wang et al.'s paper (and compute the overall SPECInt
    difference based on their numbers). For the traveling salesman
    program, the speedup factor of 2.7 (or more, depending on compiler and optimization level) is for source-level (i.e., manual) optimization,
    not "UB Optimization". The numbers for "UB Optimization" are, as
    mentioned above, usually a factor of 1, and in the three cases where
    there is a difference, it's a factor 1.04, 0.95, and 1.02, as
    mentioned above.

    Section IV: You have the beginnings of a plan to evaluate the
    performance; you leave the question of benchmarks open until you know
    what changes in the resulting code are due to "UB Optimization". It's
    not clear what the point of such a benchmark selection would be. It
    certainly would not make the selection more representative, and would
    have the smell of cherry-picking. Also, "UB Optimization" may cause
    changes in thousands of places in the code of OpenBSD, but only few of
    those places will be in hot code and be relevant for performance.

    What I miss is a plan to quantify the risks. Wang et al. made a
    pretty good case (as far as I am concerned, but the "UB Optimization"
    faithful remain unconvinced) in one of their papers when they analysed
    all the Debian packages.

    One question is: Given Wang et al.'s earlier work, what is the
    original contribution of your intended research?

    I discuss some alternative research topics in Section 5.4 of my 2015
    paper.

    - anton
    --
    M. Anton Ertl
    anton@mips.complang.tuwien.ac.at
    http://www.complang.tuwien.ac.at/anton/
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.compilers on Fri Jan 6 08:41:49 2023
    From Newsgroup: comp.compilers

    gah4 <gah4@u.washington.edu> writes:
    On Thursday, January 5, 2023 at 10:13:08 AM UTC-8, Spiros Bousbouras wrote:
    On 5 Jan 2023 10:05:49 +0000
    "Lucian Popescu" <luc...@ctrl-c.club> wrote:

    I'm currently working on an academic project that analyzes the speedup gain of
    Undefined Behavior Optimizations in C.
    (snip)

    To test the theory that the UB Optimizations introduce more risks than
    speedup gains,

    Isn't this comparing apples and oranges ?

    Probably.

    You can compare apples and oranges: How many can you buy for, say, EUR
    5? What's their nutritional value? etc.

    In a similar way, we can compare the advantages and costs of "UB
    Optimization".

    The advantages claimed by the faithful are speedups (usually
    unquantified, to let the the audience imagine high speedups). Wang et
    al. has done some quantification work, as have I; the results were
    that the speedups were small, or sometimes even slowdowns.

    Among the costs are that programmers have to change your program to
    eliminate standard-undefined behaviour.

    First you have to find that behaviour. The "UB Optimization"
    compilers have added "sanitizers" to let you find it; however,
    sanitizers work at run-time, so they won't find cases where, e.g., the
    "UB Optimization" "optimized" away a bounds-check preventing a buffer
    overflow unless there is a test case that tries to perform this buffer overflow.

    And once you have found the behaviour, you have to change it. This
    not only costs developers time, it can also cost performance. E.g.,
    in Section 4.3 of <https://www.complang.tuwien.ac.at/kps2015/proceedings/KPS_2015_submission_29.pdf>,
    I estimate that converting Gforth to C without undefined behaviour
    would cause a slowdown by a factor >3; Gforth is currently compiled
    with as few "UB Optimizations" as possible (by using flags like
    -fwrapv and -fno-... flags). I expect that removing these flags will
    give a speedup by approximately a factor of 1. For those who believe differently, I have repeatedly posted a challenge <2017Aug18.152854@mips.complang.tuwien.ac.at>

    |Write a proof-of-concept Forth interpreter in the language you
    |advocate that runs at least one of bubble-sort, matrix-mult or sieve
    |from bench/forth in
    |<http://www.complang.tuwien.ac.at/forth/bench.zip>

    We can then check whether the language you advocate can compete with a low-level language by benchmarking your interpreter against Gforth.


    We can compare the advantages to the costs of "UB Optimization" in the
    currency of development time:

    How much development time does it cost to achieve as much speedup as
    the UB optimizations give?

    How much development time does it cost to add test cases for
    sanitizing, to run all the different sanitizers, to remove the
    undefined behaviour that you find, and to add manual optimizations
    that compensate any performance regressions that may turn up in that
    process?

    My expectation is that the latter development cost would often be far
    greater (and, in the case of the Forth interpreter, I expect that the performance regression compensation is not possible), but it would be
    nice if we had some empirical results for that.

    Most important when debugging, is that you can trust the compiler to
    do what you said. That they don't, has always been part of
    optimization

    If a compiler produces something that behaves differently, it's not an optimization. One usually considers the I/O (but not timing)
    behaviour of the program when evaluating behaviour. When using a
    single-step debugger, you see execution at a more fine-grained level
    than the optimization was designed for. One approach is to debug the unoptimized program, and enable optimization afterwards. If it's a
    proper optimization, the I/O behaviour will be unchanged.

    but these UB make it worse.

    "UB optimizations" assume that programs don't perform undefined
    behaviour, and they are transformations that are proper optimizations
    for such programs. The problem is that this assumption is usually
    wrong.

    My thought would be all of SPEC

    SPEC CPU programs are not representative in many respects:

    First, they are chosen to be standard-compliant; it's not clear how
    SPEC tries to ensure this, but as the 464.h264ref function SATD shows,
    the process is not perfect. Still, a program that uses one of the
    flags for disabling "UB optimization" is going to be rejected by SPEC,
    so SPEC programs cannot be used for evaluating all the costs of "UB optimizations".

    Moreover, the "UB optimization" compilers are tuned for SPEC CPU: They
    compile the undefined behaviour in SPEC CPU programs (e.g., SATD) as
    intended, while they miscompile a non-SPEC CPU program with the same
    undefined behaviour (see gcc bug 66875).

    And given that the compiler maintainers (and others) evaluate the
    performance of the compilers using SPEC CPU, the compiler maintainers
    are motivated to add optimizations that speed up a SPEC CPU program;
    e.g., if an "UB optimization idea" breaks a test case one one of the
    tested programs (for gcc those reportedly nowadays include all
    automatically testable Debian programs, which is good), but benefits a
    SPEC CPU program, the compiler maintainer will be motivated to refine
    the criteria for applying the optimizations until the breakage
    disappears while still applying for the SPEC CPU program; if the "UB optimization" benefits only programs outside the relevant benchmarks,
    the compiler maintainer will be less likely to spend as much time on
    it. As a result, I expect to see more benefits from "UB
    Optimizations" for SPEC CPU than for, say, code that was written after
    the compiler was released (and was not tuned for the compiler).

    I haven't thought about this so recently. Do compilers give warnings
    when they remove such UB code?

    No. The faithful claim that it is impossible, but I don't think so:

    The compiler could internally compile the program twice: Once with the
    flags as given, and once with all the flags that the compiler has for
    defining undefined behaviour (-fwrapv and the various
    -fno-... options). If there is a difference in the resulting code,
    produce a warning. That warning could be for something that is a
    proper optimization for this particular program (i.e., a false
    positive), or it could be for miscompilation (a true positive); that's
    up to the developer to sort out.

    Once we have that framework in place, we can look at the false
    positives and refine it. E.g., there is a claim that many false
    negatives come from macro evaluations that are possible at compile
    time. In that case, one could add something to the source code that
    suppresses the warning (just like there is "__attribute__((unused))"
    for suppressing warnings about unused variables.

    - anton
    --
    M. Anton Ertl
    anton@mips.complang.tuwien.ac.at
    http://www.complang.tuwien.ac.at/anton/
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.compilers on Fri Jan 6 16:12:25 2023
    From Newsgroup: comp.compilers

    On 06/01/2023 01:22, gah4 wrote:
    On Thursday, January 5, 2023 at 10:13:08 AM UTC-8, Spiros Bousbouras wrote:
    On 5 Jan 2023 10:05:49 +0000
    "Lucian Popescu" <luc...@ctrl-c.club> wrote:

    I'm currently working on an academic project that analyzes the speedup gain of
    Undefined Behavior Optimizations in C.
    (snip)

    To test the theory that the UB Optimizations introduce more risks than
    speedup gains,

    Isn't this comparing apples and oranges ?

    Probably.

    You can quantify speed-up, but it is harder to quantify risk.

    You might be able to quantify debug time, and how much longer
    it takes to debug a program with such behavior.

    Most important when debugging, is that you can trust the compiler to
    do what you said. That they don't, has always been part of
    optimization, but these UB make it worse.

    The trouble with undefined behaviour is that, in general, you cannot
    trust the compiler to "do what you say" because it cannot know what you
    have said.

    A computer language like C is defined by its standard. This says what particular combinations of characters in the source code actually mean.
    If what you write does not fit the specified and documented patterns
    (or the pattern is explicitly labelled "undefined behaviour"), then it
    does not mean anything at all.

    So if you write "give me some prime numbers" as your C code, it means
    nothing and the compiler can't help you. If you write "int * p = 0; int
    x = *p;", it means nothing and the compiler can't help you.

    (Well, the compiler might be able to give helpful error messages!)

    When you write "x = *p;", you are saying to the compiler "It is a fact
    that p is valid pointer to data of a type compatible to *p, all the
    constraints required for the assignment operation are met, there is no
    partial overlap between x and *p, there are no data races, and no range
    errors converting any floating point values. Given that, act as though
    the value of x is now equal to *p after any required conversions."

    You might /think/ you are saying "read the value at address p, and store
    it in the memory reserved for x".


    When you write "x = (y * 30) / 15;" (for "int" x and y), you might
    /think/ you are asking the compiler to multiply by 30 and then divide by
    15. But you are actually telling it that there would be no overflow if
    it were to multiply y by 30, and thus it can use simple mathematical
    equalities to reduce the expression to "x = y * 2;".


    You can /always/ trust the compiler to do what you said, barring
    occasional bugs in the compiler. What you cannot always do is trust the programmer to know what he or she /actually/ said, or to write what he
    or she meant.


    And outside of flags that change the language semantics, such as gcc's
    -fwrapv or -fno-strict-aliasing, enabling or disabling optimisations
    does not change the meaning of the code. It might affect how code is
    generated (and therefore its speed and size), but if the behaviour is
    different then it is because your code did not say what you thought it said.


    And yes, I know debugging optimised code can be difficult. Sometimes
    that means adding extra "volatile" qualifiers, or "asm volatile("" ::: "memory");" fences, in order to have good breakpoint spots or debugging information.

    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From gah4@gah4@u.washington.edu to comp.compilers on Fri Jan 6 10:33:29 2023
    From Newsgroup: comp.compilers

    On Friday, January 6, 2023 at 9:03:30 AM UTC-8, David Brown wrote:

    (snip, I wrote)

    Most important when debugging, is that you can trust the compiler to
    do what you said. That they don't, has always been part of
    optimization, but these UB make it worse.

    The trouble with undefined behaviour is that, in general, you cannot
    trust the compiler to "do what you say" because it cannot know what you
    have said.

    Well, yes. But sometimes ...

    This is reminding me, last year I was working on a program that I worked
    on (but didn't write) in 1977. Writting in, as close as it could be, to standard Fortran 66. (Except that it makes some assumptions about
    the use of values read in, and written out, with A format. Like that you
    can compare such values.)

    In any case, I wanted to compile with bounds check on because it wasn't
    working right. (The version I had was not the exact one from 1977.)

    It turns out, though, to have a fair number of statements like:

    IF(I.GT.0.AND.X(I).EQ.3) ...

    That is, test the subscript and use it in the same IF statement.

    Now, this would not be a problem at all in C, with short-circuit IF
    evaluation required, but even now Fortran doesn't do short
    circuit IF evaluation. The way Fortran did static allocation in 1977,
    there was pretty much no chance that you would access outside
    your address space evaluating X(0). There is a chance, though,
    that the value 3 is stored there.

    Pretty much what I say in this case, is that I rely on the system
    memory protection, and don't need the compiler to ignore the
    test on I, which I suspect is one of the UB that C compilers do.

    Anyway, in this case I know exactly what I said, and the compiler
    does, too. And as long as subscript checks are off, it most
    likely works.
    [Both Fortran 66 and 77 could do short-circuit evaluation, but it's
    not required. I think most compilers did, since it was easy, but
    of course you couldn't count on it. -John]
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From gah4@gah4@u.washington.edu to comp.compilers on Fri Jan 6 11:39:42 2023
    From Newsgroup: comp.compilers

    On Friday, January 6, 2023 at 10:49:59 AM UTC-8, gah4 wrote:

    (snip)

    Now, this would not be a problem at all in C, with short-circuit IF evaluation required, but even now Fortran doesn't do short
    circuit IF evaluation. The way Fortran did static allocation in 1977,
    there was pretty much no chance that you would access outside
    your address space evaluating X(0). There is a chance, though,
    that the value 3 is stored there.

    (snip)

    [Both Fortran 66 and 77 could do short-circuit evaluation, but it's
    not required. I think most compilers did, since it was easy, but
    of course you couldn't count on it. -John]

    Sorry, yes, you can't count on it.

    But if you can't count on it, then you can't use it in standard programs.

    But now that you mention it, I was compiling this last year with
    gfortran, and ran into those, so it wasn't doing it. At least it wasn't
    doing all of them. I would run the program with bounds check on,
    and see where it failed.

    Reminds me, though, of another one from years ago.

    DO 11 I=l,10
    DO 12 J=l,10
    IF (B(I}.LT.0.)GO TO 11
    12 C(J)=SQRT(B(I)}
    11 CONTINUE

    comes from an IBM manual explaining the Fortran H optimizations.
    It seems that the optimizer can move the SQRT out of the inner loop,
    but not the test on B(I). This is actually documented, so isn't UB
    anymore. (Well, not IBM UB.)

    That is in an IBM manual from 1973.
    [It's on page 65 of this 1967 edition. I guess they did data flow but not control flow.
    http://bitsavers.org/pdf/ibm/360/fortran/C28-6602-3_OS_FORTRAN_IV_H_Programmers_Guide_1967.pdf
    -John]
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Kaz Kylheku@864-117-4973@kylheku.com to comp.compilers on Mon Jan 9 09:10:24 2023
    From Newsgroup: comp.compilers

    On 2023-01-05, Lucian Popescu <lucic71@ctrl-c.club> wrote:
    Hi,

    I'm currently working on an academic project that analyzes the speedup gain of
    Undefined Behavior Optimizations in C. I argue that UB Optimizations introduce
    more risks than actual speedup gains. This happens because most of the time the compiler does not have the context to understand the intention of the programmer in code that contains UB. For example this fast inverse square root
    function [1] triggers UB at this line: i = * ( long * ) &y;. A "smart" compiler could delete this line as an optimization because it contains UB.

    "(Lack of) UB optimizations" are based on these ideas:

    1. The programmer is infallible, and therefore
    2. Any interpretation of the code's meaning which entails undefined
    behavior must be incorrect and unintended; and also
    3. If some construct has no interpretation that is not UB, the
    programmer must be saying that that code is unreachable.

    They are pretty terrible engineering.

    Assuming a lack of undefined behavior in C is simply foolish; it's
    extremely unlikely to be correct for any nontrivial program. A
    compiler processing just tens of thousands of lines of C is likely to
    be incorrect a few times in making a no-UB assumption.

    In regard to 3, there are actually some extensions being proposed to C
    whereby the programer can insert an expression which means "this code
    is not reached", and the only semantics of that expression is that it
    has undefined behavior. Since the programmer would never intend
    undefined behavior, by (3) above it implies that the construct is not
    reached. The compiler can replace it by an infinite loop, abortive
    exit, reformatting of the disk, or whatever. Or just generate no
    instructions at all.

    I simply cannot get behind the idea of a deliberately introduced
    language construct that has nothing but undefined behavior. I
    understand that it's efficient: the compiler can reason about that not
    being reachable, and not expend any instructions for it that would add
    to the code size. I would rather have spend the code size to have it
    stop. Just stick a breakpoint trap instruction in there or whatever.

    Don't get me wrong, I'm not saying that the compiler should magically understand situations where the programmer makes UB mistakes and then
    repair them. What I'm saying is that the compiler should compile the code
    "as is" without making smart optimization transformations that break the intention of the programmer.

    I echo that serntiment. I don't require optimizations of the form
    "assuming expression X is always well-defined, thus can translate some
    other expression Y as follows".

    In C, the programmer is supposed to do a lot of the optimizing.

    In spite of these UB-related shennanigans, new developments in compilers
    like GCC are not improving this basic fact.

    You still have to know the tight coding techniques from 1990 in order
    to get good code out of the latest GCC.

    In the 1990's, I compared code for deleting from a linked list:

    node->next->prev = node->prev;
    node->prev->next = node->next;

    versus:

    node_t *next = node->next;
    node_t *prev = node->prev;

    next->prev = prev;
    prev->next = next;

    I get the same results today as I did in 1990-something: a shorter
    instruction sequence for the code in which I cached the pointers in
    local variables! This is because the compiler suspects that
    the first assignment "node->next->prev = node->prev"
    might or might not have affected the value of "node->prev";
    so that has to be loaded again; it cannot be subject to CSE. Strict
    aliasing reasoning doesn't help because everything has the same node_t *
    type.

    (But, note! If the assignment node->next->prev = node->prev
    affects node->prev, that could only be becauase node->next == node.
    And in that case, the assignment is a no-op: the cached node->prev
    value could be used to avoid accesing that location again! GCC is not
    yet smart enough to figure this out: and nobody needs it to be.)

    A C compiler just needs to obey the memory accesses the programmer
    expressed, figure out which local variables to keep in registers,
    and do some basic things with loops, inlining and whatnot; and
    have good basic optimizations like control flow stuff (jump thraeding
    and whatnot), peephole optimizations, CSE, ...

    Approaches like "oh this loop must be infinite because it nothing but increments the dummy variable and tests for it going negative" just have
    no place in engineering; that's just someone dicking around to get their
    name in the git history of the compiler, and stuff their resume.

    If you want an optimal instruction sequence for something, there is
    always assembly language.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Kaz Kylheku@864-117-4973@kylheku.com to comp.compilers on Mon Jan 9 10:14:02 2023
    From Newsgroup: comp.compilers

    On 2023-01-06, David Brown <david.brown@hesbynett.no> wrote:
    On 06/01/2023 01:22, gah4 wrote:
    On Thursday, January 5, 2023 at 10:13:08 AM UTC-8, Spiros Bousbouras wrote: >>> On 5 Jan 2023 10:05:49 +0000
    "Lucian Popescu" <luc...@ctrl-c.club> wrote:

    I'm currently working on an academic project that analyzes the speedup gain of
    Undefined Behavior Optimizations in C.
    (snip)

    To test the theory that the UB Optimizations introduce more risks than >>>> speedup gains,

    Isn't this comparing apples and oranges ?

    Probably.

    You can quantify speed-up, but it is harder to quantify risk.

    You might be able to quantify debug time, and how much longer
    it takes to debug a program with such behavior.

    Most important when debugging, is that you can trust the compiler to
    do what you said. That they don't, has always been part of
    optimization, but these UB make it worse.

    The trouble with undefined behaviour is that, in general, you cannot
    trust the compiler to "do what you say" because it cannot know what you
    have said.

    That's correcst; however, what we should be able to trust is for the
    compiler not to make it worse.

    The compiler makes it worse when it assumes that the programmer is
    infallible, and thus makes logical inferences predicated on some
    construct never having undefined behavior, using those to guide the
    translation of other constructs.

    This is particularly harmful when the undefined behavior is *de facto*
    defined: like that if the undefinedness aspect is ignored, and the
    obvious code is emitted, that code will do something characteristic
    of the machine.

    In C99, the original definition of undefined behavior is this:

    behavior, upon use of a nonportable or erroneous program construct or
    of erroneous data, for which this International Standard imposes no
    requirements.

    NOTE: Possible undefined behavior ranges from ignoring the situation
    completely with unpredictable results, to behaving during translation
    or program execution in a documented manner characteristic of the
    environment (with or without the issuance of a diagnostic message), to
    terminating a translation or execution (with the issuance of a
    diagnostic message).

    There is an obvious intepretation of this text which rules out
    the too-clever optimizations.

    If the compiler optimizes construct Y based on some other construct
    X being free of undefined behavior, and that assumption turns
    out to be false, that compiler has not conformed to the "NOTE"
    part of the treatment of undefined behavior.

    Firstly, it has not "ignor[ed] the situation completely". You cannot
    assume that X is well-defined, in order to treat Y in some way, and yet
    say that you're completely ignoring the situaton of X. If the UB
    problem in X causes a secondary problem with the way Y was translated,
    that is a problem with the assumption that was made about X. That
    assumption was made because it's possible for X to be undefined, and
    that possiblity was expliclty disregarded, which is not the same as
    completely ignored.

    The situation also isn't a documented manner characteristic of
    the environment.

    It also isn't a termination of translation or execution with or without
    a diagnostic message.

    The situation doesn't conform to the NOTE.

    C compiler writers should take the NOTE more seriously.

    Ignoring the situation completely must mean something like:

    "Do not engage in any reasoning about whether the inputs or other
    conditions are right for the correct execution of the construct. Do not
    handle the bad situations at translation time, and don't emit any code
    to handle them. Just translate the specified semantics, and let the
    translated language and its run-time deal with those potential issues."

    As soon as we assume that, oh, if X is well-defined, we can make a
    certain shortcut in translating this other construct Y, we are no longer "completely ignoring"; we are engaging in reasoning about whether the
    inputs or other conditions are right for the correct execution of X, and insisting that they must be, as a justification for treating Y in a
    certain way.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Jon Chesterfield@jonathanchesterfield@gmail.com to comp.compilers on Tue Jan 10 10:46:32 2023
    From Newsgroup: comp.compilers

    So before we decide if UB optimizations are actually allowed by the
    standard we need to decide what "ignoring the situation completely
    with unpredictable results" actually means.

    [1] https://port70.net/~nsz/c/c89/rationale/

    Lucian

    WG14 are aware of UB optimising compilers and could have steered away from
    this path, but haven't. It's been decades now. The pointer provenance work seeks to apply aliasing rules even more aggressively. GCC and clang are
    both pursuing faster codegen via exploiting undefined behaviour.

    C, the WG14 ISO defined language, as implemented by the primary open source toolchains, is thus unfit for my purposes. I'm not clear what use that
    language has.

    C, the typed assembler of ye olde times, is a profoundly useful language.
    One just can't use GCC or clang to build it reliably.

    It annoys me intensely that the type aliasing rules capture something a
    whole program optimising compiler can usually work out for itself anyway,
    while preventing me from reading 128bit integers from the same memory I fetch_add 32bit integers into.

    Jon
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.compilers on Tue Jan 10 17:32:28 2023
    From Newsgroup: comp.compilers

    On 09/01/2023 11:14, Kaz Kylheku wrote:
    On 2023-01-06, David Brown <david.brown@hesbynett.no> wrote:
    On 06/01/2023 01:22, gah4 wrote:

    Most important when debugging, is that you can trust the compiler to
    do what you said. That they don't, has always been part of
    optimization, but these UB make it worse.

    The trouble with undefined behaviour is that, in general, you cannot
    trust the compiler to "do what you say" because it cannot know what you
    have said.

    That's correcst; however, what we should be able to trust is for the
    compiler not to make it worse.


    Let's be clear here. At the theoretical level, when your code executes undefined behaviour at run time, you have said "I don't care what
    happens now". The compiler is allowed to do /anything/. There is no
    such thing as "making it worse" - you have said you will accept
    absolutely /anything/ from the compiler. No restrictions, no
    limitations - /anything/.

    That is what "undefined behaviour" means. There are /no/ definitions of
    the behaviour, and no limits, expectations, "worse" choices, "better"
    choices.

    At a /practical/ level, compilers can try to be more helpful. They can
    define behaviour for things that are not defined in the language specifications. They can give you warnings or error messages (at
    compile time or run time, such as when using sanitizers). They can
    offer optional extra semantics (like gcc's "-fwrapv" flag). They can
    certainly avoid going out of their way to avoid being specially awkward
    - normally they are simply trying to make the code more efficient when
    it is running correctly.

    But you cannot be sure things won't go unexpectedly bad. You can never
    have guarantees about that. Guarantees require definitions, and you
    don't have that with undefined behaviour.

    Consider this example source code:

    void print_message(bool x) {
    if (x) {
    printf("Great!\n");
    } else {
    printf("Boo!\n");
    }
    }

    void format_harddisk(int disk_number) {
    ...
    }


    This is fine code, with no UB in sight.

    The compiler could this to the pseudo-code :

    print_message:
    if ($GPR1 == 0) :
    $GPR1 = "Boo!"
    jump puts

    if ($GPR1 == 1) :
    $GPR1 = "Great!"
    jump puts

    format_harddisk:
    ...


    It knows that on entry, the parameter in GPR1 is either 0 or 1, because
    it is type "bool".

    What happens if you call the function from a different translation unit
    like this :

    extern void print_message(int x);
    print_message(2);

    ?

    This is clearly wrong - clearly undefined behaviour. And the result
    would be a formatted disk. But there is nothing wrong with the
    compiler's generated code. I have seen other occasions when compiler's
    have made code with booleans that appear to be both true and false, or
    neither true nor false, as a result of undefined behaviour setting the underlying memory to something other than 0 or 1, simply because that
    was the result of the most efficient code.

    There are all sorts of other ways a compiler can take advantage of
    knowing a boolean is either 0 or 1, and all sorts of things that can go
    wrong if you've messed up your code and the boolean is /not/ 0 or 1. It
    can use it as an index into an array, or to multiply a value
    (translating "x = f ? a : b;" into "x = f * (a - b) + b;" ), or using it
    as a bit mask, or subtracting 1 and using /that/ as a mask. And once
    you have started getting the wrong values, there is no theoretical limit
    to how bad things can get - it is not the compiler "making things
    worse", it is bugs in the code being outside the specifications for what
    is intended for the program.


    Of course the compiler should not be messing around checking that the
    value you pass is actually 0 or 1. That would be inefficient. And what
    would it do if it was a bad value? Treat it as "false" ? What about
    calling "missile_control(bool cancel_launch)" with a bad value? Should
    it treat it as "true" ? What about calling "missile_control(bool confirm_launch)" with a bad value? Jump to an error handler and stop
    the program with a message? How does that work for your flight controller?

    If you want a hand-holding check everything language, there are plenty
    of them to go around. They're great, and far more appropriate than C
    for many purposes. C, on the other hand, trusts the programmer and
    gives you the most efficient object code it can from the source you give
    it, based on the rules of the language (plus any extra features defined
    by the compiler).

    The compiler makes it worse when it assumes that the programmer is infallible, and thus makes logical inferences predicated on some
    construct never having undefined behavior, using those to guide the translation of other constructs.

    It is the programmers' /job/ to write correct code. It is up to them to
    use any and all tools available to do that - static error checking,
    run-time sanitizers, code reviews, unit tests, system tests, etc. That
    also includes using the write language for the task in hand. If the
    code is critical, but too complex to be written bug-free in C, then
    maybe a different language would be better - or maybe a different
    programmer.

    I am not suggesting that I always write bug-free code. But the bugs I
    put in my code are /my/ bugs - /my/ responsibility. And I don't blame
    the compiler for the effects of these bugs.


    This is particularly harmful when the undefined behavior is *de facto* defined: like that if the undefinedness aspect is ignored, and the
    obvious code is emitted, that code will do something characteristic
    of the machine.

    It is particularly harmful when programmers think there is such a thing
    as "de facto defined". That's an oxymoron. If the behaviour is
    defined, it is defined. If it is not defined, it is not defined. If it
    is not defined and a programmer makes unwarranted and incorrect
    assumptions about what they think it means, then the programmer needs to
    update his or her understanding of the language. They don't get to
    blame the compiler or the compiler writer for not making the same
    unfounded assumptions that they did.



    In C99, the original definition of undefined behavior is this:

    behavior, upon use of a nonportable or erroneous program construct or
    of erroneous data, for which this International Standard imposes no
    requirements.

    NOTE: Possible undefined behavior ranges from ignoring the situation
    completely with unpredictable results, to behaving during translation
    or program execution in a documented manner characteristic of the
    environment (with or without the issuance of a diagnostic message), to
    terminating a translation or execution (with the issuance of a
    diagnostic message).

    There is an obvious intepretation of this text which rules out
    the too-clever optimizations.


    I presume you mean "a documented manner characteristic of the
    environment". That would be something like gcc's "-fwrapv" flag. It
    takes something that is undefined in the C standards - signed integer
    overflow - and gives it a /documented/ behaviour. The key here is "documented". Now the behaviour is /defined/ - it is UB as far as the C standards are concerned, but /defined/ behaviour as far as the
    implementation is concerned. And you can rely on it, as long as you use
    a compiler with such a flag and documented behaviour.

    The more general and obvious interpretation of the definition is
    "imposes no requirements" - you have no right to have any expectations
    about the results of the undefined behaviour. In particular, while you
    might have some guesses as to what will happen just at the time, you
    have no idea about knock-on effects.

    If the compiler optimizes construct Y based on some other construct
    X being free of undefined behavior, and that assumption turns
    out to be false, that compiler has not conformed to the "NOTE"
    part of the treatment of undefined behavior.

    Firstly, it has not "ignor[ed] the situation completely". You cannot
    assume that X is well-defined, in order to treat Y in some way, and yet
    say that you're completely ignoring the situaton of X. If the UB
    problem in X causes a secondary problem with the way Y was translated,
    that is a problem with the assumption that was made about X. That
    assumption was made because it's possible for X to be undefined, and
    that possiblity was expliclty disregarded, which is not the same as completely ignored.

    The situation also isn't a documented manner characteristic of
    the environment.

    It also isn't a termination of translation or execution with or without
    a diagnostic message.

    The situation doesn't conform to the NOTE.

    C compiler writers should take the NOTE more seriously.

    The "note" is just a note, not part of the language definitions, and it
    gives some example treatments of undefined behaviour, not an exclusive
    list of options. No, compiler writers do /not/ have to take it
    seriously. If the C standards committee had wanted to impose
    restrictions on what can happen during undefined behaviour, they would
    have given them instead of saying "no requirements". (And the committee
    are smart enough to know that imposing restrictions to UB in general is
    not possible.)

    Of course it would be possible to impose restrictions in particular
    cases. It would be fine to say, for example, that signed integer
    overflow gives an unspecified integer value as a result, rather than
    undefined behaviour. Or that attempting to dereference an invalid
    pointer will result in the memory being written or read regardless. But
    that would be giving definitions to behaviour that is currently
    undefined - you cannot impose meaning on undefined behaviour itself.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From gah4@gah4@u.washington.edu to comp.compilers on Tue Jan 10 15:57:07 2023
    From Newsgroup: comp.compilers

    On Tuesday, January 10, 2023 at 2:00:57 PM UTC-8, David Brown wrote:

    (big snip)

    It is particularly harmful when programmers think there is such a thing
    as "de facto defined". That's an oxymoron. If the behaviour is
    defined, it is defined. If it is not defined, it is not defined. If it
    is not defined and a programmer makes unwarranted and incorrect
    assumptions about what they think it means, then the programmer needs to update his or her understanding of the language. They don't get to
    blame the compiler or the compiler writer for not making the same
    unfounded assumptions that they did.

    A lot of C code assumes two's complement. I believe Unisys still sells
    ones' complement hardware, and so that code might have UB, but often
    people know that it will only run on two's complement machines.

    Much also assumes ASCII code. Don't tell IBM about that.

    I remember comments in some routines in the IBM PL/I (F)
    library, such as:

    THE OPERATION OF THIS MODULE DEPENDS UPON AN INTERNAL
    REPRESENTATION OF THE EXTERNAL CHARACTER SET WHICH IS
    EQUIVALENT TO THE ONE USED AT ASSEMBLY TIME. THE CODING
    HAS BEEN ARRANGED SO THAT REDEFINITION OF ''CHARACTER''
    CONSTANTS, BY REASSEMBLY, WILL RESULT IN A CORRECT
    MODULE FOR THE NEW DEFINITIONS.

    I have never seen that in C programs.

    As I noted in another thread, Java has a rule that the compiler be
    able to determine that a variable is given a value before it is used.
    It is a fatal compilation error if it can't.

    If C compilers had a fatal compilation error when they found
    suspicious UB, I could live with that. Maybe it means doing
    something to convince the compiler, even if it really is still UB.

    Remember, much of the data breaches we hear about
    (and also the ones we don't) are due to buffer overflow
    in C programs. Make it easier, instead of harder, for programmers
    to avoid buffer problems.

    (snip)
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Thomas Koenig@tkoenig@netcologne.de to comp.compilers on Wed Jan 11 09:34:28 2023
    From Newsgroup: comp.compilers

    Jon Chesterfield <jonathanchesterfield@gmail.com> schrieb:
    So before we decide if UB optimizations are actually allowed by the
    standard we need to decide what "ignoring the situation completely
    with unpredictable results" actually means.

    [1] https://port70.net/~nsz/c/c89/rationale/

    Lucian

    WG14 are aware of UB optimising compilers and could have steered away from this path, but haven't. It's been decades now. The pointer provenance work seeks to apply aliasing rules even more aggressively. GCC and clang are
    both pursuing faster codegen via exploiting undefined behaviour.

    C, the WG14 ISO defined language, as implemented by the primary open source toolchains, is thus unfit for my purposes.

    This whole discussion is a bit weird.

    The C standard, like any programming language standard, is a
    specification. Compiler writers promise to keep their end of the
    bargain by translating code to machien language according to that
    specification (modulo bugs). A programmer who uses the language
    according to that specification can be sure that this is translated
    correctly (again, modulo bugs).

    Compilers are required to document implementation-defined behavior,
    and it is also possible to document undefined behavior; a
    documented instance of undefined behavior is an extension.

    For example, a program which uses #include <unistd.h> (where in
    fact another standard, POSIX, is referenced) or using __builtin_clz
    (a gcc extension taken up by other compilers) invokes undefined
    behavior according to the C standard. There is nothing wrong with
    that, provided the programmers sticks to compilers where it is
    documented to be supported.

    There is another class of undefined behavoor, that which is the result
    of an erroneous construct. It appears that some people like to violate
    the specifications (both the C standard and the compiler's
    documentation, or other standards) and expect consistent results
    according to their wishes. Or, to put it slightly differently, they
    assume some sort of unwritten super specification that should apply in
    addition to the other specifications.

    There is in fact no such specification, and it is unclear if peple
    who think so could agree on what exactly it should and should
    not contain.

    And people who have such an unspecified specification in mind then are
    then hurt, indignant and sometimes downright offensive to people who
    point out that compiler writers follow the written and
    well-established specifications of the standards, and that something
    that happened by accident in earlier revisions of a compiler has
    changed.

    I'm not clear what use that language has.

    Either you know very little knowledge of the software currently used
    (did you ever hear of a piece of software called the "Linux kernel"?)
    or you are engaging in hyperbole. I strongly suspect the latter.

    If C is indeed the right sort of language to write large programs in
    is quite another question, in which a strong argument against C can
    indeed be made.

    C, the typed assembler of ye olde times, is a profoundly useful language.
    One just can't use GCC or clang to build it reliably.

    You mean without a lot of the optimizations of the last (almost)
    fifty years?

    It annoys me intensely that the type aliasing rules capture something a
    whole program optimising compiler can usually work out for itself anyway,

    Link-time optimization, while highly useful, currently has severe
    scaling issues. This is an area where advances would be really
    useful (if implemented in production compilers).

    while preventing me from reading 128bit integers from the same memory I fetch_add 32bit integers into.

    Not clear what exactly you want to do this.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.compilers on Wed Jan 11 14:20:49 2023
    From Newsgroup: comp.compilers

    On 10/01/2023 11:46, Jon Chesterfield wrote:
    So before we decide if UB optimizations are actually allowed by the
    standard we need to decide what "ignoring the situation completely
    with unpredictable results" actually means.

    [1] https://port70.net/~nsz/c/c89/rationale/

    Lucian

    WG14 are aware of UB optimising compilers and could have steered away from this path, but haven't. It's been decades now. The pointer provenance work seeks to apply aliasing rules even more aggressively. GCC and clang are
    both pursuing faster codegen via exploiting undefined behaviour.

    C, the WG14 ISO defined language, as implemented by the primary open source toolchains, is thus unfit for my purposes. I'm not clear what use that language has.

    It seems to be very popular, so many people find it fit for their
    purposes. (I certainly find it, along with C++, a good fit for my
    low-level small-systems embedded programming, and I am quite happy with
    "UB optimisations" as you call them.) But some people don't like it,
    which is fair enough. And certainly no one thinks either the language
    or the tools are perfect.

    Some people want a language that is mostly like C, except for certain
    features - and accessing objects in memory using different pointer types
    is a common request. This is why both gcc and clang (and a few other compilers) have a flag that gives you this behaviour
    "-fno-strict-aliasing". I always find it ironic that the compilers that
    some people complain "doesn't do what I want" or "doesn't do what old
    compilers did" are precisely the compilers that give you these options.


    C, the typed assembler of ye olde times, is a profoundly useful language.

    It's a myth. It never existed. There has simply been a steady
    improvement in the optimisation of correct code as compilers have got
    more sophisticated. There are compilers that document and define
    behaviour for certain things that are undefined behaviour in the C
    standards, but I have never heard of a compiler that claims to
    understand the programmers' intentions even when they write incorrect
    code.

    C was designed from day one to be a high-level language, not an
    assembler of any sort. Limitations of weaker earlier compilers does
    not mean the language was supposed to work that way.

    I first used a C compiler that optimised on the assumption that UB
    didn't happen some 25 years ago. (In particular, it assumed signed
    integer arithmetic never overflowed.)

    One just can't use GCC or clang to build it reliably.


    You mean newer tools treat your code bugs in different ways from older
    tools? There's a solution for that.

    It annoys me intensely that the type aliasing rules capture something a
    whole program optimising compiler can usually work out for itself anyway, while preventing me from reading 128bit integers from the same memory I fetch_add 32bit integers into.


    It annoys /me/ intensely that people complain about this sort of thing,
    and yet apparently haven't bothered to read the compiler manuals to see
    how to get the effects they want. Compile with "-fno-strict-aliasing",
    or (better, IMHO) add this to your code:

    #pragma GCC optimize ("-fno-strict-aliasing")

    Now, if you want to complain that the gcc documentation is not great, or
    that flags like this should be documented along with the standards flags
    rather than optimisation flags, I'll happily agree. (I don't know if
    clang does better here.) But don't complain that the compiler is a problem.


    And there are other ways to handle this in gcc. Use "may_alias" types.
    Or use volatile accesses. Or use memcpy(). Or use unions.


    There are two /real/ problems here. One is that C is not, and never has
    been, the language that some people think it is - and thus they get
    frustrated when they find out there code is not as correct as they
    thought. A second is that there are weak compilers out there that on
    the one hand lull developers into a false understanding of the language
    due to their limited code optimisations, and on the other hand make safe alternatives such as "memcpy" highly inefficient on their tools.

    What this means is that different compilers, including gcc and clang,
    are perfectly capable of generating code that efficiently mixes accesses
    of different kinds to the same object. But the details of the code you
    write to get the effects are different - C is not as portable here as it
    should be. For code that needs to work well on multiple toolchains, you quickly end up with a header that has conditional compilation and macros
    that vary depending on the compiler in use. That is ugly and awkward,
    but I know of no better way.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.compilers on Wed Jan 11 14:40:30 2023
    From Newsgroup: comp.compilers

    On 11/01/2023 00:57, gah4 wrote:
    On Tuesday, January 10, 2023 at 2:00:57 PM UTC-8, David Brown wrote:

    (big snip)

    It is particularly harmful when programmers think there is such a thing
    as "de facto defined". That's an oxymoron. If the behaviour is
    defined, it is defined. If it is not defined, it is not defined. If it
    is not defined and a programmer makes unwarranted and incorrect
    assumptions about what they think it means, then the programmer needs to
    update his or her understanding of the language. They don't get to
    blame the compiler or the compiler writer for not making the same
    unfounded assumptions that they did.

    A lot of C code assumes two's complement. I believe Unisys still sells
    ones' complement hardware, and so that code might have UB, but often
    people know that it will only run on two's complement machines.

    Much also assumes ASCII code. Don't tell IBM about that.

    There is usually no requirement for a given piece of C code to be fully portable to all conforming compilers. Most C code is quite limited in
    the scope of its use, and it is quite reasonable to assume two's
    complement representation (though /not/ two's complement wrapping on
    overflow), 8-bit char, ASCII characters, etc. It may be fine to assume
    32-bit int, and perhaps little-endian ordering. These are /warranted/ assumptions, not unwarranted ones - they are typically reasonable to
    make, and if you want to you can sometimes put compile-time checks so
    that if someone does try to use it out of context, they get an error
    message.

    And these are all points that the C standards call
    implementation-dependent behaviour. These may vary between compilers or targets, but for a given toolchain and target the behaviour is clear and defined. That is quite different from undefined behaviour.


    If C compilers had a fatal compilation error when they found
    suspicious UB, I could live with that. Maybe it means doing
    something to convince the compiler, even if it really is still UB.


    C compilers often do their best here, as long as you enable warnings and optimisation (needed for the code analysis). I'd prefer it if tools
    like gcc had a lot more warnings by default. However, it is rarely
    possible to find run-time UB at compile time. If a function takes a
    pointer parameter and dereferences it, it is only UB if the function is
    called with an invalid pointer. The compiler will assume that UB does
    not occur, and compile the function optimised with that assumption. If
    the compiler were to warn you about the potential UB in the function,
    your builds would be swamped by false positives.

    You can come a long way be using sanitizers when testing code - then
    your code is augmented by extra checks, and the run-time will stop with
    a fatal error when there are problems. I think such tools are often
    under-used by developers (as are static warnings). Of course no testing
    can prove code is correct, but it is certainly a help.

    Remember, much of the data breaches we hear about
    (and also the ones we don't) are due to buffer overflow
    in C programs. Make it easier, instead of harder, for programmers
    to avoid buffer problems.


    Sure. The prime solution here is to stop using C - use a language that
    gives you more help in such cases. That could be a high level language
    such as Python, a managed language like C# or Java, or a smarter
    language like Rust or C++.

    If you have to use C, then write /better/ C and do better testing. Stop passing raw pointers around independently of the size of the data. Stop
    using fixed size "magic numbers" for the size of buffers when the size
    needed might change later. Refactor your data accesses into small
    inline functions, where you can easily add tests and limits during
    debugging.

    Yes, C makes it easy to write incorrect code - but it is quite possible
    to use it for writing correct code. There are other languages that make
    it easier to write correct code and harder to write incorrect code -
    it's too late for C to change.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From gah4@gah4@u.washington.edu to comp.compilers on Wed Jan 11 16:09:32 2023
    From Newsgroup: comp.compilers

    On Wednesday, January 11, 2023 at 3:13:08 PM UTC-8, David Brown wrote:

    (snip)

    Much also assumes ASCII code. Don't tell IBM about that.

    There is usually no requirement for a given piece of C code to be fully portable to all conforming compilers. Most C code is quite limited in
    the scope of its use, and it is quite reasonable to assume two's
    complement representation (though /not/ two's complement wrapping on overflow), 8-bit char, ASCII characters, etc. It may be fine to assume
    32-bit int, and perhaps little-endian ordering. These are /warranted/ assumptions, not unwarranted ones - they are typically reasonable to
    make, and if you want to you can sometimes put compile-time checks so
    that if someone does try to use it out of context, they get an error
    message.

    As I noted with the Fortran optimization example from 50 years ago,
    this is not a new problem. But IBM documented it (so, as you note, it
    is an extension and not UB), and programmers learn to expect it.

    And yes I ignored the distinction between implementation defined
    and undefined. But that is exactly what happens when actual
    people (as opposed to other programs) write programs.

    People learn what works and what doesn't. When they need to worry
    about endianness or ASCII or two's complement.

    Much of the implementation, or undefined, behavior of early Fortran
    compilers, later got implemented into the standard. And much didn't,
    but enough people believe it did.

    If compiler writers remember that they are writing for use by actual
    people, who are sometimes imperfect, then it should be fine. Try to
    make the changes not too surprising to users.

    I don't remember the exact example of surprising UB, but they are out
    there.

    In any case, I am all for the original project, of studying compilers,
    UB, and optimization. If it doesn't really help speed up problems, but
    does slow down debugging, maybe it isn't worth doing.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Kaz Kylheku@864-117-4973@kylheku.com to comp.compilers on Thu Jan 12 05:21:16 2023
    From Newsgroup: comp.compilers

    On 2023-01-11, Thomas Koenig <tkoenig@netcologne.de> wrote:
    Jon Chesterfield <jonathanchesterfield@gmail.com> schrieb:
    It annoys me intensely that the type aliasing rules capture something a
    whole program optimising compiler can usually work out for itself anyway,

    Link-time optimization, while highly useful, currently has severe
    scaling issues. This is an area where advances would be really
    useful (if implemented in production compilers).

    Link-time optimization violates ISO C, unfortunately.

    ISO C describes C translation in a number of phases. I'm going to use
    C99 numbering; it might have changed in C11.

    In translation phase 7, syntax and semantic analysis is performed on the
    tokens coming from the previous phases (preprocessing) and the
    translation unit is translated.

    Then in translation phase 8, multiple translation units are combined,
    and references are resolved.

    Link time optimization breaks the layering; it reintroduces semantic
    analysis into translation phase 8, where only linking is supposed to be
    taking place.

    When the compiler/linker peeks into translation unit a.o in order to
    alter something in b.o, you no longer have translation units.

    You have something similar to the effect of doing

    cat a.c b.c > combined.c
    cc combined.c

    except that such a crude approach combines all internal linkage
    identifiers ("static") into one.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    [I don't see the problem, since there's invariably an "as if" rule
    that lets you do whatever you want so long as the results are the same
    as if you'd done everything in sequence. If a linktime optimizer looks
    at the code, promotes a couple of very busy variables that never have
    their addresses taken into global registers, so what? Or if it can
    globally tell that two variables are never aliased so it can reuse a
    register copy of one after modifying the other, again so what? -John]
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.compilers on Thu Jan 12 12:21:40 2023
    From Newsgroup: comp.compilers

    Kaz Kylheku <864-117-4973@kylheku.com> writes:
    [...]
    Link-time optimization violates ISO C, unfortunately.

    ISO C describes C translation in a number of phases. I'm going to use
    C99 numbering; it might have changed in C11.

    (It didn't.)

    In translation phase 7, syntax and semantic analysis is performed on the tokens coming from the previous phases (preprocessing) and the
    translation unit is translated.

    Then in translation phase 8, multiple translation units are combined,
    and references are resolved.

    Link time optimization breaks the layering; it reintroduces semantic
    analysis into translation phase 8, where only linking is supposed to be taking place.

    When the compiler/linker peeks into translation unit a.o in order to
    alter something in b.o, you no longer have translation units.
    [...]
    [I don't see the problem, since there's invariably an "as if" rule
    that lets you do whatever you want so long as the results are the same
    as if you'd done everything in sequence. If a linktime optimizer looks
    at the code, promotes a couple of very busy variables that never have
    their addresses taken into global registers, so what? Or if it can
    globally tell that two variables are never aliased so it can reuse a
    register copy of one after modifying the other, again so what? -John]

    And in fact the C standard makes this explicit in this case, in a
    footnote in the section defining the translation phases:

    Physical source file multibyte characters are mapped, in an
    implementation- defined manner, to the source character set
    (introducing new-line characters for end-of-line indicators) if
    necessary. Trigraph sequences are replaced by corresponding
    single-character internal representations. Implementations shall
    behave as if these separate phases occur, even though many are
    typically folded together in practice. Source files, translation
    units, and translated translation units need not necessarily be
    stored as files, nor need there be any one-to-one correspondence
    between these entities and any external representation. The
    description is conceptual only, and does not specify any particular
    implementation.

    Of course any link-time optimization that breaks the standard-defined
    semantics makes the implementation non-conforming (the same as for any optimization).

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    Working, but not speaking, for XCOM Labs
    void Void(void) { Void(); } /* The recursive call of the void */
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Thomas Koenig@tkoenig@netcologne.de to comp.compilers on Thu Jan 12 21:50:14 2023
    From Newsgroup: comp.compilers

    Kaz Kylheku <864-117-4973@kylheku.com> schrieb:
    On 2023-01-11, Thomas Koenig <tkoenig@netcologne.de> wrote:
    Jon Chesterfield <jonathanchesterfield@gmail.com> schrieb:
    It annoys me intensely that the type aliasing rules capture something a
    whole program optimising compiler can usually work out for itself anyway, >>
    Link-time optimization, while highly useful, currently has severe
    scaling issues. This is an area where advances would be really
    useful (if implemented in production compilers).

    Link-time optimization violates ISO C, unfortunately.

    ISO C describes C translation in a number of phases. I'm going to use
    C99 numbering; it might have changed in C11.

    In translation phase 7, syntax and semantic analysis is performed on the tokens coming from the previous phases (preprocessing) and the
    translation unit is translated.

    [...]

    The C standard (like other programming language standards) describes
    behavior, not implementation. A compiler which translates "as if" the
    phases were followed works as well. The same applies to program
    execution, where the concept of the abstract machine is used.

    Regarding compilation phases, this is made clear in N2596,

    5.1.1.2 Translation phases
    The precedence among the syntax rules of translation is specified
    by the following phases.6)

    plus the footnote which explains a bit what is meant by

    6) This requires implementations to behave as if these separate
    phases occur, [...]

    Then in translation phase 8, multiple translation units are combined,
    and references are resolved.

    Link time optimization breaks the layering; it reintroduces semantic
    analysis into translation phase 8, where only linking is supposed to be taking place.

    ... which does not have to be the case.

    When the compiler/linker peeks into translation unit a.o in order to
    alter something in b.o, you no longer have translation units.

    ... which you don't need; 5.1.1.2 shows the "as if" rule quite clearly.

    If LTO changes the semantics of a standard-conforming program, that
    is another matter; that is then simply a bug.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From antispam@antispam@math.uni.wroc.pl to comp.compilers on Fri Jan 13 20:46:28 2023
    From Newsgroup: comp.compilers

    Spiros Bousbouras <spibou@gmail.com> wrote:
    On Fri, 6 Jan 2023 10:33:29 -0800 (PST)
    gah4 <gah4@u.washington.edu> wrote:
    It turns out, though, to have a fair number of statements like:

    IF(I.GT.0.AND.X(I).EQ.3) ...

    That is, test the subscript and use it in the same IF statement.

    Now, this would not be a problem at all in C, with short-circuit IF evaluation required, but even now Fortran doesn't do short
    circuit IF evaluation. ...

    The analogous C code would be
    if (I > 0 && X[I] == 3) ...

    No, Fortran array start from 1, C arrays start from 0. And Fortran
    is not obliged do do short circuit evaluation. So analogous
    C code would be

    char bv1 = (I>=0);
    char bv2 = (X[I] == 3);

    if (bv1 && bv2) ...

    I this code compiler may throw out 'bv1' (assume that it is true).
    Of course, normal C programmer would write

    if ((I>=0)&&(X[I] == 3))

    which is fine due to short circuit evaluation. Reversing order:

    if ((X[I] == 3)&&(I>=0))

    have the same problem as first version. Both first version and
    third version may appear as output of generators or due to macro
    expansion, so this is not entirely theoretical problem. OTOH,
    in Fortran and Pascal classic teaching was "do not do this".
    This was particularly important in Pascal, as standard required
    full evaluation.


    --
    Waldek Hebisch
    [You can start Fortran arrays at zero if you want
    DIMENSION FOO(0:99)
    But I agree mostly people use the 1 default -John]
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Kaz Kylheku@864-117-4973@kylheku.com to comp.compilers on Sun Jan 15 04:17:31 2023
    From Newsgroup: comp.compilers

    On 2023-01-12, the moderator wrote:
    [I don't see the problem, since there's invariably an "as if" rule
    that lets you do whatever you want so long as the results are the same
    as if you'd done everything in sequence. If a linktime optimizer looks
    at the code, promotes a couple of very busy variables that never have
    their addresses taken into global registers, so what? Or if it can
    globally tell that two variables are never aliased so it can reuse a
    register copy of one after modifying the other, again so what? -John]

    We can look at this as if the translation units have been rolled back in
    time to translation phase 7, where semantic analysis is now taking place
    again in an aternative future. This time, there is a magic oracle
    present that provides information about all the translation units
    which will be linked to produce the program.

    I'm skeptical whether the GCC and Clang LTO systems always behave like a
    magic oracle that supplies iron-clad truths to translation phase 7.

    And even if so, it seems wrong for the translation phase semantic
    analysis to be looking at information from any other part of a program
    than the translation unit at hand. The standard clearly says what is
    subject to semantic analysis, and it's only the material from the
    translation unit:

    7. White-space characters separating tokens are no longer significant.
    Each preprocessing token is converted into a token. The resulting
    tokens are syntactically and semantically analyzed and translated
    as a translation unit.

    "The resulting tokens" are those from this translation unit, and nowhere
    else. Nothing other than those tokens is to be semantically analyzed.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    [Hey, wait, it's one thing to say LTO might be buggy, another to say
    it's fundamentally imposible. Any optimizer might be buggy. -John]
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Spiros Bousbouras@spibou@gmail.com to comp.compilers on Wed Jan 18 13:14:35 2023
    From Newsgroup: comp.compilers

    On Wed, 11 Jan 2023 14:20:49 +0100
    David Brown <david.brown@hesbynett.no> wrote:
    C was designed from day one to be a high-level language, not an
    assembler of any sort. Limitations of weaker earlier compilers does
    not mean the language was supposed to work that way.

    For those who want an abstract or portable assembler , there exists c9x.me/compile/ .I've never used it but at least it aims to be that ,
    unlike C. I would be curious to know of other analogous projects. I
    guess the "register transfer language" of GCC is somewhat analogous.

    I first used a C compiler that optimised on the assumption that UB
    didn't happen some 25 years ago. (In particular, it assumed signed
    integer arithmetic never overflowed.)

    I have encountered several times the claim that compilers assume that UB does not happen and I don't understand it. Lets consider 2 examples :

    x + 1 > x

    in C where x is a signed integer. Compilers will often treat this as
    always true with the following reasoning :

    - if x does not have the maximum value which fits in its type then the
    meaning of the C expressions is the same as their mathematical meaning
    so the expression evaluates to true.

    - if x has the maximum value which fits in its type then x + 1 is not
    defined so any translation (including treating the whole expression as
    true) is valid.

    There's no assumption that UB (undefined behaviour) will not happen, both possibilities are accounted for.

    Another example is

    ... *some_pointer_object ...
    [ some_pointer_object does not get modified in this part of the code and
    has not been declared as volatile ]
    if (some_pointer_object == NULL) ...

    If some_pointer_object is not NULL then the test can be omitted ; if it is NULL then the earlier dereference is UB so any translation is valid including omitting the test.

    Again, there's no assumpion that UB will not happen.

    So the request that C compilers should stop assuming that UB will not
    happen seems to me completely misguided. I think what is really meant
    is that, in reasoning what a valid translation is, C compilers (or
    the authors of the compilers) should not employ the notion of UB. But
    then how should UB be translated ? Again there exists the assumption
    or claim that there is some intuitively obvious translation and
    compilers should go for that. First, I'm not sure that there exists
    such a common intuition even among humans and second, even if it does
    , how does one go from an intuition to an algorithm C compilers can
    use to do translation ? Lots of things are intuitively obvious but
    creating an algorithm to duplicate the human intuition is a hard
    problem, one which has not been solved in many cases and perhaps even
    one which is unsolvable in some cases.

    I've seen the suggestion that compilers should describe their behaviour in terms of assembly generated (possibly some kind of abstract assembly) as opposed to higher terms. I'm not sure if this is possible and, even if it is,
    I would not find it useful. I tend to think of what I want my code to do in higher terms and then bring it down to the level of the language with successive refinements. If parts of C were described in assembly terms then
    it would potentially force me to do at least 1 more refinement step with no benefit.

    A more productive avenue is for people to give definitions, as precise as possible, to the kinds of UB which has caused them problems and then try to convince compiler writers to implement such extensions if they don't do so already. In this area even compiler documentation should perhaps improve. For example, from the GCC man page

    -fdelete-null-pointer-checks
    Use global dataflow analysis to identify and eliminate useless
    checks for null pointers. The compiler assumes that
    dereferencing a null pointer would have halted the program. If
    a pointer is checked after it has already been dereferenced, it
    cannot be null.

    In some environments, this assumption is not true, and programs
    can safely dereference null pointers. Use
    -fno-delete-null-pointer-checks to disable this optimization for
    programs which depend on that behavior.

    .The above still doesn't tell me what is supposed to happen when a NULL pointer is dereferenced even with the -fno-delete-null-pointer-checks flag. I'm guessing it's impossible to give a general definition. One can in specific systems but in general no so perhaps the above description does the best possible.

    Another example

    -fstrict-overflow
    Allow the compiler to assume strict signed overflow rules,
    depending on the language being compiled. For C (and C++) this
    means that overflow when doing arithmetic with signed numbers is
    undefined, which means that the compiler may assume that it will
    not happen.

    This is poor phrasing, in particular the part "which means that the
    compiler may assume that it will not happen" is redundant. There is no
    reason for the compiler to assume anything about which execution paths will happen during runtime to conclude for example that x + 1 > x can be translated as true. The above quote gives an unnecessarily circuitous
    reasoning as to why the expression can be translated as true. I give a more direct reasoning above.

    It annoys /me/ intensely that people complain about this sort of thing,
    and yet apparently haven't bothered to read the compiler manuals to see
    how to get the effects they want. Compile with "-fno-strict-aliasing",
    or (better, IMHO) add this to your code:

    #pragma GCC optimize ("-fno-strict-aliasing")

    Now, if you want to complain that the gcc documentation is not great,

    Yeah, it would be good if there was a more precise specification as to what additional guarantees beyond the C standard this gives. For translating other languages into C, this seems to be important for achieving object allocation and garbage collection since relying on the native malloc() and related is generally not adequate, at least not if your garbage collector is allowed to move objects.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.compilers on Wed Jan 18 21:14:44 2023
    From Newsgroup: comp.compilers

    On 18/01/2023 14:14, Spiros Bousbouras wrote:
    On Wed, 11 Jan 2023 14:20:49 +0100
    David Brown <david.brown@hesbynett.no> wrote:
    C was designed from day one to be a high-level language, not an
    assembler of any sort. Limitations of weaker earlier compilers does
    not mean the language was supposed to work that way.

    For those who want an abstract or portable assembler , there exists c9x.me/compile/ .I've never used it but at least it aims to be that ,
    unlike C. I would be curious to know of other analogous projects. I
    guess the "register transfer language" of GCC is somewhat analogous.

    I haven't looked at that projects - but as a general point, I am
    sceptical to any claims about "portable assembler". If there is
    translation and it is not one-to-one (or very close to that), then you
    don't really have "assembler" even if you have a rather low-level
    language. (And gcc's RTL is an internal format - usually there are
    several optimisation passes done at the RTL level.)


    I first used a C compiler that optimised on the assumption that UB
    didn't happen some 25 years ago. (In particular, it assumed signed
    integer arithmetic never overflowed.)

    I have encountered several times the claim that compilers assume that UB does not happen and I don't understand it. Lets consider 2 examples :

    x + 1 > x

    in C where x is a signed integer. Compilers will often treat this as
    always true with the following reasoning :

    - if x does not have the maximum value which fits in its type then the
    meaning of the C expressions is the same as their mathematical meaning
    so the expression evaluates to true.

    - if x has the maximum value which fits in its type then x + 1 is not
    defined so any translation (including treating the whole expression as
    true) is valid.

    There's no assumption that UB (undefined behaviour) will not happen, both possibilities are accounted for.


    I think I see what you are saying, but I don't make a big distinction
    between "assumes UB does not happen", "assumes you don't care about
    results if UB /does/ happen" and "can make any transformations if UB
    happens".

    One thing that you might view as a distinction is that compilers can
    use their knowledge of UB to affect surrounding code.

    So if you have :

    int x, y;

    if (x + 1 > x) y++; // (a)
    if (x == INT_MAX) y = 10; // (b)

    From your example above, we can see that the compiler can transform (a)
    into "y++;" - there is no need for the conditional. But the compiler
    can /also/ transform (b) into ";" - it is allowed to reason that if x
    /were/ equal to INT_MAX, statement (a) would be undefined behaviour
    (even though it was transformed away) and there is no value for x which
    would result in "y = 10" being executed without also executing UB.

    (A quick check on <https://godbolt.org> shows that gcc does the first transformation, but not the second one.)


    Another example is

    ... *some_pointer_object ...
    [ some_pointer_object does not get modified in this part of the code and
    has not been declared as volatile ]
    if (some_pointer_object == NULL) ...

    If some_pointer_object is not NULL then the test can be omitted ; if it is NULL then the earlier dereference is UB so any translation is valid including omitting the test.

    Again, there's no assumpion that UB will not happen.

    I think that is one way to look at it, but really it comes down to the
    same thing.

    One thing that is worth noting in this context is that compilers like
    gcc and clang translate known undefined behaviour into a special marker.
    You can imagine it as translating "*p = ..." into :

    if (!p) undefined_behaviour();
    *p = ...

    And the builtin function __builtin_unreachable() is translated into
    exactly the same internal marker or tree node type. These compilers do
    not distinguish between "undefined behaviour" and "code flow cannot
    get here".


    So the request that C compilers should stop assuming that UB will not
    happen seems to me completely misguided. I think what is really meant
    is that, in reasoning what a valid translation is, C compilers (or
    the authors of the compilers) should not employ the notion of UB. But
    then how should UB be translated ? Again there exists the assumption
    or claim that there is some intuitively obvious translation and
    compilers should go for that. First, I'm not sure that there exists
    such a common intuition even among humans and second, even if it does
    , how does one go from an intuition to an algorithm C compilers can
    use to do translation ? Lots of things are intuitively obvious but
    creating an algorithm to duplicate the human intuition is a hard
    problem, one which has not been solved in many cases and perhaps even
    one which is unsolvable in some cases.


    I agree entirely with your assessment, with the exception that
    "compilers can and do assume UB doesn't happen" is a valid way to view
    things.

    (I'm snipping the rest, because I fully agree - and it is so well
    written that I've nothing to add!)
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From gah4@gah4@u.washington.edu to comp.compilers on Wed Jan 18 21:10:55 2023
    From Newsgroup: comp.compilers

    On Wednesday, January 18, 2023 at 3:55:49 PM UTC-8, David Brown wrote:

    (snip)

    From your example above, we can see that the compiler can transform (a)
    into "y++;" - there is no need for the conditional. But the compiler
    can /also/ transform (b) into ";" - it is allowed to reason that if x
    /were/ equal to INT_MAX, statement (a) would be undefined behaviour
    (even though it was transformed away) and there is no value for x which
    would result in "y = 10" being executed without also executing UB.

    This is reminding me of some quantum mechanics rules described here:

    https://www.sciencenews.org/wp-content/uploads/2010/11/baseball.pdf

    It is interesting reading for those interested in the physics, but also
    for those who aren't. It doesn't take much physics thought.

    It has to do with what quantum mechanics allows when you don't
    actually measure something.

    And, similarly, compilers shouldn't try to make too many assumptions
    about things not measured.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Alexei A. Frounze@alexfrunews@gmail.com to comp.compilers on Thu Jan 19 21:18:52 2023
    From Newsgroup: comp.compilers

    On Wednesday, January 18, 2023 at 8:35:40 AM UTC-8, Spiros Bousbouras wrote: ...
    I have encountered several times the claim that compilers assume that UB does not happen and I don't understand it. Lets consider 2 examples :

    x + 1 > x

    in C where x is a signed integer. Compilers will often treat this as
    always true with the following reasoning :

    - if x does not have the maximum value which fits in its type then the meaning of the C expressions is the same as their mathematical meaning
    so the expression evaluates to true.

    - if x has the maximum value which fits in its type then x + 1 is not
    defined so any translation (including treating the whole expression as
    true) is valid.

    There's no assumption that UB (undefined behaviour) will not happen, both possibilities are accounted for.

    I believe in a case like this a modern C/C++ compiler reasons that x must be less than the maximum representable value and it generates code according
    to this, possibly removing dead code that depends on x being the maximum representable value. If the compiler's assumption that x is less than the maximum is wrong, it's perfectly fine, it's UB, any "broken" code generated
    is allowed.

    Alex
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From gah4@gah4@u.washington.edu to comp.compilers on Fri Jan 20 10:45:11 2023
    From Newsgroup: comp.compilers

    On Wednesday, January 18, 2023 at 3:55:49 PM UTC-8, David Brown wrote:

    (snip)

    So if you have :

    int x, y;

    if (x + 1 > x) y++; // (a)
    if (x == INT_MAX) y = 10; // (b)

    From your example above, we can see that the compiler can transform (a)
    into "y++;" - there is no need for the conditional. But the compiler
    can /also/ transform (b) into ";" - it is allowed to reason that if x
    /were/ equal to INT_MAX, statement (a) would be undefined behaviour
    (even though it was transformed away) and there is no value for x which
    would result in "y = 10" being executed without also executing UB.

    I am now wondering both how well compilers do this, and how well
    people do this.

    Note that the only case where, on all machines I use, the y++ is
    not executed, is when x==INT_MAX. Now, it would be completely
    different for:

    if(x + 2 > x) y++; (a)

    or

    if(x == INT_MAX) y--;

    For many years, C allowed for sign-magnitude and ones' complement representation. Fixed point overflow was, then, at least machine
    dependent as they overflow differently. Some machines have the
    ability to enable an interrupt on fixed point overflow, but at least
    for Fortran and C, it is normally disabled.

    Now, C has unsigned int which you can use when you need specific
    overflow behavior (except on some Unisys machines). Fortran does not,
    and so people expect, and depend, on two's complement overflow. (The
    Fortran standard allows for any integer radix greater than one, and
    also for different sign representations. But often enough, people know
    it is two's complement binary.)

    As C allows for both UB and system-dependent behavior, and it is
    hard for people to remember every case of each one, it is unreasonable
    to me, to assume fixed point overflow is UB.

    Dereference of pointers that might point to the wrong place, maybe.

    The reason for the Fortran example above, mentioning short circuit
    IF statements, was that Fortran programs, at least with IBM compilers,
    could reliably fetch from element 0 of an array. With static allocation,
    and data stored after code, there was no chance of element 0 being
    outside the address space. Yes people shouldn't rely on it,
    but sometimes they did.

    Storing outside an array often enough leads to problems, but fetch
    much less often.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Thomas Koenig@tkoenig@netcologne.de to comp.compilers on Fri Jan 20 20:42:25 2023
    From Newsgroup: comp.compilers

    Alexei A. Frounze <alexfrunews@gmail.com> schrieb:

    I believe in a case like this a modern C/C++ compiler reasons that x must be less than the maximum representable value and it generates code according
    to this, possibly removing dead code that depends on x being the maximum representable value. If the compiler's assumption that x is less than the maximum is wrong, it's perfectly fine, it's UB, any "broken" code generated is allowed.

    There are cases when compilers don't even use this knowledge.

    Take the function

    int add (int a, int b)
    {
    return a+b;
    }

    on an instruction set architecture which has only 64-bit
    arithmetic, such as POWER. This is translated by gcc,
    with optimization, to

    add 3,3,4
    extsw 3,3
    blr

    (which is an addition followed by a sign extension). The POWER ABi
    specifies that all values passed in registers are sign-extended,
    so the content of a register has the same value independent of
    the width of the signed integer it is being considered as.

    So, the compiler would be within its right to _not_ extend the sign
    of the result, because it could assume that no overflow occurs.
    This, however, would result in a violation of the ABI, so the
    compiler puts in the extra instruction just in case. If you
    replace int by long in the example above, the sign extension
    instruction is not generated.

    By comparision, MIPS gcc translates this to

    jr $31
    addu $2,$4,$5

    (note use of the delay slot), so no explicit sign extension is done,
    and the value returned in the register might have a different value
    if interpreted as a 64-bit value.

    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.compilers on Fri Jan 20 13:54:36 2023
    From Newsgroup: comp.compilers

    gah4 <gah4@u.washington.edu> writes:
    [snip]
    For many years, C allowed for sign-magnitude and ones' complement representation.

    It still does. The upcoming 2023 ISO C standard mandates 2's-complement
    for signed integer types, but it hasn't been published yet.

    [...]

    Now, C has unsigned int which you can use when you need specific
    overflow behavior (except on some Unisys machines).

    If a Unisys C implementation doesn't behave as the standard requires
    with respect to unsigned overflow, then it's a non-conforming
    implementation. (Non-conforming implementations can still be useful.)

    [...]

    As C allows for both UB and system-dependent behavior, and it is
    hard for people to remember every case of each one, it is unreasonable
    to me, to assume fixed point overflow is UB.

    C has:
    - Undefined behavior (the standard imposes no requirements);
    - Unspecified behavior (the standard provides 2 or more possibilities
    and implementations can choose arbitrarily; an example is the order of
    evaluation of function arguments); and
    - Implementation-defined behavior (unspecified behavior where the
    implementation must document its choice).

    Signed integer overflow has undefined behavior -- not because it is or
    isn't reasonable, but because the standard says so.

    Even in C23, the mandate for 2's-complement representation doesn't
    imply a requirement for 2's-complement behavior on overflow.

    [...]

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    Working, but not speaking, for XCOM Labs
    void Void(void) { Void(); } /* The recursive call of the void */
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.compilers on Sat Jan 21 11:54:43 2023
    From Newsgroup: comp.compilers

    Thomas Koenig <tkoenig@netcologne.de> writes:
    Take the function

    int add (int a, int b)
    {
    return a+b;
    }

    on an instruction set architecture which has only 64-bit
    arithmetic, such as POWER. This is translated by gcc,
    with optimization, to

    add 3,3,4
    extsw 3,3
    blr

    (which is an addition followed by a sign extension). The POWER ABi
    specifies that all values passed in registers are sign-extended,
    so the content of a register has the same value independent of
    the width of the signed integer it is being considered as.

    So, the compiler would be within its right to _not_ extend the sign
    of the result, because it could assume that no overflow occurs.
    This, however, would result in a violation of the ABI, so the
    compiler puts in the extra instruction just in case. If you
    replace int by long in the example above, the sign extension
    instruction is not generated.

    By comparision, MIPS gcc translates this to

    jr $31
    addu $2,$4,$5

    (note use of the delay slot), so no explicit sign extension is done,

    What makes you think so? The definition of ADDU in MIPS IV Rev. 3.2
    is pretty perverse, specifying an undefined result if one of the
    operands is not a sign-extended 32-bit value; but if both operands are
    to the instruction's liking, it produces a sign-extended 32-bit
    result.

    A programming note says:

    |[ADDU] is appropriate for arithmetic which is not signed, such as
    |address arithmetic, or integer arithmetic environments that ignore
    |overflow, such as rCLCrCY language arithmetic.

    One interesting aspect is that the Power ABI specifies sign-extension
    rather than garbage-extension for passing around ints. Many other
    ABIs are similar (e.g., the RISC-V ABI specifies sign extension, even
    for unsigned ints), and AMD64 specifies zero-extension for both signed
    and unsigned ints (and has instructions that generate zero-extended
    results).

    - anton
    --
    M. Anton Ertl
    anton@mips.complang.tuwien.ac.at
    http://www.complang.tuwien.ac.at/anton/
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Kaz Kylheku@864-117-4973@kylheku.com to comp.compilers on Sun Jan 22 07:04:26 2023
    From Newsgroup: comp.compilers

    On 2023-01-20, Thomas Koenig <tkoenig@netcologne.de> wrote:
    Alexei A. Frounze <alexfrunews@gmail.com> schrieb:

    I believe in a case like this a modern C/C++ compiler reasons that x must be >> less than the maximum representable value and it generates code according
    to this, possibly removing dead code that depends on x being the maximum
    representable value. If the compiler's assumption that x is less than the
    maximum is wrong, it's perfectly fine, it's UB, any "broken" code generated >> is allowed.

    There are cases when compilers don't even use this knowledge.

    Take the function

    int add (int a, int b)
    {
    return a+b;
    }

    on an instruction set architecture which has only 64-bit
    arithmetic, such as POWER. This is translated by gcc,
    with optimization, to

    add 3,3,4
    extsw 3,3
    blr

    (which is an addition followed by a sign extension). The POWER ABi
    specifies that all values passed in registers are sign-extended,
    so the content of a register has the same value independent of
    the width of the signed integer it is being considered as.

    So, the compiler would be within its right to _not_ extend the sign
    of the result, because it could assume that no overflow occurs.

    Indeed. If overflow occurs, then there could be an unexpected results if
    the, say, result of the addition is converted to a 64 bit type.

    Suppose that a 32 to 64 conversion is actually a no-op, because the
    register values are assumed to be sign-extended, like the ABI says.

    Then, say a and b are positives values that fit into the 32 bit
    range, such that the a + b sum does not; it wraps negative
    in 32 bits.

    The programmer might thus expect expect (long) (a + b) to be
    that wrapped negative value.

    But since the addition is done in 64 bits, it doesn't overflow;
    there is a positive 64 bit result. If conversion to long is a noop,
    a positive value of long will result, as if the expression were
    (long) a + (long) b.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From anton@anton@mips.complang.tuwien.ac.at (Anton Ertl) to comp.compilers on Sun Jan 22 09:56:22 2023
    From Newsgroup: comp.compilers

    anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
    AMD64 specifies zero-extension for both signed
    and unsigned ints (and has instructions that generate zero-extended
    results).

    Looking at <https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf>, I
    find no such specification. However, compilers certainly behave in
    that way. E.g., for

    int add (int a, int b)
    {
    return a+b;
    }

    gcc generates:

    0: 8d 04 37 lea (%rdi,%rsi,1),%eax
    3: c3 retq

    which zero-extends the result. This certainly rules out an ABI that
    requires sign-extension for signed integers.

    One interesting case is:

    long add (unsigned a, long b)
    {
    return a+b;
    }

    which gcc compiles into

    0: 89 ff mov %edi,%edi
    2: 48 8d 04 37 lea (%rdi,%rsi,1),%rax
    6: c3 retq

    What's the point of the MOV instruction here? It performs a
    64-bit zero extension of %rdi. So gcc apparently assumes that
    passed operands are garbage-extended on AMD64. Or maybe gcc is just
    cautious here. Another test:

    unsigned bar(int x);

    unsigned long foo(long x)
    {
    return bar(x);
    }

    gcc -O compiles this to:

    0: 48 83 ec 08 sub $0x8,%rsp
    4: e8 00 00 00 00 callq 9 <foo+0x9>
    9: 89 c0 mov %eax,%eax
    b: 48 83 c4 08 add $0x8,%rsp
    f: c3 retq

    There is no zero or sign-extension on passing x to bar(), so the value
    is passed garbage-extended. There is a zero extension for converting
    the return value unsigned long, so gcc assumes that the return value
    of bar is not necessarily zero-extended.

    Conclusion: In the System V ABI for AMD64, values are passed around garbage-extended (in the general case).

    - anton
    --
    M. Anton Ertl
    anton@mips.complang.tuwien.ac.at
    http://www.complang.tuwien.ac.at/anton/
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Martin Ward@martin@gkc.org.uk to comp.compilers on Mon Jan 23 17:12:14 2023
    From Newsgroup: comp.compilers

    On 18/01/2023 13:14, Spiros Bousbouras wrote:

    There's no assumption that UB (undefined behaviour) will not happen, both possibilities are accounted for.


    The "assumption that UB will not happen" is shorthand for the idea
    that any optimisation is valid if the optimised code is a refinement
    of the unoptimised code for all initial states such that UB does not
    occur. Equivalently, a proposed optimiation is valid if we represent
    UB as "abort" (a statement which can be refined to anything) and the
    optimised code is a refinement of the unoptimised code for all initial
    states.

    --
    Martin

    Dr Martin Ward | Email: martin@gkc.org.uk | http://www.gkc.org.uk G.K.Chesterton site: http://www.gkc.org.uk/gkc | Erdos number: 4
    --- Synchronet 3.21b-Linux NewsLink 1.2