• Regarding assignment to struct

    From Lew Pitcher@21:1/5 to All on Fri May 2 18:34:52 2025
    Back in the days of K&R, Kernighan and Ritchie published an addendum
    to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
    in which they detailed some differences in the C language post "The
    C Programming Language".

    The first difference they noted was that
    "Structures may be assigned, passed as arguments to functions, and
    returned by functions."

    From what I can see of the ISO C standards, the current C language
    has kept these these features. However, I don't see many C projects
    using them.

    I have a project in which these capabilities might come in handy; has
    anyone had experience with assigning to structures, passing them as
    arguments to functions, and/or having a function return a structure?

    Would code like
    struct ab {
    int a;
    char *b;
    } result, function(void);

    if ((result = function()).a == 10) puts(result.b);

    be understandable, or even legal?


    --
    Lew Pitcher
    "In Skills We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Barry Schwarz@21:1/5 to lew.pitcher@digitalfreehold.ca on Fri May 2 13:35:50 2025
    On Fri, 2 May 2025 18:34:52 -0000 (UTC), Lew Pitcher <lew.pitcher@digitalfreehold.ca> wrote:

    Back in the days of K&R, Kernighan and Ritchie published an addendum
    to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
    in which they detailed some differences in the C language post "The
    C Programming Language".

    The first difference they noted was that
    "Structures may be assigned, passed as arguments to functions, and
    returned by functions."

    From what I can see of the ISO C standards, the current C language
    has kept these these features. However, I don't see many C projects
    using them.

    I have a project in which these capabilities might come in handy; has
    anyone had experience with assigning to structures, passing them as
    arguments to functions, and/or having a function return a structure?

    Would code like
    struct ab {
    int a;
    char *b;
    } result, function(void);

    if ((result = function()).a == 10) puts(result.b);

    be understandable, or even legal?

    Wouldn't it be quicker and easier to write a simple program to test
    this rather than wait for someone to compose a response? You already
    have 90% of the code written.

    --
    Remove del for email

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Waldek Hebisch@21:1/5 to Lew Pitcher on Fri May 2 21:35:31 2025
    Lew Pitcher <lew.pitcher@digitalfreehold.ca> wrote:
    Back in the days of K&R, Kernighan and Ritchie published an addendum
    to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
    in which they detailed some differences in the C language post "The
    C Programming Language".

    The first difference they noted was that
    "Structures may be assigned, passed as arguments to functions, and
    returned by functions."

    From what I can see of the ISO C standards, the current C language
    has kept these these features. However, I don't see many C projects
    using them.

    I have a project in which these capabilities might come in handy; has
    anyone had experience with assigning to structures, passing them as
    arguments to functions, and/or having a function return a structure?

    Typically this is fine. However, in sdcc-4.2 manual one can find
    the following statement:

    : Deviations from standard compliance:
    : structures and unions cannot be passed as function parameters
    : and cannot be a return value from a function,....

    --
    Waldek Hebisch

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lew Pitcher@21:1/5 to Waldek Hebisch on Sat May 3 01:43:54 2025
    On Fri, 02 May 2025 21:35:31 +0000, Waldek Hebisch wrote:

    Lew Pitcher <lew.pitcher@digitalfreehold.ca> wrote:
    Back in the days of K&R, Kernighan and Ritchie published an addendum
    to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
    in which they detailed some differences in the C language post "The
    C Programming Language".

    The first difference they noted was that
    "Structures may be assigned, passed as arguments to functions, and
    returned by functions."

    From what I can see of the ISO C standards, the current C language
    has kept these these features. However, I don't see many C projects
    using them.

    I have a project in which these capabilities might come in handy; has
    anyone had experience with assigning to structures, passing them as
    arguments to functions, and/or having a function return a structure?

    Typically this is fine. However, in sdcc-4.2 manual one can find
    the following statement:

    : Deviations from standard compliance:
    : structures and unions cannot be passed as function parameters
    : and cannot be a return value from a function,....

    Not a problem. I don't foresee that the code I'm working on would be
    compiled with a non-standard C compiler (assuming that I've read
    the standards correctly wrt struct pass-by-value).

    --
    Lew Pitcher
    "In Skills We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andrey Tarasevich@21:1/5 to Lew Pitcher on Sat May 3 01:14:46 2025
    On Fri 5/2/2025 11:34 AM, Lew Pitcher wrote:
    Back in the days of K&R, Kernighan and Ritchie published an addendum
    to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
    in which they detailed some differences in the C language post "The
    C Programming Language".

    The first difference they noted was that
    "Structures may be assigned, passed as arguments to functions, and
    returned by functions."

    From what I can see of the ISO C standards, the current C language
    has kept these these features. However, I don't see many C projects
    using them.

    Weird. Virtually every C project relies on assignment of structures. Passing-returning structs by value might be more rare (although
    perfectly valid and often appropriate too), but assignment... assignment
    is used by everyone everywhere without even giving it a second thought.

    One dark corner this feature has, is that in C (as opposed to C++) the
    result of an assignment operator is an rvalue, which can easily lead to
    some interesting consequences related to structs with arrays inside.

    --
    Best regards,
    Andrey

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Lew Pitcher on Sat May 3 11:46:30 2025
    On 02/05/2025 20:34, Lew Pitcher wrote:
    Back in the days of K&R, Kernighan and Ritchie published an addendum
    to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
    in which they detailed some differences in the C language post "The
    C Programming Language".

    The first difference they noted was that
    "Structures may be assigned, passed as arguments to functions, and
    returned by functions."

    From what I can see of the ISO C standards, the current C language
    has kept these these features. However, I don't see many C projects
    using them.


    I use these features regularly. I have no problem passing structs
    around if that is the convenient way to structure the code.

    Some people mistakenly think that it is very inefficient, in comparison
    to passing around pointers to structs (which is the usual alternative).
    There are circumstances where you might end up with an extra struct and
    an extra copy, but unless you are dealing with very big structs and
    otherwise very fast functions, it's unlikely to be significant. Modern
    ABI's support passing small structs around in registers, and bigger
    structs get passed around using hidden pointers - using the structs in
    your code, rather than pointers to structs, makes the code clearer,
    safer, and gives the optimiser more information for better static
    analysis and code generation.


    I have a project in which these capabilities might come in handy; has
    anyone had experience with assigning to structures, passing them as
    arguments to functions, and/or having a function return a structure?

    Would code like
    struct ab {
    int a;
    char *b;
    } result, function(void);

    if ((result = function()).a == 10) puts(result.b);

    be understandable, or even legal?


    I'd immediately reject any code that mixes declaration of a variable and
    a function in the same declaration. I'd immediately reject any code
    that defines a type and declares a function in one shot. I'd question
    code that defines a type and a variable in one go. But that's my way of
    coding - other people have different rules, and your declarations are legal.

    Personally, I'd have :

    typedef struct {
    int a;
    char * b;
    } ab;

    ab result;

    ab function(void);

    (Obviously "ab" would not be a likely name in real code.)

    Once the type "ab" is defined, I am quite happy making variables of it, assigning them, and using it for function parameters and return types.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Andrey Tarasevich on Sat May 3 22:46:55 2025
    On Sat, 3 May 2025 01:14:46 -0700, Andrey Tarasevich wrote:

    Virtually every C project relies on assignment of structures. Passing-returning structs by value might be more rare (although
    perfectly valid and often appropriate too), but assignment...
    assignment is used by everyone everywhere without even giving it a
    second thought.

    There is a caveat, to do with alignment padding: will this always have a defined value?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Andrey Tarasevich on Sat May 3 22:47:25 2025
    On Sat, 3 May 2025 01:14:46 -0700, Andrey Tarasevich wrote:

    Virtually every C project relies on assignment of structures. Passing-returning structs by value might be more rare (although
    perfectly valid and often appropriate too), but assignment...
    assignment is used by everyone everywhere without even giving it a
    second thought.

    There is a caveat, to do with alignment padding: will this always have a defined value?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Richard Damon@21:1/5 to Lew Pitcher on Sat May 3 21:42:37 2025
    On 5/2/25 2:34 PM, Lew Pitcher wrote:
    Back in the days of K&R, Kernighan and Ritchie published an addendum
    to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
    in which they detailed some differences in the C language post "The
    C Programming Language".

    The first difference they noted was that
    "Structures may be assigned, passed as arguments to functions, and
    returned by functions."

    From what I can see of the ISO C standards, the current C language
    has kept these these features. However, I don't see many C projects
    using them.

    I have a project in which these capabilities might come in handy; has
    anyone had experience with assigning to structures, passing them as
    arguments to functions, and/or having a function return a structure?

    Would code like
    struct ab {
    int a;
    char *b;
    } result, function(void);

    if ((result = function()).a == 10) puts(result.b);

    be understandable, or even legal?



    I will say that I have used the feature, but in very limited conditions,
    mostly where the structure is no bigger than one or two typical words.

    It could be a structure with a bitfield to make accesses clearer than
    low level masking, or for a "point" with x and y tied into one object.

    Bigger than that, and you likely want to pass the object by address, not
    by value, passing just a pointer to it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Keith Thompson on Sat May 3 23:38:47 2025
    On 5/3/25 20:37, Keith Thompson wrote:
    Lawrence D'Oliveiro <ldo@nz.invalid> writes:
    On Sat, 3 May 2025 01:14:46 -0700, Andrey Tarasevich wrote:
    Virtually every C project relies on assignment of structures.
    Passing-returning structs by value might be more rare (although
    perfectly valid and often appropriate too), but assignment...
    assignment is used by everyone everywhere without even giving it a
    second thought.

    There is a caveat, to do with alignment padding: will this always have a
    defined value?

    I don't believe so. In a quick look, I don't see anything in
    the standard that explicitly addresses this, but I believe that a
    conforming implementation could implement structure assignment by
    copying the individual members, leaving any padding in the target
    undefined.

    "When a value is stored in an object of structure or union type,
    including in a member object, the bytes of the object representation
    that correspond to any padding bytes take unspecified values.56)"
    (6.2.6.1p6).

    That refers to footnote 56, which says "Thus, for example, structure
    assignment need not copy any padding bits."

    Note that, even when writing to a single member, the representations in
    the padding bytes might be affected. A plausible reason for this to
    happen would be, for example when a value is written to an 8-bit strujct
    field followed by 8 bits of padding on a machine where the word size is
    16 bits. The wording of that clause permits the use of instructions that
    change the contents of an entire word to be used when updating that field.

    Finally, why would you care?

    The fact that an implementation does not have to do the equivalent of
    memcpy() to perform a struct copy means that successful assignment
    cannot be checked by using memcmp().

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Richard Damon on Sun May 4 11:01:17 2025
    On Sat, 3 May 2025 21:42:37 -0400
    Richard Damon <richard@damon-family.org> wrote:


    Bigger than that, and you likely want to pass the object by address,
    not by value, passing just a pointer to it.

    That sort of thinking is an example of Knutian premature optimization.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Michael S on Sun May 4 08:34:11 2025
    On Sun, 4 May 2025 11:01:17 +0300, Michael S wrote:

    That sort of thinking is an example of Knutian premature optimization.

    Trying to hold back the optimization tide?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kenny McCormack@21:1/5 to jameskuyper@alumni.caltech.edu on Sun May 4 09:25:08 2025
    In article <vv6ng8$1410m$1@dont-email.me>,
    James Kuyper <jameskuyper@alumni.caltech.edu> wrote:
    ...
    The fact that an implementation does not have to do the equivalent of >memcpy() to perform a struct copy means that successful assignment
    cannot be checked by using memcmp().

    Which then begs two questions:

    1) Why wouldn't an implementaton do it with memcpy()? That is likely
    to be as good or better than any other method, including, especially, a
    member-by-member copy.

    2) Why wouldn't you, the programmer, just use memcpy() instead of
    struct assignment? Yes, I realize there are other cases to consider,
    but in the simple one:

    struct something foo,bar;
    foo = bar;

    memcpy() seems like it would always be easier and more reliable.

    --
    The randomly chosen signature file that would have appeared here is more than 4 lines long. As such, it violates one or more Usenet RFCs. In order to remain in compliance with said RFCs, the actual sig can be found at the following URL:
    http://user.xmission.com/~gazelle/Sigs/Pedantic

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Lawrence D'Oliveiro on Sun May 4 14:06:30 2025
    On 04/05/2025 10:34, Lawrence D'Oliveiro wrote:
    On Sun, 4 May 2025 11:01:17 +0300, Michael S wrote:

    That sort of thinking is an example of Knutian premature optimization.

    Trying to hold back the optimization tide?

    I think he meant Knuthian, rather than Knutian :-)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Andrey Tarasevich on Sun May 4 06:48:13 2025
    Andrey Tarasevich <noone@noone.net> writes:

    On Fri 5/2/2025 11:34 AM, Lew Pitcher wrote:

    Back in the days of K&R, Kernighan and Ritchie published an addendum
    to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
    in which they detailed some differences in the C language post "The
    C Programming Language".

    The first difference they noted was that
    "Structures may be assigned, passed as arguments to functions, and
    returned by functions."

    From what I can see of the ISO C standards, the current C language
    has kept these these features. However, I don't see many C projects
    using them.

    Weird. Virtually every C project relies on assignment of
    structures. Passing-returning structs by value might be more rare
    (although perfectly valid and often appropriate too), but
    assignment... assignment is used by everyone everywhere without even
    giving it a second thought.

    One dark corner this feature has, is that in C (as opposed to C++) the
    result of an assignment operator is an rvalue, which can easily lead
    to some interesting consequences related to structs with arrays
    inside.

    I'm curious to know what interesting consequences you mean here. Do
    you mean something other than cases that have undefined behavior?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to James Kuyper on Sun May 4 14:27:01 2025
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:
    On 5/3/25 20:37, Keith Thompson wrote:
    Lawrence D'Oliveiro <ldo@nz.invalid> writes:
    On Sat, 3 May 2025 01:14:46 -0700, Andrey Tarasevich wrote:
    Virtually every C project relies on assignment of structures.
    Passing-returning structs by value might be more rare (although
    perfectly valid and often appropriate too), but assignment...
    assignment is used by everyone everywhere without even giving it a
    second thought.

    There is a caveat, to do with alignment padding: will this always have a >>> defined value?

    I don't believe so. In a quick look, I don't see anything in
    the standard that explicitly addresses this, but I believe that a
    conforming implementation could implement structure assignment by
    copying the individual members, leaving any padding in the target
    undefined.

    "When a value is stored in an object of structure or union type,
    including in a member object, the bytes of the object representation
    that correspond to any padding bytes take unspecified values.56)" >(6.2.6.1p6).

    That refers to footnote 56, which says "Thus, for example, structure >assignment need not copy any padding bits."

    Are there any C implementations in common use that don't just
    use memcpy or an optimized version thereof?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Lew Pitcher on Sun May 4 07:49:15 2025
    Lew Pitcher <lew.pitcher@digitalfreehold.ca> writes:

    Back in the days of K&R, Kernighan and Ritchie published an addendum
    to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
    in which they detailed some differences in the C language post "The
    C Programming Language".

    The first difference they noted was that
    "Structures may be assigned, passed as arguments to functions, and
    returned by functions."

    From what I can see of the ISO C standards, the current C language
    has kept these these features. However, I don't see many C projects
    using them.

    I have a project in which these capabilities might come in handy; has
    anyone had experience with assigning to structures, passing them as
    arguments to functions, and/or having a function return a structure?

    Would code like
    struct ab {
    int a;
    char *b;
    } result, function(void);

    if ((result = function()).a == 10) puts(result.b);

    be understandable, or even legal?

    The style is unorthodox, but the code is understandable.

    Also it is both legal and well-defined, back to and
    including C90.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Scott Lurndal on Sun May 4 18:45:32 2025
    On 04/05/2025 16:27, Scott Lurndal wrote:
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:
    On 5/3/25 20:37, Keith Thompson wrote:
    Lawrence D'Oliveiro <ldo@nz.invalid> writes:
    On Sat, 3 May 2025 01:14:46 -0700, Andrey Tarasevich wrote:
    Virtually every C project relies on assignment of structures.
    Passing-returning structs by value might be more rare (although
    perfectly valid and often appropriate too), but assignment...
    assignment is used by everyone everywhere without even giving it a
    second thought.

    There is a caveat, to do with alignment padding: will this always have a >>>> defined value?

    I don't believe so. In a quick look, I don't see anything in
    the standard that explicitly addresses this, but I believe that a
    conforming implementation could implement structure assignment by
    copying the individual members, leaving any padding in the target
    undefined.

    "When a value is stored in an object of structure or union type,
    including in a member object, the bytes of the object representation
    that correspond to any padding bytes take unspecified values.56)"
    (6.2.6.1p6).

    That refers to footnote 56, which says "Thus, for example, structure
    assignment need not copy any padding bits."

    Are there any C implementations in common use that don't just
    use memcpy or an optimized version thereof?


    Sometimes small structs never make it to memory, or are handled by the
    compiler as though they were individual variables (as long as that is
    within "as-if" usage, of course). Copying a struct might merely mean
    the compiler keeps track of the logical copy without actually copying
    any memory. (You could argue that the compiler is still treating it
    like memcpy, as memcpy calls don't always copy something.)

    I think it would be unusual to see a significant difference between a
    struct assignment copy and a memcpy on a compiler that optimises memcpy
    well.

    But on a compiler that does not handle memcpy well, then a struct
    assignment could be inlined while a memcpy could mean an external
    library call with significant overhead.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Keith Thompson on Sun May 4 21:08:43 2025
    On 5/4/25 16:20, Keith Thompson wrote:
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:
    On 5/3/25 20:37, Keith Thompson wrote:
    ...
    I don't believe so. In a quick look, I don't see anything in
    the standard that explicitly addresses this, but I believe that a
    conforming implementation could implement structure assignment by
    copying the individual members, leaving any padding in the target
    undefined.
    ...
    Finally, why would you care?

    The fact that an implementation does not have to do the equivalent of
    memcpy() to perform a struct copy means that successful assignment
    cannot be checked by using memcmp().

    Are you referring to checking whether an assignment was performed
    or not, due to uncertainty about what the program has done? If you
    mean doing an assignment and then checking whether it succeeded,
    I can't think of a context where that makes sense.

    Sorry, I didn't explain what I was thinking about in any detail. I've
    seen code that allows a data structure to be modified by one section of
    the code, and then periodically checks each object in that data
    structure (including aggregate objects) to see whether it has been
    modified by using memcmp() versus a saved copy. If so, it updated the
    saved copy, including a timestamp when it was updated. If it weren't for
    the need to keep track of the timestamp, it would always be simpler, and
    not much slower, to always replace the saved copy, whether or not
    there'd been a change.

    I should have made it clear that I basically understand and agree with
    your "why would you care" criticism. But it's part of my nature to look
    for the edge cases where differences that ordinarily don't matter, could matter.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Keith Thompson on Mon May 5 00:41:03 2025
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:
    On 5/3/25 20:37, Keith Thompson wrote:
    Lawrence D'Oliveiro <ldo@nz.invalid> writes:
    On Sat, 3 May 2025 01:14:46 -0700, Andrey Tarasevich wrote:
    Virtually every C project relies on assignment of structures.
    Passing-returning structs by value might be more rare (although
    perfectly valid and often appropriate too), but assignment...
    assignment is used by everyone everywhere without even giving it a
    second thought.

    There is a caveat, to do with alignment padding: will this always have a >>>> defined value?

    I don't believe so. In a quick look, I don't see anything in
    the standard that explicitly addresses this, but I believe that a
    conforming implementation could implement structure assignment by
    copying the individual members, leaving any padding in the target
    undefined.

    "When a value is stored in an object of structure or union type,
    including in a member object, the bytes of the object representation
    that correspond to any padding bytes take unspecified values.56)"
    (6.2.6.1p6).

    That refers to footnote 56, which says "Thus, for example, structure
    assignment need not copy any padding bits."

    Yes, that's what I missed.

    It's interesting that the footnote refers to padding *bits* rather than >padding *bytes*. I presume this was unintentional.

    Padding bits:

    struct A {
    uint64_t tlen : 16,
    : 20,
    pkind : 6,
    fsz : 6,
    gsz : 14,
    g : 1,
    ptp : 1;
    } s;

    There are 20 padding bits in this declaration. Perhaps that's
    what they're referring to?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andrey Tarasevich@21:1/5 to Tim Rentsch on Sun May 4 22:22:12 2025
    On Sun 5/4/2025 6:48 AM, Tim Rentsch wrote:

    One dark corner this feature has, is that in C (as opposed to C++) the
    result of an assignment operator is an rvalue, which can easily lead
    to some interesting consequences related to structs with arrays
    inside.

    I'm curious to know what interesting consequences you mean here. Do
    you mean something other than cases that have undefined behavior?

    I'm referring to the matter of the address identity of the resultant
    rvalue object. At first, "address identity of rvalue" might sound
    strange, but the standard says that there's indeed an object tied to
    such rvalue, and once we start applying array-to-pointer conversion (and
    use `[]` operator), lvalues and addresses quickly come into the picture.

    The standard says in 6.2.4/8:

    "A non-lvalue expression with structure or union type, where the
    structure or union contains a member with array type [...]
    refers to an object with automatic storage duration and temporary
    lifetime. Its lifetime begins when the expression is evaluated and its
    initial value is the value of the expression. Its lifetime ends when the evaluation of the containing full expression ends. [...] Such an object
    need not have a unique address." https://port70.net/~nsz/c/c11/n1570.html#6.2.4p8

    I wondering what the last sentence is intended to mean ("... need not
    have a unique address"). At the first sight, the intent seems to be
    obvious: it simply says that such temporary objects might repeatedly
    appear (and disappear) at the same location in storage, which is a
    natural thing to expect.

    But is it, perhaps, intended to also allow such temporaries to have
    addresses identical to regular named objects? It is not immediately
    clear to me.

    And when I make the following experiment with GCC and Clang

    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5];
    pb = &b.a[5];
    pc = &(a = b).a[5];

    printf("%p %p %p\n", pa, pb, pc);
    }

    I consistently get the following output from GCC

    0x7fff73eb5544 0x7fff73eb5574 0x7fff73eb5544

    And this is what I get from Clang

    0x7ffd2b8dbf44 0x7ffd2b8dbf14 0x7ffd2b8dbee4

    As you can see, GCC apparently took C++-like approach to this situation.
    The returned "temporary" is not really a separate temporary at all, but actually `a` itself.

    Meanwhile, in Clang all three pointers are different, i.e. Clang decided
    to actually create a separate temporary object for the result of the assignment.

    I have a strong feeling that GCC's behavior is non-conforming. The last sentence of 6.2.4/8 is not supposed to permit "projecting" the resultant temporaries onto existing named objects. I could be wrong...

    --
    Best regards,
    Andrey

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Andrey Tarasevich on Mon May 5 11:12:13 2025
    On Sun, 4 May 2025 22:22:12 -0700
    Andrey Tarasevich <noone@noone.net> wrote:

    On Sun 5/4/2025 6:48 AM, Tim Rentsch wrote:

    One dark corner this feature has, is that in C (as opposed to C++)
    the result of an assignment operator is an rvalue, which can
    easily lead to some interesting consequences related to structs
    with arrays inside.

    I'm curious to know what interesting consequences you mean here. Do
    you mean something other than cases that have undefined behavior?

    I'm referring to the matter of the address identity of the resultant
    rvalue object. At first, "address identity of rvalue" might sound
    strange, but the standard says that there's indeed an object tied to
    such rvalue, and once we start applying array-to-pointer conversion
    (and use `[]` operator), lvalues and addresses quickly come into the
    picture.

    The standard says in 6.2.4/8:

    "A non-lvalue expression with structure or union type, where the
    structure or union contains a member with array type [...]
    refers to an object with automatic storage duration and temporary
    lifetime. Its lifetime begins when the expression is evaluated and
    its initial value is the value of the expression. Its lifetime ends
    when the evaluation of the containing full expression ends. [...]
    Such an object need not have a unique address." https://port70.net/~nsz/c/c11/n1570.html#6.2.4p8

    I wondering what the last sentence is intended to mean ("... need not
    have a unique address"). At the first sight, the intent seems to be
    obvious: it simply says that such temporary objects might repeatedly
    appear (and disappear) at the same location in storage, which is a
    natural thing to expect.

    But is it, perhaps, intended to also allow such temporaries to have
    addresses identical to regular named objects? It is not immediately
    clear to me.

    And when I make the following experiment with GCC and Clang

    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5];
    pb = &b.a[5];
    pc = &(a = b).a[5];

    printf("%p %p %p\n", pa, pb, pc);
    }

    I consistently get the following output from GCC

    0x7fff73eb5544 0x7fff73eb5574 0x7fff73eb5544

    And this is what I get from Clang

    0x7ffd2b8dbf44 0x7ffd2b8dbf14 0x7ffd2b8dbee4

    As you can see, GCC apparently took C++-like approach to this
    situation. The returned "temporary" is not really a separate
    temporary at all, but actually `a` itself.

    Meanwhile, in Clang all three pointers are different, i.e. Clang
    decided to actually create a separate temporary object for the result
    of the assignment.

    I have a strong feeling that GCC's behavior is non-conforming. The
    last sentence of 6.2.4/8 is not supposed to permit "projecting" the
    resultant temporaries onto existing named objects. I could be wrong...


    According to my understanding, you are wrong.
    Taking pointer of non-lvalue is UB, so anything compiler does is
    conforming.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Muttley@dastardlyhq.com@21:1/5 to All on Mon May 5 08:50:39 2025
    On Sat, 3 May 2025 11:46:30 +0200
    David Brown <david.brown@hesbynett.no> gabbled:
    On 02/05/2025 20:34, Lew Pitcher wrote:
    Back in the days of K&R, Kernighan and Ritchie published an addendum
    to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
    in which they detailed some differences in the C language post "The
    C Programming Language".

    The first difference they noted was that
    "Structures may be assigned, passed as arguments to functions, and
    returned by functions."

    From what I can see of the ISO C standards, the current C language
    has kept these these features. However, I don't see many C projects
    using them.


    I use these features regularly. I have no problem passing structs
    around if that is the convenient way to structure the code.

    If you twant o pass an actual array to a function instead of a pointer to it, embedding it in a structure is the only way to do it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andrey Tarasevich@21:1/5 to Michael S on Mon May 5 01:29:47 2025
    On Mon 5/5/2025 1:12 AM, Michael S wrote:

    According to my understanding, you are wrong.
    Taking pointer of non-lvalue is UB, so anything compiler does is
    conforming.


    Er... What? What specifically do you mean by "taking pointers"?

    The whole functionality of `[]` operator in C is based on pointers. This expression

    (a = b).a[5]

    is already doing your "taking pointers of non-lvalue" (if I understood
    you correctly) as part of array-to-pointer conversion. And no, it is not UB.

    This is not UB either

    struct S foo(void) { return (struct S) { 1, 2, 3 }; }
    ...
    int *p;
    p = &foo().a[2], printf("%d\n", *p);

    So, what you are basing your "UB" claim on is not clear to me.

    --
    Best regards,
    Andrey

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Andrey Tarasevich on Mon May 5 12:01:45 2025
    On Mon, 5 May 2025 01:29:47 -0700
    Andrey Tarasevich <noone@noone.net> wrote:

    On Mon 5/5/2025 1:12 AM, Michael S wrote:

    According to my understanding, you are wrong.
    Taking pointer of non-lvalue is UB, so anything compiler does is conforming.


    Er... What? What specifically do you mean by "taking pointers"?

    The whole functionality of `[]` operator in C is based on pointers.
    This expression

    (a = b).a[5]


    is already doing your "taking pointers of non-lvalue" (if I
    understood you correctly) as part of array-to-pointer conversion. And
    no, it is not UB.

    This is not UB either

    struct S foo(void) { return (struct S) { 1, 2, 3 }; }
    ...
    int *p;
    p = &foo().a[2], printf("%d\n", *p);



    That is not UB:
    int a5 = (a = b).a[5];

    That is UB:
    int* pa5 = &(a = b).a[5];

    So, what you are basing your "UB" claim on is not clear to me.


    If you read the post of Keith Thompson and it is still not clears to
    you then I can not help.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Keith Thompson on Mon May 5 12:03:31 2025
    On Mon, 05 May 2025 01:34:16 -0700
    Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:


    And more obviously, "%p" requires an argument of type void*, not int*.


    That part of otherwise very good comment is unreasonably pedantic.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kenny McCormack@21:1/5 to already5chosen@yahoo.com on Mon May 5 11:30:43 2025
    In article <20250505120331.000015b2@yahoo.com>,
    Michael S <already5chosen@yahoo.com> wrote:
    On Mon, 05 May 2025 01:34:16 -0700
    Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:


    And more obviously, "%p" requires an argument of type void*, not int*.


    That part of otherwise very good comment is unreasonably pedantic.

    That's KT for you. That's his reason for existence.

    Welcome to CLC!

    --
    Alice was something of a handful to her father, Theodore Roosevelt. He was once asked by a visiting dignitary about parenting his spitfire of a daughter and he replied, "I can be President of the United States, or I can control Alice. I cannot possibly do both."

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Muttley@dastardlyhq.com on Mon May 5 13:34:45 2025
    On 05/05/2025 10:50, Muttley@dastardlyhq.com wrote:
    On Sat, 3 May 2025 11:46:30 +0200
    David Brown <david.brown@hesbynett.no> gabbled:
    On 02/05/2025 20:34, Lew Pitcher wrote:
    Back in the days of K&R, Kernighan and Ritchie published an addendum
    to the "C Reference Manual" titled "Recent Changes to C" (November 1978) >>> in which they detailed some differences in the C language post "The
    C Programming Language".

    The first difference they noted was that
       "Structures may be assigned, passed as arguments to functions, and
        returned by functions."

     From what I can see of the ISO C standards, the current C language
    has kept these these features. However, I don't see many C projects
    using them.


    I use these features regularly.  I have no problem passing structs
    around if that is the convenient way to structure the code.

    If you twant o pass an actual array to a function instead of a pointer
    to it,
    embedding it in a structure is the only way to do it.

    Yes.

    (Well, you could embed it in a union if you prefer, but a struct seems
    more likely.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Michael S on Mon May 5 07:14:17 2025
    Michael S <already5chosen@yahoo.com> writes:

    On Mon, 5 May 2025 01:29:47 -0700
    Andrey Tarasevich <noone@noone.net> wrote:

    On Mon 5/5/2025 1:12 AM, Michael S wrote:

    According to my understanding, you are wrong.
    Taking pointer of non-lvalue is UB, so anything compiler does is
    conforming.

    Er... What? What specifically do you mean by "taking pointers"?

    The whole functionality of `[]` operator in C is based on pointers.
    This expression

    (a = b).a[5]
    [...]
    is already doing your "taking pointers of non-lvalue" (if I
    understood you correctly) as part of array-to-pointer conversion.
    And no, it is not UB.

    This is not UB either

    struct S foo(void) { return (struct S) { 1, 2, 3 }; }
    ...
    int *p;
    p = &foo().a[2], printf("%d\n", *p);

    That is not UB:
    int a5 = (a = b).a[5];

    That is UB:
    int* pa5 = &(a = b).a[5];

    So, what you are basing your "UB" claim on is not clear to me.

    If you read the post of Keith Thompson and it is still not clears to
    you then I can not help.

    Under C11 semantics, both

    int a5 = (a = b).a[5];

    and

    int* pa5 = &(a = b).a[5];

    have well-defined behavior. The undefined behavior of the
    upthread example comes later, only after the statement assigning
    to the pointer (or here, initializing) completes. It isn't
    taking the address with & that has undefined behavior; it is
    using the stored pointer value in a subsequent statement, /after
    the full expression containing the & operator has completed/,
    that results in undefined behavior.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Michael S on Mon May 5 07:03:40 2025
    Michael S <already5chosen@yahoo.com> writes:

    On Sun, 4 May 2025 22:22:12 -0700
    Andrey Tarasevich <noone@noone.net> wrote:

    On Sun 5/4/2025 6:48 AM, Tim Rentsch wrote:

    One dark corner this feature has, is that in C (as opposed to C++)
    the result of an assignment operator is an rvalue, which can
    easily lead to some interesting consequences related to structs
    with arrays inside.

    I'm curious to know what interesting consequences you mean here. Do
    you mean something other than cases that have undefined behavior?

    I'm referring to the matter of the address identity of the resultant
    rvalue object. At first, "address identity of rvalue" might sound
    strange, but the standard says that there's indeed an object tied to
    such rvalue, and once we start applying array-to-pointer conversion
    (and use `[]` operator), lvalues and addresses quickly come into the
    picture.

    The standard says in 6.2.4/8:

    "A non-lvalue expression with structure or union type, where the
    structure or union contains a member with array type [...]
    refers to an object with automatic storage duration and temporary
    lifetime. Its lifetime begins when the expression is evaluated and
    its initial value is the value of the expression. Its lifetime ends
    when the evaluation of the containing full expression ends. [...]
    Such an object need not have a unique address."
    https://port70.net/~nsz/c/c11/n1570.html#6.2.4p8

    I wondering what the last sentence is intended to mean ("... need not
    have a unique address"). At the first sight, the intent seems to be
    obvious: it simply says that such temporary objects might repeatedly
    appear (and disappear) at the same location in storage, which is a
    natural thing to expect.

    But is it, perhaps, intended to also allow such temporaries to have
    addresses identical to regular named objects? It is not immediately
    clear to me.

    And when I make the following experiment with GCC and Clang

    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5];
    pb = &b.a[5];
    pc = &(a = b).a[5];

    printf("%p %p %p\n", pa, pb, pc);
    }

    I consistently get the following output from GCC

    0x7fff73eb5544 0x7fff73eb5574 0x7fff73eb5544

    And this is what I get from Clang

    0x7ffd2b8dbf44 0x7ffd2b8dbf14 0x7ffd2b8dbee4

    As you can see, GCC apparently took C++-like approach to this
    situation. The returned "temporary" is not really a separate
    temporary at all, but actually `a` itself.

    Meanwhile, in Clang all three pointers are different, i.e. Clang
    decided to actually create a separate temporary object for the result
    of the assignment.

    I have a strong feeling that GCC's behavior is non-conforming. The
    last sentence of 6.2.4/8 is not supposed to permit "projecting" the
    resultant temporaries onto existing named objects. I could be wrong...

    According to my understanding, you are wrong.
    Taking pointer of non-lvalue is UB, so anything compiler does is
    conforming.

    Maybe you are thinking of C90.

    In both C99 and C11, the expression

    (a = b).a[5]

    is an lvalue, so taking its address with & is allowed.

    It's easy to verify this assertion using gcc -std=c99 -pedantic. If
    the given expression were not an lvalue then taking its address with
    & would be a constraint violation, requiring a diagnostic. But no
    diagnostic is produced. (Using clang in place of gcc also produces
    no diagnostic.)

    The behavior under C99 semantics is arguably murky. But under C11
    semantics the behavior is well-defined.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Keith Thompson on Mon May 5 06:34:49 2025
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    Andrey Tarasevich <noone@noone.net> writes:
    [...]

    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5];
    pb = &b.a[5];
    pc = &(a = b).a[5];

    printf("%p %p %p\n", pa, pb, pc);
    }

    [...]

    I think that code has undefined behavior.

    Right. [*]

    (a = b) is an rvalue that refers to an object of type struct S with
    temporary lifetime. pc holds the address of a subobject of that
    temporary object. The object reaches the end of its lifetime at the end
    of the evaluation of the full expression. You then print its value.

    Even if the printf() statement were replaced by

    (void)pc;

    the behavior would be undefined, because the pointer held in pc
    becomes indeterminate as soon as the statement containing the
    assignment to pc completes.


    [*] Assuming C11 semantics. At best inadvisable under C99
    semantics, and a constraint violation under C90 semantics.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Andrey Tarasevich on Mon May 5 07:56:40 2025
    Andrey Tarasevich <noone@noone.net> writes:

    On Sun 5/4/2025 6:48 AM, Tim Rentsch wrote:

    One dark corner this feature has, is that in C (as opposed to C++) the
    result of an assignment operator is an rvalue, which can easily lead
    to some interesting consequences related to structs with arrays
    inside.

    I'm curious to know what interesting consequences you mean here. Do
    you mean something other than cases that have undefined behavior?

    I'm referring to the matter of the address identity of the resultant
    rvalue object. At first, "address identity of rvalue" might sound
    strange, but the standard says that there's indeed an object tied to
    such rvalue, and once we start applying array-to-pointer conversion
    (and use `[]` operator), lvalues and addresses quickly come into the
    picture.

    The standard says in 6.2.4/8:

    "A non-lvalue expression with structure or union type, where the
    structure or union contains a member with array type [...]
    refers to an object with automatic storage duration and temporary
    lifetime. Its lifetime begins when the expression is evaluated and its initial value is the value of the expression. Its lifetime ends when
    the evaluation of the containing full expression ends. [...] Such an
    object need not have a unique address." https://port70.net/~nsz/c/c11/n1570.html#6.2.4p8

    The last sentence there is not present in N1570. Apparently it was
    introduced later, in C17. (My appreciation to Keith Thompson for
    reporting this.)

    I wondering what the last sentence is intended to mean ("... need not
    have a unique address"). At the first sight, the intent seems to be
    obvious: it simply says that such temporary objects might repeatedly
    appear (and disappear) at the same location in storage, which is a
    natural thing to expect.

    Ahh, I see now what your concern is.

    But is it, perhaps, intended to also allow such temporaries to have
    addresses identical to regular named objects? It is not immediately
    clear to me.

    My reading of the post-C11 standards is that they allow the "new"
    object to overlap with already existing objects, including both
    declared objects and objects whose storage was allocated using
    malloc().

    And when I make the following experiment with GCC and Clang

    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5];
    pb = &b.a[5];
    pc = &(a = b).a[5];

    printf("%p %p %p\n", pa, pb, pc);
    }

    I consistently get the following output from GCC

    0x7fff73eb5544 0x7fff73eb5574 0x7fff73eb5544

    And this is what I get from Clang

    0x7ffd2b8dbf44 0x7ffd2b8dbf14 0x7ffd2b8dbee4

    As you can see, GCC apparently took C++-like approach to this
    situation. The returned "temporary" is not really a separate temporary
    at all, but actually `a` itself.

    Yeah.

    Meanwhile, in Clang all three pointers are different, i.e. Clang
    decided to actually create a separate temporary object for the result
    of the assignment.

    Which in my reading of the standard is required under C11 rules.
    I have reproduced your results under -std=c11 -pedantic, for both
    gcc and clang.

    I have a strong feeling that GCC's behavior is non-conforming. The
    last sentence of 6.2.4/8 is not supposed to permit "projecting" the
    resultant temporaries onto existing named objects. I could be wrong...

    My judgment is that the behavior under gcc is non-conforming if the
    compilation was done using C11 semantics. Under C17 or later rules
    the gcc behavior is allowed (and may have been what prompted the
    change in C17, but that is just speculation on my part). In any
    case I understand now what you were getting at. Thank you for
    bringing this hazard to the group's attention.

    I hope someone files a bug report for gcc using -std=c11 rules,
    because what gcc does under that setting (along with -pedantic)
    is surely at odds with the plain reading of the C11 standard,
    for the situation being discussed here.

    Editorial comment: here is yet another case where post-C11 changes
    to the C standard seem ill advised, and another reason not to use
    any version of the ISO C standard for C17 or later. And it's
    disappointing that gcc -std=c11 -pedantic strays into the realm of non-conforming behavior.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andrey Tarasevich@21:1/5 to Michael S on Mon May 5 08:45:09 2025
    On Mon 5/5/2025 2:01 AM, Michael S wrote:
    On Mon, 5 May 2025 01:29:47 -0700
    Andrey Tarasevich <noone@noone.net> wrote:

    On Mon 5/5/2025 1:12 AM, Michael S wrote:

    According to my understanding, you are wrong.
    Taking pointer of non-lvalue is UB, so anything compiler does is
    conforming.


    Er... What? What specifically do you mean by "taking pointers"?

    The whole functionality of `[]` operator in C is based on pointers.
    This expression

    (a = b).a[5]


    is already doing your "taking pointers of non-lvalue" (if I
    understood you correctly) as part of array-to-pointer conversion. And
    no, it is not UB.

    This is not UB either

    struct S foo(void) { return (struct S) { 1, 2, 3 }; }
    ...
    int *p;
    p = &foo().a[2], printf("%d\n", *p);



    That is not UB:
    int a5 = (a = b).a[5];

    That is UB:
    int* pa5 = &(a = b).a[5];

    No, it isn't.

    If you read the post of Keith Thompson and it is still not clears to
    you then I can not help.

    The only valid "UB" claim in Keith's post is my printing the value of
    `pc` pointer, which by that time happens to point nowhere, since the
    lifetime of the temporary is over. (And, of course, lack of conversion
    to `void *` is an issue).

    As for the expressions like

    &(a = b).a[5];

    and

    &foo().a[2]

    - these by themselves are are perfectly valid. There's no UB in these expressions. (And this is not a debate.)

    Here's a version of the same code that corrects the above distracting issues

    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5],
    pb = &b.a[5],
    pc = &(a = b).a[5],
    printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
    }

    This version has no UB.

    --
    Best regards,
    Andrey

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andrey Tarasevich@21:1/5 to Keith Thompson on Mon May 5 10:14:40 2025
    On Mon 5/5/2025 1:26 AM, Keith Thompson wrote:

    I wondering what the last sentence is intended to mean ("... need not
    have a unique address"). At the first sight, the intent seems to be
    obvious: it simply says that such temporary objects might repeatedly
    appear (and disappear) at the same location in storage, which is a
    natural thing to expect.

    You snipped this: "Any attempt to modify an object with temporary
    lifetime results in undefined behavior.". Which means, I think,
    that an implementation that shared storage for "such an object"
    with something else probably isn't going to cause problems for any
    code with defined behavior.

    It is going to cause problems, if the code relies on the address
    identity of the object, assuming the standard intends to provide such guarantees.

    Though I can imagine the possibility of code that modifies `a` and
    reads via `pc` within the same full expression.

    That's easy (in the context of declarations from my previous example):

    pc = &(a = b).a[5], a.a[5] = 42, printf("%d\n", *pc);

    As one would expect, this produces different output in GCC and Clang for
    the reasons I already described.

    But unless I've somehow missed it, the "Such an object need not
    have a unique address." wording doesn't appear on that web page or
    in my copy of n1570.pdf. C17 does add these two sentences:

    An object with temporary lifetime behaves as if it were declared
    with the type of its value for the purposes of effective type. Such
    an object need not have a unique address.

    Normally any two objects with overlapping lifetime must have distinct addresses. This addition, I think, gives compilers permission to have temporary lifetime objects overlap with other existing objects, but not
    to have a modification to one object affect the value of the other
    (unless the modification invokes UB, of course).

    If so, that would be extremely underspecified. A mere "such an object
    need not have a unique address" is insufficient to fully convey the
    permission to overlap existing named objects. And that's probably what
    led to difference in interpretation between GCC and Clang.

    Modification of the temporary is "prohibited" (as UB), but modification
    of the overlapped named object is not. The consequences can be quite surprising.

    --
    Best regards,
    Andrey

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Tim Rentsch on Mon May 5 20:00:39 2025
    On 05/05/2025 16:56, Tim Rentsch wrote:
    Andrey Tarasevich <noone@noone.net> writes:

    On Sun 5/4/2025 6:48 AM, Tim Rentsch wrote:

    One dark corner this feature has, is that in C (as opposed to C++) the >>>> result of an assignment operator is an rvalue, which can easily lead
    to some interesting consequences related to structs with arrays
    inside.

    I'm curious to know what interesting consequences you mean here. Do
    you mean something other than cases that have undefined behavior?

    I'm referring to the matter of the address identity of the resultant
    rvalue object. At first, "address identity of rvalue" might sound
    strange, but the standard says that there's indeed an object tied to
    such rvalue, and once we start applying array-to-pointer conversion
    (and use `[]` operator), lvalues and addresses quickly come into the
    picture.

    The standard says in 6.2.4/8:

    "A non-lvalue expression with structure or union type, where the
    structure or union contains a member with array type [...]
    refers to an object with automatic storage duration and temporary
    lifetime. Its lifetime begins when the expression is evaluated and its
    initial value is the value of the expression. Its lifetime ends when
    the evaluation of the containing full expression ends. [...] Such an
    object need not have a unique address."
    https://port70.net/~nsz/c/c11/n1570.html#6.2.4p8

    The last sentence there is not present in N1570. Apparently it was introduced later, in C17. (My appreciation to Keith Thompson for
    reporting this.)

    I wondering what the last sentence is intended to mean ("... need not
    have a unique address"). At the first sight, the intent seems to be
    obvious: it simply says that such temporary objects might repeatedly
    appear (and disappear) at the same location in storage, which is a
    natural thing to expect.

    Ahh, I see now what your concern is.

    But is it, perhaps, intended to also allow such temporaries to have
    addresses identical to regular named objects? It is not immediately
    clear to me.

    My reading of the post-C11 standards is that they allow the "new"
    object to overlap with already existing objects, including both
    declared objects and objects whose storage was allocated using
    malloc().

    And when I make the following experiment with GCC and Clang

    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5];
    pb = &b.a[5];
    pc = &(a = b).a[5];

    printf("%p %p %p\n", pa, pb, pc);
    }

    I consistently get the following output from GCC

    0x7fff73eb5544 0x7fff73eb5574 0x7fff73eb5544

    And this is what I get from Clang

    0x7ffd2b8dbf44 0x7ffd2b8dbf14 0x7ffd2b8dbee4

    As you can see, GCC apparently took C++-like approach to this
    situation. The returned "temporary" is not really a separate temporary
    at all, but actually `a` itself.

    Yeah.

    Meanwhile, in Clang all three pointers are different, i.e. Clang
    decided to actually create a separate temporary object for the result
    of the assignment.

    Which in my reading of the standard is required under C11 rules.
    I have reproduced your results under -std=c11 -pedantic, for both
    gcc and clang.


    Compilers don't have to follow the behaviour specified by the standard
    in a "direct translation" manner in order to be correct and conforming.
    They have to generate code that in the absence of any attempt to execute something with undefined behaviour, will give the same observable
    behaviour as a "direct translation" would.

    The result of the "(a = b)" expression should be a temporary object
    distinct from "a" and "b", with a lifetime extending only to the end of
    the expression assigning to "pc" (prior to C17).

    Is there any way to distinguish between "pc" pointing to an int inside
    this now dead temporary object, and it pointing to an int inside "a",
    without invoking undefined behaviour?

    By the time you are using "pc" to print it, the pointer itself has an indeterminate value - the compiler can quite happily give it the same
    value as "pa", so looking at the pointer in the printf() statement does
    not show a non-conformance.

    Attempting to modify the temporary lifetime object, such as by writing
    "*(pc = &(a = b).a[5]) = 42;", is undefined behaviour.

    It is entirely possible that there /is/ some way to determine that the
    compiler is not making a distinct temporary object while avoiding any
    undefined behaviour or indeterminate values. But I don't think the code
    here does show that - and it is therefore not an example of
    non-conforming behaviour. I think GCC and clang can be viewed as having
    simply picked different ways to generate their indeterminate values.

    I will be happy to change that opinion if someone has a better argument
    or example.


    I have a strong feeling that GCC's behavior is non-conforming. The
    last sentence of 6.2.4/8 is not supposed to permit "projecting" the
    resultant temporaries onto existing named objects. I could be wrong...

    My judgment is that the behavior under gcc is non-conforming if the compilation was done using C11 semantics. Under C17 or later rules
    the gcc behavior is allowed (and may have been what prompted the
    change in C17, but that is just speculation on my part). In any
    case I understand now what you were getting at. Thank you for
    bringing this hazard to the group's attention.

    I hope someone files a bug report for gcc using -std=c11 rules,
    because what gcc does under that setting (along with -pedantic)
    is surely at odds with the plain reading of the C11 standard,
    for the situation being discussed here.

    Editorial comment: here is yet another case where post-C11 changes
    to the C standard seem ill advised, and another reason not to use
    any version of the ISO C standard for C17 or later. And it's
    disappointing that gcc -std=c11 -pedantic strays into the realm of non-conforming behavior.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Andrey Tarasevich on Mon May 5 20:20:38 2025
    On Mon, 5 May 2025 08:45:09 -0700
    Andrey Tarasevich <noone@noone.net> wrote:

    On Mon 5/5/2025 2:01 AM, Michael S wrote:
    On Mon, 5 May 2025 01:29:47 -0700
    Andrey Tarasevich <noone@noone.net> wrote:

    On Mon 5/5/2025 1:12 AM, Michael S wrote:

    According to my understanding, you are wrong.
    Taking pointer of non-lvalue is UB, so anything compiler does is
    conforming.


    Er... What? What specifically do you mean by "taking pointers"?

    The whole functionality of `[]` operator in C is based on pointers.
    This expression

    (a = b).a[5]


    is already doing your "taking pointers of non-lvalue" (if I
    understood you correctly) as part of array-to-pointer conversion.
    And no, it is not UB.

    This is not UB either

    struct S foo(void) { return (struct S) { 1, 2, 3 }; }
    ...
    int *p;
    p = &foo().a[2], printf("%d\n", *p);



    That is not UB:
    int a5 = (a = b).a[5];

    That is UB:
    int* pa5 = &(a = b).a[5];

    No, it isn't.

    If you read the post of Keith Thompson and it is still not clears to
    you then I can not help.

    The only valid "UB" claim in Keith's post is my printing the value of
    `pc` pointer, which by that time happens to point nowhere, since the
    lifetime of the temporary is over. (And, of course, lack of
    conversion to `void *` is an issue).

    As for the expressions like

    &(a = b).a[5];

    and

    &foo().a[2]


    Expressions by themselves a valid. But since there is no situation in
    which the value produced by expressions is valid outside of expressions
    the compiler can generate any value it wants, even NULL or value
    completely outside of address space of current process.

    - these by themselves are are perfectly valid. There's no UB in these expressions. (And this is not a debate.)

    Here's a version of the same code that corrects the above distracting
    issues

    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5],
    pb = &b.a[5],
    pc = &(a = b).a[5],
    printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
    }

    This version has no UB.


    It's only not UB in the nazal demons sense.
    It's UB in a sense that we can't predict values of expressions
    like (pa==pc) and (pb==pc). I.e. pc is completely useless. In my book
    it is form of UB.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Keith Thompson on Mon May 5 21:10:07 2025
    On 2025-05-05, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    Michael S <already5chosen@yahoo.com> writes:
    On Mon, 05 May 2025 01:34:16 -0700
    Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    And more obviously, "%p" requires an argument of type void*, not int*.

    That part of otherwise very good comment is unreasonably pedantic.

    I disagree. I suggest it's a bad habit to use "%p" without ensuring,
    by a cast if necessary, that the argument is of type void*.

    In most implementations, it's likely that all pointers have the same
    size and representation and are passed as arguments in the same way,
    but getting the types right means one less thing to worry about.

    If the codebade assumes all data pointers are the same size, bit pattern
    and are treated the same in the calling conventions / ABI, then it
    is probably moot.

    That code is doomed on a platform where the assumption doesn't hold, and
    the printf statemnts are probably not independently reusable.

    (I mostly put in these casts just to communicate to others that
    an ISO C language lawyer works here, if you happen to need one.)

    Also, it owuld be amazingly stupid of any such platform not just
    make those printfs work: to promote variadic arguments of
    pointer-to-object type to a common representation which is the same as
    void *, combined with a matching behavior in the va_arg macro for
    extracting the value back into any pointer-to-object type.

    Mountains of non-standard-conforming code exert tremendous pressure on
    both hardware platforms and the way C implementations are adapted to
    those platforms.


    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Keith Thompson on Mon May 5 17:04:06 2025
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    Andrey Tarasevich <noone@noone.net> writes:
    [...]

    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5],
    pb = &b.a[5],
    pc = &(a = b).a[5],
    printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
    }

    This version has no UB.

    I believe it does. [...]

    If you look again carefully, I expect you will reach a
    different conclusion.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Scott Lurndal on Mon May 5 21:57:47 2025
    scott@slp53.sl.home (Scott Lurndal) writes:

    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    James Kuyper <jameskuyper@alumni.caltech.edu> writes:

    On 5/3/25 20:37, Keith Thompson wrote:

    Lawrence D'Oliveiro <ldo@nz.invalid> writes:

    On Sat, 3 May 2025 01:14:46 -0700, Andrey Tarasevich wrote:

    Virtually every C project relies on assignment of structures.
    Passing-returning structs by value might be more rare (although
    perfectly valid and often appropriate too), but assignment...
    assignment is used by everyone everywhere without even giving
    it a second thought.

    There is a caveat, to do with alignment padding: will this
    always have a defined value?

    I don't believe so. In a quick look, I don't see anything in
    the standard that explicitly addresses this, but I believe that a
    conforming implementation could implement structure assignment by
    copying the individual members, leaving any padding in the target
    undefined.

    "When a value is stored in an object of structure or union type,
    including in a member object, the bytes of the object
    representation that correspond to any padding bytes take
    unspecified values.56)" (6.2.6.1p6).

    That refers to footnote 56, which says "Thus, for example,
    structure assignment need not copy any padding bits."

    Yes, that's what I missed.

    It's interesting that the footnote refers to padding *bits* rather
    than padding *bytes*. I presume this was unintentional.

    Padding bits:

    struct A {
    uint64_t tlen : 16,
    : 20,
    pkind : 6,
    fsz : 6,
    gsz : 14,
    g : 1,
    ptp : 1;
    } s;

    There are 20 padding bits in this declaration. Perhaps that's
    what they're referring to?

    To me it seems clear that the "padding bits" here is meant to refer
    to all of the following:

    unoccupied bytes between members, due to member alignment
    unoccupied bytes at the end of a structure or union
    bits corresponding to unnamed bit-field members
    unoccupied bits or bytes caused by explicit bit-field alignment
    unoccupied bits or bytes caused by other bit-field alignment

    Any member objects may have their own internal padding bits. Any
    assignment of a struct or union follows the usual rule that any
    padding bits that are part of a target member have unspecified
    values (as long as the member doesn't become a trap representation
    as a result).

    Considering all these parts together, I think it makes sense to say
    that the padding bits of an object are those bits that do not
    participate in determining the abstract value of the object (not
    counting that some combination of padding bits might cause the
    object to become a trap representation, which never happens for
    structs or unions).

    (Yes I know that the term "trap representation" has been changed in
    later versions of the C standard. Please make any needed editorial
    changes internally, without having to post a followup.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Michael S on Mon May 5 21:25:16 2025
    Michael S <already5chosen@yahoo.com> writes:

    On Sat, 3 May 2025 21:42:37 -0400
    Richard Damon <richard@damon-family.org> wrote:

    Bigger than that, and you likely want to pass the object by address,
    not by value, passing just a pointer to it.

    That sort of thinking is an example of Knutian premature optimization.

    I don't agree with this assessment. First, the given suggestion is
    a rule of thumb. By their nature rules of thumb offer heuristics
    that give guidelines likely to yield good results, but not
    guaranteed to do so. Second, a decision about whether to pass a
    struct object or a pointer to said object is often one that is a
    fair amount of work to undo, and so tends to be made early during
    the time period of program development. As such, it is useful to
    follow a guideline likely to give good results, even if not always
    optimal, because on average it will mean less work done overall.

    I second Richard Damon's recommendation, with the understanding that
    it is only a guideline, not an absolute, and as always subject to
    later revision should that turn out to be called for (no pun
    intended).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Michael S on Mon May 5 22:40:57 2025
    Michael S <already5chosen@yahoo.com> writes:

    On Mon, 05 May 2025 01:34:16 -0700
    Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

    And more obviously, "%p" requires an argument of type void*, not
    int*.

    That part of otherwise very good comment is unreasonably pedantic.

    I don't have the same reaction. My sense is Keith was just being
    thorough. Speaking for myself his statement wasn't needed, but
    that condition might not hold for other readers. Given that his
    comment is just one not-overly-long sentence, I don't think it's
    too much to ask that readers already familiar with the point
    simply skip over it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Michael S on Mon May 5 22:26:22 2025
    Michael S <already5chosen@yahoo.com> writes:

    On Mon, 5 May 2025 08:45:09 -0700
    Andrey Tarasevich <noone@noone.net> wrote:

    On Mon 5/5/2025 2:01 AM, Michael S wrote:

    On Mon, 5 May 2025 01:29:47 -0700
    Andrey Tarasevich <noone@noone.net> wrote:

    On Mon 5/5/2025 1:12 AM, Michael S wrote:

    According to my understanding, you are wrong.
    Taking pointer of non-lvalue is UB, so anything compiler does is
    conforming.

    Er... What? What specifically do you mean by "taking pointers"?

    The whole functionality of `[]` operator in C is based on pointers.
    This expression

    (a = b).a[5]
    [...]
    is already doing your "taking pointers of non-lvalue" (if I
    understood you correctly) as part of array-to-pointer conversion.
    And no, it is not UB.

    This is not UB either

    struct S foo(void) { return (struct S) { 1, 2, 3 }; }
    ...
    int *p;
    p = &foo().a[2], printf("%d\n", *p);

    That is not UB:
    int a5 = (a = b).a[5];

    That is UB:
    int* pa5 = &(a = b).a[5];

    No, it isn't.

    If you read the post of Keith Thompson and it is still not clears to
    you then I can not help.

    The only valid "UB" claim in Keith's post is my printing the value of
    `pc` pointer, which by that time happens to point nowhere, since the
    lifetime of the temporary is over. (And, of course, lack of
    conversion to `void *` is an issue).

    As for the expressions like

    &(a = b).a[5];

    and

    &foo().a[2]

    Expressions by themselves a valid. But since there is no situation in
    which the value produced by expressions is valid outside of expressions
    the compiler can generate any value it wants, even NULL or value
    completely outside of address space of current process.

    These expressions produce valid values as long as they are used
    before the end of each full expression containing the given
    expression; within that context they may not produce NULL or a
    value outside of the program's address space.

    - these by themselves are are perfectly valid. There's no UB in these
    expressions. (And this is not a debate.)

    Here's a version of the same code that corrects the above distracting
    issues

    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5],
    pb = &b.a[5],
    pc = &(a = b).a[5],
    printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
    }

    This version has no UB.

    It's only not UB in the nazal demons sense.
    It's UB in a sense that we can't predict values of expressions
    like (pa==pc) and (pb==pc). I.e. pc is completely useless. In my book
    it is form of UB.

    The term used in the C standard is "unspecified behavior". If
    this kind of expression is something you don't want to use that
    is understandable, but it would help communication to use the
    appropriate standard-defined term to describe it.

    Essentially all non-trivial programs have unspecified behaviors, and
    plenty of them. Most are benign, some are problematic, but in no
    case does an unspecified behavior, by itself, represent a danger to
    program semantics as severe as executing a construct that has
    undefined behavior.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Kaz Kylheku on Mon May 5 22:57:14 2025
    Kaz Kylheku <643-408-1753@kylheku.com> writes:

    On 2025-05-05, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

    Michael S <already5chosen@yahoo.com> writes:

    On Mon, 05 May 2025 01:34:16 -0700
    Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

    And more obviously, "%p" requires an argument of type void*, not
    int*.

    That part of otherwise very good comment is unreasonably pedantic.

    I disagree. I suggest it's a bad habit to use "%p" without
    ensuring, by a cast if necessary, that the argument is of type
    void*.

    In most implementations, it's likely that all pointers have the
    same size and representation and are passed as arguments in the
    same way, but getting the types right means one less thing to worry
    about.

    If the codebade assumes all data pointers are the same size, bit
    pattern and are treated the same in the calling conventions / ABI,
    then it is probably moot.

    That code is doomed on a platform where the assumption doesn't
    hold, and the printf statemnts are probably not independently
    reusable.

    (I mostly put in these casts just to communicate to others that
    an ISO C language lawyer works here, if you happen to need one.)

    Also, it owuld be amazingly stupid of any such platform not just
    make those printfs work: to promote variadic arguments of
    pointer-to-object type to a common representation which is the
    same as void *, combined with a matching behavior in the va_arg
    macro for extracting the value back into any pointer-to-object
    type.

    This statement strikes me as would an utterance coming from a
    resident of Fantasyland.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Muttley@DastardlyHQ.org@21:1/5 to All on Tue May 6 07:16:24 2025
    On Mon, 05 May 2025 13:53:10 -0700
    Keith Thompson <Keith.S.Thompson+u@gmail.com> wibbled:
    Muttley@dastardlyhq.com writes:
    [...]
    If you twant o pass an actual array to a function instead of a pointer to it,

    embedding it in a structure is the only way to do it.

    Yes, but that's not necessarily useful. An array that's a member

    Depends what you're doing. Passing an array in a structure will copy the array saving you having to do it yourself if you don't want to work on the original version. Obviously that doesn't happen if you just pass a pointer.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Keith Thompson on Tue May 6 11:46:21 2025
    On 05/05/2025 22:53, Keith Thompson wrote:
    Muttley@dastardlyhq.com writes:
    [...]
    If you twant o pass an actual array to a function instead of a pointer to it,
    embedding it in a structure is the only way to do it.

    Yes, but that's not necessarily useful. An array that's a member
    of a struct can only be of a constant length (unless it's a flexible
    array member, but that doesn't help). Functions that work with
    arrays typically need to deal with arrays of arbitrary length.


    I regularly use arrays with known fixed sizes. In fact, in my code
    those are absolutely dominant - it is very rare for me to see or use an
    array whose size is /not/ fixed at compile time. Sometimes I will have
    general functions that take parameters that are arrays of arbitrary
    length, but not often.

    So this is very much dependent on the kind of code you are working with,
    and other people will have very different experiences for their own code.

    However, I think it is not unlikely that people will see use of structs
    like :

    struct vector4int { int vs[4]; };

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Keith Thompson on Tue May 6 11:35:30 2025
    On 05/05/2025 22:27, Keith Thompson wrote:
    Andrey Tarasevich <noone@noone.net> writes:
    [...]
    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5],
    pb = &b.a[5],
    pc = &(a = b).a[5],
    printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
    }

    This version has no UB.

    I believe it does. pc points to an element of an object with
    temporary lifetime. The value of pc is then used after the object
    it points to has reached the end of its lifetime. At that point,
    pc has an indeterminate value.

    N3096 6.2.4p2: "If a pointer value is used in an evaluation after
    the object the pointer points to (or just past) reaches the end of
    its lifetime, the behavior is undefined. The representation of a
    pointer object becomes indeterminate when the object the pointer
    points to (or just past) reaches the end of its lifetime."


    It seems clear to me that "pc" has an indeterminate value after the
    expression assigning, since it points to an object with temporary lifetime.

    And attempting to use the value of an object with automatic storage
    while it has an indeterminate value is undefined behaviour.

    As far as I can see, simply reading the value in "pc" to print it out is
    UB according to the C standards. It is clearly going to be a harmless operation on most hardware, but there are processors where pointer
    registers are more complicated than simple linear addresses - they can
    track some kind of segment structure describing the range of a data
    block, or permissions for access to the data, and such structures could
    have been deactivated or deallocated when the temporary lifetime object
    died. Even attempting to read the value of the pointer, without
    dereferencing it, would then cause some kind of fault or trap.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Muttley@DastardlyHQ.org@21:1/5 to All on Tue May 6 10:18:44 2025
    On Tue, 6 May 2025 11:46:21 +0200
    David Brown <david.brown@hesbynett.no> wibbled:
    On 05/05/2025 22:53, Keith Thompson wrote:
    Muttley@dastardlyhq.com writes:
    [...]
    If you twant o pass an actual array to a function instead of a pointer to >it,
    embedding it in a structure is the only way to do it.

    Yes, but that's not necessarily useful. An array that's a member
    of a struct can only be of a constant length (unless it's a flexible
    array member, but that doesn't help). Functions that work with
    arrays typically need to deal with arrays of arbitrary length.


    I regularly use arrays with known fixed sizes. In fact, in my code
    those are absolutely dominant - it is very rare for me to see or use an
    array whose size is /not/ fixed at compile time. Sometimes I will have

    I do a lot of networking code and with packet structures the arrays are
    almost always of fixed size. Also with arrays the data is inline so a simple memcpy() can copy the data from the struct to the output buffer. You can't
    do that if you have pointers in the struct. Ditto a simple cast to char * to use it directly as the ouput.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Keith Thompson on Tue May 6 16:34:56 2025
    On Mon, 05 May 2025 13:53:10 -0700
    Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

    Muttley@dastardlyhq.com writes:
    [...]
    If you twant o pass an actual array to a function instead of a
    pointer to it, embedding it in a structure is the only way to do
    it.

    Yes, but that's not necessarily useful. An array that's a member
    of a struct can only be of a constant length (unless it's a flexible
    array member, but that doesn't help). Functions that work with
    arrays typically need to deal with arrays of arbitrary length.


    It seems, C++ authorities were feeling that the pattern "struct with
    array of constant length as an only member" is very common.
    Otherwise they wouldn't bother to add <array> to their standard library.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Waldek Hebisch@21:1/5 to Keith Thompson on Tue May 6 17:36:36 2025
    Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    Andrey Tarasevich <noone@noone.net> writes:
    [...]
    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5],
    pb = &b.a[5],
    pc = &(a = b).a[5],
    printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
    }

    This version has no UB.

    I believe it does. pc points to an element of an object with
    temporary lifetime. The value of pc is then used after the object
    it points to has reached the end of its lifetime. At that point,
    pc has an indeterminate value.

    N3096 6.2.4p2: "If a pointer value is used in an evaluation after
    the object the pointer points to (or just past) reaches the end of
    its lifetime, the behavior is undefined. The representation of a
    pointer object becomes indeterminate when the object the pointer
    points to (or just past) reaches the end of its lifetime."

    Note commas above. Assignment to pc and call to printf are parts
    of a single expression, so use of pc is within lifetime of the
    temporary object.

    --
    Waldek Hebisch

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Waldek Hebisch on Tue May 6 20:46:48 2025
    On 06/05/2025 19:36, Waldek Hebisch wrote:
    Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    Andrey Tarasevich <noone@noone.net> writes:
    [...]
    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5],
    pb = &b.a[5],
    pc = &(a = b).a[5],
    printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
    }

    This version has no UB.

    I believe it does. pc points to an element of an object with
    temporary lifetime. The value of pc is then used after the object
    it points to has reached the end of its lifetime. At that point,
    pc has an indeterminate value.

    N3096 6.2.4p2: "If a pointer value is used in an evaluation after
    the object the pointer points to (or just past) reaches the end of
    its lifetime, the behavior is undefined. The representation of a
    pointer object becomes indeterminate when the object the pointer
    points to (or just past) reaches the end of its lifetime."

    Note commas above. Assignment to pc and call to printf are parts
    of a single expression, so use of pc is within lifetime of the
    temporary object.


    I must admit I had not noticed that detail.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Nick Bowler@21:1/5 to Keith Thompson on Tue May 6 19:06:20 2025
    On Mon, 05 May 2025 13:43:31 -0700, Keith Thompson wrote:
    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    Andrey Tarasevich <noone@noone.net> writes:
    [...]

    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5];
    pb = &b.a[5];
    pc = &(a = b).a[5];

    printf("%p %p %p\n", pa, pb, pc);
    }

    [...]

    I think that code has undefined behavior.

    Right. [*]
    [...]
    [*] Assuming C11 semantics. At best inadvisable under C99
    semantics, and a constraint violation under C90 semantics.

    What C90 constraint does it violate? Both gcc and clang reject it
    with "-std=c90 -pedantic-errors", with an error message "ISO C90
    forbids subscripting non-lvalue array", but I don't see a relevant
    constraint in the C90 standard.

    I don't know about C90, but in C89 the above code violates the
    constraint on the [] operator that "one of the expressions shall
    have type ``pointer to object type.''" (3.3.2.1, first paragraph)

    C89 (3.2.2.1, third paragraph) only describes conversion of lvalues with
    array type into pointers. No similar rule applies for an expression
    with array type which is not an lvalue, so such expressions are not
    converted to pointers.

    So, given:

    struct { int a[10]; } a, b;
    /* ... */
    (a = b).a[5];

    Since (a = b).a is not an lvalue, it is not converted to a pointer, so
    neither operand of [] has pointer type, so a diagnostic is required.

    I know that C11 introduced "temporary lifetime" to cover cases
    like this. In C99, the wording for the indexing operator implicitly
    assumes that there's an array object; if there isn't, I'd argue the
    behavior is undefined by omission. I'm not aware of any relevant
    change from C90 to C99.

    The rule about conversions from arrays to pointers is different in C99
    (n1124 6.3.2.1, third paragraph) compared to C89. In particular,
    "an lvalue that has type ``array of type'' ..." was changed to
    "an expression that has type ``array of type'' ...".

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to David Brown on Tue May 6 19:22:34 2025
    David Brown <david.brown@hesbynett.no> writes:
    On 06/05/2025 19:36, Waldek Hebisch wrote:
    Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    Andrey Tarasevich <noone@noone.net> writes:
    [...]
    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5],
    pb = &b.a[5],
    pc = &(a = b).a[5],
    printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
    }

    This version has no UB.

    I believe it does. pc points to an element of an object with
    temporary lifetime. The value of pc is then used after the object
    it points to has reached the end of its lifetime. At that point,
    pc has an indeterminate value.

    N3096 6.2.4p2: "If a pointer value is used in an evaluation after
    the object the pointer points to (or just past) reaches the end of
    its lifetime, the behavior is undefined. The representation of a
    pointer object becomes indeterminate when the object the pointer
    points to (or just past) reaches the end of its lifetime."

    Note commas above. Assignment to pc and call to printf are parts
    of a single expression, so use of pc is within lifetime of the
    temporary object.


    I must admit I had not noticed that detail.

    That would get an immediate downcheck during review for exactly
    that reason.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Scott Lurndal on Wed May 7 09:37:57 2025
    On 06/05/2025 21:22, Scott Lurndal wrote:
    David Brown <david.brown@hesbynett.no> writes:
    On 06/05/2025 19:36, Waldek Hebisch wrote:
    Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    Andrey Tarasevich <noone@noone.net> writes:
    [...]
    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5],
    pb = &b.a[5],
    pc = &(a = b).a[5],
    printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
    }

    This version has no UB.

    I believe it does. pc points to an element of an object with
    temporary lifetime. The value of pc is then used after the object
    it points to has reached the end of its lifetime. At that point,
    pc has an indeterminate value.

    N3096 6.2.4p2: "If a pointer value is used in an evaluation after
    the object the pointer points to (or just past) reaches the end of
    its lifetime, the behavior is undefined. The representation of a
    pointer object becomes indeterminate when the object the pointer
    points to (or just past) reaches the end of its lifetime."

    Note commas above. Assignment to pc and call to printf are parts
    of a single expression, so use of pc is within lifetime of the
    temporary object.


    I must admit I had not noticed that detail.

    That would get an immediate downcheck during review for exactly
    that reason.

    Of course. In fact, if someone presented such code for review (and
    assuming I noticed the commas!) I'd have to consider whether it was done maliciously, intentionally deceptively, due to incompetence, or
    smart-arse coding. In all my C coding experience, I can't recall ever
    coming across a single situation when I thought the use of the comma
    operator was appropriate in the kind of code I work with.

    Other people, projects, and teams work with different standards,
    different requirements, and different kinds of code - there are a lot of
    things that are common practice in some C coding that are strongly
    rejected in my field (and perhaps vice versa). So I am not suggesting
    that the comma operator is always bad in C - just that it is pretty much
    always bad in my line of work.

    And of course Andrey was using it here to make a specific point in a
    discussion about C details, rather than real-life code.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Nick Bowler@21:1/5 to Keith Thompson on Wed May 7 19:09:40 2025
    On Tue, 06 May 2025 13:21:38 -0700, Keith Thompson wrote:
    Nick Bowler <nbowler@draconx.ca> writes:
    The rule about conversions from arrays to pointers is different in C99
    (n1124 6.3.2.1, third paragraph) compared to C89. In particular,
    "an lvalue that has type ``array of type'' ..." was changed to
    "an expression that has type ``array of type'' ...".
    [...]
    The change from "lvalue" to "expression" was made in C99. I wonder why
    that was done.

    It's not mentioned in the rationale, so we can only guess. But it is
    called out in the list of major changes in the C99 foreword.

    BTW, you have a copy of ANSI C89? Hard or soft copy? Do you know if
    it's still available in some form?

    Hint: look for FIPS 160 on the NIST website. This is the same standard
    as ANSI X3.159-1989 Programming Language - C.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Nick Bowler on Wed May 7 21:17:17 2025
    Nick Bowler <nbowler@draconx.ca> writes:

    On Tue, 06 May 2025 13:21:38 -0700, Keith Thompson wrote:

    Nick Bowler <nbowler@draconx.ca> writes:

    The rule about conversions from arrays to pointers is different
    in C99 (n1124 6.3.2.1, third paragraph) compared to C89. In
    particular, "an lvalue that has type ``array of type'' ..." was
    changed to "an expression that has type ``array of type'' ...".

    [...]

    The change from "lvalue" to "expression" was made in C99. I
    wonder why that was done.

    It's not mentioned in the rationale, so we can only guess. [...]

    To me it seems obvious. The change in C99 was meant to allow
    access to an array inside a non-lvalue struct. When C99 was
    done the committee didn't realize all the ramifications of
    accessing non-value structs (which apparently has problems
    even for scalar members, not just array members). Later, when
    they did realize the resulting problems, they fixed things up
    in C11.

    See also n1253.htm, by Clark Nelson.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Nick Bowler@21:1/5 to Keith Thompson on Thu May 8 12:58:56 2025
    On Wed, 07 May 2025 14:23:57 -0700, Keith Thompson wrote:
    Nick Bowler <nbowler@draconx.ca> writes:
    On Tue, 06 May 2025 13:21:38 -0700, Keith Thompson wrote:
    The change from "lvalue" to "expression" was made in C99. I wonder why
    that was done.

    It's not mentioned in the rationale, so we can only guess. But it is
    called out in the list of major changes in the C99 foreword.

    I've just looked at the foreword of the C99 standard and the n1256
    draft, and I couldn't find it. Can you quote the precise wording?

    N1256 page xiii. Fourth to last bullet point:

    "-- conversion of array to pointer not limited to lvalues"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Andrey Tarasevich on Thu May 8 12:45:58 2025
    Andrey Tarasevich <noone@noone.net> writes:

    On Mon 5/5/2025 1:26 AM, Keith Thompson wrote:

    I wondering what the last sentence is intended to mean ("... need not
    have a unique address"). At the first sight, the intent seems to be
    obvious: it simply says that such temporary objects might repeatedly
    appear (and disappear) at the same location in storage, which is a
    natural thing to expect.

    You snipped this: "Any attempt to modify an object with temporary
    lifetime results in undefined behavior.". Which means, I think,
    that an implementation that shared storage for "such an object"
    with something else probably isn't going to cause problems for any
    code with defined behavior.

    It is going to cause problems, if the code relies on the address
    identity of the object, assuming the standard intends to provide such guarantees.

    Though I can imagine the possibility of code that modifies `a` and
    reads via `pc` within the same full expression.

    That's easy (in the context of declarations from my previous example):

    pc = &(a = b).a[5], a.a[5] = 42, printf("%d\n", *pc);

    As one would expect, this produces different output in GCC and Clang
    for the reasons I already described.

    But unless I've somehow missed it, the "Such an object need not
    have a unique address." wording doesn't appear on that web page or
    in my copy of n1570.pdf. C17 does add these two sentences:

    An object with temporary lifetime behaves as if it were declared
    with the type of its value for the purposes of effective type. Such
    an object need not have a unique address.

    Normally any two objects with overlapping lifetime must have distinct
    addresses. This addition, I think, gives compilers permission to have
    temporary lifetime objects overlap with other existing objects, but not
    to have a modification to one object affect the value of the other
    (unless the modification invokes UB, of course).

    If so, that would be extremely underspecified. A mere "such an object
    need not have a unique address" is insufficient to fully convey the permission to overlap existing named objects.

    I don't see why you say that. The statement says objects with
    temporary lifetime need not have a unique address. In the absence
    of any other statement on the subject, this statement admits the
    inference that an object with temporary lifetime might have the same
    address as any other object. Removing the constraint (that the
    addresses of those objects must be distinct from the addresses
    of all other objects), /and doing nothing else/, can only mean that
    the addresses of such objects might match the address of any other
    object in the environment.

    If you think there should be a non-normative footnote explaining
    that point, I expect I would vote in favor of that, but as far
    as normative text goes I don't see any fuzziness about what is
    allowed under the existing wording.

    And that's probably what led to difference in interpretation
    between GCC and Clang.

    I suspect the implication actually goes the other way. It is
    because what gcc has done (past tense) violates the rules of the C11
    standard that someone had the bright idea that the C standard should
    be changed to allow this stupidity.

    Modification of the temporary is "prohibited" (as UB), but
    modification of the overlapped named object is not. The
    consequences can be quite surprising.

    In my view the problem is not that what is allowed is unclear, but
    that the whole idea of possibly overlapping objects is a crock.
    It's a sad statement on the quality of gcc that it does the wrong
    thing even when -std=c11 and -pedantic are given as compilation
    options. Bleah.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Tim Rentsch on Thu May 8 22:20:02 2025
    On 08/05/2025 21:45, Tim Rentsch wrote:
    Andrey Tarasevich <noone@noone.net> writes:



    And that's probably what led to difference in interpretation
    between GCC and Clang.

    I suspect the implication actually goes the other way. It is
    because what gcc has done (past tense) violates the rules of the C11
    standard that someone had the bright idea that the C standard should
    be changed to allow this stupidity.

    Modification of the temporary is "prohibited" (as UB), but
    modification of the overlapped named object is not. The
    consequences can be quite surprising.

    In my view the problem is not that what is allowed is unclear, but
    that the whole idea of possibly overlapping objects is a crock.
    It's a sad statement on the quality of gcc that it does the wrong
    thing even when -std=c11 and -pedantic are given as compilation
    options. Bleah.

    While I think it is important that compilers try to follow the C
    standards (at least when you specify conforming modes), are there any
    potential realistic consequences of this?

    Posters here have gone far out of their way to make hypothetical code
    that demonstrates this flaw in gcc without invoking undefined behaviour.
    Is there any risk that anyone would come across this in real code?

    In addition, is it reasonable to suppose that C programmers that have
    not studied the C standards here would be expecting the behaviour of
    gcc, or the behaviour of clang here? Certainly if /I/ saw "pc = &(a =
    b).a[5]" prior to this thread, I would expect the contents of the struct
    "b" to be copied to the memory of the struct "a", and "pc" set to point
    to the member of the array within "a". I would expect the code to work
    as gcc works, and would find clang's behaviour completely unexpected. I
    would be surprised if I were alone in that.

    So to me, it makes sense that the C standard has changed to support a
    more sane approach to such situations. It would be unreasonable to
    change it to guarantee the sensible behaviour - that would mean
    compilers like clang that generated technically correct but surprising
    (to many) code would now be wrong.

    (gcc's behaviour is also more efficient, but of course correctness
    trumps efficiency every time.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rosario19@21:1/5 to Michael S on Mon May 12 11:23:02 2025
    On Sun, 4 May 2025 11:01:17 +0300, Michael S wrote:

    On Sat, 3 May 2025 21:42:37 -0400
    Richard Damon <richard@damon-family.org> wrote:


    Bigger than that, and you likely want to pass the object by address,
    not by value, passing just a pointer to it.

    That sort of thinking is an example of Knutian premature optimization.

    i prefer pass memory (if it is big enought) with one address or
    reference

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)