Sysop: | Amessyroom |
---|---|
Location: | Fayetteville, NC |
Users: | 28 |
Nodes: | 6 (0 / 6) |
Uptime: | 47:40:29 |
Calls: | 422 |
Files: | 1,024 |
Messages: | 90,399 |
Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
in which they detailed some differences in the C language post "The
C Programming Language".
The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."
From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.
I have a project in which these capabilities might come in handy; has
anyone had experience with assigning to structures, passing them as
arguments to functions, and/or having a function return a structure?
Would code like
struct ab {
int a;
char *b;
} result, function(void);
if ((result = function()).a == 10) puts(result.b);
be understandable, or even legal?
Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
in which they detailed some differences in the C language post "The
C Programming Language".
The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."
From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.
I have a project in which these capabilities might come in handy; has
anyone had experience with assigning to structures, passing them as
arguments to functions, and/or having a function return a structure?
Lew Pitcher <lew.pitcher@digitalfreehold.ca> wrote:
Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
in which they detailed some differences in the C language post "The
C Programming Language".
The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."
From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.
I have a project in which these capabilities might come in handy; has
anyone had experience with assigning to structures, passing them as
arguments to functions, and/or having a function return a structure?
Typically this is fine. However, in sdcc-4.2 manual one can find
the following statement:
: Deviations from standard compliance:
: structures and unions cannot be passed as function parameters
: and cannot be a return value from a function,....
Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
in which they detailed some differences in the C language post "The
C Programming Language".
The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."
From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.
Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
in which they detailed some differences in the C language post "The
C Programming Language".
The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."
From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.
I have a project in which these capabilities might come in handy; has
anyone had experience with assigning to structures, passing them as
arguments to functions, and/or having a function return a structure?
Would code like
struct ab {
int a;
char *b;
} result, function(void);
if ((result = function()).a == 10) puts(result.b);
be understandable, or even legal?
Virtually every C project relies on assignment of structures. Passing-returning structs by value might be more rare (although
perfectly valid and often appropriate too), but assignment...
assignment is used by everyone everywhere without even giving it a
second thought.
Virtually every C project relies on assignment of structures. Passing-returning structs by value might be more rare (although
perfectly valid and often appropriate too), but assignment...
assignment is used by everyone everywhere without even giving it a
second thought.
Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
in which they detailed some differences in the C language post "The
C Programming Language".
The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."
From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.
I have a project in which these capabilities might come in handy; has
anyone had experience with assigning to structures, passing them as
arguments to functions, and/or having a function return a structure?
Would code like
struct ab {
int a;
char *b;
} result, function(void);
if ((result = function()).a == 10) puts(result.b);
be understandable, or even legal?
Lawrence D'Oliveiro <ldo@nz.invalid> writes:
On Sat, 3 May 2025 01:14:46 -0700, Andrey Tarasevich wrote:
Virtually every C project relies on assignment of structures.
Passing-returning structs by value might be more rare (although
perfectly valid and often appropriate too), but assignment...
assignment is used by everyone everywhere without even giving it a
second thought.
There is a caveat, to do with alignment padding: will this always have a
defined value?
I don't believe so. In a quick look, I don't see anything in
the standard that explicitly addresses this, but I believe that a
conforming implementation could implement structure assignment by
copying the individual members, leaving any padding in the target
undefined.
Finally, why would you care?
Bigger than that, and you likely want to pass the object by address,
not by value, passing just a pointer to it.
That sort of thinking is an example of Knutian premature optimization.
The fact that an implementation does not have to do the equivalent of >memcpy() to perform a struct copy means that successful assignment
cannot be checked by using memcmp().
On Sun, 4 May 2025 11:01:17 +0300, Michael S wrote:
That sort of thinking is an example of Knutian premature optimization.
Trying to hold back the optimization tide?
On Fri 5/2/2025 11:34 AM, Lew Pitcher wrote:
Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
in which they detailed some differences in the C language post "The
C Programming Language".
The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."
From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.
Weird. Virtually every C project relies on assignment of
structures. Passing-returning structs by value might be more rare
(although perfectly valid and often appropriate too), but
assignment... assignment is used by everyone everywhere without even
giving it a second thought.
One dark corner this feature has, is that in C (as opposed to C++) the
result of an assignment operator is an rvalue, which can easily lead
to some interesting consequences related to structs with arrays
inside.
On 5/3/25 20:37, Keith Thompson wrote:
Lawrence D'Oliveiro <ldo@nz.invalid> writes:
On Sat, 3 May 2025 01:14:46 -0700, Andrey Tarasevich wrote:
Virtually every C project relies on assignment of structures.
Passing-returning structs by value might be more rare (although
perfectly valid and often appropriate too), but assignment...
assignment is used by everyone everywhere without even giving it a
second thought.
There is a caveat, to do with alignment padding: will this always have a >>> defined value?
I don't believe so. In a quick look, I don't see anything in
the standard that explicitly addresses this, but I believe that a
conforming implementation could implement structure assignment by
copying the individual members, leaving any padding in the target
undefined.
"When a value is stored in an object of structure or union type,
including in a member object, the bytes of the object representation
that correspond to any padding bytes take unspecified values.56)" >(6.2.6.1p6).
That refers to footnote 56, which says "Thus, for example, structure >assignment need not copy any padding bits."
Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
in which they detailed some differences in the C language post "The
C Programming Language".
The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."
From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.
I have a project in which these capabilities might come in handy; has
anyone had experience with assigning to structures, passing them as
arguments to functions, and/or having a function return a structure?
Would code like
struct ab {
int a;
char *b;
} result, function(void);
if ((result = function()).a == 10) puts(result.b);
be understandable, or even legal?
James Kuyper <jameskuyper@alumni.caltech.edu> writes:
On 5/3/25 20:37, Keith Thompson wrote:
Lawrence D'Oliveiro <ldo@nz.invalid> writes:
On Sat, 3 May 2025 01:14:46 -0700, Andrey Tarasevich wrote:
Virtually every C project relies on assignment of structures.
Passing-returning structs by value might be more rare (although
perfectly valid and often appropriate too), but assignment...
assignment is used by everyone everywhere without even giving it a
second thought.
There is a caveat, to do with alignment padding: will this always have a >>>> defined value?
I don't believe so. In a quick look, I don't see anything in
the standard that explicitly addresses this, but I believe that a
conforming implementation could implement structure assignment by
copying the individual members, leaving any padding in the target
undefined.
"When a value is stored in an object of structure or union type,
including in a member object, the bytes of the object representation
that correspond to any padding bytes take unspecified values.56)"
(6.2.6.1p6).
That refers to footnote 56, which says "Thus, for example, structure
assignment need not copy any padding bits."
Are there any C implementations in common use that don't just
use memcpy or an optimized version thereof?
James Kuyper <jameskuyper@alumni.caltech.edu> writes:...
On 5/3/25 20:37, Keith Thompson wrote:
...I don't believe so. In a quick look, I don't see anything in
the standard that explicitly addresses this, but I believe that a
conforming implementation could implement structure assignment by
copying the individual members, leaving any padding in the target
undefined.
Finally, why would you care?
The fact that an implementation does not have to do the equivalent of
memcpy() to perform a struct copy means that successful assignment
cannot be checked by using memcmp().
Are you referring to checking whether an assignment was performed
or not, due to uncertainty about what the program has done? If you
mean doing an assignment and then checking whether it succeeded,
I can't think of a context where that makes sense.
James Kuyper <jameskuyper@alumni.caltech.edu> writes:
On 5/3/25 20:37, Keith Thompson wrote:
Lawrence D'Oliveiro <ldo@nz.invalid> writes:
On Sat, 3 May 2025 01:14:46 -0700, Andrey Tarasevich wrote:
Virtually every C project relies on assignment of structures.
Passing-returning structs by value might be more rare (although
perfectly valid and often appropriate too), but assignment...
assignment is used by everyone everywhere without even giving it a
second thought.
There is a caveat, to do with alignment padding: will this always have a >>>> defined value?
I don't believe so. In a quick look, I don't see anything in
the standard that explicitly addresses this, but I believe that a
conforming implementation could implement structure assignment by
copying the individual members, leaving any padding in the target
undefined.
"When a value is stored in an object of structure or union type,
including in a member object, the bytes of the object representation
that correspond to any padding bytes take unspecified values.56)"
(6.2.6.1p6).
That refers to footnote 56, which says "Thus, for example, structure
assignment need not copy any padding bits."
Yes, that's what I missed.
It's interesting that the footnote refers to padding *bits* rather than >padding *bytes*. I presume this was unintentional.
One dark corner this feature has, is that in C (as opposed to C++) the
result of an assignment operator is an rvalue, which can easily lead
to some interesting consequences related to structs with arrays
inside.
I'm curious to know what interesting consequences you mean here. Do
you mean something other than cases that have undefined behavior?
On Sun 5/4/2025 6:48 AM, Tim Rentsch wrote:
One dark corner this feature has, is that in C (as opposed to C++)
the result of an assignment operator is an rvalue, which can
easily lead to some interesting consequences related to structs
with arrays inside.
I'm curious to know what interesting consequences you mean here. Do
you mean something other than cases that have undefined behavior?
I'm referring to the matter of the address identity of the resultant
rvalue object. At first, "address identity of rvalue" might sound
strange, but the standard says that there's indeed an object tied to
such rvalue, and once we start applying array-to-pointer conversion
(and use `[]` operator), lvalues and addresses quickly come into the
picture.
The standard says in 6.2.4/8:
"A non-lvalue expression with structure or union type, where the
structure or union contains a member with array type [...]
refers to an object with automatic storage duration and temporary
lifetime. Its lifetime begins when the expression is evaluated and
its initial value is the value of the expression. Its lifetime ends
when the evaluation of the containing full expression ends. [...]
Such an object need not have a unique address." https://port70.net/~nsz/c/c11/n1570.html#6.2.4p8
I wondering what the last sentence is intended to mean ("... need not
have a unique address"). At the first sight, the intent seems to be
obvious: it simply says that such temporary objects might repeatedly
appear (and disappear) at the same location in storage, which is a
natural thing to expect.
But is it, perhaps, intended to also allow such temporaries to have
addresses identical to regular named objects? It is not immediately
clear to me.
And when I make the following experiment with GCC and Clang
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5];
pb = &b.a[5];
pc = &(a = b).a[5];
printf("%p %p %p\n", pa, pb, pc);
}
I consistently get the following output from GCC
0x7fff73eb5544 0x7fff73eb5574 0x7fff73eb5544
And this is what I get from Clang
0x7ffd2b8dbf44 0x7ffd2b8dbf14 0x7ffd2b8dbee4
As you can see, GCC apparently took C++-like approach to this
situation. The returned "temporary" is not really a separate
temporary at all, but actually `a` itself.
Meanwhile, in Clang all three pointers are different, i.e. Clang
decided to actually create a separate temporary object for the result
of the assignment.
I have a strong feeling that GCC's behavior is non-conforming. The
last sentence of 6.2.4/8 is not supposed to permit "projecting" the
resultant temporaries onto existing named objects. I could be wrong...
On 02/05/2025 20:34, Lew Pitcher wrote:
Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
in which they detailed some differences in the C language post "The
C Programming Language".
The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."
From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.
I use these features regularly. I have no problem passing structs
around if that is the convenient way to structure the code.
According to my understanding, you are wrong.
Taking pointer of non-lvalue is UB, so anything compiler does is
conforming.
On Mon 5/5/2025 1:12 AM, Michael S wrote:
According to my understanding, you are wrong.
Taking pointer of non-lvalue is UB, so anything compiler does is conforming.
Er... What? What specifically do you mean by "taking pointers"?
The whole functionality of `[]` operator in C is based on pointers.
This expression
(a = b).a[5]
is already doing your "taking pointers of non-lvalue" (if I
understood you correctly) as part of array-to-pointer conversion. And
no, it is not UB.
This is not UB either
struct S foo(void) { return (struct S) { 1, 2, 3 }; }
...
int *p;
p = &foo().a[2], printf("%d\n", *p);
So, what you are basing your "UB" claim on is not clear to me.
And more obviously, "%p" requires an argument of type void*, not int*.
On Mon, 05 May 2025 01:34:16 -0700
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
And more obviously, "%p" requires an argument of type void*, not int*.
That part of otherwise very good comment is unreasonably pedantic.
On Sat, 3 May 2025 11:46:30 +0200
David Brown <david.brown@hesbynett.no> gabbled:
On 02/05/2025 20:34, Lew Pitcher wrote:
Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978) >>> in which they detailed some differences in the C language post "The
C Programming Language".
The first difference they noted was that
  "Structures may be assigned, passed as arguments to functions, and
   returned by functions."
 From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.
I use these features regularly. I have no problem passing structs
around if that is the convenient way to structure the code.
If you twant o pass an actual array to a function instead of a pointer
to it,
embedding it in a structure is the only way to do it.
On Mon, 5 May 2025 01:29:47 -0700
Andrey Tarasevich <noone@noone.net> wrote:
On Mon 5/5/2025 1:12 AM, Michael S wrote:
According to my understanding, you are wrong.
Taking pointer of non-lvalue is UB, so anything compiler does is
conforming.
Er... What? What specifically do you mean by "taking pointers"?
The whole functionality of `[]` operator in C is based on pointers.
This expression
(a = b).a[5]
[...]
is already doing your "taking pointers of non-lvalue" (if I
understood you correctly) as part of array-to-pointer conversion.
And no, it is not UB.
This is not UB either
struct S foo(void) { return (struct S) { 1, 2, 3 }; }
...
int *p;
p = &foo().a[2], printf("%d\n", *p);
That is not UB:
int a5 = (a = b).a[5];
That is UB:
int* pa5 = &(a = b).a[5];
So, what you are basing your "UB" claim on is not clear to me.
If you read the post of Keith Thompson and it is still not clears to
you then I can not help.
On Sun, 4 May 2025 22:22:12 -0700
Andrey Tarasevich <noone@noone.net> wrote:
On Sun 5/4/2025 6:48 AM, Tim Rentsch wrote:
One dark corner this feature has, is that in C (as opposed to C++)
the result of an assignment operator is an rvalue, which can
easily lead to some interesting consequences related to structs
with arrays inside.
I'm curious to know what interesting consequences you mean here. Do
you mean something other than cases that have undefined behavior?
I'm referring to the matter of the address identity of the resultant
rvalue object. At first, "address identity of rvalue" might sound
strange, but the standard says that there's indeed an object tied to
such rvalue, and once we start applying array-to-pointer conversion
(and use `[]` operator), lvalues and addresses quickly come into the
picture.
The standard says in 6.2.4/8:
"A non-lvalue expression with structure or union type, where the
structure or union contains a member with array type [...]
refers to an object with automatic storage duration and temporary
lifetime. Its lifetime begins when the expression is evaluated and
its initial value is the value of the expression. Its lifetime ends
when the evaluation of the containing full expression ends. [...]
Such an object need not have a unique address."
https://port70.net/~nsz/c/c11/n1570.html#6.2.4p8
I wondering what the last sentence is intended to mean ("... need not
have a unique address"). At the first sight, the intent seems to be
obvious: it simply says that such temporary objects might repeatedly
appear (and disappear) at the same location in storage, which is a
natural thing to expect.
But is it, perhaps, intended to also allow such temporaries to have
addresses identical to regular named objects? It is not immediately
clear to me.
And when I make the following experiment with GCC and Clang
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5];
pb = &b.a[5];
pc = &(a = b).a[5];
printf("%p %p %p\n", pa, pb, pc);
}
I consistently get the following output from GCC
0x7fff73eb5544 0x7fff73eb5574 0x7fff73eb5544
And this is what I get from Clang
0x7ffd2b8dbf44 0x7ffd2b8dbf14 0x7ffd2b8dbee4
As you can see, GCC apparently took C++-like approach to this
situation. The returned "temporary" is not really a separate
temporary at all, but actually `a` itself.
Meanwhile, in Clang all three pointers are different, i.e. Clang
decided to actually create a separate temporary object for the result
of the assignment.
I have a strong feeling that GCC's behavior is non-conforming. The
last sentence of 6.2.4/8 is not supposed to permit "projecting" the
resultant temporaries onto existing named objects. I could be wrong...
According to my understanding, you are wrong.
Taking pointer of non-lvalue is UB, so anything compiler does is
conforming.
Andrey Tarasevich <noone@noone.net> writes:
[...]
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5];
pb = &b.a[5];
pc = &(a = b).a[5];
printf("%p %p %p\n", pa, pb, pc);
}
[...]
I think that code has undefined behavior.
(a = b) is an rvalue that refers to an object of type struct S with
temporary lifetime. pc holds the address of a subobject of that
temporary object. The object reaches the end of its lifetime at the end
of the evaluation of the full expression. You then print its value.
On Sun 5/4/2025 6:48 AM, Tim Rentsch wrote:
One dark corner this feature has, is that in C (as opposed to C++) the
result of an assignment operator is an rvalue, which can easily lead
to some interesting consequences related to structs with arrays
inside.
I'm curious to know what interesting consequences you mean here. Do
you mean something other than cases that have undefined behavior?
I'm referring to the matter of the address identity of the resultant
rvalue object. At first, "address identity of rvalue" might sound
strange, but the standard says that there's indeed an object tied to
such rvalue, and once we start applying array-to-pointer conversion
(and use `[]` operator), lvalues and addresses quickly come into the
picture.
The standard says in 6.2.4/8:
"A non-lvalue expression with structure or union type, where the
structure or union contains a member with array type [...]
refers to an object with automatic storage duration and temporary
lifetime. Its lifetime begins when the expression is evaluated and its initial value is the value of the expression. Its lifetime ends when
the evaluation of the containing full expression ends. [...] Such an
object need not have a unique address." https://port70.net/~nsz/c/c11/n1570.html#6.2.4p8
I wondering what the last sentence is intended to mean ("... need not
have a unique address"). At the first sight, the intent seems to be
obvious: it simply says that such temporary objects might repeatedly
appear (and disappear) at the same location in storage, which is a
natural thing to expect.
But is it, perhaps, intended to also allow such temporaries to have
addresses identical to regular named objects? It is not immediately
clear to me.
And when I make the following experiment with GCC and Clang
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5];
pb = &b.a[5];
pc = &(a = b).a[5];
printf("%p %p %p\n", pa, pb, pc);
}
I consistently get the following output from GCC
0x7fff73eb5544 0x7fff73eb5574 0x7fff73eb5544
And this is what I get from Clang
0x7ffd2b8dbf44 0x7ffd2b8dbf14 0x7ffd2b8dbee4
As you can see, GCC apparently took C++-like approach to this
situation. The returned "temporary" is not really a separate temporary
at all, but actually `a` itself.
Meanwhile, in Clang all three pointers are different, i.e. Clang
decided to actually create a separate temporary object for the result
of the assignment.
I have a strong feeling that GCC's behavior is non-conforming. The
last sentence of 6.2.4/8 is not supposed to permit "projecting" the
resultant temporaries onto existing named objects. I could be wrong...
On Mon, 5 May 2025 01:29:47 -0700
Andrey Tarasevich <noone@noone.net> wrote:
On Mon 5/5/2025 1:12 AM, Michael S wrote:
According to my understanding, you are wrong.
Taking pointer of non-lvalue is UB, so anything compiler does is
conforming.
Er... What? What specifically do you mean by "taking pointers"?
The whole functionality of `[]` operator in C is based on pointers.
This expression
(a = b).a[5]
is already doing your "taking pointers of non-lvalue" (if I
understood you correctly) as part of array-to-pointer conversion. And
no, it is not UB.
This is not UB either
struct S foo(void) { return (struct S) { 1, 2, 3 }; }
...
int *p;
p = &foo().a[2], printf("%d\n", *p);
That is not UB:
int a5 = (a = b).a[5];
That is UB:
int* pa5 = &(a = b).a[5];
If you read the post of Keith Thompson and it is still not clears to
you then I can not help.
I wondering what the last sentence is intended to mean ("... need not
have a unique address"). At the first sight, the intent seems to be
obvious: it simply says that such temporary objects might repeatedly
appear (and disappear) at the same location in storage, which is a
natural thing to expect.
You snipped this: "Any attempt to modify an object with temporary
lifetime results in undefined behavior.". Which means, I think,
that an implementation that shared storage for "such an object"
with something else probably isn't going to cause problems for any
code with defined behavior.
Though I can imagine the possibility of code that modifies `a` and
reads via `pc` within the same full expression.
But unless I've somehow missed it, the "Such an object need not
have a unique address." wording doesn't appear on that web page or
in my copy of n1570.pdf. C17 does add these two sentences:
An object with temporary lifetime behaves as if it were declared
with the type of its value for the purposes of effective type. Such
an object need not have a unique address.
Normally any two objects with overlapping lifetime must have distinct addresses. This addition, I think, gives compilers permission to have temporary lifetime objects overlap with other existing objects, but not
to have a modification to one object affect the value of the other
(unless the modification invokes UB, of course).
Andrey Tarasevich <noone@noone.net> writes:
On Sun 5/4/2025 6:48 AM, Tim Rentsch wrote:
One dark corner this feature has, is that in C (as opposed to C++) the >>>> result of an assignment operator is an rvalue, which can easily lead
to some interesting consequences related to structs with arrays
inside.
I'm curious to know what interesting consequences you mean here. Do
you mean something other than cases that have undefined behavior?
I'm referring to the matter of the address identity of the resultant
rvalue object. At first, "address identity of rvalue" might sound
strange, but the standard says that there's indeed an object tied to
such rvalue, and once we start applying array-to-pointer conversion
(and use `[]` operator), lvalues and addresses quickly come into the
picture.
The standard says in 6.2.4/8:
"A non-lvalue expression with structure or union type, where the
structure or union contains a member with array type [...]
refers to an object with automatic storage duration and temporary
lifetime. Its lifetime begins when the expression is evaluated and its
initial value is the value of the expression. Its lifetime ends when
the evaluation of the containing full expression ends. [...] Such an
object need not have a unique address."
https://port70.net/~nsz/c/c11/n1570.html#6.2.4p8
The last sentence there is not present in N1570. Apparently it was introduced later, in C17. (My appreciation to Keith Thompson for
reporting this.)
I wondering what the last sentence is intended to mean ("... need not
have a unique address"). At the first sight, the intent seems to be
obvious: it simply says that such temporary objects might repeatedly
appear (and disappear) at the same location in storage, which is a
natural thing to expect.
Ahh, I see now what your concern is.
But is it, perhaps, intended to also allow such temporaries to have
addresses identical to regular named objects? It is not immediately
clear to me.
My reading of the post-C11 standards is that they allow the "new"
object to overlap with already existing objects, including both
declared objects and objects whose storage was allocated using
malloc().
And when I make the following experiment with GCC and Clang
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5];
pb = &b.a[5];
pc = &(a = b).a[5];
printf("%p %p %p\n", pa, pb, pc);
}
I consistently get the following output from GCC
0x7fff73eb5544 0x7fff73eb5574 0x7fff73eb5544
And this is what I get from Clang
0x7ffd2b8dbf44 0x7ffd2b8dbf14 0x7ffd2b8dbee4
As you can see, GCC apparently took C++-like approach to this
situation. The returned "temporary" is not really a separate temporary
at all, but actually `a` itself.
Yeah.
Meanwhile, in Clang all three pointers are different, i.e. Clang
decided to actually create a separate temporary object for the result
of the assignment.
Which in my reading of the standard is required under C11 rules.
I have reproduced your results under -std=c11 -pedantic, for both
gcc and clang.
I have a strong feeling that GCC's behavior is non-conforming. The
last sentence of 6.2.4/8 is not supposed to permit "projecting" the
resultant temporaries onto existing named objects. I could be wrong...
My judgment is that the behavior under gcc is non-conforming if the compilation was done using C11 semantics. Under C17 or later rules
the gcc behavior is allowed (and may have been what prompted the
change in C17, but that is just speculation on my part). In any
case I understand now what you were getting at. Thank you for
bringing this hazard to the group's attention.
I hope someone files a bug report for gcc using -std=c11 rules,
because what gcc does under that setting (along with -pedantic)
is surely at odds with the plain reading of the C11 standard,
for the situation being discussed here.
Editorial comment: here is yet another case where post-C11 changes
to the C standard seem ill advised, and another reason not to use
any version of the ISO C standard for C17 or later. And it's
disappointing that gcc -std=c11 -pedantic strays into the realm of non-conforming behavior.
On Mon 5/5/2025 2:01 AM, Michael S wrote:
On Mon, 5 May 2025 01:29:47 -0700
Andrey Tarasevich <noone@noone.net> wrote:
On Mon 5/5/2025 1:12 AM, Michael S wrote:
According to my understanding, you are wrong.
Taking pointer of non-lvalue is UB, so anything compiler does is
conforming.
Er... What? What specifically do you mean by "taking pointers"?
The whole functionality of `[]` operator in C is based on pointers.
This expression
(a = b).a[5]
is already doing your "taking pointers of non-lvalue" (if I
understood you correctly) as part of array-to-pointer conversion.
And no, it is not UB.
This is not UB either
struct S foo(void) { return (struct S) { 1, 2, 3 }; }
...
int *p;
p = &foo().a[2], printf("%d\n", *p);
That is not UB:
int a5 = (a = b).a[5];
That is UB:
int* pa5 = &(a = b).a[5];
No, it isn't.
If you read the post of Keith Thompson and it is still not clears to
you then I can not help.
The only valid "UB" claim in Keith's post is my printing the value of
`pc` pointer, which by that time happens to point nowhere, since the
lifetime of the temporary is over. (And, of course, lack of
conversion to `void *` is an issue).
As for the expressions like
&(a = b).a[5];
and
&foo().a[2]
- these by themselves are are perfectly valid. There's no UB in these expressions. (And this is not a debate.)
Here's a version of the same code that corrects the above distracting
issues
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}
This version has no UB.
Michael S <already5chosen@yahoo.com> writes:
On Mon, 05 May 2025 01:34:16 -0700
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
And more obviously, "%p" requires an argument of type void*, not int*.
That part of otherwise very good comment is unreasonably pedantic.
I disagree. I suggest it's a bad habit to use "%p" without ensuring,
by a cast if necessary, that the argument is of type void*.
In most implementations, it's likely that all pointers have the same
size and representation and are passed as arguments in the same way,
but getting the types right means one less thing to worry about.
Andrey Tarasevich <noone@noone.net> writes:
[...]
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}
This version has no UB.
I believe it does. [...]
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
James Kuyper <jameskuyper@alumni.caltech.edu> writes:
On 5/3/25 20:37, Keith Thompson wrote:
Lawrence D'Oliveiro <ldo@nz.invalid> writes:
On Sat, 3 May 2025 01:14:46 -0700, Andrey Tarasevich wrote:
Virtually every C project relies on assignment of structures.
Passing-returning structs by value might be more rare (although
perfectly valid and often appropriate too), but assignment...
assignment is used by everyone everywhere without even giving
it a second thought.
There is a caveat, to do with alignment padding: will this
always have a defined value?
I don't believe so. In a quick look, I don't see anything in
the standard that explicitly addresses this, but I believe that a
conforming implementation could implement structure assignment by
copying the individual members, leaving any padding in the target
undefined.
"When a value is stored in an object of structure or union type,
including in a member object, the bytes of the object
representation that correspond to any padding bytes take
unspecified values.56)" (6.2.6.1p6).
That refers to footnote 56, which says "Thus, for example,
structure assignment need not copy any padding bits."
Yes, that's what I missed.
It's interesting that the footnote refers to padding *bits* rather
than padding *bytes*. I presume this was unintentional.
Padding bits:
struct A {
uint64_t tlen : 16,
: 20,
pkind : 6,
fsz : 6,
gsz : 14,
g : 1,
ptp : 1;
} s;
There are 20 padding bits in this declaration. Perhaps that's
what they're referring to?
On Sat, 3 May 2025 21:42:37 -0400
Richard Damon <richard@damon-family.org> wrote:
Bigger than that, and you likely want to pass the object by address,
not by value, passing just a pointer to it.
That sort of thinking is an example of Knutian premature optimization.
On Mon, 05 May 2025 01:34:16 -0700
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
And more obviously, "%p" requires an argument of type void*, not
int*.
That part of otherwise very good comment is unreasonably pedantic.
On Mon, 5 May 2025 08:45:09 -0700
Andrey Tarasevich <noone@noone.net> wrote:
On Mon 5/5/2025 2:01 AM, Michael S wrote:
On Mon, 5 May 2025 01:29:47 -0700
Andrey Tarasevich <noone@noone.net> wrote:
On Mon 5/5/2025 1:12 AM, Michael S wrote:
According to my understanding, you are wrong.
Taking pointer of non-lvalue is UB, so anything compiler does is
conforming.
Er... What? What specifically do you mean by "taking pointers"?
The whole functionality of `[]` operator in C is based on pointers.
This expression
(a = b).a[5]
[...]
is already doing your "taking pointers of non-lvalue" (if I
understood you correctly) as part of array-to-pointer conversion.
And no, it is not UB.
This is not UB either
struct S foo(void) { return (struct S) { 1, 2, 3 }; }
...
int *p;
p = &foo().a[2], printf("%d\n", *p);
That is not UB:
int a5 = (a = b).a[5];
That is UB:
int* pa5 = &(a = b).a[5];
No, it isn't.
If you read the post of Keith Thompson and it is still not clears to
you then I can not help.
The only valid "UB" claim in Keith's post is my printing the value of
`pc` pointer, which by that time happens to point nowhere, since the
lifetime of the temporary is over. (And, of course, lack of
conversion to `void *` is an issue).
As for the expressions like
&(a = b).a[5];
and
&foo().a[2]
Expressions by themselves a valid. But since there is no situation in
which the value produced by expressions is valid outside of expressions
the compiler can generate any value it wants, even NULL or value
completely outside of address space of current process.
- these by themselves are are perfectly valid. There's no UB in these
expressions. (And this is not a debate.)
Here's a version of the same code that corrects the above distracting
issues
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}
This version has no UB.
It's only not UB in the nazal demons sense.
It's UB in a sense that we can't predict values of expressions
like (pa==pc) and (pb==pc). I.e. pc is completely useless. In my book
it is form of UB.
On 2025-05-05, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
Michael S <already5chosen@yahoo.com> writes:
On Mon, 05 May 2025 01:34:16 -0700
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
And more obviously, "%p" requires an argument of type void*, not
int*.
That part of otherwise very good comment is unreasonably pedantic.
I disagree. I suggest it's a bad habit to use "%p" without
ensuring, by a cast if necessary, that the argument is of type
void*.
In most implementations, it's likely that all pointers have the
same size and representation and are passed as arguments in the
same way, but getting the types right means one less thing to worry
about.
If the codebade assumes all data pointers are the same size, bit
pattern and are treated the same in the calling conventions / ABI,
then it is probably moot.
That code is doomed on a platform where the assumption doesn't
hold, and the printf statemnts are probably not independently
reusable.
(I mostly put in these casts just to communicate to others that
an ISO C language lawyer works here, if you happen to need one.)
Also, it owuld be amazingly stupid of any such platform not just
make those printfs work: to promote variadic arguments of
pointer-to-object type to a common representation which is the
same as void *, combined with a matching behavior in the va_arg
macro for extracting the value back into any pointer-to-object
type.
Muttley@dastardlyhq.com writes:
[...]
If you twant o pass an actual array to a function instead of a pointer to it,
embedding it in a structure is the only way to do it.
Yes, but that's not necessarily useful. An array that's a member
Muttley@dastardlyhq.com writes:
[...]
If you twant o pass an actual array to a function instead of a pointer to it,
embedding it in a structure is the only way to do it.
Yes, but that's not necessarily useful. An array that's a member
of a struct can only be of a constant length (unless it's a flexible
array member, but that doesn't help). Functions that work with
arrays typically need to deal with arrays of arbitrary length.
Andrey Tarasevich <noone@noone.net> writes:
[...]
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}
This version has no UB.
I believe it does. pc points to an element of an object with
temporary lifetime. The value of pc is then used after the object
it points to has reached the end of its lifetime. At that point,
pc has an indeterminate value.
N3096 6.2.4p2: "If a pointer value is used in an evaluation after
the object the pointer points to (or just past) reaches the end of
its lifetime, the behavior is undefined. The representation of a
pointer object becomes indeterminate when the object the pointer
points to (or just past) reaches the end of its lifetime."
On 05/05/2025 22:53, Keith Thompson wrote:
Muttley@dastardlyhq.com writes:
[...]
If you twant o pass an actual array to a function instead of a pointer to >it,
embedding it in a structure is the only way to do it.
Yes, but that's not necessarily useful. An array that's a member
of a struct can only be of a constant length (unless it's a flexible
array member, but that doesn't help). Functions that work with
arrays typically need to deal with arrays of arbitrary length.
I regularly use arrays with known fixed sizes. In fact, in my code
those are absolutely dominant - it is very rare for me to see or use an
array whose size is /not/ fixed at compile time. Sometimes I will have
Muttley@dastardlyhq.com writes:
[...]
If you twant o pass an actual array to a function instead of a
pointer to it, embedding it in a structure is the only way to do
it.
Yes, but that's not necessarily useful. An array that's a member
of a struct can only be of a constant length (unless it's a flexible
array member, but that doesn't help). Functions that work with
arrays typically need to deal with arrays of arbitrary length.
Andrey Tarasevich <noone@noone.net> writes:
[...]
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}
This version has no UB.
I believe it does. pc points to an element of an object with
temporary lifetime. The value of pc is then used after the object
it points to has reached the end of its lifetime. At that point,
pc has an indeterminate value.
N3096 6.2.4p2: "If a pointer value is used in an evaluation after
the object the pointer points to (or just past) reaches the end of
its lifetime, the behavior is undefined. The representation of a
pointer object becomes indeterminate when the object the pointer
points to (or just past) reaches the end of its lifetime."
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
Andrey Tarasevich <noone@noone.net> writes:
[...]
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}
This version has no UB.
I believe it does. pc points to an element of an object with
temporary lifetime. The value of pc is then used after the object
it points to has reached the end of its lifetime. At that point,
pc has an indeterminate value.
N3096 6.2.4p2: "If a pointer value is used in an evaluation after
the object the pointer points to (or just past) reaches the end of
its lifetime, the behavior is undefined. The representation of a
pointer object becomes indeterminate when the object the pointer
points to (or just past) reaches the end of its lifetime."
Note commas above. Assignment to pc and call to printf are parts
of a single expression, so use of pc is within lifetime of the
temporary object.
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:[...]
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
Andrey Tarasevich <noone@noone.net> writes:
[...]
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5];
pb = &b.a[5];
pc = &(a = b).a[5];
printf("%p %p %p\n", pa, pb, pc);
}
[...]
I think that code has undefined behavior.
Right. [*]
[*] Assuming C11 semantics. At best inadvisable under C99
semantics, and a constraint violation under C90 semantics.
What C90 constraint does it violate? Both gcc and clang reject it
with "-std=c90 -pedantic-errors", with an error message "ISO C90
forbids subscripting non-lvalue array", but I don't see a relevant
constraint in the C90 standard.
I know that C11 introduced "temporary lifetime" to cover cases
like this. In C99, the wording for the indexing operator implicitly
assumes that there's an array object; if there isn't, I'd argue the
behavior is undefined by omission. I'm not aware of any relevant
change from C90 to C99.
On 06/05/2025 19:36, Waldek Hebisch wrote:
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
Andrey Tarasevich <noone@noone.net> writes:
[...]
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}
This version has no UB.
I believe it does. pc points to an element of an object with
temporary lifetime. The value of pc is then used after the object
it points to has reached the end of its lifetime. At that point,
pc has an indeterminate value.
N3096 6.2.4p2: "If a pointer value is used in an evaluation after
the object the pointer points to (or just past) reaches the end of
its lifetime, the behavior is undefined. The representation of a
pointer object becomes indeterminate when the object the pointer
points to (or just past) reaches the end of its lifetime."
Note commas above. Assignment to pc and call to printf are parts
of a single expression, so use of pc is within lifetime of the
temporary object.
I must admit I had not noticed that detail.
David Brown <david.brown@hesbynett.no> writes:
On 06/05/2025 19:36, Waldek Hebisch wrote:
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
Andrey Tarasevich <noone@noone.net> writes:
[...]
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}
This version has no UB.
I believe it does. pc points to an element of an object with
temporary lifetime. The value of pc is then used after the object
it points to has reached the end of its lifetime. At that point,
pc has an indeterminate value.
N3096 6.2.4p2: "If a pointer value is used in an evaluation after
the object the pointer points to (or just past) reaches the end of
its lifetime, the behavior is undefined. The representation of a
pointer object becomes indeterminate when the object the pointer
points to (or just past) reaches the end of its lifetime."
Note commas above. Assignment to pc and call to printf are parts
of a single expression, so use of pc is within lifetime of the
temporary object.
I must admit I had not noticed that detail.
That would get an immediate downcheck during review for exactly
that reason.
Nick Bowler <nbowler@draconx.ca> writes:[...]
The rule about conversions from arrays to pointers is different in C99
(n1124 6.3.2.1, third paragraph) compared to C89. In particular,
"an lvalue that has type ``array of type'' ..." was changed to
"an expression that has type ``array of type'' ...".
The change from "lvalue" to "expression" was made in C99. I wonder why
that was done.
BTW, you have a copy of ANSI C89? Hard or soft copy? Do you know if
it's still available in some form?
On Tue, 06 May 2025 13:21:38 -0700, Keith Thompson wrote:
Nick Bowler <nbowler@draconx.ca> writes:
The rule about conversions from arrays to pointers is different
in C99 (n1124 6.3.2.1, third paragraph) compared to C89. In
particular, "an lvalue that has type ``array of type'' ..." was
changed to "an expression that has type ``array of type'' ...".
[...]
The change from "lvalue" to "expression" was made in C99. I
wonder why that was done.
It's not mentioned in the rationale, so we can only guess. [...]
Nick Bowler <nbowler@draconx.ca> writes:
On Tue, 06 May 2025 13:21:38 -0700, Keith Thompson wrote:
The change from "lvalue" to "expression" was made in C99. I wonder why
that was done.
It's not mentioned in the rationale, so we can only guess. But it is
called out in the list of major changes in the C99 foreword.
I've just looked at the foreword of the C99 standard and the n1256
draft, and I couldn't find it. Can you quote the precise wording?
On Mon 5/5/2025 1:26 AM, Keith Thompson wrote:
I wondering what the last sentence is intended to mean ("... need not
have a unique address"). At the first sight, the intent seems to be
obvious: it simply says that such temporary objects might repeatedly
appear (and disappear) at the same location in storage, which is a
natural thing to expect.
You snipped this: "Any attempt to modify an object with temporary
lifetime results in undefined behavior.". Which means, I think,
that an implementation that shared storage for "such an object"
with something else probably isn't going to cause problems for any
code with defined behavior.
It is going to cause problems, if the code relies on the address
identity of the object, assuming the standard intends to provide such guarantees.
Though I can imagine the possibility of code that modifies `a` and
reads via `pc` within the same full expression.
That's easy (in the context of declarations from my previous example):
pc = &(a = b).a[5], a.a[5] = 42, printf("%d\n", *pc);
As one would expect, this produces different output in GCC and Clang
for the reasons I already described.
But unless I've somehow missed it, the "Such an object need not
have a unique address." wording doesn't appear on that web page or
in my copy of n1570.pdf. C17 does add these two sentences:
An object with temporary lifetime behaves as if it were declared
with the type of its value for the purposes of effective type. Such
an object need not have a unique address.
Normally any two objects with overlapping lifetime must have distinct
addresses. This addition, I think, gives compilers permission to have
temporary lifetime objects overlap with other existing objects, but not
to have a modification to one object affect the value of the other
(unless the modification invokes UB, of course).
If so, that would be extremely underspecified. A mere "such an object
need not have a unique address" is insufficient to fully convey the permission to overlap existing named objects.
And that's probably what led to difference in interpretation
between GCC and Clang.
Modification of the temporary is "prohibited" (as UB), but
modification of the overlapped named object is not. The
consequences can be quite surprising.
Andrey Tarasevich <noone@noone.net> writes:
And that's probably what led to difference in interpretation
between GCC and Clang.
I suspect the implication actually goes the other way. It is
because what gcc has done (past tense) violates the rules of the C11
standard that someone had the bright idea that the C standard should
be changed to allow this stupidity.
Modification of the temporary is "prohibited" (as UB), but
modification of the overlapped named object is not. The
consequences can be quite surprising.
In my view the problem is not that what is allowed is unclear, but
that the whole idea of possibly overlapping objects is a crock.
It's a sad statement on the quality of gcc that it does the wrong
thing even when -std=c11 and -pedantic are given as compilation
options. Bleah.
On Sat, 3 May 2025 21:42:37 -0400
Richard Damon <richard@damon-family.org> wrote:
Bigger than that, and you likely want to pass the object by address,
not by value, passing just a pointer to it.
That sort of thinking is an example of Knutian premature optimization.