• UB-free slice

    From highcrew@high.crew3868@fastmail.com to comp.lang.c on Mon Jan 5 22:36:28 2026
    From Newsgroup: comp.lang.c

    Hello,

    Let `struct Buffer` be defined as:

    struct Buffer { unsigned char *bytes; size_t length; };

    Then let's define the following function:

    int buffer_append(struct Buffer *dst, const struct Buffer *src)
    {
    // checks on size etc etc, return non-zero on failure

    memcpy(dst->bytes, src->bytes, length);
    dst->bytes += length;
    dst->length -= length;
    return 0;
    }

    In my understanding, if we call buffer_append(&x, &y) for x.bytes and
    y.bytes pointing to overlapping areas of the same array, we get UB by
    the first two parameters of memcpy being restrict-qualified pointers.

    I suppose the correct thing to do would be to replace memcpy with
    memmove, allegedly taking some performance penalty.

    Assuming that such penalty matters on the scale of things, what is
    the correct/recommended way of handing this situation?
    Would it make sense to define a `struct RBuffer` as the following for
    the purpose?

    struct RBuffer { unsigned char * restrict bytes; size_t length };

    This is maybe OK with respect to UB, but IMO it would result in a rather
    ugly API. We don't always *need* that bytes pointer to be restricted.

    Alternative ideas? I can think of the following, but I don't like
    it either...

    int buffer_append(struct Buffer *dst, const struct Buffer *src,
    bool overlapping)
    {
    // if overlapping then memmove else memcpy
    }
    --
    High Crew

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.lang.c on Tue Jan 6 09:15:00 2026
    From Newsgroup: comp.lang.c

    On 05/01/2026 22:36, highcrew wrote:
    Hello,

    Let `struct Buffer` be defined as:

    -a struct Buffer { unsigned char *bytes; size_t length; };

    Then let's define the following function:

    -a int buffer_append(struct Buffer *dst, const struct Buffer *src)
    -a {
    -a-a-a // checks on size etc etc, return non-zero on failure

    -a-a-a memcpy(dst->bytes, src->bytes, length);
    -a-a-a dst->bytes += length;
    -a-a-a dst->length -= length;
    -a-a-a return 0;
    -a }

    In my understanding, if we call buffer_append(&x, &y) for x.bytes and
    y.bytes pointing to overlapping areas of the same array, we get UB by
    the first two parameters of memcpy being restrict-qualified pointers.

    I suppose the correct thing to do would be to replace memcpy with
    memmove, allegedly taking some performance penalty.

    Assuming that such penalty matters on the scale of things, what is
    the correct/recommended way of handing this situation?

    Use memmove. Your assumption about the performance penalty is
    unwarranted - it should be negligible on any scale of things. And even
    if the performance penalty was big, "correct" is better than "fast"
    every time.

    You have a very strange definition for your "Buffer" type. A "Buffer"
    type that does not keep track of the start of the buffer, but only the
    end point and the space remaining in the memory allocation, is not much
    use as a buffer. It might be useful as a "Buffer_End_Point" type, but
    not "Buffer".

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From highcrew@high.crew3868@fastmail.com to comp.lang.c on Tue Jan 6 11:25:20 2026
    From Newsgroup: comp.lang.c

    On 1/6/26 9:15 AM, David Brown wrote:
    Use memmove.-a Your assumption about the performance penalty is
    unwarranted - it should be negligible on any scale of things.-a And even
    if the performance penalty was big, "correct" is better than "fast"
    every time.

    Thanks, that's how I corrected the code.

    You have a very strange definition for your "Buffer" type.-a A "Buffer"
    type that does not keep track of the start of the buffer, but only the
    end point and the space remaining in the memory allocation, is not much
    use as a buffer.-a It might be useful as a "Buffer_End_Point" type, but
    not "Buffer".

    It is used a bit like a cursor.
    But you are right: it is not properly a buffer, although it can be
    used to reference a buffer.
    Perhaps it should be renamed as `struct Slice`.
    --
    High Crew
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Kaz Kylheku@046-301-5902@kylheku.com to comp.lang.c on Tue Jan 6 20:48:51 2026
    From Newsgroup: comp.lang.c

    On 2026-01-05, highcrew <high.crew3868@fastmail.com> wrote:
    In my understanding, if we call buffer_append(&x, &y) for x.bytes and
    y.bytes pointing to overlapping areas of the same array, we get UB by
    the first two parameters of memcpy being restrict-qualified pointers.

    Nope! The standard doesn't provide a /definition/ of memcpy.

    And restrict qualifiers on pointer parameters in a /declaration/ don't
    mean anything (and need not be repeated in the definition).

    memcpy is not required to take overlapping objects for the
    reason that its description says so.

    That situation existed long before there was a restrict, and has
    implications in the absence of restrict. Various ways of implemeting a memcpy-like function will produce various unexpected results when
    objects overlap, even without any undefined behavior taking place. For instance if we copy byte-by-byte, from lowest address to highest, then
    we end up writing into memory that our loop is about to read from,
    corrupting the data.

    In order not to place any restrictions on how a memcpy may be
    implemented, whether with the help of restrict pointers, or assembly
    language or whatever else, the standard makes overlapping inputs
    undefined.
    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Kaz Kylheku@046-301-5902@kylheku.com to comp.lang.c on Tue Jan 6 20:50:42 2026
    From Newsgroup: comp.lang.c

    On 2026-01-06, Kaz Kylheku <046-301-5902@kylheku.com> wrote:
    implications in the absence of restrict. Various ways of implemeting a memcpy-like function will produce various unexpected results when
    objects overlap, even without any undefined behavior taking place. For instance if we copy byte-by-byte, from lowest address to highest, then
    we end up writing into memory that our loop is about to read from,
    corrupting the data.

    I mean, of course, when we copy from a lower addressed object to a higher addresed object which overlaps it.
    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    --- Synchronet 3.21a-Linux NewsLink 1.2