• STOMP

    From =?UTF-8?Q?Arne_Vajh=C3=B8j?=@arne@vajhoej.dk to comp.os.vms on Fri Jul 25 17:30:00 2025
    From Newsgroup: comp.os.vms

    I have updated my STOMP library.

    (for those that does not know STOMP then it is a standard protocol for accessing a message queue)

    https://www.vajhoej.dk/arne/opensource/vms/vmspstomp-src-v0_5.zip

    It consist of 3 things:
    * simple_stomp: a C library, not created by me - I just found it
    on the internet, but now I have modified it quite a bit
    * vms_simple_stomp : a wrapper exposing a VMS API (strings by
    descriptor, integers always by reference)
    * pstomp : a Pascal wrapper to make it easier to use in
    Pascal (build to PEN)

    There are 3 changes from 0.4 to 0.5:
    - The _read function is now supplemented by _sub and _readone
    functions for the case where one want to read more than one
    message (_read sort of work for reading multiple messages from
    a queue but it totally fucks up reading multiple messages from
    a topic)
    - the low level reading now handle the case where a single TCP recv
    receives more than one message, before that would result in
    second message getting lost
    - avoid appending a LF to messages when sending - other libraries does
    not do that

    Arne

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.os.vms on Sun Jul 27 06:07:40 2025
    From Newsgroup: comp.os.vms

    In article <1060t0o$20o4j$1@dont-email.me>,
    Arne Vajh|+j <arne@vajhoej.dk> wrote:
    I have updated my STOMP library.

    (for those that does not know STOMP then it is a standard protocol for >accessing a message queue)

    https://www.vajhoej.dk/arne/opensource/vms/vmspstomp-src-v0_5.zip

    This seems to lack any kind of license; since you mentioned that
    you did not write the STOMP library yourself, where is the
    original, and what exactly are the terms for redistribution?

    It consist of 3 things:
    * simple_stomp: a C library, not created by me - I just found it
    on the internet, but now I have modified it quite a bit

    Oophf. Unfortunately, this code is loaded with bugs and
    footguns.

    Starting with the `helper_write` function.

    First, this function's name is vague. It appears to serialize a
    "frame" into a byte buffer and send it over a socket associated
    with a TCP connection, but on the face of it, the name is not
    descriptive. But let's look at the first few lines of code:

    -static int helper_write(simple_stomp_t *ctx, const char *cmd, const char *header, const char *body)
    -{
    - int status;
    - char buf[MAX_BUF], errmsg[200], *p;
    - p = buf;
    - p += snprintf(p, sizeof(buf) - (p - buf), "%s\n", cmd);

    Ok, here's the first problem. Recall that `snprintf` will
    return the total number of bytes needed to successfully format
    its arguments, not the total number of bytes it copies to the
    destination. So if `cmd` is sufficiently long that it overflows
    `buf`, then you'll set `p` to some address beyond the end of the
    buffer, and your next copy will be out of bounds; the
    computation of the size argument will use wrap-around unsigned
    arithmetic, so will (likely) be very large.

    This may not be a huge problem in practice, though; this
    function is `static`, and `cmd` is always fixed and small.
    Still, this is a bug waiting to happen. Furthermore...

    - if(header != NULL)
    - {
    - p += snprintf(p, sizeof(buf) - (p - buf), "%s\n", header);

    Note that `header` is constructed from an argument supplied by
    the user. However, that argument is always copied into a small
    buffer, prepended with the header name ("destination: ", when
    specified). But the argument may be truncated when copied into
    that intermediate value, and there's no way to detect that.

    - }
    - p += snprintf(p, sizeof(buf) - (p - buf), "\n");
    - if(body != NULL)
    - {
    - p += snprintf(p, sizeof(buf) - (p - buf), "%s", body);
    - }

    In this case, `body` is provided by the user, and certainly may
    be truncated; again, with no way to know.

    So, while this section of code may _happen_ to work correctly
    most of the time, at best it's accidental, and at worst, this
    may lead to data corruption due to silent truncation.

    Then there's the matter of `send`:

    - status = send(ctx->sd, buf, strlen(buf) + 1, 0);
    - if(status < 0)
    - {
    - snprintf(errmsg, sizeof(errmsg), "Error sending to socket: %s", strerror(errno));
    - ctx->eh(errmsg);
    - return 0;
    - }
    - return 1;

    There's no actual guarantee that `send` needs to write the total
    number of bytes you give it in one go. The "VSI TCP/IP Services
    for OpenVMS Sockets API and System Services Programming" manual
    alludes to this, but more specifically, there is no structural
    reason that the socket you're sending on cannot be marked
    non-blocking after it's been connected. So while `send` as
    written may be ok for casual programming or programs where
    access to the socket is tightly controlled, it is not
    particularly robust in a library. It is best to call it in a
    loop.

    Next, let's look at the `helper_read` function:

    -static int helper_read(simple_stomp_t *ctx, char *cmd, char *header, char *body)
    -...
    - while(ctx->ix < sizeof(ctx->buf) && ctx->ix <= strlen(ctx->buf))
    - {
    - ctx->ix += recv(ctx->sd, ctx->buf + ctx->ix, sizeof(ctx->buf) - ctx->ix, 0);
    - }

    First of all, what if `recv` returns -1 to indicate an error?

    Second of all, how do you know that calling `strlen` on
    `ctx->buf` is safe? There's no guarantee that that buffer is
    0-terminated. To understand this, let's break down how it's
    initialized, and then how it's used.

    `ctx->buf[0]` and `ctx->ix` are both set to 0 when the elements
    of `ctx` are initialized during `simple_stomp_init`, but
    critically, the contents of the rest of `ctx->buf` are
    indeterminate. So, consider this loop: on the first iteration,
    `ctx->ix` is zero, and `strlen(ctx->buf)` is 0, so the loop
    conditions are met, and we enter the body; suppose that there
    are 10 bytes available for `recv`, but that none of those are a
    0; so we set ctx->ix to 10, overwrite the 0 in `ctx->buf[0]`,
    and continue on to the next iteration of the loop. But on this
    iteration, we really don't know whether `strlen(ctx->buf)` will
    encounter a 0 byte or not, since we overwrote the only zero we
    *know* was in the buffer with non-zero data.

    I don't see any incidental mitigation for this, as there was for
    the errant `snprintf` calls above; it looks like it just crashes
    if you don't receive (at least) a full frame on each call to
    `recv` and there doesn't happen to already be a 0 in the buffer
    somewhere by accident.

    Taking a step back, it's unstated but one may deduce that frames
    in this STOMP protocol are 0-delimited; that's an unfortunate
    design decision (they should just count then, using something
    similar to Bernstein's netstrings if need be), but it is what it
    is. Anyway, it's clear that this code is using `strlen` as a
    proxy for trying to find the 0 delimiter in the buffer. A far
    more robust mechanism would be to use a function that's designed
    for finding a byte in (potentially) non-string data: `memchr`.
    If you're already keeping track of how much data you have
    accumulated in your buffer, just look for the first zero byte in
    that range: `char *p = memchr(buf, 0, len);` If `p` is
    non-NULL, then you've accumulated an entire frame.

    Third, this loop doesn't handle truncation, which will almost
    certainly lead to a program crash. Suppose that a frame is sent
    to you that is larger than your receive buffer; you'll fill the
    buffer, and so `ctx->ix` will be equal to `sizeof(ctx->buf)` and
    the loop will terminate, but since the received frame was
    truncated, then by definition, the buffer will not contain a
    zero-valued byte. Any subsequent string operations on the
    buffer (such as the `strstr`s to find newlines later on, or the
    unbounded `strcpy` into `body`) will be applied to a
    non-terminated string; clearly a bug.

    Speaking of `strcpy`...this function copies data received from
    the network into fixed-sized buffers, but without validating
    that the data fits into those buffers: there sizes aren't even
    passed as arguments to `helper_read`! These are buffer
    overflows waiting to happen.

    Also, speaking of `strstr`, there's no need to use `strstr` to
    look for a single character in a string. Use `strchr` instead.

    Finally, once the frame has been decoded and the bits copied to
    their respective destinations, there's this code, to move
    whatever frame data may have been received from the network to
    the front of the context buffer for use when receiving the next
    frame:

    - int len = strlen(ctx->buf) + 2;
    - memmove(ctx->buf, ctx->buf + len, ctx->ix - len);
    - ctx->ix -= len;

    First, the return type of `strlen` is `size_t`, not `int`, but
    that's minor. The real issue is that you've got an off-by-one
    error here; `strlen` will not include the terminating NUL at the
    end of the string, so I understand why you want to add something
    to it's return value, but it's also true that, if `s` is a C
    string and `l = strlen(s)`; then `s[l] == 0`, and `s+l+1` points
    to the char immediately _after_ the terminating 0. That is, the
    length of the entire frame, including the NUL terminator, is
    `strlen(ctx->buf) + 1`, not +2. Here, you appear to be
    discarding the first byte of the next frame, though the protocol
    does say that a message can have any number of newlines
    following the NUL byte, so perhaps that's what you are doing.
    Regardles, you should validate that you're not throwing away
    valid data.

    Next up, let's look at `simple_stomp_init`:

    -int simple_stomp_init(simple_stomp_t *ctx, const char *host, int port, error_handler eh)`
    -{
    - ....
    - struct sockaddr local, remote;
    - ....
    - local.sa_family = AF_INET;
    - memset(local.sa_data, 0, sizeof(local.sa_data));
    - status = bind(ctx->sd, &local, sizeof(local));

    You don't need to `bind` here. It's unclear why you do.

    - remote.sa_family = hostinfo->h_addrtype;
    - memcpy(remote.sa_data + 2, hostinfo->h_addr_list[0], hostinfo->h_length);
    - *((short *)remote.sa_data) = htons(port);
    - status = connect(ctx->sd, &remote, sizeof(remote));

    It's totally unclear why you would abuse `struct sockaddr` like
    this: the structure of `sa_data` is meant to be opaque. Just
    use `sockaddr_in` here, and as all of the available
    documentation demonstrates. This sequence then becomes:

    struct sockaddr_in remote;
    ...
    memset(&remote, 0, sizeof(remote));
    remote.sin_family = AF_INET;
    remote.sin_port = ntohs(port);
    memcpy(&remote.sin_addr, hostinfo->h_addr_list[0], hostinfo->h_length);
    status = connect(ctx->sd, (struct sockaddr *)&remote, sizeof(remote));

    See the aforementioned VSI TCP/IP Services manual for details
    and examples.

    Continuing on, we get to this code:

    - if(!helper_read(ctx, cmd, header, body))
    - {
    - snprintf(errmsg, sizeof(errmsg), "Error receiving CONNECTED");
    - ctx->eh(errmsg);
    - return 0;
    - }

    Here, we are attempting to read a frame in response to a message
    we sent the server ("CONNECT"). We deduce that the server is
    supposed to send us a "CONNECTED" frame, but we only verify that
    we receive _a_ frame; the contents of that frame is not
    validated. You should probably check it to make sure it is what
    you expect.

    Note also that we're reading into fixed sized buffers that are
    very small (80 bytes!) and easily overflowed.

    Finally with respect to initialization, this code leaks sockets
    on init failure: after the socket has been created, and even
    connected, if the rest of initialization fails for any reason,
    then we don't close the socket descriptor, and there's no good
    way for calling code to know that it's open; error handling code
    should close it consistently.

    In general, things in this module are poorly named,
    inconsistently formatted, overly verbose, and repetitious (e.g.,
    the `XDEBUG` stuff is easily factored out into a small helper
    function); there's a lot of "accidentally quadratic" code that
    does things like loop over the value of `strlen` for some
    string. I certainly wouldn't trust this in a production
    setting, though it may be fine for hobbyist use or a
    non-critical internal tool.

    * vms_simple_stomp : a wrapper exposing a VMS API (strings by
    descriptor, integers always by reference)

    This code does a lot of extraneous allocation and copying, to
    convert from a string descriptor to a NUL-terminated C-style
    string: presumably the `malloc`'ing is done to ensure there's
    space for the 0-terminator. But with relatively modest
    interface changes, none of that is necessary, and indeed, a lot
    of the copying in the core library is unnecessary as well. The
    trick is to define a local type for a sized string, convert the
    descriptor to an instance of that, and then use that type
    consistently in the core library. One then has a function that
    serializes a frame into a buffer, which can be sent, and one to
    retrieve a raw frame from the context, which is then parsed into
    a suitable representation. Everything can then be stack
    allocated.

    Also, in the `read` cases, even though you have access to
    destination buffer sizes and it's easy enough to plumb them
    through, you did not pass those to the reading routines, so they
    cannot validate their destination buffers or check for
    truncation.

    * pstomp : a Pascal wrapper to make it easier to use in
    Pascal (build to PEN)

    I'm afraid that I did not look at this closely.

    There are 3 changes from 0.4 to 0.5:
    - The _read function is now supplemented by _sub and _readone
    functions for the case where one want to read more than one
    message (_read sort of work for reading multiple messages from
    a queue but it totally fucks up reading multiple messages from
    a topic)
    - the low level reading now handle the case where a single TCP recv
    receives more than one message, before that would result in
    second message getting lost
    - avoid appending a LF to messages when sending - other libraries does
    not do that

    See above. On a general note, you'd be much better off tracking
    this sort of thing in a proper revision control tool; perhaps
    putting that code into (say) a `git` repository.

    I addressed most of these issues in a change I pushed to github
    here: https://github.com/dancrossnyc/stompvms.git

    I have no real means to test this against a network server, but
    it compiles and the simple test exercisor I wrote for the frame
    serdes code runs on Eisner.

    - Dan C.

    PS: the reference for this little protocol: https://stomp.github.io/stomp-specification-1.2.html
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Simon Clubley@clubley@remove_me.eisner.decus.org-Earth.UFP to comp.os.vms on Mon Jul 28 12:43:45 2025
    From Newsgroup: comp.os.vms

    On 2025-07-25, Arne Vajhoj <arne@vajhoej.dk> wrote:
    I have updated my STOMP library.

    (for those that does not know STOMP then it is a standard protocol for accessing a message queue)


    OK. Did anyone else read that subject line and wonder who or what Arne
    was wanting to stomp on ? :-)

    Simon.
    --
    Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP
    Walking destinations on a map are further away than they appear.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.os.vms on Mon Jul 28 14:17:03 2025
    From Newsgroup: comp.os.vms

    In article <1067ra1$25emj$1@dont-email.me>,
    Simon Clubley <clubley@remove_me.eisner.decus.org-Earth.UFP> wrote:
    On 2025-07-25, Arne Vajhoj <arne@vajhoej.dk> wrote:
    I have updated my STOMP library.

    (for those that does not know STOMP then it is a standard protocol for
    accessing a message queue)


    OK. Did anyone else read that subject line and wonder who or what Arne
    was wanting to stomp on ? :-)

    Clearly not https://stomponline.com

    But a link to https://stomp.github.io would have been helpful.

    - Dan C.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From =?UTF-8?Q?Arne_Vajh=C3=B8j?=@arne@vajhoej.dk to comp.os.vms on Mon Jul 28 18:52:33 2025
    From Newsgroup: comp.os.vms

    On 7/25/2025 5:30 PM, Arne Vajh|+j wrote:
    I have updated my STOMP library.

    (for those that does not know STOMP then it is a standard protocol for accessing a message queue)

    https://www.vajhoej.dk/arne/opensource/vms/vmspstomp-src-v0_5.zip

    There are 3 changes from 0.4 to 0.5:
    - The _read function is now supplemented by _sub and _readone
    -a functions for the case where one want to read more than one
    -a message (_read sort of work for reading multiple messages from
    -a a queue but it totally fucks up reading multiple messages from
    -a a topic)
    - the low level reading now handle the case where a single TCP recv
    -a receives more than one message, before that would result in
    -a second message getting lost
    - avoid appending a LF to messages when sending - other libraries does
    -a not do that

    https://www.vajhoej.dk/arne/opensource/vms/vmspstomp-src-v0_6.zip

    Changes from 0.5 to 0.6:
    - added _unsub functions
    - setting logical/symbol clientid will now cause
    init to send a client-id header

    Both changes require to support durable subscriptions.

    Arne

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.os.vms on Tue Jul 29 02:33:28 2025
    From Newsgroup: comp.os.vms

    In article <1068uvh$2bs8v$1@dont-email.me>,
    Arne Vajh|+j <arne@vajhoej.dk> wrote:
    On 7/25/2025 5:30 PM, Arne Vajh|+j wrote:
    I have updated my STOMP library.

    (for those that does not know STOMP then it is a standard protocol for
    accessing a message queue)

    https://www.vajhoej.dk/arne/opensource/vms/vmspstomp-src-v0_5.zip

    There are 3 changes from 0.4 to 0.5:
    - The _read function is now supplemented by _sub and _readone
    -a functions for the case where one want to read more than one
    -a message (_read sort of work for reading multiple messages from
    -a a queue but it totally fucks up reading multiple messages from
    -a a topic)
    - the low level reading now handle the case where a single TCP recv
    -a receives more than one message, before that would result in
    -a second message getting lost
    - avoid appending a LF to messages when sending - other libraries does
    -a not do that

    https://www.vajhoej.dk/arne/opensource/vms/vmspstomp-src-v0_6.zip

    Changes from 0.5 to 0.6:
    - added _unsub functions
    - setting logical/symbol clientid will now cause
    init to send a client-id header

    Both changes require to support durable subscriptions.

    Sadly, this doesn't address any of the defects I identified in
    your previous version (well, some buffer sizes were enlarged,
    but they still suffer from the basic underlying problems I
    described earlier). This version introduces several more
    problems. However, the delta off of the previous is small, so
    we only really need to look at the diffs.

    Let's look at the first new issue. The main thing added is the, `simple_stomp_unsub` function, which is defined here:

    +int simple_stomp_unsub(simple_stomp_t *ctx, const char *qname)
    +{

    The format of "qname" is undocumented. Below, you look for a
    newline in it; but why would this have a newline embedded in it,
    and why would it matter if it doesn't?

    + char cmd[800], header[800], errmsg[200];
    + char *p = strstr(qname, "\n");

    First of all, as I mentioned last time, using `strstr` to find a
    single character in a string is dubious. Use `strchr` instead;
    that is what it is there for.

    + if(p == NULL)
    + {
    + if(!helper_write(ctx, "UNSUBSCRIBE", NULL, NULL))

    This won't append a header with a queue identifier, which seems
    to violate all versions of the the protocol defined at
    https://stomp.github.io.

    + }
    + else
    + {
    + strcpy(header, p + 1);

    Here, you're doing an (unbounded) string copy into a fixed-sized
    buffer using data you're given from the user. I see no reason
    why this cannot result in a buffer overflow with sufficiently
    long input. You could use `snprintf`, for instance, to avoid
    that here.

    + if(!helper_write(ctx, "UNSUBSCRIBE", header, NULL))

    You do not prepend a header name; it appears as though you are
    expecting the `qname` variable to already have a properly
    formatted header after a newline character; you should really
    document that.

    The only other major change is using `getenv` to look for
    "clientid", and sending that in a header on CONNECT if found:

    + char *clientid = getenv("clientid");

    Generally, altering something like this init function based on
    some value ambiently in the environment is bad practice. Why
    not give this as an explicit paramter, and pass `NULL` if not
    needed? This places the responsbility back onto the caller,
    but the caller is presumably the thing that knows whether it
    needs persistence or not.

    Turning our attention to `vms_simple_stomp.c` and the added
    function there, we see this stanza:

    + char *qname2;
    + qname2 = in_cstr(qname);

    Recall that, internally, `in_cstr` uses malloc. But you do not
    check the return value from `malloc` to see whether it was
    successful or not when you call it there; assuming that malloc
    always succeeds is bad practice. In fact, this was also a
    problem with the previous version, but I neglected to mention it
    the other day.

    - Dan C.

    --- Synchronet 3.21a-Linux NewsLink 1.2