• Tale of a Fork

    From Rainer Weikusat@21:1/5 to All on Tue Jan 7 18:11:34 2025
    The purpose of this post (as with its predecessors although it has a
    while) is to collect use-cases for fork which don't involve invoking
    exec in the forked process to run another program.

    A program I'm dealing with acts as system backend to a cloud-based
    (Heroku) web UI offering system configuration, monitoring and access
    features. Among other things, it supports interactive shell sessions
    based on an Angular terminal widget running in a brower¹. This program
    is supposed to support interruption-free upgrades in future, ie,
    replacing its running instance with an updated one without disconnecting
    active shell sessions. The following general algorithm will be used for
    that (not yet completely implemented).

    1. Switch from processing network input to buffering it.

    2. Wait for all outstanding requests to complete.

    3. Fork to have the old program running in a new process where it can
    continue to receive and buffer network input and will also keep the
    running state for later restoration.

    4. Exec the updated program in the original process.

    5. After that has completed enough of its initialization (established a
    WebSocket connect to Heroku) to be useful, contact the old program
    running in the forked process to determine the state information
    necessary to create shell &c management objects in the updated
    program's process and do this.

    6. Subscribe to all necessary ActionCable channels, start buffering
    network input,

    7. Contact the old program again to received buffered network
    input. After that was sent, the old program will terminate itself.

    8. Process this input and then switch to processing new input received
    over the ActionCable bus.

    This also relies on parent/child relations between processes not being
    affected by an exec and on both the forked process and the newly
    executed program in the old process inheriting open file descriptors
    from it.

    ¹ Using ActionCable and JSON for all data communication --- we've come
    some miles since the Nagle-algorithm was invented to stop telnet
    sessions from causing congestion collapses.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rainer Weikusat@21:1/5 to Rainer Weikusat on Fri Jan 17 22:40:54 2025
    Rainer Weikusat <rweikusat@talktalk.net> writes:
    The purpose of this post (as with its predecessors although it has a
    while) is to collect use-cases for fork which don't involve invoking
    exec in the forked process to run another program.

    [...]

    3. Fork to have the old program running in a new process where it can
    continue to receive and buffer network input and will also keep the
    running state for later restoration.

    This deserves a follow-up: Another feature this program also supports is
    file uploads. As ActionCable does unlimited buffering (read: buffers a
    lot, exact limit unknown) and it's supposed to be possible to cancel
    uploads, flow control is necessary here so that sender and receiver
    operate at least roughly in real-time wrt each other. This uses a fixed-window-based algorithm where the sender sends data blocks until it
    has completely filled the window and each data block is acknowledged by
    the receiver after it was actually written to the file. Each ack
    received by the sender increases the available window by one block.

    ActionCable is essentially a chat protocol following a so-called pub/sub
    model. Clients can subscribe to so-called channels and will then receive everything published to such a channel after they have subscribed to
    it. This implies that something must remain subscribed to the channel
    and receive upload data messages while the updated program is still busy
    with initializing itself. Initialization requires

    1. Starting perl.
    2. Loading and compiling a lot of perl code, roughly 10.000 LOC
    in total.
    3. Resolve the name of the cloud endpoint via DNS.
    4. Establish a TCP connection it.
    5. Negotiate TLS.
    6. Do a WebSocket handshake to switch to WebSocket.
    7. Receive an ActionCable greeting message.
    8. Do a handshake in order to subscribe to the channel.
    9. For each file upload, do another handshake to subscribe to the
    file upload channel for this upload.

    9 is necessary because ActionCable is essentially a virtual, dumb
    repeater, ie, it sends all messages received for a channel to all
    subscribers, including to the party which originally sent the message.

    As this takes (for computers) a considerable amount of time, it's vital
    that something remains subscribed to the channel in order to receive
    file data messages the sender sent because it still has window space
    available. That's done by the original program running in the forked
    process. In total, this arrangement works like charm: Running file
    uploads just continue despite the original program running in the
    original process was meanwhile completely replaced.

    The fork/exec split may have been invented accidentally (or rather, was invented accidenally) because of path-of-least-resistance programming in
    an early UNIX version but it's decidedly a genuine discovery: Two
    independent system primitives which can be combined into a whole which
    is more than just it parts. No wonder that people hate it so much.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)