• acquire + sleep + async

    From fir@profesor.fir@gmail.com to comp.lang.c on Wed Jun 10 13:53:59 2026
    From Newsgroup: comp.lang.c

    im not experienced with multithreading programing (though i learned a
    bit of it 20 years ago i decided it is not nice for me enough to do it)

    i only tend to do things that seem reasonable to me and i have some doubts

    hovever in my small view on things i may say that

    I.
    sleep() function
    (who puts cpu core on sleep given miliseconds (or microseconds)
    (by put on sleep i also thing it just chnges the context of execution
    to anuther thread for given time eventually) I FIND REASONABLE
    (i see no problem with that)

    II.
    async call of function (it is a call to some function that spawns a
    thread, so main thread continues but the second spawned on that function executes also until it find some of async end when its job is done
    I ALSO FIND reasoneble (no problem with that)

    III.

    there is a third thing...i name it acquire ..it is some kind os assembly instruction or pair of assembly instustions with some guarantees

    i mean acquire(x) should work if(x==0) x=TID (thread id or something
    like that) so its like conditional move mov eax, tid; movz x, eax;
    (where movz would mean here mov if x is 0 if not set a cpu flag)

    and this should be proof of core crashes on this acces i mean only one
    can acquire (im not sure if present cpu have it - that probably should
    have - becouse it makes such operation very cheap then)

    NOTE im meybe even less sura about this acquire becouse theoretically
    probably such acquireless (and maybe even mostly sleepless) programming
    is possible but im not sure if it wouldnt be handy too


    THOSE 3 "PRIMITIVES" imo are quite A SET to make multithreaded
    programming "by hand".i also think it should be most lightweight,
    i mean async call should be very light probbly (and probbaly it is its a metter of line of assembly or two i hope)...same with sleep (as in fact
    sleep and async call are close things

    so i would hope that should be just maybe a set of few assebly opcodes
    (if i not mislooked and something heavvier is needed)

    conclusion is what i already said it seem that those 3 thing
    (primitives) allows to do a lot of multithreading without bloated
    libraries i hope

    eventually correct me if im wrong

    (also not to say they are best way of doing multithreading but if they
    are lightweight it is at least minimally reasonable in some way it seems)

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From fir@profesor.fir@gmail.com to comp.lang.c on Wed Jun 10 16:03:28 2026
    From Newsgroup: comp.lang.c

    fir pisze:
    im not experienced with multithreading programing (though i learned a
    bit of it 20 years ago i decided it is not nice for me enough to do it)

    i only tend to do things that seem reasonable to me and i have some doubts

    hovever in my small view on things i may say that

    I.
    -asleep() function
    (who puts cpu core on sleep given miliseconds (or microseconds)
    (by put on sleep i also thing it just chnges the context of execution
    to anuther thread for given time eventually) I FIND REASONABLE
    (i see no problem with that)

    II.
    async call of function (it is a call to some function that spawns a
    thread, so main thread continues but the second spawned on that function executes also until it find some of async end when its job is done
    I ALSO FIND reasoneble (no problem with that)

    III.

    there is a third thing...i name it acquire ..it is some kind os assembly instruction or pair of assembly instustions with some guarantees

    i mean acquire(x) should work if(x==0) x=TID (thread id or something
    like that) so its like conditional move-a-a mov eax, tid; movz x, eax;
    (where movz would mean here mov if x is 0 if not set a cpu flag)

    and this should be proof of core crashes on this acces i mean only one
    can acquire (im not sure if present cpu have it - that probably should
    have - becouse it makes such operation very cheap then)

    NOTE im meybe even less sura about this acquire becouse theoretically probably such acquireless (and maybe even mostly sleepless) programming
    is possible but im not sure if it wouldnt be handy too


    THOSE 3 "PRIMITIVES" imo are quite A SET to make multithreaded
    programming "by hand".i also think it should be most lightweight,
    i mean async call should be very light probbly (and probbaly it is its a metter of line of assembly or two i hope)...same with sleep (as in fact sleep and async call are close things

    so i would hope that should be just maybe a set of few assebly opcodes
    (if i not mislooked and something heavvier is needed)

    conclusion is what i already said it seem that those 3 thing
    (primitives) allows to do a lot of multithreading without bloated
    libraries i hope

    eventually correct me if im wrong

    (also not to say they are best way of doing multithreading but if they
    are lightweight it is at least minimally reasonable in some way it seems)


    btw if some would like to implement some common pattern


    if(acquire(x)) //or acquire(x) {} else {}

    {
    //do something
    release(x);
    }
    else
    {
    sleep(0.01);
    continue;
    }

    then a need for "continue" appears - which i find rather okay

    imo this

    if(x<10)
    {
    x++; continue;
    }

    probably looks better than while()
    (though word "continue" is not most fortunate maybe it should be more
    like repeat
    (though it all should be rethinked meybe there is opportunity to born
    beter form of loop here at all
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c on Thu Jun 11 16:44:46 2026
    From Newsgroup: comp.lang.c

    Am 11.06.2026 um 13:39 schrieb David Brown:

    There are stackless coroutine libraries for C that work entirely from
    the pre-processor, with similar limitations to the C++ coroutines (like
    not being able to yield from normal functions called from the
    coroutine).-a There are also stackful coroutine libraries, which are more flexible but have higher overhead.

    Yes, ... from the pre-processor; how comfortable is the debugging with
    that ? Forget it.

    C coroutine solutions require more manual coding than C++ for tracking
    local variables that must be preserved between calls/yields, but on the other hand they don't require the dog's breakfast of boilerplate classes
    and code that C++ coroutines need.

    In the end it's less work in C++ and the debugging is comfortable at
    least with VC++, i.e. you can inspect the coroutine frame when the
    coroutine is not running (CLion can't do that).

    It is still an astoundingly ignorant claim.-a If you had said C had not changed much since 1999, it would be reasonable - though C23 has quite a number of new features.-a And even then it would only be in reference to
    the C standards - the C ecosystem has changed enormously.

    It nearly hasn't changed if you compare that to the evolution of C++.

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Thu Jun 11 14:18:09 2026
    From Newsgroup: comp.lang.c

    On 6/10/2026 9:40 PM, Lawrence DrCOOliveiro wrote:
    On Wed, 10 Jun 2026 13:53:59 +0200, fir wrote:

    im not experienced with multithreading programing ...

    It does tend to be error-prone.

    ThatrCOs why it is best reserved for CPU-intensive tasks that can
    benefit from running a bunch of cores at once.

    For cases where the bottleneck is in the I/O or the network connection
    (which is a lot of them), threading is typically unnecessary. Instead,
    the popular approach nowadays is to use coroutines.

    Not sure why you say that. Threads work perfectly fine with I/O, network connections, ect... Do you think sync between processes is easier?




    <https://developer.mozilla.org/en-US/docs/Learn_web_development/Extensions/Async_JS/Introducing>
    <https://docs.python.org/3/library/asyncio.html>

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Thu Jun 11 15:56:20 2026
    From Newsgroup: comp.lang.c

    On 6/11/2026 12:45 AM, Lawrence DrCOOliveiro wrote:
    On Thu, 11 Jun 2026 09:21:13 +0200, Bonita Montero wrote:

    C hasn't been improved much since 1973. You still stick with the
    same lowlevel view.

    Actually, just as you can use threading APIs with C, you can use
    stackful coroutine libraries as well.

    Stackless ones are another matter.

    Want to go to the fiber level? I have been there done that. Fibers
    riding threads... Fine, but you have to know what you are doing.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Thu Jun 11 16:01:47 2026
    From Newsgroup: comp.lang.c

    On 6/11/2026 1:44 AM, fir wrote:
    Bonita Montero pisze:
    Am 11.06.2026 um 09:52 schrieb David Brown:

    There are no coroutines in the C standard.-a There are a wide variety
    of coroutine implementations in C, with different mixtures of
    features, limitations, efficiencies, portability and requirements.-a...

    Real coroutines aren't possible with C since with native coroutines you
    need functions that can be suspended and resumed.


    c lasks some low lewel core management imo -it is more visible in
    multicore times..but even in old times there was something like
    interrupts - who allowed theoretically but probably also practically
    freeze a branch and resume it - so somemechanics of multthreading
    was even on 1core machines - i remember it was in 8-bit c64 really

    C hasn't been improved much since 1973.

    What an astoundingly ignorant claim.

    Compared to C++ that's true.


    Wrt per-core, well, in user mode we can try to use affinity masks for
    that. But! Its not guaranteed. So, per thread is pretty low level, per
    fibers on threads is a, well, kind of a different story.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.lang.c on Fri Jun 12 00:12:48 2026
    From Newsgroup: comp.lang.c

    On Thu, 11 Jun 2026 14:18:09 -0700, Chris M. Thomasson wrote:

    On 6/10/2026 9:40 PM, Lawrence DrCOOliveiro wrote:

    For cases where the bottleneck is in the I/O or the network
    connection (which is a lot of them), threading is typically
    unnecessary. Instead, the popular approach nowadays is to use
    coroutines.

    Not sure why you say that.

    It does tend to be error-prone.

    ThatrCOs why it is best reserved for CPU-intensive tasks that can
    benefit from running a bunch of cores at once.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.lang.c on Fri Jun 12 00:13:38 2026
    From Newsgroup: comp.lang.c

    On Thu, 11 Jun 2026 11:38:23 +0200, Bonita Montero wrote:

    Coroutines are the most elegant way to handle finite state machines.
    They need explicit language support and can't be handled exclsuively
    with libraries.

    Stackful coroutines ... I donrCOt see why not.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.lang.c on Fri Jun 12 00:14:08 2026
    From Newsgroup: comp.lang.c

    On Thu, 11 Jun 2026 10:28:55 +0200, Bonita Montero wrote:

    Real coroutines aren't possible with C since with native coroutines
    you need functions that can be suspended and resumed.

    Just think of them as non-preemptive threads.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.lang.c on Fri Jun 12 01:32:35 2026
    From Newsgroup: comp.lang.c

    On Thu, 11 Jun 2026 11:36:12 +0200, Bonita Montero wrote:

    Coroutines have nothing to do with multithreaded programming, but
    they can be used to have sth. more lightweight than threading.

    I donrCOt know whatrCOs rCLheavyweightrCY about threading, beyond the stack allocations. You have exactly the same thing in (stackful) coroutines.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.lang.c on Fri Jun 12 01:33:39 2026
    From Newsgroup: comp.lang.c

    On Thu, 11 Jun 2026 16:01:47 -0700, Chris M. Thomasson wrote:

    ... per fibers on threads is a, well, kind of a different story.

    Are these rCLfibrerCY things just some kind of runtime abstraction built
    on top of OS threads?
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Thu Jun 11 21:05:29 2026
    From Newsgroup: comp.lang.c

    On 6/11/2026 6:33 PM, Lawrence DrCOOliveiro wrote:
    On Thu, 11 Jun 2026 16:01:47 -0700, Chris M. Thomasson wrote:

    ... per fibers on threads is a, well, kind of a different story.

    Are these rCLfibrerCY things just some kind of runtime abstraction built
    on top of OS threads?

    Basically. You need to make your own scheduler for the fiber on the
    threads. Fwiw:

    https://learn.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-createfiberex
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Thu Jun 11 21:06:08 2026
    From Newsgroup: comp.lang.c

    On 6/11/2026 5:12 PM, Lawrence DrCOOliveiro wrote:
    On Thu, 11 Jun 2026 14:18:09 -0700, Chris M. Thomasson wrote:

    On 6/10/2026 9:40 PM, Lawrence DrCOOliveiro wrote:

    For cases where the bottleneck is in the I/O or the network
    connection (which is a lot of them), threading is typically
    unnecessary. Instead, the popular approach nowadays is to use
    coroutines.

    Not sure why you say that.

    It does tend to be error-prone.

    Well, shit happens. :^)



    ThatrCOs why it is best reserved for CPU-intensive tasks that can
    benefit from running a bunch of cores at once.


    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Thu Jun 11 21:11:13 2026
    From Newsgroup: comp.lang.c

    On 6/11/2026 4:39 AM, David Brown wrote:
    [...]
    C coroutine solutions require more manual coding than C++ for tracking
    local variables that must be preserved between calls/yields, but on the other hand they don't require the dog's breakfast of boilerplate classes
    and code that C++ coroutines need.
    [...]

    FWIW, check this shit out. Coroutines can be emulated even in BASIC.
    This is a recursive stack here. Its all manual... :^)


    100 REM ct_vfield_applesoft_basic
    110 HOME
    120 HGR: HCOLOR = 3: VTAB 22
    130 PRINT "ct_vfield_applesoft_basic"
    140 GOSUB 1000
    150 GOSUB 3000
    160 SP = 0
    170 RS(SP, 0) = 0
    180 RS(SP, 1) = -1
    190 RS(SP, 2) = 0
    200 RS(SP, 3) = 1
    210 RS(SP, 4) = 0
    220 GOSUB 8000
    230 V1(1) = 0: V1(2) = 0: V1(3) = 1: V1(4) = 128
    240 GOSUB 6000
    245 PRINT "Chris Thomasson's Koch Complete!"
    250 END

    1000 REM ct_init
    1010 PRINT "ct_init"
    1020 DIM A0(6)
    1030 DIM V0(4)
    1040 DIM V1(4)
    1050 DIM V2(4)
    1060 DIM V3(4)
    1070 DIM V4(4)
    1080 DIM V5(4)
    1090 RN = 3
    1100 DIM RS(RN, 16)
    1110 GOSUB 2000
    1120 RETURN

    2000 REM ct_init_plane
    2010 PRINT "ct_init_plane"
    2020 A0(1) = 279: REM m_plane.m_width
    2030 A0(2) = 191: REM m_plane.m_height
    2040 A0(3) = 0.0126106: REM m_plane.m_xstep
    2050 A0(4) = 0.0126316: REM m_plane.m_ystep
    2060 A0(5) = -1.75288: REM m_plane.m_axes.m_xmin
    2070 A0(6) = 1.2: REM m_plane.m_axes.m_ymax
    2080 RETURN

    3000 REM ct_display_plane
    3010 PRINT "ct_display_plane"
    3020 FOR I0 = 1 TO 6
    3030 PRINT "A0("; I0; ") = " A0(I0)
    3040 NEXT I0
    3050 RETURN

    4000 REM ct_project_point
    4010 REM PRINT "ct_project_point"
    4020 V0(3) = (V0(1) - A0(5)) / A0(3)
    4030 V0(4) = (A0(6) - V0(2)) / A0(4)
    4040 IF V0(3) < 0 THEN V0(3) = INT(V0(3) - .5)
    4050 IF V0(3) >= 0 THEN V0(3) = INT(V0(3) + .5)
    4060 IF V0(4) < 0 THEN V0(4) = INT(V0(4) - .5)
    4070 IF V0(4) >= 0 THEN V0(4) = INT(V0(4) + .5)
    4080 RETURN

    5000 REM ct_plot_point
    5010 REM PRINT "ct_plot_point"
    5020 GOSUB 4000
    5030 IF V0(3) > -1 AND V0(3) <= A0(1) AND V0(4) > -1 AND V0(4) <=
    A0(2) THEN HPLOT V0(3), V0(4)
    5040 RETURN

    6000 REM ct_plot_circle
    6010 PRINT "ct_plot_circle"
    6020 AB = 6.28318 / V1(4)
    6030 FOR I1 = 0 TO 6.28318 STEP AB
    6040 V0(1) = V1(1) + COS(I1) * V1(3)
    6050 V0(2) = V1(2) + SIN(I1) * V1(3)
    6060 GOSUB 5000
    6070 NEXT I1
    6080 RETURN

    7000 REM ct_plot_line
    7010 PRINT "ct_plot_line"
    7020 V0(1) = V5(1): V0(2) = V5(2)
    7030 GOSUB 4000
    7040 IF V0(3) < 0 THEN V0(3) = 0
    7050 IF V0(3) > A0(1) THEN V0(3) = A0(1)
    7060 IF V0(4) < 0 THEN V0(4) = 0
    7070 IF V0(4) > A0(2) THEN V0(4) = A0(2)
    7080 HPLOT V0(3), V0(4)
    7090 V0(1) = V5(3): V0(2) = V5(4)
    7100 GOSUB 4000
    7110 IF V0(3) < 0 THEN V0(3) = 0
    7120 IF V0(3) > A0(1) THEN V0(3) = A0(1)
    7130 IF V0(4) < 0 THEN V0(4) = 0
    7140 IF V0(4) > A0(2) THEN V0(4) = A0(2)
    7150 HPLOT TO V0(3), V0(4)
    7160 RETURN

    8000 REM ct_koch
    8010 IF RS(SP, 0) >= RN THEN RETURN
    8020 PRINT "ct_koch = "; RS(SP, 0); " "; RS(SP, 1); " "; RS(SP, 2);
    " "; RS(SP, 3); " "; RS(SP, 4)"
    8030 RS(SP, 5) = RS(SP, 3) - RS(SP, 1) : REM difx
    8040 RS(SP, 6) = RS(SP, 4) - RS(SP, 2) : REM dify
    8050 RS(SP, 7) = RS(SP, 1) + RS(SP, 5) / 2 : REM dify
    8060 RS(SP, 8) = RS(SP, 2) + RS(SP, 6) / 2 : REM dify
    8070 RS(SP, 9) = -RS(SP, 6) : REM perpx
    8080 RS(SP, 10) = RS(SP, 5) : REM perpy
    8090 RS(SP, 11) = RS(SP, 7) + RS(SP, 9) / 3 : REM tipx
    8100 RS(SP, 12) = RS(SP, 8) + RS(SP, 10) / 3 : REM tipy
    8110 RS(SP, 13) = RS(SP, 1) + RS(SP, 5) / 3 : REM k0x
    8120 RS(SP, 14) = RS(SP, 2) + RS(SP, 6) / 3 : REM k0y
    8130 RS(SP, 15) = RS(SP, 3) - RS(SP, 5) / 3 : REM k1x
    8140 RS(SP, 16) = RS(SP, 4) - RS(SP, 6) / 3 : REM k1y

    8145 IF RS(SP, 0) < RN - 1 GOTO 8230
    8150 V5(1) = RS(SP, 1): V5(2) = RS(SP, 2): V5(3) = RS(SP, 13): V5(4)
    = RS(SP, 14)
    8160 GOSUB 7000
    8170 V5(1) = RS(SP, 13): V5(2) = RS(SP, 14): V5(3) = RS(SP, 11):
    V5(4) = RS(SP, 12)
    8180 GOSUB 7000
    8190 V5(1) = RS(SP, 11): V5(2) = RS(SP, 12): V5(3) = RS(SP, 15):
    V5(4) = RS(SP, 16)
    8200 GOSUB 7000
    8210 V5(1) = RS(SP, 15): V5(2) = RS(SP, 16): V5(3) = RS(SP, 3):
    V5(4) = RS(SP, 4)
    8220 GOSUB 7000

    8230 REM line 0
    8240 SP = SP + 1
    8250 RS(SP, 0) = RS(SP - 1, 0) + 1
    8260 RS(SP, 1) = RS(SP - 1, 1)
    8270 RS(SP, 2) = RS(SP - 1, 2)
    8280 RS(SP, 3) = RS(SP - 1, 13)
    8290 RS(SP, 4) = RS(SP - 1, 14)
    8300 GOSUB 8000
    8310 SP = SP - 1
    8320 REM line 1
    8330 SP = SP + 1
    8340 RS(SP, 0) = RS(SP - 1, 0) + 1
    8350 RS(SP, 1) = RS(SP - 1, 13)
    8360 RS(SP, 2) = RS(SP - 1, 14)
    8370 RS(SP, 3) = RS(SP - 1, 11)
    8380 RS(SP, 4) = RS(SP - 1, 12)
    8390 GOSUB 8000
    8400 SP = SP - 1
    8410 REM line 2
    8420 SP = SP + 1
    8430 RS(SP, 0) = RS(SP - 1, 0) + 1
    8440 RS(SP, 1) = RS(SP - 1, 11)
    8450 RS(SP, 2) = RS(SP - 1, 12)
    8460 RS(SP, 3) = RS(SP - 1, 15)
    8470 RS(SP, 4) = RS(SP - 1, 16)
    8480 GOSUB 8000
    8490 SP = SP - 1
    8500 REM line 3
    8510 SP = SP + 1
    8520 RS(SP, 0) = RS(SP - 1, 0) + 1
    8530 RS(SP, 1) = RS(SP - 1, 15)
    8540 RS(SP, 2) = RS(SP - 1, 16)
    8550 RS(SP, 3) = RS(SP - 1, 3)
    8560 RS(SP, 4) = RS(SP - 1, 4)
    8570 GOSUB 8000
    8580 SP = SP - 1
    8590 RETURN
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Thu Jun 11 21:16:48 2026
    From Newsgroup: comp.lang.c

    On 6/11/2026 5:14 PM, Lawrence DrCOOliveiro wrote:
    On Thu, 11 Jun 2026 10:28:55 +0200, Bonita Montero wrote:

    Real coroutines aren't possible with C since with native coroutines
    you need functions that can be suspended and resumed.

    Just think of them as non-preemptive threads.

    windows has a fairly nice way to get at them, that is if you want them
    at all...

    https://learn.microsoft.com/en-us/windows/win32/procthread/fibers

    https://learn.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-createfiberex

    https://learn.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-convertthreadtofiberex


    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c on Fri Jun 12 06:16:42 2026
    From Newsgroup: comp.lang.c

    Am 12.06.2026 um 03:32 schrieb Lawrence DrCOOliveiro:

    I donrCOt know whatrCOs rCLheavyweightrCY about threading, beyond the stack allocations. ...

    Yes, the allocation of the stack is very expensive. And the
    context-switch between threads is a magnitudes more expensive
    than switching between two coroutine conexts.

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Thu Jun 11 21:19:45 2026
    From Newsgroup: comp.lang.c

    On 6/11/2026 9:16 PM, Bonita Montero wrote:
    Am 12.06.2026 um 03:32 schrieb Lawrence DrCOOliveiro:

    I donrCOt know whatrCOs rCLheavyweightrCY about threading, beyond the stack >> allocations. ...

    Yes, the allocation of the stack is very expensive. And the
    context-switch between threads is a magnitudes more expensive
    than switching between two coroutine conexts.


    The fibers float along the threads... ;^)

    Anyway, its been a while since I used them. Iiic it was for a sorting algo.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c on Fri Jun 12 13:15:59 2026
    From Newsgroup: comp.lang.c

    Am 12.06.2026 um 06:19 schrieb Chris M. Thomasson:

    The fibers float along the threads... ;^)

    If you have as many fibres as you otherwise would have coroutines
    and the number is lage you waste a lot of memory.
    The only difference is that with fibers you can switch the context
    from any function.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.lang.c on Fri Jun 12 23:52:32 2026
    From Newsgroup: comp.lang.c

    On Fri, 12 Jun 2026 06:16:42 +0200, Bonita Montero wrote:

    Am 12.06.2026 um 03:32 schrieb Lawrence DrCOOliveiro:

    I donrCOt know whatrCOs rCLheavyweightrCY about threading, beyond the stack >> allocations. ...

    Yes, the allocation of the stack is very expensive. And the
    context-switch between threads is a magnitudes more expensive than
    switching between two coroutine conexts.

    Why should that be? What extra overhead is there in context-switching
    between preemptive threads, versus non-preemptive ones?
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.lang.c on Fri Jun 12 23:54:57 2026
    From Newsgroup: comp.lang.c

    On Thu, 11 Jun 2026 21:05:29 -0700, Chris M. Thomasson wrote:

    On 6/11/2026 6:33 PM, Lawrence DrCOOliveiro wrote: > > Are these
    rCLfibrerCY things just some kind of runtime abstraction > built on top
    of OS threads? Basically. You need to make your own scheduler for the
    fiber on the threads.

    Been there, done that. Actually, the OS concerned (VAX/VMS) didnrCOt
    even support threads at the time: but it had asynchronous I/O and
    timer completion routines, and I was able to build a fully-preemptive
    threading model on top of that.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.lang.c on Sat Jun 13 00:01:21 2026
    From Newsgroup: comp.lang.c

    On Thu, 11 Jun 2026 21:11:13 -0700, Chris M. Thomasson wrote:

    FWIW, check this shit out. Coroutines can be emulated even in BASIC.
    This is a recursive stack here. Its all manual... :^)

    [code omitted]


    I see a lot of rCLGOSUB 8000rCY within the subroutine starting at line
    8000, so thatrCOs a lot of recursion, not coroutines. Plus yourCOve got
    the explicit RS array stack for assembling the components of the
    curve. Normally you would either have either recursion or an explicit
    stack, since each one subsumes the functions of the other, so there
    shouldnrCOt be a need for both.

    In other words, I think yourCOve got the worst of both worlds? Recursion
    that isnrCOt enough to solve the problem purely recursively, plus an
    explicit stack that isnrCOt enough to keep track of the entire stack
    state?
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c on Sat Jun 13 05:05:23 2026
    From Newsgroup: comp.lang.c

    Am 13.06.2026 um 01:52 schrieb Lawrence DrCOOliveiro:

    On Fri, 12 Jun 2026 06:16:42 +0200, Bonita Montero wrote:

    Yes, the allocation of the stack is very expensive. And the
    context-switch between threads is a magnitudes more expensive than
    switching between two coroutine conexts.

    Why should that be? What extra overhead is there in context-switching
    between preemptive threads, versus non-preemptive ones?

    Because with threading the context-switch happens inside the kernel.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Sat Jun 13 14:52:31 2026
    From Newsgroup: comp.lang.c

    Bonita Montero <Bonita.Montero@gmail.com> writes:
    Am 13.06.2026 um 01:52 schrieb Lawrence DrCOOliveiro:

    On Fri, 12 Jun 2026 06:16:42 +0200, Bonita Montero wrote:

    Yes, the allocation of the stack is very expensive. And the
    context-switch between threads is a magnitudes more expensive than
    switching between two coroutine conexts.

    Why should that be? What extra overhead is there in context-switching
    between preemptive threads, versus non-preemptive ones?

    Because with threading the context-switch happens inside the kernel.

    Some implementations of threading context-switch in the kernel. That's
    not true of all implementations.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Sat Jun 13 13:46:55 2026
    From Newsgroup: comp.lang.c

    On 6/12/2026 5:01 PM, Lawrence DrCOOliveiro wrote:
    On Thu, 11 Jun 2026 21:11:13 -0700, Chris M. Thomasson wrote:

    FWIW, check this shit out. Coroutines can be emulated even in BASIC.
    This is a recursive stack here. Its all manual... :^)

    [code omitted]


    I see a lot of rCLGOSUB 8000rCY within the subroutine starting at line
    8000, so thatrCOs a lot of recursion, not coroutines. Plus yourCOve got
    the explicit RS array stack for assembling the components of the
    curve. Normally you would either have either recursion or an explicit
    stack, since each one subsumes the functions of the other, so there shouldnrCOt be a need for both.

    In other words, I think yourCOve got the worst of both worlds? Recursion
    that isnrCOt enough to solve the problem purely recursively, plus an
    explicit stack that isnrCOt enough to keep track of the entire stack
    state?

    Its a recursive koch fractal using AppleSoft basic. Run it on an
    emulator. Say, https://www.calormen.com/jsbasic/ ?
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.lang.c on Sun Jun 14 00:32:40 2026
    From Newsgroup: comp.lang.c

    On Sat, 13 Jun 2026 13:46:55 -0700, Chris M. Thomasson wrote:

    Its a recursive koch fractal using AppleSoft basic.

    Yes, that was pretty clear. As was the fact that you were able to get
    it working, not *because of* your choice of BASIC to write it in, but
    *in spite of* that.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.lang.c on Tue Jun 16 00:12:43 2026
    From Newsgroup: comp.lang.c

    On Sat, 13 Jun 2026 05:05:23 +0200, Bonita Montero wrote:

    Am 13.06.2026 um 01:52 schrieb Lawrence DrCOOliveiro:

    On Fri, 12 Jun 2026 06:16:42 +0200, Bonita Montero wrote:

    Yes, the allocation of the stack is very expensive. And the
    context-switch between threads is a magnitudes more expensive than
    switching between two coroutine conexts.

    Why should that be? What extra overhead is there in
    context-switching between preemptive threads, versus non-preemptive
    ones?

    Because with threading the context-switch happens inside the kernel.

    Microsoft Windows problems again?
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Tue Jun 16 12:22:24 2026
    From Newsgroup: comp.lang.c

    On 6/15/2026 5:12 PM, Lawrence DrCOOliveiro wrote:
    On Sat, 13 Jun 2026 05:05:23 +0200, Bonita Montero wrote:

    Am 13.06.2026 um 01:52 schrieb Lawrence DrCOOliveiro:

    On Fri, 12 Jun 2026 06:16:42 +0200, Bonita Montero wrote:

    Yes, the allocation of the stack is very expensive. And the
    context-switch between threads is a magnitudes more expensive than
    switching between two coroutine conexts.

    Why should that be? What extra overhead is there in
    context-switching between preemptive threads, versus non-preemptive
    ones?

    Because with threading the context-switch happens inside the kernel.

    Microsoft Windows problems again?

    Microsoft Windows problems? What do you mean? preemptive threads say
    POSIX threads are going to have the same issues. Right?
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Tue Jun 16 12:23:32 2026
    From Newsgroup: comp.lang.c

    On 6/13/2026 5:32 PM, Lawrence DrCOOliveiro wrote:
    On Sat, 13 Jun 2026 13:46:55 -0700, Chris M. Thomasson wrote:

    Its a recursive koch fractal using AppleSoft basic.

    Yes, that was pretty clear. As was the fact that you were able to get
    it working, not *because of* your choice of BASIC to write it in, but
    *in spite of* that.

    It has current stack space. So, with a little work it should be workable
    for continuations.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Tue Jun 16 12:25:12 2026
    From Newsgroup: comp.lang.c

    On 6/11/2026 9:06 PM, Chris M. Thomasson wrote:
    On 6/11/2026 5:12 PM, Lawrence DrCOOliveiro wrote:
    On Thu, 11 Jun 2026 14:18:09 -0700, Chris M. Thomasson wrote:

    On 6/10/2026 9:40 PM, Lawrence DrCOOliveiro wrote:

    For cases where the bottleneck is in the I/O or the network
    connection (which is a lot of them), threading is typically
    unnecessary. Instead, the popular approach nowadays is to use
    coroutines.

    Not sure why you say that.

    It does tend to be error-prone.

    Well, shit happens. :^)

    Error prone... Well, C is not the lang that takes the corks off the forks:


    (Dirty Rotten Scoundrels (1988) - Dinner With Ruprecht Scene (6/12) | Movieclips)

    https://youtu.be/SKDX-qJaJ08

    rofl!





    ThatrCOs why it is best reserved for CPU-intensive tasks that can
    benefit from running a bunch of cores at once.



    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.lang.c on Wed Jun 17 03:07:28 2026
    From Newsgroup: comp.lang.c

    On Tue, 16 Jun 2026 12:23:32 -0700, Chris M. Thomasson wrote:

    On 6/13/2026 5:32 PM, Lawrence DrCOOliveiro wrote:

    On Sat, 13 Jun 2026 13:46:55 -0700, Chris M. Thomasson wrote:

    Its a recursive koch fractal using AppleSoft basic.

    Yes, that was pretty clear. As was the fact that you were able to
    get it working, not *because of* your choice of BASIC to write it
    in, but *in spite of* that.

    It has current stack space.

    But being BASIC, it only has fixed-length arrays to use as the stack,
    doesnrCOt it?

    So, with a little work it should be workable for continuations.

    I added continuations to my toy PostScript-revival language. I soon
    discovered that having a dedicated stack area was a bad idea. So what
    happens is call frames are allocated on the heap, and chained together
    in various ways: for transferring control for a return, exit, yield or
    stop. An instance of the Continuation class keeps a copy of the
    CallFrame object that was current when it was created, and simply
    makes that current again when it is invoked.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.lang.c on Wed Jun 17 03:09:04 2026
    From Newsgroup: comp.lang.c

    On Tue, 16 Jun 2026 12:22:24 -0700, Chris M. Thomasson wrote:

    Microsoft Windows problems? What do you mean? preemptive threads say
    POSIX threads are going to have the same issues. Right?

    Well, when talking to a Windows programmer, and they say something you
    know doesnrCOt sound right, it seems reasonable to conclude that it
    comes from their Windows-specific experience.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Wed Jun 17 20:29:23 2026
    From Newsgroup: comp.lang.c

    On 6/16/2026 8:09 PM, Lawrence DrCOOliveiro wrote:
    On Tue, 16 Jun 2026 12:22:24 -0700, Chris M. Thomasson wrote:

    Microsoft Windows problems? What do you mean? preemptive threads say
    POSIX threads are going to have the same issues. Right?

    Well, when talking to a Windows programmer, and they say something you
    know doesnrCOt sound right, it seems reasonable to conclude that it
    comes from their Windows-specific experience.

    POSIX threads are preemptive and subject to the same scheduling
    concerns... Right?
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Wed Jun 17 20:30:46 2026
    From Newsgroup: comp.lang.c

    On 6/17/2026 8:29 PM, Chris M. Thomasson wrote:
    On 6/16/2026 8:09 PM, Lawrence DrCOOliveiro wrote:
    On Tue, 16 Jun 2026 12:22:24 -0700, Chris M. Thomasson wrote:

    Microsoft Windows problems? What do you mean? preemptive threads say
    POSIX threads are going to have the same issues. Right?

    Well, when talking to a Windows programmer, and they say something you
    know doesnrCOt sound right, it seems reasonable to conclude that it
    comes from their Windows-specific experience.

    POSIX threads are preemptive and subject to the same scheduling
    concerns... Right?

    If preemption were a rCLWindows thing,rCY then POSIX mutexes, atomics, and memory ordering rules wouldnrCOt need to exist... ? But they do... because
    the same hazards exist on both sides?
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.lang.c on Thu Jun 18 04:11:55 2026
    From Newsgroup: comp.lang.c

    On Wed, 17 Jun 2026 20:29:23 -0700, Chris M. Thomasson wrote:

    On 6/16/2026 8:09 PM, Lawrence DrCOOliveiro wrote:

    On Tue, 16 Jun 2026 12:22:24 -0700, Chris M. Thomasson wrote:

    Microsoft Windows problems? What do you mean? preemptive threads
    say POSIX threads are going to have the same issues. Right?

    Well, when talking to a Windows programmer, and they say something
    you know doesnrCOt sound right, it seems reasonable to conclude that
    it comes from their Windows-specific experience.

    POSIX threads are preemptive and subject to the same scheduling
    concerns... Right?

    They are lighter-weight than processes. But then, processes on *nix
    systems are cheaper to create on *nix systems than on Dave-Cutler-type
    systems, we know that already.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Thu Jun 18 00:18:18 2026
    From Newsgroup: comp.lang.c

    On 6/17/2026 9:11 PM, Lawrence DrCOOliveiro wrote:
    On Wed, 17 Jun 2026 20:29:23 -0700, Chris M. Thomasson wrote:

    On 6/16/2026 8:09 PM, Lawrence DrCOOliveiro wrote:
    On Tue, 16 Jun 2026 12:22:24 -0700, Chris M. Thomasson wrote:

    Microsoft Windows problems? What do you mean? preemptive threads
    say POSIX threads are going to have the same issues. Right?
    Well, when talking to a Windows programmer, and they say something
    you know doesnrCOt sound right, it seems reasonable to conclude that
    it comes from their Windows-specific experience.
    POSIX threads are preemptive and subject to the same scheduling
    concerns... Right?
    They are lighter-weight than processes. But then, processes on *nix
    systems are cheaper to create on *nix systems than on Dave-Cutler-type systems, we know that already.

    Sure... But process creation cost isnrCOt relevant here?
    IrCOm talking about preemption semantics, which are the same on POSIX and Windows.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.lang.c on Thu Jun 18 07:23:31 2026
    From Newsgroup: comp.lang.c

    On Thu, 18 Jun 2026 00:18:18 -0700, Chris M. Thomasson wrote:

    IrCOm talking about preemption semantics, which are the same on POSIX
    and Windows.

    But for some reason the previous poster thinks that doing thread
    preemption in the kernel is somehow more resource-heavy than doing the
    exact same thing in userspace.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c on Thu Jun 18 18:23:28 2026
    From Newsgroup: comp.lang.c

    Am 18.06.2026 um 06:11 schrieb Lawrence DrCOOliveiro:

    They are lighter-weight than processes. But then, processes on *nix
    systems are cheaper to create on *nix systems than on Dave-Cutler-type systems, we know that already.

    Yes, they are, but if performance is your concern better use thread pools.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Thu Jun 18 12:46:19 2026
    From Newsgroup: comp.lang.c

    On 6/18/2026 9:23 AM, Bonita Montero wrote:
    Am 18.06.2026 um 06:11 schrieb Lawrence DrCOOliveiro:

    They are lighter-weight than processes. But then, processes on *nix
    systems are cheaper to create on *nix systems than on Dave-Cutler-type
    systems, we know that already.

    Yes, they are, but if performance is your concern better use thread pools.

    Agreed. xxx_init, it creates all the system state, thread pools, etc,
    with default settings say, anyway. and sits in a neutral state waiting
    for a user to poke at it. Say issue a ConnectEx.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Thu Jun 18 12:47:15 2026
    From Newsgroup: comp.lang.c

    On 6/16/2026 8:07 PM, Lawrence DrCOOliveiro wrote:
    On Tue, 16 Jun 2026 12:23:32 -0700, Chris M. Thomasson wrote:

    On 6/13/2026 5:32 PM, Lawrence DrCOOliveiro wrote:

    On Sat, 13 Jun 2026 13:46:55 -0700, Chris M. Thomasson wrote:

    Its a recursive koch fractal using AppleSoft basic.

    Yes, that was pretty clear. As was the fact that you were able to
    get it working, not *because of* your choice of BASIC to write it
    in, but *in spite of* that.

    It has current stack space.

    But being BASIC, it only has fixed-length arrays to use as the stack, doesnrCOt it?

    Well yeah. So you create a buffer.



    So, with a little work it should be workable for continuations.

    I added continuations to my toy PostScript-revival language. I soon discovered that having a dedicated stack area was a bad idea. So what
    happens is call frames are allocated on the heap, and chained together
    in various ways: for transferring control for a return, exit, yield or
    stop. An instance of the Continuation class keeps a copy of the
    CallFrame object that was current when it was created, and simply
    makes that current again when it is invoked.

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Thu Jun 18 12:51:11 2026
    From Newsgroup: comp.lang.c

    On 6/18/2026 12:47 PM, Chris M. Thomasson wrote:
    On 6/16/2026 8:07 PM, Lawrence DrCOOliveiro wrote:
    On Tue, 16 Jun 2026 12:23:32 -0700, Chris M. Thomasson wrote:

    On 6/13/2026 5:32 PM, Lawrence DrCOOliveiro wrote:

    On Sat, 13 Jun 2026 13:46:55 -0700, Chris M. Thomasson wrote:

    Its a recursive koch fractal using AppleSoft basic.

    Yes, that was pretty clear. As was the fact that you were able to
    get it working, not *because of* your choice of BASIC to write it
    in, but *in spite of* that.

    It has current stack space.

    But being BASIC, it only has fixed-length arrays to use as the stack,
    doesnrCOt it?

    Well yeah. So you create a buffer.


    Basilica over reserve... ;^)




    So, with a little work it should be workable for continuations.

    I added continuations to my toy PostScript-revival language. I soon
    discovered that having a dedicated stack area was a bad idea. So what
    happens is call frames are allocated on the heap, and chained together
    in various ways: for transferring control for a return, exit, yield or
    stop. An instance of the Continuation class keeps a copy of the
    CallFrame object that was current when it was created, and simply
    makes that current again when it is invoked.


    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.lang.c on Thu Jun 18 23:01:47 2026
    From Newsgroup: comp.lang.c

    On Thu, 18 Jun 2026 18:23:28 +0200, Bonita Montero wrote:

    Am 18.06.2026 um 06:11 schrieb Lawrence DrCOOliveiro:

    They are lighter-weight than processes. But then, processes on *nix
    systems are cheaper to create on *nix systems than on
    Dave-Cutler-type systems, we know that already.

    Yes, they are, but if performance is your concern better use thread
    pools.

    All that does is save the initial creation overhead, not the
    (supposed) context-switching overhead. Which is what you were
    complaining about, wasnrCOt it?
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Fri Jun 19 13:34:21 2026
    From Newsgroup: comp.lang.c

    On 6/18/2026 4:01 PM, Lawrence DrCOOliveiro wrote:
    On Thu, 18 Jun 2026 18:23:28 +0200, Bonita Montero wrote:

    Am 18.06.2026 um 06:11 schrieb Lawrence DrCOOliveiro:

    They are lighter-weight than processes. But then, processes on *nix
    systems are cheaper to create on *nix systems than on
    Dave-Cutler-type systems, we know that already.

    Yes, they are, but if performance is your concern better use thread
    pools.

    All that does is save the initial creation overhead, not the
    (supposed) context-switching overhead. Which is what you were
    complaining about, wasnrCOt it?

    Not exactly sure what you mean here. Windows and POSIX have basically
    the same overhead with preemptive threads. Syncing between processes is
    a different story. Once you are in a thread pool, you are staying within
    the same virtual memory space, which is why threads are the standard for high-performance concurrency?
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c on Sat Jun 20 11:47:44 2026
    From Newsgroup: comp.lang.c

    Am 19.06.2026 um 22:34 schrieb Chris M. Thomasson:

    Not exactly sure what you mean here. Windows and POSIX have basically
    the same overhead with preemptive threads. ...

    This measures the overhead of a preempted context switch:

    #if defined(_WIN32)
    #include <Windows.h>
    #elif defined(__unix__)
    #include <pthread.h>
    #endif
    #include <iostream>
    #include <latch>
    #include <thread>
    #if defined(_MSC_VER)
    #include <intrin.h>
    #elif defined(__GNUC__) || defined(__clang__)
    #include <x86intrin.h>
    #endif

    using namespace std;

    int main()
    {
    constexpr size_t ROUNDS = 1'000'000'000;
    struct id_tsc { uint64_t id, tsc; };
    static id_tsc idTsc( -1, __rdtsc() );
    latch latSync( 2 );
    atomic_uint64_t aSumTsc = 0;
    atomic<size_t> aNChanges = 0;
    auto ctxWait = [&]( uint64_t id )
    {
    #if defined(_WIN32)
    SetThreadAffinityMask( GetCurrentThread(), 1 );
    #elif defined(__unix__)
    cpu_set_t cpuSet;
    CPU_ZERO_S(sizeof cpuSet, &cpuSet);
    CPU_SET_S(0, sizeof cpuSet, &cpuSet);
    pthread_setaffinity_np( pthread_self(), sizeof cpuSet, &cpuSet );
    #endif
    latSync.arrive_and_wait();
    atomic_ref aIdTsc( idTsc );
    id_tsc ref = aIdTsc.load( memory_order_relaxed ), niu;
    uint64_t sumTsc = 0;
    size_t nChanges = 0;
    for( size_t r = ROUNDS; r; --r )
    {
    niu.id = id;
    niu.tsc = __rdtsc();
    if( aIdTsc.compare_exchange_strong( ref, niu, memory_order_relaxed,
    memory_order_relaxed ) ) [[likely]]
    {
    ref = niu;
    continue;
    }
    if( ref.id == id || ref.id == -1 )
    continue;
    int64_t dist = niu.tsc - ref.tsc;
    if( dist < 0 )
    continue;
    sumTsc += dist;
    ++nChanges;
    };
    aSumTsc += sumTsc;
    aNChanges += nChanges;
    };
    jthread spawned( ctxWait, 0 );
    ctxWait( 1 );
    cout << (double)aSumTsc / (double)aNChanges << endl;
    }
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.lang.c on Sat Jun 20 22:41:39 2026
    From Newsgroup: comp.lang.c

    On Sat, 20 Jun 2026 11:47:44 +0200, Bonita Montero wrote:

    This measures the overhead of a preempted context switch:

    [code omitted]

    ldo@theon:c++_try> g++ --std=c++26 context_switch_overhead.cpp
    /usr/bin/x86_64-linux-gnu-ld.bfd: /tmp/ccGEPNJM.o: in function `std::__atomic_ref<main::id_tsc, false, false>::load(std::memory_order) const':
    context_switch_overhead.cpp:(.text+0x547): undefined reference to `__atomic_load_16'
    /usr/bin/x86_64-linux-gnu-ld.bfd: /tmp/ccGEPNJM.o: in function `std::__atomic_ref<main::id_tsc, false, false>::compare_exchange_strong(main::id_tsc&, main::id_tsc, std::memory_order, std::memory_order) const':
    context_switch_overhead.cpp:(.text+0x671): undefined reference to `__atomic_compare_exchange_16'
    collect2: error: ld returned 1 exit status

    Got more errors when I didnrCOt specify a newer C++ --std option, but
    donrCOt know how to get rid of these ...
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.lang.c on Sat Jun 20 23:29:44 2026
    From Newsgroup: comp.lang.c

    On Sat, 20 Jun 2026 22:41:39 -0000 (UTC), I wrote:

    On Sat, 20 Jun 2026 11:47:44 +0200, Bonita Montero wrote:

    This measures the overhead of a preempted context switch:

    [code omitted]

    ldo@theon:c++_try> g++ --std=c++26 context_switch_overhead.cpp
    /usr/bin/x86_64-linux-gnu-ld.bfd: /tmp/ccGEPNJM.o: in function `std::__atomic_ref<main::id_tsc, false, false>::load(std::memory_order) const':
    context_switch_overhead.cpp:(.text+0x547): undefined reference to `__atomic_load_16'
    /usr/bin/x86_64-linux-gnu-ld.bfd: /tmp/ccGEPNJM.o: in function `std::__atomic_ref<main::id_tsc, false, false>::compare_exchange_strong(main::id_tsc&, main::id_tsc, std::memory_order, std::memory_order) const':
    context_switch_overhead.cpp:(.text+0x671): undefined reference to `__atomic_compare_exchange_16'
    collect2: error: ld returned 1 exit status

    Got more errors when I didnrCOt specify a newer C++ --std option, but
    donrCOt know how to get rid of these ...

    Got it!

    ldo@theon:c++_try> g++ --std=c++23 context_switch_overhead.cpp -latomic
    ldo@theon:c++_try> ./a.out
    17858.5
    ldo@theon:c++_try> g++ --std=c++26 context_switch_overhead.cpp -latomic
    ldo@theon:c++_try> ./a.out
    18899.2

    So what do the numbers mean?
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c on Sun Jun 21 09:06:58 2026
    From Newsgroup: comp.lang.c

    Am 21.06.2026 um 01:29 schrieb Lawrence DrCOOliveiro:> On Sat, 20 Jun 2026 22:41:39 -0000 (UTC), I wrote:

    On Sat, 20 Jun 2026 11:47:44 +0200, Bonita Montero wrote:

    This measures the overhead of a preempted context switch:

    [code omitted]

    ldo@theon:c++_try> g++ --std=c++26 context_switch_overhead.cpp
    /usr/bin/x86_64-linux-gnu-ld.bfd: /tmp/ccGEPNJM.o: in function `std::__atomic_ref<main::id_tsc, false, false>::load(std::memory_order) const':
    context_switch_overhead.cpp:(.text+0x547): undefined reference
    to `__atomic_load_16'
    /usr/bin/x86_64-linux-gnu-ld.bfd: /tmp/ccGEPNJM.o: in function `std::__atomic_ref<main::id_tsc, false, false>::compare_exchange_strong(main::id_tsc&, main::id_tsc, std::memory_order, std::memory_order) const':
    context_switch_overhead.cpp:(.text+0x671): undefined reference
    to `__atomic_compare_exchange_16'
    collect2: error: ld returned 1 exit status

    Got more errors when I didnrCOt specify a newer C++ --std option, but
    donrCOt know how to get rid of these ...

    Got it!

    ldo@theon:c++_try> g++ --std=c++23 context_switch_overhead.cpp
    -latomic
    ldo@theon:c++_try> ./a.out
    17858.5
    ldo@theon:c++_try> g++ --std=c++26 context_switch_overhead.cpp
    -latomic
    ldo@theon:c++_try> ./a.out
    18899.2

    So what do the numbers mean?
    I've improved the code. Now it measures the time for a preempted context
    switch in microseconds:

    #if defined(_WIN32)
    #include <Windows.h>
    #elif defined(__unix__)
    #include <pthread.h>
    #endif
    #include <iostream>
    #include <latch>
    #include <thread>
    #include <chrono>

    using namespace std;
    using namespace chrono;

    static bool unbeatable() noexcept;

    int main()
    {
    constexpr size_t ROUNDS = 100'000'000;
    static struct id_tsc
    {
    uint64_t id, tsc;
    } idTsc;
    latch latSync( 2 );
    atomic_uint64_t aSumTsc = 0;
    atomic<size_t> aNChanges = 0;
    auto ctxWait = [&]( uint64_t id )
    {
    unbeatable();
    latSync.arrive_and_wait();
    atomic_ref aIdTsc( idTsc );
    id_tsc ref = aIdTsc.load( memory_order_relaxed ), niu;
    uint64_t sumTsc = 0;
    size_t nChanges = 0;
    for( size_t r = ROUNDS; r; --r )
    {
    niu.id = id;
    niu.tsc = (int64_t)high_resolution_clock::now().time_since_epoch().count();
    if( aIdTsc.compare_exchange_strong( ref, niu, memory_order_relaxed, memory_order_relaxed ) ) [[likely]]
    {
    ref = niu;
    continue;
    }
    if( ref.id == id || ref.id == -1 )
    continue;
    int64_t dist = niu.tsc - ref.tsc;
    if( dist < 0 )
    continue;
    sumTsc += dist;
    ++nChanges;
    };
    aSumTsc += sumTsc;
    aNChanges += nChanges;
    };
    jthread spawned( ctxWait, 0 );
    ctxWait( 1 );
    double
    avgNs = (double)aSumTsc / (double)aNChanges,
    avgUs = floor( avgNs / (1.0e3 / 10.0) + 0.5 ) / 10.0;
    cout << aNChanges << " context switches" << endl;
    cout << avgUs << "us per preemted context switch" << endl;
    }

    static bool unbeatable() noexcept
    {
    #if defined(_WIN32)
    HANDLE hThread = GetCurrentThread();
    return SetThreadAffinityMask( hThread, 1 )
    && (SetThreadPriority( hThread, THREAD_PRIORITY_HIGHEST )
    || SetThreadPriority( hThread, THREAD_PRIORITY_TIME_CRITICAL ));
    #elif defined(__unix__)
    cpu_set_t cpuSet;
    CPU_ZERO_S(sizeof cpuSet, &cpuSet);
    CPU_SET_S(0, sizeof cpuSet, &cpuSet);
    if( pthread_setaffinity_np( pthread_self(), sizeof cpuSet, &cpuSet ) )
    return false;
    int policy;
    struct sched_param param;
    return pthread_getschedparam( pthread_self(), &policy, &param ) == 0
    && pthread_setschedprio( pthread_self(),
    sched_get_priority_max( policy ) ) == 0;
    #endif
    }

    And this code measures the time for a voluntary context switch (yield)
    in microseconds:

    #if defined(_WIN32)
    #include <Windows.h>
    #elif defined(__unix__)
    #include <pthread.h>
    #endif
    #include <iostream>
    #include <latch>
    #include <thread>
    #include <chrono>
    #include <cmath>

    using namespace std;
    using namespace chrono;

    static bool unbeatable() noexcept;

    int main()
    {
    constexpr size_t ROUNDS = 1'000'000;
    latch latSync( 2 );
    atomic_uint64_t aSumTsc = 0;
    auto yieldLoop = [&]
    {
    if( !unbeatable() )
    return;
    latSync.arrive_and_wait();
    size_t nChanges = 0;
    auto start = high_resolution_clock::now();
    for( size_t r = ROUNDS; r; --r )
    this_thread::yield();
    aSumTsc += duration_cast<nanoseconds>(high_resolution_clock::now() - start).count();
    };
    jthread spawned( yieldLoop );
    yieldLoop();
    spawned.join();
    double
    avgNs = (double)aSumTsc / (2.0 * (double)ROUNDS),
    avgUs = floor( avgNs / (1.0e3 / 10.0) + 0.5 ) / 10.0;
    cout << avgUs << "us per voluntary context switch" << endl;
    }

    static bool unbeatable() noexcept
    {
    #if defined(_WIN32)
    HANDLE hThread = GetCurrentThread();
    return SetThreadAffinityMask( hThread, 1 )
    && SetThreadPriority( hThread, THREAD_PRIORITY_HIGHEST )
    && SetThreadPriority( hThread, THREAD_PRIORITY_TIME_CRITICAL );
    #elif defined(__unix__)
    cpu_set_t cpuSet;
    CPU_ZERO_S(sizeof cpuSet, &cpuSet);
    CPU_SET_S(0, sizeof cpuSet, &cpuSet);
    if( pthread_setaffinity_np( pthread_self(), sizeof cpuSet, &cpuSet ) )
    return false;
    int policy;
    struct sched_param param;
    return pthread_getschedparam( pthread_self(), &policy, &param ) == 0
    && pthread_setschedprio( pthread_self(),
    sched_get_priority_max( policy ) ) == 0;
    #endif
    }
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.lang.c on Sun Jun 21 07:50:47 2026
    From Newsgroup: comp.lang.c

    On Sun, 21 Jun 2026 09:06:58 +0200, Bonita Montero wrote:

    I've improved the code. Now it measures the time for a preempted
    context switch in microseconds:

    Had to patch this, otherwise it couldnrCOt find a floor(double) function:

    ldo@theon:Bonita Montero progs> diff -u context_switch_overhead_2.cpp{-orig,} --- context_switch_overhead_2.cpp-orig 2026-06-21 19:43:13.682113804 +1200
    +++ context_switch_overhead_2.cpp 2026-06-21 19:46:24.228048403 +1200
    @@ -12,6 +12,7 @@
    #include <latch>
    #include <thread>
    #include <chrono>
    +#include <math.h>

    using namespace std;
    using namespace chrono;
    @@ -64,7 +65,7 @@
    avgNs = (double)aSumTsc / (double)aNChanges,
    avgUs = floor( avgNs / (1.0e3 / 10.0) + 0.5 ) / 10.0;
    cout << aNChanges << " context switches" << endl;
    - cout << avgUs << "us per preemted context switch" << endl;
    + cout << avgUs << "us per preempted context switch" << endl;
    }

    static bool unbeatable() noexcept

    Output was:

    470 context switches
    3.6us per preempted context switch

    And this code measures the time for a voluntary context switch
    (yield) in microseconds:

    Output was:

    1.6us per voluntary context switch
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.lang.c on Sun Jun 21 12:22:04 2026
    From Newsgroup: comp.lang.c

    On 21/06/2026 01:29, Lawrence DrCOOliveiro wrote:
    On Sat, 20 Jun 2026 22:41:39 -0000 (UTC), I wrote:

    On Sat, 20 Jun 2026 11:47:44 +0200, Bonita Montero wrote:

    This measures the overhead of a preempted context switch:

    [code omitted]

    ldo@theon:c++_try> g++ --std=c++26 context_switch_overhead.cpp
    /usr/bin/x86_64-linux-gnu-ld.bfd: /tmp/ccGEPNJM.o: in function `std::__atomic_ref<main::id_tsc, false, false>::load(std::memory_order) const':
    context_switch_overhead.cpp:(.text+0x547): undefined reference to `__atomic_load_16'
    /usr/bin/x86_64-linux-gnu-ld.bfd: /tmp/ccGEPNJM.o: in function `std::__atomic_ref<main::id_tsc, false, false>::compare_exchange_strong(main::id_tsc&, main::id_tsc, std::memory_order, std::memory_order) const':
    context_switch_overhead.cpp:(.text+0x671): undefined reference to `__atomic_compare_exchange_16'
    collect2: error: ld returned 1 exit status

    Got more errors when I didnrCOt specify a newer C++ --std option, but
    donrCOt know how to get rid of these ...

    Got it!

    ldo@theon:c++_try> g++ --std=c++23 context_switch_overhead.cpp -latomic
    ldo@theon:c++_try> ./a.out
    17858.5
    ldo@theon:c++_try> g++ --std=c++26 context_switch_overhead.cpp -latomic
    ldo@theon:c++_try> ./a.out
    18899.2

    So what do the numbers mean?

    When you are compiling without optimisation enabled? Probably nothing,
    unless the time taken is only due to code in libraries that you did not compile.

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c on Sun Jun 21 13:41:04 2026
    From Newsgroup: comp.lang.c

    Am 21.06.2026 um 12:22 schrieb David Brown:

    When you are compiling without optimisation enabled?-a Probably nothing, unless the time taken is only due to code in libraries that you did not compile.

    The intervals for the thread switches are that long that this
    shouldn't make a large difference. The timings just might become
    somewhat inaccurate.

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Mon Jun 22 12:15:38 2026
    From Newsgroup: comp.lang.c

    On 6/12/2026 4:15 AM, Bonita Montero wrote:
    Am 12.06.2026 um 06:19 schrieb Chris M. Thomasson:

    The fibers float along the threads... ;^)

    If you have as many fibres as you otherwise would have coroutines
    and the number is lage you waste a lot of memory.
    The only difference is that with fibers you can switch the context
    from any function.

    Don't use them if you don't actually need them. Simple.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Tue Jun 23 11:47:00 2026
    From Newsgroup: comp.lang.c

    On 6/18/2026 12:23 AM, Lawrence DrCOOliveiro wrote:
    On Thu, 18 Jun 2026 00:18:18 -0700, Chris M. Thomasson wrote:

    IrCOm talking about preemption semantics, which are the same on POSIX
    and Windows.

    But for some reason the previous poster thinks that doing thread
    preemption in the kernel is somehow more resource-heavy than doing the
    exact same thing in userspace.

    Those are different things? The kernel knows more than we do... Now
    making a user land scheduler with, say, fibers is complete different
    than what the kernel is doing...
    --- Synchronet 3.22a-Linux NewsLink 1.2