Forum: Too Lazy BBS

Really beautiful

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Thu Jan 1 06:00:47 2026

From Newsgroup: comp.lang.c++

Do you like that ?

#pragma once
#include <concepts>
#include <atomic>

struct xonce_flag
{
-a -a xonce_flag() noexcept = default;
private:
-a -a friend bool xcall_once( xonce_flag &, std::invocable auto );
-a -a using flag_t = std::atomic<signed char>;
-a -a flag_t m_flag = 0;
};

bool xcall_once( xonce_flag &xflag, std::invocable auto callable )
{
-a -a using namespace std;
-a -a xonce_flag::flag_t &flag = xflag.m_flag;
-a -a for( signed char ref = flag.load( memory_order_acquire ); ; )
-a -a -a -a if( ref > 0 ) [[likely]]
-a -a -a -a -a -a return true;
-a -a -a -a else if( ref < 0 ) [[unlikely]]
-a -a -a -a {
-a -a -a -a -a -a flag.wait( ref, memory_order_relaxed );
-a -a -a -a -a -a ref = flag.load( memory_order_acquire );
-a -a -a -a }
-a -a -a -a else if( flag.compare_exchange_strong( ref, -1, memory_order_relaxed, memory_order_acquire ) ) [[likely]]
-a -a -a -a -a -a break;
-a -a bool succ = true;
-a -a try
-a -a {
-a -a -a -a if constexpr( requires { (bool)callable(); } )
-a -a -a -a -a -a succ = (bool)callable();
-a -a -a -a else
-a -a -a -a -a -a callable();
-a -a }
-a -a catch( ... )
-a -a {
-a -a -a -a flag.store( 0, memory_order_release );
-a -a -a -a flag.notify_one();
-a -a -a -a throw;
-a -a }
-a -a flag.store( (char)succ, memory_order_release );
-a -a if( succ )
-a -a -a -a flag.notify_all();
-a -a else
-a -a -a -a flag.notify_one();
-a -a return succ;
}

--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Thu Jan 1 15:34:31 2026

From Newsgroup: comp.lang.c++

On 12/31/2025 9:00 PM, Bonita Montero wrote:

Do you like that ?

#pragma once
#include <concepts>
#include <atomic>

struct xonce_flag
{
-a -a xonce_flag() noexcept = default;
private:
-a -a friend bool xcall_once( xonce_flag &, std::invocable auto );
-a -a using flag_t = std::atomic<signed char>;
-a -a flag_t m_flag = 0;
};

bool xcall_once( xonce_flag &xflag, std::invocable auto callable )
{
-a -a using namespace std;
-a -a xonce_flag::flag_t &flag = xflag.m_flag;
-a -a for( signed char ref = flag.load( memory_order_acquire ); ; )
-a -a -a -a if( ref > 0 ) [[likely]]
-a -a -a -a -a -a return true;
-a -a -a -a else if( ref < 0 ) [[unlikely]]
-a -a -a -a {
-a -a -a -a -a -a flag.wait( ref, memory_order_relaxed );
-a -a -a -a -a -a ref = flag.load( memory_order_acquire );
-a -a -a -a }
-a -a -a -a else if( flag.compare_exchange_strong( ref, -1, memory_order_relaxed, memory_order_acquire ) ) [[likely]]
-a -a -a -a -a -a break;
-a -a bool succ = true;
-a -a try
-a -a {
-a -a -a -a if constexpr( requires { (bool)callable(); } )

what is the API of callable()? What does it return, I notice the cast to (bool)...

-a -a -a -a -a -a succ = (bool)callable();
-a -a -a -a else
-a -a -a -a -a -a callable();
-a -a }
-a -a catch( ... )
-a -a {
-a -a -a -a flag.store( 0, memory_order_release );
-a -a -a -a flag.notify_one();
-a -a -a -a throw;
-a -a }
-a -a flag.store( (char)succ, memory_order_release );

The cast is interesting to me. Why not make succ a char?

-a -a if( succ )
-a -a -a -a flag.notify_all();
-a -a else
-a -a -a -a flag.notify_one();
-a -a return succ;
}

--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Thu Jan 1 15:36:26 2026

From Newsgroup: comp.lang.c++

On 1/1/2026 3:34 PM, Chris M. Thomasson wrote:

On 12/31/2025 9:00 PM, Bonita Montero wrote:

Do you like that ?

#pragma once
#include <concepts>
#include <atomic>

struct xonce_flag
{
-a-a -a xonce_flag() noexcept = default;
private:
-a-a -a friend bool xcall_once( xonce_flag &, std::invocable auto );
-a-a -a using flag_t = std::atomic<signed char>;
-a-a -a flag_t m_flag = 0;
};

[...]

Actually, why use a signed char? Why not a signed word?
--- Synchronet 3.21a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c++ on Fri Jan 2 09:36:28 2026

From Newsgroup: comp.lang.c++

On 02/01/2026 00:34, Chris M. Thomasson wrote:

On 12/31/2025 9:00 PM, Bonita Montero wrote:

Do you like that ?

#pragma once
#include <concepts>
#include <atomic>

struct xonce_flag
{
-a-a -a xonce_flag() noexcept = default;
private:
-a-a -a friend bool xcall_once( xonce_flag &, std::invocable auto );
-a-a -a using flag_t = std::atomic<signed char>;
-a-a -a flag_t m_flag = 0;
};

bool xcall_once( xonce_flag &xflag, std::invocable auto callable )
{
-a-a -a using namespace std;
-a-a -a xonce_flag::flag_t &flag = xflag.m_flag;
-a-a -a for( signed char ref = flag.load( memory_order_acquire ); ; )
-a-a -a -a -a if( ref > 0 ) [[likely]]
-a-a -a -a -a -a -a return true;
-a-a -a -a -a else if( ref < 0 ) [[unlikely]]
-a-a -a -a -a {
-a-a -a -a -a -a -a flag.wait( ref, memory_order_relaxed );
-a-a -a -a -a -a -a ref = flag.load( memory_order_acquire );
-a-a -a -a -a }
-a-a -a -a -a else if( flag.compare_exchange_strong( ref, -1,
memory_order_relaxed, memory_order_acquire ) ) [[likely]]
-a-a -a -a -a -a -a break;
-a-a -a bool succ = true;
-a-a -a try
-a-a -a {
-a-a -a -a -a if constexpr( requires { (bool)callable(); } )

what is the API of callable()? What does it return, I notice the cast to (bool)...

-a-a -a -a -a -a -a succ = (bool)callable();
-a-a -a -a -a else
-a-a -a -a -a -a -a callable();
-a-a -a }
-a-a -a catch( ... )
-a-a -a {
-a-a -a -a -a flag.store( 0, memory_order_release );
-a-a -a -a -a flag.notify_one();
-a-a -a -a -a throw;
-a-a -a }
-a-a -a flag.store( (char)succ, memory_order_release );

The cast is interesting to me. Why not make succ a char?

I guess there is a cast because Bonita has made poor choices of types,
jumbled up their uses and lost track of them. First, "flag_t" is made
as an alias for an atomic "signed char" - and any use of "signed char"
this century is a big red flag to me. Then we have "ref" declared in a for-loop that is a a "signed char" instead of being tied to the flag_t
type (such as by using "auto"), and then we have a bool variable that is
being stored in the atomic signed char via an unnecessary and muddled
cast to plain char.

Perhaps we should wait for the traditional six follow-up posts from the
OP improving on their "beautiful" code with repeated minor and
unidentified changes, then critique the final version.

-a-a -a if( succ )
-a-a -a -a -a flag.notify_all();
-a-a -a else
-a-a -a -a -a flag.notify_one();
-a-a -a return succ;
}

--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Fri Jan 2 14:43:43 2026

From Newsgroup: comp.lang.c++

Am 02.01.2026 um 00:34 schrieb Chris M. Thomasson:

On 12/31/2025 9:00 PM, Bonita Montero wrote:

Do you like that ?

#pragma once
#include <concepts>
#include <atomic>

struct xonce_flag
{
-a-a -a xonce_flag() noexcept = default;
private:
-a-a -a friend bool xcall_once( xonce_flag &, std::invocable auto );
-a-a -a using flag_t = std::atomic<signed char>;
-a-a -a flag_t m_flag = 0;
};

bool xcall_once( xonce_flag &xflag, std::invocable auto callable )
{
-a-a -a using namespace std;
-a-a -a xonce_flag::flag_t &flag = xflag.m_flag;
-a-a -a for( signed char ref = flag.load( memory_order_acquire ); ; )
-a-a -a -a -a if( ref > 0 ) [[likely]]
-a-a -a -a -a -a -a return true;
-a-a -a -a -a else if( ref < 0 ) [[unlikely]]
-a-a -a -a -a {
-a-a -a -a -a -a -a flag.wait( ref, memory_order_relaxed );
-a-a -a -a -a -a -a ref = flag.load( memory_order_acquire );
-a-a -a -a -a }
-a-a -a -a -a else if( flag.compare_exchange_strong( ref, -1,
memory_order_relaxed, memory_order_acquire ) ) [[likely]]
-a-a -a -a -a -a -a break;
-a-a -a bool succ = true;
-a-a -a try
-a-a -a {
-a-a -a -a -a if constexpr( requires { (bool)callable(); } )

what is the API of callable()? What does it return, I notice the cast
to (bool)...

It can return sth. convertible to a a bool if it wants to signal sucess
or nothing to handle the unsucessful case just as a std::-once.

-a-a -a -a -a -a -a succ = (bool)callable();
-a-a -a -a -a else
-a-a -a -a -a -a -a callable();
-a-a -a }
-a-a -a catch( ... )
-a-a -a {
-a-a -a -a -a flag.store( 0, memory_order_release );
-a-a -a -a -a flag.notify_one();
-a-a -a -a -a throw;
-a-a -a }
-a-a -a flag.store( (char)succ, memory_order_release );

The cast is interesting to me. Why not make succ a char?

The char has three states, > 0 if the initialization was successful, == 0
if there was no initialization so far und < 0 if there is an
initialization in
progess. So I need a signed variable which can handle all this three
values and a signed char is sufficient for that.
As a bool is internally represented as a char-sized value on all current platforms i't's the easiest way to have an equally sized atomic for the
three states.

-a-a -a if( succ )
-a-a -a -a -a flag.notify_all();
-a-a -a else
-a-a -a -a -a flag.notify_one();
-a-a -a return succ;
}

--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Fri Jan 2 14:45:06 2026

From Newsgroup: comp.lang.c++

Am 02.01.2026 um 00:36 schrieb Chris M. Thomasson:

On 1/1/2026 3:34 PM, Chris M. Thomasson wrote:

On 12/31/2025 9:00 PM, Bonita Montero wrote:

Do you like that ?

#pragma once
#include <concepts>
#include <atomic>

struct xonce_flag
{
-a-a -a xonce_flag() noexcept = default;
private:
-a-a -a friend bool xcall_once( xonce_flag &, std::invocable auto );
-a-a -a using flag_t = std::atomic<signed char>;
-a-a -a flag_t m_flag = 0;
};

[...]

Actually, why use a signed char? Why not a signed word?

Because I only need thee states: > 0, 0 and < 0;

--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Fri Jan 2 14:49:22 2026

From Newsgroup: comp.lang.c++

Am 02.01.2026 um 09:36 schrieb David Brown:

On 02/01/2026 00:34, Chris M. Thomasson wrote:

On 12/31/2025 9:00 PM, Bonita Montero wrote:

Do you like that ?

#pragma once
#include <concepts>
#include <atomic>

struct xonce_flag
{
-a-a -a xonce_flag() noexcept = default;
private:
-a-a -a friend bool xcall_once( xonce_flag &, std::invocable auto );
-a-a -a using flag_t = std::atomic<signed char>;
-a-a -a flag_t m_flag = 0;
};

bool xcall_once( xonce_flag &xflag, std::invocable auto callable )
{
-a-a -a using namespace std;
-a-a -a xonce_flag::flag_t &flag = xflag.m_flag;
-a-a -a for( signed char ref = flag.load( memory_order_acquire ); ; )
-a-a -a -a -a if( ref > 0 ) [[likely]]
-a-a -a -a -a -a -a return true;
-a-a -a -a -a else if( ref < 0 ) [[unlikely]]
-a-a -a -a -a {
-a-a -a -a -a -a -a flag.wait( ref, memory_order_relaxed );
-a-a -a -a -a -a -a ref = flag.load( memory_order_acquire );
-a-a -a -a -a }
-a-a -a -a -a else if( flag.compare_exchange_strong( ref, -1,
memory_order_relaxed, memory_order_acquire ) ) [[likely]]
-a-a -a -a -a -a -a break;
-a-a -a bool succ = true;
-a-a -a try
-a-a -a {
-a-a -a -a -a if constexpr( requires { (bool)callable(); } )

what is the API of callable()? What does it return, I notice the cast
to (bool)...

-a-a -a -a -a -a -a succ = (bool)callable();
-a-a -a -a -a else
-a-a -a -a -a -a -a callable();
-a-a -a }
-a-a -a catch( ... )
-a-a -a {
-a-a -a -a -a flag.store( 0, memory_order_release );
-a-a -a -a -a flag.notify_one();
-a-a -a -a -a throw;
-a-a -a }
-a-a -a flag.store( (char)succ, memory_order_release );

The cast is interesting to me. Why not make succ a char?

I guess there is a cast because Bonita has made poor choices of types, jumbled up their uses and lost track of them.-a First, "flag_t" is made
as an alias for an atomic "signed char" - and any use of "signed char"
this century is a big red flag to me. Then we have "ref" declared in a for-loop that is a a "signed char" instead of being tied to the flag_t
type (such as by using "auto"), and then we have a bool variable that
is being stored in the atomic signed char via an unnecessary and
muddled cast to plain char.

That's nonsense. I only need three states, and they all fit into one
signed character.
There's simply no need to make it bigger.

Perhaps we should wait for the traditional six follow-up posts from
the OP improving on their "beautiful" code with repeated minor and unidentified changes, then critique the final version.

The code is perfect since it's not very complicated. I needed a
once_flag that
is failable with a bool. With a futex it's the simplest solution since
the common
state is just a byte.
If you had understood the code, you wouldn't have talked so much nonsense.

-a-a -a if( succ )
-a-a -a -a -a flag.notify_all();
-a-a -a else
-a-a -a -a -a flag.notify_one();
-a-a -a return succ;
}

--- Synchronet 3.21a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c++ on Fri Jan 2 15:52:05 2026

From Newsgroup: comp.lang.c++

On 02/01/2026 14:49, Bonita Montero wrote:

Am 02.01.2026 um 09:36 schrieb David Brown:

On 02/01/2026 00:34, Chris M. Thomasson wrote:

On 12/31/2025 9:00 PM, Bonita Montero wrote:

Do you like that ?

#pragma once
#include <concepts>
#include <atomic>

struct xonce_flag
{
-a-a -a xonce_flag() noexcept = default;
private:
-a-a -a friend bool xcall_once( xonce_flag &, std::invocable auto );
-a-a -a using flag_t = std::atomic<signed char>;
-a-a -a flag_t m_flag = 0;
};

bool xcall_once( xonce_flag &xflag, std::invocable auto callable )
{
-a-a -a using namespace std;
-a-a -a xonce_flag::flag_t &flag = xflag.m_flag;
-a-a -a for( signed char ref = flag.load( memory_order_acquire ); ; )
-a-a -a -a -a if( ref > 0 ) [[likely]]
-a-a -a -a -a -a -a return true;
-a-a -a -a -a else if( ref < 0 ) [[unlikely]]
-a-a -a -a -a {
-a-a -a -a -a -a -a flag.wait( ref, memory_order_relaxed );
-a-a -a -a -a -a -a ref = flag.load( memory_order_acquire );
-a-a -a -a -a }
-a-a -a -a -a else if( flag.compare_exchange_strong( ref, -1,
memory_order_relaxed, memory_order_acquire ) ) [[likely]]
-a-a -a -a -a -a -a break;
-a-a -a bool succ = true;
-a-a -a try
-a-a -a {
-a-a -a -a -a if constexpr( requires { (bool)callable(); } )

what is the API of callable()? What does it return, I notice the cast
to (bool)...

-a-a -a -a -a -a -a succ = (bool)callable();
-a-a -a -a -a else
-a-a -a -a -a -a -a callable();
-a-a -a }
-a-a -a catch( ... )
-a-a -a {
-a-a -a -a -a flag.store( 0, memory_order_release );
-a-a -a -a -a flag.notify_one();
-a-a -a -a -a throw;
-a-a -a }
-a-a -a flag.store( (char)succ, memory_order_release );

The cast is interesting to me. Why not make succ a char?

I guess there is a cast because Bonita has made poor choices of types,
jumbled up their uses and lost track of them.-a First, "flag_t" is made
as an alias for an atomic "signed char" - and any use of "signed char"
this century is a big red flag to me. Then we have "ref" declared in a
for-loop that is a a "signed char" instead of being tied to the flag_t
type (such as by using "auto"), and then we have a bool variable that
is being stored in the atomic signed char via an unnecessary and
muddled cast to plain char.

That's nonsense. I only need three states, and they all fit into one
signed character.
There's simply no need to make it bigger.

I am not saying you should make it bigger. I am saying you should make
it consistent.

If there are only three logical states for the type, then an enumerated
type would probably make most sense.

Perhaps we should wait for the traditional six follow-up posts from
the OP improving on their "beautiful" code with repeated minor and
unidentified changes, then critique the final version.

The code is perfect since it's not very complicated. I needed a
once_flag that
is failable with a bool. With a futex it's the simplest solution since
the common
state is just a byte.
If you had understood the code, you wouldn't have talked so much nonsense.

Code with inconsistent typing and an unnecessary cast (to a different inconsistent type) is not "beautiful" or clear. Code where you use
magic numbers to represent states is not beautiful. Code with no specification is not beautiful. Code with meaningless names (what is
the "x" doing in the names?) is not beautiful. Code that has unclear structure (why the "for" loop, when there is no loop? Why the "break"
when there is no loop?), poor layout (learn to use braces to show
structure - human readers care, even if the compiler does not) is not beautiful.

You asked "Do you like that?" - the answer is no. It might be a useful
and working piece of code, though we are left guessing as to what it is supposed to do, but it is not beautiful.

-a-a -a if( succ )
-a-a -a -a -a flag.notify_all();
-a-a -a else
-a-a -a -a -a flag.notify_one();
-a-a -a return succ;
}

--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Fri Jan 2 16:09:05 2026

From Newsgroup: comp.lang.c++

Am 02.01.2026 um 15:52 schrieb David Brown:

On 02/01/2026 14:49, Bonita Montero wrote:

Am 02.01.2026 um 09:36 schrieb David Brown:

On 02/01/2026 00:34, Chris M. Thomasson wrote:

On 12/31/2025 9:00 PM, Bonita Montero wrote:

Do you like that ?

#pragma once
#include <concepts>
#include <atomic>

struct xonce_flag
{
-a-a -a xonce_flag() noexcept = default;
private:
-a-a -a friend bool xcall_once( xonce_flag &, std::invocable auto ); >>>>> -a-a -a using flag_t = std::atomic<signed char>;
-a-a -a flag_t m_flag = 0;
};

bool xcall_once( xonce_flag &xflag, std::invocable auto callable )
{
-a-a -a using namespace std;
-a-a -a xonce_flag::flag_t &flag = xflag.m_flag;
-a-a -a for( signed char ref = flag.load( memory_order_acquire ); ; ) >>>>> -a-a -a -a -a if( ref > 0 ) [[likely]]
-a-a -a -a -a -a -a return true;
-a-a -a -a -a else if( ref < 0 ) [[unlikely]]
-a-a -a -a -a {
-a-a -a -a -a -a -a flag.wait( ref, memory_order_relaxed );
-a-a -a -a -a -a -a ref = flag.load( memory_order_acquire );
-a-a -a -a -a }
-a-a -a -a -a else if( flag.compare_exchange_strong( ref, -1,
memory_order_relaxed, memory_order_acquire ) ) [[likely]]
-a-a -a -a -a -a -a break;
-a-a -a bool succ = true;
-a-a -a try
-a-a -a {
-a-a -a -a -a if constexpr( requires { (bool)callable(); } )

what is the API of callable()? What does it return, I notice the
cast to (bool)...

-a-a -a -a -a -a -a succ = (bool)callable();
-a-a -a -a -a else
-a-a -a -a -a -a -a callable();
-a-a -a }
-a-a -a catch( ... )
-a-a -a {
-a-a -a -a -a flag.store( 0, memory_order_release );
-a-a -a -a -a flag.notify_one();
-a-a -a -a -a throw;
-a-a -a }
-a-a -a flag.store( (char)succ, memory_order_release );

The cast is interesting to me. Why not make succ a char?

I guess there is a cast because Bonita has made poor choices of
types, jumbled up their uses and lost track of them. First, "flag_t"
is made as an alias for an atomic "signed char" - and any use of
"signed char" this century is a big red flag to me. Then we have
"ref" declared in a for-loop that is a a "signed char" instead of
being tied to the flag_t type (such as by using "auto"), and then we
have a bool variable that is being stored in the atomic signed char
via an unnecessary and muddled cast to plain char.

That's nonsense. I only need three states, and they all fit into one
signed character.
There's simply no need to make it bigger.

I am not saying you should make it bigger.-a I am saying you should
make it consistent.

It is correct and as efficient as poiible.

If there are only three logical states for the type, then an
enumerated type would probably make most sense.

LOL, for three states in one function.
And you can't have enums for a range (< 0, > 0).

Perhaps we should wait for the traditional six follow-up posts from
the OP improving on their "beautiful" code with repeated minor and
unidentified changes, then critique the final version.

The code is perfect since it's not very complicated. I needed a
once_flag that
is failable with a bool. With a futex it's the simplest solution
since the common
state is just a byte.
If you had understood the code, you wouldn't have talked so much
nonsense.

Code with inconsistent typing and an unnecessary cast (to a different inconsistent type) is not "beautiful" or clear. Code where you use
magic numbers to represent states is not beautiful. Code with no specification is not beautiful.-a Code with meaningless names (what is
the "x" doing in the names?) is not beautiful.-a Code that has unclear structure (why the "for" loop, when there is no loop?-a Why the "break"
when there is no loop?), poor layout (learn to use braces to show
structure - human readers care, even if the compiler does not) is not beautiful.

You feel uncertain with everything that doesn't match your style.
I use the three magic numbers only in a single function; it's obvious
through the loop with a few lines of code.
For someone with an affinity to basic MT synchronization primitives it's
easy.
Code must be more expressive if it becomes larger; but he whole
algorithm is easy.
The retry-for-loop is usual with any synchronization-primitive.
You simply have no taste and the code is too complicated for you; and
you're focussed on minor details instead of understanding the idea..

You asked "Do you like that?" - the answer is no.-a It might be a
useful and working piece of code, though we are left guessing as to
what it is supposed to do, but it is not beautiful.

-a-a -a if( succ )
-a-a -a -a -a flag.notify_all();
-a-a -a else
-a-a -a -a -a flag.notify_one();
-a-a -a return succ;
}

--- Synchronet 3.21a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.lang.c++ on Fri Jan 2 18:45:57 2026

From Newsgroup: comp.lang.c++

On 02/01/2026 16:09, Bonita Montero wrote:

Am 02.01.2026 um 15:52 schrieb David Brown:

On 02/01/2026 14:49, Bonita Montero wrote:

Am 02.01.2026 um 09:36 schrieb David Brown:

On 02/01/2026 00:34, Chris M. Thomasson wrote:

On 12/31/2025 9:00 PM, Bonita Montero wrote:

Do you like that ?

#pragma once
#include <concepts>
#include <atomic>

struct xonce_flag
{
-a-a -a xonce_flag() noexcept = default;
private:
-a-a -a friend bool xcall_once( xonce_flag &, std::invocable auto ); >>>>>> -a-a -a using flag_t = std::atomic<signed char>;
-a-a -a flag_t m_flag = 0;
};

bool xcall_once( xonce_flag &xflag, std::invocable auto callable ) >>>>>> {
-a-a -a using namespace std;
-a-a -a xonce_flag::flag_t &flag = xflag.m_flag;
-a-a -a for( signed char ref = flag.load( memory_order_acquire ); ; ) >>>>>> -a-a -a -a -a if( ref > 0 ) [[likely]]
-a-a -a -a -a -a -a return true;
-a-a -a -a -a else if( ref < 0 ) [[unlikely]]
-a-a -a -a -a {
-a-a -a -a -a -a -a flag.wait( ref, memory_order_relaxed );
-a-a -a -a -a -a -a ref = flag.load( memory_order_acquire );
-a-a -a -a -a }
-a-a -a -a -a else if( flag.compare_exchange_strong( ref, -1,
memory_order_relaxed, memory_order_acquire ) ) [[likely]]
-a-a -a -a -a -a -a break;
-a-a -a bool succ = true;
-a-a -a try
-a-a -a {
-a-a -a -a -a if constexpr( requires { (bool)callable(); } )

what is the API of callable()? What does it return, I notice the
cast to (bool)...

-a-a -a -a -a -a -a succ = (bool)callable();
-a-a -a -a -a else
-a-a -a -a -a -a -a callable();
-a-a -a }
-a-a -a catch( ... )
-a-a -a {
-a-a -a -a -a flag.store( 0, memory_order_release );
-a-a -a -a -a flag.notify_one();
-a-a -a -a -a throw;
-a-a -a }
-a-a -a flag.store( (char)succ, memory_order_release );

The cast is interesting to me. Why not make succ a char?

I guess there is a cast because Bonita has made poor choices of
types, jumbled up their uses and lost track of them. First, "flag_t"
is made as an alias for an atomic "signed char" - and any use of
"signed char" this century is a big red flag to me. Then we have
"ref" declared in a for-loop that is a a "signed char" instead of
being tied to the flag_t type (such as by using "auto"), and then we
have a bool variable that is being stored in the atomic signed char
via an unnecessary and muddled cast to plain char.

That's nonsense. I only need three states, and they all fit into one
signed character.
There's simply no need to make it bigger.

I am not saying you should make it bigger.-a I am saying you should
make it consistent.

It is correct and as efficient as poiible.

I am not arguing that - though there is no way to know if it is
"correct" without a specification. There is no specification, so
"correct" means, at most, "it does what the code says". With poorly structured code with weird names, inconsistent types, and no comments or documentation, people would have to work through the code to see what it
does, and how it does it, in order to guess what it is supposed to do.
That's not something I am going to bother doing.

And only after reverse engineering a specification, and then doing a lot
of experimentation and timing measurements on a range of compilers for a
range of target processors and operating systems could anyone reasonably
judge if it is "as efficient as possible". That is /certainly/ not
something worth doing.

But even assuming the code is correct, and efficient, it is still not "beautiful".

If there are only three logical states for the type, then an
enumerated type would probably make most sense.

LOL, for three states in one function.

Yes. Stop claiming your code is "beautiful" or "perfect", and start
writing clearer code. When someone else sees your code and tells you it
is beautiful, /then/ you have achieved something.

And you can't have enums for a range (< 0, > 0).

You said there are only three logical states.

Perhaps we should wait for the traditional six follow-up posts from
the OP improving on their "beautiful" code with repeated minor and
unidentified changes, then critique the final version.

The code is perfect since it's not very complicated. I needed a
once_flag that
is failable with a bool. With a futex it's the simplest solution
since the common
state is just a byte.
If you had understood the code, you wouldn't have talked so much
nonsense.

Code with inconsistent typing and an unnecessary cast (to a different
inconsistent type) is not "beautiful" or clear. Code where you use
magic numbers to represent states is not beautiful. Code with no
specification is not beautiful.-a Code with meaningless names (what is
the "x" doing in the names?) is not beautiful.-a Code that has unclear
structure (why the "for" loop, when there is no loop?-a Why the "break"
when there is no loop?), poor layout (learn to use braces to show
structure - human readers care, even if the compiler does not) is not
beautiful.

You feel uncertain with everything that doesn't match your style.

No, style is a personal thing. There are many aspects of your style
that I dislike, but I am not complaining about those because they are
very subjective (as is "beautiful").

I use the three magic numbers only in a single function; it's obvious through the loop with a few lines of code.

It is lazy and makes the code harder to follow.

For someone with an affinity to basic MT synchronization primitives it's easy.
Code must be more expressive if it becomes larger; but he whole
algorithm is easy.
The retry-for-loop is usual with any synchronization-primitive.
You simply have no taste and the code is too complicated for you; and
you're focussed on minor details instead of understanding the idea..

You claimed the code was "beautiful" and asked if people liked it. I am telling you I don't like it, I don't think it is "beautiful", and giving
my reasons why. If you are not interested in feedback, why did you
bother posting it? Do you think anyone is the slightest bit interested
in using your code? Or do you think we will all swoon over how
wonderful a programmer you are?

Posting code and asking for feedback is a great thing, as long as you
are happy to receive feedback.

You asked "Do you like that?" - the answer is no.-a It might be a
useful and working piece of code, though we are left guessing as to
what it is supposed to do, but it is not beautiful.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Fri Jan 2 19:09:56 2026

From Newsgroup: comp.lang.c++

Am 02.01.2026 um 18:45 schrieb David Brown:

On 02/01/2026 16:09, Bonita Montero wrote:

Am 02.01.2026 um 15:52 schrieb David Brown:

On 02/01/2026 14:49, Bonita Montero wrote:

Am 02.01.2026 um 09:36 schrieb David Brown:

On 02/01/2026 00:34, Chris M. Thomasson wrote:

On 12/31/2025 9:00 PM, Bonita Montero wrote:

Do you like that ?

#pragma once
#include <concepts>
#include <atomic>

struct xonce_flag
{
-a-a -a xonce_flag() noexcept = default;
private:
-a-a -a friend bool xcall_once( xonce_flag &, std::invocable auto ); >>>>>>> -a-a -a using flag_t = std::atomic<signed char>;
-a-a -a flag_t m_flag = 0;
};

bool xcall_once( xonce_flag &xflag, std::invocable auto callable ) >>>>>>> {
-a-a -a using namespace std;
-a-a -a xonce_flag::flag_t &flag = xflag.m_flag;
-a-a -a for( signed char ref = flag.load( memory_order_acquire ); ; ) >>>>>>> -a-a -a -a -a if( ref > 0 ) [[likely]]
-a-a -a -a -a -a -a return true;
-a-a -a -a -a else if( ref < 0 ) [[unlikely]]
-a-a -a -a -a {
-a-a -a -a -a -a -a flag.wait( ref, memory_order_relaxed );
-a-a -a -a -a -a -a ref = flag.load( memory_order_acquire );
-a-a -a -a -a }
-a-a -a -a -a else if( flag.compare_exchange_strong( ref, -1,
memory_order_relaxed, memory_order_acquire ) ) [[likely]]
-a-a -a -a -a -a -a break;
-a-a -a bool succ = true;
-a-a -a try
-a-a -a {
-a-a -a -a -a if constexpr( requires { (bool)callable(); } )

what is the API of callable()? What does it return, I notice the
cast to (bool)...

-a-a -a -a -a -a -a succ = (bool)callable();
-a-a -a -a -a else
-a-a -a -a -a -a -a callable();
-a-a -a }
-a-a -a catch( ... )
-a-a -a {
-a-a -a -a -a flag.store( 0, memory_order_release );
-a-a -a -a -a flag.notify_one();
-a-a -a -a -a throw;
-a-a -a }
-a-a -a flag.store( (char)succ, memory_order_release );

The cast is interesting to me. Why not make succ a char?

I guess there is a cast because Bonita has made poor choices of
types, jumbled up their uses and lost track of them. First,
"flag_t" is made as an alias for an atomic "signed char" - and any
use of "signed char" this century is a big red flag to me. Then we
have "ref" declared in a for-loop that is a a "signed char"
instead of being tied to the flag_t type (such as by using
"auto"), and then we have a bool variable that is being stored in
the atomic signed char via an unnecessary and muddled cast to
plain char.

That's nonsense. I only need three states, and they all fit into
one signed character.
There's simply no need to make it bigger.

I am not saying you should make it bigger.-a I am saying you should
make it consistent.

It is correct and as efficient as poiible.

I am not arguing that - though there is no way to know if it is
"correct" without a specification.-a There is no specification, so
"correct" means, at most, "it does what the code says".-a With poorly structured code with weird names, inconsistent types, and no comments
or documentation, people would have to work through the code to see
what it does, and how it does it, in order to guess what it is
supposed to do. That's not something I am going to bother doing.

The code is not well documented, but the rest you tell about it is your personal taste.
You're focussed on details without understanding the code itself,
although it's easy to understand if you know how futexes work.

And only after reverse engineering a specification, and then doing a
lot of experimentation and timing measurements on a range of compilers
for a range of target processors and operating systems could anyone reasonably judge if it is "as efficient as possible".-a That is
/certainly/ not something worth doing.

But even assuming the code is correct, and efficient, it is still not "beautiful".

That's a matter of personal taste.
And the beauty comes from the algorithm.
Not from the minor details you're focussed on; that's really, really SICK.

If there are only three logical states for the type, then an
enumerated type would probably make most sense.

LOL, for three states in one function.

Yes.-a Stop claiming your code is "beautiful" or "perfect", and start writing clearer code.-a When someone else sees your code and tells you
it is beautiful, /then/ you have achieved something.

And you can't have enums for a range (< 0, > 0).

You said there are only three logical states.

Think ! But two are ranges.

Perhaps we should wait for the traditional six follow-up posts
from the OP improving on their "beautiful" code with repeated
minor and unidentified changes, then critique the final version.

The code is perfect since it's not very complicated. I needed a
once_flag that
is failable with a bool. With a futex it's the simplest solution
since the common
state is just a byte.
If you had understood the code, you wouldn't have talked so much
nonsense.

Code with inconsistent typing and an unnecessary cast (to a
different inconsistent type) is not "beautiful" or clear. Code where
you use magic numbers to represent states is not beautiful. Code
with no specification is not beautiful.-a Code with meaningless names
(what is the "x" doing in the names?) is not beautiful.-a Code that
has unclear structure (why the "for" loop, when there is no loop?-a
Why the "break" when there is no loop?), poor layout (learn to use
braces to show structure - human readers care, even if the compiler
does not) is not beautiful.

You feel uncertain with everything that doesn't match your style.

No, style is a personal thing.-a There are many aspects of your style
that I dislike, but I am not complaining about those because they are
very subjective (as is "beautiful").

The beauty comes from the algorithm itself, using futexes.
And you're unable to understand that and you're focussed on details
which are a matter of personal taste.

I use the three magic numbers only in a single function; it's obvious
through the loop with a few lines of code.

It is lazy and makes the code harder to follow.

LOL, the function is 34 lines of code - hard to follow ?
You're arguing like a beginner.

For someone with an affinity to basic MT synchronization primitives
it's easy.
Code must be more expressive if it becomes larger; but he whole
algorithm is easy.
The retry-for-loop is usual with any synchronization-primitive.
You simply have no taste and the code is too complicated for you; and
you're focussed on minor details instead of understanding the idea..

You claimed the code was "beautiful" and asked if people liked it.-a I
am telling you I don't like it, I don't think it is "beautiful", and
giving my reasons why.-a If you are not interested in feedback, why did
you bother posting it?-a Do you think anyone is the slightest bit
interested in using your code?-a Or do you think we will all swoon over
how wonderful a programmer you are?

You don't understand the algorithm, which doesn't need any documentation
(34 LOCs).
It's the algorithm itself which is beautiful ..

Posting code and asking for feedback is a great thing, as long as you
are happy to receive feedback.

You asked "Do you like that?" - the answer is no.-a It might be a
useful and working piece of code, though we are left guessing as to
what it is supposed to do, but it is not beautiful.

--- Synchronet 3.21a-Linux NewsLink 1.2

From James Kuyper@jameskuyper@alumni.caltech.edu to comp.lang.c++ on Fri Jan 2 14:03:07 2026

From Newsgroup: comp.lang.c++

On 02/01/2026 16:09, Bonita Montero wrote:
...

And you can't have enums for a range (< 0, > 0).

Huh?

#include <iostream>
enum tristate:int { NEGATIVE = -1, ZERO, POSITIVE};
int main(void)
{
std::cout << NEGATIVE << "\t" << ZERO << "\t" << POSITIVE << std::endl;
}

Output:
-1 0 1

--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Fri Jan 2 20:11:48 2026

From Newsgroup: comp.lang.c++

Am 02.01.2026 um 20:03 schrieb James Kuyper:

On 02/01/2026 16:09, Bonita Montero wrote:
...

And you can't have enums for a range (< 0, > 0).

Huh?

#include <iostream>
enum tristate:int { NEGATIVE = -1, ZERO, POSITIVE};
int main(void)
{
std::cout << NEGATIVE << "\t" << ZERO << "\t" << POSITIVE << std::endl; }

Output:
-1 0 1

< 0 and > 0 are ranges.
And NEGATIVE, ZERO or postive woldn't make the code more readable:
Guys, you're still focussed on minor details without understanding the code. That's always this way with nerds.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Fri Jan 2 11:49:39 2026

From Newsgroup: comp.lang.c++

On 1/2/2026 5:45 AM, Bonita Montero wrote:

Am 02.01.2026 um 00:36 schrieb Chris M. Thomasson:

On 1/1/2026 3:34 PM, Chris M. Thomasson wrote:

On 12/31/2025 9:00 PM, Bonita Montero wrote:

Do you like that ?

#pragma once
#include <concepts>
#include <atomic>

struct xonce_flag
{
-a-a -a xonce_flag() noexcept = default;
private:
-a-a -a friend bool xcall_once( xonce_flag &, std::invocable auto );
-a-a -a using flag_t = std::atomic<signed char>;
-a-a -a flag_t m_flag = 0;
};

[...]

Actually, why use a signed char? Why not a signed word?

Because I only need thee states: > 0, 0 and < 0;

But, using a signed word is more natural in a sense? You need to pad up
to and align the xonce_flag on a l2 cache line anyway. You don't want
false sharing on this flag, right?
--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Fri Jan 2 12:06:03 2026

From Newsgroup: comp.lang.c++

On 1/2/2026 11:49 AM, Chris M. Thomasson wrote:

On 1/2/2026 5:45 AM, Bonita Montero wrote:

Am 02.01.2026 um 00:36 schrieb Chris M. Thomasson:

On 1/1/2026 3:34 PM, Chris M. Thomasson wrote:

On 12/31/2025 9:00 PM, Bonita Montero wrote:

Do you like that ?

#pragma once
#include <concepts>
#include <atomic>

struct xonce_flag
{
-a-a -a xonce_flag() noexcept = default;
private:
-a-a -a friend bool xcall_once( xonce_flag &, std::invocable auto ); >>>>> -a-a -a using flag_t = std::atomic<signed char>;
-a-a -a flag_t m_flag = 0;
};

[...]

Actually, why use a signed char? Why not a signed word?

Because I only need thee states: > 0, 0 and < 0;

But, using a signed word is more natural in a sense? You need to pad up
to and align the xonce_flag on a l2 cache line anyway. You don't want
false sharing on this flag, right?

I think your algo with the sync, well, it should work okay. Unless I
missed something. Actually, I don't think you need all of the acquire in
the lock phase. Just one after:

for( signed char ref = flag.load( memory_order_acquire ); ; )
if( ref > 0 ) [[likely]]
return true;
else if( ref < 0 ) [[unlikely]]
{
flag.wait( ref, memory_order_relaxed );
ref = flag.load( memory_order_acquire );
}
else if( flag.compare_exchange_strong( ref, -1,
memory_order_relaxed, memory_order_acquire ) ) [[likely]]
break;

Make it all relaxed, but then add in a single stand alone:

std::atomic_thread_fence(std::memory_order_acquire) after it and before
you call into the callable...

--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Fri Jan 2 21:14:29 2026

From Newsgroup: comp.lang.c++

Am 02.01.2026 um 20:49 schrieb Chris M. Thomasson:

On 1/2/2026 5:45 AM, Bonita Montero wrote:

Am 02.01.2026 um 00:36 schrieb Chris M. Thomasson:

On 1/1/2026 3:34 PM, Chris M. Thomasson wrote:

On 12/31/2025 9:00 PM, Bonita Montero wrote:

Do you like that ?

#pragma once
#include <concepts>
#include <atomic>

struct xonce_flag
{
-a-a -a xonce_flag() noexcept = default;
private:
-a-a -a friend bool xcall_once( xonce_flag &, std::invocable auto ); >>>>> -a-a -a using flag_t = std::atomic<signed char>;
-a-a -a flag_t m_flag = 0;
};

[...]

Actually, why use a signed char? Why not a signed word?

Because I only need thee states: > 0, 0 and < 0;

But, using a signed word is more natural in a sense? You need to pad
up to and align the xonce_flag on a l2 cache line anyway. You don't
want false sharing on this flag, right?

The flag is written only while the intialization runs.
Otherwise the chacheline holding it is shared among the cores.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Fri Jan 2 21:16:27 2026

From Newsgroup: comp.lang.c++

Am 02.01.2026 um 21:06 schrieb Chris M. Thomasson:

On 1/2/2026 11:49 AM, Chris M. Thomasson wrote:

On 1/2/2026 5:45 AM, Bonita Montero wrote:

Am 02.01.2026 um 00:36 schrieb Chris M. Thomasson:

On 1/1/2026 3:34 PM, Chris M. Thomasson wrote:

On 12/31/2025 9:00 PM, Bonita Montero wrote:

Do you like that ?

#pragma once
#include <concepts>
#include <atomic>

struct xonce_flag
{
-a-a -a xonce_flag() noexcept = default;
private:
-a-a -a friend bool xcall_once( xonce_flag &, std::invocable auto ); >>>>>> -a-a -a using flag_t = std::atomic<signed char>;
-a-a -a flag_t m_flag = 0;
};

[...]

Actually, why use a signed char? Why not a signed word?

Because I only need thee states: > 0, 0 and < 0;

But, using a signed word is more natural in a sense? You need to pad
up to and align the xonce_flag on a l2 cache line anyway. You don't
want false sharing on this flag, right?

I think your algo with the sync, well, it should work okay. Unless I
missed something. Actually, I don't think you need all of the acquire
in the lock phase. Just one after:

The loads before the intialization need to be acquires.
Also the implicit load after a failed CMPXCHG.

-a-a-a for( signed char ref = flag.load( memory_order_acquire ); ; ) -a-a-a-a-a-a-a if( ref > 0 ) [[likely]]
-a-a-a-a-a-a-a-a-a-a-a return true;
-a-a-a-a-a-a-a else if( ref < 0 ) [[unlikely]]
-a-a-a-a-a-a-a {
-a-a-a-a-a-a-a-a-a-a-a flag.wait( ref, memory_order_relaxed ); -a-a-a-a-a-a-a-a-a-a-a ref = flag.load( memory_order_acquire ); -a-a-a-a-a-a-a }
-a-a-a-a-a-a-a else if( flag.compare_exchange_strong( ref, -1, memory_order_relaxed, memory_order_acquire ) ) [[likely]] -a-a-a-a-a-a-a-a-a-a-a break;

Make it all relaxed, but then add in a single stand alone:

No, the write after the flag change need to be ordered afterwards.

std::atomic_thread_fence(std::memory_order_acquire) after it and
before you call into the callable...

Not necessary, it's simpler as I did.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Fri Jan 2 12:52:09 2026

From Newsgroup: comp.lang.c++

On 1/2/2026 12:16 PM, Bonita Montero wrote:

Am 02.01.2026 um 21:06 schrieb Chris M. Thomasson:

On 1/2/2026 11:49 AM, Chris M. Thomasson wrote:

On 1/2/2026 5:45 AM, Bonita Montero wrote:

Am 02.01.2026 um 00:36 schrieb Chris M. Thomasson:

On 1/1/2026 3:34 PM, Chris M. Thomasson wrote:

On 12/31/2025 9:00 PM, Bonita Montero wrote:

Do you like that ?

#pragma once
#include <concepts>
#include <atomic>

struct xonce_flag
{
-a-a -a xonce_flag() noexcept = default;
private:
-a-a -a friend bool xcall_once( xonce_flag &, std::invocable auto ); >>>>>>> -a-a -a using flag_t = std::atomic<signed char>;
-a-a -a flag_t m_flag = 0;
};

[...]

Actually, why use a signed char? Why not a signed word?

Because I only need thee states: > 0, 0 and < 0;

But, using a signed word is more natural in a sense? You need to pad
up to and align the xonce_flag on a l2 cache line anyway. You don't
want false sharing on this flag, right?

I think your algo with the sync, well, it should work okay. Unless I
missed something. Actually, I don't think you need all of the acquire
in the lock phase. Just one after:

The loads before the intialization need to be acquires.
Also the implicit load after a failed CMPXCHG.

-a-a-a for( signed char ref = flag.load( memory_order_acquire ); ; )
-a-a-a-a-a-a-a if( ref > 0 ) [[likely]]
-a-a-a-a-a-a-a-a-a-a-a return true;
-a-a-a-a-a-a-a else if( ref < 0 ) [[unlikely]]
-a-a-a-a-a-a-a {
-a-a-a-a-a-a-a-a-a-a-a flag.wait( ref, memory_order_relaxed );
-a-a-a-a-a-a-a-a-a-a-a ref = flag.load( memory_order_acquire );
-a-a-a-a-a-a-a }
-a-a-a-a-a-a-a else if( flag.compare_exchange_strong( ref, -1,
memory_order_relaxed, memory_order_acquire ) ) [[likely]]
-a-a-a-a-a-a-a-a-a-a-a break;

Make it all relaxed, but then add in a single stand alone:

No, the write after the flag change need to be ordered afterwards.

I think you can make that logic all relaxed.

std::atomic_thread_fence(std::memory_order_acquire) after it and
before you call into the callable...

Not necessary, it's simpler as I did.

But my way uses a single acquire barrier after the logic has done its
thing. That is simpler and more efficient. Actually you only need one
acquire after that logic, _before_ callable is run, and one release
barrier _after_ the callable is run. You atomic logic does not depend on itself, it is only working with flag. It sure seems to be akin to the following pattern:

atomic_mutex_lock(); // all relaxed
std::atomic_thread_fence(std::memory_order_acquire)

{
// critical_section
}

std::atomic_thread_fence(std::memory_order_relaxed);
atomic_mutex_unlock(); // all relaxed

You only need the actual membars once right before you call into
callable, and once right after it.

Actually, std::atomic_thread_fence is more of a SPARC way to do things
wrt memory order.
--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Fri Jan 2 12:54:15 2026

From Newsgroup: comp.lang.c++

On 1/2/2026 12:14 PM, Bonita Montero wrote:

Am 02.01.2026 um 20:49 schrieb Chris M. Thomasson:

On 1/2/2026 5:45 AM, Bonita Montero wrote:

Am 02.01.2026 um 00:36 schrieb Chris M. Thomasson:

On 1/1/2026 3:34 PM, Chris M. Thomasson wrote:

On 12/31/2025 9:00 PM, Bonita Montero wrote:

Do you like that ?

#pragma once
#include <concepts>
#include <atomic>

struct xonce_flag
{
-a-a -a xonce_flag() noexcept = default;
private:
-a-a -a friend bool xcall_once( xonce_flag &, std::invocable auto ); >>>>>> -a-a -a using flag_t = std::atomic<signed char>;
-a-a -a flag_t m_flag = 0;
};

[...]

Actually, why use a signed char? Why not a signed word?

Because I only need thee states: > 0, 0 and < 0;

But, using a signed word is more natural in a sense? You need to pad
up to and align the xonce_flag on a l2 cache line anyway. You don't
want false sharing on this flag, right?

The flag is written only while the intialization runs.
Otherwise the chacheline holding it is shared among the cores.

The flag should be completely isolated from callable? and pad does this.
--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Fri Jan 2 12:56:20 2026

From Newsgroup: comp.lang.c++

On 1/2/2026 12:52 PM, Chris M. Thomasson wrote:

On 1/2/2026 12:16 PM, Bonita Montero wrote:

Am 02.01.2026 um 21:06 schrieb Chris M. Thomasson:

On 1/2/2026 11:49 AM, Chris M. Thomasson wrote:

On 1/2/2026 5:45 AM, Bonita Montero wrote:

Am 02.01.2026 um 00:36 schrieb Chris M. Thomasson:

On 1/1/2026 3:34 PM, Chris M. Thomasson wrote:

On 12/31/2025 9:00 PM, Bonita Montero wrote:

Do you like that ?

#pragma once
#include <concepts>
#include <atomic>

struct xonce_flag
{
-a-a -a xonce_flag() noexcept = default;
private:
-a-a -a friend bool xcall_once( xonce_flag &, std::invocable auto ); >>>>>>>> -a-a -a using flag_t = std::atomic<signed char>;
-a-a -a flag_t m_flag = 0;
};

[...]

Actually, why use a signed char? Why not a signed word?

Because I only need thee states: > 0, 0 and < 0;

But, using a signed word is more natural in a sense? You need to pad
up to and align the xonce_flag on a l2 cache line anyway. You don't
want false sharing on this flag, right?

I think your algo with the sync, well, it should work okay. Unless I
missed something. Actually, I don't think you need all of the acquire
in the lock phase. Just one after:

The loads before the intialization need to be acquires.
Also the implicit load after a failed CMPXCHG.

-a-a-a for( signed char ref = flag.load( memory_order_acquire ); ; )
-a-a-a-a-a-a-a if( ref > 0 ) [[likely]]
-a-a-a-a-a-a-a-a-a-a-a return true;
-a-a-a-a-a-a-a else if( ref < 0 ) [[unlikely]]
-a-a-a-a-a-a-a {
-a-a-a-a-a-a-a-a-a-a-a flag.wait( ref, memory_order_relaxed );
-a-a-a-a-a-a-a-a-a-a-a ref = flag.load( memory_order_acquire );
-a-a-a-a-a-a-a }
-a-a-a-a-a-a-a else if( flag.compare_exchange_strong( ref, -1,
memory_order_relaxed, memory_order_acquire ) ) [[likely]]
-a-a-a-a-a-a-a-a-a-a-a break;

Make it all relaxed, but then add in a single stand alone:

No, the write after the flag change need to be ordered afterwards.

I think you can make that logic all relaxed.

std::atomic_thread_fence(std::memory_order_acquire) after it and
before you call into the callable...

Not necessary, it's simpler as I did.

But my way uses a single acquire barrier after the logic has done its
thing. That is simpler and more efficient. Actually you only need one acquire after that logic, _before_ callable is run, and one release
barrier _after_ the callable is run. You atomic logic does not depend on itself, it is only working with flag. It sure seems to be akin to the following pattern:

atomic_mutex_lock(); // all relaxed
-a std::atomic_thread_fence(std::memory_order_acquire)

-a {
-a-a-a-a-a // critical_section
-a }

-a std::atomic_thread_fence(std::memory_order_relaxed);

^^^^^^^^^^^^^^^^

GOD DAMN IT!!!!!!!!!!!!!!!!!!!!! That NEEDS to be:

std::atomic_thread_fence(std::memory_order_release);

Shit. Sorry about that Bonita! Uggg. ;^o

atomic_mutex_unlock(); // all relaxed

You only need the actual membars once right before you call into
callable, and once right after it.

Actually, std::atomic_thread_fence is more of a SPARC way to do things
wrt memory order.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Fri Jan 2 13:43:02 2026

From Newsgroup: comp.lang.c++

On 1/2/2026 12:56 PM, Chris M. Thomasson wrote:

On 1/2/2026 12:52 PM, Chris M. Thomasson wrote:

On 1/2/2026 12:16 PM, Bonita Montero wrote:

Am 02.01.2026 um 21:06 schrieb Chris M. Thomasson:

On 1/2/2026 11:49 AM, Chris M. Thomasson wrote:

On 1/2/2026 5:45 AM, Bonita Montero wrote:

Am 02.01.2026 um 00:36 schrieb Chris M. Thomasson:

On 1/1/2026 3:34 PM, Chris M. Thomasson wrote:

On 12/31/2025 9:00 PM, Bonita Montero wrote:

Do you like that ?

#pragma once
#include <concepts>
#include <atomic>

struct xonce_flag
{
-a-a -a xonce_flag() noexcept = default;
private:
-a-a -a friend bool xcall_once( xonce_flag &, std::invocable auto ); >>>>>>>>> -a-a -a using flag_t = std::atomic<signed char>;
-a-a -a flag_t m_flag = 0;
};

[...]

Actually, why use a signed char? Why not a signed word?

Because I only need thee states: > 0, 0 and < 0;

But, using a signed word is more natural in a sense? You need to
pad up to and align the xonce_flag on a l2 cache line anyway. You
don't want false sharing on this flag, right?

I think your algo with the sync, well, it should work okay. Unless I
missed something. Actually, I don't think you need all of the
acquire in the lock phase. Just one after:

The loads before the intialization need to be acquires.
Also the implicit load after a failed CMPXCHG.

-a-a-a for( signed char ref = flag.load( memory_order_acquire ); ; )
-a-a-a-a-a-a-a if( ref > 0 ) [[likely]]

You would need a stand alone acquire right here. But, you can keep your
logic all relaxed and only use the membar when you need it.

-a-a-a-a-a-a-a-a-a-a-a return true;

-a-a-a-a-a-a-a else if( ref < 0 ) [[unlikely]]
-a-a-a-a-a-a-a {
-a-a-a-a-a-a-a-a-a-a-a flag.wait( ref, memory_order_relaxed );
-a-a-a-a-a-a-a-a-a-a-a ref = flag.load( memory_order_acquire );
-a-a-a-a-a-a-a }
-a-a-a-a-a-a-a else if( flag.compare_exchange_strong( ref, -1,
memory_order_relaxed, memory_order_acquire ) ) [[likely]]
-a-a-a-a-a-a-a-a-a-a-a break;

Make it all relaxed, but then add in a single stand alone:

No, the write after the flag change need to be ordered afterwards.

I think you can make that logic all relaxed.

std::atomic_thread_fence(std::memory_order_acquire) after it and
before you call into the callable...

Not necessary, it's simpler as I did.

But my way uses a single acquire barrier after the logic has done its
thing. That is simpler and more efficient. Actually you only need one
acquire after that logic, _before_ callable is run, and one release
barrier _after_ the callable is run. You atomic logic does not depend
on itself, it is only working with flag. It sure seems to be akin to
the following pattern:

atomic_mutex_lock(); // all relaxed
-a-a std::atomic_thread_fence(std::memory_order_acquire)

-a-a {
-a-a-a-a-a-a // critical_section
-a-a }

-a-a std::atomic_thread_fence(std::memory_order_relaxed);

^^^^^^^^^^^^^^^^

GOD DAMN IT!!!!!!!!!!!!!!!!!!!!! That NEEDS to be:

std::atomic_thread_fence(std::memory_order_release);

Shit. Sorry about that Bonita! Uggg. ;^o

atomic_mutex_unlock(); // all relaxed

You only need the actual membars once right before you call into
callable, and once right after it.

Actually, std::atomic_thread_fence is more of a SPARC way to do things
wrt memory order.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Fri Jan 2 13:43:54 2026

From Newsgroup: comp.lang.c++

On 1/2/2026 12:16 PM, Bonita Montero wrote:

Am 02.01.2026 um 21:06 schrieb Chris M. Thomasson:

On 1/2/2026 11:49 AM, Chris M. Thomasson wrote:

On 1/2/2026 5:45 AM, Bonita Montero wrote:

Am 02.01.2026 um 00:36 schrieb Chris M. Thomasson:

On 1/1/2026 3:34 PM, Chris M. Thomasson wrote:

On 12/31/2025 9:00 PM, Bonita Montero wrote:

Do you like that ?

#pragma once
#include <concepts>
#include <atomic>

struct xonce_flag
{
-a-a -a xonce_flag() noexcept = default;
private:
-a-a -a friend bool xcall_once( xonce_flag &, std::invocable auto ); >>>>>>> -a-a -a using flag_t = std::atomic<signed char>;
-a-a -a flag_t m_flag = 0;
};

[...]

Actually, why use a signed char? Why not a signed word?

Because I only need thee states: > 0, 0 and < 0;

But, using a signed word is more natural in a sense? You need to pad
up to and align the xonce_flag on a l2 cache line anyway. You don't
want false sharing on this flag, right?

I think your algo with the sync, well, it should work okay. Unless I
missed something. Actually, I don't think you need all of the acquire
in the lock phase. Just one after:

The loads before the intialization need to be acquires.
Also the implicit load after a failed CMPXCHG.

-a-a-a for( signed char ref = flag.load( memory_order_acquire ); ; )
-a-a-a-a-a-a-a if( ref > 0 ) [[likely]]
-a-a-a-a-a-a-a-a-a-a-a return true;
-a-a-a-a-a-a-a else if( ref < 0 ) [[unlikely]]
-a-a-a-a-a-a-a {
-a-a-a-a-a-a-a-a-a-a-a flag.wait( ref, memory_order_relaxed );
-a-a-a-a-a-a-a-a-a-a-a ref = flag.load( memory_order_acquire );
-a-a-a-a-a-a-a }
-a-a-a-a-a-a-a else if( flag.compare_exchange_strong( ref, -1,
memory_order_relaxed, memory_order_acquire ) ) [[likely]]
-a-a-a-a-a-a-a-a-a-a-a break;

Make it all relaxed, but then add in a single stand alone:

No, the write after the flag change need to be ordered afterwards.

You can keep it all relaxed and only add the membars when you need
them. Right before you return true aka ref > 0, you would need an acquire.

std::atomic_thread_fence(std::memory_order_acquire) after it and
before you call into the callable...

Not necessary, it's simpler as I did.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Fri Jan 2 13:49:32 2026

From Newsgroup: comp.lang.c++

On 12/31/2025 9:00 PM, Bonita Montero wrote:
[...]

#pragma once
#include <concepts>
#include <atomic>

struct xonce_flag
{
xonce_flag() noexcept = default;
private:
friend bool xcall_once( xonce_flag &, std::invocable auto );
using flag_t = std::atomic<signed char>;
flag_t m_flag = 0;
};

bool xcall_once( xonce_flag &xflag, std::invocable auto callable )
{
using namespace std;
xonce_flag::flag_t &flag = xflag.m_flag;
for( signed char ref = flag.load( memory_order_relaxed ); ; )
if( ref > 0 ) [[likely]]

std::atomic_thread_fence(std::memory_order_acquire);

return true;
else if( ref < 0 ) [[unlikely]]
{
flag.wait( ref, memory_order_relaxed );
ref = flag.load( memory_order_relaxed );
}
else if( flag.compare_exchange_strong( ref, -1,
memory_order_relaxed, memory_order_relaxed ) ) [[likely]]
break;
bool succ = true;

std::atomic_thread_fence(std::memory_order_acquire);

try
{
if constexpr( requires { (bool)callable(); } )
succ = (bool)callable();
else
callable();
}
catch( ... )
{
std::atomic_thread_fence(std::memory_order_release);

flag.store( 0, memory_order_relaxed );
flag.notify_one();
throw;
}

std::atomic_thread_fence(std::memory_order_release);

flag.store( (char)succ, memory_order_relaxed );
if( succ )
flag.notify_all();
else
flag.notify_one();
return succ;
}
--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Fri Jan 2 14:54:13 2026

From Newsgroup: comp.lang.c++

On 1/2/2026 5:49 AM, Bonita Montero wrote:

Am 02.01.2026 um 09:36 schrieb David Brown:

On 02/01/2026 00:34, Chris M. Thomasson wrote:

On 12/31/2025 9:00 PM, Bonita Montero wrote:

Do you like that ?

#pragma once
#include <concepts>
#include <atomic>

struct xonce_flag
{
-a-a -a xonce_flag() noexcept = default;
private:
-a-a -a friend bool xcall_once( xonce_flag &, std::invocable auto );
-a-a -a using flag_t = std::atomic<signed char>;
-a-a -a flag_t m_flag = 0;
};

bool xcall_once( xonce_flag &xflag, std::invocable auto callable )
{
-a-a -a using namespace std;
-a-a -a xonce_flag::flag_t &flag = xflag.m_flag;
-a-a -a for( signed char ref = flag.load( memory_order_acquire ); ; )
-a-a -a -a -a if( ref > 0 ) [[likely]]
-a-a -a -a -a -a -a return true;
-a-a -a -a -a else if( ref < 0 ) [[unlikely]]
-a-a -a -a -a {
-a-a -a -a -a -a -a flag.wait( ref, memory_order_relaxed );
-a-a -a -a -a -a -a ref = flag.load( memory_order_acquire );
-a-a -a -a -a }
-a-a -a -a -a else if( flag.compare_exchange_strong( ref, -1,
memory_order_relaxed, memory_order_acquire ) ) [[likely]]
-a-a -a -a -a -a -a break;
-a-a -a bool succ = true;
-a-a -a try
-a-a -a {
-a-a -a -a -a if constexpr( requires { (bool)callable(); } )

what is the API of callable()? What does it return, I notice the cast
to (bool)...

-a-a -a -a -a -a -a succ = (bool)callable();
-a-a -a -a -a else
-a-a -a -a -a -a -a callable();

[...]

What if callable() returns zero, but zero is meant to denote success?
Are you familiar with the return values of a lot of POSIX API's? return
zero means success. Would that mess up your logic here? What if callable
does not have a return value.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Sat Jan 3 01:32:32 2026

From Newsgroup: comp.lang.c++

Am 02.01.2026 um 21:52 schrieb Chris M. Thomasson:

I think you can make that logic all relaxed. +

Yes, with two additional barriers. But it's easier how I do it.

But my way uses a single acquire barrier after the logic has done its
thing. That is simpler and more efficient.

It's not simpler your way; you would need two additional barriers and I
have two implicit barriers at runtime.

You only need the actual membars once right before you call into
callable, and once right after it.

It's simpler how I do that.

Actually, std::atomic_thread_fence is more of a SPARC way to do things
wrt memory order.

SPARC is dead.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Sat Jan 3 01:34:04 2026

From Newsgroup: comp.lang.c++

Am 02.01.2026 um 21:54 schrieb Chris M. Thomasson:

The flag should be completely isolated from callable?-a and pad does this.

The callable is only a number of references [&] which will be optimized
away.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Sat Jan 3 01:35:07 2026

From Newsgroup: comp.lang.c++

Am 02.01.2026 um 23:54 schrieb Chris M. Thomasson:

What if callable() returns zero, but zero is meant to denote success?
Are you familiar with the return values of a lot of POSIX API's?
return zero means success. Would that mess up your logic here? What if callable does not have a return value.

If the callable returns false the flag remains 0, i.e. uninitialized.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Fri Jan 2 18:57:13 2026

From Newsgroup: comp.lang.c++

On 1/2/2026 4:35 PM, Bonita Montero wrote:

Am 02.01.2026 um 23:54 schrieb Chris M. Thomasson:

What if callable() returns zero, but zero is meant to denote success?
Are you familiar with the return values of a lot of POSIX API's?
return zero means success. Would that mess up your logic here? What if
callable does not have a return value.

If the callable returns false the flag remains 0, i.e. uninitialized.

But, what if "callable" returning 0 means it succeeded? Akin to pthread_mutex_lock() returning 0?

So, a user provided callable needs to return a bool?
--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Fri Jan 2 19:09:57 2026

From Newsgroup: comp.lang.c++

On 1/2/2026 4:32 PM, Bonita Montero wrote:

Am 02.01.2026 um 21:52 schrieb Chris M. Thomasson:

I think you can make that logic all relaxed. +

Yes, with two additional barriers. But it's easier how I do it.

My way has the barriers exactly where they are actually needed, and its
way easier to read.

But my way uses a single acquire barrier after the logic has done its
thing. That is simpler and more efficient.

It's not simpler your way; you would need two additional barriers and I
have two implicit barriers at runtime.

Its better than using the membars in the damn cas wrt C++. One membar
for fail, one membar for success. Yeah. There can be rather major issues
with that...

You only need the actual membars once right before you call into
callable, and once right after it.

It's simpler how I do that.

Actually, not. Well, imvho.

Actually, std::atomic_thread_fence is more of a SPARC way to do things
wrt memory order.

SPARC is dead.

std::atomic_thread_fence can make things oh so much easier.
--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Fri Jan 2 19:18:40 2026

From Newsgroup: comp.lang.c++

On 1/2/2026 4:34 PM, Bonita Montero wrote:

Am 02.01.2026 um 21:54 schrieb Chris M. Thomasson:

The flag should be completely isolated from callable?-a and pad does this.

The callable is only a number of references [&] which will be optimized away.

callable can call into god knows what... You want your flag to be isolated.
--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Fri Jan 2 19:48:45 2026

From Newsgroup: comp.lang.c++

On 1/2/2026 1:49 PM, Chris M. Thomasson wrote:

#pragma once
#include <concepts>
#include <atomic>

struct xonce_flag
{
-a-a-a xonce_flag() noexcept = default;
private:
-a-a-a friend bool xcall_once( xonce_flag &, std::invocable auto );
-a-a-a using flag_t = std::atomic<signed char>;
-a-a-a flag_t m_flag = 0;
};

bool xcall_once( xonce_flag &xflag, std::invocable auto callable )
{
-a-a-a using namespace std;
-a-a-a xonce_flag::flag_t &flag = xflag.m_flag;
-a-a-a for( signed char ref = flag.load( memory_order_relaxed ); ; )
-a-a-a-a-a-a-a if( ref > 0 ) [[likely]]

Hummm... I would need to add the braces here. Totally forgot about that.

-a-a-a-a-a-a-a-a-a-a-a std::atomic_thread_fence(std::memory_order_acquire);

-a-a-a-a-a-a-a-a-a-a-a return true;

if( ref > 0 ) [[likely]]
{

std::atomic_thread_fence(std::memory_order_acquire);

return true;
}

Yikes!

-a-a-a-a-a-a-a else if( ref < 0 ) [[unlikely]]
-a-a-a-a-a-a-a {
-a-a-a-a-a-a-a-a-a-a-a flag.wait( ref, memory_order_relaxed );
-a-a-a-a-a-a-a-a-a-a-a ref = flag.load( memory_order_relaxed );
-a-a-a-a-a-a-a }
-a-a-a-a-a-a-a else if( flag.compare_exchange_strong( ref, -1, memory_order_relaxed, memory_order_relaxed ) ) [[likely]]
-a-a-a-a-a-a-a-a-a-a-a break;
-a-a-a bool succ = true;

-a-a-a std::atomic_thread_fence(std::memory_order_acquire);

-a-a-a try
-a-a-a {
-a-a-a-a-a-a-a if constexpr( requires { (bool)callable(); } )
-a-a-a-a-a-a-a-a-a-a-a succ = (bool)callable();
-a-a-a-a-a-a-a else
-a-a-a-a-a-a-a-a-a-a-a callable();
-a-a-a }
-a-a-a catch( ... )
-a-a-a {
-a-a-a-a-a-a-a std::atomic_thread_fence(std::memory_order_release);

-a-a-a-a-a-a-a flag.store( 0, memory_order_relaxed );
-a-a-a-a-a-a-a flag.notify_one();
-a-a-a-a-a-a-a throw;
-a-a-a }

-a-a-a std::atomic_thread_fence(std::memory_order_release);

-a-a-a flag.store( (char)succ, memory_order_relaxed );
-a-a-a if( succ )
-a-a-a-a-a-a-a flag.notify_all();
-a-a-a else
-a-a-a-a-a-a-a flag.notify_one();
-a-a-a return succ;
}

--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Sat Jan 3 04:57:00 2026

From Newsgroup: comp.lang.c++

You make the code more complicated for nothing.

Am 02.01.2026 um 22:49 schrieb Chris M. Thomasson:

On 12/31/2025 9:00 PM, Bonita Montero wrote:
[...]

#pragma once
#include <concepts>
#include <atomic>

struct xonce_flag
{
-a-a-a xonce_flag() noexcept = default;
private:
-a-a-a friend bool xcall_once( xonce_flag &, std::invocable auto );
-a-a-a using flag_t = std::atomic<signed char>;
-a-a-a flag_t m_flag = 0;
};

bool xcall_once( xonce_flag &xflag, std::invocable auto callable )
{
-a-a-a using namespace std;
-a-a-a xonce_flag::flag_t &flag = xflag.m_flag;
-a-a-a for( signed char ref = flag.load( memory_order_relaxed ); ; ) -a-a-a-a-a-a-a if( ref > 0 ) [[likely]]

-a-a-a-a-a-a-a-a-a-a-a std::atomic_thread_fence(std::memory_order_acquire);

-a-a-a-a-a-a-a-a-a-a-a return true;
-a-a-a-a-a-a-a else if( ref < 0 ) [[unlikely]]
-a-a-a-a-a-a-a {
-a-a-a-a-a-a-a-a-a-a-a flag.wait( ref, memory_order_relaxed ); -a-a-a-a-a-a-a-a-a-a-a ref = flag.load( memory_order_relaxed ); -a-a-a-a-a-a-a }
-a-a-a-a-a-a-a else if( flag.compare_exchange_strong( ref, -1, memory_order_relaxed, memory_order_relaxed ) ) [[likely]] -a-a-a-a-a-a-a-a-a-a-a break;
-a-a-a bool succ = true;

-a-a-a std::atomic_thread_fence(std::memory_order_acquire);

-a-a-a try
-a-a-a {
-a-a-a-a-a-a-a if constexpr( requires { (bool)callable(); } ) -a-a-a-a-a-a-a-a-a-a-a succ = (bool)callable();
-a-a-a-a-a-a-a else
-a-a-a-a-a-a-a-a-a-a-a callable();
-a-a-a }
-a-a-a catch( ... )
-a-a-a {
-a-a-a-a-a-a-a std::atomic_thread_fence(std::memory_order_release);

-a-a-a-a-a-a-a flag.store( 0, memory_order_relaxed );
-a-a-a-a-a-a-a flag.notify_one();
-a-a-a-a-a-a-a throw;
-a-a-a }

-a-a-a std::atomic_thread_fence(std::memory_order_release);

-a-a-a flag.store( (char)succ, memory_order_relaxed );
-a-a-a if( succ )
-a-a-a-a-a-a-a flag.notify_all();
-a-a-a else
-a-a-a-a-a-a-a flag.notify_one();
-a-a-a return succ;
}

--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Sat Jan 3 04:58:14 2026

From Newsgroup: comp.lang.c++

Am 03.01.2026 um 03:57 schrieb Chris M. Thomasson:

On 1/2/2026 4:35 PM, Bonita Montero wrote:

Am 02.01.2026 um 23:54 schrieb Chris M. Thomasson:

What if callable() returns zero, but zero is meant to denote
success? Are you familiar with the return values of a lot of POSIX
API's? return zero means success. Would that mess up your logic
here? What if callable does not have a return value.

If the callable returns false the flag remains 0, i.e. uninitialized.

But, what if "callable" returning 0 means it succeeded? Akin to pthread_mutex_lock() returning 0?

It's the same behaviour as if the intitialization code of a once_flag
throws an exceptions; the once_flag remains uninitialized.

So, a user provided callable needs to return a bool?

--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Sat Jan 3 04:59:13 2026

From Newsgroup: comp.lang.c++

Am 03.01.2026 um 04:18 schrieb Chris M. Thomasson:

On 1/2/2026 4:34 PM, Bonita Montero wrote:

Am 02.01.2026 um 21:54 schrieb Chris M. Thomasson:

The flag should be completely isolated from callable?-a and pad does
this.

The callable is only a number of references [&] which will be
optimized away.

callable can call into god knows what... You want your flag to be
isolated.

My code behaves the same way as with a std::call_once in that sense.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Sat Jan 3 05:00:04 2026

From Newsgroup: comp.lang.c++

Am 03.01.2026 um 04:09 schrieb Chris M. Thomasson:

On 1/2/2026 4:32 PM, Bonita Montero wrote:

Am 02.01.2026 um 21:52 schrieb Chris M. Thomasson:

I think you can make that logic all relaxed. +

Yes, with two additional barriers. But it's easier how I do it.

My way has the barriers exactly where they are actually needed, and
its way easier to read.

Absolutely not.

But my way uses a single acquire barrier after the logic has done
its thing. That is simpler and more efficient.

It's not simpler your way; you would need two additional barriers and
I have two implicit barriers at runtime.

Its better than using the membars in the damn cas wrt C++. One membar
for fail, one membar for success. Yeah. There can be rather major
issues with that...

Wrong.

You only need the actual membars once right before you call into
callable, and once right after it.

It's simpler how I do that.

Actually, not. Well, imvho.

Actually, std::atomic_thread_fence is more of a SPARC way to do
things wrt memory order.

SPARC is dead.

std::atomic_thread_fence can make things oh so much easier.

No, it makes my code more complicated.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Fri Jan 2 20:02:33 2026

From Newsgroup: comp.lang.c++

On 1/2/2026 8:00 PM, Bonita Montero wrote:

Am 03.01.2026 um 04:09 schrieb Chris M. Thomasson:

On 1/2/2026 4:32 PM, Bonita Montero wrote:

Am 02.01.2026 um 21:52 schrieb Chris M. Thomasson:

I think you can make that logic all relaxed. +

Yes, with two additional barriers. But it's easier how I do it.

My way has the barriers exactly where they are actually needed, and
its way easier to read.

Absolutely not.

But my way uses a single acquire barrier after the logic has done
its thing. That is simpler and more efficient.

It's not simpler your way; you would need two additional barriers and
I have two implicit barriers at runtime.

Its better than using the membars in the damn cas wrt C++. One membar
for fail, one membar for success. Yeah. There can be rather major
issues with that...

Wrong.

Oh really? How?

You only need the actual membars once right before you call into
callable, and once right after it.

It's simpler how I do that.

Actually, not. Well, imvho.

Actually, std::atomic_thread_fence is more of a SPARC way to do
things wrt memory order.

SPARC is dead.

std::atomic_thread_fence can make things oh so much easier.

No, it makes my code more complicated.

Actually, it does not. Well, imvvho. I love the SPARC way of doing
things wrt memory ordering. Your memory order, afaict, is correct. But,
the stand alone one works and it only executes a membar when its 100%
needed.
--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Fri Jan 2 20:03:45 2026

From Newsgroup: comp.lang.c++

On 1/2/2026 7:59 PM, Bonita Montero wrote:

Am 03.01.2026 um 04:18 schrieb Chris M. Thomasson:

On 1/2/2026 4:34 PM, Bonita Montero wrote:

Am 02.01.2026 um 21:54 schrieb Chris M. Thomasson:

The flag should be completely isolated from callable?-a and pad does
this.

The callable is only a number of references [&] which will be
optimized away.

callable can call into god knows what... You want your flag to be
isolated.

My code behaves the same way as with a std::call_once in that sense.

You want your flag to be isolated from callable. Ideally aligned and
padded to a l2 cache line.
--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Fri Jan 2 20:05:14 2026

From Newsgroup: comp.lang.c++

On 1/2/2026 7:57 PM, Bonita Montero wrote:

You make the code more complicated for nothing.

[...]

It makes the memory sync MUCH easier to read, imvho. Also, its not more complicated, its more concise.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Sat Jan 3 05:20:23 2026

From Newsgroup: comp.lang.c++

Am 03.01.2026 um 05:03 schrieb Chris M. Thomasson:

On 1/2/2026 7:59 PM, Bonita Montero wrote:

Am 03.01.2026 um 04:18 schrieb Chris M. Thomasson:

On 1/2/2026 4:34 PM, Bonita Montero wrote:

Am 02.01.2026 um 21:54 schrieb Chris M. Thomasson:

The flag should be completely isolated from callable?-a and pad
does this.

The callable is only a number of references [&] which will be
optimized away.

callable can call into god knows what... You want your flag to be
isolated.

My code behaves the same way as with a std::call_once in that sense.

You want your flag to be isolated from callable. Ideally aligned and
padded to a l2 cache line.

You're really really sick !
The flag is written only a few times until initialization succeds.
Then it remains in a shared cacheline; so there's no false sharing.
And no need for alignment.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Sat Jan 3 05:21:11 2026

From Newsgroup: comp.lang.c++

Am 03.01.2026 um 05:05 schrieb Chris M. Thomasson:

On 1/2/2026 7:57 PM, Bonita Montero wrote:

You make the code more complicated for nothing.

[...]

It makes the memory sync MUCH easier to read, imvho. Also, its not
more complicated, its more concise.

You're really sick. This are 24 lines of code.
If you think it's too hard to read don't program at all.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Fri Jan 2 22:24:11 2026

From Newsgroup: comp.lang.c++

On 1/2/2026 8:21 PM, Bonita Montero wrote:

Am 03.01.2026 um 05:05 schrieb Chris M. Thomasson:

On 1/2/2026 7:57 PM, Bonita Montero wrote:

You make the code more complicated for nothing.

[...]

It makes the memory sync MUCH easier to read, imvho. Also, its not
more complicated, its more concise.

You're really sick. This are 24 lines of code.
If you think it's too hard to read don't program at all.

IrCOm saying your original version with the membars is, as far as I can
tell, correct. Yes, I can read it just fine. The whole (bool)callable()
thing aside for a moment... ;^o

I just wanted to show another way to place the membars. The SPARC style
is neat, and C++ lets us express it cleanly. ItrCOs simply easier for me
to think about the protocol when the barriers are spelled out
explicitly. In this layout, the membars are exactly where they need to
be, and all the atomics are relaxed.

Your (bool)callable issue is interesting, by the way.

Anyway, hererCOs the SPARCrCastyle sketch I typed into the newsreader
(forgive any typos). This is the hazardrCapointer load pattern. The
storeload membar makes the whole thing easy to reason about:
_________________
ct_tls& tls = ct_get_tls();

for (;;)
{
void* ptr0 = atomic_load(&anchor);

if (! ptr0)
{
tls.hazard = nullptr;
return nullptr;
}

atomic_store(&tls.hazard, ptr0);

membar_storeload(); // MEMBAR #StoreLoad | #StoreStore

void* ptr1 = atomic_load(&anchor);

if (ptr0 == ptr1)
{
// no-op (DEC Alpha (mb) aside for a moment,
// compiler barrier aside...)
membar_consume();

return ptr0;
}
}
_________________

C++ better NOT use an acquire barrier for my membar_consume()! GRRRRRR!

:^D
--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Sat Jan 3 10:08:07 2026

From Newsgroup: comp.lang.c++

Am 03.01.2026 um 07:24 schrieb Chris M. Thomasson:

I just wanted to show another way to place the membars. The SPARC
style is neat, and C++ lets us express it cleanly. ItrCOs simply easier
for me to think about the protocol when the barriers are spelled out explicitly. In this layout, the membars are exactly where they need to
be, and all the atomics are relaxed.

I like my minimalism.
If there would be a more complex synchronization algorithm with screenpages
of lines you might be right, but not with this small amout of code.

Your (bool)callable issue is interesting, by the way.

That's while I wrote that. Otherwise I could have stuck with
std::once_flag.

Anyway, hererCOs the SPARCrCastyle sketch I typed into the newsreader (forgive any typos). This is the hazardrCapointer load pattern. The storeload membar makes the whole thing easy to reason about:

SPARC is dead.
Neither Oracle nor Fujitsu have officiall quitted this CPUs,
but the last SPARC-CPUs are nine years ago. Fujitsu has moved
its development team to design new ARM-CPUs.

C++ better NOT use an acquire barrier for my membar_consume()! GRRRRRR!

I don't know wheter a childish attitude is appropriate for sofware development.
But at least when it comes to such small details I might be right.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Sat Jan 3 11:22:30 2026

From Newsgroup: comp.lang.c++

On 1/2/2026 8:20 PM, Bonita Montero wrote:

Am 03.01.2026 um 05:03 schrieb Chris M. Thomasson:

On 1/2/2026 7:59 PM, Bonita Montero wrote:

Am 03.01.2026 um 04:18 schrieb Chris M. Thomasson:

On 1/2/2026 4:34 PM, Bonita Montero wrote:

Am 02.01.2026 um 21:54 schrieb Chris M. Thomasson:

The flag should be completely isolated from callable?-a and pad
does this.

The callable is only a number of references [&] which will be
optimized away.

callable can call into god knows what... You want your flag to be
isolated.

My code behaves the same way as with a std::call_once in that sense.

You want your flag to be isolated from callable. Ideally aligned and
padded to a l2 cache line.

You're really really sick !
The flag is written only a few times until initialization succeds.
Then it remains in a shared cacheline; so there's no false sharing.
And no need for alignment.

I disagree. Try to get rid of any possibility of false sharing. Strive
for it. It's just good hygiene! :^)
--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Sat Jan 3 11:39:56 2026

From Newsgroup: comp.lang.c++

On 1/3/2026 1:08 AM, Bonita Montero wrote:

Am 03.01.2026 um 07:24 schrieb Chris M. Thomasson:

I just wanted to show another way to place the membars. The SPARC
style is neat, and C++ lets us express it cleanly. ItrCOs simply easier
for me to think about the protocol when the barriers are spelled out
explicitly. In this layout, the membars are exactly where they need to
be, and all the atomics are relaxed.

I like my minimalism.
If there would be a more complex synchronization algorithm with screenpages of lines you might be right, but not with this small amout of code.

To each their own.

Your (bool)callable issue is interesting, by the way.

That's while I wrote that. Otherwise I could have stuck with
std::once_flag.

It's just that (bool)callable is a bit scary to me.

Anyway, hererCOs the SPARCrCastyle sketch I typed into the newsreader
(forgive any typos). This is the hazardrCapointer load pattern. The
storeload membar makes the whole thing easy to reason about:

SPARC is dead.

If you say so. I happen to like the way it handled memory order with its MEMBAR instruction.

Neither Oracle nor Fujitsu have officiall quitted this CPUs,
but the last SPARC-CPUs are nine years ago. Fujitsu has moved
its development team to design new ARM-CPUs.

Okay.

C++ better NOT use an acquire barrier for my membar_consume()! GRRRRRR!

I don't know wheter a childish attitude is appropriate for sofware development.
But at least when it comes to such small details I might be right.

Oh my. If a damn compiler puts in a MEMBAR #LoadStore | #LoadLoad for a consume membar, I would be pissed off. You should be pissed off as well.

In a sense, if expecting a compiler to respect the memory model and
avoid unnecessary hardware fences is 'childish,' then I guess the entire
C++ Standards Committee is in preschool. Efficiency isn't a small
detail; it's the whole point

--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Sat Jan 3 11:57:11 2026

From Newsgroup: comp.lang.c++

On 1/2/2026 7:58 PM, Bonita Montero wrote:

Am 03.01.2026 um 03:57 schrieb Chris M. Thomasson:

On 1/2/2026 4:35 PM, Bonita Montero wrote:

Am 02.01.2026 um 23:54 schrieb Chris M. Thomasson:

What if callable() returns zero, but zero is meant to denote
success? Are you familiar with the return values of a lot of POSIX
API's? return zero means success. Would that mess up your logic
here? What if callable does not have a return value.

If the callable returns false the flag remains 0, i.e. uninitialized.

But, what if "callable" returning 0 means it succeeded? Akin to
pthread_mutex_lock() returning 0?

It's the same behaviour as if the intitialization code of a once_flag
throws an exceptions; the once_flag remains uninitialized.

So, a user provided callable needs to return a bool?

Okay, but that means your design is implicitly imposing a contract on
the user: the callable must return a bool, and rCLtruerCY must mean
successful initialization? ThatrCOs fine if itrCOs documented, but itrCOs not a general pattern in a sense. Humm...

Plenty of API, POSIX being an example use 0 to mean success.
If someone passes a callable that follows that convention, your logic
treats a successful initialization as a failure and leaves the flag uninitialized. Well, shit happens...

So the question isnrCOt whether your approach works for your specific
use case... ItrCOs whether the interface is robust for "arbitrary"
callables? Right now, it isnrCOt unless you require a bool returning
callable with a very specific meaning...

Fair enough?

--- Synchronet 3.21a-Linux NewsLink 1.2

From Richard Damon@Richard@Damon-Family.org to comp.lang.c++ on Sat Jan 3 17:59:19 2026

From Newsgroup: comp.lang.c++

On 1/3/26 2:57 PM, Chris M. Thomasson wrote:

On 1/2/2026 7:58 PM, Bonita Montero wrote:

Am 03.01.2026 um 03:57 schrieb Chris M. Thomasson:

On 1/2/2026 4:35 PM, Bonita Montero wrote:

Am 02.01.2026 um 23:54 schrieb Chris M. Thomasson:

What if callable() returns zero, but zero is meant to denote
success? Are you familiar with the return values of a lot of POSIX
API's? return zero means success. Would that mess up your logic
here? What if callable does not have a return value.

If the callable returns false the flag remains 0, i.e. uninitialized.

But, what if "callable" returning 0 means it succeeded? Akin to
pthread_mutex_lock() returning 0?

It's the same behaviour as if the intitialization code of a once_flag
throws an exceptions; the once_flag remains uninitialized.

So, a user provided callable needs to return a bool?

Okay, but that means your design is implicitly imposing a contract on
the user: the callable must return a bool, and rCLtruerCY must mean successful initialization? ThatrCOs fine if itrCOs documented, but itrCOs not
a general pattern in a sense. Humm...

Plenty of API, POSIX being an example use 0 to mean success.
If someone passes a callable that follows that convention, your logic
treats a successful initialization as a failure and leaves the flag uninitialized. Well, shit happens...

So the question isnrCOt whether your approach works for your specific
use case... ItrCOs whether the interface is robust for "arbitrary" callables? Right now, it isnrCOt unless you require a bool returning callable with a very specific meaning...

Fair enough?

Just says that if you have a function that doesn't return true on
success, you need to wrap it with a thin wrapper to return true on success.

Not uncommon to need thin shims like this in "generic" interfaces.

If int foo() return 0 on success, you need a
bool shim_foo() { return 0 == foo(); }

--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Sun Jan 4 03:32:03 2026

From Newsgroup: comp.lang.c++

Am 03.01.2026 um 20:57 schrieb Chris M. Thomasson:

On 1/2/2026 7:58 PM, Bonita Montero wrote:

Am 03.01.2026 um 03:57 schrieb Chris M. Thomasson:

On 1/2/2026 4:35 PM, Bonita Montero wrote:

Am 02.01.2026 um 23:54 schrieb Chris M. Thomasson:

What if callable() returns zero, but zero is meant to denote
success? Are you familiar with the return values of a lot of POSIX
API's? return zero means success. Would that mess up your logic
here? What if callable does not have a return value.

If the callable returns false the flag remains 0, i.e. uninitialized.

But, what if "callable" returning 0 means it succeeded? Akin to
pthread_mutex_lock() returning 0?

It's the same behaviour as if the intitialization code of a once_flag
throws an exceptions; the once_flag remains uninitialized.

So, a user provided callable needs to return a bool?

Okay, but that means your design is implicitly imposing a contract on
the user: the callable must return a bool, and rCLtruerCY must mean successful initialization? ThatrCOs fine if itrCOs documented, but itrCOs not a general pattern in a sense. Humm...

It can return a bool but it must not (if constexpr( ... )).

Plenty of API, POSIX being an example use 0 to mean success.
If someone passes a callable that follows that convention, your logic
treats a successful initialization as a failure and leaves the flag uninitialized. Well, shit happens...

So the question isnrCOt whether your approach works for your specific
use case... ItrCOs whether the interface is robust for "arbitrary" callables? Right now, it isnrCOt unless you require a bool returning callable with a very specific meaning...

Fair enough?

--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Sat Jan 3 18:47:01 2026

From Newsgroup: comp.lang.c++

On 1/3/2026 2:59 PM, Richard Damon wrote:

On 1/3/26 2:57 PM, Chris M. Thomasson wrote:

On 1/2/2026 7:58 PM, Bonita Montero wrote:

Am 03.01.2026 um 03:57 schrieb Chris M. Thomasson:

On 1/2/2026 4:35 PM, Bonita Montero wrote:

Am 02.01.2026 um 23:54 schrieb Chris M. Thomasson:

What if callable() returns zero, but zero is meant to denote
success? Are you familiar with the return values of a lot of POSIX >>>>>> API's? return zero means success. Would that mess up your logic
here? What if callable does not have a return value.

If the callable returns false the flag remains 0, i.e. uninitialized. >>>>>

But, what if "callable" returning 0 means it succeeded? Akin to
pthread_mutex_lock() returning 0?

It's the same behaviour as if the intitialization code of a once_flag
throws an exceptions; the once_flag remains uninitialized.

So, a user provided callable needs to return a bool?

Okay, but that means your design is implicitly imposing a contract on
the user: the callable must return a bool, and rCLtruerCY must mean
successful initialization? ThatrCOs fine if itrCOs documented, but itrCOs >> not a general pattern in a sense. Humm...

Plenty of API, POSIX being an example use 0 to mean success.
If someone passes a callable that follows that convention, your logic
treats a successful initialization as a failure and leaves the flag
uninitialized. Well, shit happens...

So the question isnrCOt whether your approach works for your specific
use case... ItrCOs whether the interface is robust for "arbitrary"
callables? Right now, it isnrCOt unless you require a bool returning
callable with a very specific meaning...

Fair enough?

Just says that if you have a function that doesn't return true on
success, you need to wrap it with a thin wrapper to return true on success.

Not uncommon to need thin shims like this in "generic" interfaces.

If int foo() return 0 on success, you need a
bool shim_foo() { return 0 == foo(); }

Fine. It just needs to be documented.
--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Sun Jan 4 08:04:34 2026

From Newsgroup: comp.lang.c++

Just says that if you have a function that doesn't return true on
success, you need to wrap it with a thin wrapper to return true on success.

If it the return is bool use if( fn() ) or if( !fn() ).
If it returns an integral with 0 for success use if( fn() == 0 ).
The rest is easily readable from the context.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Sun Jan 4 08:06:36 2026

From Newsgroup: comp.lang.c++

Am 03.01.2026 um 20:22 schrieb Chris M. Thomasson:

I disagree. Try to get rid of any possibility of false sharing. Strive
for it. It's just good hygiene! :^)

No, false sharing needs to be avoided if it happens at least sometimes.
Here false sharing might occur just once when the "once-flag" is initia-
lized; otherwise the flag / the cacheline remains in shared mode. The performance-impact of completely avoiding false sharing here ist nearly
zero.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Sun Jan 4 08:11:44 2026

From Newsgroup: comp.lang.c++

Am 03.01.2026 um 20:39 schrieb Chris M. Thomasson:

Your (bool)callable issue is interesting, by the way.

That's while I wrote that. Otherwise I could have stuck with
std::once_flag.

It's just that (bool)callable is a bit scary to me.

i don't understand you; the interface is understandable in an easy way.
And if you need simpler code inside xcall_once than in call_once and
not the boolean return feature you just coud return nothing.

Anyway, hererCOs the SPARCrCastyle sketch I typed into the newsreader
(forgive any typos). This is the hazardrCapointer load pattern. The
storeload membar makes the whole thing easy to reason about:

SPARC is dead.

If you say so. I happen to like the way it handled memory order with
its MEMBAR instruction.

I'm using mebars in my code in the most efficient way since how I
did it is the simplest way to do that.

I don't know wheter a childish attitude is appropriate for sofware
development.
But at least when it comes to such small details I might be right.

Oh my. If a damn compiler puts in a MEMBAR #LoadStore | #LoadLoad for
a consume membar, I would be pissed off. You should be pissed off as well.

I use as less membars as possible and where I use them they're
at their right place. I don't follow the minimalism padadigm
all the time but hiere it applies.

In a sense, if expecting a compiler to respect the memory model and
avoid unnecessary hardware fences is 'childish,' then I guess the
entire C++ Standards Committee is in preschool. Efficiency isn't a
small detail; it's the whole point

_I_ do that the simplest way, you added superflous code to make
the code more understandable. With a function that is only 34
lines of code ...

--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Sun Jan 4 08:32:49 2026

From Newsgroup: comp.lang.c++

Am 03.01.2026 um 05:02 schrieb Chris M. Thomasson:

Its better than using the membars in the damn cas wrt C++. One
membar for fail, one membar for success. Yeah. There can be rather
major issues with that...

Wrong.

Oh really? How?

I use the membars correctly and as minimal as possible.

No, it makes my code more complicated.

Actually, it does not. Well, imvvho. I love the SPARC way of doing
things wrt memory ordering. Your memory order, afaict, is correct.
But, the stand alone one works and it only executes a membar when its
100% needed.

Forget SPARC, it's dead.
And C++ has complete abstraction of any applicable barrier.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Sun Jan 4 14:05:03 2026

From Newsgroup: comp.lang.c++

On 1/3/2026 11:11 PM, Bonita Montero wrote:

Am 03.01.2026 um 20:39 schrieb Chris M. Thomasson:

Your (bool)callable issue is interesting, by the way.

That's while I wrote that. Otherwise I could have stuck with
std::once_flag.

It's just that (bool)callable is a bit scary to me.

i don't understand you; the interface is understandable in an easy way.
And if you need simpler code inside xcall_once than in call_once and
not the boolean return feature you just coud return nothing.

You should put a clear comment about (bool)callable?

Anyway, hererCOs the SPARCrCastyle sketch I typed into the newsreader >>>> (forgive any typos). This is the hazardrCapointer load pattern. The
storeload membar makes the whole thing easy to reason about:

SPARC is dead.

If you say so. I happen to like the way it handled memory order with
its MEMBAR instruction.

I'm using mebars in my code in the most efficient way since how I
did it is the simplest way to do that.

That's fine.

I don't know wheter a childish attitude is appropriate for sofware
development.
But at least when it comes to such small details I might be right.

You code is fine, (bool)callable aside for a moment. I can read it. I
just wanted to show another way to use stand alone fences. That's all.

Oh my. If a damn compiler puts in a MEMBAR #LoadStore | #LoadLoad for
a consume membar, I would be pissed off. You should be pissed off as
well.

I use as less membars as possible and where I use them they're
at their right place. I don't follow the minimalism padadigm
all the time but hiere it applies.

In a sense, if expecting a compiler to respect the memory model and
avoid unnecessary hardware fences is 'childish,' then I guess the
entire C++ Standards Committee is in preschool. Efficiency isn't a
small detail; it's the whole point

_I_ do that the simplest way, you added superflous code to make
the code more understandable. With a function that is only 34
lines of code ...

That is another matter. A consume membar. Your code does not use them.
But, if you ever do, well, make sure a rouge compiler is not putting in
an acquire barrier when it does not have to. I should have made another
topic about it. Sorry for that.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Sun Jan 4 14:06:40 2026

From Newsgroup: comp.lang.c++

On 1/3/2026 11:04 PM, Bonita Montero wrote:

Just says that if you have a function that doesn't return true on
success, you need to wrap it with a thin wrapper to return true on success.

If it the return is bool use if( fn() ) or if( !fn() ).
If it returns an integral with 0 for success use if( fn() == 0 ).
The rest is easily readable from the context.

Right, but people might like a clear comment about it before reading the
code? Fair enough?
--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Sun Jan 4 14:13:56 2026

From Newsgroup: comp.lang.c++

On 1/3/2026 11:32 PM, Bonita Montero wrote:

Am 03.01.2026 um 05:02 schrieb Chris M. Thomasson:

Its better than using the membars in the damn cas wrt C++. One
membar for fail, one membar for success. Yeah. There can be rather
major issues with that...

Wrong.

Oh really? How?

I use the membars correctly and as minimal as possible.

The mebars for CAS success and fail can be a bit sketchy. We were
discussing them well before the C++11 std back in
comp.programming.threads. I wonder if Alex Terekhov is still reading
usenet.

No, it makes my code more complicated.

Actually, it does not. Well, imvvho. I love the SPARC way of doing
things wrt memory ordering. Your memory order, afaict, is correct.
But, the stand alone one works and it only executes a membar when its
100% needed.

Forget SPARC, it's dead.

Should the std get rid of stand alone fences?

And C++ has complete abstraction of any applicable barrier.

Even consume?
--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Sun Jan 4 14:19:40 2026

From Newsgroup: comp.lang.c++

On 1/3/2026 11:06 PM, Bonita Montero wrote:

Am 03.01.2026 um 20:22 schrieb Chris M. Thomasson:

I disagree. Try to get rid of any possibility of false sharing. Strive
for it. It's just good hygiene! :^)

No, false sharing needs to be avoided if it happens at least sometimes.
Here false sharing might occur just once when the "once-flag" is initia- lized; otherwise the flag / the cacheline remains in shared mode. The performance-impact of completely avoiding false sharing here ist nearly
zero.

I would still do it. But, that's just me. Its a bit more than that.
Every atomic RMW, store, load, on your flag. Think of a reservation
granule for systems with LL/SC. A user needs to be careful where they
place those flags. But, well, they can align it themselves. So, whatever.
--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Mon Jan 5 01:00:36 2026

From Newsgroup: comp.lang.c++

Am 04.01.2026 um 23:13 schrieb Chris M. Thomasson:

On 1/3/2026 11:32 PM, Bonita Montero wrote:

Am 03.01.2026 um 05:02 schrieb Chris M. Thomasson:

Its better than using the membars in the damn cas wrt C++. One
membar for fail, one membar for success. Yeah. There can be rather
major issues with that...

Wrong.

Oh really? How?

I use the membars correctly and as minimal as possible.

The mebars for CAS success and fail can be a bit sketchy. We were
discussing them well before the C++11 std back in
comp.programming.threads. I wonder if Alex Terekhov is still reading
usenet.

My code is simple in that sense.

No, it makes my code more complicated.

Actually, it does not. Well, imvvho. I love the SPARC way of doing
things wrt memory ordering. Your memory order, afaict, is correct.
But, the stand alone one works and it only executes a membar when
its 100% needed.

Forget SPARC, it's dead.

Should the std get rid of stand alone fences?

No, but they're mostly not needed.

And C++ has complete abstraction of any applicable barrier.

Even consume?

Acquire and release consistency is sufficiently 95% of the time.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Mon Jan 5 01:01:34 2026

From Newsgroup: comp.lang.c++

Am 04.01.2026 um 23:19 schrieb Chris M. Thomasson:

On 1/3/2026 11:06 PM, Bonita Montero wrote:

Am 03.01.2026 um 20:22 schrieb Chris M. Thomasson:

I disagree. Try to get rid of any possibility of false sharing.
Strive for it. It's just good hygiene! :^)

No, false sharing needs to be avoided if it happens at least sometimes.
Here false sharing might occur just once when the "once-flag" is initia-
lized; otherwise the flag / the cacheline remains in shared mode. The
performance-impact of completely avoiding false sharing here ist nearly
zero.

I would still do it. But, that's just me. Its a bit more than that.
Every atomic RMW, store, load, on your flag. Think of a reservation
granule for systems with LL/SC. A user needs to be careful where they
place those flags. But, well, they can align it themselves. So, whatever.

Sorry, the initialization happens only once and after that the flag-cachline
is read-shared.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Mon Jan 5 01:03:59 2026

From Newsgroup: comp.lang.c++

Am 04.01.2026 um 23:05 schrieb Chris M. Thomasson:

On 1/3/2026 11:11 PM, Bonita Montero wrote:

Am 03.01.2026 um 20:39 schrieb Chris M. Thomasson:

Your (bool)callable issue is interesting, by the way.

That's while I wrote that. Otherwise I could have stuck with
std::once_flag.

It's just that (bool)callable is a bit scary to me.

i don't understand you; the interface is understandable in an easy way.
And if you need simpler code inside xcall_once than in call_once and
not the boolean return feature you just coud return nothing.

You should put a clear comment about (bool)callable?

That's sufficient:

-a -a if constexpr( requires { (bool)callable(); } )

I don't know wheter a childish attitude is appropriate for sofware
development.
But at least when it comes to such small details I might be right.

You code is fine, (bool)callable aside for a moment. I can read it. I
just wanted to show another way to use stand alone fences. That's all.

I'm using fences as minimalized as possible.

That is another matter. A consume membar. Your code does not use them.

Of course, I'm using it twice. Once after the first flag load and once
after a failed CAS.

But, if you ever do, well, make sure a rouge compiler is not putting
in an acquire barrier when it does not have to. I should have made
another topic about it. Sorry for that.

I'm using the fences properly and as minimal as possible.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Sun Jan 4 16:40:33 2026

From Newsgroup: comp.lang.c++

On 1/4/2026 4:01 PM, Bonita Montero wrote:

Am 04.01.2026 um 23:19 schrieb Chris M. Thomasson:

On 1/3/2026 11:06 PM, Bonita Montero wrote:

Am 03.01.2026 um 20:22 schrieb Chris M. Thomasson:

I disagree. Try to get rid of any possibility of false sharing.
Strive for it. It's just good hygiene! :^)

No, false sharing needs to be avoided if it happens at least sometimes.
Here false sharing might occur just once when the "once-flag" is initia- >>> lized; otherwise the flag / the cacheline remains in shared mode. The
performance-impact of completely avoiding false sharing here ist nearly
zero.

I would still do it. But, that's just me. Its a bit more than that.
Every atomic RMW, store, load, on your flag. Think of a reservation
granule for systems with LL/SC. A user needs to be careful where they
place those flags. But, well, they can align it themselves. So, whatever.

Sorry, the initialization happens only once and after that the flag- cachline
is read-shared.

What about that CAS in the loop?
--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Sun Jan 4 18:46:58 2026

From Newsgroup: comp.lang.c++

On 1/4/2026 4:03 PM, Bonita Montero wrote:

Am 04.01.2026 um 23:05 schrieb Chris M. Thomasson:

On 1/3/2026 11:11 PM, Bonita Montero wrote:

Am 03.01.2026 um 20:39 schrieb Chris M. Thomasson:

Your (bool)callable issue is interesting, by the way.

That's while I wrote that. Otherwise I could have stuck with
std::once_flag.

It's just that (bool)callable is a bit scary to me.

i don't understand you; the interface is understandable in an easy way.
And if you need simpler code inside xcall_once than in call_once and
not the boolean return feature you just coud return nothing.

You should put a clear comment about (bool)callable?

That's sufficient:

-a -a if constexpr( requires { (bool)callable(); } )

I don't know wheter a childish attitude is appropriate for sofware
development.
But at least when it comes to such small details I might be right.

You code is fine, (bool)callable aside for a moment. I can read it. I
just wanted to show another way to use stand alone fences. That's all.

I'm using fences as minimalized as possible.

That is another matter. A consume membar. Your code does not use them.

Of course, I'm using it twice. Once after the first flag load and once
after a failed CAS.

But, if you ever do, well, make sure a rouge compiler is not putting
in an acquire barrier when it does not have to. I should have made
another topic about it. Sorry for that.

I'm using the fences properly and as minimal as possible.

I basically agree. But, you code logic had no need for a consume membar.
So, its a mute point in this context. I only brought it up in case you
ever do need one. NOT and acquire. :^)
--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Sun Jan 4 18:49:00 2026

From Newsgroup: comp.lang.c++

On 1/4/2026 6:46 PM, Chris M. Thomasson wrote:

On 1/4/2026 4:03 PM, Bonita Montero wrote:

Am 04.01.2026 um 23:05 schrieb Chris M. Thomasson:

On 1/3/2026 11:11 PM, Bonita Montero wrote:

Am 03.01.2026 um 20:39 schrieb Chris M. Thomasson:

Your (bool)callable issue is interesting, by the way.

That's while I wrote that. Otherwise I could have stuck with
std::once_flag.

It's just that (bool)callable is a bit scary to me.

i don't understand you; the interface is understandable in an easy way. >>>> And if you need simpler code inside xcall_once than in call_once and
not the boolean return feature you just coud return nothing.

You should put a clear comment about (bool)callable?

That's sufficient:

-a-a -a if constexpr( requires { (bool)callable(); } )

I don't know wheter a childish attitude is appropriate for sofware >>>>>> development.
But at least when it comes to such small details I might be right.

You code is fine, (bool)callable aside for a moment. I can read it. I
just wanted to show another way to use stand alone fences. That's all.

I'm using fences as minimalized as possible.

That is another matter. A consume membar. Your code does not use them.

Of course, I'm using it twice. Once after the first flag load and once
after a failed CAS.

But, if you ever do, well, make sure a rouge compiler is not putting
in an acquire barrier when it does not have to. I should have made
another topic about it. Sorry for that.

I'm using the fences properly and as minimal as possible.

I basically agree. But, you code logic had no need for a consume membar.
So, its a mute point in this context. I only brought it up in case you
ever do need one. NOT and acquire. :^)

Actually, do you know when to use a consume membar? And why you SHOULD
get "pissed off" if a compiler injects a god damn acquire in there?
--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Sun Jan 4 20:05:12 2026

From Newsgroup: comp.lang.c++

On 1/4/2026 4:40 PM, Chris M. Thomasson wrote:

On 1/4/2026 4:01 PM, Bonita Montero wrote:

Am 04.01.2026 um 23:19 schrieb Chris M. Thomasson:

On 1/3/2026 11:06 PM, Bonita Montero wrote:

Am 03.01.2026 um 20:22 schrieb Chris M. Thomasson:

I disagree. Try to get rid of any possibility of false sharing.
Strive for it. It's just good hygiene! :^)

No, false sharing needs to be avoided if it happens at least sometimes. >>>> Here false sharing might occur just once when the "once-flag" is
initia-
lized; otherwise the flag / the cacheline remains in shared mode. The
performance-impact of completely avoiding false sharing here ist nearly >>>> zero.

I would still do it. But, that's just me. Its a bit more than that.
Every atomic RMW, store, load, on your flag. Think of a reservation
granule for systems with LL/SC. A user needs to be careful where they
place those flags. But, well, they can align it themselves. So,
whatever.

Sorry, the initialization happens only once and after that the flag-
cachline
is read-shared.

What about that CAS in the loop?

Its hitting the cache line.
--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Mon Jan 5 10:18:54 2026

From Newsgroup: comp.lang.c++

Am 05.01.2026 um 01:40 schrieb Chris M. Thomasson:

What about that CAS in the loop?

I't only executed once.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Mon Jan 5 12:31:49 2026

From Newsgroup: comp.lang.c++

On 1/5/2026 1:18 AM, Bonita Montero wrote:

Am 05.01.2026 um 01:40 schrieb Chris M. Thomasson:

What about that CAS in the loop?

I't only executed once.

Notice the loop?
--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Tue Jan 6 06:08:39 2026

From Newsgroup: comp.lang.c++

Am 05.01.2026 um 21:31 schrieb Chris M. Thomasson:

On 1/5/2026 1:18 AM, Bonita Montero wrote:

Am 05.01.2026 um 01:40 schrieb Chris M. Thomasson:

What about that CAS in the loop?

I't only executed once.

Notice the loop?

Yes, but there's usually not so much contention that it
is executed more than once.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Tue Jan 6 16:58:00 2026

From Newsgroup: comp.lang.c++

On 1/5/2026 9:08 PM, Bonita Montero wrote:

Am 05.01.2026 um 21:31 schrieb Chris M. Thomasson:

On 1/5/2026 1:18 AM, Bonita Montero wrote:

Am 05.01.2026 um 01:40 schrieb Chris M. Thomasson:

What about that CAS in the loop?

I't only executed once.

Notice the loop?

Yes, but there's usually not so much contention that it
is executed more than once.

Never know. You are exposing your algo to user code. God knows what it
might do.
--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Wed Jan 7 06:56:55 2026

From Newsgroup: comp.lang.c++

Am 07.01.2026 um 01:58 schrieb Chris M. Thomasson:

Never know. You are exposing your algo to user code. God knows what it
might do.

The xonce_flag flips to true only once, no matter how mich user code is surrounding that flag.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Tue Jan 6 22:14:22 2026

From Newsgroup: comp.lang.c++

On 1/6/2026 9:56 PM, Bonita Montero wrote:

Am 07.01.2026 um 01:58 schrieb Chris M. Thomasson:

Never know. You are exposing your algo to user code. God knows what it
might do.

The xonce_flag flips to true only once, no matter how mich user code is surrounding that flag.

Forget about the flips for a moment. How may times does a LOCK CMPXCHG
hit it?
--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Wed Jan 7 07:46:46 2026

From Newsgroup: comp.lang.c++

Am 07.01.2026 um 07:14 schrieb Chris M. Thomasson:

On 1/6/2026 9:56 PM, Bonita Montero wrote:

Am 07.01.2026 um 01:58 schrieb Chris M. Thomasson:

Never know. You are exposing your algo to user code. God knows what
it might do.

The xonce_flag flips to true only once, no matter how mich user code
is surrounding that flag.

Forget about the flips for a moment. How may times does a LOCK CMPXCHG
hit it?

Once in 99.999% of all cases.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Wed Jan 7 12:55:43 2026

From Newsgroup: comp.lang.c++

On 1/6/2026 10:46 PM, Bonita Montero wrote:

Am 07.01.2026 um 07:14 schrieb Chris M. Thomasson:

On 1/6/2026 9:56 PM, Bonita Montero wrote:

Am 07.01.2026 um 01:58 schrieb Chris M. Thomasson:

Never know. You are exposing your algo to user code. God knows what
it might do.

The xonce_flag flips to true only once, no matter how mich user code
is surrounding that flag.

Forget about the flips for a moment. How may times does a LOCK CMPXCHG
hit it?

Once in 99.999% of all cases.

What if somebody uses a callable that throws all the time? ;^) Kidding,
but I still would align and pad the flag. But, that just me. In the name
of "proper" hygiene...
--- Synchronet 3.21a-Linux NewsLink 1.2

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Thu Jan 8 05:31:08 2026

From Newsgroup: comp.lang.c++

Am 07.01.2026 um 21:55 schrieb Chris M. Thomasson:

What if somebody uses a callable that throws all the time? ;^) Kidding,
but I still would align and pad the flag. But, that just me. In the name
of "proper" hygiene...

There's no measurable performance difference with a padded flag.

--- Synchronet 3.21a-Linux NewsLink 1.2

Who's Online
Recent Visitors
- Geek2
  Tue Mar 3 10:26:12 2026
  from Euclid, Oh via Telnet
- Geek2
  Mon Mar 2 11:22:09 2026
  from Euclid, Oh via Telnet
- Geek2
  Mon Mar 2 07:52:57 2026
  from Euclid, Oh via Telnet
- Geek2
  Sun Mar 1 19:00:04 2026
  from Euclid, Oh via Telnet

System Info

Sysop:	Amessyroom
Location:	Fayetteville, NC
Users:	59
Nodes:	6 (0 / 6)
Uptime:	22:59:52
Calls:	810
Calls today:	1
Files:	1,287
D/L today:	12 files (21,036K bytes)
Messages:	195,759

Really beautiful

Who's Online

Recent Visitors

System Info