Is it incorrect to use 0 (zero) to seed srand()?
int seed = (argc >= 2 && strlen(argv[1]) == 9)
? atoi(argv[1])
: (int)(time(NULL) % 900000000 + 100000000);
srand(seed);
On 2025-12-22 03:48, Michael Sanders wrote:
Is it incorrect to use 0 (zero) to seed srand()?
int seed = (argc >= 2 && strlen(argv[1]) == 9)
? atoi(argv[1])
: (int)(time(NULL) % 900000000 + 100000000);
srand(seed);
No, why whould you think so?
On 2025-12-22 12:44, James Kuyper wrote:
On 2025-12-22 03:48, Michael Sanders wrote:
Is it incorrect to use 0 (zero) to seed srand()?
int seed = (argc >= 2 && strlen(argv[1]) == 9)
? atoi(argv[1])
: (int)(time(NULL) % 900000000 + 100000000);
srand(seed);
No, why whould you think so?
There's number sequence generators that produce 0 sequences if seeded
with 0. ...
... And maybe the comment in 'man 3 rand', "If no seed value is
provided, the rand() function is automatically seeded with a value
of 1.", may have fostered his doubt.
On 2025-12-22 07:18, Janis Papanagnou wrote:
On 2025-12-22 12:44, James Kuyper wrote:
On 2025-12-22 03:48, Michael Sanders wrote:
Is it incorrect to use 0 (zero) to seed srand()?
int seed = (argc >= 2 && strlen(argv[1]) == 9)
? atoi(argv[1])
: (int)(time(NULL) % 900000000 + 100000000);
srand(seed);
No, why whould you think so?
There's number sequence generators that produce 0 sequences if seeded
with 0. ...
The details of how the seed affects the random number sequence are unspecified by the standard. I personally would consider a pseudo-random number generator to be quite defective if there were any seed that
produced a constant output.
[...]
On 2025-12-22 18:13, James Kuyper wrote:
On 2025-12-22 07:18, Janis Papanagnou wrote:
On 2025-12-22 12:44, James Kuyper wrote:
On 2025-12-22 03:48, Michael Sanders wrote:
Is it incorrect to use 0 (zero) to seed srand()?
int seed = (argc >= 2 && strlen(argv[1]) == 9)
? atoi(argv[1])
: (int)(time(NULL) % 900000000 + 100000000);
srand(seed);
No, why whould you think so?
There's number sequence generators that produce 0 sequences if
seeded with 0. ...
The details of how the seed affects the random number sequence are unspecified by the standard. I personally would consider a
pseudo-random number generator to be quite defective if there were
any seed that produced a constant output.
I wouldn't have mentioned that if there weren't a whole class of
such functions that expose exactly that behavior by design. Have
a look for PN-(Pseudo Noise-)generators and LFSR (Linear Feedback
Shift Registers). These have been defined to produce random noise
(bit pattern with good statistical distribution). With sophisticated generator polynomials they produce also sequences of maximum period;
say, for N=31 a non-repeating sequence of length 2^N - 1. The one
element that is missing from the sequence is the 0 (that reproduces
itself).
Technically you pick some bit-values from fixed positions (depending
on the generator polynomial) of the register and xor the bits to shift
the result into the register. Here's ad hoc an example...
#include <stdio.h>
#include <stdint.h>
int main ()
{
uint32_t init = 0x00000038;
uint32_t reg = init;
uint32_t new_bit;
int count = 0;
do {
new_bit = ((reg >> 2) + (reg >> 4) + (reg >> 6) + (reg >>
30)) & 0x1;
reg <<= 1;
reg |= new_bit;
reg &= 0x7fffffff;
count++;
} while (reg != init);
printf ("period: %d\n", count);
}
Janis
[...]
On 2025-12-22 18:13, James Kuyper wrote:
On 2025-12-22 07:18, Janis Papanagnou wrote:
On 2025-12-22 12:44, James Kuyper wrote:
On 2025-12-22 03:48, Michael Sanders wrote:
Is it incorrect to use 0 (zero) to seed srand()?
int seed = (argc >= 2 && strlen(argv[1]) == 9)
? atoi(argv[1])
: (int)(time(NULL) % 900000000 + 100000000);
srand(seed);
No, why whould you think so?
There's number sequence generators that produce 0 sequences if seeded
with 0. ...
The details of how the seed affects the random number sequence are
unspecified by the standard. I personally would consider a pseudo-random
number generator to be quite defective if there were any seed that
produced a constant output.
I wouldn't have mentioned that if there weren't a whole class of
such functions that expose exactly that behavior by design. Have
a look for PN-(Pseudo Noise-)generators and LFSR (Linear Feedback
Shift Registers).
#include <stdint.h>
int main ()
{
uint32_t init = 0x00000038;
uint32_t reg = init;
uint32_t new_bit;
int count = 0;
do {
new_bit = ((reg >> 2) + (reg >> 4) + (reg >> 6) + (reg >> 30))
& 0x1;
reg <<= 1;
reg |= new_bit;
... (not POSIX imbecile rand_r() ) ...
[...]
In practice, using LFSR for rand() is not particularly bright idea for different reason: LFSR is a reasonably good PRNG for a single bit, but
not when you want to generate a group of 31 pseudo-random bits. [...]
There's number sequence generators that produce 0 sequences if seededIs it incorrect to use 0 (zero) to seed srand()?No, why whould you think so?
On 2025-12-22, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
I don't recall having paid attention to this exact material in the
past so it is a "TIL" for me.
#include <stdint.h>
int main ()
{
uint32_t init = 0x00000038;
uint32_t reg = init;
uint32_t new_bit;
int count = 0;
do {
new_bit = ((reg >> 2) + (reg >> 4) + (reg >> 6) + (reg >> 30))
& 0x1;
These could of course be XOR, without it making a difference; the least significant bit in a binary addition is the XOR of the LSB's of the
inputs.
[...]
[...] LFSR is a reasonably good PRNG for a single bit, but
not when you want to generate a group of 31 pseudo-random bits. In
order to get 31 new bits, without predictable repetitions from the
previous value, you would have to do 31 steps. That's slow! The process
can be accelerate by generation of several bits at time via look up
tables, but in order to get decent speed the table has to be rater big
and using big tables in standard library is bad sportsmanship.
It seems that overwhelming majority C RTLs use Linear Congruential Generators, probably because for Stanadard library compactness of both
code and data is considered more important than very high speed (not
that on modern HW LCGs are slow) or superior random properties of
Mersenne Twisters.
[...]
Is it incorrect to use 0 (zero) to seed srand()?
int seed = (argc >= 2 && strlen(argv[1]) == 9)
? atoi(argv[1])
: (int)(time(NULL) % 900000000 + 100000000);
srand(seed);
I like to just read /dev/urandom when I need a random number. Seem
easier and more portable across Linux & the *BSDs.
Michael Sanders <porkchop@invalid.foo> wrote:
Is it incorrect to use 0 (zero) to seed srand()?
int seed = (argc >= 2 && strlen(argv[1]) == 9)
? atoi(argv[1])
: (int)(time(NULL) % 900000000 + 100000000);
srand(seed);
I like to just read /dev/urandom when I need a random
number. Seem easier and more portable across Linux &
the *BSDs.
int s;
read(fd, &s, sizeof(int));
On 2025-12-22 03:48, Michael Sanders wrote:
Is it incorrect to use 0 (zero) to seed srand()?
int seed = (argc >= 2 && strlen(argv[1]) == 9)
? atoi(argv[1])
: (int)(time(NULL) % 900000000 + 100000000);
srand(seed);
No, why whould you think so?
There's number sequence generators that produce 0 sequences if seeded
with 0. And maybe the comment in 'man 3 rand', "If no seed value is
provided, the rand() function is automatically seeded with a value
of 1.", may have fostered his doubt.
I like to just read /dev/urandom when I need a random
number. Seem easier and more portable across Linux &
the *BSDs.
int s;
read(fd, &s, sizeof(int));
On Mon, 22 Dec 2025 06:44:42 -0500, James Kuyper wrote:
On 2025-12-22 03:48, Michael Sanders wrote:
Is it incorrect to use 0 (zero) to seed srand()?
int seed = (argc >= 2 && strlen(argv[1]) == 9)
? atoi(argv[1])
: (int)(time(NULL) % 900000000 + 100000000);
srand(seed);
No, why whould you think so?
Excuse my delayed reply James (net provider was down most of today).
Well, I guess I did not expect such large differences between
gcc & musl somehow (cant test with clang just yet). I understand the
sequence is deterministic & likely still some differences with musl,
yet I wrongly assumed it seems, the sequences would be the same...
#include <stdio.h>
#include <stdlib.h>
int main(void) {
srand(0);
printf("%d\n", rand());
return 0;
}
/*
$ gcc -O2 -o rnd rnd.c && ./rnd
1804289383
$ musl-gcc -static -O2 -o rnd rnd.c && ./rnd
2049033599
*/
On Mon, 22 Dec 2025 13:18:19 +0100, Janis Papanagnou wrote:
There's number sequence generators that produce 0 sequences if seeded
with 0. And maybe the comment in 'man 3 rand', "If no seed value is
provided, the rand() function is automatically seeded with a value
of 1.", may have fostered his doubt.
Janis - naive question for you...
How do I bring up *posix only* man pages using 3?
I see no difference when invoking any of:
man 3 srand
or: man 3 posix srand
or: man posix 3 srand
What I'm doing wrong here?
On 2025-12-22 19:45, Michael S wrote:
[...] LFSR is a reasonably good PRNG for a single bit, but
not when you want to generate a group of 31 pseudo-random bits. In
order to get 31 new bits, without predictable repetitions from the
previous value, you would have to do 31 steps. That's slow! The
process can be accelerate by generation of several bits at time via
look up tables, but in order to get decent speed the table has to
be rater big and using big tables in standard library is bad
sportsmanship.
Yes. But mind that the speed is also depending on what quality you
need. For example; I used the PN-generator to create bit-sequences
(as you also suggest). For another application both, PN-LFSR and
LCG (that you mention below), were inacceptable; we used a cipher
to create the random data. (If you compare the speed of creating
the cipher to a bit-shift-register the latter looks really fast.)
It seems that overwhelming majority C RTLs use Linear Congruential Generators, probably because for Stanadard library compactness of
both code and data is considered more important than very high
speed (not that on modern HW LCGs are slow) or superior random
properties of Mersenne Twisters.
For "standard applications" I always used the simple LCGs; simple
and fast. Or whatever the tools or library provided; which were
mostly anyway LCGs.
Janis
[...]
When I need PRNG then I am typically not deeply concerned about size of
its internal state. On the other hand, I don't want to care about
potentially insufficient randomness of the output (not in crypto
sense). On the 3rd hand, vectors that I generate with PRNG tend to be
big and I don't like to wait, so I do care somewhat about speed.
Those 3 factors together plus availability long ago made MT19937-64
into my personal default PRNG of choice.
MT19937-64 is available out of the box(*) in C++. But not in C, unfortunately.
At higher theoretical level MT is a generalization of LFSR, but it is
not obvious when one looks at implementation.
Is it incorrect to use 0 (zero) to seed srand()?
int seed = (argc >= 2 && strlen(argv[1]) == 9)
? atoi(argv[1])
: (int)(time(NULL) % 900000000 + 100000000);
srand(seed);
On 2025-12-23 08:24, Michael Sanders wrote:
[...]
How do I bring up *posix only* man pages using 3?
Sorry, can't help you here. Maybe someone else can.
On Mon, 22 Dec 2025 13:18:19 +0100, Janis Papanagnou wrote:
There's number sequence generators that produce 0 sequences if seeded
with 0. And maybe the comment in 'man 3 rand', "If no seed value is
provided, the rand() function is automatically seeded with a value
of 1.", may have fostered his doubt.
Janis - naive question for you...
How do I bring up *posix only* man pages using 3?
I see no difference when invoking any of:
man 3 srand
or: man 3 posix srand
or: man posix 3 srand
What I'm doing wrong here?
On 2025-12-23 10:18, Michael S wrote:
When I need PRNG then I am typically not deeply concerned about
size of its internal state. On the other hand, I don't want to care
about potentially insufficient randomness of the output (not in
crypto sense). On the 3rd hand, vectors that I generate with PRNG
tend to be big and I don't like to wait, so I do care somewhat
about speed. Those 3 factors together plus availability long ago
made MT19937-64 into my personal default PRNG of choice.
I've never intensified my knowledge in direction of MT algorithms.
MT19937-64 is available out of the box(*) in C++. But not in C, unfortunately.
This is really strange given that the name ("Mersenne Twister") is
that prominent.
Looking that up I find at least "C" code for MT19937 in Wikipedia https://de.wikipedia.org/wiki/Mersenne-Twister
It's based on 32 bit logic it seems; interpreting your "MT19937-64"
I assume you're looking for a 64 bit based version?
At higher theoretical level MT is a generalization of LFSR, but it
is not obvious when one looks at implementation.
Well, at least there's the 'mod' operations all based on powers of 2
along with all the binary op's which suggests some (non-arithmetic) bit-register type of algorithm, but the multiplication with 0x9908b0df
(5 * 513496723) - which I'd suppose be hard to realize as/with LFSR -
may suggest some other generator type.
Janis
It is not the compilers that are different, it is the C standard
libraries that are different. gcc is a compiler, not a library, but you
are probably using glibc with it by default. musl is a library, not a compiler. There is no reason to suppose that different C standard
libraries use the same implementation of rand()and srand(), so no reason
to suppose they would give the same sequences - though each on their own will give a deterministic pseudo-random sequence based on their seeds.
If you swap gcc with clang you will get the same results - it will
depend on whether you are linking with glibc, musl, or another C library.
On Debian, Ubuntu, and similar systems, you can install the
"manpages-posix" (section 1) and "manpages-posix-dev" (sections 3
and 7) packages. You can then view the POSIX man page for srand
by typing any of:
man 3posix srand
man -s 3posix srand
man srand.3posix
I'd expect similar packages to be available on (some) other systems.
Manual pages are for the implementation of the operating system.
The current standard version can be viewed here: <https://pubs.opengroup.org/onlinepubs/9799919799/functions/srand.html>
This is the older standard version, still containing rand_r(): <https://pubs.opengroup.org/onlinepubs/9699919799/functions/srand.html>
On Tue, 23 Dec 2025 00:39:49 -0000 (UTC), John McCue wrote:
I like to just read /dev/urandom when I need a random number. Seem
easier and more portable across Linux & the *BSDs.
Not to mention a lot stronger, cryptographically.
On Mon, 22 Dec 2025 13:18:19 +0100, Janis Papanagnou wrote:
There's number sequence generators that produce 0 sequences if seeded
with 0. And maybe the comment in 'man 3 rand', "If no seed value is
provided, the rand() function is automatically seeded with a value
of 1.", may have fostered his doubt.
Janis - naive question for you...
How do I bring up *posix only* man pages using 3?
I see no difference when invoking any of:
man 3 srand
or: man 3 posix srand
or: man posix 3 srand
What I'm doing wrong here?
On 2025-12-23 08:24, Michael Sanders wrote:
On Mon, 22 Dec 2025 13:18:19 +0100, Janis Papanagnou wrote:
There's number sequence generators that produce 0 sequences if seeded
with 0. And maybe the comment in 'man 3 rand', "If no seed value is
provided, the rand() function is automatically seeded with a value
of 1.", may have fostered his doubt.
Janis - naive question for you...
How do I bring up *posix only* man pages using 3?
Sorry, can't help you here. Maybe someone else can.
Myself I only access the Unix man pages as they come,
i.e. using either 'man entry' or 'man section entry'.
The POSIX information is usually textually integrated
in the man pages.
I see no difference when invoking any of:
man 3 srand
That's what I'm doing, and I see, for example,
...
HISTORY
rand()
srand()
SVr4, 4.3BSD, C89, POSIX.1-2001.
On Mon, 22 Dec 2025 18:41:10 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
On 2025-12-22 18:13, James Kuyper wrote:
On 2025-12-22 07:18, Janis Papanagnou wrote:
On 2025-12-22 12:44, James Kuyper wrote:
On 2025-12-22 03:48, Michael Sanders wrote:
Is it incorrect to use 0 (zero) to seed srand()?
int seed = (argc >= 2 && strlen(argv[1]) == 9)
? atoi(argv[1])
: (int)(time(NULL) % 900000000 + 100000000);
srand(seed);
No, why whould you think so?
There's number sequence generators that produce 0 sequences if
seeded with 0. ...
The details of how the seed affects the random number sequence are
unspecified by the standard. I personally would consider a
pseudo-random number generator to be quite defective if there were
any seed that produced a constant output.
I wouldn't have mentioned that if there weren't a whole class of
such functions that expose exactly that behavior by design. Have
a look for PN-(Pseudo Noise-)generators and LFSR (Linear Feedback
Shift Registers). These have been defined to produce random noise
(bit pattern with good statistical distribution). With sophisticated
generator polynomials they produce also sequences of maximum period;
say, for N=31 a non-repeating sequence of length 2^N - 1. The one
element that is missing from the sequence is the 0 (that reproduces
itself).
Technically you pick some bit-values from fixed positions (depending
on the generator polynomial) of the register and xor the bits to shift
the result into the register. Here's ad hoc an example...
#include <stdio.h>
#include <stdint.h>
int main ()
{
uint32_t init = 0x00000038;
uint32_t reg = init;
uint32_t new_bit;
int count = 0;
do {
new_bit = ((reg >> 2) + (reg >> 4) + (reg >> 6) + (reg >>
30)) & 0x1;
reg <<= 1;
reg |= new_bit;
reg &= 0x7fffffff;
count++;
} while (reg != init);
printf ("period: %d\n", count);
}
Janis
[...]
Pay attention that C Standard only requires for the same seed to always produces the same sequence. There is no requirement that different
seeds have to produce different sequences.
So, for generator in your example, implementation like below would be
fully legal. Personally, I wouldn't even consider it as particularly
poor quality:
void srand(unsigned seed ) { init = seed | 1;}
[O.T.]
In practice, using LFSR for rand() is not particularly bright idea for different reason: LFSR is a reasonably good PRNG for a single bit, but
not when you want to generate a group of 31 pseudo-random bits. In
order to get 31 new bits, without predictable repetitions from the
previous value, you would have to do 31 steps. That's slow! The process
can be accelerate by generation of several bits at time via look up
tables, but in order to get decent speed the table has to be rater big
and using big tables in standard library is bad sportsmanship.
It seems that overwhelming majority C RTLs use Linear Congruential Generators, probably because for Stanadard library compactness of both
code and data is considered more important than very high speed (not
that on modern HW LCGs are slow) or superior random properties of
Mersenne Twisters.
On Tue, 23 Dec 2025 08:25:59 +0100, David Brown wrote:
It is not the compilers that are different, it is the C standard
libraries that are different. gcc is a compiler, not a library, but you
are probably using glibc with it by default. musl is a library, not a
compiler. There is no reason to suppose that different C standard
libraries use the same implementation of rand()and srand(), so no reason
to suppose they would give the same sequences - though each on their own
will give a deterministic pseudo-random sequence based on their seeds.
If you swap gcc with clang you will get the same results - it will
depend on whether you are linking with glibc, musl, or another C library.
Sure enough & thank you David - I appreciate your explanation.
I see where my thinking was off now. You're 100% correct
(I'm still learning as you noticed).
HISTORY
rand()
srand()
SVr4, 4.3BSD, C89, POSIX.1-2001.
Those interfaces were originally documented in the SVID
(System V Interface Definition).
The third edition (1989) states:
"The function rand() uses a multiplicative congruential random-number
generator with a period 2^32 that returns successive psuedo-random
numbers in the range 0 to 32767."
In the FUTURE DIRECTIONS section, it notes:
"The algorithms used in rand() and srand() are obsolete and will
be replaced with algorithms providing better pseudo-random characteristics
in a future issue".
There was never a fourth edition.
On 2025-12-23, John McCue <jmclnx@gmail.com.invalid> wrote:
Michael Sanders <porkchop@invalid.foo> wrote:
Is it incorrect to use 0 (zero) to seed srand()?
int seed = (argc >= 2 && strlen(argv[1]) == 9)
? atoi(argv[1])
: (int)(time(NULL) % 900000000 + 100000000);
srand(seed);
I like to just read /dev/urandom when I need a random
number. Seem easier and more portable across Linux &
the *BSDs.
int s;
read(fd, &s, sizeof(int));
srand takes an unsigned argument.
unsigned s;
read(fd, &s, sizeof s);
Michael S <already5chosen@yahoo.com> wrote:
On Mon, 22 Dec 2025 18:41:10 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
On 2025-12-22 18:13, James Kuyper wrote:
On 2025-12-22 07:18, Janis Papanagnou wrote:
On 2025-12-22 12:44, James Kuyper wrote:
On 2025-12-22 03:48, Michael Sanders wrote:
Is it incorrect to use 0 (zero) to seed srand()?
int seed = (argc >= 2 && strlen(argv[1]) == 9)
? atoi(argv[1])
: (int)(time(NULL) % 900000000 + 100000000);
srand(seed);
No, why whould you think so?
There's number sequence generators that produce 0 sequences if
seeded with 0. ...
The details of how the seed affects the random number sequence
are unspecified by the standard. I personally would consider a
pseudo-random number generator to be quite defective if there
were any seed that produced a constant output.
I wouldn't have mentioned that if there weren't a whole class of
such functions that expose exactly that behavior by design. Have
a look for PN-(Pseudo Noise-)generators and LFSR (Linear Feedback
Shift Registers). These have been defined to produce random noise
(bit pattern with good statistical distribution). With
sophisticated generator polynomials they produce also sequences of
maximum period; say, for N=31 a non-repeating sequence of length
2^N - 1. The one element that is missing from the sequence is the
0 (that reproduces itself).
Technically you pick some bit-values from fixed positions
(depending on the generator polynomial) of the register and xor
the bits to shift the result into the register. Here's ad hoc an
example...
#include <stdio.h>
#include <stdint.h>
int main ()
{
uint32_t init = 0x00000038;
uint32_t reg = init;
uint32_t new_bit;
int count = 0;
do {
new_bit = ((reg >> 2) + (reg >> 4) + (reg >> 6) + (reg >>
30)) & 0x1;
reg <<= 1;
reg |= new_bit;
reg &= 0x7fffffff;
count++;
} while (reg != init);
printf ("period: %d\n", count);
}
Janis
[...]
Pay attention that C Standard only requires for the same seed to
always produces the same sequence. There is no requirement that
different seeds have to produce different sequences.
So, for generator in your example, implementation like below would
be fully legal. Personally, I wouldn't even consider it as
particularly poor quality:
void srand(unsigned seed ) { init = seed | 1;}
[O.T.]
In practice, using LFSR for rand() is not particularly bright idea
for different reason: LFSR is a reasonably good PRNG for a single
bit, but not when you want to generate a group of 31 pseudo-random
bits. In order to get 31 new bits, without predictable repetitions
from the previous value, you would have to do 31 steps. That's
slow! The process can be accelerate by generation of several bits
at time via look up tables, but in order to get decent speed the
table has to be rater big and using big tables in standard library
is bad sportsmanship.
It seems that overwhelming majority C RTLs use Linear Congruential Generators, probably because for Stanadard library compactness of
both code and data is considered more important than very high
speed (not that on modern HW LCGs are slow) or superior random
properties of Mersenne Twisters.
There is a paper "PCG: A Family of Simple Fast Space-Efficient
Statistically Good Algorithms for Random Number Generation"
by M. OrCONeill where she gives a family of algorithms and runs
several statistical tests against known algorithms. Mersenne
Twister does not look good in tests. If you have enough (128) bits
LCGs do pass tests. A bunch of generators with 64-bit state also
passes tests. So the only reason to prefer Mersenne Twister is
that it is implemented in available libraries. Otherwise it is
not so good, have large state and needs more execution time
than alternatives.
On Tue, 23 Dec 2025 10:54:23 +0100...
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
On 2025-12-23 10:18, Michael S wrote:
MT19937-64 is available out of the box(*) in C++. But not in C,
unfortunately.
This is really strange given that the name ("Mersenne Twister") is
that prominent.
Looking that up I find at least "C" code for MT19937 in Wikipedia
https://de.wikipedia.org/wiki/Mersenne-Twister
It's based on 32 bit logic it seems; interpreting your "MT19937-64"
I assume you're looking for a 64 bit based version?
"Available out of the box" in this sentence means "part of standard
library".
On 2025-12-23 06:50, Michael S wrote:
On Tue, 23 Dec 2025 10:54:23 +0100...
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
On 2025-12-23 10:18, Michael S wrote:
MT19937-64 is available out of the box(*) in C++. But not in C,
unfortunately.
This is really strange given that the name ("Mersenne Twister") is
that prominent.
Looking that up I find at least "C" code for MT19937 in Wikipedia
https://de.wikipedia.org/wiki/Mersenne-Twister
It's based on 32 bit logic it seems; interpreting your "MT19937-64"
I assume you're looking for a 64 bit based version?
"Available out of the box" in this sentence means "part of standard
library".
Citation, please? I can find neither Mersenne nor "MT19937-64" anywhere
in n5001.pdf, the latest draft version of the C++ standard that I have
access to, which is dated 2024-12-17.
Testing randomness is complicated matter.
On Wed, 24 Dec 2025 00:08:24 +0200, Michael S wrote:
Testing randomness is complicated matter.
Impossible, really, if you define rCLrandomrCY as rCLNobody can know what comes
nextrCY.
On 2025-12-23 21:02, Lawrence DrCOOliveiro wrote:
On Wed, 24 Dec 2025 00:08:24 +0200, Michael S wrote:
Testing randomness is complicated matter.
Impossible, really, if you define rCLrandomrCY as rCLNobody can know what
comes nextrCY.
The quality of pseudo-random number generators can be measured, but you
need to carefully define what you mean by "quality".
Is it incorrect to use 0 (zero) to seed srand()?
int seed = (argc >= 2 && strlen(argv[1]) == 9)
? atoi(argv[1])
: (int)(time(NULL) % 900000000 + 100000000);
srand(seed);
Wish there was such a 'device' under Windows...
On Tue, 23 Dec 2025 00:39:49 -0000 (UTC), John McCue wrote:
I like to just read /dev/urandom when I need a random
number. Seem easier and more portable across Linux &
the *BSDs.
int s;
read(fd, &s, sizeof(int));
Thanks John. Wish there was such a 'device' under Windows...
On Tue, 23 Dec 2025 07:25:42 -0000 (UTC)
Michael Sanders <porkchop@invalid.foo> wrote:
On Tue, 23 Dec 2025 00:39:49 -0000 (UTC), John McCue wrote:
I like to just read /dev/urandom when I need a random
number. Seem easier and more portable across Linux &
the *BSDs.
int s;
read(fd, &s, sizeof(int));
Thanks John. Wish there was such a 'device' under Windows...
There is.
Windows XP/Vista/7: https://learn.microsoft.com/en-us/windows/win32/api/wincrypt/nf-wincrypt-cryptgenrandom
Win8 and later: https://learn.microsoft.com/en-us/windows/win32/api/bcrypt/nf-bcrypt-bcryptgenrandom
On Tue, 23 Dec 2025 17:54:05 -0000 (UTC)
antispam@fricas.org (Waldek Hebisch) wrote:
Michael S <already5chosen@yahoo.com> wrote:
On Mon, 22 Dec 2025 18:41:10 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
On 2025-12-22 18:13, James Kuyper wrote:
On 2025-12-22 07:18, Janis Papanagnou wrote:
On 2025-12-22 12:44, James Kuyper wrote:
On 2025-12-22 03:48, Michael Sanders wrote:
Is it incorrect to use 0 (zero) to seed srand()?
int seed = (argc >= 2 && strlen(argv[1]) == 9)
? atoi(argv[1])
: (int)(time(NULL) % 900000000 + 100000000);
srand(seed);
No, why whould you think so?
There's number sequence generators that produce 0 sequences if
seeded with 0. ...
The details of how the seed affects the random number sequence
are unspecified by the standard. I personally would consider a
pseudo-random number generator to be quite defective if there
were any seed that produced a constant output.
I wouldn't have mentioned that if there weren't a whole class of
such functions that expose exactly that behavior by design. Have
a look for PN-(Pseudo Noise-)generators and LFSR (Linear Feedback
Shift Registers). These have been defined to produce random noise
(bit pattern with good statistical distribution). With
sophisticated generator polynomials they produce also sequences of
maximum period; say, for N=31 a non-repeating sequence of length
2^N - 1. The one element that is missing from the sequence is the
0 (that reproduces itself).
Technically you pick some bit-values from fixed positions
(depending on the generator polynomial) of the register and xor
the bits to shift the result into the register. Here's ad hoc an
example...
#include <stdio.h>
#include <stdint.h>
int main ()
{
uint32_t init = 0x00000038;
uint32_t reg = init;
uint32_t new_bit;
int count = 0;
do {
new_bit = ((reg >> 2) + (reg >> 4) + (reg >> 6) + (reg >>
30)) & 0x1;
reg <<= 1;
reg |= new_bit;
reg &= 0x7fffffff;
count++;
} while (reg != init);
printf ("period: %d\n", count);
}
Janis
[...]
Pay attention that C Standard only requires for the same seed to
always produces the same sequence. There is no requirement that
different seeds have to produce different sequences.
So, for generator in your example, implementation like below would
be fully legal. Personally, I wouldn't even consider it as
particularly poor quality:
void srand(unsigned seed ) { init = seed | 1;}
[O.T.]
In practice, using LFSR for rand() is not particularly bright idea
for different reason: LFSR is a reasonably good PRNG for a single
bit, but not when you want to generate a group of 31 pseudo-random
bits. In order to get 31 new bits, without predictable repetitions
from the previous value, you would have to do 31 steps. That's
slow! The process can be accelerate by generation of several bits
at time via look up tables, but in order to get decent speed the
table has to be rater big and using big tables in standard library
is bad sportsmanship.
It seems that overwhelming majority C RTLs use Linear Congruential
Generators, probably because for Stanadard library compactness of
both code and data is considered more important than very high
speed (not that on modern HW LCGs are slow) or superior random
properties of Mersenne Twisters.
There is a paper "PCG: A Family of Simple Fast Space-Efficient
Statistically Good Algorithms for Random Number Generation"
by M. OrCONeill where she gives a family of algorithms and runs
several statistical tests against known algorithms. Mersenne
Twister does not look good in tests. If you have enough (128) bits
LCGs do pass tests. A bunch of generators with 64-bit state also
passes tests. So the only reason to prefer Mersenne Twister is
that it is implemented in available libraries. Otherwise it is
not so good, have large state and needs more execution time
than alternatives.
I don't know. Testing randomness is complicated matter.
How can I be sure that LrCOEcuyer and SimardrCOs TestU01 suite tests things that I personally care about and that it does not test things that are
of no interest for me? Especially, the latter.
Also, the TestU01 suit is made for generators with 32-bit output.
M. OrCONeill used ad hoc technique to make it applicable to generators
with 64-bit output. Is this technique right? Or may be it put 64-bit
PRNG at unfair disadvantage?
Besides, I strongly disagree with at least one assertion made by
OrCONeill: "While security-related applications should
use a secure generator, because we cannot always know the future
contexts in which our code will be used, it seems wise for all
applications to avoid generators that make discovering their entire
internal state completely trivial."
No, I know exactly what I am doing/ I know exactly that for my
application easy discovery of complete state of PRNG is not a defect.
Anyway, even if I am skeptical about her criticism of popular PRNGs, intuitively I agree with the constructive part of the article - medium-quality PRNG that feeds medium quality hash function can
potentially produce very good fast PRNG with rather small internal
state.
On related note, I think that even simple counter fed into high quality
hash function (not cryptographically high quality, far less than that)
can produce excellent PRNG with even smaller internal state. But not
very fast one. Although the speed depends on specifics of used
computer. I can imagine computer that has low-latency Rijndael128 instruction. On such computer, running counter through 3-4 rounds of
Rijndael ill produce very good PRNG that is only 2-3 times slower than,
for example, LCG 128/64.
Michael S <already5chosen@yahoo.com> wrote:
On Tue, 23 Dec 2025 17:54:05 -0000 (UTC)
antispam@fricas.org (Waldek Hebisch) wrote:
Michael S <already5chosen@yahoo.com> wrote:
On Mon, 22 Dec 2025 18:41:10 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
Also, the TestU01 suit is made for generators with 32-bit output.
M. OrCONeill used ad hoc technique to make it applicable to generators
with 64-bit output. Is this technique right? Or may be it put 64-bit
PRNG at unfair disadvantage?
My point of view is that generator can be used to generate long
bistream. Then you can cut the bitstream and get number of
desired size. Good tests should check that such usage leads
to reasonable properties. So, fact that one generator produces
32-bit pieces and other produces 64-bit pieces should be irrelevant
to the test.
Besides, I strongly disagree with at least one assertion made by
OrCONeill: "While security-related applications should
use a secure generator, because we cannot always know the future
contexts in which our code will be used, it seems wise for all
applications to avoid generators that make discovering their entire internal state completely trivial."
No, I know exactly what I am doing/ I know exactly that for my
application easy discovery of complete state of PRNG is not a
defect.
OrCONeill is not a prophet, ignore what she say it you think you
know better (which is probably the above).
Anyway, even if I am skeptical about her criticism of popular PRNGs, intuitively I agree with the constructive part of the article - medium-quality PRNG that feeds medium quality hash function can
potentially produce very good fast PRNG with rather small internal
state.
She seem to care very much about having minimal possible state.
That is may be nice on embeded systems, but in general I would
happily accept slighty bigger state (say 256 bits). But if
we can get good properties with very small state, then why not?
After all looking at state and updating it takes code, so
small state helps with having fast generator.
Concerning Mersenne Twister, she is not the only one toOne theoretical advantage of MT19937 is that it has period of astronomic proportions. Which means that one instance of PRNG could be
criticise it. My personal opinion is that given large
state and not so simple update Mersenne Twister would
have to be very very good to justify its use.
But it
fails some tests, so does not look _better_ than other
generators.
On related note, I think that even simple counter fed into high
quality hash function (not cryptographically high quality, far less
than that) can produce excellent PRNG with even smaller internal
state. But not very fast one. Although the speed depends on
specifics of used computer. I can imagine computer that has
low-latency Rijndael128 instruction. On such computer, running
counter through 3-4 rounds of Rijndael ill produce very good PRNG
that is only 2-3 times slower than, for example, LCG 128/64.
Maybe.
Michael S <already5chosen@yahoo.com> wrote:
On Mon, 22 Dec 2025 18:41:10 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
On 2025-12-22 18:13, James Kuyper wrote:
On 2025-12-22 07:18, Janis Papanagnou wrote:
On 2025-12-22 12:44, James Kuyper wrote:
On 2025-12-22 03:48, Michael Sanders wrote:
Is it incorrect to use 0 (zero) to seed srand()?
int seed = (argc >= 2 && strlen(argv[1]) == 9)
? atoi(argv[1])
: (int)(time(NULL) % 900000000 + 100000000);
srand(seed);
No, why whould you think so?
There's number sequence generators that produce 0 sequences if
seeded with 0. ...
The details of how the seed affects the random number sequence are
unspecified by the standard. I personally would consider a
pseudo-random number generator to be quite defective if there were
any seed that produced a constant output.
I wouldn't have mentioned that if there weren't a whole class of
such functions that expose exactly that behavior by design. Have
a look for PN-(Pseudo Noise-)generators and LFSR (Linear Feedback
Shift Registers). These have been defined to produce random noise
(bit pattern with good statistical distribution). With sophisticated
generator polynomials they produce also sequences of maximum period;
say, for N=31 a non-repeating sequence of length 2^N - 1. The one
element that is missing from the sequence is the 0 (that reproduces
itself).
Technically you pick some bit-values from fixed positions (depending
on the generator polynomial) of the register and xor the bits to shift
the result into the register. Here's ad hoc an example...
#include <stdio.h>
#include <stdint.h>
int main ()
{
uint32_t init = 0x00000038;
uint32_t reg = init;
uint32_t new_bit;
int count = 0;
do {
new_bit = ((reg >> 2) + (reg >> 4) + (reg >> 6) + (reg >>
30)) & 0x1;
reg <<= 1;
reg |= new_bit;
reg &= 0x7fffffff;
count++;
} while (reg != init);
printf ("period: %d\n", count);
}
Janis
[...]
Pay attention that C Standard only requires for the same seed to always
produces the same sequence. There is no requirement that different
seeds have to produce different sequences.
So, for generator in your example, implementation like below would be
fully legal. Personally, I wouldn't even consider it as particularly
poor quality:
void srand(unsigned seed ) { init = seed | 1;}
[O.T.]
In practice, using LFSR for rand() is not particularly bright idea for
different reason: LFSR is a reasonably good PRNG for a single bit, but
not when you want to generate a group of 31 pseudo-random bits. In
order to get 31 new bits, without predictable repetitions from the
previous value, you would have to do 31 steps. That's slow! The process
can be accelerate by generation of several bits at time via look up
tables, but in order to get decent speed the table has to be rater big
and using big tables in standard library is bad sportsmanship.
It seems that overwhelming majority C RTLs use Linear Congruential
Generators, probably because for Stanadard library compactness of both
code and data is considered more important than very high speed (not
that on modern HW LCGs are slow) or superior random properties of
Mersenne Twisters.
There is a paper "PCG: A Family of Simple Fast Space-Efficient
Statistically Good Algorithms for Random Number Generation"
by M. OrCONeill where she gives a family of algorithms and runs
several statistical tests against known algorithms. Mersenne
Twister does not look good in tests. If you have enough (128) bits
LCGs do pass tests. A bunch of generators with 64-bit state also
passes tests. So the only reason to prefer Mersenne Twister is
that it is implemented in available libraries. Otherwise it is
not so good, have large state and needs more execution time
than alternatives.
Wish there was such a 'device' under Windows...
You should get one if you install WSL2.
Ike Naar <ike@sdf.org> wrote:
On 2025-12-23, John McCue <jmclnx@gmail.com.invalid> wrote:
Michael Sanders <porkchop@invalid.foo> wrote:
Is it incorrect to use 0 (zero) to seed srand()?
int seed = (argc >= 2 && strlen(argv[1]) == 9)
? atoi(argv[1])
: (int)(time(NULL) % 900000000 + 100000000);
srand(seed);
I like to just read /dev/urandom when I need a random
number. Seem easier and more portable across Linux &
the *BSDs.
int s;
read(fd, &s, sizeof(int));
srand takes an unsigned argument.
unsigned s;
read(fd, &s, sizeof s);
I am not quite sure what you are saying about srand(3).
If you decide to read /dev/urandom, there is no need to
call srand(3), the OS maintains random data itself. So
read(2) will just return the random number of the type
you want based upon the call.
On Tue, 23 Dec 2025 07:25:42 -0000 (UTC)
Michael Sanders <porkchop@invalid.foo> wrote:
On Tue, 23 Dec 2025 00:39:49 -0000 (UTC), John McCue wrote:
I like to just read /dev/urandom when I need a random
number. Seem easier and more portable across Linux &
the *BSDs.
int s;
read(fd, &s, sizeof(int));
Thanks John. Wish there was such a 'device' under Windows...
There is.
Windows XP/Vista/7: https://learn.microsoft.com/en-us/windows/win32/api/wincrypt/nf-wincrypt-cryptgenrandom
Win8 and later: https://learn.microsoft.com/en-us/windows/win32/api/bcrypt/nf-bcrypt-bcryptgenrandom
On 12/22/2025 12:48 AM, Michael Sanders wrote:
Is it incorrect to use 0 (zero) to seed srand()?
int seed = (argc >= 2 && strlen(argv[1]) == 9)
? atoi(argv[1])
: (int)(time(NULL) % 900000000 + 100000000);
srand(seed);
Forgive me for C++, but this RNG of mine might be useful for detecting
the state of a system:
https://groups.google.com/g/comp.lang.c++/c/7u_rLgQe86k/m/fYU9SnuAFQAJ
Michael Sanders <porkchop@invalid.foo> writes:
On Mon, 22 Dec 2025 13:18:19 +0100, Janis Papanagnou wrote:
There's number sequence generators that produce 0 sequences if seeded
with 0. And maybe the comment in 'man 3 rand', "If no seed value is
provided, the rand() function is automatically seeded with a value
of 1.", may have fostered his doubt.
Janis - naive question for you...
How do I bring up *posix only* man pages using 3?
I see no difference when invoking any of:
man 3 srand
or: man 3 posix srand
or: man posix 3 srand
What I'm doing wrong here?
You're looking in the wrong place.
https://pubs.opengroup.org/onlinepubs/9799919799/
Select <System Interface> in the top left frame,
select (3) in the subsequent bottom left frame and
select the interface name in the bottom left frame. The
manual page will be in the right frame.
https://pubs.opengroup.org/onlinepubs/9799919799/functions/rand.html
On Wed, 24 Dec 2025 10:51:14 +0200, Michael S wrote:
On Tue, 23 Dec 2025 07:25:42 -0000 (UTC)
Michael Sanders <porkchop@invalid.foo> wrote:
On Tue, 23 Dec 2025 00:39:49 -0000 (UTC), John McCue wrote:
I like to just read /dev/urandom when I need a random
number. Seem easier and more portable across Linux &
the *BSDs.
int s;
read(fd, &s, sizeof(int));
Thanks John. Wish there was such a 'device' under Windows...
There is.
Windows XP/Vista/7: https://learn.microsoft.com/en-us/windows/win32/api/wincrypt/nf-wincrypt-cryptgenrandom
Win8 and later: https://learn.microsoft.com/en-us/windows/win32/api/bcrypt/nf-bcrypt-bcryptgenrandom
Was referring to the concept of a device in the same idiom of BSD/Linux/Apple...
Something that is just as easy to use.
On Wed, 24 Dec 2025 15:28:24 -0000 (UTC)
Michael Sanders <porkchop@invalid.foo> wrote:
On Wed, 24 Dec 2025 10:51:14 +0200, Michael S wrote:
On Tue, 23 Dec 2025 07:25:42 -0000 (UTC)
Michael Sanders <porkchop@invalid.foo> wrote:
On Tue, 23 Dec 2025 00:39:49 -0000 (UTC), John McCue wrote:
I like to just read /dev/urandom when I need a random
number. Seem easier and more portable across Linux &
the *BSDs.
int s;
read(fd, &s, sizeof(int));
Thanks John. Wish there was such a 'device' under Windows...
There is.
Windows XP/Vista/7:
https://learn.microsoft.com/en-us/windows/win32/api/wincrypt/nf-wincrypt-cryptgenrandom
Win8 and later:
https://learn.microsoft.com/en-us/windows/win32/api/bcrypt/nf-bcrypt-bcryptgenrandom
Was referring to the concept of a device in the same idiom of
BSD/Linux/Apple...
Something that is just as easy to use.
What is not easy in the functions referred above? You do the same
couple of steps as on Unix: open device then read few bytes from it.
Only names are different.
Unix:
$ head -c 8 /dev/urandom | od -An | tr -d ' '
4fa2c3d17b9a8f12
Windows:
PS C:\Users\Bob>
$bytes = New-Object byte[] 8 [System.Security.Cryptography.RandomNumberGenerator]::Create().GetBytes($bytes)
[Console]::Write($bytes | ForEach-Object { "{0:x2}" -f $_ })
My aims (mostly just learning my around C at this point)
are *much* more simple. I needed something that is seed-able/deterministic/portable allowing the user a
shot at replaying a round in a silly game I've been
working on every now & again:
int genseed(int seed_in) {
if (seed_in >= 10000000) return seed_in;
unsigned long t = (unsigned long)time(NULL);
unsigned long c = (unsigned long)clock();
return (int)(((t ^ c) % 80000000UL) + 10000000UL);
}
On Wed, 24 Dec 2025 15:28:24 -0000 (UTC)
Michael Sanders <porkchop@invalid.foo> wrote:
On Wed, 24 Dec 2025 10:51:14 +0200, Michael S wrote:
On Tue, 23 Dec 2025 07:25:42 -0000 (UTC)
Michael Sanders <porkchop@invalid.foo> wrote:
On Tue, 23 Dec 2025 00:39:49 -0000 (UTC), John McCue wrote:
I like to just read /dev/urandom when I need a random
number. Seem easier and more portable across Linux &
the *BSDs.
int s;
read(fd, &s, sizeof(int));
Thanks John. Wish there was such a 'device' under Windows...
There is.
Windows XP/Vista/7:
https://learn.microsoft.com/en-us/windows/win32/api/wincrypt/nf-wincrypt-cryptgenrandom
Win8 and later:
https://learn.microsoft.com/en-us/windows/win32/api/bcrypt/nf-bcrypt-bcryptgenrandom
Was referring to the concept of a device in the same idiom of
BSD/Linux/Apple...
Something that is just as easy to use.
What is not easy in the functions referred above? You do the same
couple of steps as on Unix: open device then read few bytes from it.
Only names are different.
On 2025-12-24 17:17, Michael Sanders wrote:
Unix:
$ head -c 8 /dev/urandom | od -An | tr -d ' '
4fa2c3d17b9a8f12
Windows:
PS C:\Users\Bob>
$bytes = New-Object byte[] 8
[System.Security.Cryptography.RandomNumberGenerator]::Create().GetBytes($bytes)
[Console]::Write($bytes | ForEach-Object { "{0:x2}" -f $_ })
Amazing! 8-o
Or rather; frightening! ("The little Shop of Horrors")
A mixture (best/worst) of all; OO, Functional, and Shell?
What is that; "Powershell", or something else?
(I've mostly ignored Windows during the past 20+ years.)
Janis
witness: echo hello world? | | tr 'A-Za-z' 'N-ZA-Mn-za-m'
On Wed, 24 Dec 2025 17:27:32 -0000 (UTC), Michael Sanders wrote:
witness: echo hello world? | | tr 'A-Za-z' 'N-ZA-Mn-za-m'
typo, should be: echo hello world? | tr 'A-Za-z' 'N-ZA-Mn-za-m'
set a= 10000001000000000
set /A ab= %a% * 1000
set /A ab= %a% * 20002000000000
set /A ab= %a% * 3000-1294967296
On Wed, 24 Dec 2025 06:16:51 -0000 (UTC), Lawrence DrCOOliveiro wrote:
Wish there was such a 'device' under Windows...
You should get one if you install WSL2.
To be fair there is the 'Windows entropy pool' & its
non-deterministic too but its only available via API.
I don't know. Testing randomness is complicated matter.
I agree that Powershell is too complicated and too "wannabe real
programming language" which makes it bad shell scripting language.
Esp. so for quick throwaway scripts.
However I don't quite understand what you find wrong with cmd.exe.
Cryptic? May be. But I can not imagine shell scripting language which is
not cryptic in some way.
Has few limitations that one would not expect in shell script in 2025?
Yes.
set a= 10000001000000000
set /A ab= %a% * 1000
set /A ab= %a% * 20002000000000
set /A ab= %a% * 3000-1294967296
But cmd.exe language certainly is *not* over-complicated. Rather more
like too primitive.
nul chcp 65001set "_MediaInfo=C:\Program Files\MediaInfo\MediaInfo.exe"
nul chcp 437--
Not to mention some other quirks. But that's just my opinion.
[...]
On 12/23/2025 11:54 AM, Waldek Hebisch wrote:
Michael S <already5chosen@yahoo.com> wrote:
On Mon, 22 Dec 2025 18:41:10 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
On 2025-12-22 18:13, James Kuyper wrote:
On 2025-12-22 07:18, Janis Papanagnou wrote:
On 2025-12-22 12:44, James Kuyper wrote:
On 2025-12-22 03:48, Michael Sanders wrote:
Is it incorrect to use 0 (zero) to seed srand()?
int seed = (argc >= 2 && strlen(argv[1]) == 9)
? atoi(argv[1])
: (int)(time(NULL) % 900000000 + 100000000);
srand(seed);
No, why whould you think so?
There's number sequence generators that produce 0 sequences if
seeded with 0. ...
The details of how the seed affects the random number sequence are
unspecified by the standard. I personally would consider a
pseudo-random number generator to be quite defective if there were
any seed that produced a constant output.
I wouldn't have mentioned that if there weren't a whole class of
such functions that expose exactly that behavior by design. Have
a look for PN-(Pseudo Noise-)generators and LFSR (Linear Feedback
Shift Registers). These have been defined to produce random noise
(bit pattern with good statistical distribution). With sophisticated
generator polynomials they produce also sequences of maximum period;
say, for N=31 a non-repeating sequence of length 2^N - 1. The one
element that is missing from the sequence is the 0 (that reproduces
itself).
Technically you pick some bit-values from fixed positions (depending
on the generator polynomial) of the register and xor the bits to shift >>>> the result into the register. Here's ad hoc an example...
#include <stdio.h>
#include <stdint.h>
int main ()
{
-a-a-a-a-a uint32_t init = 0x00000038;
-a-a-a-a-a uint32_t reg = init;
-a-a-a-a-a uint32_t new_bit;
-a-a-a-a-a int count = 0;
-a-a-a-a-a do {
-a-a-a-a-a-a-a-a-a new_bit = ((reg >> 2) + (reg >> 4) + (reg >> 6) + (reg >>
30)) & 0x1;
-a-a-a-a-a-a-a-a-a reg <<= 1;
-a-a-a-a-a-a-a-a-a reg |= new_bit;
-a-a-a-a-a-a-a-a-a reg &= 0x7fffffff;
-a-a-a-a-a-a-a-a-a count++;
-a-a-a-a-a } while (reg != init);
-a-a-a-a-a printf ("period: %d\n", count);
}
Janis
[...]
Pay attention that C Standard only requires for the same seed to always
produces the same sequence. There is no requirement that different
seeds have to produce different sequences.
So, for generator in your example, implementation like below would be
fully legal. Personally, I wouldn't even consider it as particularly
poor quality:
void srand(unsigned seed ) { init = seed | 1;}
[O.T.]
In practice, using LFSR for rand() is not particularly bright idea for
different reason: LFSR is a reasonably good PRNG for a single bit, but
not when you want to generate a group of 31 pseudo-random bits. In
order to get 31 new bits, without predictable repetitions from the
previous value, you would have to do 31 steps. That's slow! The process
can be accelerate by generation of several bits at time via look up
tables, but in order to get decent speed the table has to be rater big
and using big tables in standard library is bad sportsmanship.
It seems that overwhelming majority C RTLs use Linear Congruential
Generators, probably because for Stanadard library compactness of both
code and data is considered more important than very high speed (not
that on modern HW LCGs are slow) or superior random properties of
Mersenne Twisters.
There is a paper "PCG: A Family of Simple Fast Space-Efficient
Statistically Good Algorithms for Random Number Generation"
by M. OrCONeill where she gives a family of algorithms and runs
several statistical tests against known algorithms.-a Mersenne
Twister does not look good in tests.-a If you have enough (128) bits
LCGs do pass tests.-a A bunch of generators with 64-bit state also
passes tests.-a So the only reason to prefer Mersenne Twister is
that it is implemented in available libraries.-a Otherwise it is
not so good, have large state and needs more execution time
than alternatives.
A lot can depend on what one wants as well...
Fast/Simple:
-a seed=seed*65521+17;
-a val=(seed>>16)&32767;
At first glance, this approach seems random enough, but these type of
RNGs have a type of repeating pattern that can become obvious, say, if
using them to generate random noise images.
Or, can also work OK (also fast/simple):
-a seed=(seed<<1)^(~(seed>>7));
-a val=(seed>>8)&32767;
Some people seem to really like using lookup tables.
64-bit multiply can potentially be very slow, and multiply in general
isn't always cheap, so can make sense to avoid using it if not necessary.
So, shift-and-XOR is fast, above approach is also trivially extended to
64 bits.
Its randomness can be improved somewhat (at the cost of speed), say:
-a seed1=(seed1<<1)^(~(seed1>>13));
-a seed2=(seed2<<3)^(~(seed2>>19));
-a seed1^=seed2>>23;
-a seed2^=seed1>>23;
-a val=(seed1>>11)^(seed2>>11);
-a val=(val^(val>>17))&32767;
Where seed1 and seed2 are two 64-bit values.
Not much formal testing here, mostly just sort of approaches that seemed
to work OK IME.
Had also noted that there are ways to do checksums that are a lot faster
and simpler than more widespread algorithms and also seem to still do reasonably well at error detection.
Say, for example:
-a sum1=1; sum2=1;
-a for(i=0; i<szWords; i++)
-a-a-a { sum1+=data[i]; sum2+=sum1; }
-a sum1=((uint32_t)sum1)+(sum1>>32);
-a sum2=((uint32_t)sum2)+(sum2>>32);
-a csum=sum1^sum2;
Where, sum1/sum2 are 64-bit, and data is interpreted as 32-bit words,
all unsigned.
But, yeah...
On Tue, 23 Dec 2025 02:17:01 -0000 (UTC), Lawrence DrCOOliveiro wrote:
On Tue, 23 Dec 2025 00:39:49 -0000 (UTC), John McCue wrote:
I like to just read /dev/urandom when I need a random number. Seem
easier and more portable across Linux & the *BSDs.
Not to mention a lot stronger, cryptographically.
No srand() combined with crypto on my end. Sounds like an invitation
to get hacked from everything I've ever read about mixing the two.
On 2025-12-24 16:41, Michael Sanders wrote:
My aims (mostly just learning my around C at this point)
are *much* more simple. I needed something that is
seed-able/deterministic/portable allowing the user a
shot at replaying a round in a silly game I've been
working on every now & again:
int genseed(int seed_in) {
if (seed_in >= 10000000) return seed_in;
unsigned long t = (unsigned long)time(NULL);
unsigned long c = (unsigned long)clock();
return (int)(((t ^ c) % 80000000UL) + 10000000UL);
}
If you need a portable function across different platforms
you may want to write an own random() function, code based
on some simple, proven algorithm. Or borrow a piece of code
from some existing public source code library.
For "_replaying_ a round in a silly game" across platforms
(or generally) you should not seed it with time() or other
random factors (as shown in your code snippet).
Janis
Ok seed is working for replaying a game. =)
Pure C, no ncurses etc, built in help too.
You know this game don't you Janis?
Simple little project. Screenshot...
<https://drive.google.com/file/d/1dKSjDzmu0mLy76GWrlZUT9HVK72GiP9d/view>
On 12/24/2025 5:22 AM, BGB wrote:
Some people seem to really like using lookup tables.
I don't really get the point of lookup table driven RNGs.
A lot of these ones are, say, a table with 256 spots, and an index.
Each time one generates a random number (usually a byte), it returns the value at that location in the table and advances to the next index.
Sometimes some get clever and use an algorithm to jitter the table index.
I have mostly seen this strategy used in old game engines.
These typically fail the image test as by design they give repeating patterns.
[...]
On Wed, 24 Dec 2025 15:21:11 -0000 (UTC), Michael Sanders wrote:
On Wed, 24 Dec 2025 06:16:51 -0000 (UTC), Lawrence DrCOOliveiro wrote:
Wish there was such a 'device' under Windows...
You should get one if you install WSL2.
To be fair there is the 'Windows entropy pool' & its
non-deterministic too but its only available via API.
You begin to see why Microsoft is supporting Linux more and more.
On 2025-12-25 06:09, BGB wrote:
On 12/24/2025 5:22 AM, BGB wrote:
Some people seem to really like using lookup tables.
Of course; they can speed up things significantly. And simplify
the operations for contemporary data sizes (bytes, 64 bit words,
etc.).
I don't really get the point of lookup table driven RNGs.
In e.g. bit-oriented algorithms (e.g. based on LFSR) you can speed
up processing significantly by processing larger quantities (like octets/bytes) and processing/accessing values then byte-wise.[*]
Someone already mentioned that a bit-wise operating PN-generator
for random numbers could make use of such a table driven approach.
(You can extend that principle to larger quantities than bits, e.g.
to gain from larger processor word lengths, especially it you need
those larger entities as result in the first place.)
A lot of these ones are, say, a table with 256 spots, and an index.
Each time one generates a random number (usually a byte), it returns
the value at that location in the table and advances to the next index.
Sometimes some get clever and use an algorithm to jitter the table index.
I have mostly seen this strategy used in old game engines.
These typically fail the image test as by design they give repeating
patterns.
[...]
Janis
[*] Here's some old "C" example for a CRC-16 using table-lookup... http://random.gridbug.de/ccitt_crc16.c
One entropy-mining process is to use "clock()" or similar and then
spin in a loop for a certain amount of time effectively building a
hash of the values returned by clock. The exact timing when the
values change will tend to carry a certain amount of entropy.
The turbulence of the air/gas inside disk drives is apparently a good
source of randomness.
On Thu, 25 Dec 2025 03:07:03 -0600, BGB wrote:
One entropy-mining process is to use "clock()" or similar and then
spin in a loop for a certain amount of time effectively building a
hash of the values returned by clock. The exact timing when the
values change will tend to carry a certain amount of entropy.
The turbulence of the air/gas inside disk drives is apparently a good
source of randomness.
On 12/25/2025 1:31 PM, Lawrence DrCOOliveiro wrote:
On Thu, 25 Dec 2025 03:07:03 -0600, BGB wrote:
One entropy-mining process is to use "clock()" or similar and then
spin in a loop for a certain amount of time effectively building a
hash of the values returned by clock. The exact timing when the
values change will tend to carry a certain amount of entropy.
The turbulence of the air/gas inside disk drives is apparently a good
source of randomness.
Yeah, but one doesn't easily have access to this information.
Likewise to access from the low order bits of CPU thermometers or similar, etc.
For some of my targets, there is also no HDD (typically, everything runs off of SD cards).
FWIW, in my own CPU design, there is actually a hardware RNG where internal signals are basically gathered up and fed around the bus in a special noise channel and used to continuously feed into a hardware RNG for which a value can be read with a special CPU instruction.
But, alas, mainline CPUs lack such a feature.
On Thu, 12/25/2025 4:29 PM, BGB wrote:
On 12/25/2025 1:31 PM, Lawrence DrCOOliveiro wrote:
On Thu, 25 Dec 2025 03:07:03 -0600, BGB wrote:
One entropy-mining process is to use "clock()" or similar and then
spin in a loop for a certain amount of time effectively building a
hash of the values returned by clock. The exact timing when the
values change will tend to carry a certain amount of entropy.
The turbulence of the air/gas inside disk drives is apparently a good
source of randomness.
Yeah, but one doesn't easily have access to this information.
Likewise to access from the low order bits of CPU thermometers or similar, etc.
For some of my targets, there is also no HDD (typically, everything runs off of SD cards).
FWIW, in my own CPU design, there is actually a hardware RNG where internal signals are basically gathered up and fed around the bus in a special noise channel and used to continuously feed into a hardware RNG for which a value can be read with a special CPU instruction.
But, alas, mainline CPUs lack such a feature.
How have you concluded such a thing ?
My CPU happens to have a random number generator running at 500MB/sec.
And it works on the same principle as other RNGs. It uses one physical process for entropy, and it uses a pseudo random number generator for
the at-speed part (the 500MB/sec).
On mine, there are 16 ring oscillators, with one ring oscillator
having three inverters in a ring. The slowest oscillator has 59 inverters
in a row. (The number of inverters must be an odd number, in order
to ensure the oscillators start OK.) A sampling circuit samples all
sixteen RO and creates a 16 bit number. The 16 bit number is a "seed" to
the pseudo random number generator.
Keywords like: "RDRAND, RDSEED"
https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/white-papers/amd-random-number-generator.pdf
There are other implementations. Intel has more than one method for doing this, historically.
https://www.electronicdesign.com/resources/article/21796238/understanding-intels-ivy-bridge-random-number-generator
The Linux people happen not to like those, but, they exist anyway, chugging away.
And completely unrelated.
https://www.2uo.de/myths-about-urandom/
My CPU happens to have a random number generator running at
500MB/sec.
...
The Linux people happen not to like those, but, they exist anyway,
chugging away.
On Thu, 25 Dec 2025 23:25:39 -0500, Paul wrote:
My CPU happens to have a random number generator running at
500MB/sec.
...
The Linux people happen not to like those, but, they exist anyway,
chugging away.
Because itrCOs difficult to see how you can trust them.
Not without thoroughly mashing them through something like this <https://en.wikipedia.org/wiki/Fortuna_(PRNG)>, anyway.
On Fri, 12/26/2025 12:42 AM, Lawrence DrCOOliveiro wrote:
On Thu, 25 Dec 2025 23:25:39 -0500, Paul wrote:
My CPU happens to have a random number generator running at
500MB/sec.
...
The Linux people happen not to like those, but, they exist anyway,
chugging away.
Because itrCOs difficult to see how you can trust them.
Not without thoroughly mashing them through something like this
<https://en.wikipedia.org/wiki/Fortuna_(PRNG)>, anyway.
The claim was, that x86 processors didn't have anything.
How did you implement the colored squares? Using characters from
an "extended" character set and ANSI controls for the colors?
Looks nice.[*]
Only that it looks as if one gets too much information (compared
to the original Mastermind)! IMO and AFAIK one should *not* get
the _exact_ place of a wrong digit indicated. (The green hints
should all be left aligned, and the [optional] red ones all right
aligned, and the blue ones in between.)
[*] 18 months ago I've written an optically less appealing command
line variant to refresh my Algol 68 skills. Though I just notice
that I hadn't finish it yet; I've only implemented playing modes 1
and 2. ;-)
Enter a value for #places: 4
Enter a value for #colors: 6
Available playing modes:
1 - computer selects, human guesses
2 - human selects, computer guesses
3 - alternate select/guess roles per game
4 - human selects, human guesses
5 - computer selects, computer guesses
0 - leave the game
Choose the playing mode (1-5): 1
A secret color combination to guess has been chosen.
You have to guess it.
Enter a color combination: 1122
Turn 1: 1 1 2 2 -
Enter a color combination: 3345
Turn 2: 3 3 4 5 - @
Enter a color combination: 6444
Turn 3: 6 4 4 4 - @@
Enter a color combination: 6646
Turn 4: 6 6 4 6 - @@@@
You guessed it!
Nothing specific to either cmd.exe or PowerShell, but the Windows
command line is fundamentally broken. This is because it derives from
the CP/M command line model, which in turn was inherited from old-time
DEC operating systems.
On these DEC systems, the command line was a simple string buffer. So
there is this assumption that program invocation is always going to be mediated by some kind of rCLshellrCY program, and the concept of one
program directly invoking another is either nonexistent, or only
grudgingly tolerated.
Contrast this with the Unix approach, where the command line is an
array of separate string rCLwordsrCY. There is no rCLshellrCY that occupies a privileged place in the system; any program can directly invoke any
other, without having to worry about properly escaping any special
characters that might be (mis)interpreted by some rCLshellrCY.
While arguably a typical C library "rand()" isn't that strong, if one
has a number sequence of output random digits, it might still take an impractical amount of time to brute-force search the entire seed space
for a 64-bit seed.
On Fri, 26 Dec 2025 01:52:15 -0500, Paul wrote:
On Fri, 12/26/2025 12:42 AM, Lawrence DrCOOliveiro wrote:
On Thu, 25 Dec 2025 23:25:39 -0500, Paul wrote:
My CPU happens to have a random number generator running at
500MB/sec.
...
The Linux people happen not to like those, but, they exist anyway,
chugging away.
Because itrCOs difficult to see how you can trust them.
Not without thoroughly mashing them through something like this
<https://en.wikipedia.org/wiki/Fortuna_(PRNG)>, anyway.
The claim was, that x86 processors didn't have anything.
Not what I was responding to. I said nothing about such a claim, either
way.
On Wed, 24 Dec 2025 23:35:55 -0600, BGB wrote:
While arguably a typical C library "rand()" isn't that strong, if one
has a number sequence of output random digits, it might still take an
impractical amount of time to brute-force search the entire seed space
for a 64-bit seed.
That is a great point IMO. After reading this I used Gemini to get a guess
on the number of permutations for A-Z, its reply was:
'over 403 quintillion millions' of permatations for A-Z...
Now if we split that list (assuming each line was randomized 26 characters A-Z) & used each line exactly *once*, then destroyed it, we might maintain some privacy. At least till quantum stuff is in every day use...
Though one possibility could be to increase the strength of stack-
canaries by flagging them with relocs, allowing the program loader to
itself re-jitter the canary values without needing a recompile.
Well, and GCC compiled code has an additional weakness here in that GCC doesn't normally use stack canaries (they seemingly do very little to
give binaries any kind of resistance against buffer overflow exploits).
On Wed, 24 Dec 2025 09:00:50 -0000 (UTC)
antispam@fricas.org (Waldek Hebisch) wrote:
Michael S <already5chosen@yahoo.com> wrote:
On Tue, 23 Dec 2025 17:54:05 -0000 (UTC)
antispam@fricas.org (Waldek Hebisch) wrote:
Michael S <already5chosen@yahoo.com> wrote:
On Mon, 22 Dec 2025 18:41:10 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
Also, the TestU01 suit is made for generators with 32-bit output.
M. OrCONeill used ad hoc technique to make it applicable to
generators with 64-bit output. Is this technique right? Or may be
it put 64-bit PRNG at unfair disadvantage?
My point of view is that generator can be used to generate long
bistream. Then you can cut the bitstream and get number of
desired size. Good tests should check that such usage leads
to reasonable properties. So, fact that one generator produces
32-bit pieces and other produces 64-bit pieces should be irrelevant
to the test.
What you say is correct in few use cases. But there are many uses
cases (in field of testing of numeric code, probably, most of them)
in which "less random" LS bits are acceptable.
Not that I can see why it could be the case for MT19937-64, but it
could apply to one of two of other 64-bit generators tested by
O'Neill.
Besides, I strongly disagree with at least one assertion made by OrCONeill: "While security-related applications should
use a secure generator, because we cannot always know the future
contexts in which our code will be used, it seems wise for all applications to avoid generators that make discovering their
entire internal state completely trivial."
No, I know exactly what I am doing/ I know exactly that for my application easy discovery of complete state of PRNG is not a
defect.
OrCONeill is not a prophet, ignore what she say it you think you
know better (which is probably the above).
Anyway, even if I am skeptical about her criticism of popular
PRNGs, intuitively I agree with the constructive part of the
article - medium-quality PRNG that feeds medium quality hash
function can potentially produce very good fast PRNG with rather
small internal state.
She seem to care very much about having minimal possible state.
That is may be nice on embeded systems, but in general I would
happily accept slighty bigger state (say 256 bits). But if
we can get good properties with very small state, then why not?
After all looking at state and updating it takes code, so
small state helps with having fast generator.
Agreed.
Concerning Mersenne Twister, she is not the only one to
criticise it. My personal opinion is that given large
state and not so simple update Mersenne Twister would
have to be very very good to justify its use.
One theoretical advantage of MT19937 is that it has period of
astronomic proportions. Which means that one instance of PRNG could be de-multiplexed into millions or billions of sub-streams with no
detectable degradation of the quality of each sub-stream.
However I fail to see how de-multiplexing into more than ~ one
thousand of sub-streams can be practical. And for the latter one does
not need to be astronomical, something like period=2**96 would be
fully sufficient with many bits to spare.
So, in theory I agree with the criticism. But in practice I am not
bothered by the size of MT state.
But it
fails some tests, so does not look _better_ than other
generators.
It would be interesting to find out what were those tests that failed.
I wonder, if tests suit can run faster on multicore computer. I don't
want to wait 5-6 hours just to find out that report does not provide
an information that I am looking for.
On related note, I think that even simple counter fed into high
quality hash function (not cryptographically high quality, far
less than that) can produce excellent PRNG with even smaller
internal state. But not very fast one. Although the speed depends
on specifics of used computer. I can imagine computer that has low-latency Rijndael128 instruction. On such computer, running
counter through 3-4 rounds of Rijndael ill produce very good PRNG
that is only 2-3 times slower than, for example, LCG 128/64.
Maybe.
May be I'd even test my hypothesis. Eventually. Except that, again, I
am not thrilled by idea of waiting 6 hours for each result.
On Wed, 24 Dec 2025 12:12:11 +0200
Michael S <already5chosen@yahoo.com> wrote:
On Wed, 24 Dec 2025 09:00:50 -0000 (UTC)
antispam@fricas.org (Waldek Hebisch) wrote:
Michael S <already5chosen@yahoo.com> wrote:
On Tue, 23 Dec 2025 17:54:05 -0000 (UTC)
antispam@fricas.org (Waldek Hebisch) wrote:
Michael S <already5chosen@yahoo.com> wrote:
On Mon, 22 Dec 2025 18:41:10 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
Also, the TestU01 suit is made for generators with 32-bit output.
M. OrCONeill used ad hoc technique to make it applicable to
generators with 64-bit output. Is this technique right? Or may be
it put 64-bit PRNG at unfair disadvantage?
My point of view is that generator can be used to generate long
bistream. Then you can cut the bitstream and get number of
desired size. Good tests should check that such usage leads
to reasonable properties. So, fact that one generator produces
32-bit pieces and other produces 64-bit pieces should be irrelevant
to the test.
What you say is correct in few use cases. But there are many uses
cases (in field of testing of numeric code, probably, most of them)
in which "less random" LS bits are acceptable.
Not that I can see why it could be the case for MT19937-64, but it
could apply to one of two of other 64-bit generators tested by
O'Neill.
Besides, I strongly disagree with at least one assertion made by
OrCONeill: "While security-related applications should
use a secure generator, because we cannot always know the future
contexts in which our code will be used, it seems wise for all
applications to avoid generators that make discovering their
entire internal state completely trivial."
No, I know exactly what I am doing/ I know exactly that for my
application easy discovery of complete state of PRNG is not a
defect.
OrCONeill is not a prophet, ignore what she say it you think you
know better (which is probably the above).
Anyway, even if I am skeptical about her criticism of popular
PRNGs, intuitively I agree with the constructive part of the
article - medium-quality PRNG that feeds medium quality hash
function can potentially produce very good fast PRNG with rather
small internal state.
She seem to care very much about having minimal possible state.
That is may be nice on embeded systems, but in general I would
happily accept slighty bigger state (say 256 bits). But if
we can get good properties with very small state, then why not?
After all looking at state and updating it takes code, so
small state helps with having fast generator.
Agreed.
Concerning Mersenne Twister, she is not the only one to
criticise it. My personal opinion is that given large
state and not so simple update Mersenne Twister would
have to be very very good to justify its use.
One theoretical advantage of MT19937 is that it has period of
astronomic proportions. Which means that one instance of PRNG could be
de-multiplexed into millions or billions of sub-streams with no
detectable degradation of the quality of each sub-stream.
However I fail to see how de-multiplexing into more than ~ one
thousand of sub-streams can be practical. And for the latter one does
not need to be astronomical, something like period=2**96 would be
fully sufficient with many bits to spare.
So, in theory I agree with the criticism. But in practice I am not
bothered by the size of MT state.
But it
fails some tests, so does not look _better_ than other
generators.
It would be interesting to find out what were those tests that failed.
I wonder, if tests suit can run faster on multicore computer. I don't
want to wait 5-6 hours just to find out that report does not provide
an information that I am looking for.
I reproduced results of M. O'Neil. Luckily, on semi-modern hardware
(Coffee Lake or EPYC3) and for PRNGs in question BigCrash finishes in
2-2.5 hours. Which is a pain, but considarably less so then 5 hours.
mt19937 fails all tests of scomp_LinearComp() variaty (2 in Crash and 2
in BigCrash). It passes all the rest of tests.
After re-reading O'Neil, I see that she wrote that, but first time I
didn't pay attention.
I have read description of scomp_LinearComp(). I can't say that I
understood much. Neither theory, nor parameters, in particular, even
after looking though code I have no idea about the meaning of parameter
r.
I am not so sure that Pierre LrCOEcuyer himself fully understands this
test, apart from the fact that many moons ago basic algorithm for
calculation of linear complexity was published in IEEE Transactions on Information Theory. Otherwise, his description would not look so much
as hand waving. As to O'Neil, she likely understands it better than
myself, but that's not a huge achievement :(
My less than scientific feeling about this test is that one part of it
is looking if test is LFSR or LFSR-family and if the answer is yes then
test fails.
So, eccentrically, mt19937 is punished for what it is rather than for randomness of results that it produces.
I made a minimal modification to mt19937 algorithm, telling it to skip
every 19936th result word. With modification it easily passes all 3
crash batteries of Pierre LrCOEcuyer.
Do I think that my modified mt19937 is better than original? No, I
don't. IMHO, the only thing it is better in is passing batteries of LrCOEcuyer.
On related note, I think that even simple counter fed into high
quality hash function (not cryptographically high quality, far
less than that) can produce excellent PRNG with even smaller
internal state. But not very fast one. Although the speed depends
on specifics of used computer. I can imagine computer that has
low-latency Rijndael128 instruction. On such computer, running
counter through 3-4 rounds of Rijndael ill produce very good PRNG
that is only 2-3 times slower than, for example, LCG 128/64.
Maybe.
May be I'd even test my hypothesis. Eventually. Except that, again, I
am not thrilled by idea of waiting 6 hours for each result.
I tested.
It turned out that my hypothesis was wrong. Running counter through 3
rounds of Rijndael128 is not enough. Running counter through 4 rounds
is still not enough - it fails 1 test (#86) in BigCrash battery.
I didn't test 5 rounds, but even if it is enough, which is likely, it
would almost certainly be slower than other several known methods.
All that with simple 64-bit binary counter as a state variable.
With 128-bit state and with partial chaining of 64 bits of Rijndael
output back into part of state (the other half of state is still a
counter), passing all batteries appear very easy. It only takes one
round for chaining and another one for hashing. But under O'Neil's
figures of merit using 128-bit PRNG state considered cheating );
... built in help too.
Michael S <already5chosen@yahoo.com> wrote:
On Wed, 24 Dec 2025 12:12:11 +0200
Michael S <already5chosen@yahoo.com> wrote:
On Wed, 24 Dec 2025 09:00:50 -0000 (UTC)
antispam@fricas.org (Waldek Hebisch) wrote:
Michael S <already5chosen@yahoo.com> wrote:
On Tue, 23 Dec 2025 17:54:05 -0000 (UTC)
antispam@fricas.org (Waldek Hebisch) wrote:
Michael S <already5chosen@yahoo.com> wrote:
On Mon, 22 Dec 2025 18:41:10 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
Also, the TestU01 suit is made for generators with 32-bit
output. M. OrCONeill used ad hoc technique to make it applicable
to generators with 64-bit output. Is this technique right? Or
may be it put 64-bit PRNG at unfair disadvantage?
My point of view is that generator can be used to generate long
bistream. Then you can cut the bitstream and get number of
desired size. Good tests should check that such usage leads
to reasonable properties. So, fact that one generator produces
32-bit pieces and other produces 64-bit pieces should be
irrelevant to the test.
What you say is correct in few use cases. But there are many uses
cases (in field of testing of numeric code, probably, most of them)
in which "less random" LS bits are acceptable.
Not that I can see why it could be the case for MT19937-64, but it
could apply to one of two of other 64-bit generators tested by
O'Neill.
Besides, I strongly disagree with at least one assertion made
by OrCONeill: "While security-related applications should
use a secure generator, because we cannot always know the
future contexts in which our code will be used, it seems wise
for all applications to avoid generators that make discovering
their entire internal state completely trivial."
No, I know exactly what I am doing/ I know exactly that for my
application easy discovery of complete state of PRNG is not a
defect.
OrCONeill is not a prophet, ignore what she say it you think you
know better (which is probably the above).
Anyway, even if I am skeptical about her criticism of popular
PRNGs, intuitively I agree with the constructive part of the
article - medium-quality PRNG that feeds medium quality hash
function can potentially produce very good fast PRNG with
rather small internal state.
She seem to care very much about having minimal possible state.
That is may be nice on embeded systems, but in general I would
happily accept slighty bigger state (say 256 bits). But if
we can get good properties with very small state, then why not?
After all looking at state and updating it takes code, so
small state helps with having fast generator.
Agreed.
Concerning Mersenne Twister, she is not the only one to
criticise it. My personal opinion is that given large
state and not so simple update Mersenne Twister would
have to be very very good to justify its use.
One theoretical advantage of MT19937 is that it has period of
astronomic proportions. Which means that one instance of PRNG
could be de-multiplexed into millions or billions of sub-streams
with no detectable degradation of the quality of each sub-stream.
However I fail to see how de-multiplexing into more than ~ one
thousand of sub-streams can be practical. And for the latter one
does not need to be astronomical, something like period=2**96
would be fully sufficient with many bits to spare.
So, in theory I agree with the criticism. But in practice I am not
bothered by the size of MT state.
But it
fails some tests, so does not look _better_ than other
generators.
It would be interesting to find out what were those tests that
failed. I wonder, if tests suit can run faster on multicore
computer. I don't want to wait 5-6 hours just to find out that
report does not provide an information that I am looking for.
I reproduced results of M. O'Neil. Luckily, on semi-modern hardware
(Coffee Lake or EPYC3) and for PRNGs in question BigCrash finishes
in 2-2.5 hours. Which is a pain, but considarably less so then 5
hours. mt19937 fails all tests of scomp_LinearComp() variaty (2 in
Crash and 2 in BigCrash). It passes all the rest of tests.
After re-reading O'Neil, I see that she wrote that, but first time I
didn't pay attention.
I have read description of scomp_LinearComp(). I can't say that I understood much. Neither theory, nor parameters, in particular, even
after looking though code I have no idea about the meaning of
parameter r.
I am not so sure that Pierre LrCOEcuyer himself fully understands this test, apart from the fact that many moons ago basic algorithm for calculation of linear complexity was published in IEEE Transactions
on Information Theory. Otherwise, his description would not look so
much as hand waving. As to O'Neil, she likely understands it better
than myself, but that's not a huge achievement :(
My less than scientific feeling about this test is that one part of
it is looking if test is LFSR or LFSR-family and if the answer is
yes then test fails.
So, eccentrically, mt19937 is punished for what it is rather than
for randomness of results that it produces.
Well, I peeked at the code but did not read destription of the
test. In the code I see mention of Berlekamp-Massey. That is
well-known algorithm to find regularites in data, basically
tries to find linear recurence satisfied by the data. One of
things that I do is looking for reccurences, including linear
ones. Up to now I did not run any serious simulation for
such things but I may wish to do so (and IIUC other people did
run some simulations). When doing simulations I really do
not want to see artifacts due to PRNG producing sequence with
different features than random sequence. So for me, if mt19937
produces sequence with visible extra linear regulatities that is
significant failure.
I made a minimal modification to mt19937 algorithm, telling it to
skip every 19936th result word. With modification it easily passes
all 3 crash batteries of Pierre LrCOEcuyer.
Do I think that my modified mt19937 is better than original? No, I
don't. IMHO, the only thing it is better in is passing batteries of LrCOEcuyer.
Your modification is enough to fool the tests. It is not clear
to me if it is enough to fool better regularity finders, so
probably the generator is still defective.
Also, note basic priciple of statistial testing: you should collect
and process data first, than apply _once_ statitical test. Repeated
testing with tweaked data is likely to prodice false pass. If you
really want to tweak data the whole thing should be treated as
one composite test with its own acceptance criteria (which is more
stringent than separate tests). Given that you used knowledge of
failing tests to modify generator, passing test after that is much
weaker claim than passing test for generator without knowledge of
the tests.
Using 64-bit state and generating 32-bit output at each step is exactlyOn related note, I think that even simple counter fed into high
quality hash function (not cryptographically high quality, far
less than that) can produce excellent PRNG with even smaller
internal state. But not very fast one. Although the speed
depends on specifics of used computer. I can imagine computer
that has low-latency Rijndael128 instruction. On such
computer, running counter through 3-4 rounds of Rijndael ill
produce very good PRNG that is only 2-3 times slower than, for
example, LCG 128/64.
Maybe.
May be I'd even test my hypothesis. Eventually. Except that,
again, I am not thrilled by idea of waiting 6 hours for each
result.
I tested.
It turned out that my hypothesis was wrong. Running counter through
3 rounds of Rijndael128 is not enough. Running counter through 4
rounds is still not enough - it fails 1 test (#86) in BigCrash
battery. I didn't test 5 rounds, but even if it is enough, which is
likely, it would almost certainly be slower than other several
known methods.
All that with simple 64-bit binary counter as a state variable.
With 128-bit state and with partial chaining of 64 bits of Rijndael
output back into part of state (the other half of state is still a counter), passing all batteries appear very easy. It only takes one
round for chaining and another one for hashing. But under O'Neil's
figures of merit using 128-bit PRNG state considered cheating );
O'Neil writes about birtday test: if you take values from N
element set, with more than sqrt(N) samples you should get
repetitions. Consider N equal to 2 to the power 64. In
heavy use one could generate more than sqrt(N) values.
In PRNG having 64-bit state and producing state as value
all values are distinct, so generator would fail such a test.
One could try to fix this by not exposing state, say producing
only 32-bits in each step.
But on 64-bit machine it looks
more efficient to use 128-bit state and produce 64-bits in
each step.
Yes it's just a variant. I still need to merge/harmonize
this new way (popular with some folks here in the U.S.)
with the older 'proper' way you & I learned it.
Check out <https://jrgraphix.net/r/Unicode/2500-257F>
Lots of useful glyphs there. Assuming the reader's software can
render these, here's some tree glyphs that provide a smart visual representation of branching hierarchical taxonomies.
Another possibility is sixel graphics
<https://www.arewesixelyet.com/>. Seems to be enjoying a mini-revival
at the moment.
On Fri, 26 Dec 2025 08:08:57 -0000 (UTC), Michael Sanders wrote:
Yes it's just a variant. I still need to merge/harmonize
this new way (popular with some folks here in the U.S.)
with the older 'proper' way you & I learned it.
Merged & harmonized...
int main(int argc, char *argv[]) {
// if app named 'moo' play bulls & cows else mastermind
const char *p = strrchr(argv[0], '/');
What if 'argv[0]' is NULL (and argc == 0)?
*ISO C (C17 / C23)*:
C17, 5.1.2.2.1 "Program startup"
The value of argc shall be nonnegative.
argv[argc] shall be a null pointer.
If the value of argc is greater than zero, the array members argv[0]
through argv[argcreA1] inclusive shall contain pointers to strings
which are given implementation-defined values.
...
What say you?
Clearly on Windows, there are no guarantees about argc contains, so
you shouldnrCOt be relying on it.
app.exe foo "bar baz" 123"app.exe" foo "bar baz" 123
On Tue, 30 Dec 2025 18:42:30 GMT, Scott Lurndal wrote:
What if 'argv[0]' is NULL (and argc == 0)?
Well, seems we have to make a choice, ISO vs. POSIX:
*ISO C (C17 / C23)*:
C17, 5.1.2.2.1 "Program startup"
The value of argc shall be nonnegative.
argv[argc] shall be a null pointer.
*POSIX.1-2017 (and later)*
POSIX execve() specification:
The argument argv is an array of character pointers
to null-terminated strings.
The application shall ensure that argv[0] points to a filename
string that is associated with the process being started.
What say you?
On Wed, 31 Dec 2025 02:01:55 -0000 (UTC), Michael Sanders wrote:
*ISO C (C17 / C23)*:
C17, 5.1.2.2.1 "Program startup"
The value of argc shall be nonnegative.
argv[argc] shall be a null pointer.
If the value of argc is greater than zero, the array members argv[0]
through argv[argcreA1] inclusive shall contain pointers to strings
which are given implementation-defined values.
...
What say you?
Clearly on Windows, there are no guarantees about argc contains, so
you shouldnrCOt be relying on it.
Summary: Some systems guarantee that argc>=1 and argv[0] points to
a valid string, but software that's intended to be portable should
tolerate argc==0 and argv[0]==NULL.
For more information, see
<https://github.com/Keith-S-Thompson/argv0>.
On Wed, 31 Dec 2025 03:10:52 -0000 (UTC), Lawrence DrCOOliveiro wrote:
Clearly on Windows, there are no guarantees about argc contains, so
you shouldnrCOt be relying on it.
Some windows snippets:
Michael Sanders <porkchop@invalid.foo> writes:
On Tue, 30 Dec 2025 18:42:30 GMT, Scott Lurndal wrote:
What if 'argv[0]' is NULL (and argc == 0)?
Well, seems we have to make a choice, ISO vs. POSIX:
*ISO C (C17 / C23)*:
C17, 5.1.2.2.1 "Program startup"
The value of argc shall be nonnegative.
argv[argc] shall be a null pointer.
[...]
*POSIX.1-2017 (and later)*
POSIX execve() specification:
The argument argv is an array of character pointers
to null-terminated strings.
The application shall ensure that argv[0] points to a filename
string that is associated with the process being started.
[...]
What say you?
It happens that I recently spent some time looking into this.
As you say, POSIX requires argc >= 1, but ISO C only guarantees
argc >= 0.
If argc == 0, a program that assumes argv[0] is non-null
can run into serious problems if that assumption is invalid.
In particular, a program called "pkexec" would try to traverse
arguments starting with argv[1], which logically doesn't
exist if argc==0. Due to the way program arguments are laid
out in memory, argv[1] is also envp[0]. Frivolity ensued.
See <https://nvd.nist.gov/vuln/detail/cve-2021-4034>.
The Linux kernel updated execve to ensure that the invoked program
has argc>=1. It was patched in early 2022. NetBSD still has this vulnerability.
Summary: Some systems guarantee that argc>=1 and argv[0] points to
a valid string, but software that's intended to be portable should
tolerate argc==0 and argv[0]==NULL.
For more information, see
<https://github.com/Keith-S-Thompson/argv0>.
On Wed, 31 Dec 2025 02:01:55 -0000 (UTC), Michael Sanders wrote:How did you come to this conclusion?
*ISO C (C17 / C23)*:
C17, 5.1.2.2.1 "Program startup"
The value of argc shall be nonnegative.
argv[argc] shall be a null pointer.
If the value of argc is greater than zero, the array members argv[0] through argv[argcreA1] inclusive shall contain pointers to strings
which are given implementation-defined values.
...
What say you?
Clearly on Windows, there are no guarantees about argc contains, so
you shouldnrCOt be relying on it.
Lawrence DrCOOliveiro <ldo@nz.invalid> writes:
On Wed, 31 Dec 2025 02:01:55 -0000 (UTC), Michael Sanders wrote:
*ISO C (C17 / C23)*:
C17, 5.1.2.2.1 "Program startup"
The value of argc shall be nonnegative.
argv[argc] shall be a null pointer.
If the value of argc is greater than zero, the array members argv[0]
through argv[argcreA1] inclusive shall contain pointers to strings
which are given implementation-defined values.
...
What say you?
Clearly on Windows, there are no guarantees about argc contains, so
you shouldnrCOt be relying on it.
That's not clear. Linux (since 2022) guarantees argc>=1.
Lawrence DrCOOliveiro <ldo@nz.invalid> writes:
On Wed, 31 Dec 2025 02:01:55 -0000 (UTC), Michael Sanders wrote:
*ISO C (C17 / C23)*:
C17, 5.1.2.2.1 "Program startup"
The value of argc shall be nonnegative.
argv[argc] shall be a null pointer.
If the value of argc is greater than zero, the array members argv[0]
through argv[argcreA1] inclusive shall contain pointers to strings
which are given implementation-defined values.
...
What say you?
Clearly on Windows, there are no guarantees about argc contains, so
you shouldnrCOt be relying on it.
That's not clear. Linux (since 2022) guarantees argc>=1. I don't
know whether Windows makes a similar guarantee, but it's entirely
plausible that it could.
But not all systems guarantee argc>=1, so any portable code shouldn't
make assumptions beyond what's guaranteed by ISO C.
On Tue, 12/30/2025 10:37 PM, Keith Thompson wrote:
Lawrence DrCOOliveiro <ldo@nz.invalid> writes:
On Wed, 31 Dec 2025 02:01:55 -0000 (UTC), Michael Sanders wrote:
*ISO C (C17 / C23)*:
C17, 5.1.2.2.1 "Program startup"
The value of argc shall be nonnegative.
argv[argc] shall be a null pointer.
If the value of argc is greater than zero, the array members argv[0]
through argv[argcreA1] inclusive shall contain pointers to strings
which are given implementation-defined values.
...
What say you?
Clearly on Windows, there are no guarantees about argc contains, so
you shouldnrCOt be relying on it.
That's not clear. Linux (since 2022) guarantees argc>=1. I don't
know whether Windows makes a similar guarantee, but it's entirely
plausible that it could.
But not all systems guarantee argc>=1, so any portable code shouldn't
make assumptions beyond what's guaranteed by ISO C.
The Windows guarantee is with MSVC, according to this answer.
********** CoPilot Question **********
Does the Windows operating system guarantee that argc >= 1 for an EXE program ?
argv[0] usually contains the program absolute path.
********** CoPilot Answer ************
HererCOs the clear, standardsrCagrounded, WindowsrCaspecific answer yourCOre looking for rCo with citations from the search results.
# **Short Answer**
**Yes. On Windows, when using the Microsoft C runtime (MSVCRT), argc
is always guaranteed to be reN 1**, and ** argv[0] always exists**. However, **Windows itself does *not* guarantee what argv[0] contains** -
only that the C runtime *treats* it as the program name.
---
# **What Windows Guarantees vs. What the C Runtime Guarantees**
### **Microsoft C Runtime Guarantee**
MicrosoftrCOs documentation explicitly states:
- *rCLSince the program name is considered an argument, the value of argc is at least one.rCY*
So if your EXE is built with MSVC and uses the standard main or wmain , you can rely on:
- argc >= 1
- argv[0] is a valid nullrCaterminated string
- argv[argc] == NULL
### **MSVC Startup Code Behavior**
MSVC startup code parses the command line and
**always inserts the program name as argv[0] ** (quoted if needed).
---
# **What Windows *Does Not* Guarantee**
Windows itself does **not** enforce that the first token of the command line is the executable name. This is a convention, not a rule:
- * By convention the module name ... is the first argument ... Note that this is
only a convention and is not enforced in any way. *
This means:
- A caller can technically pass *anything* as the first token.
- A process created via CreateProcess can have an arbitrary lpCommandLine .
- argv[0] may not be the absolute path.
- It may not even be a valid path at all.
This is why portable code should not assume argv[0] is the executable path.
---
# **CrossrCaPlatform Note**
The C standard does **not** guarantee that argv[0] contains the program name rCo only that it exists.
This is echoed in the search results:
- * In short: No... this is more of a common convention than a strict rule. *
---
# **Final Summary**
| Question | Answer |
|------------------------------------------------------|------------------------------------------|
| **Does Windows guarantee argc >= 1 ?** | **Yes rCo if using MSVCrCOs C runtime.** |
| **Is argv[0] guaranteed to exist?** | **Yes rCo with MSVC.** |
| **Is argv[0] guaranteed to be the absolute path?** | **No.** |
| **Does Windows enforce any of this?** | **No rCo itrCOs the C runtime, not the OS.** |
If you need the *actual* executable path, use:
- GetModuleFileName(NULL, ...) (Windows API)
---
https://learn.microsoft.com/en-us/cpp/c-language/argument-description?view=msvc-170&utm_source=copilot.com
https://learn.microsoft.com/en-us/cpp/c-language/parsing-c-command-line-arguments?view=msvc-170&utm_source=copilot.com
https://github-wiki-see.page/m/johnstevenson/winbox-args/wiki/How-Windows-parses-the-command-line?utm_source=copilot.com
https://www.codegenes.net/blog/is-argv-0-name-of-executable-an-accepted-standard-or-just-a-common-convention/?utm_source=copilot.com
When argv[0] IsnrCOt the Executable Name
4.1 Invocation via exec Functions
4.2 Symbolic Links
4.3 Shell Scripts and Aliases
4.4 Debuggers, Emulators, and Special Environments
********** End CoPilot Answer ************
= 1.
So is that a Yes or No?
My C compiler calls __getmainargs() in msvcrt.dll to get argc/argv.
__getmainargs() is also imported by programs compiled with Tiny C, and also with gcc 14.x from winlibs.com. (I assume it is actually called for the same purpose.)
The specs for __getmainargs() say that the returned argc value is always >= 1.
(I doubt whether msvcrt.dll, which is present because so many programs rely on it, is what is used by MSVC-compiled appls, but you'd have to look inside such an app to check. EXEs inside \windows\system tend to import DLLs with names like "api-ms-win...".)
In any case, it is easy enough to do a check on argc's value in your applications. (And on Windows, if it is 0 and you really need the path, you can get it with GetModuleFileNameA().)
On 31/12/2025 17:30, Paul wrote:I experimented a little with CreateProcess() in caller (parent) and GetCommandLine() in callee (child). It seems that [under Windows] it is impossible to pass empty command line to child process.
On Tue, 12/30/2025 10:37 PM, Keith Thompson wrote:
Lawrence DrCOOliveiro <ldo@nz.invalid> writes:
On Wed, 31 Dec 2025 02:01:55 -0000 (UTC), Michael Sanders wrote:
*ISO C (C17 / C23)*:
C17, 5.1.2.2.1 "Program startup"
The value of argc shall be nonnegative.
argv[argc] shall be a null pointer.
If the value of argc is greater than zero, the array members
argv[0] through argv[argcreA1] inclusive shall contain pointers to
strings which are given implementation-defined values.
...
What say you?
Clearly on Windows, there are no guarantees about argc contains,
so you shouldnrCOt be relying on it.
That's not clear. Linux (since 2022) guarantees argc>=1. I don't
know whether Windows makes a similar guarantee, but it's entirely
plausible that it could.
But not all systems guarantee argc>=1, so any portable code
shouldn't make assumptions beyond what's guaranteed by ISO C.
The Windows guarantee is with MSVC, according to this answer.
********** CoPilot Question **********
Does the Windows operating system guarantee that argc >= 1 for an
EXE program ?
argv[0] usually contains the program absolute path.
********** CoPilot Answer ************
HererCOs the clear, standardsrCagrounded, WindowsrCaspecific answer yourCOre looking for rCo with citations from the search results.
# **Short Answer**
**Yes. On Windows, when using the Microsoft C runtime (MSVCRT), argc
is always guaranteed to be reN 1**, and ** argv[0] always
exists**. However, **Windows itself does *not* guarantee what
argv[0] contains** - only that the C runtime *treats* it as the
program name.
---
# **What Windows Guarantees vs. What the C Runtime Guarantees**
### **Microsoft C Runtime Guarantee**
MicrosoftrCOs documentation explicitly states:
- *rCLSince the program name is considered an argument, the value of
argc is at least one.rCY*
So if your EXE is built with MSVC and uses the standard main or
wmain , you can rely on:
- argc >= 1
- argv[0] is a valid nullrCaterminated string
- argv[argc] == NULL
### **MSVC Startup Code Behavior**
MSVC startup code parses the command line and
**always inserts the program name as argv[0] ** (quoted if
needed).
---
# **What Windows *Does Not* Guarantee**
Windows itself does **not** enforce that the first token of the
command line is the executable name. This is a convention, not a
rule:
- * By convention the module name ... is the first argument ...
Note that this is only a convention and is not enforced in any way.
*
This means:
- A caller can technically pass *anything* as the first token.
- A process created via CreateProcess can have an arbitrary
lpCommandLine .
- argv[0] may not be the absolute path.
- It may not even be a valid path at all.
This is why portable code should not assume argv[0] is the
executable path.
---
# **CrossrCaPlatform Note**
The C standard does **not** guarantee that argv[0] contains the
program name rCo only that it exists. This is echoed in the search
results:
- * In short: No... this is more of a common convention than a
strict rule. *
---
# **Final Summary**
| Question | Answer
| |------------------------------------------------------|------------------------------------------|
| **Does Windows guarantee argc >= 1 ?** | **Yes rCo if
using MSVCrCOs C runtime.** | | **Is argv[0] guaranteed to
exist?** | **Yes rCo with MSVC.**
| | **Is argv[0] guaranteed to be the absolute path?** | **No.**
| | **Does Windows enforce any of
this?** | **No rCo itrCOs the C runtime, not the OS.** |
If you need the *actual* executable path, use:
- GetModuleFileName(NULL, ...) (Windows API)
---
https://learn.microsoft.com/en-us/cpp/c-language/argument-description?view=msvc-170&utm_source=copilot.com
https://learn.microsoft.com/en-us/cpp/c-language/parsing-c-command-line-arguments?view=msvc-170&utm_source=copilot.com
https://github-wiki-see.page/m/johnstevenson/winbox-args/wiki/How-Windows-parses-the-command-line?utm_source=copilot.com
https://www.codegenes.net/blog/is-argv-0-name-of-executable-an-accepted-standard-or-just-a-common-convention/?utm_source=copilot.com
When argv[0] IsnrCOt the Executable Name
4.1 Invocation via exec Functions
4.2 Symbolic Links
4.3 Shell Scripts and Aliases
4.4 Debuggers, Emulators, and Special Environments
********** End CoPilot Answer ************
So is that a Yes or No?
My C compiler calls __getmainargs() in msvcrt.dll to get argc/argv.
__getmainargs() is also imported by programs compiled with Tiny C,
and also with gcc 14.x from winlibs.com. (I assume it is actually
called for the same purpose.)
The specs for __getmainargs() say that the returned argc value is
always
= 1.
(I doubt whether msvcrt.dll, which is present because so many
programs rely on it, is what is used by MSVC-compiled appls, but
you'd have to look inside such an app to check. EXEs inside
\windows\system tend to import DLLs with names like "api-ms-win...".)
In any case, it is easy enough to do a check on argc's value in your applications. (And on Windows, if it is 0 and you really need the
path, you can get it with GetModuleFileNameA().)
... using exec() in caller sounds like a bad idea. It just not how
these systems work and not how people write programs on them.
I'd implement caller with spawn(). I suppose that even on POSIX it
is more idiomatic.
On Wed, 31 Dec 2025 03:10:52 -0000 (UTC)
Lawrence DrCOOliveiro <ldo@nz.invalid> wrote:
Clearly on Windows, there are no guarantees about argc contains, so
you shouldnrCOt be relying on it.
How did you come to this conclusion?
In any case, it is easy enough to do a check on argc's value in your applications. (And on Windows, if it is 0 and you really need the
path, you can get it with GetModuleFileNameA().)
On Wed, 31 Dec 2025 18:42:45 +0000, bart wrote:
In any case, it is easy enough to do a check on argc's value in your
applications. (And on Windows, if it is 0 and you really need the
path, you can get it with GetModuleFileNameA().)
Remember that, on *nix systems, the contents of argv are arbitrary and caller-specified. And none of them need bear any relation to the
actual filename of the invoked executable.
In fact, it is quite common for utilities to behave differently based
on the name, as passed in argv[0], by which they are invoked.
On Tue, 30 Dec 2025 19:35:12 -0800[...]
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
For more information, see
<https://github.com/Keith-S-Thompson/argv0>.
If you are interested in behavior on non-POSIX systems, primarily
Windows, but possibly others as well (e.g. VMS) then using exec() in
caller sounds like a bad idea. It just not how these systems work and
not how people write programs on them.
Even when exec() *appears* to works in some environments (like
msys2) it likely emulated by spawn() followed by exit().
I'd implement caller with spawn(). I suppose that even on POSIX it is
more idiomatic.
On Wed, 31 Dec 2025 03:10:52 -0000 (UTC)[...]
Lawrence DrCOOliveiro <ldo@nz.invalid> wrote:
Clearly on Windows, there are no guarantees about argc contains, so
you shouldnrCOt be relying on it.
How did you come to this conclusion?
Keith's test appears to show the opposite - he was not able to convince
the Windows system to call application with empty argv list.
Of course, he tried only one way out of many, but knowing how native
Windows system call works, it appears extremely likely that on Windows
argc < 1 is impossible.
On Wed, 31 Dec 2025 15:29:09 +0200, Michael S wrote:
On Wed, 31 Dec 2025 03:10:52 -0000 (UTC)
Lawrence DrCOOliveiro <ldo@nz.invalid> wrote:
Clearly on Windows, there are no guarantees about argc contains, so
you shouldnrCOt be relying on it.
How did you come to this conclusion?
The fact that the C spec says so.
Is there any standard on Windows for
how different C compilers are supposed to handle argc/argv?
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:[...]
That's not clear. Linux (since 2022) guarantees argc>=1.
Does it? That seems to be up to the shell, since the exec()
manual pages on the latest Fedora Core release don't indicate
that argv[0] must be initialized or that argc be greater than zero.
# **CrossrCaPlatform Note**[...]
The C standard does **not** guarantee that argv[0] contains the
program name rCo only that it exists.
Paul <nospam@needed.invalid> writes:
That isn't quite correct, or is at least misleading. ISO C guarantees
that argv[0] exists, but not that it points to a string. On some
systems, it can contain be a null pointer.
Michael S <already5chosen@yahoo.com> writes:
On Tue, 30 Dec 2025 19:35:12 -0800[...]
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
For more information, see
<https://github.com/Keith-S-Thompson/argv0>.
If you are interested in behavior on non-POSIX systems, primarily
Windows, but possibly others as well (e.g. VMS) then using exec()
in caller sounds like a bad idea. It just not how these systems
work and not how people write programs on them.
Even when exec() *appears* to works in some environments (like
msys2) it likely emulated by spawn() followed by exit().
I'd implement caller with spawn(). I suppose that even on POSIX it
is more idiomatic.
If I were going to look into the behavior on Windows, I'd probably
want to use Windows native features. (I tried my test on Cygwin,
and the callee wasn't invoked.)
Apparently the Windows way to invoke a program is CreateProcessA().
But it takes the command line as a single string. There might not
be a Windows-native way to exercise the kind of control over argc
and argv provided by POSIX execve().
argv[0] merely returns what was typed on the command line to invoke the application.[...]
So if someone types:
C:\abc> prog
it may run a prog.exe found in, say, c:\programs\myapp, and return the full path as
"c:\programs\myapp\prog.exe".
args[0] will give you only "prog"; good luck with that!
On 1/1/26 12:29 AM, Keith Thompson wrote:
Paul <nospam@needed.invalid> writes:
That isn't quite correct, or is at least misleading. ISO C guarantees
that argv[0] exists, but not that it points to a string. On some
systems, it can contain be a null pointer.
I heard of this before.
Is it just theoretical, or do we have actual systems where
argv[0]==NULL? I never saw it happen in any modern operating system.
On Wed, 31 Dec 2025 15:00:24 -0800[...]
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
If I were going to look into the behavior on Windows, I'd probably
want to use Windows native features. (I tried my test on Cygwin,
and the callee wasn't invoked.)
That's likely because under Windows callee is named callee.exe.
I didn't try on cygwin, but that was the reason of failure under msys2.
Also, I am not sure if slash in the name is allowed. May be, backslash
is required.
But, there is a difference between argv[0] and GetModuleFileName().
Michael S <already5chosen@yahoo.com> writes:
On Tue, 30 Dec 2025 19:35:12 -0800[...]
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
For more information, see
<https://github.com/Keith-S-Thompson/argv0>.
If you are interested in behavior on non-POSIX systems, primarily
Windows, but possibly others as well (e.g. VMS) then using exec() in
caller sounds like a bad idea. It just not how these systems work and
not how people write programs on them.
Even when exec() *appears* to works in some environments (like
msys2) it likely emulated by spawn() followed by exit().
I'd implement caller with spawn(). I suppose that even on POSIX it is
more idiomatic.
If I were going to look into the behavior on Windows, I'd probably
want to use Windows native features. (I tried my test on Cygwin,
and the callee wasn't invoked.)
Apparently the Windows way to invoke a program is CreateProcessA().
But it takes the command line as a single string. There might not
be a Windows-native way to exercise the kind of control over argc
and argv provided by POSIX execve().
On Windows there is a command-line provided to applications via GetCommandLine() api. This is a single zero-terminated string.
Windows views parsing of the command-line string to be in the
application domain, I guess.
[...]
It creates executables with a ".exe" suffix,
but plays some tricks so that "foo.exe" also looks like "foo".
Are there any standards for how C argc/argv are supposed to behave on Windows?
On Wed, 31 Dec 2025 22:57:55 +0000, bart wrote:
But, there is a difference between argv[0] and GetModuleFileName().
So the latter cannot be used as a simple substitute for the former, as
you might have led us to believe.
In fact, it is quite common for utilities to behave differently basedon the name, as passed in argv[0], by which they are invoked.
(And on Windows, if it is 0 and you really need the path, you can getit with GetModuleFileNameA().)
On Wed, 31 Dec 2025 09:37:08 -0000 (UTC), Lawrence DrCOOliveiro wrote:
Are there any standards for how C argc/argv are supposed to behave
on Windows?
Good question, some more ways to open things (that I know of), see
2nd example for 'sort of' argc/argv...
[examples omitted]
On 01/01/2026 01:03, Lawrence DrCOOliveiro wrote:
On Wed, 31 Dec 2025 22:57:55 +0000, bart wrote:
But, there is a difference between argv[0] and
GetModuleFileName().
So the latter cannot be used as a simple substitute for the former,
as you might have led us to believe.
It depends on your needs.
All those are at the sending end. But what would C code see at the
receiving end?
On Thu, 1 Jan 2026 07:32:34 -0000 (UTC), Michael Sanders wrote:The first three cases look very simple.
On Wed, 31 Dec 2025 09:37:08 -0000 (UTC), Lawrence DrCOOliveiro wrote:
Are there any standards for how C argc/argv are supposed to behave
on Windows?
Good question, some more ways to open things (that I know of), see
2nd example for 'sort of' argc/argv...
[examples omitted]
All those are at the sending end. But what would C code see at the
receiving end?
On Thu, 1 Jan 2026 14:05:29 +0000, bart wrote:
On 01/01/2026 01:03, Lawrence DrCOOliveiro wrote:
On Wed, 31 Dec 2025 22:57:55 +0000, bart wrote:
But, there is a difference between argv[0] and
GetModuleFileName().
So the latter cannot be used as a simple substitute for the former,
as you might have led us to believe.
It depends on your needs.
You neglected to mention that when offering the substitute before
though, didnrCOt you?
On Thu, 1 Jan 2026 19:02:49 -0000 (UTC)
Lawrence DrCOOliveiro <ldo@nz.invalid> wrote:
All those are at the sending end. But what would C code see at the
receiving end?
The first three cases look very simple.
On Thu, 1 Jan 2026 21:53:20 +0200, Michael S wrote:There is a spec that describes how that works in Microsoft's
On Thu, 1 Jan 2026 19:02:49 -0000 (UTC)
Lawrence DrCOOliveiro <ldo@nz.invalid> wrote:
All those are at the sending end. But what would C code see at the
receiving end?
The first three cases look very simple.
Is there some spec in Windows which describes how that works?
More interesting and meaningful question is how to do the reverse.
I.e. how to convert an argv[] array into flat form in a way that
guarantees that CommandLineToArgvW() parses it back into original
form? Is it even possible in general case or there exist limitations (ignoring, for sake of brevity, 2*15-1 size limit)?
Microsoft certainly has reverse conversion implemented, e.g. here: https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/spawnv-wspawnv
Through the years you were told so, by different people, and shown
the spec may be 100 times. But being the trolll you are, you
continue to ask.
Still, for the benefit of more sincere readers and also for myself,
in order to have both pieces in one place: https://learn.microsoft.com/en-us/windows/win32/api/shellapi/nf-shellapi-commandlinetoargvw
https://learn.microsoft.com/en-us/cpp/c-language/parsing-c-command-line-arguments
Experimenting with _spawnv() shows that Microsoft made no effort in the direction of invertible serialization/de-serialization of argv[] lists.
That is, as long as there are no double quotes, everything works as
expected. But when there are double quotes in the original argv[] then
more often than not they can't be passed exactly.
On Thu, 1 Jan 2026 23:50:00 -0000 (UTC)
Lawrence DAOliveiro <ldo@nz.invalid> wrote:
On Thu, 1 Jan 2026 21:53:20 +0200, Michael S wrote:
On Thu, 1 Jan 2026 19:02:49 -0000 (UTC)
Lawrence DAOliveiro <ldo@nz.invalid> wrote:
All those are at the sending end. But what would C code see at the
receiving end?
The first three cases look very simple.
Is there some spec in Windows which describes how that works?
There is a spec that describes how that works in Microsoft's
implementation. That implementation is available free of charge to
other Windows compilers.
If vendor of Windows 'C' compiler decided to implement different
algorithm then nobody can stop him.
Through the years you were told so, by different people, and shown
the spec may be 100 times. But being the trolll you are, you continue
to ask.
Still, for the benefit of more sincere readers and also for myself, in
order to have both pieces in one place: https://learn.microsoft.com/en-us/windows/win32/api/shellapi/nf-shellapi-commandlinetoargvw
https://learn.microsoft.com/en-us/cpp/c-language/parsing-c-command-line-arguments
More interesting and meaningful question is how to do the reverse.
I.e. how to convert an argv[] array into flat form in a way that
guarantees that CommandLineToArgvW() parses it back into original form?
Is it even possible in general case or there exist limitations
(ignoring, for sake of brevity, 2*15-1 size limit)?
Microsoft certainly has reverse conversion implemented, e.g. here: https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/spawnv-wspawnv
But I am not aware of command line serialization part available as a
library call in isolation from process creation part.
I binged around and googled around, but all I was able to find was the
name of the function that performs the work: __acrt_pack_wide_command_line_and_environment
I was not able to find the source code of the function.
[O.T.]
I am sure that 15, 10 or even 5 years ago Google would give me link to
the source in a second. Or, may be, 5 years ago Google already
wouldn't, but Bing still would.
But today both search engines are hopelessly crippled with AI and do not appear to actually search the web. Instead, the try to guess the
answer I likely want to hear.
[/O.T.]
The argc/argv problem seemed easy enough in practice if we only need
to handle the "real" arguments argv[n] with n>0. (Involving CMD.EXE introduced much worse complications, as you might imagine. But
generally I always thought that MS wasn't really interested in
/documenting/ how programmers should do things like this, in the
same way they never bothered explaining exactly how CMD processing
worked. Probably because it was forever changing!... Put another
way, for many years they were really more focussed on admins
clicking buttons in some GUI!)
After years -- decades -- of conditioning its users to be allergic to
the command line, now suddenly the rise of Linux has made command
lines cool again. Leaving Microsoft in an awkward position ...
On 02/01/2026 12:32, Michael S wrote:
On Thu, 1 Jan 2026 23:50:00 -0000 (UTC)
Lawrence DAOliveiro <ldo@nz.invalid> wrote:
On Thu, 1 Jan 2026 21:53:20 +0200, Michael S wrote:
On Thu, 1 Jan 2026 19:02:49 -0000 (UTC)
Lawrence DAOliveiro <ldo@nz.invalid> wrote:
All those are at the sending end. But what would C code see at
the receiving end?
The first three cases look very simple.
Is there some spec in Windows which describes how that works?
There is a spec that describes how that works in Microsoft's implementation. That implementation is available free of charge to
other Windows compilers.
If vendor of Windows 'C' compiler decided to implement different
algorithm then nobody can stop him.
Through the years you were told so, by different people, and shown
the spec may be 100 times. But being the trolll you are, you
continue to ask.
Still, for the benefit of more sincere readers and also for myself,
in order to have both pieces in one place: https://learn.microsoft.com/en-us/windows/win32/api/shellapi/nf-shellapi-commandlinetoargvw
https://learn.microsoft.com/en-us/cpp/c-language/parsing-c-command-line-arguments
In the long distant past I investigated how MSVC converts a
command-line to its argc/argv input. There was an internal routine in
the CRT startup code that did pretty much what we would expect, and I reversed engineered that for my code (or did I just copy the code?
surely the former!). The MSVC code did not call CommandLineToArgvW
in those days, but reading the description of that api it all sounds
very familiar - the state flags for controlling "quoted/unquoted"
text, even vs odd numbers of backslashes and all that.
I didn't find it difficult to create command-line strings to call C
programs, given what I wanted those programs to see as argv[n] with
0. I think it was just a case of quoting all arguments, then
applying quoting rules as docuemented for CommandLineToArgvW to
handle nested quotes/backslashes.
But I can see a sticky problem - the MSVC parsing for argv[0] was
completely separate from thr main loop handling other arguments. The
logic was considerably simplified, assuming that argv[0] was the path
for the module being invoked. Since that is expected to be a valid
file system path, the logic did not handle nested quotes etc.. I
think the logic was just:
- if 1st char is a DQUOTE, copy chars for argv[0] up to next DQUOTE
or null terminator. (enclosing DQUOTE chars are not included)
- else copy chars for argv[0] up to next whitespace or null
terminator. (all chars are included, I think including DQUOTE should
it occur)
Given this, it would not be possible to create certain argv[0]
strings containing quotes etc., and I understand that the likes of
execve() allow that possibility. So I don't know what should happen
for this case. E.g. I don't see there is a command-line that gives
argv[0] the string "\" ". This was never a problem for me in
practice.
There would always be at least an argv[0] with this logic, so MSVC
ensures argc>0 and argv[0] != NULL. (Of course, MSVC is not
"Windows". Various posters in this thread seem to be asking "what
does /Windows/ do regarding argc/argv?" as though the OS is
responsible for setting them.)
More interesting and meaningful question is how to do the reverse.
I.e. how to convert an argv[] array into flat form in a way that
guarantees that CommandLineToArgvW() parses it back into original
form? Is it even possible in general case or there exist limitations (ignoring, for sake of brevity, 2*15-1 size limit)?
Yes, programmers need this if they need to create a process to invoke
some utility program which will see particular argv parameters.
Users are used to typing in command-lines as a string, e.g. at a
console, so I suppose they don't normally need to think about the
argv[] parsing; they can just build the required command-line and use
that. (But it's a problem in the general case.)
The argc/argv problem seemed easy enough in practice if we only need
to handle the "real" arguments argv[n] with n>0. (Involving CMD.EXE introduced much worse complications, as you might imagine. But
generally I always thought that MS wasn't really interested in
/documenting/ how programmers should do things like this, in the same
way they never bothered explaining exactly how CMD processing worked.
Probably because it was forever changing!... Put another way, for
many years they were really more focussed on admins clicking buttons
in some GUI!)
It's entirely possible that Windows goes beyond the ISO C
requirements and explicitly or implicitly guarantees argc>0.
It's also entirely possible that it doesn't. Do you have any
concrete information one way or the other
I experimented a bit more (in fact, more like a lot more) with
test batteries of LrCOEcuyer. It led me to conclusion that occasional
failure in the either middle or big battery means nothing.
Sometimes even cripto-quality PRNG does not pass one or another test.
Then you try to reproduce it and see that with any other seed that you
try a failure does not happen.
All in all, it makes me more suspect of PRNGs that consistently pass
both batteries with various seed. I start to see it as a sign of
PRNG being rigged to pass tests.
Said above does not apply to scomp_LinearComp() failures of mt19937.
Those failures are very consistent. I just don't consider them
significant for my own use of PRNGs or for any other uses of PRNG that
I personally ever encountered.
Overall an experience strengthened my position that general wisdom, previously shared by O'Neil herself, got it right: in absence of the
special considerations people should select mt19937 and especially
mt19937-64 as their default PRNGs of choice.
Looking closer, apart from its properties of randomness and apart from
huge period
(which does not matter for me) I started to appreciate for
mt19937 for the following properties that I was not aware before:
- it does not use multiplier. So good fit for embedded systems that
have no (or very slow) multiplier HW.
- algorithm is very SIMD-friendly, so optimized implementations can be
very very fast on modern x86 and ARM64 hardware.
The latter property also means that very fast FPGA implementation is
easily possible as long as designer is willing to through at it
moderate amount of logic resources.
Said above does not mean that PCG generators of O'Neil have no place. Intuitively, they appear not bad. But the theory is unproven, optimized implementation is likely slower that optimized mt19937, claimed
"security" advantages are nonsense as admitted later by O'Neil herself.
And, as said above, I no longer trust her empirical methodology, based
on work of LrCOEcuyer.
So, PCG generators are valuable addition to the toolbox, but not good
enough to change my default.
Michael Sanders <porkchop@invalid.foo> wrote:
Is it incorrect to use 0 (zero) to seed srand()?
int seed = (argc >= 2 && strlen(argv[1]) == 9)
? atoi(argv[1])
: (int)(time(NULL) % 900000000 + 100000000);
srand(seed);
I like to just read /dev/urandom when I need a random
number. Seem easier and more portable across Linux &
the *BSDs.
int s;
read(fd, &s, sizeof(int));
Pay attention that C Standard only requires for the same seed to always produces the same sequence. There is no requirement that different
seeds have to produce different sequences.
So, for generator in your example, implementation like below would be
fully legal. Personally, I wouldn't even consider it as particularly
poor quality:
void srand(unsigned seed ) { init = seed | 1;}
There is a paper "PCG: A Family of Simple Fast Space-Efficient
Statistically Good Algorithms for Random Number Generation"
by M. O?Neill where she gives a family of algorithms and runs
several statistical tests against known algorithms. Mersenne
Twister does not look good in tests. If you have enough (128) bits
LCGs do pass tests. A bunch of generators with 64-bit state also
passes tests. So the only reason to prefer Mersenne Twister is
that it is implemented in available libraries. Otherwise it is
not so good, have large state and needs more execution time
than alternatives.
Michael S <already5chosen@yahoo.com> writes:
[regarding rand() and srand()]
Pay attention that C Standard only requires for the same seed to
always produces the same sequence. There is no requirement that
different seeds have to produce different sequences.
So, for generator in your example, implementation like below would
be fully legal. Personally, I wouldn't even consider it as
particularly poor quality:
void srand(unsigned seed ) { init = seed | 1;}
It seems better to do, for example,
void srand(unsigned seed ) { init = seed - !seed;}
On Tue, 23 Dec 2025 17:54:05 -0000 (UTC)[...]
antispam@fricas.org (Waldek Hebisch) wrote:
There is a paper "PCG: A Family of Simple Fast Space-Efficient
Statistically Good Algorithms for Random Number Generation"
by M. O?Neill where she gives a family of algorithms and runs
several statistical tests against known algorithms. Mersenne
Twister does not look good in tests. If you have enough (128) bits
LCGs do pass tests. A bunch of generators with 64-bit state also
passes tests. So the only reason to prefer Mersenne Twister is
that it is implemented in available libraries. Otherwise it is
not so good, have large state and needs more execution time
than alternatives.
I don't know. Testing randomness is complicated matter.
How can I be sure that L'Ecuyer and Simard's TestU01 suite tests
things that I personally care about and that it does not test
things that are of no interest for me? Especially, the latter.
Also, the TestU01 suit is made for generators with 32-bit output.
M. O'Neill used ad hoc technique to make it applicable to
generators with 64-bit output. Is this technique right? Or may
be it put 64-bit PRNG at unfair disadvantage?
Besides, I strongly disagree with at least one assertion made by
O'Neill: "While security-related applications should use a secure
generator, because we cannot always know the future contexts in
which our code will be used, it seems wise for all applications to
avoid generators that make discovering their entire internal state
completely trivial."
No, I know exactly what I am doing/ I know exactly that for my
application easy discovery of complete state of PRNG is not a
defect.
Anyway, even if I am skeptical about her criticism of popular PRNGs, intuitively I agree with the constructive part of the article - medium-quality PRNG that feeds medium quality hash function can
potentially produce very good fast PRNG with rather small internal
state.
On related note, I think that even simple counter fed into high
quality hash function (not cryptographically high quality, far
less than that) can produce excellent PRNG with even smaller
internal state. But not very fast one. Although the speed
depends on specifics of used computer. I can imagine computer
that has low-latency Rijndael128 instruction. On such computer,
running counter through 3-4 rounds of Rijndael ill produce very
good PRNG that is only 2-3 times slower than, for example, LCG
128/64.
John McCue <jmclnx@gmail.com.invalid> writes:
Michael Sanders <porkchop@invalid.foo> wrote:
Is it incorrect to use 0 (zero) to seed srand()?
int seed = (argc >= 2 && strlen(argv[1]) == 9)
? atoi(argv[1])
: (int)(time(NULL) % 900000000 + 100000000);
srand(seed);
I like to just read /dev/urandom when I need a random
number. Seem easier and more portable across Linux &
the *BSDs.
int s;
read(fd, &s, sizeof(int));
Apples and oranges. Many applications that use random numbers
need a stream of numbers that is deterministic and reproducible,
which /dev/urandom is not.
Michael S <already5chosen@yahoo.com> writes:
On Tue, 23 Dec 2025 17:54:05 -0000 (UTC)[...]
antispam@fricas.org (Waldek Hebisch) wrote:
There is a paper "PCG: A Family of Simple Fast Space-Efficient
Statistically Good Algorithms for Random Number Generation"
by M. O?Neill where she gives a family of algorithms and runs
several statistical tests against known algorithms. Mersenne
Twister does not look good in tests. If you have enough (128) bits
LCGs do pass tests. A bunch of generators with 64-bit state also
passes tests. So the only reason to prefer Mersenne Twister is
that it is implemented in available libraries. Otherwise it is
not so good, have large state and needs more execution time
than alternatives.
I don't know. Testing randomness is complicated matter.
How can I be sure that L'Ecuyer and Simard's TestU01 suite tests
things that I personally care about and that it does not test
things that are of no interest for me? Especially, the latter.
Do you think any of the tests in the TestU01 suite are actually counter-indicated? As long as you don't think any TestU01 test
makes things worse, there is no reason not to use all of them.
You are always free to disregard tests you don't care about.
Also, the TestU01 suit is made for generators with 32-bit output.
M. O'Neill used ad hoc technique to make it applicable to
generators with 64-bit output. Is this technique right? Or may
be it put 64-bit PRNG at unfair disadvantage?
As long as the same mapping is applied to all 64-bit PRNGs under consideration I don't see a problem. The point of the test is to
compare PRNGs, not to compare test methods. If someone thinks a
different set of tests is called for they are free to run them.
Besides, I strongly disagree with at least one assertion made by
O'Neill: "While security-related applications should use a secure generator, because we cannot always know the future contexts in
which our code will be used, it seems wise for all applications to
avoid generators that make discovering their entire internal state completely trivial."
No, I know exactly what I am doing/ I know exactly that for my
application easy discovery of complete state of PRNG is not a
defect.
You and she are talking about different things. You are talking
about choosing a PRNG to be used only by yourself. She is talking
about choosing a PRNG to be made available to other people without
knowing who they are or what their needs are. In the second case
it's reasonable to raise the bar for the set of criteria that need
to be met.
Anyway, even if I am skeptical about her criticism of popular PRNGs, intuitively I agree with the constructive part of the article - medium-quality PRNG that feeds medium quality hash function can
potentially produce very good fast PRNG with rather small internal
state.
After looking at one of the example PCG generators, I would
describe it as a medium-quality PRNG that feeds a low-quality
hash. The particular combination I looked at produced good
results, but it isn't clear which combinations of PRNG and
hash would do likewise.
On related note, I think that even simple counter fed into high
quality hash function (not cryptographically high quality, far
less than that) can produce excellent PRNG with even smaller
internal state. But not very fast one. Although the speed
depends on specifics of used computer. I can imagine computer
that has low-latency Rijndael128 instruction. On such computer,
running counter through 3-4 rounds of Rijndael ill produce very
good PRNG that is only 2-3 times slower than, for example, LCG
128/64.
I think the point of her paper where she talks about determining
how much internal state is needed is to measure the efficacy of
the PRNG, not to try to reduce the amount of state needed. Based
on my own experience with various PRNGs I think it's a mistake to
try to minimize the amount of internal state needed. My own rule
of thumb is to allow at least a factor of four: for example, a
PRNG with a 32-bit output should have at least 128 bits of state.
My latest favorite has 256 bits of state to produce 32-bit
outputs (and so might also do well to produce 64-bit outputs, but
I haven't tested that).
[...]
Michael S <already5chosen@yahoo.com> wrote:Yes, there are many stable tests. But there are also many unstable
I experimented a bit more (in fact, more like a lot more) with
test batteries of LrCOEcuyer. It led me to conclusion that occasional failure in the either middle or big battery means nothing.
Sometimes even cripto-quality PRNG does not pass one or another
test. Then you try to reproduce it and see that with any other seed
that you try a failure does not happen.
All in all, it makes me more suspect of PRNGs that consistently pass
both batteries with various seed. I start to see it as a sign of
PRNG being rigged to pass tests.
Well, that depends on the tests and threshhold in the tests.
Some tests when fed with trurly random source will produce produce
very small variation of the results. With generous threshhold
such test will essentially never fail for trurly random source.
OTOH when expected variation of the results is larger and
threshhold is tight, then trurly random source will fail the
test from time to time.
And if you test long enough you should
be able to estimate probability of failure and possibly compare
is with theoretical result if available.
To say the truth, I do not know what failures of crypto-quality PRNG
means. It may mean that the test is of tight kind that is supposed
to fail from time to time for trurly random source. Or it may mean
to PRNG improves cypto part at cost of statistics. That is
non-uniform distribution of the output is not a problem in
crypto applications. Simply for crypto purposes future output
should be not predictable from current and past output only.
And slightly non-uniform distribution can increase probablity
of test failure enough that you can observe such failures.
BTW: You mentioned using counter and hardware AES128 round.
Given cheap AES128 round I would try 128-bit state and AES128
round as state update. I do not know if hardware AES128 is
fast enough to make it wortwhile, but using AES128 round as a
state update should be much better than scrambling a counter.
Said above does not apply to scomp_LinearComp() failures of mt19937.
Those failures are very consistent. I just don't consider them
significant for my own use of PRNGs or for any other uses of PRNG
that I personally ever encountered.
Overall an experience strengthened my position that general wisdom, previously shared by O'Neil herself, got it right: in absence of the special considerations people should select mt19937 and especially mt19937-64 as their default PRNGs of choice.
Looking closer, apart from its properties of randomness and apart
from huge period
Huge period alone is easy. AFAICS matrix variants of LCG can
easily get quite large periods. I did not test matrix LCG,
but on statistical tests they should be essentially as good
as multiprecision LCG, but should be cheaper to implement.
Just to be clear, I mean equation x_{n+1} = Ax_n + b, wher x_n
is a vector reprezenting n-th state, A is a matrix and b is a
vector. Matrix A may be somewhat sparse, that is have limited
number of non-zero entries, and some entries my be simple, like
1 or -1. With proper choice of a and b period should be
comparable with number of availalable states.
I see no reason to go for very long periods, already 512 bits
of state allow perids which in practice should be indistingushable
from infinite period.
64-bit multipliers spread bits nicely, indeed. But implementing 64-bit multiplier on the core that natively has only 8x8=16 is not easy or(which does not matter for me) I started to appreciate for
mt19937 for the following properties that I was not aware before:
- it does not use multiplier. So good fit for embedded systems that
have no (or very slow) multiplier HW.
Biggest MCU with no hardware multiplier that I have has 2kB RAM.
I do not want mt19937 on such a machine. 64-bit multiplication
on 8-biter with hardware multiplier may be slow, so 64-bit LCG
(and improvements based on it) may be slow. But multiplication
nicely spreads out bits, so it is not clear to me if there is
equally good cheaper alternative.
If needed I would investigate
matrix LCG, they may be slightly cheaper.
- algorithm is very SIMD-friendly, so optimized implementations can
be very very fast on modern x86 and ARM64 hardware.
Just size of the state puts limit how fast it can be. And size of
the state means that it will compete for cache with user data.
BTW: AFAICS matrix LCG can be SIMD friendly too.
The latter property also means that very fast FPGA implementation is
easily possible as long as designer is willing to through at it
moderate amount of logic resources.
Said above does not mean that PCG generators of O'Neil have no
place. Intuitively, they appear not bad. But the theory is
unproven, optimized implementation is likely slower that optimized
mt19937, claimed "security" advantages are nonsense as admitted
later by O'Neil herself. And, as said above, I no longer trust her empirical methodology, based on work of LrCOEcuyer.
So, PCG generators are valuable addition to the toolbox, but not
good enough to change my default.
I agree that ATM it is not entirely clear if PCG-s are as good as
suggested by tests. But I am surprised by your opinion about
speed, I did not analyze deeply either of them, but PCG-s are way
simpler, so I would expect them to be faster.
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
John McCue <jmclnx@gmail.com.invalid> writes:
Michael Sanders <porkchop@invalid.foo> wrote:
Is it incorrect to use 0 (zero) to seed srand()?
int seed = (argc >= 2 && strlen(argv[1]) == 9)
? atoi(argv[1])
: (int)(time(NULL) % 900000000 + 100000000);
srand(seed);
I like to just read /dev/urandom when I need a random
number. Seem easier and more portable across Linux &
the *BSDs.
int s;
read(fd, &s, sizeof(int));
Apples and oranges. Many applications that use random numbers
need a stream of numbers that is deterministic and reproducible,
which /dev/urandom is not.
And neither is the non-conforming rand() on OpenBSD.
The rand(1) man page on OpenBSD 7.8 says:
Standards insist that this interface return deterministic
results. Unsafe usage is very common, so OpenBSD changed the
subsystem to return non-deterministic results by default.
To satisfy portable code, srand() may be called to initialize
the subsystem. In OpenBSD the seed variable is ignored,
and strong random number results will be provided from
arc4random(3). In other systems, the seed variable primes a
simplistic deterministic algorithm.
It does provide an srand_deterministic() function that behaves the way srand() is supposed to.
Michael S <already5chosen@yahoo.com> wrote:[...]
Anyway, even if I am skeptical about her criticism of popular PRNGs,
intuitively I agree with the constructive part of the article -
medium-quality PRNG that feeds medium quality hash function can
potentially produce very good fast PRNG with rather small internal
state.
She seem to care very much about having minimal possible state.
That is may be nice on embeded systems, but in general I would
happily accept slighty bigger state (say 256 bits). But if
we can get good properties with very small state, then why not?
[...]
Michael S <already5chosen@yahoo.com> wrote:
I experimented a bit more (in fact, more like a lot more) with
test batteries of L?Ecuyer. It led me to conclusion that occasional
failure in the either middle or big battery means nothing.
Sometimes even cripto-quality PRNG does not pass one or another test.
Then you try to reproduce it and see that with any other seed that you
try a failure does not happen.
All in all, it makes me more suspect of PRNGs that consistently pass
both batteries with various seed. I start to see it as a sign of
PRNG being rigged to pass tests.
Well, that depends on the tests and threshhold in the tests.
Some tests when fed with trurly random source will produce produce
very small variation of the results. With generous threshhold
such test will essentially never fail for trurly random source.
OTOH when expected variation of the results is larger and
threshhold is tight, then trurly random source will fail the
test from time to time. And if you test long enough you should
be able to estimate probability of failure and possibly compare
is with theoretical result if available.
On Wed, 07 Jan 2026 13:54:21 -0800, Keith Thompson wrote:[...]
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Apples and oranges. Many applications that use random numbers
need a stream of numbers that is deterministic and reproducible,
which /dev/urandom is not.
And neither is the non-conforming rand() on OpenBSD.
The rand(1) man page on OpenBSD 7.8 says:
Standards insist that this interface return deterministic
results. Unsafe usage is very common, so OpenBSD changed the
subsystem to return non-deterministic results by default.
To satisfy portable code, srand() may be called to initialize
the subsystem. In OpenBSD the seed variable is ignored,
and strong random number results will be provided from
arc4random(3). In other systems, the seed variable primes a
simplistic deterministic algorithm.
It does provide an srand_deterministic() function that behaves the way
srand() is supposed to.
So then clang would use:
#ifdef __OpenBSD__
srand_deterministic(seed);
#else
srand(seed);
#endif
But I don't know (yet) that gcc does as well under OpenBSD.
Michael Sanders <porkchop@invalid.foo> writes:
On Wed, 07 Jan 2026 13:54:21 -0800, Keith Thompson wrote:[...]
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Apples and oranges. Many applications that use random numbers
need a stream of numbers that is deterministic and reproducible,
which /dev/urandom is not.
And neither is the non-conforming rand() on OpenBSD.
The rand(1) man page on OpenBSD 7.8 says:
Standards insist that this interface return deterministic
results. Unsafe usage is very common, so OpenBSD changed the
subsystem to return non-deterministic results by default.
To satisfy portable code, srand() may be called to initialize
the subsystem. In OpenBSD the seed variable is ignored,
and strong random number results will be provided from
arc4random(3). In other systems, the seed variable primes a
simplistic deterministic algorithm.
It does provide an srand_deterministic() function that behaves the way
srand() is supposed to.
So then clang would use:
#ifdef __OpenBSD__
srand_deterministic(seed);
#else
srand(seed);
#endif
But I don't know (yet) that gcc does as well under OpenBSD.
I don't know what you mean when you say that clang "would use"
that code.
I'm not aware that either clang or gcc uses random numbers
internally. I don't know why they would.
You could certainly write the above code and compile it with either
gcc or clang (or any other C compiler on OpenBSD). I've confirmed
that gcc on OpenBSD does predefine the symbol __OpenBSD__. There
should be no relevant difference between gcc and clang; random
number generation is implemented in the library, not in the compiler.
If your point is that a programmer using either gcc or clang could
use the above code to get the required deterministic behavior
for rand(), I agree. (Though it shouldn't be necessary; IMHO the
OpenBSD folks made a very bad decision.)
Relatedly, the NetBSD implementation of rand() is conforming, but
of very low quality. The low-order bit alternates between 0 and
1 on successive rand() calls, the two low-order bits repeat with
a cycle of 4, and so on. Larry Jones wrote about it here in 2010:
The even/odd problem was caused at Berkeley by a well meaning
but clueless individual who increased the range of the generator
(which originally matched the sample implementation) by returning
the *entire* internal state rather than just the high-order
bits of it. BSD was very popular, so that defective generator
got around a lot, unfortunately.
And I've just discovered that the OpenBSD rand() returns alternating
odd and even results after a call to srand_determinstic().
It's disturbing that this has never been fixed.
On Thu, 08 Jan 2026 14:44:27 -0800, Keith Thompson wrote:[...]
Michael Sanders <porkchop@invalid.foo> writes:
So then clang would use:
#ifdef __OpenBSD__
srand_deterministic(seed);
#else
srand(seed);
#endif
But I don't know (yet) that gcc does as well under OpenBSD.
I don't know what you mean when you say that clang "would use"
that code.
I'm not aware that either clang or gcc uses random numbers
internally. I don't know why they would.
Well, I meant the macro itself is (I'm guessing) probably defined
by clang since its the default compiler.
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
John McCue <jmclnx@gmail.com.invalid> writes:
Michael Sanders <porkchop@invalid.foo> wrote:
Is it incorrect to use 0 (zero) to seed srand()?
int seed = (argc >= 2 && strlen(argv[1]) == 9)
? atoi(argv[1])
: (int)(time(NULL) % 900000000 + 100000000);
srand(seed);
I like to just read /dev/urandom when I need a random
number. Seem easier and more portable across Linux &
the *BSDs.
int s;
read(fd, &s, sizeof(int));
Apples and oranges. Many applications that use random numbers
need a stream of numbers that is deterministic and reproducible,
which /dev/urandom is not.
And neither is the non-conforming rand() on OpenBSD.
The rand(1) man page on OpenBSD 7.8 says:
Standards insist that this interface return deterministic
results. Unsafe usage is very common, so OpenBSD changed the
subsystem to return non-deterministic results by default.
To satisfy portable code, srand() may be called to initialize
the subsystem. In OpenBSD the seed variable is ignored,
and strong random number results will be provided from
arc4random(3). In other systems, the seed variable primes a
simplistic deterministic algorithm.
But your original statement implied that clang would *use* that
particular piece of code, which didn't make much sense. Were you
just asking about how the __OpenBSD__ macro is defined, without
reference to srand?
On Thu, 08 Jan 2026 22:46:42 -0800, Keith Thompson wrote:
But your original statement implied that clang would *use* that
particular piece of code, which didn't make much sense. Were you
just asking about how the __OpenBSD__ macro is defined, without
reference to srand?
Well, under OpenBSD I plan on using:
#ifdef __OpenBSD__
srand_deterministic(seed);
#else
srand(seed);
#endif
But what I was asking is whether or not gcc would recognize
the __OpenBSD__ macro (why wouldn't I'm assuming) since clang
is the default compiler.
But also about srand()... you've got me really wondering why
OpenBSD would deviate from the standard as they have. I get
that the those folks disagree because its deterministic, but
its the accepted standard to be deterministic with srand().
Only speaking for myself here, rather than srand_deterministic()
and srand() (that's not deterministic under OpenBSD) it
would've made more sense to've implemented srand_non_deterministic()
and left srand() alone. That design decision on their part only
muddies the waters in my thinking. Live & learn =)
On Thu, 08 Jan 2026 22:46:42 -0800, Keith Thompson wrote:
But your original statement implied that clang would *use* that
particular piece of code, which didn't make much sense. Were you
just asking about how the __OpenBSD__ macro is defined, without
reference to srand?
Well, under OpenBSD I plan on using:
#ifdef __OpenBSD__
srand_deterministic(seed);
#else
srand(seed);
#endif
But what I was asking is whether or not gcc would recognize
the __OpenBSD__ macro (why wouldn't I'm assuming) since clang
is the default compiler.
But also about srand()... you've got me really wondering why
OpenBSD would deviate from the standard as they have. I get
that the those folks disagree because its deterministic, but
its the accepted standard to be deterministic with srand().
Only speaking for myself here, rather than srand_deterministic()
and srand() (that's not deterministic under OpenBSD) it
would've made more sense to've implemented srand_non_deterministic()
and left srand() alone. That design decision on their part only
muddies the waters in my thinking. Live & learn =)
On Thu, 08 Jan 2026 22:46:42 -0800, Keith Thompson wrote:
But your original statement implied that clang would *use* that
particular piece of code, which didn't make much sense. Were you
just asking about how the __OpenBSD__ macro is defined, without
reference to srand?
Well, under OpenBSD I plan on using:
#ifdef __OpenBSD__
srand_deterministic(seed);
#else
srand(seed);
#endif
| Sysop: | Amessyroom |
|---|---|
| Location: | Fayetteville, NC |
| Users: | 54 |
| Nodes: | 6 (0 / 6) |
| Uptime: | 19:28:20 |
| Calls: | 742 |
| Files: | 1,218 |
| D/L today: |
5 files (8,203K bytes) |
| Messages: | 184,913 |
| Posted today: | 1 |