Forum: Too Lazy BBS

Re: Command Languages Versus Programming Languages

From Muttley@Muttley@DastartdlyHQ.org to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Wed Nov 20 12:27:54 2024

From Newsgroup: comp.unix.programmer

On Wed, 20 Nov 2024 05:46:49 -0600
Ed Morton <mortonspam@gmail.com> boring babbled:

On 11/20/2024 2:21 AM, Muttley@DastartdlyHQ.org wrote:

On Tue, 19 Nov 2024 18:43:48 -0800
merlyn@stonehenge.com (Randal L. Schwartz) boring babbled:

"Lawrence" == Lawrence D'Oliveiro <ldo@nz.invalid> writes:

Lawrence> Perl was the language that made regular expressions
Lawrence> sexy. Because it made them easy to use.

I'm often reminded of this as I've been coding very little in Perl these >>> days, and a lot more in languages like Dart, where the regex feels like
a clumsy bolt-on rather than a proper first-class citizen.

Regex itself is clumsy beyond simple search and replace patterns. A lot of >> stuff I've seen done in regex would have better done procedurally at the
expense of slightly more code but a LOT more readability.

Definitely. The most relevant statement about regexps is this:

Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.

Very true!

Obviously regexps are very useful and commonplace but if you find you
have to use some online site or other tools to help you write/understand
one or just generally need more than a couple of minutes to
write/understand it then it's time to back off and figure out a better
way to write your code for the sake of whoever has to read it 6 months
later (and usually for robustness too as it's hard to be sure all rainy
day cases are handled correctly in a lengthy and/or complicated regexp).

Edge cases are regex achilles heal, eg an expression that only accounted
for 1 -> N chars, not 0 -> N, or matches in the middle but not at the ends.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Wed Nov 20 16:38:24 2024

From Newsgroup: comp.unix.programmer

On 20.11.2024 12:30, Muttley@DastartdlyHQ.org wrote:

On Wed, 20 Nov 2024 11:51:11 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> boring babbled:

On 20.11.2024 09:21, Muttley@DastartdlyHQ.org wrote:

On Tue, 19 Nov 2024 18:43:48 -0800
merlyn@stonehenge.com (Randal L. Schwartz) boring babbled:

I'm often reminded of this as I've been coding very little in Perl these >>>> days, and a lot more in languages like Dart, where the regex feels like >>>> a clumsy bolt-on rather than a proper first-class citizen.

Regex itself is clumsy beyond simple search and replace patterns. A lot of >>> stuff I've seen done in regex would have better done procedurally at the >>> expense of slightly more code but a LOT more readability. Also given its >>> effectively a compact language with its own grammar and syntax IMO it should
not be the core part of any language as it can lead to a syntatic mess,

which

is what often happens with Perl.

I wouldn't look at it that way. I've seen Regexps as part of languages
usually in well defined syntactical contexts. For example, like strings
are enclosed in "...", Regexps could be seen within /.../ delimiters.
GNU Awk (in recent versions) went towards first class "strongly typed"
Regexps which are then denoted by the @/.../ syntax.

I'm curious what you mean by Regexps presented in a "procedural" form.
Can you give some examples?

Anything that can be done in regex can obviously also be done procedurally. At the point regex expression become unwieldy - usually when substitution variables raise their heads - I prefer procedural code as its also often easier to debug.

You haven't even tried to honestly answer my (serious) question.
With your statement above and your hostility below, it rather seems
you have no clue of what I am talking about.

In practice, given that a Regexp conforms to a FSA, any Regexp can be
precompiled and used multiple times. The thing I had used in Java - it

Precompiled regex is no more efficient than precompiled anything , its all just assembler at the bottom.

The Regexps are a way to specify the words of a regular language;
for pattern matching the expression gets interpreted or compiled; you
specify it, e.g., using strings of characters and meta-characters.
If you have a programming language where that string gets repeatedly interpreted then it's slower than a precompiled Regexp expression.

I give you examples...

(1) DES encryption function

(1a) ciphertext = des_encode (key, plaintext)

(1b) cipher = des (key)
ciphertext = cipher.encode (plaintext)

In case (1) you can either call the des encription (decription) for
any (key, plaintext)-pair in a procedural function as in (1a), or
you can create the key-specific encryption once and encode various
texts with the same cipher object as in (1b).

(2) regexp matching

(2a) location = regexp (pattern, string)

(2b) fsm = rexexp (pattern)
location = fsm.match (string)

In case (2) you can either do the match in a string with a pattern
in a procedural form as in (2a) or you can create the FSM for the
given Regexp just once and apply it on various strings as in (2b).

That's what I was talking about.

Only if key (in (1)) or pattern (in (2)) are static or "constant"
that compilation could (but only theoretically) be done in advance
and optimizing system may (or may not) precompile it (both) to
[similar] assembler code. How should that work with regexps or DES?
The optimizing system would need knowledge how to use the library
code (DES, Regexps, ...) to create binary structures based on the
algorithms (key-initialization in DES, FSM-generation in Regexps).
This is [statically] not done.

Otherwise - i.e. the normal, expected case - there's an efficiency
difference to observe between the respective cases of (a) and (b).

then operate on that same object. (Since there's still typical Regexp
syntax involved I suppose that is not what you meant by "procedural"?)

If you don't know the different between declarative syntax like regex and procedural syntax then there's not much point continuing this discussion.

Why do you think so, and why are you saying that? - That wasn't and
still isn't the point. - You said upthread

"A lot of stuff I've seen done in regex would have better done
procedurally at the expense of slightly more code but a LOT more
readability."

and I asked

"I'm curious what you mean by Regexps presented in a "procedural"
form.
Can you give some examples?"

What you wanted to say wasn't clear to me, since you were complaining
about the _Regexp syntax_. So it couldn't be meant to just write
regexp (pattern, string) instead of pattern ~ string
but to somehow(!) transform "pattern", say, like /[0-9]+(ABC)?x*foo/,
to something syntactically "better".
I was interested in that "somehow" (that I emphasized), and in an
example how that would look like in your opinion.
If you're unable to answer that simple question then just take that
simple regexp /[0-9]+(ABC)?x*foo/ example and show us your preferred
procedural variant.

But my expectation is that you cannot provide any reasonable example
anyway.

Personally I think that writing bulky procedural stuff for something
like [0-9]+ can only be much worse, and that further abbreviations
like \d+ are the better direction to go if targeting a good interface.
YMMV.

Janis

--- Synchronet 3.21d-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Wed Nov 20 16:53:38 2024

From Newsgroup: comp.unix.programmer

On 20.11.2024 12:46, Ed Morton wrote:

Definitely. The most relevant statement about regexps is this:

Some people, when confronted with a problem, think "I know, I'll use
regular expressions." Now they have two problems.

(Worth a scribbling on a WC wall.)

Obviously regexps are very useful and commonplace but if you find you
have to use some online site or other tools to help you write/understand
one or just generally need more than a couple of minutes to
write/understand it then it's time to back off and figure out a better
way to write your code for the sake of whoever has to read it 6 months
later (and usually for robustness too as it's hard to be sure all rainy
day cases are handled correctly in a lengthy and/or complicated regexp).

Regexps are nothing for newbies.

The inherent fine thing with Regexps is that you can incrementally
compose them[*].[**]

It seems you haven't found a sensible way to work with them?
(And I'm really astonished about that since I know you worked with
Regexps for years if not decades.)

In those cases where Regexps *are* the tool for a specific task -
I don't expect you to use them where they are inappropriate?! -
what would be the better solution[***] then?

Janis

[*] Like the corresponding FSMs.

[**] And you can also decompose them if they are merged in a huge
expression, too large for you to grasp it. (BTW, I'm doing such
decompositions also with other expressions in program code that
are too bulky.)

[***] Can you answer the question that another poster failed to do?

--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastartdlyHQ.org to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Wed Nov 20 16:38:15 2024

From Newsgroup: comp.unix.programmer

On Wed, 20 Nov 2024 16:38:24 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> boring babbled:

On 20.11.2024 12:30, Muttley@DastartdlyHQ.org wrote:

Anything that can be done in regex can obviously also be done procedurally. >> At the point regex expression become unwieldy - usually when substitution
variables raise their heads - I prefer procedural code as its also often
easier to debug.

You haven't even tried to honestly answer my (serious) question.

You mean you can't figure out how to do something like string search and replace
procedurally? I'm not going to show you, ask a kid who knows Python or Basic.

With your statement above and your hostility below, it rather seems

If you think my reply was hostile then I suggest you go find a safe space
and cuddle your teddy bear snowflake.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Rainer Weikusat@rweikusat@talktalk.net to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Wed Nov 20 17:50:13 2024

From Newsgroup: comp.unix.programmer

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

[...]

Personally I think that writing bulky procedural stuff for something
like [0-9]+ can only be much worse, and that further abbreviations
like \d+ are the better direction to go if targeting a good interface.
YMMV.

Assuming that p is a pointer to the current position in a string, e is a pointer to the end of it (ie, point just past the last byte) and -
that's important - both are pointers to unsigned quantities, the 'bulky'
C equivalent of [0-9]+ is

while (p < e && *p - '0' < 10) ++p;

That's not too bad. And it's really a hell lot faster than a
general-purpose automaton programmed to recognize the same pattern
(which might not matter most of the time, but sometimes, it does).
--- Synchronet 3.21d-Linux NewsLink 1.2

From Rainer Weikusat@rweikusat@talktalk.net to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Wed Nov 20 17:54:22 2024

From Newsgroup: comp.unix.programmer

Muttley@DastartdlyHQ.org writes:

[...]

With your statement above and your hostility below, it rather seems

If you think my reply was hostile then I suggest you go find a safe space
and cuddle your teddy bear snowflake.

There's surely no reason why anyone could ever think you were inclined
to substitute verbal aggression for arguments.
--- Synchronet 3.21d-Linux NewsLink 1.2

From John Ames@commodorejohn@gmail.com to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Wed Nov 20 10:03:47 2024

From Newsgroup: comp.unix.programmer

On Wed, 20 Nov 2024 17:54:22 +0000
Rainer Weikusat <rweikusat@talktalk.net> wrote:

There's surely no reason why anyone could ever think you were inclined
to substitute verbal aggression for arguments.

I mean, it's his whole thing - why would he stop now?

--- Synchronet 3.21d-Linux NewsLink 1.2

From Lawrence D'Oliveiro@ldo@nz.invalid to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Wed Nov 20 21:43:41 2024

From Newsgroup: comp.unix.programmer

On Wed, 20 Nov 2024 12:27:54 -0000 (UTC), Muttley wrote:

Edge cases are regex achilles heal, eg an expression that only accounted
for 1 -> N chars, not 0 -> N, or matches in the middle but not at the
ends.

ThatrCOs what rCL^rCY and rCL$rCY are for.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastartdlyHQ.org to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Thu Nov 21 08:13:39 2024

From Newsgroup: comp.unix.programmer

On Wed, 20 Nov 2024 17:54:22 +0000
Rainer Weikusat <rweikusat@talktalk.net> boring babbled: >Muttley@DastartdlyHQ.org writes:

[...]

With your statement above and your hostility below, it rather seems

If you think my reply was hostile then I suggest you go find a safe space
and cuddle your teddy bear snowflake.

There's surely no reason why anyone could ever think you were inclined
to substitute verbal aggression for arguments.

I have zero time for anyone who claims hurt feelings or being slighted as
soon as they're losing an argument.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastartdlyHQ.org to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Thu Nov 21 08:15:41 2024

From Newsgroup: comp.unix.programmer

On Wed, 20 Nov 2024 21:43:41 -0000 (UTC)
Lawrence D'Oliveiro <ldo@nz.invalid> boring babbled:

On Wed, 20 Nov 2024 12:27:54 -0000 (UTC), Muttley wrote:

Edge cases are regex achilles heal, eg an expression that only accounted
for 1 -> N chars, not 0 -> N, or matches in the middle but not at the
ends.

ThatrCOs what rCL^rCY and rCL$rCY are for.

Yes, but people forget about those (literal) edge cases.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastartdlyHQ.org to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Thu Nov 21 08:18:06 2024

From Newsgroup: comp.unix.programmer

On Wed, 20 Nov 2024 10:03:47 -0800
John Ames <commodorejohn@gmail.com> boring babbled:

On Wed, 20 Nov 2024 17:54:22 +0000
Rainer Weikusat <rweikusat@talktalk.net> wrote:

There's surely no reason why anyone could ever think you were inclined
to substitute verbal aggression for arguments.

I mean, it's his whole thing - why would he stop now?

Whats it like being so wet? Do you get cold easily?

--- Synchronet 3.21d-Linux NewsLink 1.2

From merlyn@merlyn@stonehenge.com (Randal L. Schwartz) to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Thu Nov 21 05:38:45 2024

From Newsgroup: comp.unix.programmer

"Rainer" == Rainer Weikusat <rweikusat@talktalk.net> writes:

Rainer> -| I used to use a JSON parser written in OO-Perl which made
Rainer> extensive use of regexes for that. I've recently replaced that
Rainer> with a C/XS version which - while slightly larger (617 vs 410
Rainer> lines of text) - is over a hundred times faster and conceptually Rainer> simpler at the same time.

I wonder if that was my famous "JSON parser in a single regex" from https://www.perlmonks.org/?node_id=995856, or from one of the two CPAN
modules that incorporated it.
--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/> Perl/Dart/Flutter consulting, Technical writing, Comedy, etc. etc.
Still trying to think of something clever for the fourth line of this .sig
--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Thu Nov 21 14:13:37 2024

From Newsgroup: comp.unix.programmer

In article <20241120100347.00005f10@gmail.com>,
John Ames <commodorejohn@gmail.com> wrote:

On Wed, 20 Nov 2024 17:54:22 +0000
Rainer Weikusat <rweikusat@talktalk.net> wrote:

There's surely no reason why anyone could ever think you were inclined
to substitute verbal aggression for arguments.

I mean, it's his whole thing - why would he stop now?

This is the guy who didn't know what a compiler is, right?

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Thu Nov 21 14:40:18 2024

From Newsgroup: comp.unix.programmer

In article <875xohbxre.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

[...]

Personally I think that writing bulky procedural stuff for something
like [0-9]+ can only be much worse, and that further abbreviations
like \d+ are the better direction to go if targeting a good interface.
YMMV.

Assuming that p is a pointer to the current position in a string, e is a >pointer to the end of it (ie, point just past the last byte) and -
that's important - both are pointers to unsigned quantities, the 'bulky'
C equivalent of [0-9]+ is

while (p < e && *p - '0' < 10) ++p;

That's not too bad. And it's really a hell lot faster than a
general-purpose automaton programmed to recognize the same pattern
(which might not matter most of the time, but sometimes, it does).

It's also not exactly right. `[0-9]+` would match one or more
characters; this possibly matches 0 (ie, if `p` pointed to
something that wasn't a digit).

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Rainer Weikusat@rweikusat@talktalk.net to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Thu Nov 21 15:07:42 2024

From Newsgroup: comp.unix.programmer

cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

[...]

Personally I think that writing bulky procedural stuff for something
like [0-9]+ can only be much worse, and that further abbreviations
like \d+ are the better direction to go if targeting a good interface.
YMMV.

Assuming that p is a pointer to the current position in a string, e is a >>pointer to the end of it (ie, point just past the last byte) and -
that's important - both are pointers to unsigned quantities, the 'bulky'
C equivalent of [0-9]+ is

while (p < e && *p - '0' < 10) ++p;

That's not too bad. And it's really a hell lot faster than a >>general-purpose automaton programmed to recognize the same pattern
(which might not matter most of the time, but sometimes, it does).

It's also not exactly right. `[0-9]+` would match one or more
characters; this possibly matches 0 (ie, if `p` pointed to
something that wasn't a digit).

The regex won't match any digits if there aren't any. In this case, the
match will fail. I didn't include the code for handling that because it
seemed pretty pointless for the example.
--- Synchronet 3.21d-Linux NewsLink 1.2

From mas@mas@a4.home to comp.unix.programmer on Thu Nov 21 15:46:42 2024

From Newsgroup: comp.unix.programmer

On 2024-11-20, Rainer Weikusat <rweikusat@talktalk.net> wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

Assuming that p is a pointer to the current position in a string, e is a pointer to the end of it (ie, point just past the last byte) and -
that's important - both are pointers to unsigned quantities, the 'bulky'
C equivalent of [0-9]+ is

while (p < e && *p - '0' < 10) ++p;

That's not too bad. And it's really a hell lot faster than a
general-purpose automaton programmed to recognize the same pattern
(which might not matter most of the time, but sometimes, it does).

int
main(int argc, char **argv) {
unsigned char *p, *e;
unsigned char mystr[] = "12#45XY ";

p = mystr;
e = mystr + sizeof(mystr);
while (p < e && *p - '0' < 10) ++p;

size_t xlen = p-mystr;
printf("digits: '%.*s'\n", (int) xlen, mystr);
printf("mystr %p p %p e %p\n", mystr, p, e);
printf("xlen %zd\n", xlen);

return 0;
}

./a.out
digits: '12#45'
mystr 0x7ffc92ac55ff p 0x7ffc92ac5604 e 0x7ffc92ac5608
xlen 5

--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastartdlyHQ.org to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Thu Nov 21 16:06:01 2024

From Newsgroup: comp.unix.programmer

On Thu, 21 Nov 2024 14:13:37 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) boring babbled:

In article <20241120100347.00005f10@gmail.com>,
John Ames <commodorejohn@gmail.com> wrote:

On Wed, 20 Nov 2024 17:54:22 +0000
Rainer Weikusat <rweikusat@talktalk.net> wrote:

There's surely no reason why anyone could ever think you were inclined
to substitute verbal aggression for arguments.

I mean, it's his whole thing - why would he stop now?

This is the guy who didn't know what a compiler is, right?

Wrong. Want another go?

--- Synchronet 3.21d-Linux NewsLink 1.2

From John Ames@commodorejohn@gmail.com to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Thu Nov 21 07:56:48 2024

From Newsgroup: comp.unix.programmer

On Thu, 21 Nov 2024 08:18:06 -0000 (UTC)
Muttley@DastartdlyHQ.org wrote:

Whats it like being so wet? Do you get cold easily?

No, I have a soft gray hoodie with a nice fleecy lining that I quite
like. It's very warm for not being too heavy.

Also: *huh?*

--- Synchronet 3.21d-Linux NewsLink 1.2

From John Ames@commodorejohn@gmail.com to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Thu Nov 21 07:58:06 2024

From Newsgroup: comp.unix.programmer

On Thu, 21 Nov 2024 08:13:39 -0000 (UTC)
Muttley@DastartdlyHQ.org wrote:

I have zero time

I approve, it's a wonderful album!

--- Synchronet 3.21d-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.unix.programmer on Thu Nov 21 16:08:16 2024

From Newsgroup: comp.unix.programmer

mas@a4.home writes:

On 2024-11-20, Rainer Weikusat <rweikusat@talktalk.net> wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

Assuming that p is a pointer to the current position in a string, e is a
pointer to the end of it (ie, point just past the last byte) and -
that's important - both are pointers to unsigned quantities, the 'bulky'
C equivalent of [0-9]+ is

while (p < e && *p - '0' < 10) ++p;

That's not too bad. And it's really a hell lot faster than a
general-purpose automaton programmed to recognize the same pattern
(which might not matter most of the time, but sometimes, it does).

int
main(int argc, char **argv) {
unsigned char *p, *e;
unsigned char mystr[] = "12#45XY ";

p = mystr;
e = mystr + sizeof(mystr);
while (p < e && *p - '0' < 10) ++p;

size_t xlen = p-mystr;
printf("digits: '%.*s'\n", (int) xlen, mystr);
printf("mystr %p p %p e %p\n", mystr, p, e);
printf("xlen %zd\n", xlen);

return 0;
}

./a.out
digits: '12#45'
mystr 0x7ffc92ac55ff p 0x7ffc92ac5604 e 0x7ffc92ac5608
xlen 5

Indeed. Rainer's while loop should be using isdigit(*p).
--- Synchronet 3.21d-Linux NewsLink 1.2

From Rainer Weikusat@rweikusat@talktalk.net to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Thu Nov 21 17:01:48 2024

From Newsgroup: comp.unix.programmer

merlyn@stonehenge.com (Randal L. Schwartz) writes:

"Rainer" == Rainer Weikusat <rweikusat@talktalk.net> writes:

Rainer> -| I used to use a JSON parser written in OO-Perl which made
Rainer> extensive use of regexes for that. I've recently replaced that Rainer> with a C/XS version which - while slightly larger (617 vs 410
Rainer> lines of text) - is over a hundred times faster and conceptually Rainer> simpler at the same time.

I wonder if that was my famous "JSON parser in a single regex" from https://www.perlmonks.org/?node_id=995856, or from one of the two CPAN modules that incorporated it.

No. One of my use-cases is an interactive shell running in a web browser
using ActionCable messages to relay data between the browser and the
shell process on the computer supposed to be accessed in this way. For
this, I absolutely do need \u escapes. I also need this to be fast. Eg,
one of the nice properties of JSON is that the type of a value can be determined by looking at the first character of it. This cries for an implementation based on an array of pointers to 'value parsing routines'
of size 256 and determining the parser routine to use by using the first character as index into this table (which will either yield a pointer to
the correct parser routine or NULL for a syntax error).
--- Synchronet 3.21d-Linux NewsLink 1.2

From Rainer Weikusat@rweikusat@talktalk.net to comp.unix.programmer on Thu Nov 21 17:31:16 2024

From Newsgroup: comp.unix.programmer

scott@slp53.sl.home (Scott Lurndal) writes:

mas@a4.home writes:

On 2024-11-20, Rainer Weikusat <rweikusat@talktalk.net> wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

Assuming that p is a pointer to the current position in a string, e is a >>> pointer to the end of it (ie, point just past the last byte) and -
that's important - both are pointers to unsigned quantities, the 'bulky' >>> C equivalent of [0-9]+ is

while (p < e && *p - '0' < 10) ++p;

That's not too bad. And it's really a hell lot faster than a
general-purpose automaton programmed to recognize the same pattern
(which might not matter most of the time, but sometimes, it does).

int
main(int argc, char **argv) {
unsigned char *p, *e;
unsigned char mystr[] = "12#45XY ";

p = mystr;
e = mystr + sizeof(mystr);
while (p < e && *p - '0' < 10) ++p;

size_t xlen = p-mystr;
printf("digits: '%.*s'\n", (int) xlen, mystr);
printf("mystr %p p %p e %p\n", mystr, p, e);
printf("xlen %zd\n", xlen);

return 0;
}

./a.out
digits: '12#45'
mystr 0x7ffc92ac55ff p 0x7ffc92ac5604 e 0x7ffc92ac5608
xlen 5

Indeed. Rainer's while loop should be using isdigit(*p).

I'm even using

c &= ~0x20;
if (c - 'A' < 6) return c - 'A' + 10;

for detecting hex digits (type of c is unsigned). JSON demands UTF8
which implies demanding ASCII (which means that the code point of a
lowercase letter is that of the corresponding uppercase letter but with
the sixth bit additionally set).

Constructs like this are great insurance policy against people inventing spurious character encodings.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Nicolas George@nicolas$george@salle-s.org to comp.unix.programmer on Thu Nov 21 17:53:02 2024

From Newsgroup: comp.unix.programmer

Scott Lurndal, dans le message <QVI%O.179460$Oi5e.162642@fx15.iad>, a
ocrita:

Indeed. Rainer's while loop should be using isdigit(*p).

No it should, that would be a mistake.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Rainer Weikusat@rweikusat@talktalk.net to comp.unix.programmer on Thu Nov 21 17:19:51 2024

From Newsgroup: comp.unix.programmer

mas@a4.home writes:

On 2024-11-20, Rainer Weikusat <rweikusat@talktalk.net> wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

Assuming that p is a pointer to the current position in a string, e is a
pointer to the end of it (ie, point just past the last byte) and -
that's important - both are pointers to unsigned quantities, the 'bulky'
C equivalent of [0-9]+ is

while (p < e && *p - '0' < 10) ++p;

That's not too bad. And it's really a hell lot faster than a
general-purpose automaton programmed to recognize the same pattern
(which might not matter most of the time, but sometimes, it does).

int
main(int argc, char **argv) {
unsigned char *p, *e;
unsigned char mystr[] = "12#45XY ";

p = mystr;
e = mystr + sizeof(mystr);
while (p < e && *p - '0' < 10) ++p;

The code I'm actually using is

while (p < e && (unsigned)*p - '0' < 10)
++p;

I just omitted that when posting this beause I mistakenly assumed that
it probably wasn't needed, ;-). You could have pointed this out instead
of trying to dress it up as some sort of mystery problem-|. Especially as
I did mention that using unsigned arithmetic was necessary (should
really be self-evident).

I should really have replied with

[rw@doppelsaurus]/tmp#gcc a.c
a.c: In function 'main':
a.c:12:5: error: unknown type name 'size_t'
12 | size_t xlen = p-mystr;
| ^~~~~~
a.c:1:1: note: 'size_t' is defined in header '<stddef.h>'; did you forget to '#include <stddef.h>'?
+++ |+#include <stddef.h>
1 | int
a.c:13:5: warning: implicit declaration of function 'printf' [-Wimplicit-function-declaration]
13 | printf("digits: '%.*s'\n", (int) xlen, mystr);
| ^~~~~~
a.c:13:5: warning: incompatible implicit declaration of built-in function 'printf'
a.c:1:1: note: include '<stdio.h>' or provide a declaration of 'printf'
+++ |+#include <stdio.h>
1 | int

as code which cannot even be compiled as no runtime behaviour.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Kaz Kylheku@643-408-1753@kylheku.com to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Thu Nov 21 19:12:03 2024

From Newsgroup: comp.unix.programmer

On 2024-11-20, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 20.11.2024 09:21, Muttley@DastartdlyHQ.org wrote:

Regex itself is clumsy beyond simple search and replace patterns. A lot of >> stuff I've seen done in regex would have better done procedurally at the
expense of slightly more code but a LOT more readability. Also given its
effectively a compact language with its own grammar and syntax IMO it should >> not be the core part of any language as it can lead to a syntatic mess, which
is what often happens with Perl.

I wouldn't look at it that way. I've seen Regexps as part of languages usually in well defined syntactical contexts. For example, like strings
are enclosed in "...", Regexps could be seen within /.../ delimiters.
GNU Awk (in recent versions) went towards first class "strongly typed" Regexps which are then denoted by the @/.../ syntax.

These features solve the problem of regexes being stored as character
strings not being recognized by the language compiler and then having
to be compiled at run-time.

They don't solve all the ergonomics of regexes that Muttley is talking
about.

I'm curious what you mean by Regexps presented in a "procedural" form.
Can you give some examples?

Here is an example: using a regex match to capture a C comment /* ... */
in Lex compared to just recognizing the start sequence /* and handling
the discarding of the comment in the action.

Without non-greedy repetition matching, the regex for a C comment is
quite obtuse. The procedural handling is straightforward: read
characters until you see a * immediately followed by a /.

In the wild, you see regexes being used for all sorts of stupid stuff,
like checking whether numeric input is in a certain range, rather than converting it to a number and doing an arithmetic check.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
--- Synchronet 3.21d-Linux NewsLink 1.2

From Lawrence D'Oliveiro@ldo@nz.invalid to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Thu Nov 21 22:05:13 2024

From Newsgroup: comp.unix.programmer

On Thu, 21 Nov 2024 08:15:41 -0000 (UTC), Muttley wrote:

On Wed, 20 Nov 2024 21:43:41 -0000 (UTC)
Lawrence D'Oliveiro <ldo@nz.invalid> boring babbled:

On Wed, 20 Nov 2024 12:27:54 -0000 (UTC), Muttley wrote:

Edge cases are regex achilles heal, eg an expression that only
accounted for 1 -> N chars, not 0 -> N, or matches in the middle but
not at the ends.

ThatrCOs what rCL^rCY and rCL$rCY are for.

Yes, but people forget about those (literal) edge cases.

Those of us who are accustomed to using regexes do not.

Another handy one is rCL\brCY for word boundaries.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastartdlyHQ.org to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 10:09:48 2024

From Newsgroup: comp.unix.programmer

On Thu, 21 Nov 2024 19:12:03 -0000 (UTC)
Kaz Kylheku <643-408-1753@kylheku.com> boring babbled:

On 2024-11-20, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

I'm curious what you mean by Regexps presented in a "procedural" form.
Can you give some examples?

Here is an example: using a regex match to capture a C comment /* ... */
in Lex compared to just recognizing the start sequence /* and handling
the discarding of the comment in the action.

Without non-greedy repetition matching, the regex for a C comment is
quite obtuse. The procedural handling is straightforward: read
characters until you see a * immediately followed by a /.

Its not that simple I'm afraid since comments can be commented out.

eg:

// int i; /*
int j;
/*
int k;
*/
++j;

A C99 and C++ compiler would see "int j" and compile it, a regex would
simply remove everything from the first /* to */.

Also the same probably applies to #ifdef's.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 12:14:32 2024

From Newsgroup: comp.unix.programmer

On 20.11.2024 18:50, Rainer Weikusat wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

Personally I think that writing bulky procedural stuff for something
like [0-9]+ can only be much worse, and that further abbreviations
like \d+ are the better direction to go if targeting a good interface.
YMMV.

Assuming that p is a pointer to the current position in a string, e is a pointer to the end of it (ie, point just past the last byte) and -
that's important - both are pointers to unsigned quantities, the 'bulky'
C equivalent of [0-9]+ is

while (p < e && *p - '0' < 10) ++p;

That's not too bad. And it's really a hell lot faster than a
general-purpose automaton programmed to recognize the same pattern
(which might not matter most of the time, but sometimes, it does).

Okay, I see where you're coming from (and especially in that simple
case).

Personally (and YMMV), even here in this simple case I think that
using pointers is not better but worse - and anyway isn't [in this
form] available in most languages; in other cases (and languages)
such constructs get yet more clumsy, and for my not very complex
example - /[0-9]+(ABC)?x*foo/ - even a "catastrophe" concerning
readability, error-proneness, and maintainability.

If that is what the other poster meant I'm fine with your answer;
there's no need to even consider abandoning regular expressions
in favor of explicitly codified parsing.

Janis

PS: And thanks for answering on behalf of the other poster whom I
see in his followups just continuing his very personal style.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 12:17:56 2024

From Newsgroup: comp.unix.programmer

On 21.11.2024 20:12, Kaz Kylheku wrote:

[...]

In the wild, you see regexes being used for all sorts of stupid stuff,

No one can prevent folks using features for stupid things. Yes.

like checking whether numeric input is in a certain range, rather than converting it to a number and doing an arithmetic check.

Janis

--- Synchronet 3.21d-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 12:47:16 2024

From Newsgroup: comp.unix.programmer

On 21.11.2024 23:05, Lawrence D'Oliveiro wrote:

On Thu, 21 Nov 2024 08:15:41 -0000 (UTC), Muttley wrote:

On Wed, 20 Nov 2024 21:43:41 -0000 (UTC)
Lawrence D'Oliveiro <ldo@nz.invalid> boring babbled:
[...]

ThatrCOs what rCL^rCY and rCL$rCY are for.

Yes, but people forget about those (literal) edge cases.

But *only* _literally_ "edge cases". Rather they're simple
and basics of regexp parsers since their beginning.

Those of us who are accustomed to using regexes do not.

It's one of the first things that regexp newbies learn,
I'd say.

Another handy one is rCL\brCY for word boundaries.

I prefer \< and \> (that are quite commonly used) for such
structural things, also $ and $ for allowing references
to matched parts. And I prefer the \alpha regexp pattern
extension forms for things like \d \D \w \W \s \S . (But
that's not only a matter of taste but also a question of
what any regexp parser actually supports.)

Janis

--- Synchronet 3.21d-Linux NewsLink 1.2

From Rainer Weikusat@rweikusat@talktalk.net to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 11:56:26 2024

From Newsgroup: comp.unix.programmer

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 20.11.2024 18:50, Rainer Weikusat wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

Personally I think that writing bulky procedural stuff for something
like [0-9]+ can only be much worse, and that further abbreviations
like \d+ are the better direction to go if targeting a good interface.
YMMV.

Assuming that p is a pointer to the current position in a string, e is a
pointer to the end of it (ie, point just past the last byte) and -
that's important - both are pointers to unsigned quantities, the 'bulky'
C equivalent of [0-9]+ is

while (p < e && *p - '0' < 10) ++p;

That's not too bad. And it's really a hell lot faster than a
general-purpose automaton programmed to recognize the same pattern
(which might not matter most of the time, but sometimes, it does).

Okay, I see where you're coming from (and especially in that simple
case).

Personally (and YMMV), even here in this simple case I think that
using pointers is not better but worse - and anyway isn't [in this
form] available in most languages;

That's a question of using the proper tool for the job. In C, that's
pointer and pointer arithmetic because it's the simplest way to express something like this.

in other cases (and languages)
such constructs get yet more clumsy, and for my not very complex
example - /[0-9]+(ABC)?x*foo/ - even a "catastrophe" concerning
readability, error-proneness, and maintainability.

Procedural code for matching strings constructed in this way is
certainly much simpler-| than the equally procedural code for a
programmable automaton capable of interpreting regexes. Your statement
is basically "If we assume that the code interpreting regexes doesn't
exist, regexes need much less code than something equivalent which does
exist." Without this assumption, the picture becomes a different one altogether.

-| This doesn't even need a real state machine, just four subroutines
executed in succession (and two of these can share an implementation as "matching ABC" and "matching foo" are both cases of matching a constant
string.

If that is what the other poster meant I'm fine with your answer;
there's no need to even consider abandoning regular expressions
in favor of explicitly codified parsing.

This depends on the specific problem and the constraints applicable to a solution. For the common case, regexes, if easily available, are an
obvious good solution. But not all cases are common.
--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 13:30:34 2024

From Newsgroup: comp.unix.programmer

In article <874j40sk01.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

[...]

Personally I think that writing bulky procedural stuff for something
like [0-9]+ can only be much worse, and that further abbreviations
like \d+ are the better direction to go if targeting a good interface. >>>> YMMV.

Assuming that p is a pointer to the current position in a string, e is a >>>pointer to the end of it (ie, point just past the last byte) and -
that's important - both are pointers to unsigned quantities, the 'bulky' >>>C equivalent of [0-9]+ is

while (p < e && *p - '0' < 10) ++p;

That's not too bad. And it's really a hell lot faster than a >>>general-purpose automaton programmed to recognize the same pattern
(which might not matter most of the time, but sometimes, it does).

It's also not exactly right. `[0-9]+` would match one or more
characters; this possibly matches 0 (ie, if `p` pointed to
something that wasn't a digit).

The regex won't match any digits if there aren't any. In this case, the
match will fail. I didn't include the code for handling that because it >seemed pretty pointless for the example.

That's rather the point though, isn't it? The program snippet
(modulo the promotion to signed int via the "usual arithmetic
conversions" before the subtraction and comparison giving you
unexpected values; nothing to do with whether `char` is signed
or not) is a snippet that advances a pointer while it points to
a digit, starting at the current pointer position; that is, it
just increments a pointer over a run of digits.

But that's not the same as a regex matcher, which has a semantic
notion of success or failure. I could run your snippet against
a string such as, say, "ZZZZZZ" and it would "succeed" just as
it would against an empty string or a string of one or more
digits. And then there are other matters of context; does the
user intend for the regexp to match the _whole_ string? Or any
portion of the string (a la `grep`)? So, for example, does the
string "aaa1234aaa" match `[0-9]+`? As written, the above
snippet is actually closer to advancing `p` over `^[0-9]*`. One
might differentiate between `*` and `+` after the fact, by
examining `p` against some (presumably saved) source value, but
that's more code.

These are just not equivalent. That's not to say that your
snippet is not _useful_ in context, but to pretend that it's the
same as the regular expression is pointlessly reductive.

By the way, something that _would_ match `^[0-9]+$` might be:

term% cat mdp.c
#include <assert.h>
#include <stdbool.h>
#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

static bool
mdigit(unsigned int c)
{
return c - '0' < 10;
}

bool
mdp(const char *str, const char *estr)
{
if (str == NULL || estr == NULL || str == estr)
return false;
if (!mdigit(*str))
return false;
while (str < estr && mdigit(*str))
str++;
return str == estr;
}

bool
probe(const char *s, bool expected)
{
if (mdp(s, s + strlen(s)) != expected) {
fprintf(stderr, "test failure: `%s` (expected %s)\n",
s, expected ? "true" : "false");
return false;
}
return true;
}

int
main(void)
{
bool success = true;

success = probe("1234", true) && success;
success = probe("", false) && success;
success = probe("ab", false) && success;
success = probe("0", true) && success;
success = probe("0123456789", true) && success;
success = probe("a0123456", false) && success;
success = probe("0123456b", false) && success;
success = probe("0123c456", false) && success;
success = probe("0123#456", false) && success;

return success ? EXIT_SUCCESS : EXIT_FAILURE;
}
term% cc -Wall -Wextra -Werror -pedantic -std=c11 mdp.c -o mdp
term% ./mdp
term% echo $?
0
term%

Granted the test scaffolding and `#include` boilerplate makes
this appear rather longer than it would be in context, but it's
still not nearly as succinct.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Fri Nov 22 14:14:00 2024

From Newsgroup: comp.unix.programmer

In article <87o728qzbc.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

mas@a4.home writes:

On 2024-11-20, Rainer Weikusat <rweikusat@talktalk.net> wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

Assuming that p is a pointer to the current position in a string, e is a >>> pointer to the end of it (ie, point just past the last byte) and -
that's important - both are pointers to unsigned quantities, the 'bulky' >>> C equivalent of [0-9]+ is

while (p < e && *p - '0' < 10) ++p;

That's not too bad. And it's really a hell lot faster than a
general-purpose automaton programmed to recognize the same pattern
(which might not matter most of the time, but sometimes, it does).

int
main(int argc, char **argv) {
unsigned char *p, *e;
unsigned char mystr[] = "12#45XY ";

p = mystr;
e = mystr + sizeof(mystr);
while (p < e && *p - '0' < 10) ++p;

The code I'm actually using is

while (p < e && (unsigned)*p - '0' < 10)
++p;

I just omitted that when posting this beause I mistakenly assumed that
it probably wasn't needed, ;-). You could have pointed this out instead
of trying to dress it up as some sort of mystery problem-|. Especially as
I did mention that using unsigned arithmetic was necessary (should
really be self-evident).

Well, no, not exactly. You said that it was important that the
pointers point to unsigned quantities, but that wasn't the
issue. The issue is that both operands of the `-` are promoted
to _signed_ int before the subtraction (sec 6.3.1.8 of n220),
and so the comparison is done against signed quantities. Your
cast fixes this by forcing promotion of '0' and 10 to unsigned
int before either operation.

I do agree that the presentation was overly terse and came off
poorly.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Rainer Weikusat@rweikusat@talktalk.net to comp.unix.programmer on Fri Nov 22 15:27:43 2024

From Newsgroup: comp.unix.programmer

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <87o728qzbc.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

mas@a4.home writes:

On 2024-11-20, Rainer Weikusat <rweikusat@talktalk.net> wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

Assuming that p is a pointer to the current position in a string, e is a >>>> pointer to the end of it (ie, point just past the last byte) and -
that's important - both are pointers to unsigned quantities, the 'bulky' >>>> C equivalent of [0-9]+ is

while (p < e && *p - '0' < 10) ++p;

That's not too bad. And it's really a hell lot faster than a
general-purpose automaton programmed to recognize the same pattern
(which might not matter most of the time, but sometimes, it does).

int
main(int argc, char **argv) {
unsigned char *p, *e;
unsigned char mystr[] = "12#45XY ";

p = mystr;
e = mystr + sizeof(mystr);
while (p < e && *p - '0' < 10) ++p;

The code I'm actually using is

while (p < e && (unsigned)*p - '0' < 10)
++p;

I just omitted that when posting this beause I mistakenly assumed that
it probably wasn't needed, ;-). You could have pointed this out instead
of trying to dress it up as some sort of mystery problem|e-|. Especially as >>I did mention that using unsigned arithmetic was necessary (should
really be self-evident).

Well, no, not exactly. You said that it was important that the
pointers point to unsigned quantities, but that wasn't the
issue.

The issue here is that I mistakenly assumed the (unsigned) in the code
was a left-over from before the time when I had changed to pointers to
unsigned char to fix a different issue. As C tries really hard to force
signed arithmetic onto people despite this basically never makes any
sense, the type of '0' is int *p gets promoted to int and hence, the
result of the subtraction will also be an int and the '< 10' condition
will be true for every codepoint numerically less than '9' which
obviously won't work as intended.

That's a mistake I made which would have warranted being pointed out and possibly explained instead of posting some broken code together with
some output the broken code certainly never generated due it being
broken.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Rainer Weikusat@rweikusat@talktalk.net to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 15:41:09 2024

From Newsgroup: comp.unix.programmer

cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

[...]

Personally I think that writing bulky procedural stuff for something >>>>> like [0-9]+ can only be much worse, and that further abbreviations
like \d+ are the better direction to go if targeting a good interface. >>>>> YMMV.

Assuming that p is a pointer to the current position in a string, e is a >>>>pointer to the end of it (ie, point just past the last byte) and - >>>>that's important - both are pointers to unsigned quantities, the 'bulky' >>>>C equivalent of [0-9]+ is

while (p < e && *p - '0' < 10) ++p;

That's not too bad. And it's really a hell lot faster than a >>>>general-purpose automaton programmed to recognize the same pattern >>>>(which might not matter most of the time, but sometimes, it does).

It's also not exactly right. `[0-9]+` would match one or more
characters; this possibly matches 0 (ie, if `p` pointed to
something that wasn't a digit).

The regex won't match any digits if there aren't any. In this case, the >>match will fail. I didn't include the code for handling that because it >>seemed pretty pointless for the example.

That's rather the point though, isn't it? The program snippet
(modulo the promotion to signed int via the "usual arithmetic
conversions" before the subtraction and comparison giving you
unexpected values; nothing to do with whether `char` is signed
or not) is a snippet that advances a pointer while it points to
a digit, starting at the current pointer position; that is, it
just increments a pointer over a run of digits.

That's the core part of matching someting equivalent to the regex [0-9]+
and the only part of it is which is at least remotely interesting.

But that's not the same as a regex matcher, which has a semantic
notion of success or failure. I could run your snippet against
a string such as, say, "ZZZZZZ" and it would "succeed" just as
it would against an empty string or a string of one or more
digits.

Why do you believe that p being equivalent to the starting position
would be considered a "successful match", considering that this
obviously doesn't make any sense?

[...]

By the way, something that _would_ match `^[0-9]+$` might be:

[too much code]

Something which would match [0-9]+ in its first argument (if any) would
be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;
if (!c) exit(1);
return 0;
}

but that's 14 lines of text, 13 of which have absolutely no relation to
the problem of recognizing a digit.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Rainer Weikusat@rweikusat@talktalk.net to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 15:52:41 2024

From Newsgroup: comp.unix.programmer

Rainer Weikusat <rweikusat@talktalk.net> writes:

[...]

Something which would match [0-9]+ in its first argument (if any) would
be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;

This needs to be

while (c = *p, c && c - '0' > 9) ++p
--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 17:17:46 2024

From Newsgroup: comp.unix.programmer

In article <877c8vtgx6.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote: >>>cross@spitfire.i.gajendra.net (Dan Cross) writes:

[snip]
It's also not exactly right. `[0-9]+` would match one or more
characters; this possibly matches 0 (ie, if `p` pointed to
something that wasn't a digit).

The regex won't match any digits if there aren't any. In this case, the >>>match will fail. I didn't include the code for handling that because it >>>seemed pretty pointless for the example.

That's rather the point though, isn't it? The program snippet
(modulo the promotion to signed int via the "usual arithmetic
conversions" before the subtraction and comparison giving you
unexpected values; nothing to do with whether `char` is signed
or not) is a snippet that advances a pointer while it points to
a digit, starting at the current pointer position; that is, it
just increments a pointer over a run of digits.

That's the core part of matching someting equivalent to the regex [0-9]+
and the only part of it is which is at least remotely interesting.

Not really, no. The interesting thing in this case appears to
be knowing whether or not the match succeeded, but you omited
that part.

But that's not the same as a regex matcher, which has a semantic
notion of success or failure. I could run your snippet against
a string such as, say, "ZZZZZZ" and it would "succeed" just as
it would against an empty string or a string of one or more
digits.

Why do you believe that p being equivalent to the starting position
would be considered a "successful match", considering that this
obviously doesn't make any sense?

Because absent any surrounding context, there's no indication
that the source is even saved. You'll note that I did mention
that as a means to differentiate later on, but that's not the
snippet you posted.

[...]

By the way, something that _would_ match `^[0-9]+$` might be:

[too much code]

Something which would match [0-9]+ in its first argument (if any) would
be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;
if (!c) exit(1);
return 0;
}

but that's 14 lines of text, 13 of which have absolutely no relation to
the problem of recognizing a digit.

This is wrong in many ways. Did you actually test that program?

First of all, why `"string.h"` and not `<string.h>`? Ok, that's
not technically an error, but it's certainly unconventional, and
raises questions that are ultimately a distraction.

Second, suppose that `argc==0` (yes, this can happen under
POSIX).

Third, the loop: why `> 10`? Don't you mean `< 10`? You are
trying to match digits, not non-digits.

Fourth, you exit with failure (`exit(1)`) if `!p` *and* if `!c`
at the end, but `!c` there means you've reached the end of the
string; which should be success.

Fifth and finally, you `return 0;` which is EXIT_SUCCESS, in the
failure case.

Compare:

#include <regex.h>
#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>

int
main(int argc, char *argv[])
{
regex_t reprog;
int ret;

if (argc != 2) {
fprintf(stderr, "Usage: regexp pattern\n");
return(EXIT_FAILURE);
}
(void)regcomp(&reprog, "^[0-9]+$", REG_EXTENDED | REG_NOSUB);
ret = regexec(&reprog, argv[1], 0, NULL, 0);
regfree(&reprog);

return ret == 0 ? EXIT_SUCCESS : EXIT_FAILURE;
}

This is only marginally longer, but is correct.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 17:18:26 2024

From Newsgroup: comp.unix.programmer

In article <87zflrs1ti.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

Rainer Weikusat <rweikusat@talktalk.net> writes:

[...]

Something which would match [0-9]+ in its first argument (if any) would
be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;

This needs to be

while (c = *p, c && c - '0' > 9) ++p

No, that's still wrong. Try actually running it.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Rainer Weikusat@rweikusat@talktalk.net to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 17:35:29 2024

From Newsgroup: comp.unix.programmer

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <87zflrs1ti.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

Rainer Weikusat <rweikusat@talktalk.net> writes:

[...]

Something which would match [0-9]+ in its first argument (if any) would
be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;

This needs to be

while (c = *p, c && c - '0' > 9) ++p

No, that's still wrong. Try actually running it.

If you know something that's wrong with that, why not write it instead
of utilizing the claim for pointless (and wrong) snide remarks?
--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 17:43:24 2024

From Newsgroup: comp.unix.programmer

In article <87v7wfrx26.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <87zflrs1ti.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

Rainer Weikusat <rweikusat@talktalk.net> writes:

[...]

Something which would match [0-9]+ in its first argument (if any) would >>>> be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;

This needs to be

while (c = *p, c && c - '0' > 9) ++p

No, that's still wrong. Try actually running it.

If you know something that's wrong with that, why not write it instead
of utilizing the claim for pointless (and wrong) snide remarks?

I did, at length, in my other post.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 17:43:59 2024

From Newsgroup: comp.unix.programmer

In article <vhqfrs$bit$1@reader2.panix.com>,
Dan Cross <cross@spitfire.i.gajendra.net> wrote:

In article <87v7wfrx26.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote: >>cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <87zflrs1ti.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

Rainer Weikusat <rweikusat@talktalk.net> writes:

[...]

Something which would match [0-9]+ in its first argument (if any) would >>>>> be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;

This needs to be

while (c = *p, c && c - '0' > 9) ++p

No, that's still wrong. Try actually running it.

If you know something that's wrong with that, why not write it instead
of utilizing the claim for pointless (and wrong) snide remarks?

I did, at length, in my other post.

Cf. <vhqebq$c71$1@reader2.panix.com>

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Rainer Weikusat@rweikusat@talktalk.net to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 17:48:37 2024

From Newsgroup: comp.unix.programmer

cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote: >>>>cross@spitfire.i.gajendra.net (Dan Cross) writes:

[snip]
It's also not exactly right. `[0-9]+` would match one or more
characters; this possibly matches 0 (ie, if `p` pointed to
something that wasn't a digit).

The regex won't match any digits if there aren't any. In this case, the >>>>match will fail. I didn't include the code for handling that because it >>>>seemed pretty pointless for the example.

That's rather the point though, isn't it? The program snippet
(modulo the promotion to signed int via the "usual arithmetic
conversions" before the subtraction and comparison giving you
unexpected values; nothing to do with whether `char` is signed
or not) is a snippet that advances a pointer while it points to
a digit, starting at the current pointer position; that is, it
just increments a pointer over a run of digits.

That's the core part of matching someting equivalent to the regex [0-9]+ >>and the only part of it is which is at least remotely interesting.

Not really, no. The interesting thing in this case appears to
be knowing whether or not the match succeeded, but you omited
that part.

This of interest to you as it enables you to base an 'argumentation'
(sarcasm) on arbitrary assumptions you've chosen to make. It's not
something I consider interesting and it's besides the point of the
example I posted.

But that's not the same as a regex matcher, which has a semantic
notion of success or failure. I could run your snippet against
a string such as, say, "ZZZZZZ" and it would "succeed" just as
it would against an empty string or a string of one or more
digits.

Why do you believe that p being equivalent to the starting position
would be considered a "successful match", considering that this
obviously doesn't make any sense?

Because absent any surrounding context, there's no indication
that the source is even saved.

A text usually doesn't contain information about things which aren't
part of its content. I congratulate you to this rather obvious observation.

[...]

Something which would match [0-9]+ in its first argument (if any) would
be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;
if (!c) exit(1);
return 0;
}

but that's 14 lines of text, 13 of which have absolutely no relation to
the problem of recognizing a digit.

This is wrong in many ways. Did you actually test that program?

First of all, why `"string.h"` and not `<string.h>`? Ok, that's
not technically an error, but it's certainly unconventional, and
raises questions that are ultimately a distraction.

Such as your paragraph above.

Second, suppose that `argc==0` (yes, this can happen under
POSIX).

It can happen in case of some piece of functionally hostile software intentionally creating such a situation. Tangential, irrelevant
point. If you break it, you get to keep the parts.

Third, the loop: why `> 10`? Don't you mean `< 10`? You are
trying to match digits, not non-digits.

Mistake I made. The opposite of < 10 is > 9.

Fourth, you exit with failure (`exit(1)`) if `!p` *and* if `!c`
at the end, but `!c` there means you've reached the end of the
string; which should be success.

Mistake you made: [0-9]+ matches if there's at least one digit in the
string. That's why the loop terminates once one was found. In this case,
c cannot be 0.
--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 18:12:34 2024

From Newsgroup: comp.unix.programmer

In article <87o727rwga.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

Something which would match [0-9]+ in its first argument (if any) would >>>be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;
if (!c) exit(1);
return 0;
}

but that's 14 lines of text, 13 of which have absolutely no relation to >>>the problem of recognizing a digit.

This is wrong in many ways. Did you actually test that program?

First of all, why `"string.h"` and not `<string.h>`? Ok, that's
not technically an error, but it's certainly unconventional, and
raises questions that are ultimately a distraction.

Such as your paragraph above.

Second, suppose that `argc==0` (yes, this can happen under
POSIX).

It can happen in case of some piece of functionally hostile software >intentionally creating such a situation. Tangential, irrelevant
point. If you break it, you get to keep the parts.

Third, the loop: why `> 10`? Don't you mean `< 10`? You are
trying to match digits, not non-digits.

Mistake I made. The opposite of < 10 is > 9.

I see. So you want to skip non-digits and exit the first time
you see a digit. Ok, fair enough, though that program has
already been written, and is called `grep`.

Fourth, you exit with failure (`exit(1)`) if `!p` *and* if `!c`
at the end, but `!c` there means you've reached the end of the
string; which should be success.

Mistake you made: [0-9]+ matches if there's at least one digit in the
string. That's why the loop terminates once one was found. In this case,
c cannot be 0.

Ah, you are trying to match `[0-9]` (though you're calling it
`[0-9]+`). Yeah, your program was not at all equivalent to one
I wrote, though this is what you posted in response to mine, so
I assumed you were trying to emulate that behavior (matching
`^[0-9]+$`).

But I see above that you mentioned `[0-9]+`. But as I mentioned
above, really you're just matching any digit, so you may as well
be matching `[0-9]`; again, this not the same as the actual
regexp, because you are ignoring the semantics of what regular
expressions actually describe.

In any event, this seems simpler than what you posted:

#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>

int
main(int argc, char *argv[])
{
if (argc != 2) {
fprintf(stderr, "Usage: matchd <str>\n");
return EXIT_FAILURE;
}

for (const char *p = argv[1]; *p != '\0'; p++)
if ('0' <= *p && *p <= '9')
return EXIT_SUCCESS;

return EXIT_FAILURE;
}

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Kaz Kylheku@643-408-1753@kylheku.com to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 18:18:04 2024

From Newsgroup: comp.unix.programmer

On 2024-11-22, Muttley@DastartdlyHQ.org <Muttley@DastartdlyHQ.org> wrote:

On Thu, 21 Nov 2024 19:12:03 -0000 (UTC)
Kaz Kylheku <643-408-1753@kylheku.com> boring babbled:

On 2024-11-20, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

I'm curious what you mean by Regexps presented in a "procedural" form.
Can you give some examples?

Here is an example: using a regex match to capture a C comment /* ... */
in Lex compared to just recognizing the start sequence /* and handling
the discarding of the comment in the action.

Without non-greedy repetition matching, the regex for a C comment is
quite obtuse. The procedural handling is straightforward: read
characters until you see a * immediately followed by a /.

Its not that simple I'm afraid since comments can be commented out.

Umm, no.

eg:

// int i; /*

This /* sequence is inside a // comment, and so the machinery that
recognizes /* as the start of a comment would never see it.

Just like "int i;" is in a string literal and so not recognized
as a keyword, whitespace, identifier and semicolon.

int j;
/*
int k;
*/
++j;

A C99 and C++ compiler would see "int j" and compile it, a regex would
simply remove everything from the first /* to */.

No, it won't, because that's not how regexes are used in a lexical
analyzer. At the start of the input, the lexical analyzer faces
the characters "// int i; /*\n". This will trigger the pattern match
for // comments. Essentially that entire sequence through the newline
is treated as a kind of token, equivalent to a space.

Once a token is recognized and removed from the input, it is gone;
no other regular expression can match into it.

Also the same probably applies to #ifdef's.

Lexically analyzing C requires implementing the translation phases
as described in the standard. There are preprocessor phases which
delimit the input into preprocessor tokens (pp-tokens). Comments
are stripped in preprocessing. But logical lines (backslash
continuations) are recognized below comments; i.e. this is one
comment:

\\ comment \
split \
into \
physical \
lines

A lexical scanner can have an input routine which transparently handles
this low-level detail, so that it doesn't have to deal with the
line continuations in every token pattern.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
--- Synchronet 3.21d-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 18:14:48 2024

From Newsgroup: comp.unix.programmer

Rainer Weikusat <rweikusat@talktalk.net> writes:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote: >>>cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

[...]

Personally I think that writing bulky procedural stuff for something >>>>>> like [0-9]+ can only be much worse, and that further abbreviations >>>>>> like \d+ are the better direction to go if targeting a good interface. >>>>>> YMMV.

Assuming that p is a pointer to the current position in a string, e is a >>>>>pointer to the end of it (ie, point just past the last byte) and - >>>>>that's important - both are pointers to unsigned quantities, the 'bulky' >>>>>C equivalent of [0-9]+ is

while (p < e && *p - '0' < 10) ++p;

That's not too bad. And it's really a hell lot faster than a >>>>>general-purpose automaton programmed to recognize the same pattern >>>>>(which might not matter most of the time, but sometimes, it does).

It's also not exactly right. `[0-9]+` would match one or more
characters; this possibly matches 0 (ie, if `p` pointed to
something that wasn't a digit).

The regex won't match any digits if there aren't any. In this case, the >>>match will fail. I didn't include the code for handling that because it >>>seemed pretty pointless for the example.

That's rather the point though, isn't it? The program snippet
(modulo the promotion to signed int via the "usual arithmetic
conversions" before the subtraction and comparison giving you
unexpected values; nothing to do with whether `char` is signed
or not) is a snippet that advances a pointer while it points to
a digit, starting at the current pointer position; that is, it
just increments a pointer over a run of digits.

That's the core part of matching someting equivalent to the regex [0-9]+
and the only part of it is which is at least remotely interesting.

But that's not the same as a regex matcher, which has a semantic
notion of success or failure. I could run your snippet against
a string such as, say, "ZZZZZZ" and it would "succeed" just as
it would against an empty string or a string of one or more
digits.

Why do you believe that p being equivalent to the starting position
would be considered a "successful match", considering that this
obviously doesn't make any sense?

[...]

By the way, something that _would_ match `^[0-9]+$` might be:

[too much code]

Something which would match [0-9]+ in its first argument (if any) would
be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;
if (!c) exit(1);
return 0;
}

but that's 14 lines of text, 13 of which have absolutely no relation to
the problem of recognizing a digit.

Personally, I'd use:

$ cat /tmp/a.c
#include <stdint.h>
#include <string.h>

int
main(int argc, const char **argv)
{
char *cp;
uint64_t value;

if (argc < 2) return 1;

value = strtoull(argv[1], &cp, 10);
if ((cp == argv[1])
|| (*cp != '\0')) {
return 1;
}
return 0;
}
$ cc -o /tmp/a /tmp/a.c
$ /tmp/a 13254
$ echo $?
0
$ /tmp/a 23v23
$ echo $?
1
--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 18:30:31 2024

From Newsgroup: comp.unix.programmer

In article <VZ30P.4664$YSkc.1894@fx40.iad>,
Scott Lurndal <slp53@pacbell.net> wrote:

scott@slp53.sl.home (Scott Lurndal) writes:

Rainer Weikusat <rweikusat@talktalk.net> writes: >>>cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote: >>>>>cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

[...]

Personally I think that writing bulky procedural stuff for something >>>>>>>> like [0-9]+ can only be much worse, and that further abbreviations >>>>>>>> like \d+ are the better direction to go if targeting a good interface. >>>>>>>> YMMV.

Assuming that p is a pointer to the current position in a string, e is a >>>>>>>pointer to the end of it (ie, point just past the last byte) and - >>>>>>>that's important - both are pointers to unsigned quantities, the 'bulky' >>>>>>>C equivalent of [0-9]+ is

while (p < e && *p - '0' < 10) ++p;

That's not too bad. And it's really a hell lot faster than a >>>>>>>general-purpose automaton programmed to recognize the same pattern >>>>>>>(which might not matter most of the time, but sometimes, it does). >>>>>>

It's also not exactly right. `[0-9]+` would match one or more
characters; this possibly matches 0 (ie, if `p` pointed to
something that wasn't a digit).

The regex won't match any digits if there aren't any. In this case, the >>>>>match will fail. I didn't include the code for handling that because it >>>>>seemed pretty pointless for the example.

That's rather the point though, isn't it? The program snippet
(modulo the promotion to signed int via the "usual arithmetic
conversions" before the subtraction and comparison giving you
unexpected values; nothing to do with whether `char` is signed
or not) is a snippet that advances a pointer while it points to
a digit, starting at the current pointer position; that is, it
just increments a pointer over a run of digits.

That's the core part of matching someting equivalent to the regex [0-9]+ >>>and the only part of it is which is at least remotely interesting.

But that's not the same as a regex matcher, which has a semantic
notion of success or failure. I could run your snippet against
a string such as, say, "ZZZZZZ" and it would "succeed" just as
it would against an empty string or a string of one or more
digits.

Why do you believe that p being equivalent to the starting position
would be considered a "successful match", considering that this
obviously doesn't make any sense?

[...]

By the way, something that _would_ match `^[0-9]+$` might be:

[too much code]

Something which would match [0-9]+ in its first argument (if any) would >>>be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;
if (!c) exit(1);
return 0;
}

but that's 14 lines of text, 13 of which have absolutely no relation to >>>the problem of recognizing a digit.

Personally, I'd use:

Albeit this is limited to strings of digits that sum to less than >ULONG_MAX...

It's not quite equivalent to his program, which just exit's with
success if it sees any input string with a digit in it; your's
is closer to what I wrote, which matches `^[0-9]+$`. His is not
an interesting program and certainly not a recognizable
equivalent a regular expression matcher in any reasonable sense,
but I think the cognitive dissonance is too strong to get that
across.

- Dan C.

$ cat /tmp/a.c
#include <stdint.h>
#include <string.h>

int
main(int argc, const char **argv)
{
char *cp;
uint64_t value;

if (argc < 2) return 1;

value = strtoull(argv[1], &cp, 10);
if ((cp == argv[1])
|| (*cp != '\0')) {
return 1;
}
return 0;
}
$ cc -o /tmp/a /tmp/a.c
$ /tmp/a 13254
$ echo $?
0
$ /tmp/a 23v23
$ echo $?
1

--- Synchronet 3.21d-Linux NewsLink 1.2

From Kaz Kylheku@643-408-1753@kylheku.com to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 18:19:30 2024

From Newsgroup: comp.unix.programmer

On 2024-11-22, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 21.11.2024 20:12, Kaz Kylheku wrote:

[...]

In the wild, you see regexes being used for all sorts of stupid stuff,

No one can prevent folks using features for stupid things. Yes.

But the thing is that "modern" regular expressions (Perl regex and its
progeny) have features that are designed to exclusively cater to these
folks.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
--- Synchronet 3.21d-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 18:22:45 2024

From Newsgroup: comp.unix.programmer

scott@slp53.sl.home (Scott Lurndal) writes:

Rainer Weikusat <rweikusat@talktalk.net> writes: >>cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote: >>>>cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

[...]

Personally I think that writing bulky procedural stuff for something >>>>>>> like [0-9]+ can only be much worse, and that further abbreviations >>>>>>> like \d+ are the better direction to go if targeting a good interface. >>>>>>> YMMV.

Assuming that p is a pointer to the current position in a string, e is a >>>>>>pointer to the end of it (ie, point just past the last byte) and - >>>>>>that's important - both are pointers to unsigned quantities, the 'bulky' >>>>>>C equivalent of [0-9]+ is

while (p < e && *p - '0' < 10) ++p;

That's not too bad. And it's really a hell lot faster than a >>>>>>general-purpose automaton programmed to recognize the same pattern >>>>>>(which might not matter most of the time, but sometimes, it does).

It's also not exactly right. `[0-9]+` would match one or more
characters; this possibly matches 0 (ie, if `p` pointed to
something that wasn't a digit).

The regex won't match any digits if there aren't any. In this case, the >>>>match will fail. I didn't include the code for handling that because it >>>>seemed pretty pointless for the example.

That's rather the point though, isn't it? The program snippet
(modulo the promotion to signed int via the "usual arithmetic
conversions" before the subtraction and comparison giving you
unexpected values; nothing to do with whether `char` is signed
or not) is a snippet that advances a pointer while it points to
a digit, starting at the current pointer position; that is, it
just increments a pointer over a run of digits.

That's the core part of matching someting equivalent to the regex [0-9]+ >>and the only part of it is which is at least remotely interesting.

But that's not the same as a regex matcher, which has a semantic
notion of success or failure. I could run your snippet against
a string such as, say, "ZZZZZZ" and it would "succeed" just as
it would against an empty string or a string of one or more
digits.

Why do you believe that p being equivalent to the starting position
would be considered a "successful match", considering that this
obviously doesn't make any sense?

[...]

By the way, something that _would_ match `^[0-9]+$` might be:

[too much code]

Something which would match [0-9]+ in its first argument (if any) would
be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;
if (!c) exit(1);
return 0;
}

but that's 14 lines of text, 13 of which have absolutely no relation to
the problem of recognizing a digit.

Personally, I'd use:

Albeit this is limited to strings of digits that sum to less than
ULONG_MAX...

$ cat /tmp/a.c
#include <stdint.h>
#include <string.h>

int
main(int argc, const char **argv)
{
char *cp;
uint64_t value;

if (argc < 2) return 1;

value = strtoull(argv[1], &cp, 10);
if ((cp == argv[1])
|| (*cp != '\0')) {
return 1;
}
return 0;
}
$ cc -o /tmp/a /tmp/a.c
$ /tmp/a 13254
$ echo $?
0
$ /tmp/a 23v23
$ echo $?
1

--- Synchronet 3.21d-Linux NewsLink 1.2

From Rainer Weikusat@rweikusat@talktalk.net to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 18:48:55 2024

From Newsgroup: comp.unix.programmer

cross@spitfire.i.gajendra.net (Dan Cross) writes:

[...]

In any event, this seems simpler than what you posted:

#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>

int
main(int argc, char *argv[])
{
if (argc != 2) {
fprintf(stderr, "Usage: matchd <str>\n");
return EXIT_FAILURE;
}

for (const char *p = argv[1]; *p != '\0'; p++)
if ('0' <= *p && *p <= '9')
return EXIT_SUCCESS;

return EXIT_FAILURE;
}

It's not only 4 lines longer but in just about every individual aspect syntactically more complicated and more messy and functionally more
clumsy. This is particularly noticable in the loop

for (const char *p = argv[1]; *p != '\0'; p++)
if ('0' <= *p && *p <= '9')
return EXIT_SUCCESS;

the loop header containing a spuriously qualified variable declaration,
the loop body and half of the termination condition. The other half then follows as special-case in the otherwise useless loop body.

It looks like a copy of my code which each individual bit redesigned
under the guiding principle of "Can we make this more complicated?", eg,

char **argv

declares an array of pointers (as each pointer in C points to an array)
and

char *argv[]

accomplishes exactly the same but uses both more characters and more
different kinds of characters.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Rainer Weikusat@rweikusat@talktalk.net to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 18:59:43 2024

From Newsgroup: comp.unix.programmer

scott@slp53.sl.home (Scott Lurndal) writes:

Rainer Weikusat <rweikusat@talktalk.net> writes:

[...]

Something which would match [0-9]+ in its first argument (if any) would
be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;
if (!c) exit(1);
return 0;
}

but that's 14 lines of text, 13 of which have absolutely no relation to
the problem of recognizing a digit.

Personally, I'd use:

$ cat /tmp/a.c
#include <stdint.h>
#include <string.h>

int
main(int argc, const char **argv)
{
char *cp;
uint64_t value;

if (argc < 2) return 1;

value = strtoull(argv[1], &cp, 10);
if ((cp == argv[1])
|| (*cp != '\0')) {
return 1;
}
return 0;
}

This will accept a string of digits whose numerical value is <=
ULLONG_MAX, ie, it's basically ^[0-9]+$ with unobvious length and
content limits.

return !strstr(argv[1], "0123456789");

would be a better approximation, just a much more complicated algorithm
than necessary. Even in strictly conforming ISO-C "digitness" of a
character can be determined by a simple calculation instead of some kind
of search loop.
--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 19:05:42 2024

From Newsgroup: comp.unix.programmer

In article <87h67zrtns.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

[...]

In any event, this seems simpler than what you posted:

#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>

int
main(int argc, char *argv[])
{
if (argc != 2) {
fprintf(stderr, "Usage: matchd <str>\n");
return EXIT_FAILURE;
}

for (const char *p = argv[1]; *p != '\0'; p++)
if ('0' <= *p && *p <= '9')
return EXIT_SUCCESS;

return EXIT_FAILURE;
}

It's not only 4 lines longer but in just about every individual aspect >syntactically more complicated and more messy and functionally more
clumsy.

That's a lot of opinion, and not particularly well-founded
opinion at that, given that your code was incorrect to begin
with.

This is particularly noticable in the loop

for (const char *p = argv[1]; *p != '\0'; p++)
if ('0' <= *p && *p <= '9')
return EXIT_SUCCESS;

the loop header containing a spuriously qualified variable declaration,

Ibid. Const qualifying a pointer that I'm not going to assign
through is just good hygiene, IMHO.

the loop body and half of the termination condition.

I think you're trying to project a value judgement onto that
loop in order to make it fit a particular world view, but I
think this is an odd way to look at it.

Another way to loop at it is that the loop is only concerned
with the iteration over the string, while the body is concerned
with applying some predicate to the element, and doing something
if that predicate evaluates it to true.

The other half then
follows as special-case in the otherwise useless loop body.

That's a way to look at it, but I submit that's an outlier point
of view.

It looks like a copy of my code which each individual bit redesigned
under the guiding principle of "Can we make this more complicated?", eg,

Uh, no.

char **argv

declares an array of pointers

No, it declares a pointer to a pointer to char.

(as each pointer in C points to an array)

That's absolutely not true. A pointer in C may refer to
an array, or a scalar. Consider,

char c;
char *p = &c;
char **pp = &p;

For a concrete example of how this works in a real function,
consider the second argument to `strtol` et al in the standard
library.

and

char *argv[]

accomplishes exactly the same but uses both more characters and more >different kinds of characters.

"more characters" is a poor metric.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 19:15:07 2024

From Newsgroup: comp.unix.programmer

In article <87cyinrt5s.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

scott@slp53.sl.home (Scott Lurndal) writes:

Rainer Weikusat <rweikusat@talktalk.net> writes:

[...]

Something which would match [0-9]+ in its first argument (if any) would >>>be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;
if (!c) exit(1);
return 0;
}

but that's 14 lines of text, 13 of which have absolutely no relation to >>>the problem of recognizing a digit.

Personally, I'd use:

$ cat /tmp/a.c
#include <stdint.h>
#include <string.h>

int
main(int argc, const char **argv)
{
char *cp;
uint64_t value;

if (argc < 2) return 1;

value = strtoull(argv[1], &cp, 10);
if ((cp == argv[1])
|| (*cp != '\0')) {
return 1;
}
return 0;
}

This will accept a string of digits whose numerical value is <=
ULLONG_MAX, ie, it's basically ^[0-9]+$ with unobvious length and
content limits.

He acknowledged this already.

return !strstr(argv[1], "0123456789");

would be a better approximation,

No it wouldn't. That's not even close. `strstr` looks for an
instance of its second argument in its first, not an instance of
any character in it's second argument in its first. Perhaps you
meant something with `strspn` or similar. E.g.,

const char *p = argv[1] + strspn(argv[1], "0123456789");
return *p != '\0';

just a much more complicated algorithm
than necessary. Even in strictly conforming ISO-C "digitness" of a
character can be determined by a simple calculation instead of some kind
of search loop.

Yes, one can do that, but why bother?

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 20:20:06 2024

From Newsgroup: comp.unix.programmer

On 22.11.2024 19:19, Kaz Kylheku wrote:

On 2024-11-22, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 21.11.2024 20:12, Kaz Kylheku wrote:

[...]

In the wild, you see regexes being used for all sorts of stupid stuff,

No one can prevent folks using features for stupid things. Yes.

But the thing is that "modern" regular expressions (Perl regex and its progeny) have features that are designed to exclusively cater to these
folks.

Which ones are you specifically thinking of?

Since I'm not using Perl I don't know all the Perl RE details. Besides
the basic REs I'm aware of the abbreviations (like '\d') (that I like),
then extensions of Chomsky-3 (like back-references) (that I also like
to have in cases I need them; but one must know what we buy with them),
then the minimum-match (as opposed to matching the longest substring)
(which I think is useful to simplify some types of expressions), and
there was another one that evades my memories, something like context
dependent patterns (also useful), and wasn't there also some syntax to
match subexpression-hierarchies (useful as well) (similar like in GNU
Awk's gensub() (probably in a more primitive variant there), and also
existing in Kornshell patterns that also supports some more from above [Perl-]features, like the abbreviations).

Janis

--- Synchronet 3.21d-Linux NewsLink 1.2

From Rainer Weikusat@rweikusat@talktalk.net to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 19:24:23 2024

From Newsgroup: comp.unix.programmer

cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

[...]

In any event, this seems simpler than what you posted:

#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>

int
main(int argc, char *argv[])
{
if (argc != 2) {
fprintf(stderr, "Usage: matchd <str>\n");
return EXIT_FAILURE;
}

for (const char *p = argv[1]; *p != '\0'; p++)
if ('0' <= *p && *p <= '9')
return EXIT_SUCCESS;

return EXIT_FAILURE;
}

It's not only 4 lines longer but in just about every individual aspect >>syntactically more complicated and more messy and functionally more
clumsy.

That's a lot of opinion, and not particularly well-founded
opinion at that, given that your code was incorrect to begin
with.

That's not at all an opinion but an observation. My opinion on this is
that this is either a poor man's attempt at winning an obfuscation
context or - simpler - exemplary bad code.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Rainer Weikusat@rweikusat@talktalk.net to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 19:26:07 2024

From Newsgroup: comp.unix.programmer

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <87cyinrt5s.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

scott@slp53.sl.home (Scott Lurndal) writes:

Rainer Weikusat <rweikusat@talktalk.net> writes:

[...]

Something which would match [0-9]+ in its first argument (if any) would >>>>be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;
if (!c) exit(1);
return 0;
}

but that's 14 lines of text, 13 of which have absolutely no relation to >>>>the problem of recognizing a digit.

Personally, I'd use:

$ cat /tmp/a.c
#include <stdint.h>
#include <string.h>

int
main(int argc, const char **argv)
{
char *cp;
uint64_t value;

if (argc < 2) return 1;

value = strtoull(argv[1], &cp, 10);
if ((cp == argv[1])
|| (*cp != '\0')) {
return 1;
}
return 0;
}

This will accept a string of digits whose numerical value is <=
ULLONG_MAX, ie, it's basically ^[0-9]+$ with unobvious length and
content limits.

He acknowledged this already.

return !strstr(argv[1], "0123456789");

would be a better approximation,

No it wouldn't. That's not even close. `strstr` looks for an
instance of its second argument in its first, not an instance of
any character in it's second argument in its first. Perhaps you
meant something with `strspn` or similar. E.g.,

const char *p = argv[1] + strspn(argv[1], "0123456789");
return *p != '\0';

My bad.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 20:33:24 2024

From Newsgroup: comp.unix.programmer

On 22.11.2024 12:56, Rainer Weikusat wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 20.11.2024 18:50, Rainer Weikusat wrote:

[...]
while (p < e && *p - '0' < 10) ++p;

That's not too bad. And it's really a hell lot faster than a
general-purpose automaton programmed to recognize the same pattern
(which might not matter most of the time, but sometimes, it does).

Okay, I see where you're coming from (and especially in that simple
case).

Personally (and YMMV), even here in this simple case I think that
using pointers is not better but worse - and anyway isn't [in this
form] available in most languages;

That's a question of using the proper tool for the job. In C, that's
pointer and pointer arithmetic because it's the simplest way to express something like this.

Yes, in "C" you'd use that primitive (error-prone) pointer feature.
That's what I said. And that in other languages it's less terse than
in "C" but equally error-prone if you have to create all the parsing
code yourself (without an existing engine and in a non-standard way).
And if you extend the expression to parse it's IME much simpler done
in Regex than adjusting the algorithm of the ad hoc procedural code.

in other cases (and languages)
such constructs get yet more clumsy, and for my not very complex
example - /[0-9]+(ABC)?x*foo/ - even a "catastrophe" concerning
readability, error-proneness, and maintainability.

Procedural code for matching strings constructed in this way is
certainly much simpler-| than the equally procedural code for a
programmable automaton capable of interpreting regexes.

The point is that Regexps and the equivalence to FSA (with guaranteed
runtime complexity) is an [efficient] abstraction with a formalized
syntax; that are huge advantages compared to ad hoc parsing code in C
(or in any other language).

Your statement
is basically "If we assume that the code interpreting regexes doesn't
exist, regexes need much less code than something equivalent which does exist." Without this assumption, the picture becomes a different one altogether.

I don't speak of assumptions. I speak about the fact that there's a well-understood model with existing [parsing-]implementations already
available to handle a huge class of algorithms in a standardized way
with a guaranteed runtime-efficiency and in an error-resilient way.

Janis

[...]

--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 19:46:31 2024

From Newsgroup: comp.unix.programmer

In article <878qtbrs0o.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote: >>>cross@spitfire.i.gajendra.net (Dan Cross) writes:

[...]

In any event, this seems simpler than what you posted:

#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>

int
main(int argc, char *argv[])
{
if (argc != 2) {
fprintf(stderr, "Usage: matchd <str>\n");
return EXIT_FAILURE;
}

for (const char *p = argv[1]; *p != '\0'; p++)
if ('0' <= *p && *p <= '9')
return EXIT_SUCCESS;

return EXIT_FAILURE;
}

It's not only 4 lines longer but in just about every individual aspect >>>syntactically more complicated and more messy and functionally more >>>clumsy.

That's a lot of opinion, and not particularly well-founded
opinion at that, given that your code was incorrect to begin
with.

That's not at all an opinion but an observation. My opinion on this is
that this is either a poor man's attempt at winning an obfuscation
context or - simpler - exemplary bad code.

Opinion (noun)
a view or judgment formed about something, not necessarily based on
fact or knowledge. "I'm writing to voice my opinion on an issue of
little importance"

You mentioned snark earlier. Physician, heal thyself.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 19:51:18 2024

From Newsgroup: comp.unix.programmer

In article <874j3zrrxs.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <87cyinrt5s.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

scott@slp53.sl.home (Scott Lurndal) writes:

Rainer Weikusat <rweikusat@talktalk.net> writes:

[...]

Something which would match [0-9]+ in its first argument (if any) would >>>>>be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;
if (!c) exit(1);
return 0;
}

but that's 14 lines of text, 13 of which have absolutely no relation to >>>>>the problem of recognizing a digit.

Personally, I'd use:

$ cat /tmp/a.c
#include <stdint.h>
#include <string.h>

int
main(int argc, const char **argv)
{
char *cp;
uint64_t value;

if (argc < 2) return 1;

value = strtoull(argv[1], &cp, 10);
if ((cp == argv[1])
|| (*cp != '\0')) {
return 1;
}
return 0;
}

This will accept a string of digits whose numerical value is <= >>>ULLONG_MAX, ie, it's basically ^[0-9]+$ with unobvious length and
content limits.

He acknowledged this already.

return !strstr(argv[1], "0123456789");

would be a better approximation,

No it wouldn't. That's not even close. `strstr` looks for an
instance of its second argument in its first, not an instance of
any character in it's second argument in its first. Perhaps you
meant something with `strspn` or similar. E.g.,

const char *p = argv[1] + strspn(argv[1], "0123456789");
return *p != '\0';

My bad.

You've made a lot of "bad"s in this thread, and been rude about
it to boot, crying foul when someone's pointed out ways that
your code is deficient; claiming offense at what you perceive as
"snark" while dishing the same out in kind, making basic errors
that show you haven't done the barest minimum of testing, and
making statements that show you have, at best, a limited grasp
on the language you're choosing to use.

I'm done being polite. My conclusion is that perhaps you are
not as up on these things as you seem to think that you are.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Lawrence D'Oliveiro@ldo@nz.invalid to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Fri Nov 22 20:41:21 2024

From Newsgroup: comp.unix.programmer

On Fri, 22 Nov 2024 12:47:16 +0100, Janis Papanagnou wrote:

On 21.11.2024 23:05, Lawrence D'Oliveiro wrote:

Another handy one is rCL\brCY for word boundaries.

I prefer \< and \> (that are quite commonly used) for such structural
things ...

rCL\<rCY only matches the beginning of a word, rCL\>rCY only matches the end, rCL\brCY matches both <https://www.gnu.org/software/emacs/manual/html_node/emacs/Regexp-Backslash.html>.
--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Fri Nov 22 21:14:28 2024

From Newsgroup: comp.unix.programmer

In article <87cyinthjk.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <87o728qzbc.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

mas@a4.home writes:

On 2024-11-20, Rainer Weikusat <rweikusat@talktalk.net> wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

Assuming that p is a pointer to the current position in a string, e is a >>>>> pointer to the end of it (ie, point just past the last byte) and -
that's important - both are pointers to unsigned quantities, the 'bulky' >>>>> C equivalent of [0-9]+ is

while (p < e && *p - '0' < 10) ++p;

That's not too bad. And it's really a hell lot faster than a
general-purpose automaton programmed to recognize the same pattern
(which might not matter most of the time, but sometimes, it does).

int
main(int argc, char **argv) {
unsigned char *p, *e;
unsigned char mystr[] = "12#45XY ";

p = mystr;
e = mystr + sizeof(mystr);
while (p < e && *p - '0' < 10) ++p;

The code I'm actually using is

while (p < e && (unsigned)*p - '0' < 10)
++p;

I just omitted that when posting this beause I mistakenly assumed that
it probably wasn't needed, ;-). You could have pointed this out instead >>>of trying to dress it up as some sort of mystery problem|e-|. Especially as >>>I did mention that using unsigned arithmetic was necessary (should
really be self-evident).

Well, no, not exactly. You said that it was important that the
pointers point to unsigned quantities, but that wasn't the
issue.

The issue here is that I mistakenly assumed the (unsigned) in the code
was a left-over from before the time when I had changed to pointers to >unsigned char to fix a different issue.

That you "changed to pointers to unsigned char" is completely
irrelevant. That's not how C works: C will promote those
`unsigned char` values to `signed int` before it does the
subtraction. It does this because `unsigned char` has _rank_
lower than _int_ and the entire range of values is expressible
in a signed int. These are called, "the usual arithmetic
conversions."

I pointed you to the section of the standard that explains this.

As C tries really hard to force
signed arithmetic onto people despite this basically never makes any
sense,

It actually makes a lot of sense in a lot of contexts, and is
usually what people actually want.

You've shown that you, personally, don't have a great command of
the language, so perhaps you shouldn't opine quite so
forcefully.

the type of '0' is int *p gets promoted to int and hence, the
result of the subtraction will also be an int and the '< 10' condition
will be true for every codepoint numerically less than '9' which
obviously won't work as intended.

Yes, that's how it works.

That's a mistake I made which would have warranted being pointed out and >possibly explained instead of posting some broken code together with
some output the broken code certainly never generated due it being
broken.

Perhaps if you stopped being so unbearably smug and pretentious
in your judgements people would give you that grace. As it is,
you come across as thin-skinned and terribly insecure, and your
code shows a lack of competence.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Rainer Weikusat@rweikusat@talktalk.net to comp.unix.programmer on Fri Nov 22 22:09:44 2024

From Newsgroup: comp.unix.programmer

cross@spitfire.i.gajendra.net (Dan Cross) writes:

[...]

That's a mistake I made which would have warranted being pointed out and >>possibly explained instead of posting some broken code together with
some output the broken code certainly never generated due it being
broken.

Perhaps if you stopped being so unbearably smug and pretentious
in your judgements people would give you that grace. As it is,
you come across as thin-skinned and terribly insecure, and your
code shows a lack of competence.

I suggest you print this and pin to to a place you frequently look
at. This may eventually lead you to a much more realistic assessment of yourself than you presently seem to have.
--- Synchronet 3.21d-Linux NewsLink 1.2

From James Kuyper@jameskuyper@alumni.caltech.edu to comp.unix.programmer on Fri Nov 22 17:16:05 2024

From Newsgroup: comp.unix.programmer

On 11/22/24 16:14, Dan Cross wrote:
...

irrelevant. That's not how C works: C will promote those
`unsigned char` values to `signed int` before it does the
subtraction. It does this because `unsigned char` has _rank_
lower than _int_ ...

True.

... and the entire range of values is expressible
in a signed int.

Not necessarily. An implementation is allowed to have UCHAR_MAX >
INT_MAX, in which case unsigned char promotes to unsigned int rather
than int. I'm aware of systems where UCHAR_MAX > LONG_MAX was true:
char, short, int, and long were all 32 bits.

... These are called, "the usual arithmetic
conversions."

Actually, what you're talking about are the integer promotions. The
first step of the usual arithmetic conversions is to apply the integer promotions to each operand, but then a few other things happen as well.
--- Synchronet 3.21d-Linux NewsLink 1.2

From James Kuyper@jameskuyper@alumni.caltech.edu to comp.unix.programmer on Fri Nov 22 17:26:59 2024

From Newsgroup: comp.unix.programmer

On 11/22/24 14:05, Dan Cross wrote:

In article <87h67zrtns.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:...

char **argv

declares an array of pointers

No, it declares a pointer to a pointer to char.

Agreed.

(as each pointer in C points to an array)

That's absolutely not true. A pointer in C may refer to
an array, or a scalar. Consider,

char c;
char *p = &c;
char **pp = &p;

Not actually relevant. For purposes of pointer arithmetic, a pointer to
a single object is treated as if it pointed at the first element of a
1-element array of that object's type.

--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Fri Nov 22 22:34:52 2024

From Newsgroup: comp.unix.programmer

In article <e47664d3-7f9b-4e67-aa73-b72c6cc0687a@alumni.caltech.edu>,
James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

On 11/22/24 16:14, Dan Cross wrote:
...

irrelevant. That's not how C works: C will promote those
`unsigned char` values to `signed int` before it does the
subtraction. It does this because `unsigned char` has _rank_
lower than _int_ ...

True.

... and the entire range of values is expressible
in a signed int.

Not necessarily. An implementation is allowed to have UCHAR_MAX >
INT_MAX, in which case unsigned char promotes to unsigned int rather
than int. I'm aware of systems where UCHAR_MAX > LONG_MAX was true:
char, short, int, and long were all 32 bits.

Yes, but in this context, that's obviously not the case as he
posted the behavior he saw. I was merely explaining _why_ he
saw that behavior, vis the standard.

... These are called, "the usual arithmetic
conversions."

Actually, what you're talking about are the integer promotions. The
first step of the usual arithmetic conversions is to apply the integer >promotions to each operand, but then a few other things happen as well.

This is correct, but IMHO too far down into the weeds. Consider
section 6.3.1.1 para 3, which notes that, "the integer
promotions are applied only: ... 1. as part of the usual
arithmetic conversions". Since we're talking about operands to
a binary operator, 6.3.1.8 applies. 6.3.1.8 is why converting
one side to unsigned is sufficient to get an unsigned result.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Fri Nov 22 23:10:09 2024

From Newsgroup: comp.unix.programmer

In article <87cyimapjr.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

[...]

That's a mistake I made which would have warranted being pointed out and >>>possibly explained instead of posting some broken code together with
some output the broken code certainly never generated due it being >>>broken.

Perhaps if you stopped being so unbearably smug and pretentious
in your judgements people would give you that grace. As it is,
you come across as thin-skinned and terribly insecure, and your
code shows a lack of competence.

I suggest you print this and pin to to a place you frequently look
at. This may eventually lead you to a much more realistic assessment of >yourself than you presently seem to have.

*shrug* If you don't want to be criticized, don't be wrong so
often. Not my problem if you don't program well.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Fri Nov 22 23:06:18 2024

From Newsgroup: comp.unix.programmer

In article <vhr0fj$1bq0o$1@dont-email.me>,
James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

On 11/22/24 14:05, Dan Cross wrote:

In article <87h67zrtns.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:...

char **argv

declares an array of pointers

No, it declares a pointer to a pointer to char.

Agreed.

(as each pointer in C points to an array)

That's absolutely not true. A pointer in C may refer to
an array, or a scalar. Consider,

char c;
char *p = &c;
char **pp = &p;

Not actually relevant. For purposes of pointer arithmetic, a pointer to
a single object is treated as if it pointed at the first element of a >1-element array of that object's type.

a) Please stop emailing me this stuff _and_ posting it here. I
have asked you this in the past, and previously you'd said that
it was because you switched news readers. That's fine, but that
was a year ago or more.

b) What you are referring to, from the section on Additive
Operators (6.5.7 in n3220; 6.5.6 in C99) is in reference to
pointer arithmetic; the statement that I was replying to was a
general statement about pointers, independent of issues of
pointer arithmetic. That is, it is not the case that, "each
pointer in C points to an array". The above example, to which
you replied, is a counterpoint to the general statement.

So while what you are saying is true, it doesn't change the
general point.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From James Kuyper@jameskuyper@alumni.caltech.edu to comp.unix.programmer on Fri Nov 22 22:49:46 2024

From Newsgroup: comp.unix.programmer

On 11/22/24 18:06, Dan Cross wrote:
...

a) Please stop emailing me this stuff _and_ posting it here. I
have asked you this in the past, and previously you'd said that
it was because you switched news readers. That's fine, but that
was a year ago or more.

Actually, Thunderbird made the change to how the "Reply" button works
sometime around April 2021, and I'm still making the mistake of hitting
it when I should hit "Followup". At my age it's hard to change a habit
acquired over a couple of decades of posting to usenet; I'm doing my
best to change it, and still failing frequently.

I'm unlikely to deliberately send you e-mail, so you can easily avoid
the aggravation of dealing with my accidental e-mails by simply
kill-filing me.

b) What you are referring to, from the section on Additive
Operators (6.5.7 in n3220; 6.5.6 in C99) is in reference to
pointer arithmetic; the statement that I was replying to was a
general statement about pointers, independent of issues of
pointer arithmetic. That is, it is not the case that, "each
pointer in C points to an array". The above example, to which
you replied, is a counterpoint to the general statement.

I disagree. It is not independent, because there's nothing you can do
with pointer itself that has either it's value or it's validity
dependent upon whether that pointer points at a single object or the
first element of an array of length one.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.unix.programmer on Sat Nov 23 05:26:04 2024

From Newsgroup: comp.unix.programmer

On 23.11.2024 04:49, James Kuyper wrote:

On 11/22/24 18:06, Dan Cross wrote:
...

a) Please stop emailing me this stuff _and_ posting it here. I
have asked you this in the past, and previously you'd said that
it was because you switched news readers. That's fine, but that
was a year ago or more.

Actually, Thunderbird made the change to how the "Reply" button works sometime around April 2021, and I'm still making the mistake of hitting
it when I should hit "Followup". At my age it's hard to change a habit acquired over a couple of decades of posting to usenet; I'm doing my
best to change it, and still failing frequently.

Been there. I desperately looked for a way to fix or alleviate that
horrendous user-interface misconception. Eventually I was able to
fix it, but unfortunately cannot quite recall how that was done.

Now in my old TB 45.8.0 I I have

Followup v Forward ... More v
--------
Reply All
Reply

where Followup is the default and the Reply buttons are a drop-down
menu, while in TB 102.11.0 I have a similar menu like this

Followup v Reply Forward ... More v
-------- ----
Reply All ...
Reply Customize Toolbar

I seem to recall that the leftmost menu items were called something
like *cough* "Smart Reply" (or so), and I indeed see such an entry in
my newer TB version under More -> Customize Toolbar ; maybe it helps
to fix it [on your TB version] (or to find some hints on the Web with
that keyword).

Janis

--- Synchronet 3.21d-Linux NewsLink 1.2

From James Kuyper@jameskuyper@alumni.caltech.edu to comp.unix.programmer on Fri Nov 22 23:44:49 2024

From Newsgroup: comp.unix.programmer

On 11/22/24 17:34, Dan Cross wrote:

In article <e47664d3-7f9b-4e67-aa73-b72c6cc0687a@alumni.caltech.edu>,
James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

On 11/22/24 16:14, Dan Cross wrote:

...

... and the entire range of values is expressible
in a signed int.

Not necessarily. An implementation is allowed to have UCHAR_MAX >
INT_MAX, in which case unsigned char promotes to unsigned int rather
than int. I'm aware of systems where UCHAR_MAX > LONG_MAX was true:
char, short, int, and long were all 32 bits.

Yes, but in this context, that's obviously not the case as he
posted the behavior he saw. I was merely explaining _why_ he
saw that behavior, vis the standard.

Your wording could easily give the false impression, to anyone who
didn't already know better, that promotion of unsigned char to signed
int is required by the standard, rather than it being dependent upon
whether UCHAR_MAX > INT_MAX.

... These are called, "the usual arithmetic
conversions."

Actually, what you're talking about are the integer promotions. The
first step of the usual arithmetic conversions is to apply the integer
promotions to each operand, but then a few other things happen as well.

This is correct, but IMHO too far down into the weeds. Consider
section 6.3.1.1 para 3, which notes that, "the integer
promotions are applied only: ... 1. as part of the usual
arithmetic conversions".

The latest draft of the standard that I have is n3096.pdf, dated
2023-04-01. In that draft, that wording is not present in paragraph 3,
but only footnote 63, which is referenced by paragraph 2. That footnote
does not contain the "1." that is present in your citation. That
footnote goes on to list the other contexts in which integer promotions
can occur: ", to certain argument expressions, to the operands of the
unary +, -, and ~ operators, and to both operands of the shift
operators, as specified by their respective subclauses."
Which version are you quoting? I have copies of most of the draft
standards that are available for free, but none of the final versions of
the standards, since those cost money.

... Since we're talking about operands to
a binary operator, 6.3.1.8 applies. 6.3.1.8 is why converting
one side to unsigned is sufficient to get an unsigned result.

Calling them the usual arithmetic conversions rather than the integer promotions is being unnecessarily vague. Your description only covers
the integer promotions, it doesn't cover any of the other usual
arithmetic conversions.

I'm going to try to come up with an analogy; the best one I could come
up on the spur of the moment involves the US federal income tax form
1040. It has a section called "Payments". The first four payments are
all amounts that have been withheld from various income sources before
you ever get a chance to spend the money. Most the other payments are
things that you spent money on that you are entitled to take as a credit against your tax liability.
What you've done is analogous to describing how withholding works, and
even using the term "withheld", and then referring to what you've just described as "payments" rather than "amounts withheld", even though your description doesn't fit the tax credits, which are the other kinds of
payments.

--- Synchronet 3.21d-Linux NewsLink 1.2

From James Kuyper@jameskuyper@alumni.caltech.edu to comp.unix.programmer on Sat Nov 23 00:04:05 2024

From Newsgroup: comp.unix.programmer

On 11/22/24 23:26, Janis Papanagnou wrote:

On 23.11.2024 04:49, James Kuyper wrote:

...

Actually, Thunderbird made the change to how the "Reply" button works
sometime around April 2021, and I'm still making the mistake of hitting
it when I should hit "Followup". At my age it's hard to change a habit
acquired over a couple of decades of posting to usenet; I'm doing my
best to change it, and still failing frequently.

Been there. I desperately looked for a way to fix or alleviate that horrendous user-interface misconception. Eventually I was able to
fix it, but unfortunately cannot quite recall how that was done.

I seem to recall that I was advised that I could use an older version of Thunderbird to edit the list of buttons that are visible, something that
cannot be done in the latest version. I'm afraid to go back to an older version, because so many of the updates to most of the software I own
consists of security fixes. I don't want to go back to an older, less
secure version of TB.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.unix.programmer on Sat Nov 23 06:09:15 2024

From Newsgroup: comp.unix.programmer

On 23.11.2024 06:04, James Kuyper wrote:

On 11/22/24 23:26, Janis Papanagnou wrote:

On 23.11.2024 04:49, James Kuyper wrote:

...

Actually, Thunderbird made the change to how the "Reply" button works
sometime around April 2021, and I'm still making the mistake of hitting
it when I should hit "Followup". At my age it's hard to change a habit
acquired over a couple of decades of posting to usenet; I'm doing my
best to change it, and still failing frequently.

Been there. I desperately looked for a way to fix or alleviate that
horrendous user-interface misconception. Eventually I was able to
fix it, but unfortunately cannot quite recall how that was done.

I seem to recall that I was advised that I could use an older version of Thunderbird to edit the list of buttons that are visible, something that cannot be done in the latest version. I'm afraid to go back to an older version, because so many of the updates to most of the software I own consists of security fixes. I don't want to go back to an older, less
secure version of TB.

Understandable. - But you noticed that the "Smart Reply" thing was
a feature of my newer Thunderbird version? - I'd think it should be
there, or isn't it in your version? (What version are you running?)

Janis

--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@dastardlyhq.com to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Sat Nov 23 11:40:37 2024

From Newsgroup: comp.unix.programmer

On Fri, 22 Nov 2024 18:18:04 -0000 (UTC)
Kaz Kylheku <643-408-1753@kylheku.com> gabbled:

On 2024-11-22, Muttley@DastartdlyHQ.org <Muttley@DastartdlyHQ.org> wrote:

Its not that simple I'm afraid since comments can be commented out.

Umm, no.

Umm, yes, they can.

eg:

// int i; /*

This /* sequence is inside a // comment, and so the machinery that
recognizes /* as the start of a comment would never see it.

Yes, thats kind of the point. You seem to be arguing against yourself.

A C99 and C++ compiler would see "int j" and compile it, a regex would
simply remove everything from the first /* to */.

No, it won't, because that's not how regexes are used in a lexical

Yes, it will.

Also the same probably applies to #ifdef's.

Lexically analyzing C requires implementing the translation phases
as described in the standard. There are preprocessor phases which
delimit the input into preprocessor tokens (pp-tokens). Comments
are stripped in preprocessing. But logical lines (backslash
continuations) are recognized below comments; i.e. this is one
comment:

Not sure what your point is. A regex cannot be used to parse C comments because its doesn't know C/C++ grammar.

--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Sat Nov 23 13:53:09 2024

From Newsgroup: comp.unix.programmer

In article <vhrjcr$1ijr4$1@dont-email.me>,
James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

On 11/22/24 18:06, Dan Cross wrote:
...

a) Please stop emailing me this stuff _and_ posting it here. I
have asked you this in the past, and previously you'd said that
it was because you switched news readers. That's fine, but that
was a year ago or more.

Actually, Thunderbird made the change to how the "Reply" button works >sometime around April 2021, and I'm still making the mistake of hitting
it when I should hit "Followup". At my age it's hard to change a habit >acquired over a couple of decades of posting to usenet; I'm doing my
best to change it, and still failing frequently.

Three and a half years is a long time to learn how to use a tool
competently.

I'm unlikely to deliberately send you e-mail, so you can easily avoid
the aggravation of dealing with my accidental e-mails by simply
kill-filing me.

Please don't put the onus to deal with your mistakes on me.

b) What you are referring to, from the section on Additive
Operators (6.5.7 in n3220; 6.5.6 in C99) is in reference to
pointer arithmetic; the statement that I was replying to was a
general statement about pointers, independent of issues of
pointer arithmetic. That is, it is not the case that, "each
pointer in C points to an array". The above example, to which
you replied, is a counterpoint to the general statement.

I disagree. It is not independent, because there's nothing you can do
with pointer itself that has either it's value or it's validity
dependent upon whether that pointer points at a single object or the
first element of an array of length one.

You can indirect it without doing arithmetic on it. This isn't
a particularly interesting topic, however. The point was that
the OP's statement was factually incorrect.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Sat Nov 23 14:05:35 2024

From Newsgroup: comp.unix.programmer

In article <vhrmk1$1ivhr$1@dont-email.me>,
James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

On 11/22/24 17:34, Dan Cross wrote:

In article <e47664d3-7f9b-4e67-aa73-b72c6cc0687a@alumni.caltech.edu>,
James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

On 11/22/24 16:14, Dan Cross wrote:

...

... and the entire range of values is expressible
in a signed int.

Not necessarily. An implementation is allowed to have UCHAR_MAX >
INT_MAX, in which case unsigned char promotes to unsigned int rather
than int. I'm aware of systems where UCHAR_MAX > LONG_MAX was true:
char, short, int, and long were all 32 bits.

Yes, but in this context, that's obviously not the case as he
posted the behavior he saw. I was merely explaining _why_ he
saw that behavior, vis the standard.

Your wording could easily give the false impression, to anyone who
didn't already know better, that promotion of unsigned char to signed
int is required by the standard, rather than it being dependent upon
whether UCHAR_MAX > INT_MAX.

Actually I'm not sure that it did. Note the part that you
quoted above that says, "the entire range values is expressible
in a signed int." This implies UCHAR_MAX <= INT_MAX. (By
"values" I meant, "values of that type", not specific values in
any given program).

Regardless, if you wanted to provide more detail, it would be
done more usefully by doing so within the context, "here's why
this is true in this context, but note that other contexts
exist in which this doesn't hold, and here's why..." etc.

... These are called, "the usual arithmetic
conversions."

Actually, what you're talking about are the integer promotions. The
first step of the usual arithmetic conversions is to apply the integer
promotions to each operand, but then a few other things happen as well.

This is correct, but IMHO too far down into the weeds. Consider
section 6.3.1.1 para 3, which notes that, "the integer
promotions are applied only: ... 1. as part of the usual
arithmetic conversions".

The latest draft of the standard that I have is n3096.pdf, dated
2023-04-01.

As I mentioned, I'm looking at n3220. https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3220.pdf

You may want to looka t n3301, which seems to be the latest. https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3301.pdf

In that draft, that wording is not present in paragraph 3,
but only footnote 63, which is referenced by paragraph 2. That footnote
does not contain the "1." that is present in your citation. That
footnote goes on to list the other contexts in which integer promotions
can occur: ", to certain argument expressions, to the operands of the
unary +, -, and ~ operators, and to both operands of the shift
operators, as specified by their respective subclauses."
Which version are you quoting? I have copies of most of the draft
standards that are available for free, but none of the final versions of
the standards, since those cost money.

See above.

... Since we're talking about operands to
a binary operator, 6.3.1.8 applies. 6.3.1.8 is why converting
one side to unsigned is sufficient to get an unsigned result.

Calling them the usual arithmetic conversions rather than the integer >promotions is being unnecessarily vague. Your description only covers
the integer promotions, it doesn't cover any of the other usual
arithmetic conversions.

The integer promotions were the only relevant part in context.
Perhaps it would have allayed your concerns to say, "part of"?

I'm going to try to come up with an analogy; the best one I could come
up on the spur of the moment involves the US federal income tax form
1040. It has a section called "Payments". The first four payments are
all amounts that have been withheld from various income sources before
you ever get a chance to spend the money. Most the other payments are
things that you spent money on that you are entitled to take as a credit >against your tax liability.
What you've done is analogous to describing how withholding works, and
even using the term "withheld", and then referring to what you've just >described as "payments" rather than "amounts withheld", even though your >description doesn't fit the tax credits, which are the other kinds of >payments.

Sorry, I think this analogy is poor and unhelpful. See the
above references.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From James Kuyper@jameskuyper@alumni.caltech.edu to comp.unix.programmer on Sat Nov 23 09:24:10 2024

From Newsgroup: comp.unix.programmer

On 11/23/24 00:09, Janis Papanagnou wrote:

On 23.11.2024 06:04, James Kuyper wrote:

...

I seem to recall that I was advised that I could use an older version of
Thunderbird to edit the list of buttons that are visible, something that
cannot be done in the latest version. I'm afraid to go back to an older
version, because so many of the updates to most of the software I own
consists of security fixes. I don't want to go back to an older, less
secure version of TB.

Understandable. - But you noticed that the "Smart Reply" thing was
a feature of my newer Thunderbird version? - I'd think it should be
there, or isn't it in your version? (What version are you running?)

115.16.0esr

I don't have a "More/Customize Toolbar" option, only "More/Customize",
and when I select it, there's only the following options:

Button Style
Show senders profile picture
Larger Profile Picture
Always show sender's full address
Hide Labels Column
Large Subject
Show all Headers
--- Synchronet 3.21d-Linux NewsLink 1.2

From James Kuyper@jameskuyper@alumni.caltech.edu to comp.unix.programmer on Sat Nov 23 10:22:19 2024

From Newsgroup: comp.unix.programmer

On 11/23/24 09:05, Dan Cross wrote:

In article <vhrmk1$1ivhr$1@dont-email.me>,
James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

...

Your wording could easily give the false impression, to anyone who
didn't already know better, that promotion of unsigned char to signed
int is required by the standard, rather than it being dependent upon
whether UCHAR_MAX > INT_MAX.

Actually I'm not sure that it did. Note the part that you
quoted above that says, "the entire range values is expressible
in a signed int." This implies UCHAR_MAX <= INT_MAX. (By
"values" I meant, "values of that type", not specific values in
any given program).

You paired that assertion with "`unsigned char` has _rank_
lower than _int_", which is a fact guaranteed by the standard, which
could easily give the impression that the comment about the range of
values was either explicitly stated in the standard, or correctly
inferrable from the statement about the ranks.

...

... Since we're talking about operands to
a binary operator, 6.3.1.8 applies. 6.3.1.8 is why converting
one side to unsigned is sufficient to get an unsigned result.

Calling them the usual arithmetic conversions rather than the integer
promotions is being unnecessarily vague. Your description only covers
the integer promotions, it doesn't cover any of the other usual
arithmetic conversions.

The integer promotions were the only relevant part in context.
Perhaps it would have allayed your concerns to say, "part of"?

Partially. However since you described the integer promotions, and the description you provided didn't fit any other part of the usual
arithmetic conversions, and the part you described has it's own special
name, and the integer promotions can occur without the usual arithmetic conversions, it still seems to me more appropriate to say that what you described were the integer promotions.

--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Sat Nov 23 16:38:37 2024

From Newsgroup: comp.unix.programmer

This is amazing. Yet another response that was both emailed to
me _and_ posted to USENET.

In article <vhsrvb$1oct2$4@dont-email.me>,
James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

[snip]

I'm not particularly interested in the technical conent of your
article, which seems like it's trying to force a pedantic point
of interpretation without regard to the current standard (which
you admitted you didn't have a copy of) so I'm not going to
respond to that. Now you've admitted that your issue is that
someone _could_ misinterpret my statement. Ok, but this is comp.unix.programmer, not comp.lang.c.

But I am interested in this repeated mistake of emailing and
then posting.

In another email sent to me, but curiously NOT posted to USENET,
you said that you were "trying your best" and that if your best
was not good enough, I should take steps on my end to discard
your emails. No. It's not my responsibility to deal with your
mistakes.

I find this troubling. I really don't care how long you've been
programming, whether or not APL was your third language 50 years
ago (appeal to length of experience is, of course, a logical
fallacy) or how long you used the prior version of your tool.
Three and a half years is a long time to learn your way around a
new user interface; excuses related to age and prior experience
just don't cut it. As a former Drill Instructor once told me on
Parris Island, "excuses are like assholes: everybody has one."

It's the height of arrogance to assume that your words are so
important that it's someone else's responsibility to account for
the fact that you care to post them correctly. How about,
instead of excuses and asking others to deal with your mistakes,
you take some personal responsibility for using your tools
competently? Sure, mistakes _do_ happen, but with you it seems
to happen more often than not. If you're the one continually
making the mistake, you ought to be owning that, not asking
others to take steps to deal with your failures in this regard.

Here are some ideas for you to consider:

1. If you email someone something you intended to post, why not
follow up? You could, for example, send a follow-up email
("oops, I accidentally sent that via email; sorry about that,
I'll post it instead").
2. Alternatively, you could acknowledge that in your USENET post
with a disclaimer ("this was accidentally mailed to the
respondee").
3. If you email someone when you intended to post, you could
exercise some discipline and simply not post, acknowledging
the mistake while assuming responsibility for it. Perhaps
that would encourage you to learn to use your tools
correctly.
4. If you cannot use your existing newsreader without making
this mistake so frequently, perhaps consider switching to
different software for your USENET consumption; maybe a
program that more accurately tracks your desired user
interface?
5. Perhaps make yourself a checklist of things to check before
responding; #1 on that might be, "am I sending this by email
or posting? Is that really what I want to do?"

I'm sure you can think of others.

And while you clearly have the ability to post whatever you like
to USENET, it is a choice to do so, and it sure seems you lack
the discipline and discretion to do so competently. Perhaps
consider stopping until you develop or refine the necessary
skills to do so without such frequent error.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.unix.programmer on Sat Nov 23 20:14:55 2024

From Newsgroup: comp.unix.programmer

On 23.11.2024 15:24, James Kuyper wrote:

On 11/23/24 00:09, Janis Papanagnou wrote:

On 23.11.2024 06:04, James Kuyper wrote:

...

I seem to recall that I was advised that I could use an older version of >>> Thunderbird to edit the list of buttons that are visible, something that >>> cannot be done in the latest version. I'm afraid to go back to an older
version, because so many of the updates to most of the software I own
consists of security fixes. I don't want to go back to an older, less
secure version of TB.

Understandable. - But you noticed that the "Smart Reply" thing was
a feature of my newer Thunderbird version? - I'd think it should be
there, or isn't it in your version? (What version are you running?)

115.16.0esr

That's newer than my "newer" one. - So they changed interface again?!

Both TB versions I am using allowed some fix; none allowed it in an
obvious, acceptable form, let alone did it right in the first place.

I don't have a "More/Customize Toolbar" option, only "More/Customize",
[...]

Since you seem to be annoying someone here it's probably best to search
for a solution or workaround on the Net/Web for your TB version.

For the other (annoyed) poster I would just suggest "self-defense" in
the form of changing the email address to allow filtering. (I'm using
for Internet anyway a special spam-catch email base address, and also
added a "filter component" to that, but that's not supported by all
email systems.) - I'm sure that's not what you prefer; you'll have to
choose your path of least annoyance yourself.

Janis

--- Synchronet 3.21d-Linux NewsLink 1.2

From Ed Morton@mortonspam@gmail.com to comp.unix.shell,comp.unix.programmer,comp.lang.misc on Sat Nov 23 18:17:41 2024

From Newsgroup: comp.unix.programmer

On 11/20/2024 9:53 AM, Janis Papanagnou wrote:

On 20.11.2024 12:46, Ed Morton wrote:

Definitely. The most relevant statement about regexps is this:

Some people, when confronted with a problem, think "I know, I'll use
regular expressions." Now they have two problems.

(Worth a scribbling on a WC wall.)

Obviously regexps are very useful and commonplace but if you find you
have to use some online site or other tools to help you write/understand
one or just generally need more than a couple of minutes to
write/understand it then it's time to back off and figure out a better
way to write your code for the sake of whoever has to read it 6 months
later (and usually for robustness too as it's hard to be sure all rainy
day cases are handled correctly in a lengthy and/or complicated regexp).

Regexps are nothing for newbies.

The inherent fine thing with Regexps is that you can incrementally
compose them[*].[**]

It seems you haven't found a sensible way to work with them?
(And I'm really astonished about that since I know you worked with
Regexps for years if not decades.)

I have no problem working with regexps, I just don't write lengthy or complicated regexps, just brief, simple BREs or EREs, and I don't
restrict myself to trying to solve problems with a single regexp.

In those cases where Regexps *are* the tool for a specific task -
I don't expect you to use them where they are inappropriate?! -

Right, I don't, but I see many people using them for tasks that could be
done more clearly and robustly if not done with a single regexp.

what would be the better solution[***] then?

It all depends on the problem. For example, if you need to match an
input string that must contain each of a, b, and c in any order then you
could do that in awk with this regexp or similar:

awk '/(a.*(b.*c|c.*b))|(b.*(a.*c|c.*a))|(c.*(a.*b|b.*a))/'

or you could do it with this condition comprised of regexp segments:

awk '/a/ && /b/ && /c/'

I would prefer the second solution as it's more concise and easier to
enhance (try adding "and d" to both).

As another example, someone on StackOverflow recently said they had
written the following regexp to isolate the last string before a set of
parens in a line that contains multiple such strings, some of them
nested, and they said it works in python:

^(?:^[^(]+$[^)]+$ $([^(]+)\([^)]+$\))|[^(]+$([^(]+)\([^)]+$,\s([^$]+)\([^)]+$\s$[^$]+\)\)|(?:(?:.*?)$(.*?)\(.*?$\))|(?:[^(]+$([^)]+)$)$

I personally wouldn't consider anything remotely as lengthy or
complicated as that regexp despite their assurances that it works, I'd
use this any-awk script or similar instead:

{
rec = $0
while ( match(rec, /\([^()]*)/) ) {
tgt = substr($0,RSTART+1,RLENGTH-2)
rec = substr(rec,1,RSTART-1) RS substr(rec,RSTART+1,RLENGTH-2)
RS substr(rec,RSTART+RLENGTH)
}
gsub(/ *\([^()]*) */, "", tgt)
print tgt
}

It's a bit more code but, unlike that regexp, anyone assigned to
maintain this code in future can tell what it does with just a little
thought (and maybe adding a debugging print in the loop if they aren't
very familiar with awk), can then be sure it does what is required and
nothing else, and could easily maintain/enhance it if necessary.

Ed.

Janis

[*] Like the corresponding FSMs.

[**] And you can also decompose them if they are merged in a huge
expression, too large for you to grasp it. (BTW, I'm doing such decompositions also with other expressions in program code that
are too bulky.)

[***] Can you answer the question that another poster failed to do?

--- Synchronet 3.21d-Linux NewsLink 1.2

From kalevi@kalevi@kolttonen.fi (Kalevi Kolttonen) to comp.unix.programmer on Fri Dec 27 13:59:47 2024

From Newsgroup: comp.unix.programmer

Dan Cross <cross@spitfire.i.gajendra.net> wrote:

Here are some ideas for you to consider:

Sorry for replying to an old post. I do not even
remember who it is that keeps sending emails when
he intends to send follow-ups to Usenet. However,
I have a good solution.

Thunderbird is open source, maybe even free software.

I am sure it is extremely easy to either remove the
"reply" button or to bind it to the "followup" function.

Problem solved.

br,
KK
--- Synchronet 3.21d-Linux NewsLink 1.2

From gazelle@gazelle@shell.xmission.com (Kenny McCormack) to comp.unix.programmer on Fri Dec 27 14:35:52 2024

From Newsgroup: comp.unix.programmer

In article <vkmbsj$3kvjq$1@dont-email.me>,
Kalevi Kolttonen <kalevi@kolttonen.fi> wrote:
...

Thunderbird is open source, maybe even free software.

I am sure it is extremely easy to either remove the
"reply" button or to bind it to the "followup" function.

I doubt it is any simple matter to find it in the massive source code, much less to figure out how to fix it, much less figure out how to successfully re-build (and test!) it after making your changes.

Generally, most complicated GUI software, even if technically Open Source,
is difficult, if not impossible, for ordinary people using ordinary
machines, to re-build from source. Among other things, you will generally
end up spending many hours, if not days, getting all the dependencies installed.
--
That's the Trump playbook. Every action by Trump or his supporters can
be categorized as one (or more) of:

outrageous, incompetent, or mentally ill.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Richard Kettlewell@invalid@invalid.invalid to comp.unix.programmer on Fri Dec 27 14:56:28 2024

From Newsgroup: comp.unix.programmer

gazelle@shell.xmission.com (Kenny McCormack) writes:

Kalevi Kolttonen <kalevi@kolttonen.fi> wrote:
...

Thunderbird is open source, maybe even free software.

I am sure it is extremely easy to either remove the
"reply" button or to bind it to the "followup" function.

I doubt it is any simple matter to find it in the massive source code,
much less to figure out how to fix it, much less figure out how to successfully re-build (and test!) it after making your changes.

Generally, most complicated GUI software, even if technically Open
Source, is difficult, if not impossible, for ordinary people using
ordinary machines, to re-build from source. Among other things, you
will generally end up spending many hours, if not days, getting all
the dependencies installed.

On Debian-derived platforms, thatrCOs what apt-get build-dep is for.
Source package rebuild is also standardized. It looks like the RH world
has something pretty similar.

So I donrCOt think the need to install build dependencies is likely to be
a real obstacle to rebuilding Thunderbird.
--
https://www.greenend.org.uk/rjk/
--- Synchronet 3.21d-Linux NewsLink 1.2

From Richard Kettlewell@invalid@invalid.invalid to comp.unix.programmer on Fri Dec 27 14:56:52 2024

From Newsgroup: comp.unix.programmer

kalevi@kolttonen.fi (Kalevi Kolttonen) writes:

Dan Cross <cross@spitfire.i.gajendra.net> wrote:

Here are some ideas for you to consider:

Sorry for replying to an old post. I do not even
remember who it is that keeps sending emails when
he intends to send follow-ups to Usenet. However,
I have a good solution.

Thunderbird is open source, maybe even free software.

I am sure it is extremely easy to either remove the
"reply" button or to bind it to the "followup" function.

Go on then?
--
https://www.greenend.org.uk/rjk/
--- Synchronet 3.21d-Linux NewsLink 1.2

From John Ames@commodorejohn@gmail.com to comp.unix.programmer on Fri Dec 27 07:43:17 2024

From Newsgroup: comp.unix.programmer

On Fri, 27 Dec 2024 13:59:47 -0000 (UTC)
kalevi@kolttonen.fi (Kalevi Kolttonen) wrote:

I am sure it is extremely easy to either remove the
"reply" button or to bind it to the "followup" function.

Have you, like, actually *checked* this assertion?

--- Synchronet 3.21d-Linux NewsLink 1.2

From gazelle@gazelle@shell.xmission.com (Kenny McCormack) to comp.unix.programmer on Fri Dec 27 16:14:20 2024

From Newsgroup: comp.unix.programmer

In article <wwvh66p9ntv.fsf@LkoBDZeT.terraraq.uk>,
Richard Kettlewell <invalid@invalid.invalid> wrote:
...

On Debian-derived platforms, thats what apt-get build-dep is for.
Source package rebuild is also standardized. It looks like the RH world
has something pretty similar.

I know all that - and, in theory, it should "just work".

But my experience is that theory and practice diverge.

Now, I may not be the most capable person in the world, probably not even
in the top 10 (or 100). But that's exactly my point. It's just not an
easy task for ordinary people under ordinary circumstances.
--
The randomly chosen signature file that would have appeared here is more than 4 lines long. As such, it violates one or more Usenet RFCs. In order to remain in compliance with said RFCs, the actual sig can be found at the following URL:
http://user.xmission.com/~gazelle/Sigs/Cancer
--- Synchronet 3.21d-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.unix.programmer on Fri Dec 27 17:39:30 2024

From Newsgroup: comp.unix.programmer

John Ames <commodorejohn@gmail.com> writes:

On Fri, 27 Dec 2024 13:59:47 -0000 (UTC)
kalevi@kolttonen.fi (Kalevi Kolttonen) wrote:

I am sure it is extremely easy to either remove the
"reply" button or to bind it to the "followup" function.

Have you, like, actually *checked* this assertion?

Unlikely, to be sure.

I personally still use xrn, which _is_ completely customizable
via X11 resources.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Salvador Mirzo@smirzo@example.com to comp.unix.programmer on Fri Dec 27 15:07:59 2024

From Newsgroup: comp.unix.programmer

gazelle@shell.xmission.com (Kenny McCormack) writes:

In article <wwvh66p9ntv.fsf@LkoBDZeT.terraraq.uk>,
Richard Kettlewell <invalid@invalid.invalid> wrote:
...

On Debian-derived platforms, thats what apt-get build-dep is for.
Source package rebuild is also standardized. It looks like the RH world >>has something pretty similar.

I know all that - and, in theory, it should "just work".

But my experience is that theory and practice diverge.

Now, I may not be the most capable person in the world, probably not even
in the top 10 (or 100). But that's exactly my point. It's just not an
easy task for ordinary people under ordinary circumstances.

If I were not full of tasks right now, I would set up a VM with Debian
and try it out---build-dep for Thunderbird. Just to see if compiles successfully without much hacking involved. I am also skeptical of such things. It usually works on smaller projects; I'd be surprised and
happy to find out that it works with no hacking involved.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Paul@nospam@needed.invalid to comp.unix.programmer on Fri Dec 27 13:11:29 2024

From Newsgroup: comp.unix.programmer

On Fri, 12/27/2024 11:14 AM, Kenny McCormack wrote:

In article <wwvh66p9ntv.fsf@LkoBDZeT.terraraq.uk>,
Richard Kettlewell <invalid@invalid.invalid> wrote:
...

On Debian-derived platforms, thats what apt-get build-dep is for.
Source package rebuild is also standardized. It looks like the RH world
has something pretty similar.

I know all that - and, in theory, it should "just work".

But my experience is that theory and practice diverge.

Now, I may not be the most capable person in the world, probably not even
in the top 10 (or 100). But that's exactly my point. It's just not an
easy task for ordinary people under ordinary circumstances.

One of the solutions, is to not put email and USENET news, on the same tool.

I hit the reply by accident the other day here, and when I clicked
Send, a dialog whined about "Could not send because..." I have no email
account set up on the thing. Of course it cannot erroneously send
to email, because there is no email. The Reply button then is neutered.
I changed over to Followup, sent to newsgroup, and finished the job.

IDK about capable, but Thunderbird is a huge package, and the
installation may require Rust and some other materials to be
loaded. The description for building, may have started with
a Mercurial (Hg) clone, and then the build is based on that.
The tarball off the site, would be insufficient for an immediate
build. The recipe no longer describes starting with a tarball.

As a consequence, the distro people might not have a "conventional"
setup for Thunderbird. One of the distros, a .deb arrives from
Mozilla in a cardboard box. As a result of the package
being manufactured elsewhere, instead of in-house in the
Buildmeister corral, the source option just might be missing.

Still, even without doing the actual build, it would be fun
to tease the distro and see what it has to offer, and see whether
a patched source magically shows up. It would be good to know
whether in "difficult cases", it actually arrives ready to be
built for you.

For some of these "big" projects, a machine with 32GB of RAM
is recommended. Which helps with the linking phase. Some
of the compiling, the RAM footprint isn't all that large.
The procedure used to have a hideous linkage methodology
at one time, but that got modified a bit. To make a 32-bit
copy of the thing, it is best to build on a 64-bit OS!
If your distro isn't 64-bit, it would be not-possible to finish.

While a lot of distros are 64-bit only, you can *still* find 32-bit ones.
But not for much longer. Building Thunderbird on the one with
the arrow below, that could be a waste of your time. I don't know if
some PAE thing would work or not in 32-bit mode. I'm using a
mirror here, because the main site isn't as easy to get around on.

https://mirror.csclub.uwaterloo.ca/linuxmint/debian/

lmde-6-cinnamon-32bit.iso 22-Sep-2023 17:07 2G <=== lmde-6-cinnamon-64bit.iso 22-Sep-2023 16:26 3G
sha256sum.txt 01-Feb-2024 16:09 552
sha256sum.txt.gpg 01-Feb-2024 16:10 833

Paul
--- Synchronet 3.21d-Linux NewsLink 1.2

From Paul@nospam@needed.invalid to comp.unix.programmer on Fri Dec 27 13:15:47 2024

From Newsgroup: comp.unix.programmer

On Fri, 12/27/2024 10:43 AM, John Ames wrote:

On Fri, 27 Dec 2024 13:59:47 -0000 (UTC)
kalevi@kolttonen.fi (Kalevi Kolttonen) wrote:

I am sure it is extremely easy to either remove the
"reply" button or to bind it to the "followup" function.

Have you, like, actually *checked* this assertion?

The Followup button on the copy of Thunderbird I'm typing on,
is actually a menu. It's not just a button. This is why
this nonsense happens. It's a graphical trap, a hole in the
back yard for you to fall into while walking past.

Followup V <=== Normally, you are hitting this as if it is a button
Followup
Reply All
Reply <=== If you "smear" or "wipe" the Followup V at
the top, and you're not paying attention, THAT
is when you hit the Reply.

Paul
--- Synchronet 3.21d-Linux NewsLink 1.2

From Kaz Kylheku@643-408-1753@kylheku.com to comp.unix.programmer on Fri Dec 27 19:14:42 2024

From Newsgroup: comp.unix.programmer

On 2024-12-27, Kalevi Kolttonen <kalevi@kolttonen.fi> wrote:

Dan Cross <cross@spitfire.i.gajendra.net> wrote:

Here are some ideas for you to consider:

Sorry for replying to an old post. I do not even
remember who it is that keeps sending emails when
he intends to send follow-ups to Usenet. However,
I have a good solution.

Thunderbird is open source, maybe even free software.

Thunderbird is yet another mail program whose developers think that
Usenet is just like e-mail, and can be bolted onto a mail client.

Just use a fscking newsreader.

I've never sent an e-mail by accident in slrn.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
--- Synchronet 3.21d-Linux NewsLink 1.2

From Lawrence D'Oliveiro@ldo@nz.invalid to comp.unix.programmer on Fri Dec 27 23:09:40 2024

From Newsgroup: comp.unix.programmer

On Fri, 27 Dec 2024 15:07:59 -0300, Salvador Mirzo wrote:

If I were not full of tasks right now, I would set up a VM with Debian
and try it out---build-dep for Thunderbird. Just to see if compiles successfully without much hacking involved.

It will certainly work for (re)building the Debian package. Remember,
thatrCOs how they build Debian in the first place.

Also, you can try a container rather than a full VM.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Lawrence D'Oliveiro@ldo@nz.invalid to comp.unix.programmer on Fri Dec 27 23:11:32 2024

From Newsgroup: comp.unix.programmer

On Fri, 27 Dec 2024 14:56:28 +0000, Richard Kettlewell wrote:

On Debian-derived platforms, thatrCOs what apt-get build-dep is for.

Even if you want your build to be different from the Debian package, apt-
get build-dep will still likely get you 90% of the way to installing all
the needed dependencies.

I used this as a starting point for my Blender build. Now that is a
package with some complex dependencies.
--- Synchronet 3.21d-Linux NewsLink 1.2

From kalevi@kalevi@kolttonen.fi (Kalevi Kolttonen) to comp.unix.programmer on Fri Dec 27 23:22:17 2024

From Newsgroup: comp.unix.programmer

Salvador Mirzo <smirzo@example.com> wrote:

If I were not full of tasks right now, I would set up a VM with Debian
and try it out---build-dep for Thunderbird. Just to see if compiles successfully without much hacking involved. I am also skeptical of such things. It usually works on smaller projects; I'd be surprised and
happy to find out that it works with no hacking involved.

No need to be skeptical, we live in modern ages
where things have been made quite convenient for us.
Compiling Thunderbird should be very easy indeed
when we use Linux distro's package management.

I run Fedora Linux 41 xfce spin and I love it. If my
memory serves right, so far I have performed the
following steps:

1) Download the Thunderbird source RPM
dnf download --source thunderbird

2) Install the source RPM
rpm -Uvh thunderbird-128.5.2-1.fc41.src.rpm

3) Bump release from 2 to 3
vi ~/rpmbuild/SPECS/thunderbird.spec

4) Extract the tar.xz
tar xJf thunderbird-128.5.2esr.source.tar.xz

5) Edit function MsgReplyMessage to contain just "return;"

vi +/MsgReplyMessage thunderbird-128.5.2/comm/suite/mailnews/content/mailWindowOverlay.js

6) Recreate the tar.xz
tar cJf thunderbird-128.5.2esr.source.tar.xz thunderbird-128.5.2

7) Install all RPM build dependencies, letting dnf do the heavy lifting
dnf builddep ~/rpmbuild/SPECS/thunderbird.spec

8) Build Thunderbird binary RPM:

rpmbuild -bb ~/rpmbuild/SPECS/thunderbird.spec

Since Thunderbird is pretty huge, I am guessing that
the build will take some time to complete.

In Finland it is now 01:17 o'clock in the middle of the
night. Unfortunately I have to go to work tomorrow so
I must go to sleep at 2:00. I have no idea when the
build will be complete or whether my JavaScript hack
works or not.

br,
KK
--- Synchronet 3.21d-Linux NewsLink 1.2

From kalevi@kalevi@kolttonen.fi (Kalevi Kolttonen) to comp.unix.programmer on Fri Dec 27 23:22:53 2024

From Newsgroup: comp.unix.programmer

Richard Kettlewell <invalid@invalid.invalid> wrote:

kalevi@kolttonen.fi (Kalevi Kolttonen) writes:

Dan Cross <cross@spitfire.i.gajendra.net> wrote:

Here are some ideas for you to consider:

Sorry for replying to an old post. I do not even
remember who it is that keeps sending emails when
he intends to send follow-ups to Usenet. However,
I have a good solution.

Thunderbird is open source, maybe even free software.

I am sure it is extremely easy to either remove the
"reply" button or to bind it to the "followup" function.

Go on then?

Working on it.

br,
KK
--- Synchronet 3.21d-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.unix.programmer on Sat Dec 28 00:38:23 2024

From Newsgroup: comp.unix.programmer

On 27.12.2024 20:14, Kaz Kylheku wrote:

[ Thunderbird issues ]

Thunderbird is yet another mail program whose developers think that
Usenet is just like e-mail, and can be bolted onto a mail client.

I may be using Thunderbird now for two decades or more, don't recall.

That it's a combined client (for email, news, chat, and some more)
with a common [basic] GUI is not the problem. Thunderbird has a lot
of (real, unnecessary) problems that the product managers obviously
failed to understand or see. (I won't even start here to enumerate
them.) And it makes things not easier if versions differ completely
in what can be done to try to fix some of these severe shortcomings.

The state of that product is IMO beyond all hope; it would require a
complete redesign to make its interface follow some basic ergonomic
principles (including all the possible configuration capabilities).
(Mileages may vary.)

I've fixed the Reply-button issue in the two (completely different)
versions I'm mostly using - but don't ask me how I did that! - and
I'm meanwhile thus fine with Thunderbird's newsreader component in
a single mail/news application - as long as I don't follow my urge
to fix anything else in this product.

Janis

[...]

--- Synchronet 3.21d-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.unix.programmer on Sat Dec 28 00:44:10 2024

From Newsgroup: comp.unix.programmer

On 28.12.2024 00:22, Kalevi Kolttonen wrote:

[...]

No need to be skeptical, we live in modern ages
where things have been made quite convenient for us.

LOL. :-)

Compiling Thunderbird should be very easy indeed
when we use Linux distro's package management.

You expect _users_ of tools to use a _development_
environment to fix *inherent* shortcomings of a tool?
(Shortcomings that should not be there in the first
place!)

[ snip suggested 8-step development process ]

Janis

--- Synchronet 3.21d-Linux NewsLink 1.2

From kalevi@kalevi@kolttonen.fi (Kalevi Kolttonen) to comp.unix.programmer on Fri Dec 27 23:56:50 2024

From Newsgroup: comp.unix.programmer

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 28.12.2024 00:22, Kalevi Kolttonen wrote:

[...]

No need to be skeptical, we live in modern ages
where things have been made quite convenient for us.

LOL. :-)

My comment above was a reference to the bad old
days when you had to manually download tar.gz packages
and compile them to satisfy dependencies. Now the
builds are super easy with the help of package management.

Compiling Thunderbird should be very easy indeed
when we use Linux distro's package management.

You expect _users_ of tools to use a _development_
environment to fix *inherent* shortcomings of a tool?
(Shortcomings that should not be there in the first
place!)

Why would you think so? This is just one way to
solve the problem. I would never ever use TB
for anything. I have used a basic newsreader called
tin for over 30 years now. It works fine as I have
no need for any fancy features.

Unfortunately I could not complete my experiment.
The build process jammed my Lenovo Thinkpad so bad
that it was completely stuck, probably because
of build parallelism and memory hogging.

$ grep ^proc /proc/cpuinfo
processor : 0
processor : 1
processor : 2
processor : 3
processor : 4
processor : 5
processor : 6
processor : 7
processor : 8
processor : 9
processor : 10
processor : 11

$ grep ^vendor /proc/cpuinfo |head -1
vendor_id : AuthenticAMD

I got 16GM of RAM. Maybe the build parameters in the
spec file are too aggressive for this modest laptop,
but I did not find any "make -j" invocation.

Now I have to sleep and maybe try it again tomorrow.

br,
KK
--- Synchronet 3.21d-Linux NewsLink 1.2

From kalevi@kalevi@kolttonen.fi (Kalevi Kolttonen) to comp.unix.programmer on Sat Dec 28 00:11:48 2024

From Newsgroup: comp.unix.programmer

Kalevi Kolttonen <kalevi@kolttonen.fi> wrote:

I got 16GM of RAM. Maybe the build parameters in the
spec file are too aggressive for this modest laptop,
but I did not find any "make -j" invocation.

Now I have to sleep and maybe try it again tomorrow.

Still awake for a while. I looked at the spec file
and replaced:

./mach build -v

with

./mach build -j6 -v

Maybe in the morning we have a complete build.

br,
KK
--- Synchronet 3.21d-Linux NewsLink 1.2

From Salvador Mirzo@smirzo@example.com to comp.unix.programmer on Fri Dec 27 21:22:03 2024

From Newsgroup: comp.unix.programmer

kalevi@kolttonen.fi (Kalevi Kolttonen) writes:

Kalevi Kolttonen <kalevi@kolttonen.fi> wrote:

I got 16GM of RAM. Maybe the build parameters in the
spec file are too aggressive for this modest laptop,
but I did not find any "make -j" invocation.

Now I have to sleep and maybe try it again tomorrow.

Still awake for a while. I looked at the spec file
and replaced:

./mach build -v

with

./mach build -j6 -v

Maybe in the morning we have a complete build.

Thanks for doing the work and keeping us posted. I wish you a good
night and that the compilation completes just fine.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Lawrence D'Oliveiro@ldo@nz.invalid to comp.unix.programmer on Sat Dec 28 02:07:44 2024

From Newsgroup: comp.unix.programmer

On Sat, 28 Dec 2024 00:44:10 +0100, Janis Papanagnou wrote:

On 28.12.2024 00:22, Kalevi Kolttonen wrote:

Compiling Thunderbird should be very easy indeed when we use Linux
distro's package management.

You expect _users_ of tools to use a _development_ environment to fix *inherent* shortcomings of a tool?

On Linux, there is no rCLdevelopment environmentrCY versus rCLuser environmentrCY.
The same packaging tools work with both source code and built binaries.

Platforms like Microsoft and Apple try to build a wall between two
separate modes of using the system; Linux doesnrCOt.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.unix.programmer on Sat Dec 28 19:27:07 2024

From Newsgroup: comp.unix.programmer

On 28.12.2024 00:56, Kalevi Kolttonen wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 28.12.2024 00:22, Kalevi Kolttonen wrote:

[...]

No need to be skeptical, we live in modern ages
where things have been made quite convenient for us.

LOL. :-)

My comment above was a reference to the bad old
days when you had to manually download tar.gz packages
and compile them to satisfy dependencies. Now the
builds are super easy with the help of package management.

I see; you were referring to the way the technical process
works.

Personally I don't think that package managers contribute
a lot since for ordinary users it's the same whether the
package managers install a binary package or a source that
is compiled under the hood. The difference is that source
package needs a development environment (compiler, etc.)
that "ordinary users" might not have installed or may not
want to get installed (just for that).

Compiling Thunderbird should be very easy indeed
when we use Linux distro's package management.

You expect _users_ of tools to use a _development_
environment to fix *inherent* shortcomings of a tool?
(Shortcomings that should not be there in the first
place!)

Why would you think so? This is just one way to
solve the problem. [...]

For a specific type of users. - The description you gave
was describing a development process; that's not something
that ordinary users would typically do (or want to do).

Your problem solving suggestion goes even farther with yet
more inherent issues that users of package managers might
not like (editing sources, bypassing standard installation
of regular updates with an own [temporary] version/branch).

Janis

--- Synchronet 3.21d-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.unix.programmer on Sat Dec 28 19:40:46 2024

From Newsgroup: comp.unix.programmer

On 28.12.2024 03:07, Lawrence D'Oliveiro wrote:

On Sat, 28 Dec 2024 00:44:10 +0100, Janis Papanagnou wrote:

On 28.12.2024 00:22, Kalevi Kolttonen wrote:

Compiling Thunderbird should be very easy indeed when we use Linux
distro's package management.

You expect _users_ of tools to use a _development_ environment to fix
*inherent* shortcomings of a tool?

On Linux, there is no rCLdevelopment environmentrCY versus rCLuser environmentrCY.
The same packaging tools work with both source code and built binaries.

You think it's normal that on a Linux installation where, say, no 'cc'
(as prominent example of a development tool) is installed the package
manager would first install ALL the necessary compilers and scripting
languages just to create a binary (as opposed to just installing the
binary)? - I'm not sure I understood what you were saying or aiming at especially in context of what I had been saying.

For the Unix systems I worked with (commercial Unixes, Cygwin, Linux)
there was no development environment guaranteed to be in the default installation.

Janis

--- Synchronet 3.21d-Linux NewsLink 1.2

From James Kuyper@jameskuyper@alumni.caltech.edu to comp.unix.programmer on Sat Dec 28 14:26:22 2024

From Newsgroup: comp.unix.programmer

On 12/27/24 18:44, Janis Papanagnou wrote:

On 28.12.2024 00:22, Kalevi Kolttonen wrote:

[...]

No need to be skeptical, we live in modern ages
where things have been made quite convenient for us.

LOL. :-)

Compiling Thunderbird should be very easy indeed
when we use Linux distro's package management.

You expect _users_ of tools to use a _development_
environment to fix *inherent* shortcomings of a tool?
(Shortcomings that should not be there in the first
place!)

IIRC, this is in reference to my difficulty when Thunderbird changed the
Reply button to mean "Reply" rather than "Followup", and instead added a
new button that is labelled "Followup". I have never complained about
that change - it was an entirely sensible one. I'm just having trouble re-training myself to use the newer, more sensible interface in a few
years after spending a couple of decades using the older, less sensible
one. And I fully appreciate other people's irritation at my difficulty
with re-training.
I wouldn't mind if they reinstated the ability, which existed in older
versions of Thunderbird, to rearrange the list of buttons that are
displayed. I do complain about the removal of that customization
ability. I don't want to go back to those older versions because that
would mean undoing other improvements. I'm especially worried about
undoing security bug fixes.

I don't like the idea of creating my own personal version of Thunderbird
by modifying their source code, because it means I would have to re-do
the build every time they put out a new version. I want quick and easy
upgrades to newer versions, especially security bug fixes, and that
desire conflicts with the desire for customization.
--- Synchronet 3.21d-Linux NewsLink 1.2

From kalevi@kalevi@kolttonen.fi (Kalevi Kolttonen) to comp.unix.programmer on Sat Dec 28 19:48:51 2024

From Newsgroup: comp.unix.programmer

Kalevi Kolttonen <kalevi@kolttonen.fi> wrote:

./mach build -v

with

./mach build -j6 -v

Okay, I am finally back at home and it is 21:43
o'clock.

The change above did not help because earlier in the
spec file there is this:

# Require 4 GB of RAM per CPU core
%constrain_build -m 4096

I now have:

%constrain_build -m 1024

But I am still very baffled. I tried to build TB
without any source code modifications to make sure
that the building process with rpmbuild works okay.

Instead of success, my build has terminated with:

RPM build errors:
No patch number 416
No patch number 419
Bad exit status from /var/tmp/rpm-tmp.N27puv (%build)

I cannot understand this because all references to
patches 416 and 419 are commented out in the Fedora
41 spec file. I now completely removed them and will
try again...

br,
KK
--- Synchronet 3.21d-Linux NewsLink 1.2

From kalevi@kalevi@kolttonen.fi (Kalevi Kolttonen) to comp.unix.programmer on Sat Dec 28 20:30:07 2024

From Newsgroup: comp.unix.programmer

Kalevi Kolttonen <kalevi@kolttonen.fi> wrote:

I cannot understand this because all references to
patches 416 and 419 are commented out in the Fedora
41 spec file. I now completely removed them and will
try again...

I am having massive problems with having only 16GB of
RAM.

Using 'top', I was able to see that Rust compiler 'rustc'
was hogging something like 11GB of memory, and then
after a while OOM killer got rid of the Rust compiler
process. I am also seeing swapping take place when I
attempt the build.

It is really painful but I guess I have to use
just a single CPU:

./mach build -j1 -v

Because of this, the build takes forever to complete.

br,
KK
--- Synchronet 3.21d-Linux NewsLink 1.2

From kalevi@kalevi@kolttonen.fi (Kalevi Kolttonen) to comp.unix.programmer on Sat Dec 28 21:07:19 2024

From Newsgroup: comp.unix.programmer

Kalevi Kolttonen <kalevi@kolttonen.fi> wrote:

It is really painful but I guess I have to use
just a single CPU:

./mach build -j1 -v

Because of this, the build takes forever to complete.

Despite that, rpmbuild was using three CPUS. Now trying
to add this too:

RPM_BUILD_NCPUS=1 rpmbuild -bb ~/rpmbuild/SPECS/thunderbird.spec

The build is in progress again and things look better now:

$ grep MOZ_MAKE_FLAGS /var/tmp/rpm-tmp.5ctL0k
echo "mk_add_options MOZ_MAKE_FLAGS=\"-j1\"" >> .mozconfig

Without explicit RPM_BUILD_NCPUS=1, rpmbuild defaulted to
3.

br,
KK
--- Synchronet 3.21d-Linux NewsLink 1.2

From Lawrence D'Oliveiro@ldo@nz.invalid to comp.unix.programmer on Sat Dec 28 23:00:32 2024

From Newsgroup: comp.unix.programmer

On Sat, 28 Dec 2024 19:40:46 +0100, Janis Papanagnou wrote:

You think it's normal that on a Linux installation where, say, no 'cc'
(as prominent example of a development tool) is installed the package
manager would first install ALL the necessary compilers and scripting languages just to create a binary (as opposed to just installing the
binary)?

The discussion has to do with creating your own version of the binary,
rather than using the repo-provided version.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Lawrence D'Oliveiro@ldo@nz.invalid to comp.unix.programmer on Sat Dec 28 23:03:59 2024

From Newsgroup: comp.unix.programmer

On Sat, 28 Dec 2024 19:27:07 +0100, Janis Papanagnou wrote:

Personally I don't think that package managers contribute a lot since
for ordinary users it's the same whether the package managers install a binary package or a source that is compiled under the hood.

Package managers contribute a lot to both tasks. On Linux, there is no rCLdevelopment environmentrCY versus rCLuser environmentrCY. The same packaging
tools work with both source code and built binaries.

The difference is that source package needs a development
environment (compiler, etc.) that "ordinary users" might not have
installed or may not want to get installed (just for that).

Platforms like Microsoft and Apple try to build a wall between two
separate rCLdeveloperrCY versus rCLordinary userrCY modes of using the system; Linux doesnrCOt.
--- Synchronet 3.21d-Linux NewsLink 1.2

From kalevi@kalevi@kolttonen.fi (Kalevi Kolttonen) to comp.unix.programmer on Sat Dec 28 23:32:37 2024

From Newsgroup: comp.unix.programmer

Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Sat, 28 Dec 2024 19:40:46 +0100, Janis Papanagnou wrote:

You think it's normal that on a Linux installation where, say, no 'cc'
(as prominent example of a development tool) is installed the package
manager would first install ALL the necessary compilers and scripting
languages just to create a binary (as opposed to just installing the
binary)?

The discussion has to do with creating your own version of the binary, rather than using the repo-provided version.

Right.

Anyway, to be honest, I never realized how bloated Thunderbird is.
The source RPM thunderbird-128.5.2-1.fc41.src.rpm is 690MB and
the main source directory unpacked is:

~/tmp/tb/thunderbird-128.5.2 $ du -sh
4.2G .

Building TB with the help of a pre-made spec file on Fedora is
probably very much easier than doing 'git clone' and trying to
build it from there. Using 'dnf', it was just one command to
download all the dependencies. I suppose the size of the
dependency packages was 260MB in total. It would be a nightmare
having to download them manually and then building them.

Packages are just so handy.

Fedora and Red Hat have already done the hard work so it
is wise to use their source RPM as a basis for your own
modifications when you are on Fedora or Red Hat Enterprise
Linux.

My single CPU Thunderbird build has now lasted for over two
and half hours and I have no clue when it will be ready.

This codebase is absolutely massive! I am beginning to
lose patience.

br,
KK
--- Synchronet 3.21d-Linux NewsLink 1.2

From Grant Taylor@gtaylor@tnetconsulting.net to comp.unix.programmer on Sat Dec 28 19:02:48 2024

From Newsgroup: comp.unix.programmer

On 12/28/24 17:32, Kalevi Kolttonen wrote:

This codebase is absolutely massive! I am beginning to lose patience.

I remember when compiling X11 was a good burn in text for a processor.

Then it was the Linux kernel.

Now it's Thunderbird / Firefox, followed by Chromium, followed by Rust.
--
Grant. . . .
--- Synchronet 3.21d-Linux NewsLink 1.2

From Paul@nospam@needed.invalid to comp.unix.programmer on Sat Dec 28 21:12:38 2024

From Newsgroup: comp.unix.programmer

On Sat, 12/28/2024 6:32 PM, Kalevi Kolttonen wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Sat, 28 Dec 2024 19:40:46 +0100, Janis Papanagnou wrote:

You think it's normal that on a Linux installation where, say, no 'cc'
(as prominent example of a development tool) is installed the package
manager would first install ALL the necessary compilers and scripting
languages just to create a binary (as opposed to just installing the
binary)?

The discussion has to do with creating your own version of the binary,
rather than using the repo-provided version.

Right.

Anyway, to be honest, I never realized how bloated Thunderbird is.
The source RPM thunderbird-128.5.2-1.fc41.src.rpm is 690MB and
the main source directory unpacked is:

~/tmp/tb/thunderbird-128.5.2 $ du -sh
4.2G .

Building TB with the help of a pre-made spec file on Fedora is
probably very much easier than doing 'git clone' and trying to
build it from there. Using 'dnf', it was just one command to
download all the dependencies. I suppose the size of the
dependency packages was 260MB in total. It would be a nightmare
having to download them manually and then building them.

Packages are just so handy.

Fedora and Red Hat have already done the hard work so it
is wise to use their source RPM as a basis for your own
modifications when you are on Fedora or Red Hat Enterprise
Linux.

My single CPU Thunderbird build has now lasted for over two
and half hours and I have no clue when it will be ready.

This codebase is absolutely massive! I am beginning to
lose patience.

br,
KK

This is one of the bigger things that you could attempt to build.

In the old days, there would have been almost zero chance
of you finishing the linking phase (XUL linking). Today,
they've fixed up how linking is done, so you might
actually finish the build without incident.

Keep a copy of "top" open, so you can watch the RAM consumption
during linking.

Paul
--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@dastardlyhq.com to comp.unix.programmer on Sun Dec 29 09:50:55 2024

From Newsgroup: comp.unix.programmer

On Sat, 28 Dec 2024 20:30:07 -0000 (UTC)
kalevi@kolttonen.fi (Kalevi Kolttonen) gabbled:

Kalevi Kolttonen <kalevi@kolttonen.fi> wrote:

I cannot understand this because all references to
patches 416 and 419 are commented out in the Fedora
41 spec file. I now completely removed them and will
try again...

I am having massive problems with having only 16GB of
RAM.

Using 'top', I was able to see that Rust compiler 'rustc'
was hogging something like 11GB of memory, and then
after a while OOM killer got rid of the Rust compiler
process. I am also seeing swapping take place when I
attempt the build.

Why TF is the rust compiler involved in the process at all?

--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@dastardlyhq.com to comp.unix.programmer on Sun Dec 29 09:54:52 2024

From Newsgroup: comp.unix.programmer

On Sat, 28 Dec 2024 23:32:37 -0000 (UTC)
kalevi@kolttonen.fi (Kalevi Kolttonen) gabbled:

Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Sat, 28 Dec 2024 19:40:46 +0100, Janis Papanagnou wrote:

You think it's normal that on a Linux installation where, say, no 'cc'
(as prominent example of a development tool) is installed the package
manager would first install ALL the necessary compilers and scripting
languages just to create a binary (as opposed to just installing the
binary)?

The discussion has to do with creating your own version of the binary,
rather than using the repo-provided version.

Right.

Anyway, to be honest, I never realized how bloated Thunderbird is.
The source RPM thunderbird-128.5.2-1.fc41.src.rpm is 690MB and
the main source directory unpacked is:

~/tmp/tb/thunderbird-128.5.2 $ du -sh
4.2G .

Welcome to Lego brick style programming where the main application devs are incompetant halfwits unable to implement even simple things themselves beyond designing (if that applies to Thunderbird) a GUI so have to import 101 libraries
to do everything for them.

I've written my own newsreader system and while admittedly its command line only it requires only one 3rd party library which is OpenSSL.

--- Synchronet 3.21d-Linux NewsLink 1.2

From gazelle@gazelle@shell.xmission.com (Kenny McCormack) to comp.unix.programmer on Sun Dec 29 10:33:43 2024

From Newsgroup: comp.unix.programmer

In article <vkr61v$srrk$1@dont-email.me>, <Muttley@dastardlyhq.com> wrote:
...

Why TF is the rust compiler involved in the process at all?

Well, obviously, because parts of TB are (apparently) written in Rust...

Rust seems to be, like Python, trying to ingratiate itself into the basic running of the system, not just be a peripheral "scripting language".
--
"It does a lot of things half well and it's just a garbage heap of ideas that are
mutually exclusive."

- Ken Thompson, on C++ -
--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@dastardlyhq.com to comp.unix.programmer on Sun Dec 29 10:38:28 2024

From Newsgroup: comp.unix.programmer

On Sun, 29 Dec 2024 10:33:43 -0000 (UTC)
gazelle@shell.xmission.com (Kenny McCormack) gabbled:

In article <vkr61v$srrk$1@dont-email.me>, <Muttley@dastardlyhq.com> wrote: >....

Why TF is the rust compiler involved in the process at all?

Well, obviously, because parts of TB are (apparently) written in Rust...

Christ, who's stupid idea was that?

Rust seems to be, like Python, trying to ingratiate itself into the basic >running of the system, not just be a peripheral "scripting language".

Requiring 2 seperate compilers to build anything is an absurdity.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Paul@nospam@needed.invalid to comp.unix.programmer on Sun Dec 29 07:39:42 2024

From Newsgroup: comp.unix.programmer

On Sun, 12/29/2024 4:54 AM, Muttley@dastardlyhq.com wrote:

On Sat, 28 Dec 2024 23:32:37 -0000 (UTC)
kalevi@kolttonen.fi (Kalevi Kolttonen) gabbled:

Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Sat, 28 Dec 2024 19:40:46 +0100, Janis Papanagnou wrote:

You think it's normal that on a Linux installation where, say, no 'cc' >>>> (as prominent example of a development tool) is installed the package
manager would first install ALL the necessary compilers and scripting
languages just to create a binary (as opposed to just installing the
binary)?

The discussion has to do with creating your own version of the binary,
rather than using the repo-provided version.

Right.

Anyway, to be honest, I never realized how bloated Thunderbird is.
The source RPM thunderbird-128.5.2-1.fc41.src.rpm is 690MB and
the main source directory unpacked is:

~/tmp/tb/thunderbird-128.5.2 $ du -sh
4.2G .

Welcome to Lego brick style programming where the main application devs are incompetant halfwits unable to implement even simple things themselves beyond designing (if that applies to Thunderbird) a GUI so have to import 101 libraries
to do everything for them.

I've written my own newsreader system and while admittedly its command line only it requires only one 3rd party library which is OpenSSL.

You know that Thunderbird uses the source code of Firefox,
to build XUL.so , which is the rendering engine for the interface.

Thunderbird is the demo app for XUL. It's not really
a product, it was partially a creation which was
intended to show how XUL could provide a web render
engine for another software package.

As a result, a relatively small amount of code, implements
News and Email functions. At least 90% of the vast volume
of files in the tarball, is a copy of the Firefox source.

The Thunderbird build tree, even has an option to "just build Firefox".
This is a means of proving the Firefox portion of the tree was
not damaged by staff during tree preparation.

The Thunderbird program, tries to restrict just exactly
how much of Firefox is used for "browsing". If there is a
URL in an email message, Thunderbird would prefer to call
the platform browser (whatever it is) to handle the URL.
But if the Thunderbird staff want to put up an appeal for
donations, using a Mozilla-hosted web page, that part uses
the Firefox code inside Thunderbird, for rendering. This is a
security consideration, an analysis and action on the attack
surface available. Notice that Thunderbird does not do
"Quantum" the way Firefox does, so it doesn't have exactly
the same security precautions. And that's because the
browser portion, is not really intended for "general browsing".

The Thunderbird GUI is a three-pane view. If the render engine
fails, the pane view turns yellow and there is a reference
on the screen "to an XML file". It is that XML file, which
draws the three pane view and populates it with decorations.
When you see that yellow failure condition, that's your
chance to verify exactly how the product works. The graphics
are not drawn with Athena widgets. The graphics are a demo
of what the XUL shared library can do for you.

Is the whole thing obscene ? Yes. You won't find too many
software creations, this distorted. Still, people are using it.
Most people are not aware what is under the hood. It's
a herd of elephants :-)

Paul
--- Synchronet 3.21d-Linux NewsLink 1.2

From kalevi@kalevi@kolttonen.fi (Kalevi Kolttonen) to comp.unix.programmer on Sun Dec 29 13:07:35 2024

From Newsgroup: comp.unix.programmer

Muttley@dastardlyhq.com wrote:

Why TF is the rust compiler involved in the process at all?

TB codebase is a mix of C++, Rust and JavaScript.

Without my hack, the build process took four hours to complete
and it produced a working TB. However, with my tiny JavaScript
modification, the build failed.

Because these builds take four hours, I have to admit defeat.
I simply do not have the time to make more modification
attempts.

What is more, James Kuyper said that he does not want to
build his own TB so it was all in vain anyway.

br,
KK
--- Synchronet 3.21d-Linux NewsLink 1.2

From kalevi@kalevi@kolttonen.fi (Kalevi Kolttonen) to comp.unix.programmer on Sun Dec 29 14:09:40 2024

From Newsgroup: comp.unix.programmer

Kalevi Kolttonen <kalevi@kolttonen.fi> wrote:

Because these builds take four hours, I have to admit defeat.
I simply do not have the time to make more modification
attempts.

Well, I did give it one more go and this happened:

Dec 29 16:03:10 14-5A-FC-31-E8-67 kernel: [ pid ] uid tgid total_vm rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj name
Dec 29 16:03:10 14-5A-FC-31-E8-67 kernel: [ 1058] 998 1058 4009 225 32 193 0 73728 160 -900 systemd-oomd
Dec 29 16:03:10 14-5A-FC-31-E8-67 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-1000.slice/session-2.scope,task=rustc,pid=244790,uid=1004
Dec 29 16:03:10 14-5A-FC-31-E8-67 kernel: Out of memory: Killed process 244790 (rustc) total-vm:16524360kB, anon-rss:9185412kB, file-rss:448kB, shmem-rss:0kB, UID:1004 pgtables:30152kB oom_score_adj:0
Dec 29 16:03:13 14-5A-FC-31-E8-67 kernel: oom_reaper: reaped process 244790 (rustc), now anon-rss:212kB, file-rss:448kB, shmem-rss:0kB

That was a single CPU build but it still failed.
My hardware just does not cut it with monstrous
builds like TB.

br,
KK
--- Synchronet 3.21d-Linux NewsLink 1.2

From gazelle@gazelle@shell.xmission.com (Kenny McCormack) to comp.unix.programmer on Sun Dec 29 14:32:18 2024

From Newsgroup: comp.unix.programmer

In article <vkrfue$vl1b$1@dont-email.me>, Paul <nospam@needed.invalid> wrote: ...

Is the whole thing obscene ? Yes. You won't find too many
software creations, this distorted. Still, people are using it.
Most people are not aware what is under the hood. It's
a herd of elephants :-)

Just out curiosity, does all of this apply to the Windows version as well?

I know this thread is mostly about the Linux version, and although I
actually don't use TB at all, I know someone who uses the Windows version.
--
I've learned that people will forget what you said, people will forget
what you did, but people will never forget how you made them feel.

- Maya Angelou -
--- Synchronet 3.21d-Linux NewsLink 1.2

From Eric Pozharski@apple.universe@posteo.net to comp.unix.programmer on Sun Dec 29 17:56:34 2024

From Newsgroup: comp.unix.programmer

with <vkpkn3$ga7s$1@dont-email.me> Kalevi Kolttonen wrote:

Kalevi Kolttonen <kalevi@kolttonen.fi> wrote:

./mach build -v
with
./mach build -j6 -v

Okay, I am finally back at home and it is 21:43 o'clock.

*SKIP* [ 14 lines 1 level deep]

I cannot understand this because all references to patches 416 and 419
are commented out in the Fedora 41 spec file. I now completely removed
them and will try again...

Yay! The joy of building redhat. Expect your build dependencies being inadequate, missing, or plainly wrong. Just saying.

p.s. No, I'm not enjoing your pain. Just to make things clear -- about
two decades ago I've upgraded RH5.1-modified to something RH6 (IIRC)
then added RH7.3 to the mix. Slackware way (rpmbuild wasn't an option
because look-ma-no-interwebs).

p.p.s. Then in The Moment of Clarity I realised that I was making a
mistake. And immediately made another.
--
Torvalds' goal for Linux is very simple: World Domination
Stallman's goal for GNU is even simpler: Freedom
--- Synchronet 3.21d-Linux NewsLink 1.2

From kalevi@kalevi@kolttonen.fi (Kalevi Kolttonen) to comp.unix.programmer on Sun Dec 29 18:59:48 2024

From Newsgroup: comp.unix.programmer

Eric Pozharski <apple.universe@posteo.net> wrote:

Yay! The joy of building redhat. Expect your
build dependencies being inadequate, missing,
or plainly wrong. Just saying.

After some minor spec file tweaking, I managed to do
*one* successful TB build, but because Rust compiler can
hog almost 16GB of memory, most of the time I just
cannot build TB using my modest Lenovo laptop. OOM
killer kicks in and destroys the build.

I never could have believed that having 16GB of
RAM and 8GB of swap is not enough for building TB!

br,
KK
--- Synchronet 3.21d-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.unix.programmer on Sun Dec 29 21:10:30 2024

From Newsgroup: comp.unix.programmer

On 28.12.2024 20:26, James Kuyper wrote:

On 12/27/24 18:44, Janis Papanagnou wrote:

On 28.12.2024 00:22, Kalevi Kolttonen wrote:

Compiling Thunderbird should be very easy indeed
when we use Linux distro's package management.

You expect _users_ of tools to use a _development_
environment to fix *inherent* shortcomings of a tool?
(Shortcomings that should not be there in the first
place!)

IIRC, this is in reference to my difficulty when Thunderbird changed the Reply button to mean "Reply" rather than "Followup", and instead added a
new button that is labelled "Followup". I have never complained about
that change - it was an entirely sensible one. I'm just having trouble re-training myself to use the newer, more sensible interface in a few
years after spending a couple of decades using the older, less sensible
one. And I fully appreciate other people's irritation at my difficulty
with re-training.
I wouldn't mind if they reinstated the ability, which existed in older versions of Thunderbird, to rearrange the list of buttons that are
displayed. I do complain about the removal of that customization
ability. I don't want to go back to those older versions because that
would mean undoing other improvements. I'm especially worried about
undoing security bug fixes.

The post didn't contain a reference to your case (but I also had it
still in mind). My reply was based mainly on own experiences with
TB (and with experience in software development, software ergonomy,
and system environments in principle).

I do understand the "re-training" aspect. - Been there. It was so
annoying (to me) that I was desperately seeking a way to fix it on
the user-interface level (and finally [somehow] succeeded in some
[non-obvious] way).

A feature to rearrange buttons (as being present in some former TB
releases) is not something that I'd consider to be a sensible user
interface for application software for several reasons.[*] - Here
we might be disagreeing on what should be part of a user interface
and what should be defined in a sensible way in a predefined form
that matches the application case, and not polluting the interface.

I understand well that you don't want to go back to older versions.
Myself I also don't want to go forward if that means that I have to
buy some change that results in inferior software behavior; but it
happens, sadly.[**]

I don't like the idea of creating my own personal version of Thunderbird
by modifying their source code, because it means I would have to re-do
the build every time they put out a new version. I want quick and easy upgrades to newer versions, especially security bug fixes, and that
desire conflicts with the desire for customization.

Exactly, that's one reason; against a system-wide replacement.[***]

Janis

[*] Given that I meanwhile see tons of followups on this thread and
the [in this NG] well known effect that even small "BTW-statements"
are leading to bandworm-threads with much heat and little substance
I'll not extend on that here; with minimum software experience and
an open-minded thinking it should be obvious anyway.

[**] That's why I was amused by the other posters "[...] modern ages
where things have been made quite convenient for us."

[***] You could of course create a separate version in /usr/local and
define your PATH appropriately, but you still would have to keep track
of newer changes, e.g. those security fixes that you are concerned
about.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.unix.programmer on Sun Dec 29 21:45:06 2024

From Newsgroup: comp.unix.programmer

On 29.12.2024 11:38, Muttley@dastardlyhq.com wrote:

On Sun, 29 Dec 2024 10:33:43 -0000 (UTC)
gazelle@shell.xmission.com (Kenny McCormack) gabbled:

[...]

Rust seems to be, like Python, trying to ingratiate itself into the basic
running of the system, not just be a peripheral "scripting language".

Requiring 2 seperate compilers to build anything is an absurdity.

(Disclaimer: I skipped most of the sub-thread, so if that generalizing
sentence was addressing some peculiar (maybe even TB-related) software specialities you may ignore the rest of my post.)

From my experience it's no "absurdity" but actual (sensible) normality
to use multiple compilers and other software generators in SW-projects.

It seems that depends on the software architecture. It's (IMO) fine to
create libraries that are combined in an "anything" to be compiled with
the (at the time of their creation) most appropriate compiler. It's
also fine if you use a second language as a higher-level intermediate
language. Also if you create the "anything" based on several components
(or subsystems) that are combined. Using separate protocol compilers is
also not uncommon to get the transfer objects and functions. Also using
own compilers for the accompanying parts like documentation is typical.
(All these examples just off the top of my head from some professional
projects that I observed or was engaged with.)

Janis

--- Synchronet 3.21d-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.unix.programmer on Sun Dec 29 21:55:04 2024

From Newsgroup: comp.unix.programmer

On 29.12.2024 13:39, Paul wrote:

On Sun, 12/29/2024 4:54 AM, Muttley@dastardlyhq.com wrote:

[...]

I've written my own newsreader system and while admittedly its command line >> only it requires only one 3rd party library which is OpenSSL.

That's a fine property. - Too bad one has to write one's own piece
of software to get a more sensibly defined product.

[ snip explanations of some TB internals ]

Very interesting.

Is the whole thing obscene ? Yes. You won't find too many
software creations, this distorted. Still, people are using it.
Most people are not aware what is under the hood. It's
a herd of elephants :-)

You get some impression of what's under the hood if you try to fix
that [supposedly] primitive issue with the Reply-button. But it's
(software design wise) actually even worse than I'd have imagined.

Janis

--- Synchronet 3.21d-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.unix.programmer on Sun Dec 29 22:03:19 2024

From Newsgroup: comp.unix.programmer

On 29.12.2024 15:32, Kenny McCormack wrote:

In article <vkrfue$vl1b$1@dont-email.me>, Paul <nospam@needed.invalid> wrote:
...

Is the whole thing obscene ? Yes. You won't find too many
software creations, this distorted. Still, people are using it.
Most people are not aware what is under the hood. It's
a herd of elephants :-)

Just out curiosity, does all of this apply to the Windows version as well?

I know this thread is mostly about the Linux version, and although I
actually don't use TB at all, I know someone who uses the Windows version.

I've used the Windows version quite some time ago. I can only say from
a user's perspective that it was similar to use; maybe the menus were
organized a bit differently (memories are faint). (Can't say anything
about the/any "under the hood obscenities" on that platform.)

Janis

--- Synchronet 3.21d-Linux NewsLink 1.2

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.unix.programmer on Sun Dec 29 22:07:03 2024

From Newsgroup: comp.unix.programmer

On 29.12.2024 21:55, Janis Papanagnou wrote:

On Sun, 12/29/2024 4:54 AM, Muttley@dastardlyhq.com wrote:

[...]

I've written my own newsreader system and while admittedly its command line
only it requires only one 3rd party library which is OpenSSL.

That's a fine property. - Too bad one has to write one's own piece
of software to get a more sensibly defined product.

I forgot to ask; wasn't it an option to use any existing text oriented newsreader (nn, rtin, ...)?

(I recall I liked 'nn' a lot back these days.)

Janis

--- Synchronet 3.21d-Linux NewsLink 1.2

From James Kuyper@jameskuyper@alumni.caltech.edu to comp.unix.programmer on Sun Dec 29 16:41:01 2024

From Newsgroup: comp.unix.programmer

On 12/29/24 08:07, Kalevi Kolttonen wrote:
...

Without my hack, the build process took four hours to complete
and it produced a working TB. However, with my tiny JavaScript
modification, the build failed.

Because these builds take four hours, I have to admit defeat.
I simply do not have the time to make more modification
attempts.

What is more, James Kuyper said that he does not want to
build his own TB so it was all in vain anyway.

Your efforts were not entirely wasted - your (lack of) results make me
even less willing to build my own. :-}

--- Synchronet 3.21d-Linux NewsLink 1.2

From Richard Kettlewell@invalid@invalid.invalid to comp.unix.programmer on Sun Dec 29 23:01:18 2024

From Newsgroup: comp.unix.programmer

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

Muttley@dastardlyhq.com wrote:

gazelle@shell.xmission.com (Kenny McCormack) gabbled:

Rust seems to be, like Python, trying to ingratiate itself into the
basic running of the system, not just be a peripheral "scripting
language".

Requiring 2 seperate compilers to build anything is an absurdity.

(Disclaimer: I skipped most of the sub-thread, so if that generalizing sentence was addressing some peculiar (maybe even TB-related) software specialities you may ignore the rest of my post.)

From my experience it's no "absurdity" but actual (sensible) normality
to use multiple compilers and other software generators in SW-projects.

Agreed.

Thunderbird is not a surprising place to find some Rust; Mozilla
sponsored Rust in the hope of escaping the memory safety issues of
C/C++.

It seems that depends on the software architecture. It's (IMO) fine to
create libraries that are combined in an "anything" to be compiled
with the (at the time of their creation) most appropriate
compiler. It's also fine if you use a second language as a
higher-level intermediate language. Also if you create the "anything"
based on several components (or subsystems) that are combined. Using
separate protocol compilers is also not uncommon to get the transfer
objects and functions. Also using own compilers for the accompanying
parts like documentation is typical. (All these examples just off the
top of my head from some professional projects that I observed or was
engaged with.)

Off the top of my head there are at least twelve languages in my current employerrCOs codebase. More if you count things like documentation markup.
--
https://www.greenend.org.uk/rjk/
--- Synchronet 3.21d-Linux NewsLink 1.2

From Paul@nospam@needed.invalid to comp.unix.programmer on Sun Dec 29 19:49:10 2024

From Newsgroup: comp.unix.programmer

On Sun, 12/29/2024 9:32 AM, Kenny McCormack wrote:

In article <vkrfue$vl1b$1@dont-email.me>, Paul <nospam@needed.invalid> wrote:
...

Is the whole thing obscene ? Yes. You won't find too many
software creations, this distorted. Still, people are using it.
Most people are not aware what is under the hood. It's
a herd of elephants :-)

Just out curiosity, does all of this apply to the Windows version as well?

I know this thread is mostly about the Linux version, and although I
actually don't use TB at all, I know someone who uses the Windows version.

It's a FOSS software that compiles on multiple platforms.

Just as Firefox (which is most of the code inside after all),
is FOSS software that compiles on multiple platforms.
There's even a Firefox.dmg for example, for a Mac computer.
I don't keep track of how many platforms it supports.

One way to do this, is to, say, use OpenGL for graphics, as
OpenGL was available in lots of places. But, they don't do
that, not exactly. On Windows, the Google ANGLE driver is
used, which converts something like Direct3D, into an emulation
of OpenGL. And later, Google may have added WebGL or something.
The Firefox graphics runs at 20% speed on Windows, compared to
Linux, and it has something to do with the different means
of getting a working WebGL. There could have been support provided
by graphics card drivers, a more direct path, but they didn't use that.

In fact, the Mozilla graphics designer, is more than a bit annoyed
about just how many graphics standards and APIs that ended up supported.
Any notion of Keeping It Simple, went out the window long ago.
I'm impressed it works as well as it does.

Like the design of the iceburg, the news and email code is
the 10% that floats above the water line. While the huge mass
of cross-platform-ready code underneath for Firefox, does the
rendering.

If you have ever examined the tarball for a copy of Firefox
or Thunderbird, you will develop new respect for it. In the
sense that, somehow, a team of people corralled 400,000 files
of various types and made something that sorta works out of it.
How many projects do you know of, that have 400,000 files in the
tree ? Many of the files are test benches, for detecting
regressions when minor code changes are made.

One day, I was sick of line ending problems, so I made a little
project out of converting (400K files) to something common I could use.
Before doing this, I did a scan with the Linux "file" command first,
to get a declaration of the couple text file formats I was expecting.
When I sorted all the declarations found, there were *100 text file formats*
in the tree. For one particular file, if you change the line endings
in any way, it triggers a bug in the compiler, and you don't get
your build. And that's what I mean by the herd of elephants thing,
there are extensive amounts of excrement down there, and don't
step in it. It's real easy to think you can kick the tree around, when
it doesn't actually accept abuse as a tree.

One day, I used Visual Studio, and a debug build, to single-step
Firefox through a Print routine. As IDE windows opened and closed,
I noticed I had traversed three source files, source files which
modified some common print settings, but not the exact same set of
common settings. It seemed there were three routines running
sequentially, and presumably the last one executing, was the "latest version". The two moribund versions of code, having never been removed.

And that's how you manage 400,000 files in a tree. Careful
where you step!

Paul
--- Synchronet 3.21d-Linux NewsLink 1.2

From Salvador Mirzo@smirzo@example.com to comp.unix.programmer on Sun Dec 29 22:19:32 2024

From Newsgroup: comp.unix.programmer

kalevi@kolttonen.fi (Kalevi Kolttonen) writes:

Eric Pozharski <apple.universe@posteo.net> wrote:

Yay! The joy of building redhat. Expect your
build dependencies being inadequate, missing,
or plainly wrong. Just saying.

After some minor spec file tweaking, I managed to do
*one* successful TB build, but because Rust compiler can
hog almost 16GB of memory, most of the time I just
cannot build TB using my modest Lenovo laptop. OOM
killer kicks in and destroys the build.

I never could have believed that having 16GB of
RAM and 8GB of swap is not enough for building TB!

You did it. Thanks for sharing the experience.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@dastardlyhq.com to comp.unix.programmer on Mon Dec 30 09:35:54 2024

From Newsgroup: comp.unix.programmer

On Sun, 29 Dec 2024 21:45:06 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> gabbled:

On 29.12.2024 11:38, Muttley@dastardlyhq.com wrote:

On Sun, 29 Dec 2024 10:33:43 -0000 (UTC)
gazelle@shell.xmission.com (Kenny McCormack) gabbled:

[...]

Rust seems to be, like Python, trying to ingratiate itself into the basic >>> running of the system, not just be a peripheral "scripting language".

Requiring 2 seperate compilers to build anything is an absurdity.

(Disclaimer: I skipped most of the sub-thread, so if that generalizing >sentence was addressing some peculiar (maybe even TB-related) software >specialities you may ignore the rest of my post.)

From my experience it's no "absurdity" but actual (sensible) normality
to use multiple compilers and other software generators in SW-projects.

Umm no, it really isn't. At least not for the actual compilers. Boilerplate code generators sure.

--- Synchronet 3.21d-Linux NewsLink 1.2

From kalevi@kalevi@kolttonen.fi (Kalevi Kolttonen) to comp.unix.programmer on Mon Dec 30 19:31:58 2024

From Newsgroup: comp.unix.programmer

Salvador Mirzo <smirzo@example.com> wrote:

kalevi@kolttonen.fi (Kalevi Kolttonen) writes:

Eric Pozharski <apple.universe@posteo.net> wrote:

Yay! The joy of building redhat. Expect your
build dependencies being inadequate, missing,
or plainly wrong. Just saying.

After some minor spec file tweaking, I managed to do
*one* successful TB build, but because Rust compiler can
hog almost 16GB of memory, most of the time I just
cannot build TB using my modest Lenovo laptop. OOM
killer kicks in and destroys the build.

I never could have believed that having 16GB of
RAM and 8GB of swap is not enough for building TB!

You did it. Thanks for sharing the experience.

With some incredible luck, it worked out *once*. :-)

br,
KK
--- Synchronet 3.21d-Linux NewsLink 1.2

From Salvador Mirzo@smirzo@example.com to comp.unix.programmer on Mon Dec 30 18:10:22 2024

From Newsgroup: comp.unix.programmer

kalevi@kolttonen.fi (Kalevi Kolttonen) writes:

Salvador Mirzo <smirzo@example.com> wrote:

kalevi@kolttonen.fi (Kalevi Kolttonen) writes:

Eric Pozharski <apple.universe@posteo.net> wrote:

Yay! The joy of building redhat. Expect your
build dependencies being inadequate, missing,
or plainly wrong. Just saying.

After some minor spec file tweaking, I managed to do
*one* successful TB build, but because Rust compiler can
hog almost 16GB of memory, most of the time I just
cannot build TB using my modest Lenovo laptop. OOM
killer kicks in and destroys the build.

I never could have believed that having 16GB of
RAM and 8GB of swap is not enough for building TB!

You did it. Thanks for sharing the experience.

With some incredible luck, it worked out *once*. :-)

That also explains why some people were skeptical here. Even with a sophisticated system to make the compilation succeed, it's still not
without a thrill.
--- Synchronet 3.21d-Linux NewsLink 1.2

From kalevi@kalevi@kolttonen.fi (Kalevi Kolttonen) to comp.unix.programmer on Mon Dec 30 23:11:40 2024

From Newsgroup: comp.unix.programmer

Salvador Mirzo <smirzo@example.com> wrote:

That also explains why some people were skeptical here. Even with a sophisticated system to make the compilation succeed, it's still not
without a thrill.

The modifications I made to the spec file were in fact
quite trivial. The compilation would have succeeded with
32GB of RAM. Skepticism was not really warranted because
anyone with little Fedora/Red Hat experience could have
done what I did.

Others here have access to more powerful machines than
I do, so they can finish this task if they want to.

I am a bit curious whether my JavaScript hack works
or not.

br,
KK
--- Synchronet 3.21d-Linux NewsLink 1.2

From Paul@nospam@needed.invalid to comp.unix.programmer on Thu Jan 2 03:40:25 2025

From Newsgroup: comp.unix.programmer

On Sun, 12/29/2024 1:59 PM, Kalevi Kolttonen wrote:

Eric Pozharski <apple.universe@posteo.net> wrote:

Yay! The joy of building redhat. Expect your
build dependencies being inadequate, missing,
or plainly wrong. Just saying.

After some minor spec file tweaking, I managed to do
*one* successful TB build, but because Rust compiler can
hog almost 16GB of memory, most of the time I just
cannot build TB using my modest Lenovo laptop. OOM
killer kicks in and destroys the build.

I never could have believed that having 16GB of
RAM and 8GB of swap is not enough for building TB!

br,
KK

Try a chain saw next time :-)

It's one of the first practical tests the machine got.
The RPMBuild phase was awfully slow (it spoiled the fun).
But the compiles and linking behaved well. RPM compression
seems to run on one core.

[Picture] During compile phase...

https://i.postimg.cc/44qRrgxb/Thunderbird-Fedora41-Build-From-Source-via-Mock.gif

I think it's possible the build slows down, the longer it runs.
Like "something" is fragmenting.

Even on this machine, the process does not encourage
interactive operation. It takes too long. Adjusting the
command a bit so it just compiles and links, would be
better, if that's possible.

mock --resultdir=/tmp/results --rootdir=/tmp/mock --rebuild thunderbird-128.5.2-1.fc41.src.rpm | tee /tmp/build_out.txt

I picked Fedora for the job, because it only takes two commands
in a terminal, to do it. In simplified terms...

dnf download --source packagename # Downloading source doesn't need root. mock --rebuild packagename # User account belongs to "mock" group, doesn't build as root

But I need to do something else to that Mock command,
to get what I want (a "portable" copy of Thunderbird,
there should be a dir created with that sitting in it).

Summary: No question, a bit of RAM helps. Some of the RAM
accounting in Linux is just weird (process resident
seen rising, graph in system monitor remains flat).
I was expecting to see a "hump" while linking, but
the graph was relatively flat and featureless.

Paul
--- Synchronet 3.21d-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.unix.programmer on Thu Jan 2 16:29:48 2025

From Newsgroup: comp.unix.programmer

Paul <nospam@needed.invalid> writes:

On Sun, 12/29/2024 1:59 PM, Kalevi Kolttonen wrote:

Eric Pozharski <apple.universe@posteo.net> wrote:

Yay! The joy of building redhat. Expect your
build dependencies being inadequate, missing,
or plainly wrong. Just saying.

After some minor spec file tweaking, I managed to do
*one* successful TB build, but because Rust compiler can
hog almost 16GB of memory, most of the time I just
cannot build TB using my modest Lenovo laptop. OOM
killer kicks in and destroys the build.

I never could have believed that having 16GB of
RAM and 8GB of swap is not enough for building TB!

br,
KK

Try a chain saw next time :-)

It's one of the first practical tests the machine got.
The RPMBuild phase was awfully slow (it spoiled the fun).
But the compiles and linking behaved well. RPM compression
seems to run on one core.

[Picture] During compile phase...

https://i.postimg.cc/44qRrgxb/Thunderbird-Fedora41-Build-From-Source-via-Mock.gif

I think it's possible the build slows down, the longer it runs.
Like "something" is fragmenting.

Even on this machine, the process does not encourage
interactive operation. It takes too long. Adjusting the
command a bit so it just compiles and links, would be
better, if that's possible.

mock --resultdir=/tmp/results --rootdir=/tmp/mock --rebuild thunderbird-128.5.2-1.fc41.src.rpm | tee /tmp/build_out.txt

I picked Fedora for the job, because it only takes two commands
in a terminal, to do it. In simplified terms...

dnf download --source packagename # Downloading source doesn't need root. >mock --rebuild packagename # User account belongs to "mock" group, doesn't build as root

But I need to do something else to that Mock command,
to get what I want (a "portable" copy of Thunderbird,
there should be a dir created with that sitting in it).

Summary: No question, a bit of RAM helps. Some of the RAM
accounting in Linux is just weird (process resident
seen rising, graph in system monitor remains flat).
I was expecting to see a "hump" while linking, but
the graph was relatively flat and featureless.

Paul

Why would you expect the link step to require a lot of
memory? The linker builds an elf executable from the contents
of ELF object files, one ELF section at a time. It doesn't
construct the entire ELF executable in memory before writing it out.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Paul@nospam@needed.invalid to comp.unix.programmer on Thu Jan 2 19:36:50 2025

From Newsgroup: comp.unix.programmer

On Thu, 1/2/2025 11:29 AM, Scott Lurndal wrote:

Why would you expect the link step to require a lot of
memory? The linker builds an elf executable from the contents
of ELF object files, one ELF section at a time. It doesn't
construct the entire ELF executable in memory before writing it out.

It's based on experience, not imagination.

I've built Thunderbird on both Windows and Linux.
It was the Windows build that left a bad taste.
Once you repeatedly have build failures during linking,
you are always looking for it.

I've built Thunderbird multiple times over the years.
At one time, there was a nasty "ramp" in memory consumption
visible while I was building in Windows XP, and using
the Visual Studio compiler and linker, as instructed
by the Mozilla build page for Thunderbird. I just follow
the recipe when doing these.

The builds today take a lot more RAM than back then.

[Picture]

https://i.postimg.cc/85bRBYpX/buckets-of-ram.gif

Paul
--- Synchronet 3.21d-Linux NewsLink 1.2

From Lawrence D'Oliveiro@ldo@nz.invalid to comp.unix.programmer on Fri Jan 3 02:55:03 2025

From Newsgroup: comp.unix.programmer

On Thu, 2 Jan 2025 19:36:50 -0500, Paul wrote:

I've built Thunderbird on both Windows and Linux.
It was the Windows build that left a bad taste.
Once you repeatedly have build failures during linking, you are always looking for it.

Yours is not the only experience. I recall a blog post from the
LibreOffice folks, soon after they forked off from OpenOffice, that they tended to suffer intermittent unexplainable build failures on Windows,
too.
--- Synchronet 3.21d-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.unix.programmer on Fri Jan 3 18:15:25 2025

From Newsgroup: comp.unix.programmer

Paul <nospam@needed.invalid> writes:

On Thu, 1/2/2025 11:29 AM, Scott Lurndal wrote:

Why would you expect the link step to require a lot of
memory? The linker builds an elf executable from the contents
of ELF object files, one ELF section at a time. It doesn't
construct the entire ELF executable in memory before writing it out.

It's based on experience, not imagination.

I've built Thunderbird on both Windows and Linux.
It was the Windows build that left a bad taste.
Once you repeatedly have build failures during linking,
you are always looking for it.

Ah, well windows. You need not elaborate.

I've been fortunate to have never built software in a
microsoft environment (aside an optical jukebox driver
for NT3.51 once on a contract job - even then I did
all the editing on unix and just compiled and tested
on the windows box).
--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@dastardlyhq.com to comp.unix.programmer on Sat Jan 4 10:12:51 2025

From Newsgroup: comp.unix.programmer

On Fri, 03 Jan 2025 18:15:25 GMT
scott@slp53.sl.home (Scott Lurndal) gabbled:

Paul <nospam@needed.invalid> writes:

On Thu, 1/2/2025 11:29 AM, Scott Lurndal wrote:

Why would you expect the link step to require a lot of
memory? The linker builds an elf executable from the contents
of ELF object files, one ELF section at a time. It doesn't
construct the entire ELF executable in memory before writing it out.

It's based on experience, not imagination.

I've built Thunderbird on both Windows and Linux.
It was the Windows build that left a bad taste.
Once you repeatedly have build failures during linking,
you are always looking for it.

Ah, well windows. You need not elaborate.

I've been fortunate to have never built software in a
microsoft environment (aside an optical jukebox driver
for NT3.51 once on a contract job - even then I did
all the editing on unix and just compiled and tested
on the windows box).

I did a Windows C++ job for a year. I still can't believe how complicated Visual Studio (2017 IIRC) made the most basic things such as setting library and include paths which were buried 2 or 3 levels down in some sub menu not
to mention all the "project" BS which forced a certain structure on to your code filesystem layout which I didn't particularly want. Also the fact that console and GUI apps require a totally different project setup and boiler plate
code from the start is just mind boggling.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Salvador Mirzo@smirzo@example.com to comp.unix.programmer on Sat Jan 4 08:31:05 2025

From Newsgroup: comp.unix.programmer

Muttley@dastardlyhq.com writes:

On Fri, 03 Jan 2025 18:15:25 GMT
scott@slp53.sl.home (Scott Lurndal) gabbled:

Paul <nospam@needed.invalid> writes:

On Thu, 1/2/2025 11:29 AM, Scott Lurndal wrote:

Why would you expect the link step to require a lot of
memory? The linker builds an elf executable from the contents
of ELF object files, one ELF section at a time. It doesn't
construct the entire ELF executable in memory before writing it out.

It's based on experience, not imagination.

I've built Thunderbird on both Windows and Linux.
It was the Windows build that left a bad taste.
Once you repeatedly have build failures during linking,
you are always looking for it.

Ah, well windows. You need not elaborate.

I've been fortunate to have never built software in a
microsoft environment (aside an optical jukebox driver
for NT3.51 once on a contract job - even then I did
all the editing on unix and just compiled and tested
on the windows box).

I did a Windows C++ job for a year. I still can't believe how complicated Visual Studio (2017 IIRC) made the most basic things such as setting library and include paths which were buried 2 or 3 levels down in some sub menu not to mention all the "project" BS which forced a certain structure on to your code filesystem layout which I didn't particularly want. Also the fact that console and GUI apps require a totally different project setup and boiler plate
code from the start is just mind boggling.

They always try to make things pretty and easy to use, but you end up
with that. I think the only way to tolerate that is to be born and
raised in such thing. Modularization is likely the most important thing
in programming and it's hard to minimally praise Microsoft on
modularization. For instance, is there any Windows software that
handles a TCP connection in an accept-fork-exec fashion?
--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@dastardlyhq.com to comp.unix.programmer on Sat Jan 4 11:40:18 2025

From Newsgroup: comp.unix.programmer

On Sat, 04 Jan 2025 08:31:05 -0300
Salvador Mirzo <smirzo@example.com> gabbled:

Muttley@dastardlyhq.com writes:

and include paths which were buried 2 or 3 levels down in some sub menu not >> to mention all the "project" BS which forced a certain structure on to your >> code filesystem layout which I didn't particularly want. Also the fact that >> console and GUI apps require a totally different project setup and boiler >plate
code from the start is just mind boggling.

They always try to make things pretty and easy to use, but you end up
with that. I think the only way to tolerate that is to be born and
raised in such thing. Modularization is likely the most important thing
in programming and it's hard to minimally praise Microsoft on
modularization. For instance, is there any Windows software that
handles a TCP connection in an accept-fork-exec fashion?

Win32 can't do fork so thats a no. Also IIRC sockets in windows arn't a
simple file descriptor and can't be multiplexed, at least not with normal
file descriptors so select() and poll() are essentially useless so IIRC you have to spawn a thread or use windows message or some such overcomplicated nonsense to wait on them. Hopeless.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Lawrence D'Oliveiro@ldo@nz.invalid to comp.unix.programmer on Sat Jan 4 22:13:05 2025

From Newsgroup: comp.unix.programmer

On Sat, 04 Jan 2025 08:31:05 -0300, Salvador Mirzo wrote:

For instance, is there any Windows software that
handles a TCP connection in an accept-fork-exec fashion?

Almost certainly not. Because process creation is an expensive operation
on Windows.

Windows NT was masterminded by Dave Cutler, who was previously responsible
for the VMS OS at his previous employer, DEC. He was a Unix-hater, part of
a bunch of them at DEC. They would instinctively turn away from Unix ways
of doing things, like forking multiple processes. So the systems they
created did not encourage such techniques.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Salvador Mirzo@smirzo@example.com to comp.unix.programmer on Sat Jan 4 19:17:15 2025

From Newsgroup: comp.unix.programmer

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Sat, 04 Jan 2025 08:31:05 -0300, Salvador Mirzo wrote:

For instance, is there any Windows software that
handles a TCP connection in an accept-fork-exec fashion?

Almost certainly not. Because process creation is an expensive operation
on Windows.

Windows NT was masterminded by Dave Cutler, who was previously responsible for the VMS OS at his previous employer, DEC. He was a Unix-hater, part of
a bunch of them at DEC. They would instinctively turn away from Unix ways
of doing things, like forking multiple processes. So the systems they created did not encourage such techniques.

Is that Dave with a YouTube channel?
--- Synchronet 3.21d-Linux NewsLink 1.2

From Lawrence D'Oliveiro@ldo@nz.invalid to comp.unix.programmer on Sun Jan 5 00:47:19 2025

From Newsgroup: comp.unix.programmer

On Sat, 04 Jan 2025 19:17:15 -0300, Salvador Mirzo wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

Windows NT was masterminded by Dave Cutler ...

Is that Dave with a YouTube channel?

No, thatrCOs a different former Microsoftie, but he has had Cutler on his channel for an extended interview.

I found it ironic that there was a PiDP-11, I think it was, placed within armrCOs reach behind the guy during the entire interview. You know, the
PDP-11 emulator that runs on a Linux-based Raspberry Pi. I wonder if the Unix-hater ever noticed that ...
--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@dastardlyhq.com to comp.unix.programmer on Sun Jan 5 16:40:33 2025

From Newsgroup: comp.unix.programmer

On Sat, 4 Jan 2025 22:13:05 -0000 (UTC)
Lawrence D'Oliveiro <ldo@nz.invalid> gabbled:

On Sat, 04 Jan 2025 08:31:05 -0300, Salvador Mirzo wrote:

For instance, is there any Windows software that
handles a TCP connection in an accept-fork-exec fashion?

Almost certainly not. Because process creation is an expensive operation
on Windows.

Windows NT was masterminded by Dave Cutler, who was previously responsible >for the VMS OS at his previous employer, DEC. He was a Unix-hater, part of
a bunch of them at DEC. They would instinctively turn away from Unix ways
of doing things, like forking multiple processes. So the systems they >created did not encourage such techniques.

Presumably VMS relied heavily on multithreading then like Windows or was a process expected to everything itself sequentially?

--- Synchronet 3.21d-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.unix.programmer on Sun Jan 5 17:14:14 2025

From Newsgroup: comp.unix.programmer

Muttley@dastardlyhq.com writes:

On Sat, 4 Jan 2025 22:13:05 -0000 (UTC)
Lawrence D'Oliveiro <ldo@nz.invalid> gabbled:

On Sat, 04 Jan 2025 08:31:05 -0300, Salvador Mirzo wrote:

For instance, is there any Windows software that
handles a TCP connection in an accept-fork-exec fashion?

Almost certainly not. Because process creation is an expensive operation >>on Windows.

Windows NT was masterminded by Dave Cutler, who was previously responsible >>for the VMS OS at his previous employer, DEC. He was a Unix-hater, part of >>a bunch of them at DEC. They would instinctively turn away from Unix ways >>of doing things, like forking multiple processes. So the systems they >>created did not encourage such techniques.

Presumably VMS relied heavily on multithreading then like Windows or was a >process expected to everything itself sequentially?

The first shared memory multiprocessor VAX was the 11/782. It was
not considered SMP. I worked with four 11/780's sharing a 4MB MA-780
for three years in the early 80s - the shared memory could be used
primarily for inter-process communications (e.g. mailboxes) or one
could install commonly used program read-only 'text' regions in the shared memory to reduce the memory presure on each of the 780s. I developed
a DECnet ACP to support transport within the cluster via MA-780.

VMS itself did not leverage threads in the modern sense at that point.
--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Sun Jan 5 21:09:55 2025

From Newsgroup: comp.unix.programmer

In article <vlecm0$1465i$1@dont-email.me>, <Muttley@dastardlyhq.com> wrote: >On Sat, 4 Jan 2025 22:13:05 -0000 (UTC)

Lawrence D'Oliveiro <ldo@nz.invalid> gabbled:

On Sat, 04 Jan 2025 08:31:05 -0300, Salvador Mirzo wrote:

For instance, is there any Windows software that
handles a TCP connection in an accept-fork-exec fashion?

Almost certainly not. Because process creation is an expensive operation >>on Windows.

Windows NT was masterminded by Dave Cutler, who was previously responsible >>for the VMS OS at his previous employer, DEC. He was a Unix-hater, part of >>a bunch of them at DEC. They would instinctively turn away from Unix ways >>of doing things, like forking multiple processes. So the systems they >>created did not encourage such techniques.

Presumably VMS relied heavily on multithreading then like Windows or was a >process expected to everything itself sequentially?

Many system services on VMS are asynchronous, and the system
architecture provides a mechanisms to signal completion; ASTs,
mailboxes, etc. Thus, many programs (not all) on VMS are
written in a callback/closure style.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastardlyHQ.org to comp.unix.programmer on Mon Jan 6 08:36:27 2025

From Newsgroup: comp.unix.programmer

On Sun, 5 Jan 2025 21:09:55 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vlecm0$1465i$1@dont-email.me>, <Muttley@dastardlyhq.com> wrote: >>On Sat, 4 Jan 2025 22:13:05 -0000 (UTC)

Lawrence D'Oliveiro <ldo@nz.invalid> gabbled:

On Sat, 04 Jan 2025 08:31:05 -0300, Salvador Mirzo wrote:

For instance, is there any Windows software that
handles a TCP connection in an accept-fork-exec fashion?

Almost certainly not. Because process creation is an expensive operation >>>on Windows.

Windows NT was masterminded by Dave Cutler, who was previously responsible >>>for the VMS OS at his previous employer, DEC. He was a Unix-hater, part of >>>a bunch of them at DEC. They would instinctively turn away from Unix ways >>>of doing things, like forking multiple processes. So the systems they >>>created did not encourage such techniques.

Presumably VMS relied heavily on multithreading then like Windows or was a >>process expected to everything itself sequentially?

Many system services on VMS are asynchronous, and the system
architecture provides a mechanisms to signal completion; ASTs,
mailboxes, etc. Thus, many programs (not all) on VMS are
written in a callback/closure style.

I imagine that could become complicated very quickly and presumably relies
on the OS providing the signalling mechanisms for everything you might
want to do - eg waiting for a socket connection (or whatever the decnet equivalent was).

--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Mon Jan 6 14:08:44 2025

From Newsgroup: comp.unix.programmer

In article <vlg4mb$1hi6d$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >On Sun, 5 Jan 2025 21:09:55 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vlecm0$1465i$1@dont-email.me>, <Muttley@dastardlyhq.com> wrote: >>>On Sat, 4 Jan 2025 22:13:05 -0000 (UTC)

Lawrence D'Oliveiro <ldo@nz.invalid> gabbled:

On Sat, 04 Jan 2025 08:31:05 -0300, Salvador Mirzo wrote:

For instance, is there any Windows software that
handles a TCP connection in an accept-fork-exec fashion?

Almost certainly not. Because process creation is an expensive operation >>>>on Windows.

Windows NT was masterminded by Dave Cutler, who was previously responsible >>>>for the VMS OS at his previous employer, DEC. He was a Unix-hater, part of >>>>a bunch of them at DEC. They would instinctively turn away from Unix ways >>>>of doing things, like forking multiple processes. So the systems they >>>>created did not encourage such techniques.

Presumably VMS relied heavily on multithreading then like Windows or was a >>>process expected to everything itself sequentially?

Many system services on VMS are asynchronous, and the system
architecture provides a mechanisms to signal completion; ASTs,
mailboxes, etc. Thus, many programs (not all) on VMS are
written in a callback/closure style.

I imagine that could become complicated very quickly and presumably relies
on the OS providing the signalling mechanisms for everything you might
want to do - eg waiting for a socket connection (or whatever the decnet >equivalent was).

It's a fairly common way to structure software even today. As I
said, the OS provides asychronous notification mechanisms (ASTs)
and IPC (mailboxes etc) for signaling operation completion.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastardlyHQ.org to comp.unix.programmer on Mon Jan 6 14:21:48 2025

From Newsgroup: comp.unix.programmer

On Mon, 6 Jan 2025 14:08:44 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vlg4mb$1hi6d$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>On Sun, 5 Jan 2025 21:09:55 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vlecm0$1465i$1@dont-email.me>, <Muttley@dastardlyhq.com> wrote: >>>>On Sat, 4 Jan 2025 22:13:05 -0000 (UTC)

Lawrence D'Oliveiro <ldo@nz.invalid> gabbled:

On Sat, 04 Jan 2025 08:31:05 -0300, Salvador Mirzo wrote:

For instance, is there any Windows software that
handles a TCP connection in an accept-fork-exec fashion?

Almost certainly not. Because process creation is an expensive operation >>>>>on Windows.

Windows NT was masterminded by Dave Cutler, who was previously responsible

for the VMS OS at his previous employer, DEC. He was a Unix-hater, part of

a bunch of them at DEC. They would instinctively turn away from Unix ways >>>>>of doing things, like forking multiple processes. So the systems they >>>>>created did not encourage such techniques.

Presumably VMS relied heavily on multithreading then like Windows or was a >>>>process expected to everything itself sequentially?

Many system services on VMS are asynchronous, and the system
architecture provides a mechanisms to signal completion; ASTs,
mailboxes, etc. Thus, many programs (not all) on VMS are
written in a callback/closure style.

I imagine that could become complicated very quickly and presumably relies >>on the OS providing the signalling mechanisms for everything you might
want to do - eg waiting for a socket connection (or whatever the decnet >>equivalent was).

It's a fairly common way to structure software even today. As I

In Windows yes, which frankly is probably not a coincidence. Not so much
in unix unless you're writing a GUI program.

--- Synchronet 3.21d-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.unix.programmer on Mon Jan 6 15:02:49 2025

From Newsgroup: comp.unix.programmer

Muttley@DastardlyHQ.org writes:

On Sun, 5 Jan 2025 21:09:55 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vlecm0$1465i$1@dont-email.me>, <Muttley@dastardlyhq.com> wrote: >>>On Sat, 4 Jan 2025 22:13:05 -0000 (UTC)

Lawrence D'Oliveiro <ldo@nz.invalid> gabbled:

On Sat, 04 Jan 2025 08:31:05 -0300, Salvador Mirzo wrote:

For instance, is there any Windows software that
handles a TCP connection in an accept-fork-exec fashion?

Almost certainly not. Because process creation is an expensive operation >>>>on Windows.

Windows NT was masterminded by Dave Cutler, who was previously responsible >>>>for the VMS OS at his previous employer, DEC. He was a Unix-hater, part of >>>>a bunch of them at DEC. They would instinctively turn away from Unix ways >>>>of doing things, like forking multiple processes. So the systems they >>>>created did not encourage such techniques.

Presumably VMS relied heavily on multithreading then like Windows or was a >>>process expected to everything itself sequentially?

Many system services on VMS are asynchronous, and the system
architecture provides a mechanisms to signal completion; ASTs,
mailboxes, etc. Thus, many programs (not all) on VMS are
written in a callback/closure style.

I imagine that could become complicated very quickly and presumably relies
on the OS providing the signalling mechanisms for everything you might
want to do - eg waiting for a socket connection (or whatever the decnet >equivalent was).

Actually, it was straightfoward to use and rather elegent. You
could build a nice asynchronous transfer algorithm just using AST's
a and/or event flags.
--- Synchronet 3.21d-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.unix.programmer on Mon Jan 6 15:05:33 2025

From Newsgroup: comp.unix.programmer

Muttley@DastardlyHQ.org writes:

On Mon, 6 Jan 2025 14:08:44 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vlg4mb$1hi6d$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>>On Sun, 5 Jan 2025 21:09:55 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vlecm0$1465i$1@dont-email.me>, <Muttley@dastardlyhq.com> wrote:

On Sat, 4 Jan 2025 22:13:05 -0000 (UTC)
Lawrence D'Oliveiro <ldo@nz.invalid> gabbled:

On Sat, 04 Jan 2025 08:31:05 -0300, Salvador Mirzo wrote:

For instance, is there any Windows software that
handles a TCP connection in an accept-fork-exec fashion?

Almost certainly not. Because process creation is an expensive operation >>>>>>on Windows.

Windows NT was masterminded by Dave Cutler, who was previously responsible

for the VMS OS at his previous employer, DEC. He was a Unix-hater, part of

a bunch of them at DEC. They would instinctively turn away from Unix ways
of doing things, like forking multiple processes. So the systems they >>>>>>created did not encourage such techniques.

Presumably VMS relied heavily on multithreading then like Windows or was a
process expected to everything itself sequentially?

Many system services on VMS are asynchronous, and the system >>>>architecture provides a mechanisms to signal completion; ASTs, >>>>mailboxes, etc. Thus, many programs (not all) on VMS are
written in a callback/closure style.

I imagine that could become complicated very quickly and presumably relies >>>on the OS providing the signalling mechanisms for everything you might >>>want to do - eg waiting for a socket connection (or whatever the decnet >>>equivalent was).

It's a fairly common way to structure software even today. As I

In Windows yes, which frankly is probably not a coincidence. Not so much
in unix unless you're writing a GUI program.

ASTs and unix signals have similar semantics. It's certainly possible to
use, for example, SIGIO in a similar manner to the VMS AST, where the
AST signals I/O completion and the AST handler initiates a subsequent operation.
--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Mon Jan 6 15:22:51 2025

From Newsgroup: comp.unix.programmer

In article <vlgots$1le5s$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >On Mon, 6 Jan 2025 14:08:44 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vlg4mb$1hi6d$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>>On Sun, 5 Jan 2025 21:09:55 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vlecm0$1465i$1@dont-email.me>, <Muttley@dastardlyhq.com> wrote:

On Sat, 4 Jan 2025 22:13:05 -0000 (UTC)
Lawrence D'Oliveiro <ldo@nz.invalid> gabbled:

On Sat, 04 Jan 2025 08:31:05 -0300, Salvador Mirzo wrote:

For instance, is there any Windows software that
handles a TCP connection in an accept-fork-exec fashion?

Almost certainly not. Because process creation is an expensive operation >>>>>>on Windows.

Windows NT was masterminded by Dave Cutler, who was previously responsible

for the VMS OS at his previous employer, DEC. He was a Unix-hater, part of

a bunch of them at DEC. They would instinctively turn away from Unix ways
of doing things, like forking multiple processes. So the systems they >>>>>>created did not encourage such techniques.

Presumably VMS relied heavily on multithreading then like Windows or was a
process expected to everything itself sequentially?

Many system services on VMS are asynchronous, and the system >>>>architecture provides a mechanisms to signal completion; ASTs, >>>>mailboxes, etc. Thus, many programs (not all) on VMS are
written in a callback/closure style.

I imagine that could become complicated very quickly and presumably relies >>>on the OS providing the signalling mechanisms for everything you might >>>want to do - eg waiting for a socket connection (or whatever the decnet >>>equivalent was).

It's a fairly common way to structure software even today. As I

In Windows yes, which frankly is probably not a coincidence. Not so much
in unix unless you're writing a GUI program.

Very much in Unix, actually. The kernel is highly asynchronous
(it must be, to match the hardware), and has been since the
early 1970s. Many user programs similarly.

Historically, many systems have provided direct support for
asynchronous programming on Unix. Going back to the early
commerical Unix days, masscomp's real time Unix had ASTs, not
signals, to support asynch IO directly from userspace. More
recently, POSIX.1b and POSIX AIO are widely supported. Polling
interfaces like kqueue and epoll, etc, exist largely to support
multiplexing asynchronous tasks, though not using the callback
model per se (one polls a set of e.g. file descriptors and
dispatches explicitly based on their states as reported by the
polling interface). Most recently, things like io_uring on
Linux are designed specifically to support asynch IO in user
programs.

Granted, the Unix system interface is not particularly
asynchronous-friendly, but that is by design. As Doug McIlroy
put it,

|The infrastructure had to be asynchronous. The whole point was
|to surmount that difficult model and keep everyday programming
|simple. User visibility of asynchrony was held to a minimum:
|fork(), signal(), wait(). Signal() was there first and
|foremost to support SIGKILL; it did not purport to provide a
|sound basis for asynchronous IPC. (https://tuhs.org/mailman3/hyperkitty/list/tuhs@tuhs.org/message/IAAO2MRTMSX3C54YGTNOTIT4FEQA73IR/)

This was fine for writing `cat` and `cp` etc, which were largely
synchronous anyway. Less so for server-style programs or
programs that had to multiplex data from many sources generally.

This early decision to favor the comprehensibility of one class
of program in the system call interface has had reverberations
through time, leading to much complexity. This is one reason
that, for example, the Go runtime has to jump through so many
hoops to mux goroutines onto the underlying OS-provided thread
abstraction; or that the implementation of async executors for
Rust is hard (cf Tokio). In that same message referenced above,
Doug continues:

|The complexity of sigaction() is evidence that asynchrony
|remains untamed 40 years on.

This is unfair. It's evidence that grafting it onto an existing
highly synchronous system interface that was simply not designed
to accommodate it in the first place is very hard, even 50 years
after the fact.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastardlyHQ.org to comp.unix.programmer on Mon Jan 6 15:55:19 2025

From Newsgroup: comp.unix.programmer

On Mon, 06 Jan 2025 15:05:33 GMT
scott@slp53.sl.home (Scott Lurndal) wibbled:

Muttley@DastardlyHQ.org writes:

In Windows yes, which frankly is probably not a coincidence. Not so much
in unix unless you're writing a GUI program.

ASTs and unix signals have similar semantics. It's certainly possible to >use, for example, SIGIO in a similar manner to the VMS AST, where the
AST signals I/O completion and the AST handler initiates a subsequent >operation.

Unix signals should only be used to set flags that are then read later. Doing anything complicated in a signal handler is asking for trouble as you have
no idea where the program was when the signal occured and there can be all sorts of re-entrant issues or even deadlocks if using mutexes.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastardlyHQ.org to comp.unix.programmer on Mon Jan 6 16:00:33 2025

From Newsgroup: comp.unix.programmer

On Mon, 6 Jan 2025 15:22:51 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vlgots$1le5s$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>In Windows yes, which frankly is probably not a coincidence. Not so much

in unix unless you're writing a GUI program.

Very much in Unix, actually. The kernel is highly asynchronous
(it must be, to match the hardware), and has been since the
early 1970s. Many user programs similarly.

Historically, many systems have provided direct support for
asynchronous programming on Unix. Going back to the early
commerical Unix days, masscomp's real time Unix had ASTs, not
signals, to support asynch IO directly from userspace. More
recently, POSIX.1b and POSIX AIO are widely supported. Polling
interfaces like kqueue and epoll, etc, exist largely to support

Multiplexing is not asychronous, its simply offloading status checking to
the kernel. The program using is still very much sequential , at least at
that point.

Posix AIO is not asynch in the strict sense , its more "ok kernel, go do this and I'll check how you're doing later". Proper asynch where the program execution path gets bounced around between various callbacks is something
else entirely.

--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Mon Jan 6 16:39:49 2025

From Newsgroup: comp.unix.programmer

In article <vlgun1$1minf$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >On Mon, 6 Jan 2025 15:22:51 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vlgots$1le5s$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>>In Windows yes, which frankly is probably not a coincidence. Not so much >>>in unix unless you're writing a GUI program.

Very much in Unix, actually. The kernel is highly asynchronous
(it must be, to match the hardware), and has been since the
early 1970s. Many user programs similarly.

Historically, many systems have provided direct support for
asynchronous programming on Unix. Going back to the early
commerical Unix days, masscomp's real time Unix had ASTs, not
signals, to support asynch IO directly from userspace. More
recently, POSIX.1b and POSIX AIO are widely supported. Polling
interfaces like kqueue and epoll, etc, exist largely to support

Multiplexing is not asychronous, its simply offloading status checking to
the kernel.

Of course. It's a means to allow a program to respond to
asynchronous events.

The program using is still very much sequential , at least at
that point.

But the events are not. That's the point. This allows a
program to initiate a non-blocking IO operation (like, say,
establishing a TCP connection using the sockets API), go do
something else, and check it's status later.

Posix AIO is not asynch in the strict sense , its more "ok kernel, go do this >and I'll check how you're doing later". Proper asynch where the program >execution path gets bounced around between various callbacks is something >else entirely.

The POSIX AIO interface allows the kernel to generate a signal
to inform the program that an IO operation has completed, e.g.,
by setting up the `aio_sigevent` and `SIGEV_SIGNAL`. It doesn't
get much more asynchronous than that.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.unix.programmer on Mon Jan 6 16:46:56 2025

From Newsgroup: comp.unix.programmer

Muttley@DastardlyHQ.org writes:

On Mon, 06 Jan 2025 15:05:33 GMT
scott@slp53.sl.home (Scott Lurndal) wibbled:

Muttley@DastardlyHQ.org writes:

In Windows yes, which frankly is probably not a coincidence. Not so much >>>in unix unless you're writing a GUI program.

ASTs and unix signals have similar semantics. It's certainly possible to >>use, for example, SIGIO in a similar manner to the VMS AST, where the
AST signals I/O completion and the AST handler initiates a subsequent >>operation.

Unix signals should only be used to set flags that are then read later.

You're opinion is not widely shared. Note that the POSIX specification carefully notes which interfaces are not signal-safe.
--- Synchronet 3.21d-Linux NewsLink 1.2

From James Kuyper@jameskuyper@alumni.caltech.edu to comp.unix.programmer on Mon Jan 6 12:42:48 2025

From Newsgroup: comp.unix.programmer

On 1/6/25 11:46, Scott Lurndal wrote:

Muttley@DastardlyHQ.org writes:

...

Unix signals should only be used to set flags that are then read later.

You're opinion is not widely shared. Note that the POSIX specification carefully notes which interfaces are not signal-safe.

What precisely does "signal-safe" mean? As I understand it, it is
supposed to be safe for a signal to interrupt standard library routines,
but that it's not safe for a signal handler to call most of those
functions. There's just a few exceptions, described below.

The C standard says the following about signal handlers:
"The functions in the standard library are not guaranteed to be
reentrant and may modify objects with static or thread storage duration.
239)" (7.1.4p4)
Footnote 239 says "Thus, a signal handler cannot, in general, call
standard library functions."

"If the signal occurs other than as the result of calling the abort or
raise function, the behavior is undefined if the signal handler refers
to any object with static or thread storage duration that is not a
lock-free atomic object and that is not declared with the constexpr storage-class specifier other than by assigning a value to an object
declared as volatile sig_atomic_t, or the signal handler calls any
function in the standard library other than
rCo the abort function,
rCo the _Exit function,
rCo the quick_exit function,
rCo the functions in <stdatomic.h> (except where explicitly stated
otherwise) when the atomic arguments are lock-free,
rCo the atomic_is_lock_free function with any atomic argument, or
rCo the signal function with the first argument equal to the signal number corresponding to the signal that caused the invocation of the handler. Furthermore, if such a call to the signal function results in a SIG_ERR
return, the object designated by errno has an indeterminate representation.310)" (7.14.1.1p5)
Footnote 310 says "If any signal is generated by an asynchronous signal handler, the behavior is undefined."

"If a signal occurs other than as the result of calling the abort or
raise functions, the behavior is undefined if the signal handler reads
or modifies an atomic object that has an indeterminate representation." (7.17.2p2)

"If a signal occurs other than as the result of calling the abort or
raise functions, the behavior is undefined if the signal handler calls
the atomic_init generic function." (7.17.2.1p4)

The POSIX standard claims that its version of <signal.h> conforms to the
C standard, and as far as I can see, the POSIX standard doesn't say
anything to define the behavior that is undefined by the C standard.

Could you demonstrate how, within the above restrictions, a signal
handler that doesn't cause the program to exit in one fashion or another
could do anything useful other than "set flags that are read later"?
I'm not saying it cannot be done. I claim no expertise in this kind of programming - I never needed to write signal handlers. However, the last
time I considered the matter carefully (which was two or three versions
of the C standard ago) I couldn't figure out how to do much more than
that. At that time I did not consider how POSIX affects the issue, and I
don't know enough about POSIX signals to evaluate that issue.
--- Synchronet 3.21d-Linux NewsLink 1.2

From kalevi@kalevi@kolttonen.fi (Kalevi Kolttonen) to comp.unix.programmer on Mon Jan 6 17:53:20 2025

From Newsgroup: comp.unix.programmer

Muttley@dastardlyhq.org wrote:

Unix signals should only be used to set flags that are then read later. Doing anything complicated in a signal handler is asking for trouble as you have
no idea where the program was when the signal occured and there can be all sorts of re-entrant issues or even deadlocks if using mutexes.

That is what I have learned, too, but I cannot remember the
source. Maybe one of Richard Stevens' UNIX books.

I am no expert, but I guess if you need to do async programming
on UNIX/Linux userspace, your best is to use POSIX Threads.

br,
KK
--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Mon Jan 6 18:16:31 2025

From Newsgroup: comp.unix.programmer

In article <vlh4mo$1nccc$1@dont-email.me>,
James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

On 1/6/25 11:46, Scott Lurndal wrote:

Muttley@DastardlyHQ.org writes:

...

Unix signals should only be used to set flags that are then read later.

You're opinion is not widely shared. Note that the POSIX specification
carefully notes which interfaces are not signal-safe.

What precisely does "signal-safe" mean? As I understand it, it is
supposed to be safe for a signal to interrupt standard library routines,
but that it's not safe for a signal handler to call most of those
functions. There's just a few exceptions, described below.

From POSIX 2024, sectino 2.4 ("Signal Concepts"):

|The following table defines a set of functions and
|function-like macros that shall be async-signal-safe.
|Therefore, applications can call them, without restriction,
|from signal-catching functions. Note that, although there is
|no restriction on the calls themselves, for certain functions
|there are restrictions on subsequent behavior after the
|function is called from a signal-catching function (see longjmp()). (https://pubs.opengroup.org/onlinepubs/9799919799/functions/V2_chap02.html#tag_16_04)

(The table mentioned above includes over 200 functions.)

The C standard says the following about signal handlers:
"The functions in the standard library are not guaranteed to be
reentrant and may modify objects with static or thread storage duration. >239)" (7.1.4p4)
Footnote 239 says "Thus, a signal handler cannot, in general, call
standard library functions."

This is comp.unix.programmer, not comp.lang.c.

[snip]

"If a signal occurs other than as the result of calling the abort or
raise functions, the behavior is undefined if the signal handler reads
or modifies an atomic object that has an indeterminate representation." >(7.17.2p2)

"If a signal occurs other than as the result of calling the abort or
raise functions, the behavior is undefined if the signal handler calls
the atomic_init generic function." (7.17.2.1p4)

The POSIX standard claims that its version of <signal.h> conforms to the
C standard, and as far as I can see, the POSIX standard doesn't say
anything to define the behavior that is undefined by the C standard.

This is factually incorrect.

Could you demonstrate how, within the above restrictions, a signal
handler that doesn't cause the program to exit in one fashion or another >could do anything useful other than "set flags that are read later"?

Those restrictions don't apply in this context. But a trivial
example:

void
reaper(int signo)
{
(void)signo;
wait(NULL);
}

/* ... */
signal(SIGCHLD, reaper);

I'm not saying it cannot be done. I claim no expertise in this kind of >programming - I never needed to write signal handlers.

Perhaps read up on the matter before commenting, then?

However, the last
time I considered the matter carefully (which was two or three versions
of the C standard ago) I couldn't figure out how to do much more than
that. At that time I did not consider how POSIX affects the issue, and I >don't know enough about POSIX signals to evaluate that issue.

In strict C, this is correct. But while POSIX includes ISO C as
a subset, it extends allowable behavior in lots of places to
both standardize the behavior of existing programs as well as
make it possible to write useful programs on Unix-style systems.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Rainer Weikusat@rweikusat@talktalk.net to comp.unix.programmer on Mon Jan 6 18:24:24 2025

From Newsgroup: comp.unix.programmer

James Kuyper <jameskuyper@alumni.caltech.edu> writes:

On 1/6/25 11:46, Scott Lurndal wrote:

Muttley@DastardlyHQ.org writes:

...

Unix signals should only be used to set flags that are then read later.

You're opinion is not widely shared. Note that the POSIX specification
carefully notes which interfaces are not signal-safe.

What precisely does "signal-safe" mean?

The UNIX standard has meanwhile started to contradict itself on this
topic because of the apparent attempt to follow the (pretty silly) C++
memory model incorporated into C for no particular reason. But the
relevant part of the UNIX definition (still part of the current
definition) used to be

In the presence of signals, all functions defined by this volume
of POSIX.1-2024 shall behave as defined when called from or
interrupted by a signal-catching function, with the exception
that when a signal interrupts an unsafe function or
function-like macro, or equivalent (such as the processing
equivalent to exit() performed after a return from the initial
call to main()), and the signal-catching function calls an
unsafe function or function-like macro, the behavior is
undefined.

This text is below the list of async-signal safe interfaces. It
contradicts the

the behavior is undefined if:

[...]

The signal handler calls any function or function-like macro
defined in this standard other than one of the functions and
macros specified below as being async-signal-safe.

immediately above it.

https://pubs.opengroup.org/onlinepubs/9799919799/functions/V2_chap02.html#tag_16_04

When ignoring the apparent attempt at making signals really impossible
to use, "async signal-safe" is really quite simple: Any handler can do
anything provided it doesn't invoke some kind of not async signal safe interface. A handler which didn't interrupt an interface that's not
async signal safe can anything, including calling unsafe functions.

A usual way to achieve the latter is to keep signals blocked except when calling an async signal safe functions, eg, pselect, ppoll or
sigsuspend.

--- Synchronet 3.21d-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.unix.programmer on Mon Jan 6 18:52:28 2025

From Newsgroup: comp.unix.programmer

James Kuyper <jameskuyper@alumni.caltech.edu> writes:

On 1/6/25 11:46, Scott Lurndal wrote:

Muttley@DastardlyHQ.org writes:

...

Unix signals should only be used to set flags that are then read later.

You're opinion is not widely shared. Note that the POSIX specification
carefully notes which interfaces are not signal-safe.

Could you demonstrate how, within the above restrictions, a signal
handler that doesn't cause the program to exit in one fashion or another >could do anything useful other than "set flags that are read later"?
I'm not saying it cannot be done. I claim no expertise in this kind of >programming - I never needed to write signal handlers. However, the last
time I considered the matter carefully (which was two or three versions
of the C standard ago) I couldn't figure out how to do much more than
that. At that time I did not consider how POSIX affects the issue, and I >don't know enough about POSIX signals to evaluate that issue.

Augmenting Dan's well-written response, I'd point out that those
POSIX interfaces (many of which are system calls, not library
functions) are designed to work with signals. One common
technique is to call longjmp() from within a SIGINT handler;
all the application must do is mask the signal if/when executing
in a non-async-signal-safe critical section. Note that a signal
(with a few exceptions) will not interrupt a system call, and when
it does, the system call returns EINTR to the caller without any
other caller-visible side effects (either the system call fully
completes before the signal is delivered, or it has no effect).

--- Synchronet 3.21d-Linux NewsLink 1.2

From Lawrence D'Oliveiro@ldo@nz.invalid to comp.unix.programmer on Mon Jan 6 20:26:15 2025

From Newsgroup: comp.unix.programmer

On Mon, 6 Jan 2025 14:21:48 -0000 (UTC), Muttley wrote:

Not so much in unix unless you're writing a GUI program.

In *nix, select/poll works well when the performance bottleneck resides in
the I/O. async/await lets you linearize the logic of your handlers for
easier comprehension, instead of breaking them up into separate calback stages.

async/await is also useful in GUI programs, for the same reason.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Lawrence D'Oliveiro@ldo@nz.invalid to comp.unix.programmer on Mon Jan 6 20:27:15 2025

From Newsgroup: comp.unix.programmer

On Mon, 6 Jan 2025 16:00:33 -0000 (UTC), Muttley wrote:

Posix AIO is not asynch in the strict sense , its more "ok kernel, go do
this and I'll check how you're doing later".

POSIX has rCLreal-time signalsrCY, which are very similar to VMS ASTs.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Lawrence D'Oliveiro@ldo@nz.invalid to comp.unix.programmer on Mon Jan 6 20:28:43 2025

From Newsgroup: comp.unix.programmer

On Mon, 6 Jan 2025 17:53:20 -0000 (UTC), Kalevi Kolttonen wrote:

I am no expert, but I guess if you need to do async programming on
UNIX/Linux userspace, your best is to use POSIX Threads.

Threads are something you normally want to avoid, unless CPU usage is the bottleneck in your application.

In the case where the limiting factor is I/O or network bandwidth, or a
GUI app waiting for user input, then async/await is a more convenient paradigm.
--- Synchronet 3.21d-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.unix.programmer on Mon Jan 6 20:36:22 2025

From Newsgroup: comp.unix.programmer

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Mon, 6 Jan 2025 16:00:33 -0000 (UTC), Muttley wrote:

Posix AIO is not asynch in the strict sense , its more "ok kernel, go do
this and I'll check how you're doing later".

POSIX has rCLreal-time signalsrCY, which are very similar to VMS ASTs.

In any case, lio_listio is sufficiently asynchronous to be useful
in e.g. Oracle.
--- Synchronet 3.21d-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.unix.programmer on Mon Jan 6 20:36:47 2025

From Newsgroup: comp.unix.programmer

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Mon, 6 Jan 2025 17:53:20 -0000 (UTC), Kalevi Kolttonen wrote:

I am no expert, but I guess if you need to do async programming on
UNIX/Linux userspace, your best is to use POSIX Threads.

Threads are something you normally want to avoid, unless CPU usage is the >bottleneck in your application.

Complete and utter nonsense.
--- Synchronet 3.21d-Linux NewsLink 1.2

From gazelle@gazelle@shell.xmission.com (Kenny McCormack) to comp.unix.programmer on Mon Jan 6 20:38:28 2025

From Newsgroup: comp.unix.programmer

In article <z9XeP.3$TgDc.2@fx38.iad>, Scott Lurndal <slp53@pacbell.net> wrote: >Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Mon, 6 Jan 2025 17:53:20 -0000 (UTC), Kalevi Kolttonen wrote:

I am no expert, but I guess if you need to do async programming on
UNIX/Linux userspace, your best is to use POSIX Threads.

Threads are something you normally want to avoid, unless CPU usage is the >>bottleneck in your application.

Complete and utter nonsense.

As you would expect, cosidering the source.
--
Just for a change of pace, this sig is *not* an obscure reference to comp.lang.c...

--- Synchronet 3.21d-Linux NewsLink 1.2

From Nicolas George@nicolas$george@salle-s.org to comp.unix.programmer on Tue Jan 7 00:49:32 2025

From Newsgroup: comp.unix.programmer

Kalevi Kolttonen, dans le message <vlh5ag$1nruu$1@dont-email.me>, a
ocrita:

I am no expert, but I guess if you need to do async programming
on UNIX/Linux userspace, your best is to use POSIX Threads.

Very common misconception. The communication mechanisms between POSIX
threads and Unix I/O are completely alien to each-other: it is not possible
to poll() on a thread condition, nor is it it possible to set up a condition
to be woken by data on a file descriptor. As a result, anybody who tries to
use threads to solve problems of I/O concurrency ends up having to implement
a poll() or equivalent loop in each thread, defeating the purpose.

POSIX threads are good to improve on computation concurrency, but they do
not make I/O concurrency simpler, quite the opposite.

The same might not be true for other kinds of threads.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Lawrence D'Oliveiro@ldo@nz.invalid to comp.unix.programmer on Tue Jan 7 02:14:11 2025

From Newsgroup: comp.unix.programmer

On 07 Jan 2025 00:49:32 GMT, Nicolas George wrote:

The communication mechanisms between POSIX
threads and Unix I/O are completely alien to each-other: it is not
possible to poll() on a thread condition, nor is it it possible to set
up a condition to be woken by data on a file descriptor.

Linux offers signalfd, so you can indeed use poll(2) in a thread to be
woken up by any file descriptor, including a signal one (and that includes POSIX real-time signals).
--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastardlyHQ.org to comp.unix.programmer on Tue Jan 7 08:34:30 2025

From Newsgroup: comp.unix.programmer

On Mon, 6 Jan 2025 16:39:49 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vlgun1$1minf$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>On Mon, 6 Jan 2025 15:22:51 -0000 (UTC)

Multiplexing is not asychronous, its simply offloading status checking to >>the kernel.

Of course. It's a means to allow a program to respond to
asynchronous events.

Thats not the same as the program itself being asynch.

The program using is still very much sequential , at least at
that point.

But the events are not. That's the point. This allows a
program to initiate a non-blocking IO operation (like, say,
establishing a TCP connection using the sockets API), go do
something else, and check it's status later.

Thats not proper asych, its still sequential. Proper asynch is when the
program execution path is directly modified by external events. Otherwise
you could claim simply using the standard file I/O system is asynchronous programming as there's no guarantee that any data has been written to the disk before write(), fprintf() etc return.

Posix AIO is not asynch in the strict sense , its more "ok kernel, go do this >>and I'll check how you're doing later". Proper asynch where the program >>execution path gets bounced around between various callbacks is something >>else entirely.

The POSIX AIO interface allows the kernel to generate a signal
to inform the program that an IO operation has completed, e.g.,
by setting up the `aio_sigevent` and `SIGEV_SIGNAL`. It doesn't
get much more asynchronous than that.

Sure, but as I've said before, signals should only set flags to be processed later.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastardlyHQ.org to comp.unix.programmer on Tue Jan 7 08:36:29 2025

From Newsgroup: comp.unix.programmer

On Mon, 06 Jan 2025 16:46:56 GMT
scott@slp53.sl.home (Scott Lurndal) wibbled:

Muttley@DastardlyHQ.org writes:

On Mon, 06 Jan 2025 15:05:33 GMT
scott@slp53.sl.home (Scott Lurndal) wibbled:

Muttley@DastardlyHQ.org writes:

In Windows yes, which frankly is probably not a coincidence. Not so much >>>>in unix unless you're writing a GUI program.

ASTs and unix signals have similar semantics. It's certainly possible to >>>use, for example, SIGIO in a similar manner to the VMS AST, where the
AST signals I/O completion and the AST handler initiates a subsequent >>>operation.

Unix signals should only be used to set flags that are then read later.

You're opinion is not widely shared. Note that the POSIX specification >carefully notes which interfaces are not signal-safe.

ITYF it is VERY widely shared and having a signal safe API function is only step 2 - plenty of the functions in the program itself or 3rd party library functions are probably not re-entrant safe and even if they are, having
code stomp over itself - eg if in the middle of writing a log message then a signal is generated which tried to write a log message itself - is a very
poor way to write code.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastardlyHQ.org to comp.unix.programmer on Tue Jan 7 08:37:03 2025

From Newsgroup: comp.unix.programmer

On Mon, 6 Jan 2025 17:53:20 -0000 (UTC)
kalevi@kolttonen.fi (Kalevi Kolttonen) wibbled:

Muttley@dastardlyhq.org wrote:

Unix signals should only be used to set flags that are then read later. Doing

anything complicated in a signal handler is asking for trouble as you have >> no idea where the program was when the signal occured and there can be all >> sorts of re-entrant issues or even deadlocks if using mutexes.

That is what I have learned, too, but I cannot remember the
source. Maybe one of Richard Stevens' UNIX books.

I am no expert, but I guess if you need to do async programming
on UNIX/Linux userspace, your best is to use POSIX Threads.

Agreed.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Nicolas George@nicolas$george@salle-s.org to comp.unix.programmer on Tue Jan 7 08:59:28 2025

From Newsgroup: comp.unix.programmer

Lawrence D'Oliveiro , dans le message <vli2lj$1t3lt$8@dont-email.me>, a
ocrita:

Linux offers signalfd, so you can indeed use poll(2) in a thread to be
woken up by any file descriptor, including a signal one (and that includes POSIX real-time signals).

Proving my point that you need to use poll() even when doing threads.
--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Tue Jan 7 13:18:54 2025

From Newsgroup: comp.unix.programmer

In article <vlip2c$24ccb$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >On Mon, 06 Jan 2025 16:46:56 GMT

scott@slp53.sl.home (Scott Lurndal) wibbled:

Muttley@DastardlyHQ.org writes:

On Mon, 06 Jan 2025 15:05:33 GMT
scott@slp53.sl.home (Scott Lurndal) wibbled:

Muttley@DastardlyHQ.org writes:

In Windows yes, which frankly is probably not a coincidence. Not so much >>>>>in unix unless you're writing a GUI program.

ASTs and unix signals have similar semantics. It's certainly possible to >>>>use, for example, SIGIO in a similar manner to the VMS AST, where the >>>>AST signals I/O completion and the AST handler initiates a subsequent >>>>operation.

Unix signals should only be used to set flags that are then read later.

You're opinion is not widely shared. Note that the POSIX specification >>carefully notes which interfaces are not signal-safe.

ITYF it is VERY widely shared and having a signal safe API function is only >step 2 - plenty of the functions in the program itself or 3rd party library >functions are probably not re-entrant safe and even if they are, having
code stomp over itself - eg if in the middle of writing a log message then a >signal is generated which tried to write a log message itself - is a very >poor way to write code.

So don't write code that way. It does not follow that the only
thing you can do in a signal handler is an some atomic flag
somewhere.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Tue Jan 7 13:59:27 2025

From Newsgroup: comp.unix.programmer

In article <677c7a1b$0$28501$426a74cc@news.free.fr>,
Nicolas George <nicolas$george@salle-s.org> wrote:

Kalevi Kolttonen, dans le message <vlh5ag$1nruu$1@dont-email.me>, a
ocrita:

I am no expert, but I guess if you need to do async programming
on UNIX/Linux userspace, your best is to use POSIX Threads.

Very common misconception.

This is correct. Threads are a concurrency mechanism. You use
them to write (more or less) sequential execution flows that run
concurrently with respect to one another. They are not needed
to do "async programming UNIX/Linux userspace". Indeed, in some
sense they're sort of the antithesis of async programming.

The communication mechanisms between POSIX
threads and Unix I/O are completely alien to each-other: it is not possible >to poll() on a thread condition, nor is it it possible to set up a condition >to be woken by data on a file descriptor. As a result, anybody who tries to >use threads to solve problems of I/O concurrency ends up having to implement >a poll() or equivalent loop in each thread, defeating the purpose.

This, however, does not follow. I don't see why "poll" is
strictly required for IO concurrency.

In addition to signalfd etc, there is no reason you cannot, say,
have a signal handler that broadcasts on a condition variable
after an asynchronous IO operation completes, thus waking up a
thread.

More generally, in the context of a multithreaded program,
(assuming the POSIX threading model) the "condition to be woken
by data on a file descriptor" is that the IO operation on which
the thread was blocked has completed or been interrupted. In a
system with kernel scheduled threads, this simply means that the
thread becomes runnable again, and will be resumed at some
point; this is exactly analogous to the sort of state
transitions a process goes through in a traditional Unix-style
kernel when blocked on IO.

Moreover, nothing prevents a program from, say, sharing a
producer/consumer queue between threads where one extracts
entries and performs IO on them while the other prepares IO
operations to be performed by the worker. For that matter, one
can have multiple IO threads pushing work into a queue that's
executed by a pool of worker threads.

POSIX threads are good to improve on computation concurrency, but they do
not make I/O concurrency simpler, quite the opposite.

The same might not be true for other kinds of threads.

A nice property of POSIX threads is that, in _most_
implementations, if you go off and initiate some blocking
operation, such as performing (blocking) IO on a file descriptor
of some sort, other runnable threads in the system can continue
executing. Thus, it follows that you can have multiple IO
operations in separate threads pending at one time. In that
sense, one can leverage threads for IO concurrency (indeed, many
programs do just that). Threads can block independently without
impeding execution of the overall program.

Of course, not all POSIX thread implementations are capable of
doing this: consider a purely userspace "green" thread
implementation; executing a single blocking system call blocks
the whole program. In those, in many cases, blocking IO is
emulated at the library level (which translates "normal" IO
operations into their non-blocking counterparts) to give the
illusion of sequential program flow; of course, the simulacrum
is inexact at best, which is why most systems push thread
management up into the kernel.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastardlyHQ.org to comp.unix.programmer on Tue Jan 7 14:05:41 2025

From Newsgroup: comp.unix.programmer

On Tue, 7 Jan 2025 13:18:54 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vlip2c$24ccb$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>On Mon, 06 Jan 2025 16:46:56 GMT

ITYF it is VERY widely shared and having a signal safe API function is only >>step 2 - plenty of the functions in the program itself or 3rd party library >>functions are probably not re-entrant safe and even if they are, having >>code stomp over itself - eg if in the middle of writing a log message then a >>signal is generated which tried to write a log message itself - is a very >>poor way to write code.

So don't write code that way. It does not follow that the only
thing you can do in a signal handler is an some atomic flag
somewhere.

Just because you can doesn't mean you should. C lets you do a lot of things that are a Bad Idea.

--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Tue Jan 7 14:13:29 2025

From Newsgroup: comp.unix.programmer

In article <vlioum$24bqm$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >On Mon, 6 Jan 2025 16:39:49 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vlgun1$1minf$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>>On Mon, 6 Jan 2025 15:22:51 -0000 (UTC)

Multiplexing is not asychronous, its simply offloading status checking to >>>the kernel.

Of course. It's a means to allow a program to respond to
asynchronous events.

Thats not the same as the program itself being asynch.

Isn't it? The point is that the program kicks off multiple
asynchronous operations; the issue comes when figuring out what
to do when they complete. In general, there are only really two
choices here: either poll their completion status, or expect to
be notified by some kind of an event. In a POSIX-y environment, `poll`/`select` etc give you the former; signals give you the
latter.

The program using is still very much sequential , at least at
that point.

But the events are not. That's the point. This allows a
program to initiate a non-blocking IO operation (like, say,
establishing a TCP connection using the sockets API), go do
something else, and check it's status later.

Thats not proper asych, its still sequential. Proper asynch is when the >program execution path is directly modified by external events. Otherwise
you could claim simply using the standard file I/O system is asynchronous >programming as there's no guarantee that any data has been written to the disk
before write(), fprintf() etc return.

This is conflating multiple things. Most IO operations dealing
with the actual hardware _are_ asynchronous (this is what
McIlroy meant in the quote I posted earlier). The system call
interface gives the program the illusion of those happening
sequentially, but that's not how the devices really work.

It turns out the simple model of early research Unix was
insufficient for handling all sorts of important use cases,
hence why interfaces like `select` and `poll` were added.

Posix AIO is not asynch in the strict sense , its more "ok kernel, go do this
and I'll check how you're doing later". Proper asynch where the program >>>execution path gets bounced around between various callbacks is something >>>else entirely.

The POSIX AIO interface allows the kernel to generate a signal
to inform the program that an IO operation has completed, e.g.,
by setting up the `aio_sigevent` and `SIGEV_SIGNAL`. It doesn't
get much more asynchronous than that.

Sure, but as I've said before, signals should only set flags to be processed >later.

You said that, but that flies in the face of 50 years of
evidence to the contrary and the letter of the standard. This
doesn't mean that you should do arbitrary amounts of work in a
signal handler, but there's no reason that, say, one couldn't
push an event onto a queue and signal a condition variable or
something similar.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Tue Jan 7 14:14:38 2025

From Newsgroup: comp.unix.programmer

In article <vljcbk$27v6l$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >On Tue, 7 Jan 2025 13:18:54 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vlip2c$24ccb$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>>On Mon, 06 Jan 2025 16:46:56 GMT

ITYF it is VERY widely shared and having a signal safe API function is only >>>step 2 - plenty of the functions in the program itself or 3rd party library >>>functions are probably not re-entrant safe and even if they are, having >>>code stomp over itself - eg if in the middle of writing a log message then a
signal is generated which tried to write a log message itself - is a very >>>poor way to write code.

So don't write code that way. It does not follow that the only
thing you can do in a signal handler is an some atomic flag
somewhere.

Just because you can doesn't mean you should. C lets you do a lot of things >that are a Bad Idea.

I have to ask at this point: have you ever written a concurrent
program under Unix? One that used signals? For that matter,
have you ever written a program that used `fork()` and caught a
`SIGCHLD`?

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.unix.programmer on Tue Jan 7 14:59:48 2025

From Newsgroup: comp.unix.programmer

Nicolas George <nicolas$george@salle-s.org> writes:

Lawrence D'Oliveiro , dans le message <vli2lj$1t3lt$8@dont-email.me>, a
ocrita:

Linux offers signalfd, so you can indeed use poll(2) in a thread to be
woken up by any file descriptor, including a signal one (and that includes >> POSIX real-time signals).

Proving my point that you need to use poll() even when doing threads.

It "proves" no such thing.

An I/O thread can coordinate other threads using
posix condition variables, system V semaphores, pipes, etc.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastardlyHQ.org to comp.unix.programmer on Tue Jan 7 15:11:31 2025

From Newsgroup: comp.unix.programmer

On Tue, 7 Jan 2025 14:13:29 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vlioum$24bqm$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>On Mon, 6 Jan 2025 16:39:49 -0000 (UTC)

Thats not the same as the program itself being asynch.

Isn't it? The point is that the program kicks off multiple

Not, it isn't. Using operating system facilities is standard programming, its not asych programming whereby the program execution will be automatically
be diverted when an operation completes.

Thats not proper asych, its still sequential. Proper asynch is when the >>program execution path is directly modified by external events. Otherwise >>you could claim simply using the standard file I/O system is asynchronous >>programming as there's no guarantee that any data has been written to the >disk
before write(), fprintf() etc return.

This is conflating multiple things. Most IO operations dealing
with the actual hardware _are_ asynchronous (this is what
McIlroy meant in the quote I posted earlier). The system call
interface gives the program the illusion of those happening
sequentially, but that's not how the devices really work.

And? By your definition its still asynchronous programming.

Sure, but as I've said before, signals should only set flags to be processed >>later.

You said that, but that flies in the face of 50 years of
evidence to the contrary and the letter of the standard. This

Please don't just make stuff up.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastardlyHQ.org to comp.unix.programmer on Tue Jan 7 15:13:52 2025

From Newsgroup: comp.unix.programmer

On Tue, 7 Jan 2025 14:14:38 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vljcbk$27v6l$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>On Tue, 7 Jan 2025 13:18:54 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vlip2c$24ccb$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>>>On Mon, 06 Jan 2025 16:46:56 GMT

ITYF it is VERY widely shared and having a signal safe API function is only >>>>step 2 - plenty of the functions in the program itself or 3rd party library >>>>functions are probably not re-entrant safe and even if they are, having >>>>code stomp over itself - eg if in the middle of writing a log message then >a
signal is generated which tried to write a log message itself - is a very >>>>poor way to write code.

So don't write code that way. It does not follow that the only
thing you can do in a signal handler is an some atomic flag
somewhere.

Just because you can doesn't mean you should. C lets you do a lot of things >>that are a Bad Idea.

I have to ask at this point: have you ever written a concurrent
program under Unix? One that used signals? For that matter,
have you ever written a program that used `fork()` and caught a
`SIGCHLD`?

Is that supposed to be a serious question?

The only thing that should ever be done in a child exit handler is a wait*()
or set a flag.

--- Synchronet 3.21d-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.unix.programmer on Tue Jan 7 15:24:11 2025

From Newsgroup: comp.unix.programmer

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <vlioum$24bqm$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote:

Thats not proper asych, its still sequential. Proper asynch is when the >>program execution path is directly modified by external events. Otherwise >>you could claim simply using the standard file I/O system is asynchronous >>programming as there's no guarantee that any data has been written to the disk
before write(), fprintf() etc return.

This is conflating multiple things. Most IO operations dealing
with the actual hardware _are_ asynchronous (this is what
McIlroy meant in the quote I posted earlier). The system call
interface gives the program the illusion of those happening
sequentially, but that's not how the devices really work.

Indeed, and it was subsequently recognized that more than
the 'sync'[*] system call was required for applications to
ensure data was successfully written to the underlying
device.

Applications in those days (e.g. fsck) would access the
raw character device using the unbuffered read() and
write() system calls rather than using stdio. A key
characteristic of raw devices were that the hardware DMA would
use the application buffer directly rather than copying
the data to the kernel buffer pool first.

[*] I recall using 'sync;sync;sync' from the Sixth Edition
command line more than once, before rebooting.

Subsequently APIs like fdatasync(2) and open flags
such as O_DIRECT and O_DSYNC were added.

It turns out the simple model of early research Unix was
insufficient for handling all sorts of important use cases,
hence why interfaces like `select` and `poll` were added.

Although to be fair, select(2) originated with BSD and is
a bit less flexible than poll(2).

--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Tue Jan 7 15:35:44 2025

From Newsgroup: comp.unix.programmer

In article <vljgbg$28o6f$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >On Tue, 7 Jan 2025 14:14:38 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vljcbk$27v6l$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>>On Tue, 7 Jan 2025 13:18:54 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vlip2c$24ccb$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote:

On Mon, 06 Jan 2025 16:46:56 GMT
ITYF it is VERY widely shared and having a signal safe API function is only
step 2 - plenty of the functions in the program itself or 3rd party library
functions are probably not re-entrant safe and even if they are, having >>>>>code stomp over itself - eg if in the middle of writing a log message then >>a
signal is generated which tried to write a log message itself - is a very >>>>>poor way to write code.

So don't write code that way. It does not follow that the only
thing you can do in a signal handler is an some atomic flag
somewhere.

Just because you can doesn't mean you should. C lets you do a lot of things >>>that are a Bad Idea.

I have to ask at this point: have you ever written a concurrent
program under Unix? One that used signals? For that matter,
have you ever written a program that used `fork()` and caught a
`SIGCHLD`?

Is that supposed to be a serious question?

Yes.

The only thing that should ever be done in a child exit handler is a wait*() >or set a flag.

I think perhaps you should try to write some complex programs in
the Unix environment before making such categorial statement.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastardlyHQ.org to comp.unix.programmer on Tue Jan 7 15:53:57 2025

From Newsgroup: comp.unix.programmer

On Tue, 7 Jan 2025 15:35:44 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vljgbg$28o6f$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>On Tue, 7 Jan 2025 14:14:38 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vljcbk$27v6l$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>>>On Tue, 7 Jan 2025 13:18:54 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vlip2c$24ccb$1@dont-email.me>, <Muttley@DastardlyHQ.org> >wrote:

On Mon, 06 Jan 2025 16:46:56 GMT
ITYF it is VERY widely shared and having a signal safe API function is >only
step 2 - plenty of the functions in the program itself or 3rd party >library
functions are probably not re-entrant safe and even if they are, having >>>>>>code stomp over itself - eg if in the middle of writing a log message then

a

signal is generated which tried to write a log message itself - is a very >>>>>>poor way to write code.

So don't write code that way. It does not follow that the only
thing you can do in a signal handler is an some atomic flag >>>>>somewhere.

Just because you can doesn't mean you should. C lets you do a lot of things >>>>that are a Bad Idea.

I have to ask at this point: have you ever written a concurrent
program under Unix? One that used signals? For that matter,
have you ever written a program that used `fork()` and caught a >>>`SIGCHLD`?

Is that supposed to be a serious question?

Yes.

The only thing that should ever be done in a child exit handler is a wait*() >>or set a flag.

I think perhaps you should try to write some complex programs in
the Unix environment before making such categorial statement.

Don't be patronising. I've probably written more unix software in 30 years
than you've had hot dinners including a fully featured telnetd and numerous other servers for work and play. And in the places I've worked which included finance/banking, aerospace and government, the advice was almost always NOT to use signals in the first place unless there was no choice - eg SIGCHLD - but if you did then do very little in the handler and nothing that could cause any re-entrancy issues.

I suspect its you who needs a bit more practice at writing large multi process and multi threaded applications.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Nicolas George@nicolas$george@salle-s.org to comp.unix.programmer on Tue Jan 7 15:54:48 2025

From Newsgroup: comp.unix.programmer

Dan Cross, dans le message <vljbvv$gl9$1@reader2.panix.com>, a ocrita:

This, however, does not follow. I don't see why "poll" is
strictly required for IO concurrency.

Well, try to do implement anything non-trivial involving I/O concurrency, including timeouts, clients causing other clients to abort, etc., with
the common denominator of POSIX threads and come back telling us how you managed that.

I tried, and stopped trying using threads for I/O concurrency.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastardlyHQ.org to comp.unix.programmer on Tue Jan 7 15:56:46 2025

From Newsgroup: comp.unix.programmer

On 07 Jan 2025 15:54:48 GMT
Nicolas George <nicolas$george@salle-s.org> wibbled:

Dan Cross, dans le message <vljbvv$gl9$1@reader2.panix.com>, a ocrita:

This, however, does not follow. I don't see why "poll" is
strictly required for IO concurrency.

Well, try to do implement anything non-trivial involving I/O concurrency, >including timeouts, clients causing other clients to abort, etc., with
the common denominator of POSIX threads and come back telling us how you >managed that.

I tried, and stopped trying using threads for I/O concurrency.

For some mad reason it seems to be the way to do it in Windows and also Java IIRC.

--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Tue Jan 7 16:02:51 2025

From Newsgroup: comp.unix.programmer

In article <vljg72$28nj0$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >On Tue, 7 Jan 2025 14:13:29 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vlioum$24bqm$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>>On Mon, 6 Jan 2025 16:39:49 -0000 (UTC)

Thats not the same as the program itself being asynch.

Isn't it? The point is that the program kicks off multiple

Not, it isn't. Using operating system facilities is standard programming, its >not asych programming whereby the program execution will be automatically
be diverted when an operation completes.

Thats not proper asych, its still sequential. Proper asynch is when the >>>program execution path is directly modified by external events. Otherwise >>>you could claim simply using the standard file I/O system is asynchronous >>>programming as there's no guarantee that any data has been written to the >>disk
before write(), fprintf() etc return.

This is conflating multiple things. Most IO operations dealing
with the actual hardware _are_ asynchronous (this is what
McIlroy meant in the quote I posted earlier). The system call
interface gives the program the illusion of those happening
sequentially, but that's not how the devices really work.

And? By your definition its still asynchronous programming.

In the kernel, it sure is. Unix programmers have been writing
asynchronous programs (using e.g. `fork`) since 1970.

Sure, but as I've said before, signals should only set flags to be processed >>>later.

You said that, but that flies in the face of 50 years of
evidence to the contrary and the letter of the standard. This

Please don't just make stuff up.

Hmm. I wonder what shell you use, if you use Unix at all.

Here for example is the signal handler for SIGINT in bash: https://git.savannah.gnu.org/cgit/bash.git/tree/sig.c?h=devel#n691

You'll notice that it's rather more complex than just "set a
flag or `wait`.

Here's the handler in `zsh`: https://github.com/zsh-users/zsh/blob/263659acb73d0222e641dfd8d37e48e96582de02/Src/signals.c#L399

You'll again notice that it does rather more than just "set a
flag or wait."

Here's the SIGWINCH handler for good 'ol `script` from
OpenBSD: https://github.com/openbsd/src/blob/6d253f95424ee0054c798f493d12377911cd3668/usr.bin/script/script.c#L224

Wow, that one does ioctl's modifying PTY state and signals an
entire process group. That's definitely a bit more than just
"setting a flag or wait."

Here's an example from Postgres: https://github.com/postgres/postgres/blob/5b291d1c9c09d75982c3270bfa61d4e822087b6a/src/backend/storage/ipc/latch.c#L2269

Wow, that one writes to a pipe. Definitely a bit more than just
"setting a flag or wait."

Those are just a few examples. If one cares to look, one will
find many more in non-trivial programs used in production daily.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Tue Jan 7 16:10:55 2025

From Newsgroup: comp.unix.programmer

In article <vljiml$296n5$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >On Tue, 7 Jan 2025 15:35:44 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vljgbg$28o6f$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>>>[snip]

I have to ask at this point: have you ever written a concurrent
program under Unix? One that used signals? For that matter,
have you ever written a program that used `fork()` and caught a >>>>`SIGCHLD`?

Is that supposed to be a serious question?

Yes.

The only thing that should ever be done in a child exit handler is a wait*() >>>or set a flag.

I think perhaps you should try to write some complex programs in
the Unix environment before making such categorial statement.

Don't be patronising.

Wow, that's rich coming from you, my guy.

I've probably written more unix software in 30 years
than you've had hot dinners including a fully featured telnetd and numerous >other servers for work and play. And in the places I've worked which included >finance/banking, aerospace and government,

"For play" implies things that could be, or are, open source.
So post a link to code, then.

Bluntly, I don't believe that any of this is true. Your posts
here show a distinct lack of relevant experience and knowledge.

the advice was almost always NOT to
use signals in the first place unless there was no choice - eg SIGCHLD - but if
you did then do very little in the handler and nothing that could cause any >re-entrancy issues.

Earlier you said you should _only_ set a flag in a signal
handler. Then you moved the goal posts to say that you could
call `wait` (presumably after I posted that example). How
you're just saying that you should do "very little that could
cause any re-entrancy issues", which is something I more or
less said a few posts ago, but again moving those pesky goal
posts when it suits you.

So which is it?

I suspect its you who needs a bit more practice at writing large multi process >and multi threaded applications.

*shrug* Feel free to look up some things that I've written, if
you like. Perhaps you'll learn something.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Rainer Weikusat@rweikusat@talktalk.net to comp.unix.programmer on Tue Jan 7 16:13:41 2025

From Newsgroup: comp.unix.programmer

cross@spitfire.i.gajendra.net (Dan Cross) writes:

[...]

there is no reason you cannot, say, have a signal handler that
broadcasts on a condition variable after an asynchronous IO operation completes, thus waking up a thread.

The pthread_cond_* calls are not async-signal safe and hence, this is
either undefined behaviour (newly introduced with POSIX.1-2024) or
undefined behaviour if the signal handler interrupted something that
isn't async-signal safe (prior to POSIX.1-2024 and still retained in the current text).

However, POSIX semaphores can safely be used for that.
--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Tue Jan 7 16:17:15 2025

From Newsgroup: comp.unix.programmer

In article <677d4e48$0$28053$426a74cc@news.free.fr>,
Nicolas George <nicolas$george@salle-s.org> wrote:

Dan Cross, dans le message <vljbvv$gl9$1@reader2.panix.com>, a ocrita:

This, however, does not follow. I don't see why "poll" is
strictly required for IO concurrency.

Well, try to do implement anything non-trivial involving I/O concurrency, >including timeouts, clients causing other clients to abort, etc., with
the common denominator of POSIX threads and come back telling us how you >managed that.

Well, this is rather more involved than what you'd originally
said, which was just IO concurrency.

But if you've got a dedicated IO thread with a known tid, I
don't see why you couldn't do this with
`pthread_cond_timedwait` and signals.

But at this point, I'll gladly admit that `poll` et al may be
more convenient, if not required.

I tried, and stopped trying using threads for I/O concurrency.

I'm not saying it's easy.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastardlyHQ.org to comp.unix.programmer on Tue Jan 7 16:56:38 2025

From Newsgroup: comp.unix.programmer

On Tue, 7 Jan 2025 16:02:51 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vljg72$28nj0$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>On Tue, 7 Jan 2025 14:13:29 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

This is conflating multiple things. Most IO operations dealing
with the actual hardware _are_ asynchronous (this is what
McIlroy meant in the quote I posted earlier). The system call
interface gives the program the illusion of those happening
sequentially, but that's not how the devices really work.

And? By your definition its still asynchronous programming.

In the kernel, it sure is. Unix programmers have been writing
asynchronous programs (using e.g. `fork`) since 1970.

Thats not what we're discussion here and you know it.

Please don't just make stuff up.

Hmm. I wonder what shell you use, if you use Unix at all.

Stupid comments really are your forte arn't they.

Here for example is the signal handler for SIGINT in bash: >https://git.savannah.gnu.org/cgit/bash.git/tree/sig.c?h=devel#n691

Basically sets flags.

Here's the SIGWINCH handler for good 'ol `script` from
OpenBSD: >https://github.com/openbsd/src/blob/6d253f95424ee0054c798f493d12377911cd3668/us
r.bin/script/script.c#L224

Not a clever way to do it because an xterm and other terminal progs can indirectly cause a whole load of SIGWINCH to be created if someone is resizing it and only the final one really needs the ioctl call done. Better to set a flag then manually do a call when appropriate.

Those are just a few examples. If one cares to look, one will
find many more in non-trivial programs used in production daily.

There are always exceptions to every rule. You seem to be so desperate to
win this argument I can only assume your fragile ego has been burst by
someone having the temerity to disagree with you. Tough, suck it up.

--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Tue Jan 7 17:01:09 2025

From Newsgroup: comp.unix.programmer

In article <87ttaa7gay.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

[...]

there is no reason you cannot, say, have a signal handler that
broadcasts on a condition variable after an asynchronous IO operation
completes, thus waking up a thread.

The pthread_cond_* calls are not async-signal safe and hence, this is
either undefined behaviour (newly introduced with POSIX.1-2024) or
undefined behaviour if the signal handler interrupted something that
isn't async-signal safe (prior to POSIX.1-2024 and still retained in the >current text).

You are correct; I was wrong about condvars. From: https://pubs.opengroup.org/onlinepubs/9799919799/functions/pthread_cond_broadcast.html

|It is not safe to use the pthread_cond_signal() function in a
|signal handler that is invoked asynchronously. Even if it were
|safe, there would still be a race between the test of the
|Boolean pthread_cond_wait() that could not be efficiently
|eliminated.
|
|Mutexes and condition variables are thus not suitable for
|releasing a waiting thread by signaling from code running in a
|signal handler.

(Curiously they make no mention of `pthread_cond_broadcast`
here; I suppose the same rationale applies.)

However, POSIX semaphores can safely be used for that.

Another mechanism might be to have a thread blocked in `sigwait`
or `sigtimedwait` and use `pthread_kill` to signal it.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastardlyHQ.org to comp.unix.programmer on Tue Jan 7 17:01:49 2025

From Newsgroup: comp.unix.programmer

On Tue, 7 Jan 2025 16:10:55 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vljiml$296n5$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>On Tue, 7 Jan 2025 15:35:44 -0000 (UTC)

I think perhaps you should try to write some complex programs in
the Unix environment before making such categorial statement.

Don't be patronising.

Wow, that's rich coming from you, my guy.

"My guy?" LOL, how old are you, 12? :)

I've probably written more unix software in 30 years
than you've had hot dinners including a fully featured telnetd and numerous >>other servers for work and play. And in the places I've worked which included >>finance/banking, aerospace and government,

"For play" implies things that could be, or are, open source.
So post a link to code, then.

Nope, I like my relative anonymity here and I don't need to prove anything to some twat with a chip on his shoulder getting worked up over technical trivia.

Bluntly, I don't believe that any of this is true. Your posts

Believe what you like, I couldn't give a rats arse.

here show a distinct lack of relevant experience and knowledge.

Whatever you say genius.

*shrug* Feel free to look up some things that I've written, if
you like. Perhaps you'll learn something.

Is this yours?

https://github.com/dancrossnyc

Am I supposed to be impressed?

--- Synchronet 3.21d-Linux NewsLink 1.2

From gazelle@gazelle@shell.xmission.com (Kenny McCormack) to comp.unix.programmer on Tue Jan 7 17:16:29 2025

From Newsgroup: comp.unix.programmer

In article <vljjmf$g76$2@reader2.panix.com>,
Dan Cross <cross@spitfire.i.gajendra.net> wrote:
...

"For play" implies things that could be, or are, open source.
So post a link to code, then.

Bluntly, I don't believe that any of this is true. Your posts
here show a distinct lack of relevant experience and knowledge.

Or dementia. Lot of that going around in the world today.

Many Usenet posters fall into this category. They may have been smart/accomplished once upon a time, but that was years/decades ago, and
now they're just living on memories.

Seriously, I've seen a lot of this over the years, and we should cut this
guy some slack.
--
After 4 years of disastrous screwups, Trump now favors 3 policies that I support:
1) $2K/pp stimulus money. Who doesn't want more money?
2) Water pressure. My shower doesn't work very well; I want Donnie to come fix it.
3) Repeal of Section 230. This will lead to the demise of Face/Twit/Gram. Yey!
--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Tue Jan 7 17:19:56 2025

From Newsgroup: comp.unix.programmer

In article <vljmc6$29tkd$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >On Tue, 7 Jan 2025 16:02:51 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vljg72$28nj0$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>>On Tue, 7 Jan 2025 14:13:29 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

This is conflating multiple things. Most IO operations dealing
with the actual hardware _are_ asynchronous (this is what
McIlroy meant in the quote I posted earlier). The system call >>>>interface gives the program the illusion of those happening >>>>sequentially, but that's not how the devices really work.

And? By your definition its still asynchronous programming.

In the kernel, it sure is. Unix programmers have been writing
asynchronous programs (using e.g. `fork`) since 1970.

Thats not what we're discussion here and you know it.

Actually, it is.

Please don't just make stuff up.

Hmm. I wonder what shell you use, if you use Unix at all.

Stupid comments really are your forte arn't they.

I see that you can't support your argument.

Here for example is the signal handler for SIGINT in bash: >>https://git.savannah.gnu.org/cgit/bash.git/tree/sig.c?h=devel#n691

Basically sets flags.

Did you actually read and understand any of that code?

Here's the SIGWINCH handler for good 'ol `script` from
OpenBSD: >>https://github.com/openbsd/src/blob/6d253f95424ee0054c798f493d12377911cd3668/us
r.bin/script/script.c#L224

Not a clever way to do it because an xterm and other terminal progs can >indirectly cause a whole load of SIGWINCH to be created if someone is resizing
it and only the final one really needs the ioctl call done. Better to set a >flag then manually do a call when appropriate.

Ok. You may even be right! But tell me: where would you check
those flags?

Regardless, here you are, again, moving the goalposts in the
face of evidence that contradicted your earlier position.

Those are just a few examples. If one cares to look, one will
find many more in non-trivial programs used in production daily.

There are always exceptions to every rule. You seem to be so desperate to
win this argument I can only assume your fragile ego has been burst by >someone having the temerity to disagree with you. Tough, suck it up.

Ah, here we go. The classic attempt at an insult.

Look, you made categorical, definitive statements. Those
statements were factually incorrect. I pointed that out. You
seem to be pretty upset about that and want to argument the
point, no matter how much evidence to the contrary you are
presented with.

Perhaps I am not the one with the fragile ego that needs to suck
it up when disagreed with.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Tue Jan 7 17:23:01 2025

From Newsgroup: comp.unix.programmer

In article <vljmlt$29vt3$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >On Tue, 7 Jan 2025 16:10:55 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vljiml$296n5$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>>On Tue, 7 Jan 2025 15:35:44 -0000 (UTC)

I think perhaps you should try to write some complex programs in
the Unix environment before making such categorial statement.

Don't be patronising.

Wow, that's rich coming from you, my guy.

"My guy?" LOL, how old are you, 12? :)

I've probably written more unix software in 30 years
than you've had hot dinners including a fully featured telnetd and numerous >>>other servers for work and play. And in the places I've worked which included
finance/banking, aerospace and government,

"For play" implies things that could be, or are, open source.
So post a link to code, then.

Nope, I like my relative anonymity here and I don't need to prove anything to >some twat with a chip on his shoulder getting worked up over technical trivia.

Ok. So we're just supposed to take your word for it, I guess.
Got it.

Bluntly, I don't believe that any of this is true. Your posts

Believe what you like, I couldn't give a rats arse.

You also have no evidence to back up your claims, it seems.

here show a distinct lack of relevant experience and knowledge.

Whatever you say genius.

*shrug* Feel free to look up some things that I've written, if
you like. Perhaps you'll learn something.

Is this yours?

https://github.com/dancrossnyc

Yup, that's me.

Am I supposed to be impressed?

*shrug* I think my credentials speak for themselves. I really
don't care whether you're impressed or not.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Tue Jan 7 17:31:37 2025

From Newsgroup: comp.unix.programmer

In article <vGbfP.54357$XfF8.7280@fx04.iad>,
Scott Lurndal <slp53@pacbell.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <vlioum$24bqm$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote:

Thats not proper asych, its still sequential. Proper asynch is when the >>>program execution path is directly modified by external events. Otherwise >>>you could claim simply using the standard file I/O system is asynchronous >>>programming as there's no guarantee that any data has been written to the disk
before write(), fprintf() etc return.

This is conflating multiple things. Most IO operations dealing
with the actual hardware _are_ asynchronous (this is what
McIlroy meant in the quote I posted earlier). The system call
interface gives the program the illusion of those happening
sequentially, but that's not how the devices really work.

Indeed, and it was subsequently recognized that more than
the 'sync'[*] system call was required for applications to
ensure data was successfully written to the underlying
device.

Yup.

Applications in those days (e.g. fsck) would access the
raw character device using the unbuffered read() and
write() system calls rather than using stdio. A key
characteristic of raw devices were that the hardware DMA would
use the application buffer directly rather than copying
the data to the kernel buffer pool first.

They still do!

[*] I recall using 'sync;sync;sync' from the Sixth Edition
command line more than once, before rebooting.

I've always felt the triple-sync thing was kind of superstition.

But this makes some sense when one considers that a sync would
trigger additional IO requests that would be scheduled, but
not necessarily completed.

Still, once disk drivers started optimizing write order and
thus divorcing ordering of requests to the device from the
chronological order the requests arrived in, all bets were out
the window.

Subsequently APIs like fdatasync(2) and open flags
such as O_DIRECT and O_DSYNC were added.

It turns out the simple model of early research Unix was
insufficient for handling all sorts of important use cases,
hence why interfaces like `select` and `poll` were added.

Although to be fair, select(2) originated with BSD and is
a bit less flexible than poll(2).

This is fair.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Tue Jan 7 17:40:18 2025

From Newsgroup: comp.unix.programmer

In article <vljnhd$2mih3$1@news.xmission.com>,
Kenny McCormack <gazelle@shell.xmission.com> wrote:

In article <vljjmf$g76$2@reader2.panix.com>,
Dan Cross <cross@spitfire.i.gajendra.net> wrote:
...

"For play" implies things that could be, or are, open source.
So post a link to code, then.

Bluntly, I don't believe that any of this is true. Your posts
here show a distinct lack of relevant experience and knowledge.

Or dementia. Lot of that going around in the world today.

Many Usenet posters fall into this category. They may have been >smart/accomplished once upon a time, but that was years/decades ago, and
now they're just living on memories.

Seriously, I've seen a lot of this over the years, and we should cut this
guy some slack.

A lot of USENET folks talk the talk without ever having walked
the walk, but you could be right.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.unix.programmer on Tue Jan 7 19:09:53 2025

From Newsgroup: comp.unix.programmer

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <vGbfP.54357$XfF8.7280@fx04.iad>,
Scott Lurndal <slp53@pacbell.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <vlioum$24bqm$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote:

Applications in those days (e.g. fsck) would access the
raw character device using the unbuffered read() and
write() system calls rather than using stdio. A key
characteristic of raw devices were that the hardware DMA would
use the application buffer directly rather than copying
the data to the kernel buffer pool first.

They still do!

Well, not on linux. Even O_DIRECT still goes through
the file cache, last I checked.

I submitted a 'raw device' patch to the linux mailing
list in the late 90s while at SGI. It wasn't accepted because
O_DIRECT was considered sufficient, even with the spurious
copy. The overhead of pinning the user space pages was
considered onerous.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Lawrence D'Oliveiro@ldo@nz.invalid to comp.unix.programmer on Wed Jan 8 02:36:01 2025

From Newsgroup: comp.unix.programmer

On Tue, 7 Jan 2025 15:56:46 -0000 (UTC), Muttley wrote:

On 07 Jan 2025 15:54:48 GMT Nicolas George <nicolas$george@salle-s.org> wibbled:

I tried, and stopped trying using threads for I/O concurrency.

For some mad reason it seems to be the way to do it in Windows and also
Java IIRC.

Remember the era: it was the 1990s, when threads were still a new thing to
PC OSes, and they were considered the best way to do everything involving nondeterminism, including GUIs.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Lawrence D'Oliveiro@ldo@nz.invalid to comp.unix.programmer on Wed Jan 8 02:36:56 2025

From Newsgroup: comp.unix.programmer

On 07 Jan 2025 08:59:28 GMT, Nicolas George wrote:

Lawrence D'Oliveiro , dans le message <vli2lj$1t3lt$8@dont-email.me>, a
|-crit-a:

Linux offers signalfd, so you can indeed use poll(2) in a thread to be
woken up by any file descriptor, including a signal one (and that
includes POSIX real-time signals).

Proving my point that you need to use poll() even when doing threads.

But you said rCLit is not possible to poll() on a thread conditionrCY, when it fact such usage is commonplace, as I pointed out.
--- Synchronet 3.21d-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.unix.programmer on Wed Jan 8 03:23:13 2025

From Newsgroup: comp.unix.programmer

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On 07 Jan 2025 08:59:28 GMT, Nicolas George wrote:

Lawrence D'Oliveiro , dans le message <vli2lj$1t3lt$8@dont-email.me>, a
|-crit-a:

Linux offers signalfd, so you can indeed use poll(2) in a thread to be
woken up by any file descriptor, including a signal one (and that
includes POSIX real-time signals).

Proving my point that you need to use poll() even when doing threads.

But you said rCLit is not possible to poll() on a thread conditionrCY, when it
fact such usage is commonplace, as I pointed out.

It is perfectly possible to poll() on a thread condition. See pipe(2).
--- Synchronet 3.21d-Linux NewsLink 1.2

From Nicolas George@nicolas$george@salle-s.org to comp.unix.programmer on Wed Jan 8 07:52:24 2025

From Newsgroup: comp.unix.programmer

Scott Lurndal, dans le message <BcmfP.289889$aTp4.50420@fx09.iad>, a
ocrita:

It is perfectly possible to poll() on a thread condition. See pipe(2).

It is possible to write code to wake a thread blocked in poll(). I have not tried to deny this. It is not possible to poll() directly on a thread condition. You have not disproved that.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastardlyHQ.org to comp.unix.programmer on Wed Jan 8 08:20:42 2025

From Newsgroup: comp.unix.programmer

On Tue, 7 Jan 2025 17:19:56 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vljmc6$29tkd$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>On Tue, 7 Jan 2025 16:02:51 -0000 (UTC)

In the kernel, it sure is. Unix programmers have been writing >>>asynchronous programs (using e.g. `fork`) since 1970.

Thats not what we're discussion here and you know it.

Actually, it is.

Ah ok, goalposts moved. Why not some straw men while you're at it?

https://git.savannah.gnu.org/cgit/bash.git/tree/sig.c?h=devel#n691

Basically sets flags.

Did you actually read and understand any of that code?

Did you?

Not a clever way to do it because an xterm and other terminal progs can >>indirectly cause a whole load of SIGWINCH to be created if someone is >resizing
it and only the final one really needs the ioctl call done. Better to set a >>flag then manually do a call when appropriate.

Ok. You may even be right! But tell me: where would you check
those flags?

Presuably a genius like you would know most terminal programs have a seperate thread or a multiplex timeout in order to flash the cursor. You work out
the rest.

Regardless, here you are, again, moving the goalposts in the
face of evidence that contradicted your earlier position.

Irony, love it.

There are always exceptions to every rule. You seem to be so desperate to >>win this argument I can only assume your fragile ego has been burst by >>someone having the temerity to disagree with you. Tough, suck it up.

Ah, here we go. The classic attempt at an insult.

If the shoe fits.

Look, you made categorical, definitive statements. Those
statements were factually incorrect. I pointed that out. You

No, I stated the majority approach to using signals. You disagree which is fine, but don't pretent your view is THE view, it isn't.

Perhaps I am not the one with the fragile ego that needs to suck
it up when disagreed with.

Ego size on usenet is almost always correlated with the verbiage of a reply.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastardlyHQ.org to comp.unix.programmer on Wed Jan 8 08:23:49 2025

From Newsgroup: comp.unix.programmer

On Tue, 7 Jan 2025 17:23:01 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vljmlt$29vt3$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>On Tue, 7 Jan 2025 16:10:55 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) wibbled:
Nope, I like my relative anonymity here and I don't need to prove anything to >>some twat with a chip on his shoulder getting worked up over technical trivia.

Ok. So we're just supposed to take your word for it, I guess.
Got it.

In one.

Believe what you like, I couldn't give a rats arse.

You also have no evidence to back up your claims, it seems.

Plenty of evidence out there.

Am I supposed to be impressed?

*shrug* I think my credentials speak for themselves. I really
don't care whether you're impressed or not.

What credentials? You work for some neverheardofit software house and have mostly done a load of forks of other peoples code. Standard dev stuff, nothing special.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastardlyHQ.org to comp.unix.programmer on Wed Jan 8 08:26:06 2025

From Newsgroup: comp.unix.programmer

On Tue, 07 Jan 2025 19:09:53 GMT
scott@slp53.sl.home (Scott Lurndal) wibbled:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <vGbfP.54357$XfF8.7280@fx04.iad>,
Scott Lurndal <slp53@pacbell.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <vlioum$24bqm$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote:

Applications in those days (e.g. fsck) would access the
raw character device using the unbuffered read() and
write() system calls rather than using stdio. A key
characteristic of raw devices were that the hardware DMA would
use the application buffer directly rather than copying
the data to the kernel buffer pool first.

They still do!

Well, not on linux. Even O_DIRECT still goes through
the file cache, last I checked.

I submitted a 'raw device' patch to the linux mailing
list in the late 90s while at SGI. It wasn't accepted because
O_DIRECT was considered sufficient, even with the spurious
copy. The overhead of pinning the user space pages was
considered onerous.

I vaguely remember Oracle complaining about this back in the day meaning
they couldn't guarantee their RDBMS transaction writes on Linux. Not sure
if or when it was solved.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastardlyHQ.org to comp.unix.programmer on Wed Jan 8 08:27:45 2025

From Newsgroup: comp.unix.programmer

On Wed, 8 Jan 2025 02:36:01 -0000 (UTC)
Lawrence D'Oliveiro <ldo@nz.invalid> wibbled:

On Tue, 7 Jan 2025 15:56:46 -0000 (UTC), Muttley wrote:

On 07 Jan 2025 15:54:48 GMT Nicolas George <nicolas$george@salle-s.org>
wibbled:

I tried, and stopped trying using threads for I/O concurrency.

For some mad reason it seems to be the way to do it in Windows and also
Java IIRC.

Remember the era: it was the 1990s, when threads were still a new thing to >PC OSes, and they were considered the best way to do everything involving >nondeterminism, including GUIs.

Unfortunately there are still far to many programmers around for whom threads are their go to hammer no matter what problem they're trying to solve.

--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Wed Jan 8 12:19:13 2025

From Newsgroup: comp.unix.programmer

In article <vllcml$2mqb8$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >On Tue, 7 Jan 2025 17:23:01 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vljmlt$29vt3$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>>On Tue, 7 Jan 2025 16:10:55 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) wibbled:
Nope, I like my relative anonymity here and I don't need to prove anything to
some twat with a chip on his shoulder getting worked up over technical trivia.

Ok. So we're just supposed to take your word for it, I guess.
Got it.

In one.

Believe what you like, I couldn't give a rats arse.

You also have no evidence to back up your claims, it seems.

Plenty of evidence out there.

Plenty of contradictory evidence, you mean.

Am I supposed to be impressed?

*shrug* I think my credentials speak for themselves. I really
don't care whether you're impressed or not.

What credentials? You work for some neverheardofit software house and have >mostly done a load of forks of other peoples code. Standard dev stuff, nothing >special.

Yeah, ok, sure.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Wed Jan 8 12:21:59 2025

From Newsgroup: comp.unix.programmer

In article <677e2eb8$0$375$426a34cc@news.free.fr>,
Nicolas George <nicolas$george@salle-s.org> wrote:

Scott Lurndal, dans le message <BcmfP.289889$aTp4.50420@fx09.iad>, a
ocrita:

It is perfectly possible to poll() on a thread condition. See pipe(2).

It is possible to write code to wake a thread blocked in poll(). I have not >tried to deny this. It is not possible to poll() directly on a thread >condition. You have not disproved that.

I think it's important to define what you mean when you write,
"thread condition." What, exactly, is that? Perhaps you mean
a condition variable? If so, that's true, but I fail to see
the relevance: people write multithreaded code that does IO in
multiple all the time; there are some techniques that are common
for this (Scott alluded to the so-called "pipe trick", due to
Bernstein) and some that are less common. It may be harder or
easier depending on which techniques you employ, but it's all
doable.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Wed Jan 8 13:00:01 2025

From Newsgroup: comp.unix.programmer

In article <vllcgq$2mphu$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >On Tue, 7 Jan 2025 17:19:56 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vljmc6$29tkd$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>>On Tue, 7 Jan 2025 16:02:51 -0000 (UTC)

In the kernel, it sure is. Unix programmers have been writing >>>>asynchronous programs (using e.g. `fork`) since 1970.

Thats not what we're discussion here and you know it.

Actually, it is.

Ah ok, goalposts moved. Why not some straw men while you're at it?

https://git.savannah.gnu.org/cgit/bash.git/tree/sig.c?h=devel#n691

Basically sets flags.

Did you actually read and understand any of that code?

Did you?

Yes. I see that the call chains invoked from that handler wind
up calling things like `malloc`. I guess you couldn't read the
code well enough to see that for yourself.

Not a clever way to do it because an xterm and other terminal progs can >>>indirectly cause a whole load of SIGWINCH to be created if someone is >>resizing
it and only the final one really needs the ioctl call done. Better to set a >>>flag then manually do a call when appropriate.

Ok. You may even be right! But tell me: where would you check
those flags?

Presuably a genius like you would know most terminal programs have a seperate >thread or a multiplex timeout in order to flash the cursor. You work out
the rest.

Right, handwave away those very real concerns. We're talking
about Unix here, not code running on some microcontroller; code
might be sitting in some tight loop doing computation for
arbitrarily long.

Regardless, here you are, again, moving the goalposts in the
face of evidence that contradicted your earlier position.

Irony, love it.

There are always exceptions to every rule. You seem to be so desperate to >>>win this argument I can only assume your fragile ego has been burst by >>>someone having the temerity to disagree with you. Tough, suck it up.

Ah, here we go. The classic attempt at an insult.

If the shoe fits.

Look, you made categorical, definitive statements. Those
statements were factually incorrect. I pointed that out. You

No, I stated the majority approach to using signals. You disagree which is >fine, but don't pretent your view is THE view, it isn't.

Bluntly, I don't see any evidence that you are qualified enough
to make any statement regarding the "majority approach" to,
well, just about anything related to the domain, let alone the
industry writ large. On the other hand, despite you assuring us
that we should just take your word for it that you're some kind
of expert, I see plenty of evidence in the form of factually
incorrect statements to conclude that you don't know what you're
about generally.

In other words, no, I'm not just taking your word for it, but
instead trusting the evidence before me.

Perhaps I am not the one with the fragile ego that needs to suck
it up when disagreed with.

Ego size on usenet is almost always correlated with the verbiage of a reply.

I'm not the one who started with dishing out insults at the
first person who disagreed with me.

Feel free to have the last word, but absent some actually
technical point, I'm done with you.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastardlyHQ.org to comp.unix.programmer on Wed Jan 8 13:36:50 2025

From Newsgroup: comp.unix.programmer

On Wed, 8 Jan 2025 12:19:13 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vllcml$2mqb8$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>On Tue, 7 Jan 2025 17:23:01 -0000 (UTC)

You also have no evidence to back up your claims, it seems.

Plenty of evidence out there.

Plenty of contradictory evidence, you mean.

Probably for both methods.

What credentials? You work for some neverheardofit software house and have >>mostly done a load of forks of other peoples code. Standard dev stuff, nothing

special.

Yeah, ok, sure.

Hacking other peoples code doesn't impress me. They've done all the hard work.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastardlyHQ.org to comp.unix.programmer on Wed Jan 8 13:40:40 2025

From Newsgroup: comp.unix.programmer

On Wed, 8 Jan 2025 13:00:01 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vllcgq$2mphu$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>On Tue, 7 Jan 2025 17:19:56 -0000 (UTC)

Presuably a genius like you would know most terminal programs have a seperate >>thread or a multiplex timeout in order to flash the cursor. You work out >>the rest.

Right, handwave away those very real concerns. We're talking
about Unix here, not code running on some microcontroller; code

Oops, awkward fact avoided eh? :)

And you don't think microcontrollers can run linux with X on top?

No, I stated the majority approach to using signals. You disagree which is >>fine, but don't pretent your view is THE view, it isn't.

Bluntly, I don't see any evidence that you are qualified enough
to make any statement regarding the "majority approach" to,

I don't need to prove myself to argue with some random troll on a newsgroup.
If you can't handle that thats your problem, not mine.

Ego size on usenet is almost always correlated with the verbiage of a reply.

I'm not the one who started with dishing out insults at the
first person who disagreed with me.

Define insult. However here's one: you're clearly a thin skinned snowflake.

Feel free to have the last word, but absent some actually
technical point, I'm done with you.

Suits me mate.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Nicolas George@nicolas$george@salle-s.org to comp.unix.programmer on Wed Jan 8 14:01:07 2025

From Newsgroup: comp.unix.programmer

Dan Cross, dans le message <vllql7$sn6$2@reader2.panix.com>, a |-crit-a:

I think it's important to define what you mean when you write,
"thread condition." What, exactly, is that? Perhaps you mean
a condition variable?

Yes, of course that is what rCLthread conditionrCY means in the context of a discussion about POSIX threads.

If so, that's true, but I fail to see
the relevance: people write multithreaded code that does IO in
multiple all the time; there are some techniques that are common
for this (Scott alluded to the so-called "pipe trick", due to
Bernstein) and some that are less common.

Yes: there are some techniques that are common to implement I/O concurrency
and that work in the context of threads. You are arguing my point for me:
the threads did not make implementing the I/O concurrency simpler; quite the opposite it they made them harder, as proven by the fact that rCLtechniquesrCY had to be deployed.

POSIX threads do not make I/O concurrency easier, they are not made for
that, they are for performance.
--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Wed Jan 8 14:41:22 2025

From Newsgroup: comp.unix.programmer

In article <677e8523$0$28061$426a34cc@news.free.fr>,
Nicolas George <nicolas$george@salle-s.org> wrote:

Dan Cross, dans le message <vllql7$sn6$2@reader2.panix.com>, a |-crit-a:

I think it's important to define what you mean when you write,
"thread condition." What, exactly, is that? Perhaps you mean
a condition variable?

Yes, of course that is what rCLthread conditionrCY means in the context of a >discussion about POSIX threads.

Not really. A condition variable is a synchronization
primitive; it is not inherently an attribute of a thread. When
one phrases it as "thread condition" one gives the impression
that one is talking about some aspect of the thread itself, such
as its state, or the "condition" that it is in, in a similar way
that one might talk about the condition of a patient in a
doctor's office.

As always, in computing, it's better to be precise.

If so, that's true, but I fail to see
the relevance: people write multithreaded code that does IO in
multiple all the time; there are some techniques that are common
for this (Scott alluded to the so-called "pipe trick", due to
Bernstein) and some that are less common.

Yes: there are some techniques that are common to implement I/O concurrency >and that work in the context of threads. You are arguing my point for me:
the threads did not make implementing the I/O concurrency simpler; quite the >opposite it they made them harder, as proven by the fact that rCLtechniquesrCY >had to be deployed.

That's a silly argument. "Techniques" had to be developed for
literally all of this stuff.

Moreover, things like the self pipe trick are independent of
threads. That's a "technique" for avoiding races between signal
delivery and "select" etc. That it can be usefully employed in
a threaded context doesn't say much either for or against
threads.

POSIX threads do not make I/O concurrency easier, they are not made for
that, they are for performance.

This is a specious statement that is not backed up by evidence
and is trivially false (two threads can execute blocking "write"
calls on two file descriptors concurrently).

The assertion that POSIX threads are for "performance" deserves
some citation. POSIX threads might enable one to write parallel
code, thus facilitating higher performance than single-threaded
code, or they might not, depending on the implementation and the
host computer (e.g., if executed on a uniprocessor machine).

Fundamentally, threads are about having multiple control flows
that execute concurrently in a single address space. That's it,
really.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastardlyHQ.org to comp.unix.programmer on Wed Jan 8 15:05:20 2025

From Newsgroup: comp.unix.programmer

On 08 Jan 2025 14:01:07 GMT
Nicolas George <nicolas$george@salle-s.org> wibbled:

Dan Cross, dans le message <vllql7$sn6$2@reader2.panix.com>, a |-crit-a:

I think it's important to define what you mean when you write,
"thread condition." What, exactly, is that? Perhaps you mean
a condition variable?

Yes, of course that is what rCLthread conditionrCY means in the context of a >discussion about POSIX threads.

No, they're called condition variables, not thread conditions which implies something rather different. Surely an experienced genius like you would know that.

POSIX threads do not make I/O concurrency easier, they are not made for
that, they are for performance.

Not really. They're made for multitasking in situations where multiplexing would be too complicated or impossible and multiprocess would be overkill or too resource intensive. On a single core CPU using threads could actually slow a program down compared to using a single thread.

--- Synchronet 3.21d-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.unix.programmer on Wed Jan 8 16:05:13 2025

From Newsgroup: comp.unix.programmer

Muttley@DastardlyHQ.org writes:

On Wed, 8 Jan 2025 13:00:01 -0000 (UTC)

I don't need to prove myself to argue with some random troll on a newsgroup. >If you can't handle that thats your problem, not mine.

Look in the mirror, nameless troll.

Dan is well known and respected in the Unix world. You, not so much.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.unix.programmer on Wed Jan 8 09:55:28 2025

From Newsgroup: comp.unix.programmer

scott@slp53.sl.home (Scott Lurndal) writes:

Muttley@DastardlyHQ.org writes:

On Wed, 8 Jan 2025 13:00:01 -0000 (UTC)
[...]
I don't need to prove myself to argue with some random troll on a
newsgroup. If you can't handle that thats your problem, not mine.

Look in the mirror, nameless troll. [...]

If you wouldn't mind a question, I'm wondering why you continue
to take the bait.
--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Wed Jan 8 18:38:27 2025

From Newsgroup: comp.unix.programmer

In article <ZmxfP.284330$Uup4.225209@fx10.iad>,
Scott Lurndal <slp53@pacbell.net> wrote:

Dan is well known and respected in the Unix world. You, not so much.

Thank you, Scott.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Kaz Kylheku@643-408-1753@kylheku.com to comp.unix.programmer on Wed Jan 8 20:27:54 2025

From Newsgroup: comp.unix.programmer

On 2025-01-08, Muttley@DastardlyHQ.org <Muttley@DastardlyHQ.org> wrote:

I don't need to prove myself to argue with some random troll on a newsgroup. If you can't handle that thats your problem, not mine.

I have to say, that's pretty tone deaf, as a reply to Dan Cross.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
--- Synchronet 3.21d-Linux NewsLink 1.2

From Lawrence D'Oliveiro@ldo@nz.invalid to comp.unix.programmer on Thu Jan 9 04:39:35 2025

From Newsgroup: comp.unix.programmer

On Wed, 8 Jan 2025 08:26:06 -0000 (UTC), Muttley wrote:

I vaguely remember Oracle complaining about this back in the day meaning
they couldn't guarantee their RDBMS transaction writes on Linux.

Blame the hard drive vendors for putting stupid caches on their drives.
ItrCOs not a Linux-specific problem.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Salvador Mirzo@smirzo@example.com to comp.unix.programmer on Thu Jan 9 22:27:46 2025

From Newsgroup: comp.unix.programmer

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Sat, 04 Jan 2025 19:17:15 -0300, Salvador Mirzo wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

Windows NT was masterminded by Dave Cutler ...

Is that Dave with a YouTube channel?

No, thatrCOs a different former Microsoftie, but he has had Cutler on his channel for an extended interview.

I found it ironic that there was a PiDP-11, I think it was, placed within armrCOs reach behind the guy during the entire interview. You know, the PDP-11 emulator that runs on a Linux-based Raspberry Pi. I wonder if the Unix-hater ever noticed that ...

LOL. Can you find a link to this interview? Sounded very interesting.
--
Thanks!
--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastardlyHQ.org to comp.unix.programmer on Wed Jan 15 16:46:39 2025

From Newsgroup: comp.unix.programmer

On Wed, 08 Jan 2025 16:05:13 GMT
scott@slp53.sl.home (Scott Lurndal) wibbled:

Muttley@DastardlyHQ.org writes:

On Wed, 8 Jan 2025 13:00:01 -0000 (UTC)

I don't need to prove myself to argue with some random troll on a newsgroup. >>If you can't handle that thats your problem, not mine.

Look in the mirror, nameless troll.

The usual "i don't agree with you so you must be a troll" attitude.

Dan is well known and respected in the Unix world. You, not so much.

Is he? Never heard of him until recently. What exactly has he done thats
so impressive? His github certainly gives no clue. Lots of forks of other peoples stuff. BFD.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastardlyHQ.org to comp.unix.programmer on Wed Jan 15 16:47:38 2025

From Newsgroup: comp.unix.programmer

On Wed, 8 Jan 2025 20:27:54 -0000 (UTC)
Kaz Kylheku <643-408-1753@kylheku.com> wibbled:

On 2025-01-08, Muttley@DastardlyHQ.org <Muttley@DastardlyHQ.org> wrote:

I don't need to prove myself to argue with some random troll on a newsgroup. >> If you can't handle that thats your problem, not mine.

I have to say, that's pretty tone deaf, as a reply to Dan Cross.

Sorry if I don't hero worship like you. If I think someone's talking bollocks then I'll call it out.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Kaz Kylheku@643-408-1753@kylheku.com to comp.unix.programmer on Wed Jan 15 20:20:44 2025

From Newsgroup: comp.unix.programmer

On 2025-01-15, Muttley@DastardlyHQ.org <Muttley@DastardlyHQ.org> wrote:

The usual "i don't agree with you so you must be a troll" attitude.

Rather, it is this accusation that is an extremely common trope.
Rarely is it true.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
--- Synchronet 3.21d-Linux NewsLink 1.2

From Kaz Kylheku@643-408-1753@kylheku.com to comp.unix.programmer on Wed Jan 15 20:27:41 2025

From Newsgroup: comp.unix.programmer

On 2025-01-15, Muttley@DastardlyHQ.org <Muttley@DastardlyHQ.org> wrote:

On Wed, 8 Jan 2025 20:27:54 -0000 (UTC)
Kaz Kylheku <643-408-1753@kylheku.com> wibbled:

On 2025-01-08, Muttley@DastardlyHQ.org <Muttley@DastardlyHQ.org> wrote:

I don't need to prove myself to argue with some random troll on a newsgroup.
If you can't handle that thats your problem, not mine.

I have to say, that's pretty tone deaf, as a reply to Dan Cross.

Sorry if I don't hero worship like you. If I think someone's talking bollocks then I'll call it out.

Glad you understand the concept.

Dan Cross being a random troll in a newsgroup is an example of bollocks,
and it's being called out.

Even if Dan happened to be caught talking bollocks, it still wouldn't
make him a troll, or random. It's simply not factual or rational.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Wed Jan 15 22:55:16 2025

From Newsgroup: comp.unix.programmer

In article <20250115122111.406@kylheku.com>,
Kaz Kylheku <643-408-1753@kylheku.com> wrote:

[snip]
Dan Cross being a random troll in a newsgroup is an example of bollocks,
and it's being called out.

Thanks, Kaz. That's kind of you.

Even if Dan happened to be caught talking bollocks, it still wouldn't
make him a troll, or random. It's simply not factual or rational.

In fairness, most of what I say is probably bollocks. :-D

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastardlyHQ.org to comp.unix.programmer on Thu Jan 16 09:40:43 2025

From Newsgroup: comp.unix.programmer

On Wed, 15 Jan 2025 20:20:44 -0000 (UTC)
Kaz Kylheku <643-408-1753@kylheku.com> wibbled:

On 2025-01-15, Muttley@DastardlyHQ.org <Muttley@DastardlyHQ.org> wrote:

The usual "i don't agree with you so you must be a troll" attitude.

Rather, it is this accusation that is an extremely common trope.
Rarely is it true.

Its very often true. Calling someone a troll in a newsgroup is a drop-the-mic and walk off because you've lost the argument. Exactly the same as calling someone a racist/xenophone/whatever IRL when you run out of debating ability.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastardlyHQ.org to comp.unix.programmer on Thu Jan 16 09:43:13 2025

From Newsgroup: comp.unix.programmer

On Wed, 15 Jan 2025 20:27:41 -0000 (UTC)
Kaz Kylheku <643-408-1753@kylheku.com> wibbled:

On 2025-01-15, Muttley@DastardlyHQ.org <Muttley@DastardlyHQ.org> wrote:

On Wed, 8 Jan 2025 20:27:54 -0000 (UTC)
Kaz Kylheku <643-408-1753@kylheku.com> wibbled:

On 2025-01-08, Muttley@DastardlyHQ.org <Muttley@DastardlyHQ.org> wrote:

I don't need to prove myself to argue with some random troll on a >newsgroup.
If you can't handle that thats your problem, not mine.

I have to say, that's pretty tone deaf, as a reply to Dan Cross.

Sorry if I don't hero worship like you. If I think someone's talking bollocks

then I'll call it out.

Glad you understand the concept.

Dan Cross being a random troll in a newsgroup is an example of bollocks,
and it's being called out.

I've rarely seen him post here but keep holding a candle for his magnificence if it makes you feel better.

Even if Dan happened to be caught talking bollocks, it still wouldn't
make him a troll, or random. It's simply not factual or rational.

If someone talks a load of rubbish in reply to someone and also accuses that someone of being clueless instead of actually arguing the point then I'd say that was trolling.

Incidentaly no one has said what this wunderkind has actually done that deserves such praise. Given that he looks late 20s in his picture he's certaiunly no greybeard with a long history of contributions.

--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Thu Jan 16 14:51:40 2025

From Newsgroup: comp.unix.programmer

In article <vmakbg$3eqtn$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >[snip]

Given that he looks late 20s in his picture he's
certaiunly no greybeard with a long history of contributions.

I wish I were still in my late 20s. Perhaps my knees would hurt
less.

And if only my hair (and beard) were more colorful, but sadly, I
went almost totally gray in Afghanistan. Perhaps I'll turn out
lucky and have eventually have silver hair like my grandmother
had.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Nicolas George@nicolas$george@salle-s.org to comp.unix.programmer on Thu Jan 16 15:01:56 2025

From Newsgroup: comp.unix.programmer

Muttley@DastardlyHQ.org, dans le message <vmak6r$3eq9o$1@dont-email.me>,
a ocrita:

Its very often true. Calling someone a troll in a newsgroup is a drop-the-mic and walk off because you've lost the argument.

From what I have seen recently, calling you a troll is cutting oneself with Hanlon's razor.
--- Synchronet 3.21d-Linux NewsLink 1.2

From Muttley@Muttley@DastardlyHQ.org to comp.unix.programmer on Thu Jan 16 15:47:50 2025

From Newsgroup: comp.unix.programmer

On Thu, 16 Jan 2025 14:51:40 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

In article <vmakbg$3eqtn$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>[snip]

Given that he looks late 20s in his picture he's
certaiunly no greybeard with a long history of contributions.

I wish I were still in my late 20s. Perhaps my knees would hurt
less.

You need to update your photo.

And if only my hair (and beard) were more colorful, but sadly, I
went almost totally gray in Afghanistan. Perhaps I'll turn out

The humblebrag vet stuff isn't going to get you any brownie points with me mate.
2 of my family also served there and they don't act like dicks in online forums.

--- Synchronet 3.21d-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.unix.programmer on Thu Jan 16 15:56:45 2025

From Newsgroup: comp.unix.programmer

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <vmakbg$3eqtn$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>[snip]

Given that he looks late 20s in his picture he's
certaiunly no greybeard with a long history of contributions.

I wish I were still in my late 20s. Perhaps my knees would hurt
less.

Weren't you at Murray Hill, before the move to Summit?
--- Synchronet 3.21d-Linux NewsLink 1.2

From cross@cross@spitfire.i.gajendra.net (Dan Cross) to comp.unix.programmer on Thu Jan 16 16:53:05 2025

From Newsgroup: comp.unix.programmer

In article <1%9iP.816684$oR74.514042@fx16.iad>,
Scott Lurndal <slp53@pacbell.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <vmakbg$3eqtn$1@dont-email.me>, <Muttley@DastardlyHQ.org> wrote: >>>[snip]

Given that he looks late 20s in his picture he's
certaiunly no greybeard with a long history of contributions.

I wish I were still in my late 20s. Perhaps my knees would hurt
less.

Weren't you at Murray Hill, before the move to Summit?

Ha, no; I used to just go down there to hang around the Unix
room, but never worked for the labs.

- Dan C.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Kaz Kylheku@643-408-1753@kylheku.com to comp.unix.programmer on Thu Jan 16 17:34:10 2025

From Newsgroup: comp.unix.programmer

On 2025-01-16, Muttley@DastardlyHQ.org <Muttley@DastardlyHQ.org> wrote:

On Thu, 16 Jan 2025 14:51:40 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) wibbled:

And if only my hair (and beard) were more colorful, but sadly, I
went almost totally gray in Afghanistan. Perhaps I'll turn out

2 of my family also served there and they don't act like dicks in online forums.

1 of your family, on the other hand, ...
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
--- Synchronet 3.21d-Linux NewsLink 1.2

Who's Online
Recent Visitors
- Sykotik
  Tue Apr 14 16:22:53 2026
  from Canada via Telnet
- Morningstarr
  Sun Apr 12 15:27:15 2026
  from Concord, Nc via Telnet
- Rixter
  Fri Apr 3 06:57:42 2026
  from Madison, Nc via SSH
- Voltman
  Wed Apr 1 11:17:50 2026
  from Calgary, Alberta via Telnet

System Info

Sysop:	Amessyroom
Location:	Fayetteville, NC
Users:	63
Nodes:	6 (0 / 6)
Uptime:	492924:49:32
Calls:	840
Calls today:	1
Files:	1,300
D/L today:	5 files (16,259K bytes)
Messages:	258,313

Re: Command Languages Versus Programming Languages

Who's Online

Recent Visitors

System Info