Forum: Too Lazy BBS

Who's Online

System Info

Sysop:	Amessyroom
Location:	Fayetteville, NC
Users:	28
Nodes:	6 (0 / 6)
Uptime:	91:14:59
Calls:	452
Files:	1,050
D/L today:	103 files (37,223K bytes)
Messages:	96,097

Re: Command Languages Versus Programming Languages

From Sebastian@21:1/5 to Stefan Ram on Tue Aug 6 08:04:35 2024

XPost: comp.unix.shell, comp.unix.programmer

In comp.unix.programmer Stefan Ram <ram@zedat.fu-berlin.de> wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote or quoted:

On 05.04.2024 01:29, Lawrence D'Oliveiro wrote:

This is where indentation helps. E.g.
a =
b ?
c ? d : e
: f ?
g ? h : i
: j;

Indentation generally helps.

Let me give it a try to find how I would indent that!

b?
c? d: e:
f?
g? h: i:
j;

Better:

a = b ? (c ? d : e) :
f ? (g ? h : i) :
j;

Equivalent Lisp, for comparison:

(setf a (cond (b (if c d e))
(f (if g h i))
(t j)))

And some people claim that ternaries are SO horribly
unreadable that they should never be used.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Sebastian on Tue Aug 6 23:34:22 2024

XPost: comp.unix.shell, comp.unix.programmer

On Tue, 6 Aug 2024 08:04:35 -0000 (UTC), Sebastian wrote:

Better:

a = b ? (c ? d : e) :
f ? (g ? h : i) :
j;

Better still (fewer confusing parentheses):

a =
b ?
c ? d : e
: f ?
g ? h : i
: j;

Equivalent Lisp, for comparison:

(setf a (cond (b (if c d e))
(f (if g h i))
(t j)))

You can’t avoid the parentheses, but this, too, can be improved:

(setf a
(cond
(b
(if c d e)
)
(f
(if g h i)
)
(t
j
)
) ; cond
)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Wed Aug 7 13:43:10 2024

XPost: comp.unix.shell, comp.unix.programmer

On 2024-08-06, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

Equivalent Lisp, for comparison:

(setf a (cond (b (if c d e))
(f (if g h i))
(t j)))

You can’t avoid the parentheses, but this, too, can be improved:

(setf a
(cond
(b
(if c d e)
)
(f
(if g h i)
)
(t
j
)
) ; cond
)

Nobody is ever going to follow your idio(syncra)tic coding preferences
for Lisp, that wouldn't pass code review in any Lisp shop, and result in patches being rejected in a FOSS setting.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@dastardlyhq.com@21:1/5 to All on Thu Aug 8 07:33:34 2024

XPost: comp.unix.shell, comp.unix.programmer

On Wed, 7 Aug 2024 13:43:10 -0000 (UTC)
Kaz Kylheku <643-408-1753@kylheku.com> boringly babbled:

On 2024-08-06, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

Equivalent Lisp, for comparison:

(setf a (cond (b (if c d e))
(f (if g h i))
(t j)))

You can’t avoid the parentheses, but this, too, can be improved:

(setf a
(cond
(b
(if c d e)
)
(f
(if g h i)
)
(t
j
)
) ; cond
)

Nobody is ever going to follow your idio(syncra)tic coding preferences
for Lisp, that wouldn't pass code review in any Lisp shop, and result in >patches being rejected in a FOSS setting.

I'm not a Lisp dev, but the original looks far more readable to me.
His definition of improvement seems to be obfuscation.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Andreas Eder@21:1/5 to Lawrence D'Oliveiro on Thu Aug 8 17:25:54 2024

XPost: comp.unix.shell, comp.unix.programmer

On Di 06 Aug 2024 at 23:34, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Tue, 6 Aug 2024 08:04:35 -0000 (UTC), Sebastian wrote:

Better:

a = b ? (c ? d : e) :
f ? (g ? h : i) :
j;

Better still (fewer confusing parentheses):

a =
b ?
c ? d : e
: f ?
g ? h : i
: j;

Equivalent Lisp, for comparison:

(setf a (cond (b (if c d e))
(f (if g h i))
(t j)))

You can’t avoid the parentheses, but this, too, can be improved:

(setf a
(cond
(b
(if c d e)
)
(f
(if g h i)
)
(t
j
)
) ; cond
)

Sorry, but that is not an improvement but rather an abomination.

'Andreas
--
ceterum censeo redmondinem esse delendam

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Fri Aug 9 00:07:25 2024

XPost: comp.unix.shell, comp.unix.programmer

On 2024-08-08, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Thu, 08 Aug 2024 17:25:54 +0200, Andreas Eder wrote:

Sorry, but that is not an improvement but rather an abomination.

Aw, diddums.

Yep! That's a magic word that will supposedly get your
idio(syncra)tically formatted commit on the approval fast track.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Andreas Eder on Thu Aug 8 23:41:55 2024

XPost: comp.unix.shell, comp.unix.programmer

On Thu, 08 Aug 2024 17:25:54 +0200, Andreas Eder wrote:

Sorry, but that is not an improvement but rather an abomination.

Aw, diddums.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Sebastian on Sun Aug 25 07:48:17 2024

XPost: comp.unix.shell, comp.unix.programmer

On Sun, 25 Aug 2024 07:32:26 -0000 (UTC), Sebastian wrote:

In comp.unix.programmer Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

a =
b ?
c ? d : e
: f ?
g ? h : i
: j;

I find this more confusing than the parentheses.

Not accustomed to looking at source code in 2D? You have to feel your way
from symbol to symbol like brackets, rather than being able to see overall shapes?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Sebastian@21:1/5 to Lawrence D'Oliveiro on Sun Aug 25 07:32:26 2024

XPost: comp.unix.shell, comp.unix.programmer

In comp.unix.programmer Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Tue, 6 Aug 2024 08:04:35 -0000 (UTC), Sebastian wrote:

Better:

a = b ? (c ? d : e) :
f ? (g ? h : i) :
j;

Better still (fewer confusing parentheses):

a =
b ?
c ? d : e
: f ?
g ? h : i
: j;

I find this more confusing than the parentheses.

Equivalent Lisp, for comparison:

(setf a (cond (b (if c d e))
(f (if g h i))
(t j)))

You can?t avoid the parentheses, but this, too, can be improved:

(setf a
(cond
(b
(if c d e)
)
(f
(if g h i)
)
(t
j
)
) ; cond
)

If you insist on writing Lisp like that, you might as well do
this:

(ql:quickload :with-c-syntax)
(named-readtables:in-readtable with-c-syntax:with-c-syntax-readtable)

#{
a = b ? (c ? d : e) :
f ? (g ? h : i) :
j;
#}

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dan Cross@21:1/5 to Muttley@DastartdlyHQ.org on Sat Oct 12 13:53:56 2024

XPost: comp.unix.shell, comp.unix.programmer

In article <vedcjc$3mqn$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote: >On Fri, 11 Oct 2024 16:28:03 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) boring babbled:

In article <vebi0j$3nhvq$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote: >>>Irrelevant. Lot of interpreters do partial compilation and the JVM does it >>>on the fly. A proper compiler writes a standalone binary file to disk.

Not generally, no. Most compilers these days generate object
code and then, as a separate step, a linker is invoked to
combine object files and library archives into an executable
binary.

Ok, the compiler toolchain then. Most people invoke it using a single command, >the rest is behind the scenes.

By the way, when many people talk about a "standalone" binary,
they are referring to something directly executable on hardware,

For many read a tiny minority.

without the benefit of an operating system. The Unix kernel is
an example of such a "standalone binary."

If you're going to nitpick then I'm afraid you're wrong. Almost all operating >systems require some kind of bootloader and/or BIOS combination to start them >up. You can't just point the CPU at the first byte of the binary and off it >goes particularly in the case of Linux where the kernel requires decompressing >first.

Again, not generally, no. Consider an embedded system where the
program to be executed on, say, a microcontroller is itself
statically linked at an absolute address and burned into a ROM,
with the program's entry point at the CPU's reset address. I
suppose that's not "standalone" if you count a ROM burner as
part of "loading" it.

Also, I mentioned Unix, not Linux. The two are different. The
first version of the Unix kernel started at a fixed location on
the PDP-7, without a separate loading step (Ken Thompson did
that manually).

Of course, this all gets more complex when we start talking
about modern systems with loading kernel modules and the like.

Most executable binaries are not standalone.

Standalone as you are well aware in the sense of doesn't require an interpreter
or VM to run on the OS and contains CPU machine code.

So what about a binary that is dynamically linked with a shared
object? That requires a runtime interpreter nee linker to bind
its constituent parts together before it's executable. And what
if it makes a system call? Then it's no longer "standalone", as
it necessarily relies on the operating system to perform part of
its function.

But that's really neither here nor there; I think you are
conflating object code with text containing instructions meant
for direct execution on a CPU with something like a P-code;
the distinction is kind of silly when you consider that we live
in a world with CPU simulators that let you boot entire systems
for architecture A in a program running on architecture B,
usually in userspace. Why do you think that a compiler that
generates bytecode for some virtual machine is any different
from a compiler that generates object code for some CPU?

You don't seem to be able to recognize that the compilation step
is separate from execution, and that the same techniques for
compiler development apply to both hardware and virtual targets.

Saving to some sort of object image is not a necessary function
of a compiler.

Yes it is.

So you say, but that's not the commonly accepted definition.
Sorry.

Where do you get this commonly accepted definition from?

*shrug* Tanenbaum; Silberschatz; Kaashoek; Roscoe; etc. Where
did you get your definition?

- Dan C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rainer Weikusat@21:1/5 to Muttley@DastartdlyHQ.org on Sat Oct 12 14:37:24 2024

XPost: comp.unix.shell, comp.unix.programmer

Muttley@DastartdlyHQ.org writes:

On Fri, 11 Oct 2024 20:58:26 -0000 (UTC)
Lawrence D'Oliveiro <ldo@nz.invalid> boring babbled:

On Fri, 11 Oct 2024 15:15:57 -0000 (UTC), Muttley wrote:

On Fri, 11 Oct 2024 15:47:06 +0100
Rainer Weikusat <rweikusat@talktalk.net>:

The Perl compiler turns Perl source code into a set of (that's a

Does it produce a standalone binary as output? No, so its an intepreter
not a compiler.

There are two parts: the interpreter interprets code generated by the compiler.

Code generated by a compiler does not require an interpreter.

Indeed. As far as I know the term, an interpreter is something which
reads text from a file, parses it an checks it for syntax errors
and then executes the code as soon as enough of it has been gathered to
allow for execution of something, ie, a complete statement. This read,
check and parse, execute cycle is repeated until the program
terminates.

Example for this:

[rw@doppelsaurus]/tmp#cat a.sh
ed a.sh <<'TT' >/dev/null 2>&1
9,$d
wq
TT
echo `expr $i + 0`
i=`expr $i + 1`
test $i = 11 && exit
sed -n '5,8p' a.sh | tee -a a.sh >/dev/null

This is a script printing the numbers from 0 to 10 by exploiting
the property that /bin/sh is an interpeter.

In contrast to this, a compiler reads the source code completely, parses
and checks it and then transforms it into some sort of "other
representation" which can be executed without dealing with the source
code (text) again. Eg, the Java compiler transforms Java source code
into Java bytecode which is then usually executed by the jvm. OTOH,
processors capable of executing Java bytecode directly exit or at least
used to exist. ARM CPUs once had an extension for that (Jazelle).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@dastardlyhq.com@21:1/5 to All on Sat Oct 12 14:50:09 2024

XPost: comp.unix.shell, comp.unix.programmer

On Sat, 12 Oct 2024 13:53:56 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) gabbled:

In article <vedcjc$3mqn$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote: >>up. You can't just point the CPU at the first byte of the binary and off it >>goes particularly in the case of Linux where the kernel requires decompressing

first.

Again, not generally, no. Consider an embedded system where the
program to be executed on, say, a microcontroller is itself
statically linked at an absolute address and burned into a ROM,

Unlikely to be running *nix in that case.

with the program's entry point at the CPU's reset address. I
suppose that's not "standalone" if you count a ROM burner as
part of "loading" it.

Now you're just being silly.

Also, I mentioned Unix, not Linux. The two are different. The

Are they? Thats debatable these days. I'd say Linux is a lot closer to
the philosphy of BSD and SYS-V than MacOS which is a certified unix.

Standalone as you are well aware in the sense of doesn't require an >interpreter
or VM to run on the OS and contains CPU machine code.

So what about a binary that is dynamically linked with a shared
object? That requires a runtime interpreter nee linker to bind
its constituent parts together before it's executable. And what
if it makes a system call? Then it's no longer "standalone", as
it necessarily relies on the operating system to perform part of
its function.

Standalone in the sense that the opcodes in the binary don't need to be transformed into something else before being loaded by the CPU.

usually in userspace. Why do you think that a compiler that
generates bytecode for some virtual machine is any different
from a compiler that generates object code for some CPU?

I'd say its a grey area because it isn't full compilation is it, the p-code still requires an interpreter before it'll run.

You don't seem to be able to recognize that the compilation step

Compiling is not the same as converting. Is a javascript to C converter a compiler? By your definition it is.

Where do you get this commonly accepted definition from?

*shrug* Tanenbaum; Silberschatz; Kaashoek; Roscoe; etc. Where
did you get your definition?

Only heard of one of them so mostly irrelevant. Mine come from the name of tools that compile code to a runnable binary.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@dastardlyhq.com@21:1/5 to All on Sat Oct 12 15:51:10 2024

XPost: comp.unix.shell, comp.unix.programmer

On Sat, 12 Oct 2024 15:32:28 GMT
scott@slp53.sl.home (Scott Lurndal) gabbled:

Muttley@dastardlyhq.com writes:

Standalone in the sense that the opcodes in the binary don't need to be >>transformed into something else before being loaded by the CPU.

That's a rather unique definition of 'standalone'.

Is it? As opposed to something that requires a seperate program to run any
of it. That java p-code isn't going to do much without a jvm to translate it into x86 or ARM etc whereas a compiled binary will - after required libraries are loaded with it - just run directly on the CPU.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Muttley@dastardlyhq.com on Sat Oct 12 15:32:28 2024

XPost: comp.unix.shell, comp.unix.programmer

Muttley@dastardlyhq.com writes:

On Sat, 12 Oct 2024 13:53:56 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) gabbled:

In article <vedcjc$3mqn$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote: >>>up. You can't just point the CPU at the first byte of the binary and off it >>>goes particularly in the case of Linux where the kernel requires decompressing

first.

Again, not generally, no. Consider an embedded system where the
program to be executed on, say, a microcontroller is itself
statically linked at an absolute address and burned into a ROM,

Unlikely to be running *nix in that case.

Some do, some don't. Many run zephyr, others various
commercial embedded RTOS. In any case, they're binaries
and if properly created, one simply points the cpu to the
first byte of the binary (or more likely some standard
PC value that the processor starts fetching from when
it leaves reset, e.g. the VTOR on Cortex-m7 cores) and off it goes.

So what about a binary that is dynamically linked with a shared
object? That requires a runtime interpreter nee linker to bind
its constituent parts together before it's executable. And what
if it makes a system call? Then it's no longer "standalone", as
it necessarily relies on the operating system to perform part of
its function.

Standalone in the sense that the opcodes in the binary don't need to be >transformed into something else before being loaded by the CPU.

That's a rather unique definition of 'standalone'.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dan Cross@21:1/5 to Muttley@dastardlyhq.com on Sat Oct 12 16:36:26 2024

XPost: comp.unix.shell, comp.unix.programmer

In article <vee2b1$6vup$1@dont-email.me>, <Muttley@dastardlyhq.com> wrote:

On Sat, 12 Oct 2024 13:53:56 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) gabbled:

In article <vedcjc$3mqn$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote: >>>up. You can't just point the CPU at the first byte of the binary and off it >>>goes particularly in the case of Linux where the kernel requires decompressing

first.

Again, not generally, no. Consider an embedded system where the
program to be executed on, say, a microcontroller is itself
statically linked at an absolute address and burned into a ROM,

Unlikely to be running *nix in that case.

We're discussing the concept of a "standalone binary"; you seem
to think that means a binary image emitted by a linker and meant
to run under a hosted environment, like an operating system. It
does not.

with the program's entry point at the CPU's reset address. I
suppose that's not "standalone" if you count a ROM burner as
part of "loading" it.

Now you're just being silly.

*shrug* Not my problem if you haven't dealt with many embedded
systems.

Also, I mentioned Unix, not Linux. The two are different. The

Are they? Thats debatable these days. I'd say Linux is a lot closer to
the philosphy of BSD and SYS-V than MacOS which is a certified unix.

Yes, they are.

Standalone as you are well aware in the sense of doesn't require an >>interpreter
or VM to run on the OS and contains CPU machine code.

So what about a binary that is dynamically linked with a shared
object? That requires a runtime interpreter nee linker to bind
its constituent parts together before it's executable. And what
if it makes a system call? Then it's no longer "standalone", as
it necessarily relies on the operating system to perform part of
its function.

Standalone in the sense that the opcodes in the binary don't need to be >transformed into something else before being loaded by the CPU.

Yeah, no, that's not what anybody serious means when they say
that.

usually in userspace. Why do you think that a compiler that
generates bytecode for some virtual machine is any different
from a compiler that generates object code for some CPU?

I'd say its a grey area because it isn't full compilation is it, the p-code >still requires an interpreter before it'll run.

Nope.

You don't seem to be able to recognize that the compilation step

Compiling is not the same as converting. Is a javascript to C converter a >compiler? By your definition it is.

Yes, of course it is. So is the terminfo compiler, and any
number of other similar things. The first C++ compiler, cfront
emitted C code, not object code. Was it not a compiler?

Where do you get this commonly accepted definition from?

*shrug* Tanenbaum; Silberschatz; Kaashoek; Roscoe; etc. Where
did you get your definition?

Only heard of one of them so mostly irrelevant. Mine come from the name of >tools that compile code to a runnable binary.

It's very odd that you seek to speak from a position of
authority when you don't even know who most of the major people
in the field are.

- Dan C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Eric Pozharski@21:1/5 to Weikusat on Sat Oct 12 16:39:20 2024

XPost: comp.unix.shell, comp.unix.programmer

with <87wmighu4i.fsf@doppelsaurus.mobileactivedefense.com> Rainer
Weikusat wrote:

Muttley@DastartdlyHQ.org writes:

On Wed, 09 Oct 2024 22:25:05 +0100 Rainer Weikusat
<rweikusat@talktalk.net> boring babbled:

Bozo User <anthk@disroot.org> writes:

On 2024-04-07, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Sun, 07 Apr 2024 00:01:43 +0000, Javier wrote:

*CUT* [ 19 lines 6 levels deep]

Its syntax is also a horrific mess.

Which means precisely what?

You're arguing with Unix Haters Handbook. You've already lost.

*CUT* [ 8 lines 2 levels deep]

--
Torvalds' goal for Linux is very simple: World Domination
Stallman's goal for GNU is even simpler: Freedom

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Christian Weisgerber@21:1/5 to Rainer Weikusat on Sat Oct 12 17:49:15 2024

XPost: comp.unix.shell, comp.unix.programmer

On 2024-10-12, Rainer Weikusat <rweikusat@talktalk.net> wrote:

Indeed. As far as I know the term, an interpreter is something which
reads text from a file, parses it an checks it for syntax errors
and then executes the code as soon as enough of it has been gathered to
allow for execution of something, ie, a complete statement. This read,
check and parse, execute cycle is repeated until the program
terminates.

I don't really want to participate in this discussion, but what
you're saying there is that all those 1980s home computer BASIC
interpreters, which read and tokenized a program before execution,
were actually compilers.

--
Christian "naddy" Weisgerber naddy@mips.inka.de

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Rainer Weikusat on Sat Oct 12 20:50:51 2024

XPost: comp.unix.shell, comp.unix.programmer

On 12/10/2024 14:37, Rainer Weikusat wrote:

Muttley@DastartdlyHQ.org writes:

On Fri, 11 Oct 2024 20:58:26 -0000 (UTC)
Lawrence D'Oliveiro <ldo@nz.invalid> boring babbled:

On Fri, 11 Oct 2024 15:15:57 -0000 (UTC), Muttley wrote:

On Fri, 11 Oct 2024 15:47:06 +0100
Rainer Weikusat <rweikusat@talktalk.net>:

The Perl compiler turns Perl source code into a set of (that's a

Does it produce a standalone binary as output? No, so its an intepreter >>>> not a compiler.

There are two parts: the interpreter interprets code generated by the compiler.

Code generated by a compiler does not require an interpreter.

Indeed. As far as I know the term, an interpreter is something which
reads text from a file, parses it an checks it for syntax errors
and then executes the code as soon as enough of it has been gathered to
allow for execution of something, ie, a complete statement. This read,
check and parse, execute cycle is repeated until the program
terminates.

That would be some very old BASIC, or maybe a shell program where the
input is a stream of lines, and each must be executed when they were
entered (you'd never reach end-of-file otherwise!).

Most interpreters taking input from a file will compile the whole thing
to some intermediate form first (eg. to internal bytecode). Then it will interpret that bytecode.

Otherwise they would be hoplessly slow.

In contrast to this, a compiler reads the source code completely, parses
and checks it and then transforms it into some sort of "other
representation" which can be executed without dealing with the source
code (text) again.

Some compilers (eg. the ones I write) can run programs from source just
like an interpreted language. The difference here is that it first
translates the whole program to native code rather than bytecode.

Then there are more complicated schemes which mix things up: they start
off interpreting and turn 'hot' paths into native code via JIT techniques.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Muttley on Sat Oct 12 21:25:17 2024

XPost: comp.unix.shell, comp.unix.programmer

On Sat, 12 Oct 2024 08:42:17 -0000 (UTC), Muttley wrote:

Code generated by a compiler does not require an interpreter.

Something has to implement the rules of the “machine language”. This is
why we use the term “abstract machine”, to avoid having to distinguish between “hardware” and “software”.

Think: modern CPUs typically have “microcode” and “firmware” associated with them. Are those “hardware” or “software”?

If you want to go down the reductio ad absurdum route then the electrons
are interpreters too.

<https://www.americanscientist.org/article/the-computational-universe> <https://en.wikipedia.org/wiki/Digital_physics>

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Sun Oct 13 08:19:16 2024

XPost: comp.unix.shell, comp.unix.programmer

On Sat, 12 Oct 2024 16:39:20 +0000
Eric Pozharski <apple.universe@posteo.net> boring babbled:

with <87wmighu4i.fsf@doppelsaurus.mobileactivedefense.com> Rainer
Weikusat wrote:

Muttley@DastartdlyHQ.org writes:

On Wed, 09 Oct 2024 22:25:05 +0100 Rainer Weikusat
<rweikusat@talktalk.net> boring babbled:

Bozo User <anthk@disroot.org> writes:

On 2024-04-07, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Sun, 07 Apr 2024 00:01:43 +0000, Javier wrote:

*CUT* [ 19 lines 6 levels deep]

Its syntax is also a horrific mess.

Which means precisely what?

You're arguing with Unix Haters Handbook. You've already lost.

ITYF the people who dislike Perl are the ones who actually like the unix
way of having simple daisychained tools instead of some lump of a language
that does everything messily.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Sun Oct 13 08:20:19 2024

XPost: comp.unix.shell, comp.unix.programmer

On Sat, 12 Oct 2024 17:49:15 -0000 (UTC)
Christian Weisgerber <naddy@mips.inka.de> boring babbled:

On 2024-10-12, Rainer Weikusat <rweikusat@talktalk.net> wrote:

Indeed. As far as I know the term, an interpreter is something which
reads text from a file, parses it an checks it for syntax errors
and then executes the code as soon as enough of it has been gathered to
allow for execution of something, ie, a complete statement. This read,
check and parse, execute cycle is repeated until the program
terminates.

I don't really want to participate in this discussion, but what
you're saying there is that all those 1980s home computer BASIC
interpreters, which read and tokenized a program before execution,
were actually compilers.

He's painted himself into a corner. Will be interesting to see how gets
out of it.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Sun Oct 13 08:22:53 2024

XPost: comp.unix.shell, comp.unix.programmer

On Sat, 12 Oct 2024 21:25:17 -0000 (UTC)
Lawrence D'Oliveiro <ldo@nz.invalid> boring babbled:

On Sat, 12 Oct 2024 08:42:17 -0000 (UTC), Muttley wrote:

Code generated by a compiler does not require an interpreter.

Something has to implement the rules of the “machine language”. This is >why we use the term “abstract machine”, to avoid having to distinguish >between “hardware” and “software”.

Think: modern CPUs typically have “microcode” and “firmware” >associated
with them. Are those “hardware” or “software”?

Who cares what happens inside the CPU hardware? It could be a group of pixies with abacuses for all the relevance it has to this argument. Standalone binaries contain machine code that can be directly executed by the CPU.

If you want to go down the reductio ad absurdum route then the electrons
are interpreters too.

<https://www.americanscientist.org/article/the-computational-universe> ><https://en.wikipedia.org/wiki/Digital_physics>

Thats heading off into philosphy.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Sun Oct 13 08:18:08 2024

XPost: comp.unix.shell, comp.unix.programmer

On Sat, 12 Oct 2024 16:36:26 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) boring babbled:

In article <vee2b1$6vup$1@dont-email.me>, <Muttley@dastardlyhq.com> wrote: >>Unlikely to be running *nix in that case.

We're discussing the concept of a "standalone binary"; you seem
to think that means a binary image emitted by a linker and meant
to run under a hosted environment, like an operating system. It
does not.

It can mean either. Essentially its a binary that contains directly runnable CPU machine code. I'm not sure why you're having such a conceptual struggle understanding this simple concept.

Now you're just being silly.

*shrug* Not my problem if you haven't dealt with many embedded
systems.

I could bore you with the number I've actually "dealt with" including
military hardware but whats the point. You've probably programmed the occasional PIC or arduino and think you're an expert.

Are they? Thats debatable these days. I'd say Linux is a lot closer to
the philosphy of BSD and SYS-V than MacOS which is a certified unix.

Yes, they are.

I disagree. Modern linux reminds me a lot of SunOS and HP-UX from back in
the day. Not something that can be said for MacOS with its role-our-own
Apple specific way of doing pretty much everything.

Standalone in the sense that the opcodes in the binary don't need to be >>transformed into something else before being loaded by the CPU.

Yeah, no, that's not what anybody serious means when they say
that.

Anybody serious presumably meaning you.

I'd say its a grey area because it isn't full compilation is it, the p-code >>still requires an interpreter before it'll run.

Nope.

Really? So java bytecode will run direct on x86 or ARM will it? Please give some links to this astounding discovery you've made.

Compiling is not the same as converting. Is a javascript to C converter a >>compiler? By your definition it is.

Yes, of course it is. So is the terminfo compiler, and any

So in your mind google translate is a "compiler" for spoken languages is it?

number of other similar things. The first C++ compiler, cfront
emitted C code, not object code. Was it not a compiler?

No, it was a pre-compiler. Just like Oracles PRO*C/C++.

Only heard of one of them so mostly irrelevant. Mine come from the name of >>tools that compile code to a runnable binary.

It's very odd that you seek to speak from a position of
authority when you don't even know who most of the major people
in the field are.

I know the important ones. You've dug out some obscure names from google
that probably only a few CS courses even mention never mind study the work of.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Muttley@DastartdlyHQ.org on Sun Oct 13 14:55:14 2024

XPost: comp.unix.shell, comp.unix.programmer

On 13.10.2024 10:19, Muttley@DastartdlyHQ.org wrote:

On Sat, 12 Oct 2024 16:39:20 +0000
Eric Pozharski <apple.universe@posteo.net> boring babbled:

[...]

You're arguing with Unix Haters Handbook. You've already lost.

ITYF the people who dislike Perl are the ones who actually like the unix
way of having simple daisychained tools instead of some lump of a language that does everything messily.

(I think some topics in this black-or-white mix should be sorted.)

The pipelining mechanisms - if you meant that with "daisychained
tools" - has its limitations. For simple filtering it's okay, but
once you need some information from the front of the processing
chain at the end of the chain solutions can get very clumsy, may
create logical problems, race conditions, etc. Being able to
memorize some information from the pipe-front processes to be used
later is one inherent advantage of using a [scripting-]language
like Perl, Awk, or whatever. (Personally I prefer a language like
Awk because it's simple and has a clear syntax as opposed to Perl,
but I understand that other folks might prefer Perl or Python.)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dan Cross@21:1/5 to Muttley@DastartdlyHQ.org on Sun Oct 13 13:43:54 2024

XPost: comp.unix.shell, comp.unix.programmer

In article <vefvo0$k1mm$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote: >On Sat, 12 Oct 2024 16:36:26 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) boring babbled:

In article <vee2b1$6vup$1@dont-email.me>, <Muttley@dastardlyhq.com> wrote: >>>Unlikely to be running *nix in that case.

We're discussing the concept of a "standalone binary"; you seem
to think that means a binary image emitted by a linker and meant
to run under a hosted environment, like an operating system. It
does not.

It can mean either. Essentially its a binary that contains directly runnable >CPU machine code. I'm not sure why you're having such a conceptual struggle >understanding this simple concept.

Oh, I understand what you mean; it's your choice of non-standard
terminology that I object to. Admittedly, Microsoft uses the
term "standalone binary" to describe one other people might call
"statically" linked", but you seem to mean any binary that comes
out of say, invoking the `gcc` or `clang` driver and getting an
executable object image. And in context, you seem to mean that
a "compiler" is a program that _only_ generates such artifacts,
but that's nonsense, and in fact, many compilers simply don't
work that way. Even LLVM might generate an intermediate
language, that is then in turn processed to generate object code
for some particular target; that target might be an ISA for
which physical silicon exists, or it might not.

Consider, for example, the MMIX CPU designed by Knuth; a
compiler may generate code for that, even though there is no CPU
implementing the MMIX instruction set that I can pop on down to
Microcenter and buy. Does that make that compiler less of a
compiler? Or, keeping with the theme of MMIX, I'll bet someone
has done an HDL implementation of it suitable for loading into
and running on an FPGA; so is a compiler targeting it now a
real compiler?

Or consider x86; most modern x86 processors are really dataflow
CPUs, and the x86 instruction encoding is just a bytecode that
is, in fact, interpreted by the real CPU under the hood. So
where does that fit on your little shrink-to-fit taxonomy? What
about a compiler like LLVM that can target multiple backends,
some of which may not actually be hardware (like, say, eBPF).
Is any compiler that generates an intermediate laguage not a
Real compiler, since it's not generating executable object code
directly? What about a compiler that _only_ outputs an object
file and defers to an explicitly programmer-driven seperate link
step? The Plan 9 compiler suite works that way, and indeed,
actual instruction selection is deferred to the linker. https://9p.io/sys/doc/compiler.html

Or consider the APEX compiler for APL, which generated output as
code in the SISAL programming language; the SISAL compiler, in
turn, output either C or FORTRAN. This was actually quite
useful; APL is great at very high-level optimizations ("multiply
these matrices this way..."), SISAL was great a medium-level
inter-procedural optimizations, and of course the system C and
FORTRAN compilers excel at low-level register-level
optimization. The effect was a program that was highly
optimized when finally distilled down into an executable image.
To assert that these weren't compilers is inane.

Now you're just being silly.

*shrug* Not my problem if you haven't dealt with many embedded
systems.

I could bore you with the number I've actually "dealt with" including >military hardware but whats the point.

Weird appeals to experience, with vague and unsupported claims,
aren't terribly convincing.

You've probably programmed the
occasional PIC or arduino and think you're an expert.

Ok, Internet Guy.

Are they? Thats debatable these days. I'd say Linux is a lot closer to >>>the philosphy of BSD and SYS-V than MacOS which is a certified unix.

Yes, they are.

I disagree. Modern linux reminds me a lot of SunOS and HP-UX from back in
the day.

Then I can only guess that you never used either SunOS or HP-UX.

Not something that can be said for MacOS with its role-our-own
Apple specific way of doing pretty much everything.

Well, since we're talking about "standalone binaries" and my
example was the Unix kernel, I should confess that I was really
thinking more like the Unix kernel, or perhaps the standalone
installation program that came on the V7 tape.

Standalone in the sense that the opcodes in the binary don't need to be >>>transformed into something else before being loaded by the CPU.

Yeah, no, that's not what anybody serious means when they say
that.

Anybody serious presumably meaning you.

Sorry, you've shown no evidence why I should believe your
assertions, and you've ignored directly disconfirming evidence
showing that those assertions don't hold generally. If you want
to define the concept of a "compiler" to be what you've narrowly
defined it to be, you'll just have to accept that you're in very
short company and people aren't going to take you particularly
seriously.

I'd say its a grey area because it isn't full compilation is it, the p-code >>>still requires an interpreter before it'll run.

Nope.

Really? So java bytecode will run direct on x86 or ARM will it? Please give >some links to this astounding discovery you've made.

Um, ok. https://en.wikipedia.org/wiki/Jazelle

Again, I bring up my earlier example of a CPU simulator.

Compiling is not the same as converting. Is a javascript to C converter a >>>compiler? By your definition it is.

Yes, of course it is. So is the terminfo compiler, and any

So in your mind google translate is a "compiler" for spoken languages is it?

To quote you above, "now you're just being silly."

number of other similar things. The first C++ compiler, cfront
emitted C code, not object code. Was it not a compiler?

No, it was a pre-compiler. Just like Oracles PRO*C/C++.

Nope.

Only heard of one of them so mostly irrelevant. Mine come from the name of >>>tools that compile code to a runnable binary.

It's very odd that you seek to speak from a position of
authority when you don't even know who most of the major people
in the field are.

I know the important ones. You've dug out some obscure names from google
that probably only a few CS courses even mention never mind study the work of.

Ok, so you aren't familiar with the current state of the field
as far as systems go; fair enough.

In that case, let's just take a look at an authoritative source
and see what it says. From Chapter 1, "Introduction to
Compiling", section 1.1 "Compilers", first sentence of
"Compilers: Principles, Techniques, and Tools" (1st Edition) by
Aho, Sethi, and Ullman: "Simply stated, a compiler is a program
that reads a program written in one language -- the _source_
language -- and translates it into an equivalent program in
another language -- the _target_ language."

In the second paragraph, those authors go on to say, "...a
target language may be another programming language, or the
machine language of any computer".

Note "any computer", could also be a kind of virtual machine.
And of course, if the target langauge is another programming
language, that already covers what is under discussion here.

So it would seem that your definition is not shared by those who
quite literally wrote the book on compilers.

Look, I get the desire to want to pin things down into neat
little categorical buckets, and if in one's own experience a
"compiler" has only ever meant GCC or perhaps clang (or maybe
Microsoft's compiler), then I can get where one is coming from.
But as usual, in its full generality, the world is just messier
than whatever conceptual boxes you've built up here.

- Dan C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Muttley@DastartdlyHQ.org on Sun Oct 13 15:02:13 2024

XPost: comp.unix.shell, comp.unix.programmer

Muttley@DastartdlyHQ.org writes:

On Sat, 12 Oct 2024 16:36:26 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) boring babbled:

In article <vee2b1$6vup$1@dont-email.me>, <Muttley@dastardlyhq.com> wrote: >>>Unlikely to be running *nix in that case.

*shrug* Not my problem if you haven't dealt with many embedded
systems.

I could bore you with the number I've actually "dealt with" including >military hardware but whats the point. You've probably programmed the >occasional PIC or arduino and think you're an expert.

Dan isn't unknown in the field of computer science, particularly
Unix and Plan 9. You, on the other hand....

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Dan Cross on Sun Oct 13 15:08:32 2024

XPost: comp.unix.shell, comp.unix.programmer

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <vefvo0$k1mm$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote:

Really? So java bytecode will run direct on x86 or ARM will it? Please give >>some links to this astounding discovery you've made.

Um, ok. https://en.wikipedia.org/wiki/Jazelle

There was also a company a couple of decades ago that
built an entire processor designed to execute bytecode
directly - with a coprocessor to handle I/O.

IIRC, it was Azul. There were a number of others, including
Sun.

None of them panned out - JIT's ended up winning that battle.

Even ARM no longer includes Jazelle extensions in any of their
mainstream processors.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Sun Oct 13 14:54:13 2024

XPost: comp.unix.shell, comp.unix.programmer

On Sun, 13 Oct 2024 13:43:54 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) boring babbled:

In article <vefvo0$k1mm$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote: >>On Sat, 12 Oct 2024 16:36:26 -0000 (UTC)

It can mean either. Essentially its a binary that contains directly runnable >>CPU machine code. I'm not sure why you're having such a conceptual struggle >>understanding this simple concept.

Oh, I understand what you mean; it's your choice of non-standard
terminology that I object to. Admittedly, Microsoft uses the

So what is standard terminology then?

Or consider x86; most modern x86 processors are really dataflow
CPUs, and the x86 instruction encoding is just a bytecode that
is, in fact, interpreted by the real CPU under the hood. So
where does that fit on your little shrink-to-fit taxonomy? What

What happens inside the CPU is irrelevant. Its a black box as far as the
rest of the machine is concerned. As I said in another post, it could be
pixies with abacuses, doesn't matter.

[lots of waffle snipped]

I could bore you with the number I've actually "dealt with" including >>military hardware but whats the point.

Weird appeals to experience, with vague and unsupported claims,
aren't terribly convincing.

So its ok for you to do that but nobody else?

You've probably programmed the
occasional PIC or arduino and think you're an expert.

Ok, Internet Guy.

I'll take that as a yes. Btw, you're some random guy on the internet too claiming some kind of higher experience.

I disagree. Modern linux reminds me a lot of SunOS and HP-UX from back in >>the day.

Then I can only guess that you never used either SunOS or HP-UX.

"I disagree with you so you must be lying". Whatever.

Anybody serious presumably meaning you.

Sorry, you've shown no evidence why I should believe your
assertions, and you've ignored directly disconfirming evidence

Likewise.

Really? So java bytecode will run direct on x86 or ARM will it? Please give >>some links to this astounding discovery you've made.

Um, ok. https://en.wikipedia.org/wiki/Jazelle

So its incomplete and has to revert to software for some opcodes. Great.
FWIW Sun also had a java processor but you still can't run bytecode on
normal hardware without a JVM.

So in your mind google translate is a "compiler" for spoken languages is it?

To quote you above, "now you're just being silly."

Why, whats the difference? Your definition seems to be any program that can translate from one language to another.

No, it was a pre-compiler. Just like Oracles PRO*C/C++.

Nope.

Yes, they're entirely analoguous.

https://docs.oracle.com/cd/E11882_01/appdev.112/e10825/pc_02prc.htm

I know the important ones. You've dug out some obscure names from google >>that probably only a few CS courses even mention never mind study the work of.

Ok, so you aren't familiar with the current state of the field
as far as systems go; fair enough.

Who cares about the current state? Has nothing to do with this discussion.

Aho, Sethi, and Ullman: "Simply stated, a compiler is a program
that reads a program written in one language -- the _source_
language -- and translates it into an equivalent program in
another language -- the _target_ language."

Thats an opinion, not a fact.

So it would seem that your definition is not shared by those who
quite literally wrote the book on compilers.

Writing the book is not the same as writing the compilers.

Look, I get the desire to want to pin things down into neat
little categorical buckets, and if in one's own experience a
"compiler" has only ever meant GCC or perhaps clang (or maybe
Microsoft's compiler), then I can get where one is coming from.

You can add a couple of TI and MPLAB compilers into that list. And obviously Arduinos , whatever its called. Been a while.

But as usual, in its full generality, the world is just messier
than whatever conceptual boxes you've built up here.

There's a difference between accepting there are shades of grey and asserting that a compiler is pretty much any program which translates from one thing to another.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Muttley@DastartdlyHQ.org on Sun Oct 13 17:17:28 2024

XPost: comp.unix.programmer

[ X-post list reduced ]

On 13.10.2024 16:54, Muttley@DastartdlyHQ.org wrote:

Aho, Sethi, and Ullman: "Simply stated, a compiler is a program
that reads a program written in one language -- the _source_
language -- and translates it into an equivalent program in
another language -- the _target_ language."

Thats an opinion, not a fact.

Well, I recall the compiler construction and formal languages books
from Aho, Hopcroft, Sethi, Ullman, from the 1980's. Excellent books
and a reference (also in Europe). - So calling the citation only as
an "opinion" is quite audacious! - If I'd have to choose an expert
from the list Muttley, Aho, Hopcroft, Sethi, Ullman, the choice is
quite obvious. :-)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dan Cross@21:1/5 to Scott Lurndal on Sun Oct 13 15:52:14 2024

XPost: comp.unix.shell, comp.unix.programmer

In article <QnROO.226037$EEm7.111715@fx16.iad>,
Scott Lurndal <slp53@pacbell.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <vefvo0$k1mm$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote:

Really? So java bytecode will run direct on x86 or ARM will it? Please give >>>some links to this astounding discovery you've made.

Um, ok. https://en.wikipedia.org/wiki/Jazelle

There was also a company a couple of decades ago that
built an entire processor designed to execute bytecode
directly - with a coprocessor to handle I/O.

IIRC, it was Azul. There were a number of others, including
Sun.

None of them panned out - JIT's ended up winning that battle.

Even ARM no longer includes Jazelle extensions in any of their
mainstream processors.

Sure. But the fact that any of these were going concerns is an
existence proof that one _can_ take bytecodes targetted toward a
"virtual" machine and execute it on silicon, making the
distinction a lot more fluid than might be naively assumed, in
turn exposing the silliness of this argument that centers around
this weirdly overly-rigid definition of what a "compiler" is.

- Dan C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to Who on Sun Oct 13 16:02:13 2024

XPost: comp.unix.shell, comp.unix.programmer

On Sun, 13 Oct 2024 15:30:03 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) boring babbled:

In article <vegmul$ne3v$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote: >>So what is standard terminology then?

I've already explained this to you.

No you haven't. You explanation seems to be "anything that converts from one language to another".

What happens inside the CPU is irrelevant. Its a black box as far as the >>rest of the machine is concerned. As I said in another post, it could be >>pixies with abacuses, doesn't matter.

So why do you think it's so important that the definition of a

Who said its important? Its just what most people think of as compilers.

CPU"? If, as you admit, what the CPU does is highly variable,
then why do you cling so hard to this meaningless distinction?

You're the one making a big fuss about it with pages of waffle to back up
your claim.

[lots of waffle snipped]

In other words, you discard anything that doesn't fit with your >preconceptions. Got it.

No, I just have better things to do on a sunday than read all that. Keep
it to the point.

So its incomplete and has to revert to software for some opcodes. Great. >>FWIW Sun also had a java processor but you still can't run bytecode on >>normal hardware without a JVM.

Cool. So if I run a program targetting a newer version of an
ISA is run on an older machine, and that machine lacks a newer
instruction present in the program, and the CPU generates an
illegal instruction trap at runtime that the OS catches and
emulates on the program's behalf, the program was not compiled?

And again, what about an emulator for a CPU running on a
different CPU? I can boot 7th Edition Unix on a PDP-11
emulator on my workstation; does that mean that the 7the
edition C compiler wasn't a compiler?

Its all shades of grey. You seem to be getting very worked up about it.
As I said, most people consider a compiler as something that translates source code to machine code and writes it to a file.

Why, whats the difference? Your definition seems to be any program that can >>translate from one language to another.

If you can't see that yourself, then you're either ignorant or
obstinant. Take your pick.

So you can't argue the failure of your logic then. Noted.

Yes, they're entirely analoguous.

https://docs.oracle.com/cd/E11882_01/appdev.112/e10825/pc_02prc.htm

Nah, not really.

Oh nice counter arguement, you really sold your POV there.

Who cares about the current state? Has nothing to do with this discussion.

In other words, "I don't have an argument, so I'll just lamely
try to define things until I'm right."

Im just defining things the way most people see it, not some ivory tower academics. Anyway, lifes too short for the rest.

[tl;dr]

that a compiler is pretty much any program which translates from one thing to >>another.

No. It translates one computer _language_ to another computer
_language_. In the usual case, that's from a textual source

Machine code isn't a language. Fallen at the first hurdle with that
definition.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dan Cross@21:1/5 to Muttley@DastartdlyHQ.org on Sun Oct 13 15:30:03 2024

XPost: comp.unix.shell, comp.unix.programmer

In article <vegmul$ne3v$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote: >On Sun, 13 Oct 2024 13:43:54 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) boring babbled:

In article <vefvo0$k1mm$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote: >>>On Sat, 12 Oct 2024 16:36:26 -0000 (UTC)

It can mean either. Essentially its a binary that contains directly runnable >>>CPU machine code. I'm not sure why you're having such a conceptual struggle >>>understanding this simple concept.

Oh, I understand what you mean; it's your choice of non-standard >>terminology that I object to. Admittedly, Microsoft uses the

So what is standard terminology then?

I've already explained this to you.

Or consider x86; most modern x86 processors are really dataflow
CPUs, and the x86 instruction encoding is just a bytecode that
is, in fact, interpreted by the real CPU under the hood. So
where does that fit on your little shrink-to-fit taxonomy? What

What happens inside the CPU is irrelevant. Its a black box as far as the
rest of the machine is concerned. As I said in another post, it could be >pixies with abacuses, doesn't matter.

So why do you think it's so important that the definition of a
compiler means, "generates object code directly runnable on a
CPU"? If, as you admit, what the CPU does is highly variable,
then why do you cling so hard to this meaningless distinction?

[lots of waffle snipped]

In other words, you discard anything that doesn't fit with your
preconceptions. Got it.

Then I can only guess that you never used either SunOS or HP-UX.

"I disagree with you so you must be lying". Whatever.

Way to miss the point by fixating on random details. Lawrence,
is that you?

Sorry, you've shown no evidence why I should believe your
assertions, and you've ignored directly disconfirming evidence

Likewise.

I've cited evidence written by acknowledged experts in the
field; have you?

Really? So java bytecode will run direct on x86 or ARM will it? Please give >>>some links to this astounding discovery you've made.

Um, ok. https://en.wikipedia.org/wiki/Jazelle

So its incomplete and has to revert to software for some opcodes. Great.
FWIW Sun also had a java processor but you still can't run bytecode on
normal hardware without a JVM.

Cool. So if I run a program targetting a newer version of an
ISA is run on an older machine, and that machine lacks a newer
instruction present in the program, and the CPU generates an
illegal instruction trap at runtime that the OS catches and
emulates on the program's behalf, the program was not compiled?

And again, what about an emulator for a CPU running on a
different CPU? I can boot 7th Edition Unix on a PDP-11
emulator on my workstation; does that mean that the 7the
edition C compiler wasn't a compiler?

So in your mind google translate is a "compiler" for spoken languages is it? >>

To quote you above, "now you're just being silly."

Why, whats the difference? Your definition seems to be any program that can >translate from one language to another.

If you can't see that yourself, then you're either ignorant or
obstinant. Take your pick.

No, it was a pre-compiler. Just like Oracles PRO*C/C++.

Nope.

Yes, they're entirely analoguous.

https://docs.oracle.com/cd/E11882_01/appdev.112/e10825/pc_02prc.htm

Nah, not really.

I know the important ones. You've dug out some obscure names from google >>>that probably only a few CS courses even mention never mind study the work of.

Ok, so you aren't familiar with the current state of the field
as far as systems go; fair enough.

Who cares about the current state? Has nothing to do with this discussion.

In other words, "I don't have an argument, so I'll just lamely
try to define things until I'm right."

Aho, Sethi, and Ullman: "Simply stated, a compiler is a program
that reads a program written in one language -- the _source_
language -- and translates it into an equivalent program in
another language -- the _target_ language."

Thats an opinion, not a fact.

So it would seem that your definition is not shared by those who
quite literally wrote the book on compilers.

Writing the book is not the same as writing the compilers.

Well, let's look at the words of someone who wrote the compiler,
then. Stroustrup writes in his HOPL paper on C++ that cfront
was, "my original C++ compiler". Quoted from, https://stroustrup.com/hopl-almost-final.pdf (page 6).

Look, I get the desire to want to pin things down into neat
little categorical buckets, and if in one's own experience a
"compiler" has only ever meant GCC or perhaps clang (or maybe
Microsoft's compiler), then I can get where one is coming from.

You can add a couple of TI and MPLAB compilers into that list. And obviously >Arduinos , whatever its called. Been a while.

But as usual, in its full generality, the world is just messier
than whatever conceptual boxes you've built up here.

There's a difference between accepting there are shades of grey and asserting >that a compiler is pretty much any program which translates from one thing to >another.

No. It translates one computer _language_ to another computer
_language_. In the usual case, that's from a textual source
language to a binary machine language, but that needn't be the
case.

- Dan C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Muttley@DastartdlyHQ.org on Sun Oct 13 18:28:32 2024

XPost: comp.unix.programmer

[ X-post list reduced ]

On 13.10.2024 18:02, Muttley@DastartdlyHQ.org wrote:

On Sun, 13 Oct 2024 15:30:03 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) boring babbled:

[...]

No. It translates one computer _language_ to another computer
_language_. In the usual case, that's from a textual source

Machine code isn't a language. Fallen at the first hurdle with that definition.

Careful (myself included); watch out for the glazed frost!

You know there's formal definitions for what constitutes languages.

At first glance I don't see why machine code wouldn't quality as a
language (either as some specific "mnemonic" representation, or as
a sequence of integral numbers or other "code" representations).

What's the problem, in your opinion, with considering machine code
as a language?

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Dan Cross on Sun Oct 13 17:20:40 2024

XPost: comp.unix.shell, comp.unix.programmer

On 13/10/2024 16:52, Dan Cross wrote:

In article <QnROO.226037$EEm7.111715@fx16.iad>,
Scott Lurndal <slp53@pacbell.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <vefvo0$k1mm$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote:

Really? So java bytecode will run direct on x86 or ARM will it? Please give
some links to this astounding discovery you've made.

Um, ok. https://en.wikipedia.org/wiki/Jazelle

There was also a company a couple of decades ago that
built an entire processor designed to execute bytecode
directly - with a coprocessor to handle I/O.

IIRC, it was Azul. There were a number of others, including
Sun.

None of them panned out - JIT's ended up winning that battle.

Even ARM no longer includes Jazelle extensions in any of their
mainstream processors.

Sure. But the fact that any of these were going concerns is an
existence proof that one _can_ take bytecodes targetted toward a
"virtual" machine and execute it on silicon,
making the
distinction a lot more fluid than might be naively assumed, in
turn exposing the silliness of this argument that centers around
this weirdly overly-rigid definition of what a "compiler" is.

I've implemented numerous compilers and interpreters over the last few
decades (and have dabbled in emulators).

To me the distinctions are clear enough because I have to work at the
sharp end!

I'm not sure why people want to try and be clever by blurring the roles
of compiler and interpreter; that's not helpful at all.

Sure, people can write emulators for machine code, which are a kind of interpreter, or they can implement bytecode in hardware; so what?

That doesn't really affect what I do. Writing compiler backends for
actual CPUs is hard work. Generating bytecode is a lot simpler.
(Especially in my case as I've devised myself, another distinction.
Compilers usually target someone else's instruction set.)

If you want one more distinction, it is this: with my compiler, the
resultant binary is executed by a separate agency: the CPU. Or maybe the
OS loader will run it through an emulator.

With my interpreter, then *I* have to write the dispatch routines and
write code to implement all the instructions.

(My compilers generate an intermediate language, a kind of VM, which is
then processed further into native code.

But I have also tried interpreting that VM; it just runs 20 times slower
than native code. That's what interpreting usually means: slow programs.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Muttley@DastartdlyHQ.org on Sun Oct 13 16:31:58 2024

XPost: comp.unix.shell, comp.unix.programmer

On 2024-10-11, Muttley@DastartdlyHQ.org <Muttley@DastartdlyHQ.org> wrote:

Irrelevant. Lot of interpreters do partial compilation and the JVM does it
on the fly. A proper compiler writes a standalone binary file to disk.

You might want to check those goalposts again. You can easily make a
"proper compiler" which just writes a canned interpreter executable to
disk, appending to it the program source code.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Kaz Kylheku on Sun Oct 13 20:06:12 2024

XPost: comp.unix.shell, comp.unix.programmer

On 13/10/2024 17:31, Kaz Kylheku wrote:

On 2024-10-11, Muttley@DastartdlyHQ.org <Muttley@DastartdlyHQ.org> wrote:

Irrelevant. Lot of interpreters do partial compilation and the JVM does it >> on the fly. A proper compiler writes a standalone binary file to disk.

You might want to check those goalposts again. You can easily make a
"proper compiler" which just writes a canned interpreter executable to
disk, appending to it the program source code.

So, an interpreter. The rest is just details of its deployment. In your example, the program being run is just some embedded data.

Maybe the real question is what is 'hardware', and what is 'software'.
But the answer won't make everyone happy because because hardware can be emulated in software.

(Implementing software in hardware, specifically the bit of software
that interprets a VM, is less common, and generally harder.)

I prefer that there is a clear distinction between compiler and
interpreter, because you immediately know what's what. (Here I'm
excluding complex JIT products that mix up both.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dan Cross@21:1/5 to bc@freeuk.com on Sun Oct 13 20:29:46 2024

XPost: comp.unix.shell, comp.unix.programmer

In article <vegs0o$nh5t$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 13/10/2024 16:52, Dan Cross wrote:

In article <QnROO.226037$EEm7.111715@fx16.iad>,
Scott Lurndal <slp53@pacbell.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <vefvo0$k1mm$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote:

Really? So java bytecode will run direct on x86 or ARM will it? Please give
some links to this astounding discovery you've made.

Um, ok. https://en.wikipedia.org/wiki/Jazelle

There was also a company a couple of decades ago that
built an entire processor designed to execute bytecode
directly - with a coprocessor to handle I/O.

IIRC, it was Azul. There were a number of others, including
Sun.

None of them panned out - JIT's ended up winning that battle.

Even ARM no longer includes Jazelle extensions in any of their
mainstream processors.

Sure. But the fact that any of these were going concerns is an
existence proof that one _can_ take bytecodes targetted toward a
"virtual" machine and execute it on silicon,
making the
distinction a lot more fluid than might be naively assumed, in
turn exposing the silliness of this argument that centers around
this weirdly overly-rigid definition of what a "compiler" is.

I've implemented numerous compilers and interpreters over the last few >decades (and have dabbled in emulators).

To me the distinctions are clear enough because I have to work at the
sharp end!

I'm not sure why people want to try and be clever by blurring the roles
of compiler and interpreter; that's not helpful at all.

I'm not saying the two are the same; what I'm saying is that
this arbitrary criteria that a compiler must emit a fully
executable binary image is not just inadquate, but also wrong,
as it renders separate compilation impossible. I am further
saying that there are many different _types_ of compilers,
including specialized tools that don't emit machine language.

Sure, people can write emulators for machine code, which are a kind of >interpreter, or they can implement bytecode in hardware; so what?

That's exactly my point.

That doesn't really affect what I do. Writing compiler backends for
actual CPUs is hard work. Generating bytecode is a lot simpler.

That really depends on the bytecode, doesn't it? The JVM is a
complex beast; MIPS or the unprivileged integer subset of RISC-V
are pretty simple in comparison.

(Especially in my case as I've devised myself, another distinction.
Compilers usually target someone else's instruction set.)

If you want one more distinction, it is this: with my compiler, the
resultant binary is executed by a separate agency: the CPU. Or maybe the
OS loader will run it through an emulator.

Python has a mode by which it will emit bytecode _files_, which
can be separately loaded and interpreted; it even has an
optimizing mode. Is that substantially different?

With my interpreter, then *I* have to write the dispatch routines and
write code to implement all the instructions.

Again, I don't think that anyone disputes that interpreters
exist. But insisting that they must take a particular shape is
just wrong.

(My compilers generate an intermediate language, a kind of VM, which is
then processed further into native code.

Then by the definition of this psuedonyminous guy I've been
responding to, your compiler is not a "proper compiler", no?

But I have also tried interpreting that VM; it just runs 20 times slower
than native code. That's what interpreting usually means: slow programs.)

Not necessarily. The JVM does pretty good, quite honestly.

- Dan C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rainer Weikusat@21:1/5 to Christian Weisgerber on Sun Oct 13 21:25:51 2024

XPost: comp.unix.shell, comp.unix.programmer

Christian Weisgerber <naddy@mips.inka.de> writes:

On 2024-10-12, Rainer Weikusat <rweikusat@talktalk.net> wrote:

Indeed. As far as I know the term, an interpreter is something which
reads text from a file, parses it an checks it for syntax errors
and then executes the code as soon as enough of it has been gathered to
allow for execution of something, ie, a complete statement. This read,
check and parse, execute cycle is repeated until the program
terminates.

I don't really want to participate in this discussion, but what
you're saying there is that all those 1980s home computer BASIC
interpreters, which read and tokenized a program before execution,
were actually compilers.

If they contained something which compiled all of the source code prior
to execution in order to transform it some actually executable
intermediate representation whose execution didn't require future access
to the source code and thus, also didn't include checking the source
code for syntactical correctness, this something can be called a
compiler and the execution engine some sort of virtual machine which
could principally execute programs compiled from source code in any
programming language.

But judging from Wikipedia, Murkysoft Basic stored programs as linked
list of preprocessed lines and interpreted these, ie, doing string
lookups of keywords from the source code at run time in order to
determine what code to execute. Insofar I vaguely remember this from
Apple //c BASIC (has been a while) syntax errors would also be found at runtime, ie, once execution reached the line with the error. This would
make it an interpreter.

In constrast to this, this somewhat amusing small Perl program:

while (<>) {
while (length) {
s/^(\w+)// and print(scalar reverse($1));
s/^(\W+)// and print($1);
}
}

[reads lines from stdin and prints them with each word reversed]

gets translated into an op tree whose textual representation (perl -MO=Concise,-basic) looks like
this:

y <@> leave[1 ref] vKP/REFC ->(end)
1 <0> enter v ->2
2 <;> nextstate(main 1 a.pl:1) v:{ ->3
x <2> leaveloop vKP/2 ->y
3 <{> enterloop(next->r last->x redo->4) v ->s
- <1> null vK/1 ->x
w <|> and(other->4) vK/1 ->x
v <1> defined sK/1 ->w
- <1> null sK/2 ->v
- <1> ex-rv2sv sKRM*/1 ->t
s <#> gvsv[*_] s ->t
u <1> readline[t2] sKS/1 ->v
t <#> gv[*ARGV] s ->u
- <@> lineseq vKP ->-
4 <;> nextstate(main 3 a.pl:2) v:{ ->5
q <2> leaveloop vKP/2 ->r
5 <{> enterloop(next->m last->q redo->6) v ->n
- <1> null vK/1 ->q
p <|> and(other->6) vK/1 ->q
o <1> length[t4] sK/BOOL,1 ->p
- <1> ex-rv2sv sK/1 ->o
n <#> gvsv[*_] s ->o
- <@> lineseq vKP ->-
6 <;> nextstate(main 5 a.pl:3) v:{ ->7
- <1> null vK/1 ->f
9 <|> and(other->a) vK/1 ->f
8 </> subst(/"^(\\w+)"/) sK/BOOL ->9
7 <$> const[PV ""] s ->8
e <@> print vK ->f
a <0> pushmark s ->b
- <1> scalar sK/1 ->e
d <@> reverse[t6] sK/1 ->e
b <0> pushmark s ->c
- <1> ex-rv2sv sK/1 ->d
c <#> gvsv[*1] s ->d
f <;> nextstate(main 5 a.pl:4) v:{ ->g
- <1> null vK/1 ->m
i <|> and(other->j) vK/1 ->m
h </> subst(/"^(\\W+)"/) sK/BOOL ->i
g <$> const[PV ""] s ->h
l <@> print vK ->m
j <0> pushmark s ->k
- <1> ex-rv2sv sK/1 ->l
k <#> gvsv[*1] s ->l
m <0> unstack v ->n
r <0> unstack v ->s

Each line represents a node on this tree and the names refer to builtin
'ops'. In the actual tree, they're pointers to C functions and execution happens as preorder traversal of this tree and invoking the op functions
from the leaves to root to produce the arguments necessary for invoking
op functions residing at a higher level in this tree.

Modules for writing this internal representation to a file and loading
it back from there and even for translating it into C exist. They're
just not part of the core distribution anymore.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dan Cross@21:1/5 to Muttley@DastartdlyHQ.org on Sun Oct 13 20:15:45 2024

XPost: comp.unix.shell, comp.unix.programmer

In article <vegqu5$o3ve$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote: >On Sun, 13 Oct 2024 15:30:03 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) boring babbled:

In article <vegmul$ne3v$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote: >>>So what is standard terminology then?

I've already explained this to you.

No you haven't. You explanation seems to be "anything that converts from one >language to another".

What happens inside the CPU is irrelevant. Its a black box as far as the >>>rest of the machine is concerned. As I said in another post, it could be >>>pixies with abacuses, doesn't matter.

So why do you think it's so important that the definition of a

Who said its important? Its just what most people think of as compilers.

CPU"? If, as you admit, what the CPU does is highly variable,
then why do you cling so hard to this meaningless distinction?

You're the one making a big fuss about it with pages of waffle to back up >your claim.

[lots of waffle snipped]

In other words, you discard anything that doesn't fit with your >>preconceptions. Got it.

No, I just have better things to do on a sunday than read all that. Keep
it to the point.

So its incomplete and has to revert to software for some opcodes. Great. >>>FWIW Sun also had a java processor but you still can't run bytecode on >>>normal hardware without a JVM.

Cool. So if I run a program targetting a newer version of an
ISA is run on an older machine, and that machine lacks a newer
instruction present in the program, and the CPU generates an
illegal instruction trap at runtime that the OS catches and
emulates on the program's behalf, the program was not compiled?

And again, what about an emulator for a CPU running on a
different CPU? I can boot 7th Edition Unix on a PDP-11
emulator on my workstation; does that mean that the 7the
edition C compiler wasn't a compiler?

Its all shades of grey. You seem to be getting very worked up about it.
As I said, most people consider a compiler as something that translates source >code to machine code and writes it to a file.

Why, whats the difference? Your definition seems to be any program that can >>>translate from one language to another.

If you can't see that yourself, then you're either ignorant or
obstinant. Take your pick.

So you can't argue the failure of your logic then. Noted.

Yes, they're entirely analoguous.

https://docs.oracle.com/cd/E11882_01/appdev.112/e10825/pc_02prc.htm

Nah, not really.

Oh nice counter arguement, you really sold your POV there.

Who cares about the current state? Has nothing to do with this discussion. >>

In other words, "I don't have an argument, so I'll just lamely
try to define things until I'm right."

Im just defining things the way most people see it, not some ivory tower >academics. Anyway, lifes too short for the rest.

[tl;dr]

that a compiler is pretty much any program which translates from one thing to
another.

No. It translates one computer _language_ to another computer
_language_. In the usual case, that's from a textual source

Machine code isn't a language. Fallen at the first hurdle with that >definition.

Newsgroups: comp.unix.shell,comp.unix.programmer,comp.lang.misc
Subject: Re: Command Languages Versus Programming Languages
Summary:
Expires:
References: <uu54la$3su5b$6@dont-email.me> <vegmul$ne3v$1@dont-email.me> <vegp1r$oqh$1@reader1.panix.com> <vegqu5$o3ve$1@dont-email.me>
Sender:
Followup-To:
Distribution:
Organization:
Keywords:
Cc:

In article <vegqu5$o3ve$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote: >On Sun, 13 Oct 2024 15:30:03 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) boring babbled:

In article <vegmul$ne3v$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote: >>>So what is standard terminology then?

I've already explained this to you.

No you haven't. You explanation seems to be "anything that converts from one >language to another".

The context of this specific quote, which you snipped, was your
insistence on the meaning of the term, "standalone binary."
There are a number of common terms for what you are describing,
which is the general term for the executable output artifact
from a software build, none of which is "standalone binary".

Common terms are "executable" or "executable file" (that's what
the ELF standard calls it, for instance), but also "binary",
"image", etc.

What happens inside the CPU is irrelevant. Its a black box as far as the >>>rest of the machine is concerned. As I said in another post, it could be >>>pixies with abacuses, doesn't matter.

So why do you think it's so important that the definition of a

Who said its important? Its just what most people think of as compilers.

Well, you seem to think it's rather important.

CPU"? If, as you admit, what the CPU does is highly variable,
then why do you cling so hard to this meaningless distinction?

You're the one making a big fuss about it with pages of waffle to back up >your claim.

I just don't like misinformation floating around unchallenged.

You have cited nothing to back up your claims.

So its incomplete and has to revert to software for some opcodes. Great. >>>FWIW Sun also had a java processor but you still can't run bytecode on >>>normal hardware without a JVM.

Cool. So if I run a program targetting a newer version of an
ISA is run on an older machine, and that machine lacks a newer
instruction present in the program, and the CPU generates an
illegal instruction trap at runtime that the OS catches and
emulates on the program's behalf, the program was not compiled?

And again, what about an emulator for a CPU running on a
different CPU? I can boot 7th Edition Unix on a PDP-11
emulator on my workstation; does that mean that the 7the
edition C compiler wasn't a compiler?

Its all shades of grey. You seem to be getting very worked up about it.

Nah, I don't really care, aside from not wanting misinformation
to stand unchallenged.

As I said, most people consider a compiler as something that translates source >code to machine code and writes it to a file.

Sure, if you're talking informally and you mention "a compiler"
most people will know more or less what you're talking about.
But back in <vebffc$3n6jv$1@dont-email.me> you wrote,

|Does it produce a standalone binary as output? No, so its an
|intepreter not a compiler.

I said that was a bad distinction, to which you replied in <vebi0j$3nhvq$1@dont-email.me>:

|A proper compiler writes a standalone binary file to disk.

Except that, well, it doesn't. Even the "proper compilers" that
you claim familiarity with basically don't do that; as I pointed
out to you, they generate object files and a driver invokes a
linker.

For that matter, the compiler itself may not even generate
object code, but rather, may generate textual assembly and let a
separate assembler pass turn _that_ into object code.

So yeah. What you've defined to be a "proper compiler" isn't
really what you seem to think that it is.

[snip]

Who cares about the current state? Has nothing to do with this discussion. >>

In other words, "I don't have an argument, so I'll just lamely
try to define things until I'm right."

Im just defining things the way most people see it, not some ivory tower >academics. Anyway, lifes too short for the rest.

The people who create the field are the ones who get to make
the defintiions, not you.

Machine code isn't a language. Fallen at the first hurdle with that >definition.

Oh really? Is that why they call it "machine language"? It's
even in the dictionary with "machine code" as a synonymn: https://www.merriam-webster.com/dictionary/machine%20language

- Dan C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dan Cross@21:1/5 to 643-408-1753@kylheku.com on Sun Oct 13 20:30:08 2024

XPost: comp.unix.shell, comp.unix.programmer

In article <20241013093004.251@kylheku.com>,
Kaz Kylheku <643-408-1753@kylheku.com> wrote:

On 2024-10-11, Muttley@DastartdlyHQ.org <Muttley@DastartdlyHQ.org> wrote:

Irrelevant. Lot of interpreters do partial compilation and the JVM does it >> on the fly. A proper compiler writes a standalone binary file to disk.

You might want to check those goalposts again. You can easily make a
"proper compiler" which just writes a canned interpreter executable to
disk, appending to it the program source code.

Indeed; this is what the Moscow ML compiler does.

- Dan C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to All on Sun Oct 13 20:33:10 2024

XPost: comp.unix.shell, comp.unix.programmer

On Sun, 13 Oct 2024 08:22:53 -0000 (UTC), Muttley boring babbled:

On Sat, 12 Oct 2024 21:25:17 -0000 (UTC)
Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Sat, 12 Oct 2024 08:42:17 -0000 (UTC), Muttley boring babbled:

Code generated by a compiler does not require an interpreter.

Something has to implement the rules of the “machine language”. This is >>why we use the term “abstract machine”, to avoid having to distinguish >>between “hardware” and “software”.

Think: modern CPUs typically have “microcode” and “firmware” associated
with them. Are those “hardware” or “software”?

Who cares what happens inside the CPU hardware?

Because that’s where your “software” runs.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rainer Weikusat@21:1/5 to Muttley@DastartdlyHQ.org on Sun Oct 13 21:33:56 2024

XPost: comp.unix.shell, comp.unix.programmer

Muttley@DastartdlyHQ.org writes:

On Sat, 12 Oct 2024 16:39:20 +0000
Eric Pozharski <apple.universe@posteo.net> boring babbled:

with <87wmighu4i.fsf@doppelsaurus.mobileactivedefense.com> Rainer
Weikusat wrote:

Muttley@DastartdlyHQ.org writes:

On Wed, 09 Oct 2024 22:25:05 +0100 Rainer Weikusat
<rweikusat@talktalk.net> boring babbled:

Bozo User <anthk@disroot.org> writes:

On 2024-04-07, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Sun, 07 Apr 2024 00:01:43 +0000, Javier wrote:

*CUT* [ 19 lines 6 levels deep]

Its syntax is also a horrific mess.

Which means precisely what?

You're arguing with Unix Haters Handbook. You've already lost.

ITYF the people who dislike Perl are the ones who actually like the unix
way of having simple daisychained tools instead of some lump of a language that does everything messily.

Perl is a general-purpose programming language, just like C or Java (or
Python or Javascript or Rust or $whatnot). This means it can be used to implement anything (with some practical limitation for anything) and not
that it "does everything".

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Muttley on Sun Oct 13 20:34:47 2024

XPost: comp.unix.shell, comp.unix.programmer

On Sun, 13 Oct 2024 08:19:16 -0000 (UTC), Muttley wrote:

ITYF the people who dislike Perl are the ones who actually like the unix
way of having simple daisychained tools instead of some lump of a
language that does everything messily.

Not sure how those small tools can work without the support of much bigger lumps like the shell, the compiler/interpreter for those tools and the
kernel itself.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Muttley on Sun Oct 13 21:09:13 2024

XPost: comp.unix.shell, comp.unix.programmer

On Sun, 13 Oct 2024 16:02:13 -0000 (UTC), Muttley wrote:

You explanation seems to be "anything that converts from one
language to another".

You would call that a “translator”. That term was used more in the early days, but that’s essentially synonymous with “compiler”.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Janis Papanagnou on Sun Oct 13 21:10:06 2024

XPost: comp.unix.programmer

On Sun, 13 Oct 2024 18:28:32 +0200, Janis Papanagnou wrote:

You know there's formal definitions for what constitutes languages.

Not really. For example, some have preferred the term “notation” instead
of “language”.

Regardless of what you call it, machine code still qualifies.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Muttley on Sun Oct 13 21:08:09 2024

XPost: comp.unix.shell, comp.unix.programmer

On Sun, 13 Oct 2024 14:54:13 -0000 (UTC), Muttley wrote:

What happens inside the CPU is irrelevant.

But that’s where your “software” runs.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Mon Oct 14 01:16:11 2024

XPost: comp.unix.programmer

On 13.10.2024 23:10, Lawrence D'Oliveiro wrote:

On Sun, 13 Oct 2024 18:28:32 +0200, Janis Papanagnou wrote:

You know there's formal definitions for what constitutes languages.

Not really. For example, some have preferred the term “notation” instead of “language”.

A "notation" is not the same as a [formal (or informal)] "language".

(Frankly, I don't know where you're coming from; mind to explain your
point if you think it's relevant. - But since you wrote "_some_ have
preferred" it might anyway have been only an opinion or a historic
inaccuracy so it's probably not worth expanding on that?)

I think we should be clear about terminology.

I was speaking about [formal] languages as introduced by Chomsky and
used (and extended) by scientists (specifically computer scientists)
since then. And these formal characteristics of languages and grammars
are also the base of the books that have been mentioned and recently
quoted in this sub-thread.

Regardless of what you call it, machine code still qualifies.

Glad you agree.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Dan Cross on Mon Oct 14 01:20:45 2024

XPost: comp.unix.shell, comp.unix.programmer

On 13/10/2024 21:29, Dan Cross wrote:

In article <vegs0o$nh5t$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 13/10/2024 16:52, Dan Cross wrote:

In article <QnROO.226037$EEm7.111715@fx16.iad>,
Scott Lurndal <slp53@pacbell.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <vefvo0$k1mm$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote:

Really? So java bytecode will run direct on x86 or ARM will it? Please give
some links to this astounding discovery you've made.

Um, ok. https://en.wikipedia.org/wiki/Jazelle

There was also a company a couple of decades ago that
built an entire processor designed to execute bytecode
directly - with a coprocessor to handle I/O.

IIRC, it was Azul. There were a number of others, including
Sun.

None of them panned out - JIT's ended up winning that battle.

Even ARM no longer includes Jazelle extensions in any of their
mainstream processors.

Sure. But the fact that any of these were going concerns is an
existence proof that one _can_ take bytecodes targetted toward a
"virtual" machine and execute it on silicon,
making the
distinction a lot more fluid than might be naively assumed, in
turn exposing the silliness of this argument that centers around
this weirdly overly-rigid definition of what a "compiler" is.

I've implemented numerous compilers and interpreters over the last few
decades (and have dabbled in emulators).

To me the distinctions are clear enough because I have to work at the
sharp end!

I'm not sure why people want to try and be clever by blurring the roles
of compiler and interpreter; that's not helpful at all.

I'm not saying the two are the same; what I'm saying is that
this arbitrary criteria that a compiler must emit a fully
executable binary image is not just inadquate, but also wrong,
as it renders separate compilation impossible. I am further
saying that there are many different _types_ of compilers,
including specialized tools that don't emit machine language.

Sure, people can write emulators for machine code, which are a kind of
interpreter, or they can implement bytecode in hardware; so what?

That's exactly my point.

So, then what, we do away with the concepts of 'compiler' and
'interpreter'? Or allow them to be used interchangeably?

Somehow I don't think it is useful to think of gcc as a interpreter for
C, or CPython as an native code compiler for Python.

That doesn't really affect what I do. Writing compiler backends for
actual CPUs is hard work. Generating bytecode is a lot simpler.

That really depends on the bytecode, doesn't it? The JVM is a
complex beast;

Is it? It's not to my taste, but it didn't look too scary to me. Whereas
modern CPU instruction sets are horrendous. (I normally target x64,
which is described in 6 large volumes. RISC ones don't look much better,
eg. RISC V with its dozens of extensions and special types)

Example of JVM:

aload index Push a reference from local variable #index

MIPS or the unprivileged integer subset of RISC-V

are pretty simple in comparison.

(Especially in my case as I've devised myself, another distinction.
Compilers usually target someone else's instruction set.)

If you want one more distinction, it is this: with my compiler, the
resultant binary is executed by a separate agency: the CPU. Or maybe the
OS loader will run it through an emulator.

Python has a mode by which it will emit bytecode _files_, which
can be separately loaded and interpreted; it even has an
optimizing mode. Is that substantially different?

Whether there is a discrete bytecode file is besides the point. (I
generated such files for many years.)

You still need software to execute it. Especially for dynamically typed bytecode which doesn't lend itself easily to either hardware
implementations, or load-time native code translation.

With my interpreter, then *I* have to write the dispatch routines and
write code to implement all the instructions.

Again, I don't think that anyone disputes that interpreters
exist. But insisting that they must take a particular shape is
just wrong.

What shape would that be? Generally they will need some /software/ to
excute the instructions of the program being interpreted, as I said.
Some JIT products may choose to do on-demand translation to native code.

Is there anything else? I'd be interested in anything new!

(My compilers generate an intermediate language, a kind of VM, which is
then processed further into native code.

Then by the definition of this psuedonyminous guy I've been
responding to, your compiler is not a "proper compiler", no?

Actually mine is more of a compiler than many, since it directly
generates native machine code. Others generally stop at ASM code (eg.
gcc) or OBJ code, and will invoke separate programs to finish the job.

The intermediate language here is just a step in the process.

But I have also tried interpreting that VM; it just runs 20 times slower
than native code. That's what interpreting usually means: slow programs.)

Not necessarily. The JVM does pretty good, quite honestly.

But is it actually interpreting? Because if I generated such code for a statically typed language, then I would first translate to native code,
of any quality, since it's going to be faster than interpreting.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Janis Papanagnou on Mon Oct 14 01:45:49 2024

XPost: comp.unix.programmer

On Mon, 14 Oct 2024 01:16:11 +0200, Janis Papanagnou wrote:

On 13.10.2024 23:10, Lawrence D'Oliveiro wrote:

On Sun, 13 Oct 2024 18:28:32 +0200, Janis Papanagnou wrote:

You know there's formal definitions for what constitutes languages.

Not really. For example, some have preferred the term “notation”
instead of “language”.

A "notation" is not the same as a [formal (or informal)] "language".

(Frankly, I don't know where you're coming from ...

<https://en.wikipedia.org/wiki/Programming_language>:

A programming language is a system of notation for writing computer
programs.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Mon Oct 14 08:25:37 2024

XPost: comp.unix.shell, comp.unix.programmer

On Sun, 13 Oct 2024 20:15:45 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) boring babbled:

In article <vegqu5$o3ve$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote:

[tl;dr]

The people who create the field are the ones who get to make
the defintiions, not you.

ITYF people in the field as a whole make the definitions.

Machine code isn't a language. Fallen at the first hurdle with that >>definition.

Oh really? Is that why they call it "machine language"? It's
even in the dictionary with "machine code" as a synonymn: >https://www.merriam-webster.com/dictionary/machine%20language

Its not a programming language.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Mon Oct 14 08:23:20 2024

XPost: comp.unix.programmer

On Sun, 13 Oct 2024 18:28:32 +0200
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> boring babbled:

[ X-post list reduced ]

On 13.10.2024 18:02, Muttley@DastartdlyHQ.org wrote:

On Sun, 13 Oct 2024 15:30:03 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) boring babbled:

[...]

No. It translates one computer _language_ to another computer
_language_. In the usual case, that's from a textual source

Machine code isn't a language. Fallen at the first hurdle with that
definition.

Careful (myself included); watch out for the glazed frost!

You know there's formal definitions for what constitutes languages.

At first glance I don't see why machine code wouldn't quality as a
language (either as some specific "mnemonic" representation, or as
a sequence of integral numbers or other "code" representations).
What's the problem, in your opinion, with considering machine code
as a language?

A programming language is an abstraction of machine instructions that is readable by people.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Mon Oct 14 08:28:57 2024

XPost: comp.unix.shell, comp.unix.programmer

On Sun, 13 Oct 2024 21:33:56 +0100
Rainer Weikusat <rweikusat@talktalk.net> boring babbled: >Muttley@DastartdlyHQ.org writes:

ITYF the people who dislike Perl are the ones who actually like the unix
way of having simple daisychained tools instead of some lump of a language >> that does everything messily.

Perl is a general-purpose programming language, just like C or Java (or >Python or Javascript or Rust or $whatnot). This means it can be used to >implement anything (with some practical limitation for anything) and not
that it "does everything".

I can be , but generally isn't. Its niche tends to be text processing of
some sort and for that there are better tools IMO. It used to be big in web backend but those days are long gone.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Sebastian@21:1/5 to Lawrence D'Oliveiro on Mon Aug 26 16:13:33 2024

XPost: comp.unix.shell, comp.unix.programmer

In comp.unix.programmer Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Sun, 25 Aug 2024 07:32:26 -0000 (UTC), Sebastian wrote:

In comp.unix.programmer Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

a =
b ?
c ? d : e
: f ?
g ? h : i
: j;

I find this more confusing than the parentheses.

Not accustomed to looking at source code in 2D? You have to feel your way from symbol to symbol like brackets, rather than being able to see overall shapes?

Says the guy who finds a few brackets so confusing that he Black-formatted
a snippet of Lisp code.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Sebastian on Mon Aug 26 21:31:20 2024

XPost: comp.unix.shell, comp.unix.programmer

On Mon, 26 Aug 2024 16:13:33 -0000 (UTC), Sebastian wrote:

In comp.unix.programmer Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Sun, 25 Aug 2024 07:32:26 -0000 (UTC), Sebastian wrote:

In comp.unix.programmer Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

a =
b ?
c ? d : e
: f ?
g ? h : i
: j;

I find this more confusing than the parentheses.

Not accustomed to looking at source code in 2D? You have to feel your
way from symbol to symbol like brackets, rather than being able to see
overall shapes?

Says the guy who finds a few brackets so confusing that he
Black-formatted a snippet of Lisp code.

I use the same principles in all my code.

(And I have no idea about this “Black” thing. I just do my thing.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Sebastian@21:1/5 to Lawrence D'Oliveiro on Tue Aug 27 03:15:16 2024

XPost: comp.unix.shell, comp.unix.programmer

In comp.unix.programmer Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Mon, 26 Aug 2024 16:13:33 -0000 (UTC), Sebastian wrote:

In comp.unix.programmer Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Sun, 25 Aug 2024 07:32:26 -0000 (UTC), Sebastian wrote:

In comp.unix.programmer Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

Says the guy who finds a few brackets so confusing that he
Black-formatted a snippet of Lisp code.

I use the same principles in all my code. (And I have no idea about
this ?Black? thing. I just do my thing.)

Black is a Python program that formats Python code
almost exactly the way you formatted that snippet of Lisp
code. It's just as ugly in Python as it is in Lisp. Black
spreads by convincing organizations to mandate its use. It's
utterly non-configurable on purpose, in order to guarantee
that eventually, all Python code is made to be as ugly
and unreadable as possible.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Sebastian on Tue Aug 27 04:44:52 2024

XPost: comp.unix.shell, comp.unix.programmer

On Tue, 27 Aug 2024 03:15:16 -0000 (UTC), Sebastian wrote:

In comp.unix.programmer Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

(And I have no idea about this “Black” thing. I just do my thing.)

Black is a [bla bla bla]

*Yawn*

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bozo User@21:1/5 to Muttley@dastardlyhq.com on Mon Sep 30 20:04:52 2024

XPost: comp.unix.shell, comp.unix.programmer

On 2024-08-08, Muttley@dastardlyhq.com <Muttley@dastardlyhq.com> wrote:

On Wed, 7 Aug 2024 13:43:10 -0000 (UTC)
Kaz Kylheku <643-408-1753@kylheku.com> boringly babbled:

On 2024-08-06, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

Equivalent Lisp, for comparison:

(setf a (cond (b (if c d e))
(f (if g h i))
(t j)))

You can’t avoid the parentheses, but this, too, can be improved:

(setf a
(cond
(b
(if c d e)
)
(f
(if g h i)
)
(t
j
)
) ; cond
)

Nobody is ever going to follow your idio(syncra)tic coding preferences
for Lisp, that wouldn't pass code review in any Lisp shop, and result in >>patches being rejected in a FOSS setting.

I'm not a Lisp dev, but the original looks far more readable to me.
His definition of improvement seems to be obfuscation.

Cond can do if clauses by itself. Your code is overloaded.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bozo User@21:1/5 to Lawrence D'Oliveiro on Mon Sep 30 20:04:54 2024

XPost: comp.unix.shell, comp.unix.programmer

On 2024-04-07, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Sun, 07 Apr 2024 00:01:43 +0000, Javier wrote:

The downside is the loss of performance because of disk access for
trivial things like 'nfiles=$(ls | wc -l)'.

Well, you could save one process creation by writing
“nfiles=$(echo * | wc -l)” instead. But that would still not be strictly correct.

I suspect disk access times where
one of the reasons for the development of perl in the early 90s.

Shells were somewhat less powerful in those days. I would describe the genesis of Perl as “awk on steroids”. Its big party trick was regular expressions. And I guess combining that with more sophisticated data- structuring capabilities.

Perl is more awk+sed+sh in a single language. Basically the killer
of the Unix philophy in late 90's/early 00's, and for the good.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Bozo User on Mon Sep 30 21:04:24 2024

XPost: comp.unix.shell, comp.unix.programmer

On Mon, 30 Sep 2024 20:04:54 -0000 (UTC), Bozo User wrote:

Perl is more awk+sed+sh in a single language. Basically the killer of
the Unix philophy in late 90's/early 00's, and for the good.

That’s what Rob Pike said <https://interviews.slashdot.org/story/04/10/18/1153211/rob-pike-responds>:

Q: “Given the nature of current operating systems and
applications, do you think the idea of "one tool doing one job
well" has been abandoned?”
A: “Those days are dead and gone and the eulogy was delivered by
Perl.”

But I’m not sure I agree. Those small, specialized tools always
required large, monolithic pieces under them to operate: the shell
itself for shell scripts, the X server for GUI apps, the kernel itself
for everything. So while the coming of Perl has changed some things,
it has not made a difference to the modularity of the Unix way.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Bozo User on Mon Sep 30 20:59:16 2024

XPost: comp.unix.shell, comp.unix.programmer

On Mon, 30 Sep 2024 20:04:53 -0000 (UTC), Bozo User wrote:

Check Emacs' eshell where you can mix pseudo-sh code with Elisp

Can you give examples that either support or refute my claim?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Mon Oct 14 11:05:06 2024

XPost: comp.unix.shell, comp.unix.programmer

On Mon, 14 Oct 2024 11:38:29 +0100
Rainer Weikusat <rweikusat@talktalk.net> boring babbled:

The simple but flexible OO system, reliable automatic memory management

For a certain definition of OO. The requirement to have to use $self-> everywhere to denote object method/var access makes it little better than
doing OO in C. Then there's the whole 2 stage object creation with the "bless" nonsense. Hacky.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rainer Weikusat@21:1/5 to Muttley@DastartdlyHQ.org on Mon Oct 14 11:38:29 2024

XPost: comp.unix.shell, comp.unix.programmer

Muttley@DastartdlyHQ.org writes:

On Sun, 13 Oct 2024 21:33:56 +0100
Rainer Weikusat <rweikusat@talktalk.net> boring babbled:

Muttley@DastartdlyHQ.org writes:

ITYF the people who dislike Perl are the ones who actually like the unix >>> way of having simple daisychained tools instead of some lump of a language >>> that does everything messily.

Perl is a general-purpose programming language, just like C or Java (or >>Python or Javascript or Rust or $whatnot). This means it can be used to >>implement anything (with some practical limitation for anything) and not >>that it "does everything".

I can be , but generally isn't. Its niche tends to be text processing of
some sort

It is. That sysadmin-types using it don't use it to create actual
programs is of no concern for this, because they never do that and this
use only needs a very small subset of the features of the language. I've
been using it as system programming language for programs with up to
21,000 LOC in the main program (and some more thousands in auxiliary
modules) and it's very well-suited to that.

The simple but flexible OO system, reliable automatic memory management
and support for functions/ subroutine as first-class objects make it
very nice for implementing event-driven, asynchronous "crossbar"
programs connecting various external entities both running locallly and
on other computers on the internet to create complex applications from
them.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Mon Oct 14 14:13:11 2024

XPost: comp.unix.programmer

On 14.10.2024 03:45, Lawrence D'Oliveiro wrote:

On Mon, 14 Oct 2024 01:16:11 +0200, Janis Papanagnou wrote:

On 13.10.2024 23:10, Lawrence D'Oliveiro wrote:

On Sun, 13 Oct 2024 18:28:32 +0200, Janis Papanagnou wrote:

You know there's formal definitions for what constitutes languages.

Not really. For example, some have preferred the term “notation”
instead of “language”.

A "notation" is not the same as a [formal (or informal)] "language".

(Frankly, I don't know where you're coming from ...

<https://en.wikipedia.org/wiki/Programming_language>:

A programming language is a system of notation for writing computer
programs.

Okay, a "system of notation" (not a "notation") is used here to
_describe_ it. I'm fine with that formulation. Thanks.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Muttley@DastartdlyHQ.org on Mon Oct 14 14:36:38 2024

XPost: comp.unix.programmer

On 14.10.2024 10:23, Muttley@DastartdlyHQ.org wrote:

On Sun, 13 Oct 2024 18:28:32 +0200
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> boring babbled:

[ X-post list reduced ]

On 13.10.2024 18:02, Muttley@DastartdlyHQ.org wrote:

On Sun, 13 Oct 2024 15:30:03 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) boring babbled:

[...]

No. It translates one computer _language_ to another computer
_language_. In the usual case, that's from a textual source

Machine code isn't a language. Fallen at the first hurdle with that
definition.

Careful (myself included); watch out for the glazed frost!

You know there's formal definitions for what constitutes languages.

At first glance I don't see why machine code wouldn't quality as a
language (either as some specific "mnemonic" representation, or as
a sequence of integral numbers or other "code" representations).
What's the problem, in your opinion, with considering machine code
as a language?

A programming language is an abstraction of machine instructions that is readable by people.

Yes, you can explain "programming language" that way.

The topic that was cited (Aho, et al.) upthread (and what I spoke
about) was more generally about [formal] "language", the base also
of programming languages.

(In early days of computers they programmed in binary, but that is
just a side note and unnecessary to support the definition of the
upthread cited text.)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dan Cross@21:1/5 to Muttley@DastartdlyHQ.org on Mon Oct 14 13:38:04 2024

XPost: comp.unix.shell, comp.unix.programmer

In article <veiki1$14g6h$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote: >On Sun, 13 Oct 2024 20:15:45 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) boring babbled:

Oh really? Is that why they call it "machine language"? It's
even in the dictionary with "machine code" as a synonymn: >>https://www.merriam-webster.com/dictionary/machine%20language

Its not a programming language.

That's news to those people who have, and sometimes still do,
write programs in it.

But that's not important. If we go back and look at what I
wrote that you were responding to, it was this statement, about
what a compiler does, and your claim that I was asserting it
was translating anything to anything, which I was not:

|No. It translates one computer _language_ to another computer
|_language_. In the usual case, that's from a textual source

Note that I said, "computer language", not "programming
language". Being a human-readable language is not a requirement
for a computer language.

Your claim that "machine language" is not a "language" is simply
not true. Your claim that a "proper" compiler must take the
shape you are pushing is also not true.

- Dan C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Muttley@DastartdlyHQ.org on Mon Oct 14 14:53:49 2024

XPost: comp.unix.shell, comp.unix.programmer

Muttley@DastartdlyHQ.org writes:

On Mon, 14 Oct 2024 13:38:04 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) boring babbled:

In article <veiki1$14g6h$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote: >>>On Sun, 13 Oct 2024 20:15:45 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) boring babbled:

Oh really? Is that why they call it "machine language"? It's
even in the dictionary with "machine code" as a synonymn: >>>>https://www.merriam-webster.com/dictionary/machine%20language

Its not a programming language.

That's news to those people who have, and sometimes still do,
write programs in it.

Really? So if its a language you'll be able to understand this then:

0011101011010101010001110101010010110110001110010100101001010100 >0101001010010010100101010111001010100110100111010101010101010101 >0001110100011101010001001010110011100010101001110010100101100010

I certainly understand this, even four decades later

94A605440C00010200010400000110

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Mon Oct 14 14:59:22 2024

XPost: comp.unix.programmer

On Mon, 14 Oct 2024 14:58:13 GMT
scott@slp53.sl.home (Scott Lurndal) boring babbled:

Muttley@DastartdlyHQ.org writes:

On Sun, 13 Oct 2024 18:28:32 +0200
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> boring babbled:

[ X-post list reduced ]

A programming language is an abstraction of machine instructions that is >>readable by people.

By that definition, PAL-D is a programming language.

Any assembler is a programming language, by that definition.

Where did I say it wasn't? Of course assembler is a programming language.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Muttley@DastartdlyHQ.org on Mon Oct 14 14:58:13 2024

XPost: comp.unix.programmer

Muttley@DastartdlyHQ.org writes:

On Sun, 13 Oct 2024 18:28:32 +0200
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> boring babbled:

[ X-post list reduced ]

A programming language is an abstraction of machine instructions that is >readable by people.

By that definition, PAL-D is a programming language.

Any assembler is a programming language, by that definition.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Mon Oct 14 14:47:58 2024

XPost: comp.unix.shell, comp.unix.programmer

On Mon, 14 Oct 2024 13:38:04 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) boring babbled:

In article <veiki1$14g6h$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote: >>On Sun, 13 Oct 2024 20:15:45 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) boring babbled:

Oh really? Is that why they call it "machine language"? It's
even in the dictionary with "machine code" as a synonymn: >>>https://www.merriam-webster.com/dictionary/machine%20language

Its not a programming language.

That's news to those people who have, and sometimes still do,
write programs in it.

Really? So if its a language you'll be able to understand this then:

0011101011010101010001110101010010110110001110010100101001010100 0101001010010010100101010111001010100110100111010101010101010101 0001110100011101010001001010110011100010101001110010100101100010

But that's not important. If we go back and look at what I

Oh right.

|No. It translates one computer _language_ to another computer
|_language_. In the usual case, that's from a textual source

Note that I said, "computer language", not "programming
language". Being a human-readable language is not a requirement
for a computer language.

Oh watch those goalpost moves with pedant set to 11. Presumably you
think the values of the address lines is a language too.

Your claim that "machine language" is not a "language" is simply
not true. Your claim that a "proper" compiler must take the
shape you are pushing is also not true.

If you say so.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rainer Weikusat@21:1/5 to Muttley@DastartdlyHQ.org on Mon Oct 14 16:04:18 2024

XPost: comp.unix.shell, comp.unix.programmer

Muttley@DastartdlyHQ.org writes:

On Mon, 14 Oct 2024 11:38:29 +0100
Rainer Weikusat <rweikusat@talktalk.net> boring babbled:

The simple but flexible OO system, reliable automatic memory management

[...]

Then there's the whole 2 stage object creation with the "bless"
nonsense. Hacky.

I was planning to write a longer reply but killed it. You're obviously
argueing about something you reject for political reasons despite you're
not really familiar with it and you even 'argue' like a politician. That
is, you stick peiorative labels on stuff you don't like to emphasize how
really disagreeable you believe it to be. IMHO, such a method of (pseudo-)discussing anything is completely pointless.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Scott Lurndal on Mon Oct 14 17:27:18 2024

XPost: comp.unix.shell, comp.unix.programmer

On 14/10/2024 16:53, Scott Lurndal wrote:

Muttley@DastartdlyHQ.org writes:

On Mon, 14 Oct 2024 13:38:04 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) boring babbled:

In article <veiki1$14g6h$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote:

On Sun, 13 Oct 2024 20:15:45 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) boring babbled:

Oh really? Is that why they call it "machine language"? It's
even in the dictionary with "machine code" as a synonymn:
https://www.merriam-webster.com/dictionary/machine%20language

Its not a programming language.

That's news to those people who have, and sometimes still do,
write programs in it.

Really? So if its a language you'll be able to understand this then:

0011101011010101010001110101010010110110001110010100101001010100
0101001010010010100101010111001010100110100111010101010101010101
0001110100011101010001001010110011100010101001110010100101100010

I certainly understand this, even four decades later

94A605440C00010200010400000110

In my early days of assembly programming on my ZX Spectrum, I would hand-assembly to machine code, and I knew at least a few of the codes by
heart. (01 is "ld bc, #xxxx", 18 is "jr", c9 is "ret", etc.) So while
I rarely wrote machine code directly, it is certainly still a
programming language - it's a language you can write programs in.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Stefan Ram@21:1/5 to Dan Cross on Mon Oct 14 15:19:03 2024

cross@spitfire.i.gajendra.net (Dan Cross) wrote or quoted:

Your claim that "machine language" is not a "language" is simply
not true.

Machine language is a language.

(It might not be a /formal/ language when the specification
is not definite. For example, when one says, "6502", are the
"undocumented" opcodes a part of this language or not? So, for
a formal language, you have to make sure that it's definite.)

Not related to unix. So, not,

Newsgroups: comp.unix.shell,comp.unix.programmer,comp.lang.misc

, but,

Newsgroups: comp.lang.misc

.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Mon Oct 14 15:39:19 2024

XPost: comp.unix.shell, comp.unix.programmer

On Mon, 14 Oct 2024 16:04:18 +0100
Rainer Weikusat <rweikusat@talktalk.net> boring babbled: >Muttley@DastartdlyHQ.org writes:

On Mon, 14 Oct 2024 11:38:29 +0100
Rainer Weikusat <rweikusat@talktalk.net> boring babbled:

The simple but flexible OO system, reliable automatic memory management

[...]

Then there's the whole 2 stage object creation with the "bless"
nonsense. Hacky.

I was planning to write a longer reply but killed it. You're obviously >argueing about something you reject for political reasons despite you're
not really familiar with it and you even 'argue' like a politician. That
is, you stick peiorative labels on stuff you don't like to emphasize how >really disagreeable you believe it to be. IMHO, such a method of >(pseudo-)discussing anything is completely pointless.

Umm, whatever. I was just saying why I didn't like Perl but if you want to
read some grand motive into it knock yourself out.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Muttley@DastartdlyHQ.org on Mon Oct 14 17:43:59 2024

XPost: comp.unix.programmer

[ X-post list reduced ]

On 14.10.2024 16:47, Muttley@DastartdlyHQ.org wrote:

On Mon, 14 Oct 2024 13:38:04 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) boring babbled:

In article <veiki1$14g6h$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote:

On Sun, 13 Oct 2024 20:15:45 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) boring babbled:

Oh really? Is that why they call it "machine language"? It's
even in the dictionary with "machine code" as a synonymn:
https://www.merriam-webster.com/dictionary/machine%20language

Its not a programming language.

That's news to those people who have, and sometimes still do,
write programs in it.

Really? So if its a language you'll be able to understand this then:

0011101011010101010001110101010010110110001110010100101001010100 0101001010010010100101010111001010100110100111010101010101010101 0001110100011101010001001010110011100010101001110010100101100010

It's substantially (for me) not different from, e.g., Chinese text.

You need context information to understand it. But understanding a
language is not a condition for defining and handling a language.
If there's context information then people can associate semantical
meaning with it (and understand it).

To illustrate (just playing)...

if then else then if or if and else end if

Are you able to understand that? On what abstraction level do you
understand it? Does it make [semantical] sense to you?
(Note: Using the proper translator and interpreter this is quite
dangerous code. For the puzzler; it's a coded shell fork-bomb.)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to David Brown on Mon Oct 14 17:55:14 2024

XPost: comp.unix.programmer

[ X-post list reduced ]

On 14.10.2024 17:27, David Brown wrote:

On 14/10/2024 16:53, Scott Lurndal wrote:

Muttley@DastartdlyHQ.org writes:

Really? So if its a language you'll be able to understand this then:

0011101011010101010001110101010010110110001110010100101001010100
0101001010010010100101010111001010100110100111010101010101010101
0001110100011101010001001010110011100010101001110010100101100010

I certainly understand this, even four decades later

94A605440C00010200010400000110

In my early days of assembly programming on my ZX Spectrum, I would hand-assembly to machine code, and I knew at least a few of the codes by heart. (01 is "ld bc, #xxxx", 18 is "jr", c9 is "ret", etc.) So while
I rarely wrote machine code directly, it is certainly still a
programming language - it's a language you can write programs in.

Your post triggered some own memories...

I have an old pocket calculator (Sharp PC-1401) programmable in
BASIC. When I found out that it supports undocumented features to
read machine code numbers from memory and write code numbers into
memory (and call them as subprograms) I coded programs in decimal
byte sequences. (A pain, for sure, but in earlier computer eras
even a normal process.)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dan Cross@21:1/5 to Muttley@DastartdlyHQ.org on Mon Oct 14 16:51:06 2024

In article <vejauu$186ln$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote: >On Mon, 14 Oct 2024 13:38:04 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) boring babbled:

[snip]
|No. It translates one computer _language_ to another computer >>|_language_. In the usual case, that's from a textual source

Note that I said, "computer language", not "programming
language". Being a human-readable language is not a requirement
for a computer language.

Oh watch those goalpost moves with pedant set to 11. Presumably you
think the values of the address lines is a language too.

Dunno what to tell you: pretty sure you're the one who
asserted I meant something something I didn't write.

Your claim that "machine language" is not a "language" is simply
not true. Your claim that a "proper" compiler must take the
shape you are pushing is also not true.

If you say so.

Not just me.

- Dan C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Scott Lurndal on Mon Oct 14 17:23:11 2024

XPost: comp.unix.programmer

On 14/10/2024 15:58, Scott Lurndal wrote:

Muttley@DastartdlyHQ.org writes:

On Sun, 13 Oct 2024 18:28:32 +0200
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> boring babbled:

[ X-post list reduced ]

A programming language is an abstraction of machine instructions that is
readable by people.

By that definition, PAL-D is a programming language.

(I've no idea what PAL-D is in this context.)

Any assembler is a programming language, by that definition.

You mean 'assembly'? An assembler (in the sofware world) is usually a
program that translates textual assembly code.

'Compiler' isn't a programming language (although no doubt someone here
will dredge up some obscure language with exactly that name just to
prove me wrong).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Muttley on Mon Oct 14 21:04:44 2024

XPost: comp.unix.programmer

On Mon, 14 Oct 2024 08:23:20 -0000 (UTC), Muttley wrote:

A programming language is an abstraction of machine instructions that is readable by people.

Like converting circuit voltages to human-readable “1” and “0” symbols, perhaps?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Johanne Fairchild@21:1/5 to Lawrence D'Oliveiro on Tue Aug 27 19:56:50 2024

XPost: comp.unix.shell, comp.unix.programmer

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Tue, 27 Aug 2024 03:15:16 -0000 (UTC), Sebastian wrote:

In comp.unix.programmer Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

(And I have no idea about this “Black” thing. I just do my thing.)

Black is a [bla bla bla]

*Yawn*

The guy was kindly and politely sharing information with you.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Johanne Fairchild on Tue Aug 27 23:26:50 2024

XPost: comp.unix.shell, comp.unix.programmer

On Tue, 27 Aug 2024 19:56:50 -0300, Johanne Fairchild wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Tue, 27 Aug 2024 03:15:16 -0000 (UTC), Sebastian wrote:

In comp.unix.programmer Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

(And I have no idea about this “Black” thing. I just do my thing.)

Black is a [bla bla bla]

*Yawn*

The guy was kindly and politely sharing information with you.

Didn’t ask, didn’t know, didn’t care.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Johanne Fairchild on Wed Aug 28 00:09:37 2024

XPost: comp.unix.shell, comp.unix.programmer

On Tue, 27 Aug 2024 21:08:11 -0300, Johanne Fairchild wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Tue, 27 Aug 2024 19:56:50 -0300, Johanne Fairchild wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Tue, 27 Aug 2024 03:15:16 -0000 (UTC), Sebastian wrote:

In comp.unix.programmer Lawrence D'Oliveiro <ldo@nz.invalid> wrote: >>>>>>

(And I have no idea about this “Black” thing. I just do my thing.) >>>>>

Black is a [bla bla bla]

*Yawn*

The guy was kindly and politely sharing information with you.

Didn’t ask, didn’t know, didn’t care.

Knock yourself out. You have the freedom to disdain.

I was “disdaining” something even the poster who mentioned it didn’t seem to think much of.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Johanne Fairchild@21:1/5 to Lawrence D'Oliveiro on Tue Aug 27 21:08:11 2024

XPost: comp.unix.shell, comp.unix.programmer

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Tue, 27 Aug 2024 19:56:50 -0300, Johanne Fairchild wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Tue, 27 Aug 2024 03:15:16 -0000 (UTC), Sebastian wrote:

In comp.unix.programmer Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

(And I have no idea about this “Black” thing. I just do my thing.) >>>>

Black is a [bla bla bla]

*Yawn*

The guy was kindly and politely sharing information with you.

Didn’t ask, didn’t know, didn’t care.

Knock yourself out. You have the freedom to disdain. But this is the USENET---we don't need to ask for anything here. Every post is an
invitation for conversation (to the interested ones).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Sebastian on Tue Aug 27 18:05:59 2024

Sebastian <sebastian@here.com.invalid> writes:

In comp.unix.programmer Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Mon, 26 Aug 2024 16:13:33 -0000 (UTC), Sebastian wrote:

In comp.unix.programmer Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Sun, 25 Aug 2024 07:32:26 -0000 (UTC), Sebastian wrote:

In comp.unix.programmer Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

Says the guy who finds a few brackets so confusing that he
Black-formatted a snippet of Lisp code.

I use the same principles in all my code. (And I have no idea about
this ?Black? thing. I just do my thing.)

Black is a Python program that formats Python code
almost exactly the way you formatted that snippet of Lisp
code. It's just as ugly in Python as it is in Lisp. Black
spreads by convincing organizations to mandate its use. It's
utterly non-configurable on purpose, in order to guarantee
that eventually, all Python code is made to be as ugly
and unreadable as possible.

Thank you for this short description.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From usuario@21:1/5 to All on Tue Oct 1 20:18:28 2024

XPost: comp.unix.shell, comp.unix.programmer

El Mon, 30 Sep 2024 21:04:24 -0000 (UTC), Lawrence D'Oliveiro escribió:

On Mon, 30 Sep 2024 20:04:54 -0000 (UTC), Bozo User wrote:

Perl is more awk+sed+sh in a single language. Basically the killer of
the Unix philophy in late 90's/early 00's, and for the good.

That’s what Rob Pike said <https://interviews.slashdot.org/story/04/10/18/1153211/rob-pike-

responds>:

Q: “Given the nature of current operating systems and applications,
do you think the idea of "one tool doing one job well" has been
abandoned?”
A: “Those days are dead and gone and the eulogy was delivered by
Perl.”

But I’m not sure I agree. Those small, specialized tools always required large, monolithic pieces under them to operate: the shell itself for
shell scripts, the X server for GUI apps, the kernel itself for
everything. So while the coming of Perl has changed some things,
it has not made a difference to the modularity of the Unix way.

The shell could be changed as just a command launcher with no
conditionals, while perl doing all the hard work.

On X11/X.org, X11 was never very "Unix like" by itself.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From usuario@21:1/5 to All on Tue Oct 1 20:41:11 2024

XPost: comp.unix.shell, comp.unix.programmer

El Mon, 30 Sep 2024 20:59:16 -0000 (UTC), Lawrence D'Oliveiro escribió:

On Mon, 30 Sep 2024 20:04:53 -0000 (UTC), Bozo User wrote:

Check Emacs' eshell where you can mix pseudo-sh code with Elisp

Can you give examples that either support or refute my claim?

With eshell you can do sh like commands and put elisp (literal lisp code) inside a loop:

https://howardism.org/Technical/Emacs/eshell-why.html

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to usuario on Tue Oct 1 22:22:42 2024

XPost: comp.unix.shell, comp.unix.programmer

On Tue, 1 Oct 2024 20:41:11 -0000 (UTC), usuario wrote:

With eshell you can do sh like commands and put elisp (literal lisp
code) inside a loop:

https://howardism.org/Technical/Emacs/eshell-why.html

You have to put quotes around literal strings. That still makes it a “programming language”, not a “command language” (according to my original
definition).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Wed Oct 2 07:10:32 2024

XPost: comp.unix.shell, comp.unix.programmer

On Tue, 1 Oct 2024 20:18:28 -0000 (UTC)
usuario <anthk@disroot.org> boring babbled:

El Mon, 30 Sep 2024 21:04:24 -0000 (UTC), Lawrence D'Oliveiro escribió:

On Mon, 30 Sep 2024 20:04:54 -0000 (UTC), Bozo User wrote:

Perl is more awk+sed+sh in a single language. Basically the killer of
the Unix philophy in late 90's/early 00's, and for the good.

That’s what Rob Pike said
<https://interviews.slashdot.org/story/04/10/18/1153211/rob-pike- >responds>:

Q: “Given the nature of current operating systems and applications,
do you think the idea of "one tool doing one job well" has been
abandoned?”
A: “Those days are dead and gone and the eulogy was delivered by
Perl.”

But I’m not sure I agree. Those small, specialized tools always required >> large, monolithic pieces under them to operate: the shell itself for
shell scripts, the X server for GUI apps, the kernel itself for
everything. So while the coming of Perl has changed some things,
it has not made a difference to the modularity of the Unix way.

The shell could be changed as just a command launcher with no
conditionals, while perl doing all the hard work.

On X11/X.org, X11 was never very "Unix like" by itself.

An X server is about as minimal as you can have a graphics system
and still make it usable. I don't see how it could have been
subdivided any further and still work.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Bart on Tue Oct 15 13:27:09 2024

XPost: comp.unix.programmer

On 14/10/2024 18:23, Bart wrote:

On 14/10/2024 15:58, Scott Lurndal wrote:

Muttley@DastartdlyHQ.org writes:

On Sun, 13 Oct 2024 18:28:32 +0200
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> boring babbled:

[ X-post list reduced ]

A programming language is an abstraction of machine instructions that is >>> readable by people.

By that definition, PAL-D is a programming language.

(I've no idea what PAL-D is in this context.)

Any assembler is a programming language, by that definition.

You mean 'assembly'? An assembler (in the sofware world) is usually a
program that translates textual assembly code.

I took "an assembler" to mean "an assembler language", which is a common alternative way to write "an assembly language". And IMHO, any assembly language /is/ a programming language.

'Compiler' isn't a programming language (although no doubt someone here
will dredge up some obscure language with exactly that name just to
prove me wrong).

I tried, just to please you, but I couldn't find such a language :-)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dan Cross@21:1/5 to bc@freeuk.com on Tue Oct 15 15:18:21 2024

XPost: comp.unix.programmer

[Followup-To: set to comp.lang.misc, -comp.unix.programmer]

In article <vejghe$192vs$1@dont-email.me>, Bart <bc@freeuk.com> wrote:

On 14/10/2024 15:58, Scott Lurndal wrote:

Muttley@DastartdlyHQ.org writes:

On Sun, 13 Oct 2024 18:28:32 +0200
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> boring babbled:

[ X-post list reduced ]

A programming language is an abstraction of machine instructions that is >>> readable by people.

By that definition, PAL-D is a programming language.

(I've no idea what PAL-D is in this context.)

PAL-D is an assembler for the PDP-8 computer. I don't know why
one wouldn't consider it's input a programming language. https://bitsavers.org/pdf/dec/pdp8/handbooks/programmingLanguages_May70.pdf

- Dan C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From usuario@21:1/5 to All on Wed Oct 2 12:52:03 2024

XPost: comp.unix.shell, comp.unix.programmer

El Wed, 2 Oct 2024 07:10:32 -0000 (UTC), Muttley escribió:

On Tue, 1 Oct 2024 20:18:28 -0000 (UTC)
usuario <anthk@disroot.org> boring babbled:

El Mon, 30 Sep 2024 21:04:24 -0000 (UTC), Lawrence D'Oliveiro escribió:

On Mon, 30 Sep 2024 20:04:54 -0000 (UTC), Bozo User wrote:

Perl is more awk+sed+sh in a single language. Basically the killer of
the Unix philophy in late 90's/early 00's, and for the good.

That’s what Rob Pike said
<https://interviews.slashdot.org/story/04/10/18/1153211/rob-pike- >>responds>:

Q: “Given the nature of current operating systems and
applications,
do you think the idea of "one tool doing one job well" has been
abandoned?”
A: “Those days are dead and gone and the eulogy was delivered by
Perl.”

But I’m not sure I agree. Those small, specialized tools always
required large, monolithic pieces under them to operate: the shell
itself for shell scripts, the X server for GUI apps, the kernel itself
for everything. So while the coming of Perl has changed some things,
it has not made a difference to the modularity of the Unix way.

The shell could be changed as just a command launcher with no
conditionals, while perl doing all the hard work.

On X11/X.org, X11 was never very "Unix like" by itself.

An X server is about as minimal as you can have a graphics system and
still make it usable. I don't see how it could have been subdivided any further and still work.

Check out Blit under Unix V10 and Rio under plan9/9front for a much better Unix-oriented (9front/plan9 it's basically Unix philosophy 2.0) approach.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Wed Oct 2 16:00:55 2024

XPost: comp.unix.shell, comp.unix.programmer

On Wed, 2 Oct 2024 12:52:03 -0000 (UTC)
usuario <anthk@disroot.org> boring babbled:

El Wed, 2 Oct 2024 07:10:32 -0000 (UTC), Muttley escribió:

An X server is about as minimal as you can have a graphics system and
still make it usable. I don't see how it could have been subdivided any
further and still work.

Check out Blit under Unix V10 and Rio under plan9/9front for a much better >Unix-oriented (9front/plan9 it's basically Unix philosophy 2.0) approach.

Don't have the time. Presumably its a raw frame buffer or similar? If so
I can't see it being popular. The unix philosphy or breaking workflows up
into small subsections has it place, but I don't think graphics is that place.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rainer Weikusat@21:1/5 to Bozo User on Wed Oct 9 22:25:05 2024

XPost: comp.unix.shell, comp.unix.programmer

Bozo User <anthk@disroot.org> writes:

On 2024-04-07, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Sun, 07 Apr 2024 00:01:43 +0000, Javier wrote:

The downside is the loss of performance because of disk access for
trivial things like 'nfiles=$(ls | wc -l)'.

Well, you could save one process creation by writing
“nfiles=$(echo * | wc -l)” instead. But that would still not be strictly >> correct.

I suspect disk access times where
one of the reasons for the development of perl in the early 90s.

Shells were somewhat less powerful in those days. I would describe the
genesis of Perl as “awk on steroids”. Its big party trick was regular
expressions. And I guess combining that with more sophisticated data-
structuring capabilities.

Perl is more awk+sed+sh in a single language. Basically the killer
of the Unix philophy in late 90's/early 00's, and for the good.

Perl is a high-level programming language with a rich syntax¹, with
support for deterministic automatic memory management, functions as
first-class objects and message-based OO. It's also a virtual machine
for executing threaded code and a(n optimizing) compiler for translating
Perl code into the corresponding threaded code.

¹ Has recently gained try/catch for exception handling which is IMNSHO a
great improvement over eval + $@.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Thu Oct 10 08:38:26 2024

XPost: comp.unix.shell, comp.unix.programmer

On Wed, 09 Oct 2024 22:25:05 +0100
Rainer Weikusat <rweikusat@talktalk.net> boring babbled:

Bozo User <anthk@disroot.org> writes:

On 2024-04-07, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Sun, 07 Apr 2024 00:01:43 +0000, Javier wrote:

The downside is the loss of performance because of disk access for
trivial things like 'nfiles=$(ls | wc -l)'.

Well, you could save one process creation by writing
“nfiles=$(echo * | wc -l)” instead. But that would still not be >strictly
correct.

I suspect disk access times where
one of the reasons for the development of perl in the early 90s.

Shells were somewhat less powerful in those days. I would describe the
genesis of Perl as “awk on steroids”. Its big party trick was regular >>> expressions. And I guess combining that with more sophisticated data-
structuring capabilities.

Perl is more awk+sed+sh in a single language. Basically the killer
of the Unix philophy in late 90's/early 00's, and for the good.

Perl is a high-level programming language with a rich syntax¹, with
support for deterministic automatic memory management, functions as >first-class objects and message-based OO. It's also a virtual machine
for executing threaded code and a(n optimizing) compiler for translating
Perl code into the corresponding threaded code.

Its syntax is also a horrific mess. Larry took the worst parts of C and shell syntax and mashed them together. Its no surprise Perl has been ditched in favour of Python just about everywhere for new scripting projects. And while
I hate Pythons meangingful whitespace nonsense, I'd use it in preference
to Perl any day.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rainer Weikusat@21:1/5 to Muttley@DastartdlyHQ.org on Thu Oct 10 16:09:49 2024

XPost: comp.unix.shell, comp.unix.programmer

Muttley@DastartdlyHQ.org writes:

On Wed, 09 Oct 2024 22:25:05 +0100
Rainer Weikusat <rweikusat@talktalk.net> boring babbled:

Bozo User <anthk@disroot.org> writes:

On 2024-04-07, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Sun, 07 Apr 2024 00:01:43 +0000, Javier wrote:

The downside is the loss of performance because of disk access for
trivial things like 'nfiles=$(ls | wc -l)'.

Well, you could save one process creation by writing
â€œnfiles=$(echo * | wc -l)â€ instead. But that would still not be >>strictly
correct.

I suspect disk access times where
one of the reasons for the development of perl in the early 90s.

Shells were somewhat less powerful in those days. I would describe the >>>> genesis of Perl as â€œawk on steroidsâ€. Its big party trick was regular
expressions. And I guess combining that with more sophisticated data-
structuring capabilities.

Perl is more awk+sed+sh in a single language. Basically the killer
of the Unix philophy in late 90's/early 00's, and for the good.

Perl is a high-level programming language with a rich syntaxÂ¹, with >>support for deterministic automatic memory management, functions as >>first-class objects and message-based OO. It's also a virtual machine
for executing threaded code and a(n optimizing) compiler for translating >>Perl code into the corresponding threaded code.

Its syntax is also a horrific mess.

Which means precisely what?

Its no surprise Perl has been ditched in favour of Python just about everywhere for new scripting projects.

"I say so and I'm an avid Phython fan?"

Not much of a reason.

BTW, I didn't mean to start another entirely pointless language
war. Just pointing out the referring to a general-purpose programming
language as "killer of the UNIX philosophy" makes no sense.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Thu Oct 10 15:34:37 2024

XPost: comp.unix.shell, comp.unix.programmer

On Thu, 10 Oct 2024 16:09:49 +0100
Rainer Weikusat <rweikusat@talktalk.net> boring babbled: >Muttley@DastartdlyHQ.org writes:

Its syntax is also a horrific mess.

Which means precisely what?

Far too much pointless punctuation. An interpreter shouldn't need the vartype signified by $ or @ once its defined, it should already know. And then there are semantically meaningful underscores (seriously?) and random hacky keywords such as <STDIN>. I could go on.

Its no surprise Perl has been ditched in favour of Python just about
everywhere for new scripting projects.

"I say so and I'm an avid Phython fan?"

Not much of a reason.

It shows the general consensus of which is an easier language to work with.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rainer Weikusat@21:1/5 to All on Thu Oct 10 17:55:32 2024

XPost: comp.unix.shell, comp.unix.programmer

Muttley@DastartdlyHQ.org ignorantly rambled:

On Thu, 10 Oct 2024 16:09:49 +0100
Rainer Weikusat <rweikusat@talktalk.net> boring babbled:

Muttley@DastartdlyHQ.org writes:

Its syntax is also a horrific mess.

Which means precisely what?

Far too much pointless punctuation. An interpreter shouldn't need the vartype signified by $ or @ once its defined, it should already know.

For the purpose of variable declaration, how's the interpeter going to
know the type of a variable without being told about it? Obviously, not
at all.

Perl has three builtin types, scalars, arrays and hashes and
each is denoted by a single-letter prefix which effectively creates
three different variable namespaces, one for each type. That's often convenient, because the same name can be reused for a variable of a
different type, eg:

my ($data, @data, %data);

$data = rand(128);
@data = ($data, $data + 1);
%data = map { $_, 15 } @data;

it's also convenient to type and easy to read due to being concise.

Outside of declarations, $ and @ really denote access modes/ contexts,
with $ standing for "a thing" and @ for "a number of things", eg

$a[0]

is the first element of the array @a and

@a[-3 .. -1]

is a list composed of the three last elements of @a.

And then there are semantically meaningful underscores (seriously?)

Similar to the number writing convention in English: 1,600,700, numbers
in Perl can be annotated with _-separators to make them easier to
read. Eg, all of these are identical

1_600_700
16_007_00
1_6_0_0_7_0_0_

But the underscores have no meaning in here.

and random hacky keywords such as <STDIN>.

<STDIN> isn't a keyword. STDIN is the name of a glob (symbol table
entry) in the symbol table of the package main. It's most prominent use
is (as they name may suggest) to provide access to "the standard input
stream".

<> is an I/O operator. It's operand must be a file handle, ie, either
the name of glob with a file handle associated with it like STDIN or a
scalar variable used to hold a file handle. In scalar context, it reads
and returns the next line read from this file handle. In list context,
it returns all lines in the file.

Eg, this a poor man's implementation of cat:

perl -e 'open($fh, $_) and print <$fh> for @ARGV'

I could go on.

Please don't enumerate everything else on this planet you also don't
really understand as that's probably going to become a huge list. ;-)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Rainer Weikusat on Thu Oct 10 19:14:15 2024

XPost: comp.unix.shell, comp.unix.programmer

On 2024-10-10, Rainer Weikusat <rweikusat@talktalk.net> wrote:

Muttley@DastartdlyHQ.org ignorantly rambled:

On Thu, 10 Oct 2024 16:09:49 +0100
Rainer Weikusat <rweikusat@talktalk.net> boring babbled: >>>Muttley@DastartdlyHQ.org writes:

Its syntax is also a horrific mess.

Which means precisely what?

Far too much pointless punctuation. An interpreter shouldn't need the vartype
signified by $ or @ once its defined, it should already know.

For the purpose of variable declaration, how's the interpeter going to

Interpreter? Perl has some kind of compiler in it, right?

Interpreters for typed languages are possible. The lexical environment
bidnings contain type info, so when the interpreter sees x, it resolves
it through the environment not only to a location/value, but to type
info.

know the type of a variable without being told about it? Obviously, not
at all.

But it's not exactly type, because $x means "scalar variable of any
type" whereas @x is an "array of any type".

That's quite useless for proper type checking and only causes noise,
due to having to be repeated.

Actually typed languages don't use sigils. How is that?

The type of a name is declared (or else inferred); references to that
name don't need to repeat that info.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rainer Weikusat@21:1/5 to Kaz Kylheku on Thu Oct 10 21:31:39 2024

XPost: comp.unix.shell, comp.unix.programmer

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2024-10-10, Rainer Weikusat <rweikusat@talktalk.net> wrote:

Muttley@DastartdlyHQ.org ignorantly rambled:

On Thu, 10 Oct 2024 16:09:49 +0100
Rainer Weikusat <rweikusat@talktalk.net> boring babbled: >>>>Muttley@DastartdlyHQ.org writes:

Its syntax is also a horrific mess.

Which means precisely what?

Far too much pointless punctuation. An interpreter shouldn't need the vartype
signified by $ or @ once its defined, it should already know.

For the purpose of variable declaration, how's the interpeter going to

Interpreter? Perl has some kind of compiler in it, right?

The Perl compiler turns Perl source code into a set of (that's a
conjecture of mine) so-called "op trees" whose nodes contain pointers to built-in functions and pointers to "op subtrees" supplying arguments for
these and the interpeter/ virtual machine then evaluates these op trees
to run the program.

[...]

know the type of a variable without being told about it? Obviously, not
at all.

But it's not exactly type, because $x means "scalar variable of any
type" whereas @x is an "array of any type".

$x means 'scalar variable'. There's no furher differentiation of that at
the language level despite there are two kinds of scalar variables at
the implentation level, scalars whose values are "values" of some sort
(ie, strings or numbers) and scalars whose values are references to
something.

@x is an 1-D array of scalars.

That's quite useless for proper type checking and only causes noise,
due to having to be repeated.

Actually typed languages don't use sigils. How is that?

The type of a name is declared (or else inferred); references to that
name don't need to repeat that info.

The Perl type system is based on using different namespaces for
different types which means the type of a variable is part of its
name. This has the advantages that declaration syntax is concise and
that it's possible to have different kinds of variables with the same
name. It's also not really specific to Perl as C uses a similar model
for structures declarations and definitions.

The obvious disadvantage is that every variable name in Perl and every
use of a variable in an expression has and additional meta-information character associated with it. The actual rules outside of declarations
are also more complicated because of the underlying idea that $ would be something like a singular article in a spoken language an @ a plural
article. This means that elements of arrays and hashed are referred to
using a $ prefix and not @ or %, eg,

my @a;
$a[0] = 1;

or

my %h;
$h{goatonion} = 'winged cauliflower';

I think that's rather a weird than a great idea but it's internally
consistent and as good (or bad) as any other language ideosyncrasy. It's certainly less confusing than the : as expression separator in
supposedly punctuation-free Python which tripped me up numerous times
when initially starting to write (some) Python code.

Things only start to get slightly awful when references become
involved. In Perl 4 (reportedly, I've never used that) a reference was a variable holding the name of another variable, eg

$b = 1;
$a = 'b';
print $$a; # prints 1

Perl 5 added references as typed pointers with reference counting but
retained the symbolic referenc syntax. For the example above, that would
be

$b = 1;
$a = \$b;
print $$a; # also prints 1

Thinks start to become complicated once references to complex objects
are involved. Eg,

@{$$a[0]}

is the array referred to by the first item of the array $a refers to and

${$$a[0]}[0]

which seriously starts to look like a trench fortification with
barbed-wire obstacles is a way to refer to the first element of this
array. The {$$a[0]} is a block returning a reference to an array which
the surrounding $ and [0] then dereference. The { } could contain
arbitrary code returning a reference. But for the simple case of dereference-chaining, this is not needed as it's implied for adjacent
subscript (for both hashes and arrays) which means the simpler

$$a[0][0]

is equivalent to the other expression.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Rainer Weikusat on Fri Oct 11 00:09:39 2024

XPost: comp.unix.shell, comp.unix.programmer

On 10/10/2024 21:31, Rainer Weikusat wrote:

Kaz Kylheku <643-408-1753@kylheku.com> writes:

Thinks start to become complicated once references to complex objects
are involved. Eg,

@{$$a[0]}

is the array referred to by the first item of the array $a refers to and

${$$a[0]}[0]

which seriously starts to look like a trench fortification with
barbed-wire obstacles is a way to refer to the first element of this
array.

I thought you were defending the language. Now you seem to be agreeing
with how bad this syntax is.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Rainer Weikusat on Fri Oct 11 00:07:02 2024

XPost: comp.unix.shell, comp.unix.programmer

On 10/10/2024 17:55, Rainer Weikusat wrote:

Muttley@DastartdlyHQ.org ignorantly rambled:

On Thu, 10 Oct 2024 16:09:49 +0100
Rainer Weikusat <rweikusat@talktalk.net> boring babbled:

Muttley@DastartdlyHQ.org writes:

Its syntax is also a horrific mess.

Which means precisely what?

Far too much pointless punctuation. An interpreter shouldn't need the vartype
signified by $ or @ once its defined, it should already know.

For the purpose of variable declaration, how's the interpeter going to
know the type of a variable without being told about it? Obviously, not
at all.

Perl has three builtin types, scalars, arrays and hashes and
each is denoted by a single-letter prefix which effectively creates
three different variable namespaces, one for each type. That's often convenient, because the same name can be reused for a variable of a
different type, eg:

my ($data, @data, %data);

Why would you want to do this?

$data = rand(128);
@data = ($data, $data + 1);
%data = map { $_, 15 } @data;

it's also convenient to type and easy to read due to being concise.

Adding shifted punctuation at the start of every instance of a variable?
I don't call that convenient!

So, $ is scalar, @ is an array, and % is a hash?

Outside of declarations, $ and @ really denote access modes/ contexts,
with $ standing for "a thing" and @ for "a number of things", eg

$a[0]

is the first element of the array @a and

Now I'm already lost. 'a' is an array, but it's being used with $? What
would just this:

a[0]

mean by itself?

@a[-3 .. -1]

is a list composed of the three last elements of @a.

Sorry, these prefixes look utterly pointless to me. This stuff works
perfectly well in other languages without them.

I can write a[i..j] in mine and I know that it yields a slice.

What would a[-3 .. -1] give you in Perl without the @? What would $a[-3
.. -1] mean?

What happens if you have an array of mixed scalars, arrays and hashes;
what prefix to use in front of a[i]?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Muttley on Fri Oct 11 01:33:21 2024

XPost: comp.unix.shell, comp.unix.programmer

On Thu, 10 Oct 2024 15:34:37 -0000 (UTC), Muttley wrote:

An interpreter shouldn't need the vartype signified by $ or @ once its defined, it should already know.

You realize those symbols represent, not just different types, but
different namespaces as well?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Fri Oct 11 08:17:22 2024

XPost: comp.unix.shell, comp.unix.programmer

On Thu, 10 Oct 2024 17:55:32 +0100
Rainer Weikusat <rweikusat@talktalk.net> boring babbled: >Muttley@DastartdlyHQ.org ignorantly rambled:

On Thu, 10 Oct 2024 16:09:49 +0100
Rainer Weikusat <rweikusat@talktalk.net> boring babbled: >>>Muttley@DastartdlyHQ.org writes:

Its syntax is also a horrific mess.

Which means precisely what?

Far too much pointless punctuation. An interpreter shouldn't need the vartype

signified by $ or @ once its defined, it should already know.

For the purpose of variable declaration, how's the interpeter going to

Which part of "once its defined" did you not understand?

convenient, because the same name can be reused for a variable of a
different type, eg:

my ($data, @data, %data);

$data = rand(128);
@data = ($data, $data + 1);
%data = map { $_, 15 } @data;

What a mess. And I didn't realise perl allowed variables of different types
to have the same name! Insanity! Another reason never to use it again.

it's also convenient to type and easy to read due to being concise.

If you say so. Others may disagree.

and returns the next line read from this file handle. In list context,
it returns all lines in the file.

Eg, this a poor man's implementation of cat:

perl -e 'open($fh, $_) and print <$fh> for @ARGV'

Meanwhile in awk: { print }

Please don't enumerate everything else on this planet you also don't
really understand as that's probably going to become a huge list. ;-)

I understand Perl vies with C++ as the most syntactically messy language
in common use. Unfortunately I have to use the latter, I don't have to use
the former.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rainer Weikusat@21:1/5 to Bart on Fri Oct 11 16:15:14 2024

XPost: comp.unix.shell, comp.unix.programmer

Bart <bc@freeuk.com> writes:

On 10/10/2024 17:55, Rainer Weikusat wrote:

Muttley@DastartdlyHQ.org ignorantly rambled:

On Thu, 10 Oct 2024 16:09:49 +0100
Rainer Weikusat <rweikusat@talktalk.net> boring babbled:

Muttley@DastartdlyHQ.org writes:

Its syntax is also a horrific mess.

Which means precisely what?

Far too much pointless punctuation. An interpreter shouldn't need the vartype
signified by $ or @ once its defined, it should already know.

For the purpose of variable declaration, how's the interpeter going
to
know the type of a variable without being told about it? Obviously, not
at all.
Perl has three builtin types, scalars, arrays and hashes and
each is denoted by a single-letter prefix which effectively creates
three different variable namespaces, one for each type. That's often
convenient, because the same name can be reused for a variable of a
different type, eg:
my ($data, @data, %data);

Why would you want to do this?

Because it's convenient. It's possible to have a single variable
denotning something, say $car, and also a variable use to hold a list of
cars which could be named @car.

$data = rand(128);
@data = ($data, $data + 1);
%data = map { $_, 15 } @data;
it's also convenient to type and easy to read due to being concise.

Adding shifted punctuation at the start of every instance of a
variable? I don't call that convenient!

It's convenient for declarations, because it's the shortest possible
syntax which can be used to declare the type of a variable.

So, $ is scalar, @ is an array, and % is a hash?

Yes.

Outside of declarations, $ and @ really denote access modes/ contexts,
with $ standing for "a thing" and @ for "a number of things", eg
$a[0]

is the first element of the array @a and

Now I'm already lost. 'a' is an array, but it's being used with $?
What would just this:

a[0]

mean by itself?

Nothing, ie, it's a syntax error.

@a[-3 .. -1]
is a list composed of the three last elements of @a.

Sorry, these prefixes look utterly pointless to me. This stuff works perfectly well in other languages without them.

I can write a[i..j] in mine and I know that it yields a slice.

But presumably only if a is actually an array. In Perl, it's a so-called bareword which was historically just a string. In current Perl, it's
either a filehandle (which can't be indexed) or calling a function named
a (which also can't be indexed).

[...]

What would $a[-3 .. -1] mean?

$a[0]

In list context (denoted by @) the .. operator returns a sequence of
numbers starting from the value on the left and ending with the value on
the right. But because of the $, it's in scalar context. Then, it
returns a boolean value which is false until the left operand becomes
true, then remains true until the right operand becomes true and then
again becomes false. That's supposed to be used for matching ranges of something. If an operand is a constant expression, it's supposed to
refer to a line number of the last file which was accessed. Eg,

perl -ne 'print if 10 .. 15' </var/log/syslog

prints lines 10 - 15 from standard input (connected to /var/log/syslog).

For the given example, the value is always false (arithmetically equal
to 0) because -3 cannot occur as line number of a file.

What happens if you have an array of mixed scalars, arrays and hashes;
what prefix to use in front of a[i]?

Elements of arrays or hashes are always scalars.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rainer Weikusat@21:1/5 to Kaz Kylheku on Fri Oct 11 15:47:06 2024

XPost: comp.unix.shell, comp.unix.programmer

Bart <bc@freeuk.com> writes:
Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2024-10-10, Rainer Weikusat <rweikusat@talktalk.net> wrote:

Muttley@DastartdlyHQ.org ignorantly rambled:

On Thu, 10 Oct 2024 16:09:49 +0100
Rainer Weikusat <rweikusat@talktalk.net> boring babbled: >>>>Muttley@DastartdlyHQ.org writes:

Its syntax is also a horrific mess.

Which means precisely what?

Far too much pointless punctuation. An interpreter shouldn't need the vartype
signified by $ or @ once its defined, it should already know.

For the purpose of variable declaration, how's the interpeter going to

Interpreter? Perl has some kind of compiler in it, right?

The Perl compiler turns Perl source code into a set of (that's a
conjecture of mine) so-called "op trees" whose nodes contain pointers to built-in functions and pointers to "op subtrees" supplying arguments for
these and the interpeter/ virtual machine then evaluates these op trees
to run the program.

[...]

know the type of a variable without being told about it? Obviously, not
at all.

But it's not exactly type, because $x means "scalar variable of any
type" whereas @x is an "array of any type".

$x means 'scalar variable'. There's no furher differentiation of that at
the language level despite there are two kinds of scalar variables at
the implentation level, scalars whose values are "values" of some sort
(ie, strings or numbers) and scalars whose values are references to
something.

@x is an 1-D array of scalars.

That's quite useless for proper type checking and only causes noise,
due to having to be repeated.

Actually typed languages don't use sigils. How is that?

The type of a name is declared (or else inferred); references to that
name don't need to repeat that info.

The Perl type system is based on using different namespaces for
different types which means the type of a variable is part of its
name. This has the advantages that declaration syntax is concise and
that it's possible to have different kinds of variables with the same
name. It's also not really specific to Perl as C uses a similar model
for structures declarations and definitions.

The obvious disadvantage is that every variable name in Perl and every
use of a variable in an expression has and additional meta-information character associated with it. The actual rules outside of declarations
are also more complicated because of the underlying idea that $ would be something like a singular article in a spoken language an @ a plural
article. This means that elements of arrays and hashed are referred to
using a $ prefix and not @ or %, eg,

my @a;
$a[0] = 1;

or

my %h;
$h{goatonion} = 'winged cauliflower';

I think that's rather a weird than a great idea but it's internally
consistent and as good (or bad) as any other language ideosyncrasy. It's certainly less confusing than the : as expression separator in
supposedly punctuation-free Python which tripped me up numerous times
when initially starting to write (some) Python code.

Things only start to get slightly awful when references become
involved. In Perl 4 (reportedly, I've never used that) a reference was a variable holding the name of another variable, eg

$b = 1;
$a = 'b';
print $$a; # prints 1

Perl 5 added references as typed pointers with reference counting but
retained the symbolic referenc syntax. For the example above, that would
be

$b = 1;
$a = \$b;
print $$a; # also prints 1

Thinks start to become complicated once references to complex objects
are involved. Eg,

@{$$a[0]}

is the array referred to by the first item of the array $a refers to and

${$$a[0]}[0]

which seriously starts to look like a trench fortification with
barbed-wire obstacles is a way to refer to the first element of this
array. The {$$a[0]} is a block returning a reference to an array which
the surrounding $ and [0] then dereference. The { } could contain
arbitrary code returning a reference. But for the simple case of dereference-chaining, this is not needed as it's implied for adjacent
subscript (for both hashes and arrays) which means the simpler

$$a[0][0]

is equivalent to the other expression.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dan Cross@21:1/5 to Muttley@DastartdlyHQ.org on Fri Oct 11 15:45:01 2024

XPost: comp.unix.shell, comp.unix.programmer

In article <vebffc$3n6jv$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote: >On Fri, 11 Oct 2024 15:47:06 +0100

Rainer Weikusat <rweikusat@talktalk.net> boring babbled:

Bart <bc@freeuk.com> writes:

Interpreter? Perl has some kind of compiler in it, right?

The Perl compiler turns Perl source code into a set of (that's a

Does it produce a standalone binary as output? No, so its an intepreter
not a compiler. However unlike the python interpreter its non interactive >making it an even less attractive option these days.

That's a bad distinction. There have been "Load and Go"
compilers in the past that have compiled and linked a program
directly into memory and executed it immediately after
compilation. As I recall, the Waterloo FORTRAN compilers on the
IBM mainframe did, or could do, more or less this.

Saving to some sort of object image is not a necessary function
of a compiler.

- Dan C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Fri Oct 11 15:15:57 2024

XPost: comp.unix.shell, comp.unix.programmer

On Fri, 11 Oct 2024 15:47:06 +0100
Rainer Weikusat <rweikusat@talktalk.net> boring babbled:

Bart <bc@freeuk.com> writes:

Interpreter? Perl has some kind of compiler in it, right?

The Perl compiler turns Perl source code into a set of (that's a

Does it produce a standalone binary as output? No, so its an intepreter
not a compiler. However unlike the python interpreter its non interactive making it an even less attractive option these days.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to on the fly. A proper compiler on Fri Oct 11 15:59:15 2024

XPost: comp.unix.shell, comp.unix.programmer

On Fri, 11 Oct 2024 15:45:01 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) boring babbled:

In article <vebffc$3n6jv$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote: >>On Fri, 11 Oct 2024 15:47:06 +0100

Rainer Weikusat <rweikusat@talktalk.net> boring babbled:

Bart <bc@freeuk.com> writes:

Interpreter? Perl has some kind of compiler in it, right?

The Perl compiler turns Perl source code into a set of (that's a

Does it produce a standalone binary as output? No, so its an intepreter
not a compiler. However unlike the python interpreter its non interactive >>making it an even less attractive option these days.

That's a bad distinction. There have been "Load and Go"
compilers in the past that have compiled and linked a program
directly into memory and executed it immediately after
compilation. As I recall, the Waterloo FORTRAN compilers on the
IBM mainframe did, or could do, more or less this.

Irrelevant. Lot of interpreters do partial compilation and the JVM does it
on the fly. A proper compiler writes a standalone binary file to disk.

Saving to some sort of object image is not a necessary function
of a compiler.

Yes it is.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Dan Cross on Fri Oct 11 16:37:02 2024

XPost: comp.unix.shell, comp.unix.programmer

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <vebffc$3n6jv$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote: >>On Fri, 11 Oct 2024 15:47:06 +0100

Rainer Weikusat <rweikusat@talktalk.net> boring babbled:

Bart <bc@freeuk.com> writes:

Interpreter? Perl has some kind of compiler in it, right?

The Perl compiler turns Perl source code into a set of (that's a

Does it produce a standalone binary as output? No, so its an intepreter
not a compiler. However unlike the python interpreter its non interactive >>making it an even less attractive option these days.

That's a bad distinction. There have been "Load and Go"
compilers in the past that have compiled and linked a program
directly into memory and executed it immediately after
compilation. As I recall, the Waterloo FORTRAN compilers on the
IBM mainframe did, or could do, more or less this.

Indeed, the Burroughs mainframes also had compile&go capabilities.

HP-3000 had a concept called "pass files" which held the intermediate
formats between source and executable:
$ BASICCOMP HANGMAN.BAS
$ PREP $OLDPASS, $NEWPASS ($OLDPASS and $NEWPASS were implied if omitted).
$ RUN $OLDPASS

It also had

$ BASICGO (compile, prepare (link) and run)
$ BASICPREP (compile and prepare)

Similarly for COBOL, FORTRAN and SPL.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dan Cross@21:1/5 to Muttley@DastartdlyHQ.org on Fri Oct 11 16:28:03 2024

XPost: comp.unix.shell, comp.unix.programmer

In article <vebi0j$3nhvq$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote: >On Fri, 11 Oct 2024 15:45:01 -0000 (UTC)

cross@spitfire.i.gajendra.net (Dan Cross) boring babbled:

In article <vebffc$3n6jv$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote: >>>On Fri, 11 Oct 2024 15:47:06 +0100

Rainer Weikusat <rweikusat@talktalk.net> boring babbled:

Bart <bc@freeuk.com> writes:

Interpreter? Perl has some kind of compiler in it, right?

The Perl compiler turns Perl source code into a set of (that's a

Does it produce a standalone binary as output? No, so its an intepreter >>>not a compiler. However unlike the python interpreter its non interactive >>>making it an even less attractive option these days.

That's a bad distinction. There have been "Load and Go"
compilers in the past that have compiled and linked a program
directly into memory and executed it immediately after
compilation. As I recall, the Waterloo FORTRAN compilers on the
IBM mainframe did, or could do, more or less this.

Irrelevant. Lot of interpreters do partial compilation and the JVM does it
on the fly. A proper compiler writes a standalone binary file to disk.

Not generally, no. Most compilers these days generate object
code and then, as a separate step, a linker is invoked to
combine object files and library archives into an executable
binary.

By the way, when many people talk about a "standalone" binary,
they are referring to something directly executable on hardware,
without the benefit of an operating system. The Unix kernel is
an example of such a "standalone binary."

Most executable binaries are not standalone.

Saving to some sort of object image is not a necessary function
of a compiler.

Yes it is.

So you say, but that's not the commonly accepted definition.
Sorry.

- Dan C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rainer Weikusat@21:1/5 to Muttley@DastartdlyHQ.org on Fri Oct 11 19:01:27 2024

XPost: comp.unix.shell, comp.unix.programmer

Muttley@DastartdlyHQ.org writes:

On Fri, 11 Oct 2024 15:47:06 +0100
Rainer Weikusat <rweikusat@talktalk.net> boring babbled:

Bart <bc@freeuk.com> writes:

Interpreter? Perl has some kind of compiler in it, right?

The Perl compiler turns Perl source code into a set of (that's a

Does it produce a standalone binary as output? No, so its an intepreter
not a compiler. However unlike the python interpreter its non interactive making it an even less attractive option these days.

The perl debugger offers an interactive environment (with line editing support if
the necessary packages/ modules are available). It can be invoked with a suitable 'script argument' to use it without actually debugging
something, eg,

perl -de 0

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rainer Weikusat@21:1/5 to Muttley@DastartdlyHQ.org on Fri Oct 11 19:37:46 2024

XPost: comp.unix.shell, comp.unix.programmer

Muttley@DastartdlyHQ.org writes:

Rainer Weikusat <rweikusat@talktalk.net> boring babbled:

[...]

Eg, this a poor man's implementation of cat:

perl -e 'open($fh, $_) and print <$fh> for @ARGV'

Meanwhile in awk: { print }

perl -peZ

It's not only shorter than the awk version but it also works. cat
doesn't abort when some of its arguments don't name existin files.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Muttley on Fri Oct 11 20:58:26 2024

XPost: comp.unix.shell, comp.unix.programmer

On Fri, 11 Oct 2024 15:15:57 -0000 (UTC), Muttley wrote:

On Fri, 11 Oct 2024 15:47:06 +0100
Rainer Weikusat <rweikusat@talktalk.net>:

The Perl compiler turns Perl source code into a set of (that's a

Does it produce a standalone binary as output? No, so its an intepreter
not a compiler.

There are two parts: the interpreter interprets code generated by the compiler.

Remember, your CPU is an interpreter, too.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Sat Oct 12 08:39:09 2024

XPost: comp.unix.shell, comp.unix.programmer

On Fri, 11 Oct 2024 16:28:03 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) boring babbled:

In article <vebi0j$3nhvq$1@dont-email.me>, <Muttley@DastartdlyHQ.org> wrote: >>Irrelevant. Lot of interpreters do partial compilation and the JVM does it >>on the fly. A proper compiler writes a standalone binary file to disk.

Not generally, no. Most compilers these days generate object
code and then, as a separate step, a linker is invoked to
combine object files and library archives into an executable
binary.

Ok, the compiler toolchain then. Most people invoke it using a single command, the rest is behind the scenes.

By the way, when many people talk about a "standalone" binary,
they are referring to something directly executable on hardware,

For many read a tiny minority.

without the benefit of an operating system. The Unix kernel is
an example of such a "standalone binary."

If you're going to nitpick then I'm afraid you're wrong. Almost all operating systems require some kind of bootloader and/or BIOS combination to start them up. You can't just point the CPU at the first byte of the binary and off it goes particularly in the case of Linux where the kernel requires decompressing first.

Most executable binaries are not standalone.

Standalone as you are well aware in the sense of doesn't require an interpreter or VM to run on the OS and contains CPU machine code.

Saving to some sort of object image is not a necessary function
of a compiler.

Yes it is.

So you say, but that's not the commonly accepted definition.
Sorry.

Where do you get this commonly accepted definition from?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Sat Oct 12 08:40:46 2024

XPost: comp.unix.shell, comp.unix.programmer

On Fri, 11 Oct 2024 19:01:27 +0100
Rainer Weikusat <rweikusat@talktalk.net> boring babbled: >Muttley@DastartdlyHQ.org writes:

On Fri, 11 Oct 2024 15:47:06 +0100
Rainer Weikusat <rweikusat@talktalk.net> boring babbled:

Bart <bc@freeuk.com> writes:

Interpreter? Perl has some kind of compiler in it, right?

The Perl compiler turns Perl source code into a set of (that's a

Does it produce a standalone binary as output? No, so its an intepreter
not a compiler. However unlike the python interpreter its non interactive
making it an even less attractive option these days.

The perl debugger offers an interactive environment (with line editing support >if
the necessary packages/ modules are available). It can be invoked with a >suitable 'script argument' to use it without actually debugging
something, eg,

perl -de 0

I didn't know about that, I stand corrected on that point.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Sat Oct 12 08:42:17 2024

XPost: comp.unix.shell, comp.unix.programmer

On Fri, 11 Oct 2024 20:58:26 -0000 (UTC)
Lawrence D'Oliveiro <ldo@nz.invalid> boring babbled:

On Fri, 11 Oct 2024 15:15:57 -0000 (UTC), Muttley wrote:

On Fri, 11 Oct 2024 15:47:06 +0100
Rainer Weikusat <rweikusat@talktalk.net>:

The Perl compiler turns Perl source code into a set of (that's a

Does it produce a standalone binary as output? No, so its an intepreter
not a compiler.

There are two parts: the interpreter interprets code generated by the compiler.

Code generated by a compiler does not require an interpreter.

Remember, your CPU is an interpreter, too.

If you want to go down the reductio ad absurdum route then the electrons
are interpreters too.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@dastardlyhq.com@21:1/5 to All on Sat Nov 23 11:40:37 2024

XPost: comp.unix.shell, comp.unix.programmer

On Fri, 22 Nov 2024 18:18:04 -0000 (UTC)
Kaz Kylheku <643-408-1753@kylheku.com> gabbled:

On 2024-11-22, Muttley@DastartdlyHQ.org <Muttley@DastartdlyHQ.org> wrote:

Its not that simple I'm afraid since comments can be commented out.

Umm, no.

Umm, yes, they can.

eg:

// int i; /*

This /* sequence is inside a // comment, and so the machinery that
recognizes /* as the start of a comment would never see it.

Yes, thats kind of the point. You seem to be arguing against yourself.

A C99 and C++ compiler would see "int j" and compile it, a regex would
simply remove everything from the first /* to */.

No, it won't, because that's not how regexes are used in a lexical

Yes, it will.

Also the same probably applies to #ifdef's.

Lexically analyzing C requires implementing the translation phases
as described in the standard. There are preprocessor phases which
delimit the input into preprocessor tokens (pp-tokens). Comments
are stripped in preprocessing. But logical lines (backslash
continuations) are recognized below comments; i.e. this is one
comment:

Not sure what your point is. A regex cannot be used to parse C comments because its doesn't know C/C++ grammar.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Sebastian@21:1/5 to Muttley@dastartdlyhq.org on Mon Nov 11 07:31:13 2024

XPost: comp.unix.shell, comp.unix.programmer

In comp.unix.programmer Muttley@dastartdlyhq.org wrote:

On Wed, 09 Oct 2024 22:25:05 +0100
Rainer Weikusat <rweikusat@talktalk.net> boring babbled:

Bozo User <anthk@disroot.org> writes:

On 2024-04-07, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Sun, 07 Apr 2024 00:01:43 +0000, Javier wrote:

The downside is the loss of performance because of disk access for
trivial things like 'nfiles=$(ls | wc -l)'.

Well, you could save one process creation by writing
???nfiles=$(echo * | wc -l)??? instead. But that would still not be >>strictly
correct.

I suspect disk access times where
one of the reasons for the development of perl in the early 90s.

Shells were somewhat less powerful in those days. I would describe the >>>> genesis of Perl as ???awk on steroids???. Its big party trick was regular >>>> expressions. And I guess combining that with more sophisticated data-
structuring capabilities.

Perl is more awk+sed+sh in a single language. Basically the killer
of the Unix philophy in late 90's/early 00's, and for the good.

Perl is a high-level programming language with a rich syntax??, with >>support for deterministic automatic memory management, functions as >>first-class objects and message-based OO. It's also a virtual machine
for executing threaded code and a(n optimizing) compiler for translating >>Perl code into the corresponding threaded code.

Its syntax is also a horrific mess. Larry took the worst parts of C and shell syntax and mashed them together. Its no surprise Perl has been ditched in favour of Python just about everywhere for new scripting projects. And while I hate Pythons meangingful whitespace nonsense, I'd use it in preference
to Perl any day.

I think you've identified the one language that Python is better than.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Mon Nov 11 10:06:40 2024

XPost: comp.unix.shell, comp.unix.programmer

On Mon, 11 Nov 2024 07:31:13 -0000 (UTC)
Sebastian <sebastian@here.com.invalid> boring babbled:

In comp.unix.programmer Muttley@dastartdlyhq.org wrote:

syntax and mashed them together. Its no surprise Perl has been ditched in
favour of Python just about everywhere for new scripting projects. And while >> I hate Pythons meangingful whitespace nonsense, I'd use it in preference
to Perl any day.

I think you've identified the one language that Python is better than.

Yes, Python does have a lot of cons as a language. But its syntax lets
newbies get up to speed quickly and there are a lot of libraries. However its dog slow and inefficient and I'm amazed its used as a key language for AI development - not traditionally a newbie coder area - when in that application speed really is essential. Yes it generally calls libraries written in C/C++ but then why not just write the higher level code in C++ too?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Rainer Weikusat on Sun Nov 24 06:42:59 2024

Rainer Weikusat <rweikusat@talktalk.net> writes:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

[...]

Personally I think that writing bulky procedural stuff for
something like [0-9]+ can only be much worse, and that further
abbreviations like \d+ are the better direction to go if targeting
a good interface. YMMV.

Assuming that p is a pointer to the current position in a string, e
is a pointer to the end of it (ie, point just past the last byte)
and - that's important - both are pointers to unsigned quantities,
the 'bulky' C equivalent of [0-9]+ is

while (p < e && *p - '0' < 10) ++p;

To force the comparison to be done as unsigned:

while (p < e && *p - '0' < 10u) ++p;

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Kaz Kylheku on Sun Nov 24 20:08:24 2024

Kaz Kylheku <643-408-1753@kylheku.com> writes:

Here is an example: using a regex match to capture a C comment /* ... */
in Lex compared to just recognizing the start sequence /* and handling
the discarding of the comment in the action.

Without non-greedy repetition matching, the regex for a C comment is
quite obtuse. The procedural handling is straightforward: read
characters until you see a * immediately followed by a /.

Regular expressions are neither greedy nor non-greedy. One of the
key points of regular expressions is that they are declarative
rather than procedural. Any procedural change of behavior overlaid
on a regular expression is a property of the tool, not the regular
expression. It's easy to write a regular expression that exactly
matches a /* ... */ comment and that isn't hard to understand.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Wolfgang Agnes@21:1/5 to Muttley@DastartdlyHQ.org on Mon Nov 11 08:28:51 2024

XPost: comp.unix.shell, comp.unix.programmer

Muttley@DastartdlyHQ.org writes:

On Mon, 11 Nov 2024 07:31:13 -0000 (UTC)
Sebastian <sebastian@here.com.invalid> boring babbled:

In comp.unix.programmer Muttley@dastartdlyhq.org wrote:

syntax and mashed them together. Its no surprise Perl has been ditched in >>> favour of Python just about everywhere for new scripting projects. And while
I hate Pythons meangingful whitespace nonsense, I'd use it in preference >>> to Perl any day.

I think you've identified the one language that Python is better than.

Yes, Python does have a lot of cons as a language. But its syntax lets newbies get up to speed quickly and there are a lot of libraries. However its dog slow and inefficient and I'm amazed its used as a key language for AI development - not traditionally a newbie coder area - when in that application
speed really is essential. Yes it generally calls libraries written in C/C++ but then why not just write the higher level code in C++ too?

You'd have to give up the REPL, for instance.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@dastardlyhq.com@21:1/5 to All on Mon Nov 11 16:21:26 2024

XPost: comp.unix.shell, comp.unix.programmer

On Mon, 11 Nov 2024 08:28:51 -0300
Wolfgang Agnes <wagnes@jemoni.to> gabbled:

Muttley@DastartdlyHQ.org writes:

On Mon, 11 Nov 2024 07:31:13 -0000 (UTC)
Sebastian <sebastian@here.com.invalid> boring babbled:

In comp.unix.programmer Muttley@dastartdlyhq.org wrote:

syntax and mashed them together. Its no surprise Perl has been ditched in >>>> favour of Python just about everywhere for new scripting projects. And >while
I hate Pythons meangingful whitespace nonsense, I'd use it in preference >>>> to Perl any day.

I think you've identified the one language that Python is better than.

Yes, Python does have a lot of cons as a language. But its syntax lets
newbies get up to speed quickly and there are a lot of libraries. However its

dog slow and inefficient and I'm amazed its used as a key language for AI
development - not traditionally a newbie coder area - when in that >application
speed really is essential. Yes it generally calls libraries written in C/C++ >> but then why not just write the higher level code in C++ too?

You'd have to give up the REPL, for instance.

Not that big a deal especially if the model takes hours or days to train anyway.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Muttley on Mon Nov 11 20:55:15 2024

XPost: comp.unix.shell, comp.unix.programmer

On Mon, 11 Nov 2024 10:06:40 -0000 (UTC), Muttley wrote:

Yes it generally calls libraries written in C/C++
but then why not just write the higher level code in C++ too?

Because it’s easier to do higher-level stuff in Python.

Example: <https://github.com/HamPUG/meetings/tree/master/2018/2018-08-13/ldo-creating-api-bindings-using-ctypes>

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Sebastian on Mon Nov 11 21:24:14 2024

XPost: comp.unix.shell, comp.unix.programmer

On Mon, 11 Nov 2024 07:31:13 -0000 (UTC), Sebastian wrote:

In comp.unix.programmer Muttley@dastartdlyhq.org wrote:

[Perl’s] syntax is also a horrific mess. Larry took the worst parts of
C and shell syntax and mashed them together.

I think you've identified the one language that Python is better than.

In terms of the modern era of high-level programming, Perl was the
breakthrough language. Before Perl, BASIC was considered to be an example
of a language with “good” string handling. After Perl, BASIC looked old
and clunky indeed.

Perl was the language that made regular expressions sexy. Because it made
them easy to use.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Muttley@DastartdlyHQ.org on Tue Nov 12 10:14:20 2024

XPost: comp.unix.shell, comp.unix.programmer

On 11.11.2024 11:06, Muttley@DastartdlyHQ.org wrote:

Yes, Python does have a lot of cons as a language. But its syntax lets newbies get up to speed quickly

and to abruptly get stopped again due to obscure, misleading, or
(at best), non-informative error messages

and there are a lot of libraries. However its
dog slow and inefficient and I'm amazed its used as a key language for AI

(and not only there; it's ubiquitous, it seems)

development - not traditionally a newbie coder area - when in that application
speed really is essential. Yes it generally calls libraries written in C/C++ but then why not just write the higher level code in C++ too?

Because of its simpler syntax and less syntactical ballast compared
to C++?

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Tue Nov 12 10:23:38 2024

XPost: comp.unix.shell, comp.unix.programmer

On 11.11.2024 22:24, Lawrence D'Oliveiro wrote:

On Mon, 11 Nov 2024 07:31:13 -0000 (UTC), Sebastian wrote:

In comp.unix.programmer Muttley@dastartdlyhq.org wrote:

[Perl’s] syntax is also a horrific mess. Larry took the worst parts of >>> C and shell syntax and mashed them together.

I think you've identified the one language that Python is better than.

In terms of the modern era of high-level programming, Perl was the breakthrough language. Before Perl, BASIC was considered to be an example
of a language with “good” string handling. After Perl, BASIC looked old and clunky indeed.

I'm not, erm.., a fan of Perl or anything, but comparing it to BASIC
is way off; Perl is not *that* bad. - N.B.: Of course no one can say
what "BASIC" actually is given the many variants and dialects. - I'm
sure you must have some modern variant in mind that might have little
to do with the various former BASIC dialects (that I happened to use
in the 1970's; e.g., Wang, Olivetti, Commodore, and a mainframe that
I don't recall).

It's more interesting what Perl added compared to BRE/ERE, what Unix
provided since its beginning (and long before Perl).

Perl was the language that made regular expressions sexy. Because it made them easy to use.

For those of us who used regexps in Unix from the beginning it's not
that shiny as you want us to buy it; Unix was supporting Chomsky-3
Regular Expressions with a syntax that is still used in contemporary
languages. Perl supports some nice syntactic shortcuts, but also
patterns that exceed Chomsky-3's; too bad if one doesn't know these
differences and any complexity degradation that may be bought with it.

More interesting to me is the fascinating fact that on some non-Unix
platforms it took decades before regexps got (slooooowly) introduced
(even in its simplest form).

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Tue Nov 12 09:21:51 2024

XPost: comp.unix.shell, comp.unix.programmer

On Tue, 12 Nov 2024 10:14:20 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> boring babbled:

On 11.11.2024 11:06, Muttley@DastartdlyHQ.org wrote:

and there are a lot of libraries. However its
dog slow and inefficient and I'm amazed its used as a key language for AI

(and not only there; it's ubiquitous, it seems)

Yes, certainly seems to be the case now.

development - not traditionally a newbie coder area - when in that >application
speed really is essential. Yes it generally calls libraries written in C/C++ >> but then why not just write the higher level code in C++ too?

Because of its simpler syntax and less syntactical ballast compared
to C++?

When you're dealing with something as complicated and frankly ineffable as
an AI model I doubt syntactic quirks of the programming language matter that much in comparison. Surely you'd want the fastest implementation possible and in this case it would be C++.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Muttley@DastartdlyHQ.org on Tue Nov 12 10:31:58 2024

XPost: comp.unix.shell, comp.unix.programmer

On 12.11.2024 10:21, Muttley@DastartdlyHQ.org wrote:

On Tue, 12 Nov 2024 10:14:20 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> boring babbled:

On 11.11.2024 11:06, Muttley@DastartdlyHQ.org wrote:

[ Q: why some prefer Python over C++ ]

Because of its simpler syntax and less syntactical ballast compared
to C++?

When you're dealing with something as complicated and frankly ineffable as
an AI model I doubt syntactic quirks of the programming language matter that much in comparison.

Oh, I would look at it differently; in whatever application domain I
program I want a syntactic clear and well defined language.

Surely you'd want the fastest implementation possible and
in this case it would be C++.

Speed is one factor (to me), and expressiveness or "modeling power"
(OO) is another one. I also appreciate consistently defined languages
and quality of error catching and usefulness of diagnostic messages.
(There's some more factors, but...)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Tue Nov 12 09:53:59 2024

XPost: comp.unix.shell, comp.unix.programmer

On Tue, 12 Nov 2024 10:31:58 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> boring babbled:

On 12.11.2024 10:21, Muttley@DastartdlyHQ.org wrote:

On Tue, 12 Nov 2024 10:14:20 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> boring babbled:

On 11.11.2024 11:06, Muttley@DastartdlyHQ.org wrote:

[ Q: why some prefer Python over C++ ]

Because of its simpler syntax and less syntactical ballast compared
to C++?

When you're dealing with something as complicated and frankly ineffable as >> an AI model I doubt syntactic quirks of the programming language matter that >> much in comparison.

Oh, I would look at it differently; in whatever application domain I
program I want a syntactic clear and well defined language.

In which case I'd go with a statically typed language like C++ every time
ahead of a dynamic one like python.

Surely you'd want the fastest implementation possible and
in this case it would be C++.

Speed is one factor (to me), and expressiveness or "modeling power"
(OO) is another one. I also appreciate consistently defined languages
and quality of error catching and usefulness of diagnostic messages.
(There's some more factors, but...)

C++ is undeniably powerful, but I think the majority would agree now that
its syntax has become an unwieldy mess.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Wolfgang Agnes on Tue Nov 19 06:14:27 2024

On 12.11.2024 17:50, Wolfgang Agnes wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
[...]

By Chomsky-3 you mean a grammar of type 3 in the Chomsky hierarchy? And
that would be ``regular'' language, recognizable by a finite-state
automaton? If not, could you elaborate on the terminology?

Yes. I hoped the term was clear enough. If I had used too sloppy
wording in my ad hoc writing I apologize for the inconvenience.

My point was about runtime guarantees and complexities (O(N)) of
Regexp processing, which are also reflected by the FSA model.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Muttley@DastartdlyHQ.org on Tue Nov 12 15:05:00 2024

XPost: comp.unix.shell, comp.unix.programmer

On 12.11.2024 10:53, Muttley@DastartdlyHQ.org wrote:

In which case I'd go with a statically typed language like C++ every time ahead of a dynamic one like python.

Definitely!

I'm using untyped languages (like Awk) for scripting, though, but
not for code of considerable scale.

Incidentally, on of my children recently spoke about their setups;
they use Fortran with old libraries (hydrodynamic earth processes),
have the higher level tasks implemented in C++, and they do the
"job control" of the simulation tasks with Python. - A multi-tier
architecture. - That sounds not unreasonable to me. (But they had
built their system based on existing software, so it might have
been a different decision if they'd have built it from scratch.)

C++ is undeniably powerful, but I think the majority would agree now that
its syntax has become an unwieldy mess.

Yes. And recent standards made it yet worse - When I saw it the
first time I couldn't believe that this would be possible. ;-)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Tue Nov 12 15:09:23 2024

XPost: comp.unix.shell, comp.unix.programmer

On Tue, 12 Nov 2024 15:05:00 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> boring babbled:

On 12.11.2024 10:53, Muttley@DastartdlyHQ.org wrote:

C++ is undeniably powerful, but I think the majority would agree now that
its syntax has become an unwieldy mess.

Yes. And recent standards made it yet worse - When I saw it the
first time I couldn't believe that this would be possible. ;-)

Unfortunately these days the C++ steering committee (or whatever its called) simply seem to be using the language to justify their positions and keep chucking in "features" that no one asked for or care about, with the end
result of the language becoming a huge mess that no single person could
ever learn (or at least remember if they tried).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Janis Papanagnou on Tue Nov 12 14:50:26 2024

On 12/11/2024 14:05, Janis Papanagnou wrote:

On 12.11.2024 10:53, Muttley@DastartdlyHQ.org wrote:

In which case I'd go with a statically typed language like C++ every time
ahead of a dynamic one like python.

Definitely!

I'm using untyped languages (like Awk) for scripting, though, but
not for code of considerable scale.

Incidentally, on of my children recently spoke about their setups;
they use Fortran with old libraries (hydrodynamic earth processes),
have the higher level tasks implemented in C++, and they do the
"job control" of the simulation tasks with Python. - A multi-tier architecture. - That sounds not unreasonable to me. (But they had
built their system based on existing software, so it might have
been a different decision if they'd have built it from scratch.)

My last major app (now over 20 years ago), had such a 2-language solution.

It was a GUI-based low-end 2D/3D CAD app, written in my lower level
systems language.

But the app also had an embedded scripting language, which had access to
the app's environment and users' data.

That was partly so that users (both OEMs and end-users) could write
their own scripts. To this end it was moderately successful as OEMs
could write their own add-on applications (for example, to help design
lighting rigs).

But I also used it exclusively for the GUI side of the application:
menus, dialogs, cursor control, layouts, the simpler file conversions
(eg. export my data models to 3DS format) while the native code parts
dealt with the critical parts: the 3D maths, managing the 3D models the
display drivers, etc.

The whole thing was perhaps 150-200Kloc (not including OEM or user
programs), which was about half static/compiled code and half dynamic/interpreted.

(One of the original motivations, when it had to run on constrained
systems, was to allow a lot of the code to exist as standalone scripts,
which resided on floppy disks, and which ere only loaded as needed.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Wolfgang Agnes@21:1/5 to Janis Papanagnou on Tue Nov 12 13:50:58 2024

XPost: comp.unix.shell, comp.unix.programmer

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

[...]

Perl was the language that made regular expressions sexy. Because it made
them easy to use.

For those of us who used regexps in Unix from the beginning it's not
that shiny as you want us to buy it; Unix was supporting Chomsky-3
Regular Expressions with a syntax that is still used in contemporary languages. Perl supports some nice syntactic shortcuts, but also
patterns that exceed Chomsky-3's; too bad if one doesn't know these differences and any complexity degradation that may be bought with it.

By Chomsky-3 you mean a grammar of type 3 in the Chomsky hierarchy? And
that would be ``regular'' language, recognizable by a finite-state
automaton? If not, could you elaborate on the terminology?

More interesting to me is the fascinating fact that on some non-Unix platforms it took decades before regexps got (slooooowly) introduced
(even in its simplest form).

Such as which platform?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Wolfgang Agnes@21:1/5 to Muttley@DastartdlyHQ.org on Tue Nov 12 13:47:15 2024

XPost: comp.unix.shell, comp.unix.programmer

Muttley@DastartdlyHQ.org writes:

On Tue, 12 Nov 2024 10:14:20 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> boring babbled:

[...]

Because of its simpler syntax and less syntactical ballast compared
to C++?

When you're dealing with something as complicated and frankly ineffable as
an AI model I doubt syntactic quirks of the programming language matter that much in comparison. Surely you'd want the fastest implementation possible and in this case it would be C++.

I really wouldn't be so sure. :)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Bart on Tue Nov 12 20:35:30 2024

On Tue, 12 Nov 2024 14:50:26 +0000, Bart wrote:

But the app also had an embedded scripting language, which had access to
the app's environment and users' data.

Did you invent your own scripting language? Nowadays you would use
something ready-made, like Lua, Guile or even Python.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Janis Papanagnou on Tue Nov 12 20:29:02 2024

XPost: comp.unix.shell, comp.unix.programmer

On Tue, 12 Nov 2024 10:23:38 +0100, Janis Papanagnou wrote:

On 11.11.2024 22:24, Lawrence D'Oliveiro wrote:

Perl was the language that made regular expressions sexy. Because it
made them easy to use.

... Unix was supporting Chomsky-3
Regular Expressions with a syntax that is still used in contemporary languages.

Not in anything resembling a general-purpose high-level language. That’s
what Perl pioneered.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Lawrence D'Oliveiro on Tue Nov 12 21:48:39 2024

On 12/11/2024 20:35, Lawrence D'Oliveiro wrote:

On Tue, 12 Nov 2024 14:50:26 +0000, Bart wrote:

But the app also had an embedded scripting language, which had access to
the app's environment and users' data.

Did you invent your own scripting language? Nowadays you would use
something ready-made, like Lua, Guile or even Python.

At that (late 80s) I had to invent pretty much everything.

I still do, language-wise.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Randal L. Schwartz@21:1/5 to All on Tue Nov 19 18:43:48 2024

XPost: comp.unix.shell, comp.unix.programmer

"Lawrence" == Lawrence D'Oliveiro <ldo@nz.invalid> writes:

Lawrence> Perl was the language that made regular expressions
Lawrence> sexy. Because it made them easy to use.

I'm often reminded of this as I've been coding very little in Perl these
days, and a lot more in languages like Dart, where the regex feels like
a clumsy bolt-on rather than a proper first-class citizen.

There are times I miss Perl. But not too often any more. :)

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/> Perl/Dart/Flutter consulting, Technical writing, Comedy, etc. etc.
Still trying to think of something clever for the fourth line of this .sig

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Randal L. Schwartz on Wed Nov 20 04:34:31 2024

XPost: comp.unix.shell, comp.unix.programmer

On Tue, 19 Nov 2024 18:43:48 -0800, Randal L. Schwartz wrote:

"Lawrence" == Lawrence D'Oliveiro <ldo@nz.invalid> writes:

Lawrence> Perl was the language that made regular expressions
Lawrence> sexy. Because it made them easy to use.

I'm often reminded of this as I've been coding very little in Perl these days, and a lot more in languages like Dart, where the regex feels like
a clumsy bolt-on rather than a proper first-class citizen.

Python has regexes as a bolt-on -- a library module, not a core part of
the language. But I think the way it leverages the core language -- e.g.
being able to iterate over pattern matches, and collecting information
about matches in a “Match” object -- keeps it quite useful in a nicely functional way.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Wed Nov 20 08:21:17 2024

XPost: comp.unix.shell, comp.unix.programmer

On Tue, 19 Nov 2024 18:43:48 -0800
merlyn@stonehenge.com (Randal L. Schwartz) boring babbled:

"Lawrence" == Lawrence D'Oliveiro <ldo@nz.invalid> writes:

Lawrence> Perl was the language that made regular expressions
Lawrence> sexy. Because it made them easy to use.

I'm often reminded of this as I've been coding very little in Perl these >days, and a lot more in languages like Dart, where the regex feels like
a clumsy bolt-on rather than a proper first-class citizen.

Regex itself is clumsy beyond simple search and replace patterns. A lot of stuff I've seen done in regex would have better done procedurally at the expense of slightly more code but a LOT more readability. Also given its effectively a compact language with its own grammar and syntax IMO it should not be the core part of any language as it can lead to a syntatic mess, which is what often happens with Perl.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Muttley@DastartdlyHQ.org on Wed Nov 20 11:51:11 2024

XPost: comp.unix.shell, comp.unix.programmer

On 20.11.2024 09:21, Muttley@DastartdlyHQ.org wrote:

On Tue, 19 Nov 2024 18:43:48 -0800
merlyn@stonehenge.com (Randal L. Schwartz) boring babbled:

I'm often reminded of this as I've been coding very little in Perl these
days, and a lot more in languages like Dart, where the regex feels like
a clumsy bolt-on rather than a proper first-class citizen.

Regex itself is clumsy beyond simple search and replace patterns. A lot of stuff I've seen done in regex would have better done procedurally at the expense of slightly more code but a LOT more readability. Also given its effectively a compact language with its own grammar and syntax IMO it should not be the core part of any language as it can lead to a syntatic mess, which is what often happens with Perl.

I wouldn't look at it that way. I've seen Regexps as part of languages
usually in well defined syntactical contexts. For example, like strings
are enclosed in "...", Regexps could be seen within /.../ delimiters.
GNU Awk (in recent versions) went towards first class "strongly typed"
Regexps which are then denoted by the @/.../ syntax.

I'm curious what you mean by Regexps presented in a "procedural" form.
Can you give some examples?

Personally I'm fine with the typical lexical meta-symbols in Regexps
which resembles the FSA and allows a simple transformation forth/back.

In practice, given that a Regexp conforms to a FSA, any Regexp can be precompiled and used multiple times. The thing I had used in Java - it
was a library from Apache, IIRC, not the bulky thing that got included
later - was easily usable; create a Regexp object by a RE expression,
then operate on that same object. (Since there's still typical Regexp
syntax involved I suppose that is not what you meant by "procedural"?)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Wed Nov 20 11:30:44 2024

XPost: comp.unix.shell, comp.unix.programmer

On Wed, 20 Nov 2024 11:51:11 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> boring babbled:

On 20.11.2024 09:21, Muttley@DastartdlyHQ.org wrote:

On Tue, 19 Nov 2024 18:43:48 -0800
merlyn@stonehenge.com (Randal L. Schwartz) boring babbled:

I'm often reminded of this as I've been coding very little in Perl these >>> days, and a lot more in languages like Dart, where the regex feels like
a clumsy bolt-on rather than a proper first-class citizen.

Regex itself is clumsy beyond simple search and replace patterns. A lot of >> stuff I've seen done in regex would have better done procedurally at the
expense of slightly more code but a LOT more readability. Also given its
effectively a compact language with its own grammar and syntax IMO it should >> not be the core part of any language as it can lead to a syntatic mess, >which
is what often happens with Perl.

I wouldn't look at it that way. I've seen Regexps as part of languages >usually in well defined syntactical contexts. For example, like strings
are enclosed in "...", Regexps could be seen within /.../ delimiters.
GNU Awk (in recent versions) went towards first class "strongly typed" >Regexps which are then denoted by the @/.../ syntax.

I'm curious what you mean by Regexps presented in a "procedural" form.
Can you give some examples?

Anything that can be done in regex can obviously also be done procedurally.
At the point regex expression become unwieldy - usually when substitution variables raise their heads - I prefer procedural code as its also often
easier to debug.

In practice, given that a Regexp conforms to a FSA, any Regexp can be >precompiled and used multiple times. The thing I had used in Java - it

Precompiled regex is no more efficient than precompiled anything , its all
just assembler at the bottom.

then operate on that same object. (Since there's still typical Regexp
syntax involved I suppose that is not what you meant by "procedural"?)

If you don't know the different between declarative syntax like regex and procedural syntax then there's not much point continuing this discussion.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Wed Nov 20 12:27:54 2024

XPost: comp.unix.shell, comp.unix.programmer

On Wed, 20 Nov 2024 05:46:49 -0600
Ed Morton <mortonspam@gmail.com> boring babbled:

On 11/20/2024 2:21 AM, Muttley@DastartdlyHQ.org wrote:

On Tue, 19 Nov 2024 18:43:48 -0800
merlyn@stonehenge.com (Randal L. Schwartz) boring babbled:

"Lawrence" == Lawrence D'Oliveiro <ldo@nz.invalid> writes:

Lawrence> Perl was the language that made regular expressions
Lawrence> sexy. Because it made them easy to use.

I'm often reminded of this as I've been coding very little in Perl these >>> days, and a lot more in languages like Dart, where the regex feels like
a clumsy bolt-on rather than a proper first-class citizen.

Regex itself is clumsy beyond simple search and replace patterns. A lot of >> stuff I've seen done in regex would have better done procedurally at the
expense of slightly more code but a LOT more readability.

Definitely. The most relevant statement about regexps is this:

Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.

Very true!

Obviously regexps are very useful and commonplace but if you find you
have to use some online site or other tools to help you write/understand
one or just generally need more than a couple of minutes to
write/understand it then it's time to back off and figure out a better
way to write your code for the sake of whoever has to read it 6 months
later (and usually for robustness too as it's hard to be sure all rainy
day cases are handled correctly in a lengthy and/or complicated regexp).

Edge cases are regex achilles heal, eg an expression that only accounted
for 1 -> N chars, not 0 -> N, or matches in the middle but not at the ends.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rainer Weikusat@21:1/5 to Muttley@DastartdlyHQ.org on Wed Nov 20 12:21:04 2024

XPost: comp.unix.shell, comp.unix.programmer

Muttley@DastartdlyHQ.org writes:

merlyn@stonehenge.com (Randal L. Schwartz) boring babbled:

"Lawrence" == Lawrence D'Oliveiro <ldo@nz.invalid> writes:

Lawrence> Perl was the language that made regular expressions
Lawrence> sexy. Because it made them easy to use.

I'm often reminded of this as I've been coding very little in Perl these >>days, and a lot more in languages like Dart, where the regex feels like
a clumsy bolt-on rather than a proper first-class citizen.

Regex itself is clumsy beyond simple search and replace patterns. A lot of stuff I've seen done in regex would have better done procedurally at the expense of slightly more code but a LOT more readability. Also given its effectively a compact language with its own grammar and syntax IMO it should not be the core part of any language as it can lead to a syntatic mess, which is what often happens with Perl.

A mess is something which often happens when people who can't organize
their thoughts just trudge on nevertheless. They're perfectly capable of accomplishing that in any programming language.

A real problem with regexes in Perl is that they're pretty slow for simple
use cases (like lexical analysis) and thus, not suitable for volume data processing outside of throwaway code¹.

¹ I used to use a JSON parser written in OO-Perl which made extensive
use of regexes for that. I've recently replaced that with a C/XS version
which - while slightly larger (617 vs 410 lines of text) - is over a
hundred times faster and conceptually simpler at the same time.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Muttley@DastartdlyHQ.org on Wed Nov 20 16:38:24 2024

XPost: comp.unix.shell, comp.unix.programmer

On 20.11.2024 12:30, Muttley@DastartdlyHQ.org wrote:

On Wed, 20 Nov 2024 11:51:11 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> boring babbled:

On 20.11.2024 09:21, Muttley@DastartdlyHQ.org wrote:

On Tue, 19 Nov 2024 18:43:48 -0800
merlyn@stonehenge.com (Randal L. Schwartz) boring babbled:

I'm often reminded of this as I've been coding very little in Perl these >>>> days, and a lot more in languages like Dart, where the regex feels like >>>> a clumsy bolt-on rather than a proper first-class citizen.

Regex itself is clumsy beyond simple search and replace patterns. A lot of >>> stuff I've seen done in regex would have better done procedurally at the >>> expense of slightly more code but a LOT more readability. Also given its >>> effectively a compact language with its own grammar and syntax IMO it should
not be the core part of any language as it can lead to a syntatic mess,

which

is what often happens with Perl.

I wouldn't look at it that way. I've seen Regexps as part of languages
usually in well defined syntactical contexts. For example, like strings
are enclosed in "...", Regexps could be seen within /.../ delimiters.
GNU Awk (in recent versions) went towards first class "strongly typed"
Regexps which are then denoted by the @/.../ syntax.

I'm curious what you mean by Regexps presented in a "procedural" form.
Can you give some examples?

Anything that can be done in regex can obviously also be done procedurally. At the point regex expression become unwieldy - usually when substitution variables raise their heads - I prefer procedural code as its also often easier to debug.

You haven't even tried to honestly answer my (serious) question.
With your statement above and your hostility below, it rather seems
you have no clue of what I am talking about.

In practice, given that a Regexp conforms to a FSA, any Regexp can be
precompiled and used multiple times. The thing I had used in Java - it

Precompiled regex is no more efficient than precompiled anything , its all just assembler at the bottom.

The Regexps are a way to specify the words of a regular language;
for pattern matching the expression gets interpreted or compiled; you
specify it, e.g., using strings of characters and meta-characters.
If you have a programming language where that string gets repeatedly interpreted then it's slower than a precompiled Regexp expression.

I give you examples...

(1) DES encryption function

(1a) ciphertext = des_encode (key, plaintext)

(1b) cipher = des (key)
ciphertext = cipher.encode (plaintext)

In case (1) you can either call the des encription (decription) for
any (key, plaintext)-pair in a procedural function as in (1a), or
you can create the key-specific encryption once and encode various
texts with the same cipher object as in (1b).

(2) regexp matching

(2a) location = regexp (pattern, string)

(2b) fsm = rexexp (pattern)
location = fsm.match (string)

In case (2) you can either do the match in a string with a pattern
in a procedural form as in (2a) or you can create the FSM for the
given Regexp just once and apply it on various strings as in (2b).

That's what I was talking about.

Only if key (in (1)) or pattern (in (2)) are static or "constant"
that compilation could (but only theoretically) be done in advance
and optimizing system may (or may not) precompile it (both) to
[similar] assembler code. How should that work with regexps or DES?
The optimizing system would need knowledge how to use the library
code (DES, Regexps, ...) to create binary structures based on the
algorithms (key-initialization in DES, FSM-generation in Regexps).
This is [statically] not done.

Otherwise - i.e. the normal, expected case - there's an efficiency
difference to observe between the respective cases of (a) and (b).

then operate on that same object. (Since there's still typical Regexp
syntax involved I suppose that is not what you meant by "procedural"?)

If you don't know the different between declarative syntax like regex and procedural syntax then there's not much point continuing this discussion.

Why do you think so, and why are you saying that? - That wasn't and
still isn't the point. - You said upthread

"A lot of stuff I've seen done in regex would have better done
procedurally at the expense of slightly more code but a LOT more
readability."

and I asked

"I'm curious what you mean by Regexps presented in a "procedural"
form.
Can you give some examples?"

What you wanted to say wasn't clear to me, since you were complaining
about the _Regexp syntax_. So it couldn't be meant to just write
regexp (pattern, string) instead of pattern ~ string
but to somehow(!) transform "pattern", say, like /[0-9]+(ABC)?x*foo/,
to something syntactically "better".
I was interested in that "somehow" (that I emphasized), and in an
example how that would look like in your opinion.
If you're unable to answer that simple question then just take that
simple regexp /[0-9]+(ABC)?x*foo/ example and show us your preferred
procedural variant.

But my expectation is that you cannot provide any reasonable example
anyway.

Personally I think that writing bulky procedural stuff for something
like [0-9]+ can only be much worse, and that further abbreviations
like \d+ are the better direction to go if targeting a good interface.
YMMV.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Ed Morton on Wed Nov 20 16:53:38 2024

XPost: comp.unix.shell, comp.unix.programmer

On 20.11.2024 12:46, Ed Morton wrote:

Definitely. The most relevant statement about regexps is this:

Some people, when confronted with a problem, think "I know, I'll use
regular expressions." Now they have two problems.

(Worth a scribbling on a WC wall.)

Obviously regexps are very useful and commonplace but if you find you
have to use some online site or other tools to help you write/understand
one or just generally need more than a couple of minutes to
write/understand it then it's time to back off and figure out a better
way to write your code for the sake of whoever has to read it 6 months
later (and usually for robustness too as it's hard to be sure all rainy
day cases are handled correctly in a lengthy and/or complicated regexp).

Regexps are nothing for newbies.

The inherent fine thing with Regexps is that you can incrementally
compose them[*].[**]

It seems you haven't found a sensible way to work with them?
(And I'm really astonished about that since I know you worked with
Regexps for years if not decades.)

In those cases where Regexps *are* the tool for a specific task -
I don't expect you to use them where they are inappropriate?! -
what would be the better solution[***] then?

Janis

[*] Like the corresponding FSMs.

[**] And you can also decompose them if they are merged in a huge
expression, too large for you to grasp it. (BTW, I'm doing such
decompositions also with other expressions in program code that
are too bulky.)

[***] Can you answer the question that another poster failed to do?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Wed Nov 20 16:38:15 2024

XPost: comp.unix.shell, comp.unix.programmer

On Wed, 20 Nov 2024 16:38:24 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> boring babbled:

On 20.11.2024 12:30, Muttley@DastartdlyHQ.org wrote:

Anything that can be done in regex can obviously also be done procedurally. >> At the point regex expression become unwieldy - usually when substitution
variables raise their heads - I prefer procedural code as its also often
easier to debug.

You haven't even tried to honestly answer my (serious) question.

You mean you can't figure out how to do something like string search and replace
procedurally? I'm not going to show you, ask a kid who knows Python or Basic.

With your statement above and your hostility below, it rather seems

If you think my reply was hostile then I suggest you go find a safe space
and cuddle your teddy bear snowflake.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rainer Weikusat@21:1/5 to Muttley@DastartdlyHQ.org on Wed Nov 20 17:54:22 2024

XPost: comp.unix.shell, comp.unix.programmer

Muttley@DastartdlyHQ.org writes:

[...]

With your statement above and your hostility below, it rather seems

If you think my reply was hostile then I suggest you go find a safe space
and cuddle your teddy bear snowflake.

There's surely no reason why anyone could ever think you were inclined
to substitute verbal aggression for arguments.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rainer Weikusat@21:1/5 to Janis Papanagnou on Wed Nov 20 17:50:13 2024

XPost: comp.unix.shell, comp.unix.programmer

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

[...]

Personally I think that writing bulky procedural stuff for something
like [0-9]+ can only be much worse, and that further abbreviations
like \d+ are the better direction to go if targeting a good interface.
YMMV.

Assuming that p is a pointer to the current position in a string, e is a pointer to the end of it (ie, point just past the last byte) and -
that's important - both are pointers to unsigned quantities, the 'bulky'
C equivalent of [0-9]+ is

while (p < e && *p - '0' < 10) ++p;

That's not too bad. And it's really a hell lot faster than a
general-purpose automaton programmed to recognize the same pattern
(which might not matter most of the time, but sometimes, it does).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Muttley on Wed Nov 20 21:43:41 2024

XPost: comp.unix.shell, comp.unix.programmer

On Wed, 20 Nov 2024 12:27:54 -0000 (UTC), Muttley wrote:

Edge cases are regex achilles heal, eg an expression that only accounted
for 1 -> N chars, not 0 -> N, or matches in the middle but not at the
ends.

That’s what “^” and “$” are for.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Thu Nov 21 08:13:39 2024

XPost: comp.unix.shell, comp.unix.programmer

On Wed, 20 Nov 2024 17:54:22 +0000
Rainer Weikusat <rweikusat@talktalk.net> boring babbled: >Muttley@DastartdlyHQ.org writes:

[...]

With your statement above and your hostility below, it rather seems

If you think my reply was hostile then I suggest you go find a safe space
and cuddle your teddy bear snowflake.

There's surely no reason why anyone could ever think you were inclined
to substitute verbal aggression for arguments.

I have zero time for anyone who claims hurt feelings or being slighted as
soon as they're losing an argument.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Thu Nov 21 08:15:41 2024

XPost: comp.unix.shell, comp.unix.programmer

On Wed, 20 Nov 2024 21:43:41 -0000 (UTC)
Lawrence D'Oliveiro <ldo@nz.invalid> boring babbled:

On Wed, 20 Nov 2024 12:27:54 -0000 (UTC), Muttley wrote:

Edge cases are regex achilles heal, eg an expression that only accounted
for 1 -> N chars, not 0 -> N, or matches in the middle but not at the
ends.

That’s what “^” and “$” are for.

Yes, but people forget about those (literal) edge cases.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Thu Nov 21 08:18:06 2024

XPost: comp.unix.shell, comp.unix.programmer

On Wed, 20 Nov 2024 10:03:47 -0800
John Ames <commodorejohn@gmail.com> boring babbled:

On Wed, 20 Nov 2024 17:54:22 +0000
Rainer Weikusat <rweikusat@talktalk.net> wrote:

There's surely no reason why anyone could ever think you were inclined
to substitute verbal aggression for arguments.

I mean, it's his whole thing - why would he stop now?

Whats it like being so wet? Do you get cold easily?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Randal L. Schwartz@21:1/5 to All on Thu Nov 21 05:38:45 2024

XPost: comp.unix.shell, comp.unix.programmer

"Rainer" == Rainer Weikusat <rweikusat@talktalk.net> writes:

Rainer> ¹ I used to use a JSON parser written in OO-Perl which made
Rainer> extensive use of regexes for that. I've recently replaced that
Rainer> with a C/XS version which - while slightly larger (617 vs 410
Rainer> lines of text) - is over a hundred times faster and conceptually Rainer> simpler at the same time.

I wonder if that was my famous "JSON parser in a single regex" from https://www.perlmonks.org/?node_id=995856, or from one of the two CPAN
modules that incorporated it.

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/> Perl/Dart/Flutter consulting, Technical writing, Comedy, etc. etc.
Still trying to think of something clever for the fourth line of this .sig

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dan Cross@21:1/5 to commodorejohn@gmail.com on Thu Nov 21 14:13:37 2024

XPost: comp.unix.shell, comp.unix.programmer

In article <20241120100347.00005f10@gmail.com>,
John Ames <commodorejohn@gmail.com> wrote:

On Wed, 20 Nov 2024 17:54:22 +0000
Rainer Weikusat <rweikusat@talktalk.net> wrote:

There's surely no reason why anyone could ever think you were inclined
to substitute verbal aggression for arguments.

I mean, it's his whole thing - why would he stop now?

This is the guy who didn't know what a compiler is, right?

- Dan C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dan Cross@21:1/5 to rweikusat@talktalk.net on Thu Nov 21 14:40:18 2024

XPost: comp.unix.shell, comp.unix.programmer

In article <875xohbxre.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

[...]

Personally I think that writing bulky procedural stuff for something
like [0-9]+ can only be much worse, and that further abbreviations
like \d+ are the better direction to go if targeting a good interface.
YMMV.

Assuming that p is a pointer to the current position in a string, e is a >pointer to the end of it (ie, point just past the last byte) and -
that's important - both are pointers to unsigned quantities, the 'bulky'
C equivalent of [0-9]+ is

while (p < e && *p - '0' < 10) ++p;

That's not too bad. And it's really a hell lot faster than a
general-purpose automaton programmed to recognize the same pattern
(which might not matter most of the time, but sometimes, it does).

It's also not exactly right. `[0-9]+` would match one or more
characters; this possibly matches 0 (ie, if `p` pointed to
something that wasn't a digit).

- Dan C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rainer Weikusat@21:1/5 to Dan Cross on Thu Nov 21 15:07:42 2024

XPost: comp.unix.shell, comp.unix.programmer

cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

[...]

Personally I think that writing bulky procedural stuff for something
like [0-9]+ can only be much worse, and that further abbreviations
like \d+ are the better direction to go if targeting a good interface.
YMMV.

Assuming that p is a pointer to the current position in a string, e is a >>pointer to the end of it (ie, point just past the last byte) and -
that's important - both are pointers to unsigned quantities, the 'bulky'
C equivalent of [0-9]+ is

while (p < e && *p - '0' < 10) ++p;

That's not too bad. And it's really a hell lot faster than a >>general-purpose automaton programmed to recognize the same pattern
(which might not matter most of the time, but sometimes, it does).

It's also not exactly right. `[0-9]+` would match one or more
characters; this possibly matches 0 (ie, if `p` pointed to
something that wasn't a digit).

The regex won't match any digits if there aren't any. In this case, the
match will fail. I didn't include the code for handling that because it
seemed pretty pointless for the example.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Thu Nov 21 16:06:01 2024

XPost: comp.unix.shell, comp.unix.programmer

On Thu, 21 Nov 2024 14:13:37 -0000 (UTC)
cross@spitfire.i.gajendra.net (Dan Cross) boring babbled:

In article <20241120100347.00005f10@gmail.com>,
John Ames <commodorejohn@gmail.com> wrote:

On Wed, 20 Nov 2024 17:54:22 +0000
Rainer Weikusat <rweikusat@talktalk.net> wrote:

There's surely no reason why anyone could ever think you were inclined
to substitute verbal aggression for arguments.

I mean, it's his whole thing - why would he stop now?

This is the guy who didn't know what a compiler is, right?

Wrong. Want another go?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rainer Weikusat@21:1/5 to Randal L. Schwartz on Thu Nov 21 17:01:48 2024

XPost: comp.unix.shell, comp.unix.programmer

merlyn@stonehenge.com (Randal L. Schwartz) writes:

"Rainer" == Rainer Weikusat <rweikusat@talktalk.net> writes:

Rainer> ¹ I used to use a JSON parser written in OO-Perl which made
Rainer> extensive use of regexes for that. I've recently replaced that Rainer> with a C/XS version which - while slightly larger (617 vs 410
Rainer> lines of text) - is over a hundred times faster and conceptually Rainer> simpler at the same time.

I wonder if that was my famous "JSON parser in a single regex" from https://www.perlmonks.org/?node_id=995856, or from one of the two CPAN modules that incorporated it.

No. One of my use-cases is an interactive shell running in a web browser
using ActionCable messages to relay data between the browser and the
shell process on the computer supposed to be accessed in this way. For
this, I absolutely do need \u escapes. I also need this to be fast. Eg,
one of the nice properties of JSON is that the type of a value can be determined by looking at the first character of it. This cries for an implementation based on an array of pointers to 'value parsing routines'
of size 256 and determining the parser routine to use by using the first character as index into this table (which will either yield a pointer to
the correct parser routine or NULL for a syntax error).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Janis Papanagnou on Thu Nov 21 19:12:03 2024

XPost: comp.unix.shell, comp.unix.programmer

On 2024-11-20, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 20.11.2024 09:21, Muttley@DastartdlyHQ.org wrote:

Regex itself is clumsy beyond simple search and replace patterns. A lot of >> stuff I've seen done in regex would have better done procedurally at the
expense of slightly more code but a LOT more readability. Also given its
effectively a compact language with its own grammar and syntax IMO it should >> not be the core part of any language as it can lead to a syntatic mess, which
is what often happens with Perl.

I wouldn't look at it that way. I've seen Regexps as part of languages usually in well defined syntactical contexts. For example, like strings
are enclosed in "...", Regexps could be seen within /.../ delimiters.
GNU Awk (in recent versions) went towards first class "strongly typed" Regexps which are then denoted by the @/.../ syntax.

These features solve the problem of regexes being stored as character
strings not being recognized by the language compiler and then having
to be compiled at run-time.

They don't solve all the ergonomics of regexes that Muttley is talking
about.

I'm curious what you mean by Regexps presented in a "procedural" form.
Can you give some examples?

Here is an example: using a regex match to capture a C comment /* ... */
in Lex compared to just recognizing the start sequence /* and handling
the discarding of the comment in the action.

Without non-greedy repetition matching, the regex for a C comment is
quite obtuse. The procedural handling is straightforward: read
characters until you see a * immediately followed by a /.

In the wild, you see regexes being used for all sorts of stupid stuff,
like checking whether numeric input is in a certain range, rather than converting it to a number and doing an arithmetic check.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Muttley on Thu Nov 21 22:05:13 2024

XPost: comp.unix.shell, comp.unix.programmer

On Thu, 21 Nov 2024 08:15:41 -0000 (UTC), Muttley wrote:

On Wed, 20 Nov 2024 21:43:41 -0000 (UTC)
Lawrence D'Oliveiro <ldo@nz.invalid> boring babbled:

On Wed, 20 Nov 2024 12:27:54 -0000 (UTC), Muttley wrote:

Edge cases are regex achilles heal, eg an expression that only
accounted for 1 -> N chars, not 0 -> N, or matches in the middle but
not at the ends.

That’s what “^” and “$” are for.

Yes, but people forget about those (literal) edge cases.

Those of us who are accustomed to using regexes do not.

Another handy one is “\b” for word boundaries.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Muttley@DastartdlyHQ.org@21:1/5 to All on Fri Nov 22 10:09:48 2024

XPost: comp.unix.shell, comp.unix.programmer

On Thu, 21 Nov 2024 19:12:03 -0000 (UTC)
Kaz Kylheku <643-408-1753@kylheku.com> boring babbled:

On 2024-11-20, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

I'm curious what you mean by Regexps presented in a "procedural" form.
Can you give some examples?

Here is an example: using a regex match to capture a C comment /* ... */
in Lex compared to just recognizing the start sequence /* and handling
the discarding of the comment in the action.

Without non-greedy repetition matching, the regex for a C comment is
quite obtuse. The procedural handling is straightforward: read
characters until you see a * immediately followed by a /.

Its not that simple I'm afraid since comments can be commented out.

eg:

// int i; /*
int j;
/*
int k;
*/
++j;

A C99 and C++ compiler would see "int j" and compile it, a regex would
simply remove everything from the first /* to */.

Also the same probably applies to #ifdef's.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Rainer Weikusat on Fri Nov 22 12:14:32 2024

XPost: comp.unix.shell, comp.unix.programmer

On 20.11.2024 18:50, Rainer Weikusat wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

Personally I think that writing bulky procedural stuff for something
like [0-9]+ can only be much worse, and that further abbreviations
like \d+ are the better direction to go if targeting a good interface.
YMMV.

Assuming that p is a pointer to the current position in a string, e is a pointer to the end of it (ie, point just past the last byte) and -
that's important - both are pointers to unsigned quantities, the 'bulky'
C equivalent of [0-9]+ is

while (p < e && *p - '0' < 10) ++p;

That's not too bad. And it's really a hell lot faster than a
general-purpose automaton programmed to recognize the same pattern
(which might not matter most of the time, but sometimes, it does).

Okay, I see where you're coming from (and especially in that simple
case).

Personally (and YMMV), even here in this simple case I think that
using pointers is not better but worse - and anyway isn't [in this
form] available in most languages; in other cases (and languages)
such constructs get yet more clumsy, and for my not very complex
example - /[0-9]+(ABC)?x*foo/ - even a "catastrophe" concerning
readability, error-proneness, and maintainability.

If that is what the other poster meant I'm fine with your answer;
there's no need to even consider abandoning regular expressions
in favor of explicitly codified parsing.

Janis

PS: And thanks for answering on behalf of the other poster whom I
see in his followups just continuing his very personal style.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Kaz Kylheku on Fri Nov 22 12:17:56 2024

XPost: comp.unix.shell, comp.unix.programmer

On 21.11.2024 20:12, Kaz Kylheku wrote:

[...]

In the wild, you see regexes being used for all sorts of stupid stuff,

No one can prevent folks using features for stupid things. Yes.

like checking whether numeric input is in a certain range, rather than converting it to a number and doing an arithmetic check.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Fri Nov 22 12:47:16 2024

XPost: comp.unix.shell, comp.unix.programmer

On 21.11.2024 23:05, Lawrence D'Oliveiro wrote:

On Thu, 21 Nov 2024 08:15:41 -0000 (UTC), Muttley wrote:

On Wed, 20 Nov 2024 21:43:41 -0000 (UTC)
Lawrence D'Oliveiro <ldo@nz.invalid> boring babbled:
[...]

That’s what “^” and “$” are for.

Yes, but people forget about those (literal) edge cases.

But *only* _literally_ "edge cases". Rather they're simple
and basics of regexp parsers since their beginning.

Those of us who are accustomed to using regexes do not.

It's one of the first things that regexp newbies learn,
I'd say.

Another handy one is “\b” for word boundaries.

I prefer \< and \> (that are quite commonly used) for such
structural things, also $ and $ for allowing references
to matched parts. And I prefer the \alpha regexp pattern
extension forms for things like \d \D \w \W \s \S . (But
that's not only a matter of taste but also a question of
what any regexp parser actually supports.)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rainer Weikusat@21:1/5 to Janis Papanagnou on Fri Nov 22 11:56:26 2024

XPost: comp.unix.shell, comp.unix.programmer

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 20.11.2024 18:50, Rainer Weikusat wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

Personally I think that writing bulky procedural stuff for something
like [0-9]+ can only be much worse, and that further abbreviations
like \d+ are the better direction to go if targeting a good interface.
YMMV.

Assuming that p is a pointer to the current position in a string, e is a
pointer to the end of it (ie, point just past the last byte) and -
that's important - both are pointers to unsigned quantities, the 'bulky'
C equivalent of [0-9]+ is

while (p < e && *p - '0' < 10) ++p;

That's not too bad. And it's really a hell lot faster than a
general-purpose automaton programmed to recognize the same pattern
(which might not matter most of the time, but sometimes, it does).

Okay, I see where you're coming from (and especially in that simple
case).

Personally (and YMMV), even here in this simple case I think that
using pointers is not better but worse - and anyway isn't [in this
form] available in most languages;

That's a question of using the proper tool for the job. In C, that's
pointer and pointer arithmetic because it's the simplest way to express something like this.

in other cases (and languages)
such constructs get yet more clumsy, and for my not very complex
example - /[0-9]+(ABC)?x*foo/ - even a "catastrophe" concerning
readability, error-proneness, and maintainability.

Procedural code for matching strings constructed in this way is
certainly much simpler¹ than the equally procedural code for a
programmable automaton capable of interpreting regexes. Your statement
is basically "If we assume that the code interpreting regexes doesn't
exist, regexes need much less code than something equivalent which does
exist." Without this assumption, the picture becomes a different one altogether.

¹ This doesn't even need a real state machine, just four subroutines
executed in succession (and two of these can share an implementation as "matching ABC" and "matching foo" are both cases of matching a constant
string.

If that is what the other poster meant I'm fine with your answer;
there's no need to even consider abandoning regular expressions
in favor of explicitly codified parsing.

This depends on the specific problem and the constraints applicable to a solution. For the common case, regexes, if easily available, are an
obvious good solution. But not all cases are common.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dan Cross@21:1/5 to rweikusat@talktalk.net on Fri Nov 22 13:30:34 2024

XPost: comp.unix.shell, comp.unix.programmer

In article <874j40sk01.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

[...]

Personally I think that writing bulky procedural stuff for something
like [0-9]+ can only be much worse, and that further abbreviations
like \d+ are the better direction to go if targeting a good interface. >>>> YMMV.

Assuming that p is a pointer to the current position in a string, e is a >>>pointer to the end of it (ie, point just past the last byte) and -
that's important - both are pointers to unsigned quantities, the 'bulky' >>>C equivalent of [0-9]+ is

while (p < e && *p - '0' < 10) ++p;

That's not too bad. And it's really a hell lot faster than a >>>general-purpose automaton programmed to recognize the same pattern
(which might not matter most of the time, but sometimes, it does).

It's also not exactly right. `[0-9]+` would match one or more
characters; this possibly matches 0 (ie, if `p` pointed to
something that wasn't a digit).

The regex won't match any digits if there aren't any. In this case, the
match will fail. I didn't include the code for handling that because it >seemed pretty pointless for the example.

That's rather the point though, isn't it? The program snippet
(modulo the promotion to signed int via the "usual arithmetic
conversions" before the subtraction and comparison giving you
unexpected values; nothing to do with whether `char` is signed
or not) is a snippet that advances a pointer while it points to
a digit, starting at the current pointer position; that is, it
just increments a pointer over a run of digits.

But that's not the same as a regex matcher, which has a semantic
notion of success or failure. I could run your snippet against
a string such as, say, "ZZZZZZ" and it would "succeed" just as
it would against an empty string or a string of one or more
digits. And then there are other matters of context; does the
user intend for the regexp to match the _whole_ string? Or any
portion of the string (a la `grep`)? So, for example, does the
string "aaa1234aaa" match `[0-9]+`? As written, the above
snippet is actually closer to advancing `p` over `^[0-9]*`. One
might differentiate between `*` and `+` after the fact, by
examining `p` against some (presumably saved) source value, but
that's more code.

These are just not equivalent. That's not to say that your
snippet is not _useful_ in context, but to pretend that it's the
same as the regular expression is pointlessly reductive.

By the way, something that _would_ match `^[0-9]+$` might be:

term% cat mdp.c
#include <assert.h>
#include <stdbool.h>
#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

static bool
mdigit(unsigned int c)
{
return c - '0' < 10;
}

bool
mdp(const char *str, const char *estr)
{
if (str == NULL || estr == NULL || str == estr)
return false;
if (!mdigit(*str))
return false;
while (str < estr && mdigit(*str))
str++;
return str == estr;
}

bool
probe(const char *s, bool expected)
{
if (mdp(s, s + strlen(s)) != expected) {
fprintf(stderr, "test failure: `%s` (expected %s)\n",
s, expected ? "true" : "false");
return false;
}
return true;
}

int
main(void)
{
bool success = true;

success = probe("1234", true) && success;
success = probe("", false) && success;
success = probe("ab", false) && success;
success = probe("0", true) && success;
success = probe("0123456789", true) && success;
success = probe("a0123456", false) && success;
success = probe("0123456b", false) && success;
success = probe("0123c456", false) && success;
success = probe("0123#456", false) && success;

return success ? EXIT_SUCCESS : EXIT_FAILURE;
}
term% cc -Wall -Wextra -Werror -pedantic -std=c11 mdp.c -o mdp
term% ./mdp
term% echo $?
0
term%

Granted the test scaffolding and `#include` boilerplate makes
this appear rather longer than it would be in context, but it's
still not nearly as succinct.

- Dan C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rainer Weikusat@21:1/5 to Rainer Weikusat on Fri Nov 22 15:52:41 2024

XPost: comp.unix.shell, comp.unix.programmer

Rainer Weikusat <rweikusat@talktalk.net> writes:

[...]

Something which would match [0-9]+ in its first argument (if any) would
be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;

This needs to be

while (c = *p, c && c - '0' > 9) ++p

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rainer Weikusat@21:1/5 to Dan Cross on Fri Nov 22 15:41:09 2024

XPost: comp.unix.shell, comp.unix.programmer

cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

[...]

Personally I think that writing bulky procedural stuff for something >>>>> like [0-9]+ can only be much worse, and that further abbreviations
like \d+ are the better direction to go if targeting a good interface. >>>>> YMMV.

Assuming that p is a pointer to the current position in a string, e is a >>>>pointer to the end of it (ie, point just past the last byte) and - >>>>that's important - both are pointers to unsigned quantities, the 'bulky' >>>>C equivalent of [0-9]+ is

while (p < e && *p - '0' < 10) ++p;

That's not too bad. And it's really a hell lot faster than a >>>>general-purpose automaton programmed to recognize the same pattern >>>>(which might not matter most of the time, but sometimes, it does).

It's also not exactly right. `[0-9]+` would match one or more
characters; this possibly matches 0 (ie, if `p` pointed to
something that wasn't a digit).

The regex won't match any digits if there aren't any. In this case, the >>match will fail. I didn't include the code for handling that because it >>seemed pretty pointless for the example.

That's rather the point though, isn't it? The program snippet
(modulo the promotion to signed int via the "usual arithmetic
conversions" before the subtraction and comparison giving you
unexpected values; nothing to do with whether `char` is signed
or not) is a snippet that advances a pointer while it points to
a digit, starting at the current pointer position; that is, it
just increments a pointer over a run of digits.

That's the core part of matching someting equivalent to the regex [0-9]+
and the only part of it is which is at least remotely interesting.

But that's not the same as a regex matcher, which has a semantic
notion of success or failure. I could run your snippet against
a string such as, say, "ZZZZZZ" and it would "succeed" just as
it would against an empty string or a string of one or more
digits.

Why do you believe that p being equivalent to the starting position
would be considered a "successful match", considering that this
obviously doesn't make any sense?

[...]

By the way, something that _would_ match `^[0-9]+$` might be:

[too much code]

Something which would match [0-9]+ in its first argument (if any) would
be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;
if (!c) exit(1);
return 0;
}

but that's 14 lines of text, 13 of which have absolutely no relation to
the problem of recognizing a digit.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dan Cross@21:1/5 to rweikusat@talktalk.net on Fri Nov 22 17:18:26 2024

XPost: comp.unix.shell, comp.unix.programmer

In article <87zflrs1ti.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

Rainer Weikusat <rweikusat@talktalk.net> writes:

[...]

Something which would match [0-9]+ in its first argument (if any) would
be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;

This needs to be

while (c = *p, c && c - '0' > 9) ++p

No, that's still wrong. Try actually running it.

- Dan C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rainer Weikusat@21:1/5 to Dan Cross on Fri Nov 22 17:35:29 2024

XPost: comp.unix.shell, comp.unix.programmer

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <87zflrs1ti.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

Rainer Weikusat <rweikusat@talktalk.net> writes:

[...]

Something which would match [0-9]+ in its first argument (if any) would
be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;

This needs to be

while (c = *p, c && c - '0' > 9) ++p

No, that's still wrong. Try actually running it.

If you know something that's wrong with that, why not write it instead
of utilizing the claim for pointless (and wrong) snide remarks?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dan Cross@21:1/5 to rweikusat@talktalk.net on Fri Nov 22 17:43:24 2024

XPost: comp.unix.shell, comp.unix.programmer

In article <87v7wfrx26.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <87zflrs1ti.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

Rainer Weikusat <rweikusat@talktalk.net> writes:

[...]

Something which would match [0-9]+ in its first argument (if any) would >>>> be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;

This needs to be

while (c = *p, c && c - '0' > 9) ++p

No, that's still wrong. Try actually running it.

If you know something that's wrong with that, why not write it instead
of utilizing the claim for pointless (and wrong) snide remarks?

I did, at length, in my other post.

- Dan C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dan Cross@21:1/5 to rweikusat@talktalk.net on Fri Nov 22 17:17:46 2024

XPost: comp.unix.shell, comp.unix.programmer

In article <877c8vtgx6.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote: >>>cross@spitfire.i.gajendra.net (Dan Cross) writes:

[snip]
It's also not exactly right. `[0-9]+` would match one or more
characters; this possibly matches 0 (ie, if `p` pointed to
something that wasn't a digit).

The regex won't match any digits if there aren't any. In this case, the >>>match will fail. I didn't include the code for handling that because it >>>seemed pretty pointless for the example.

That's rather the point though, isn't it? The program snippet
(modulo the promotion to signed int via the "usual arithmetic
conversions" before the subtraction and comparison giving you
unexpected values; nothing to do with whether `char` is signed
or not) is a snippet that advances a pointer while it points to
a digit, starting at the current pointer position; that is, it
just increments a pointer over a run of digits.

That's the core part of matching someting equivalent to the regex [0-9]+
and the only part of it is which is at least remotely interesting.

Not really, no. The interesting thing in this case appears to
be knowing whether or not the match succeeded, but you omited
that part.

But that's not the same as a regex matcher, which has a semantic
notion of success or failure. I could run your snippet against
a string such as, say, "ZZZZZZ" and it would "succeed" just as
it would against an empty string or a string of one or more
digits.

Why do you believe that p being equivalent to the starting position
would be considered a "successful match", considering that this
obviously doesn't make any sense?

Because absent any surrounding context, there's no indication
that the source is even saved. You'll note that I did mention
that as a means to differentiate later on, but that's not the
snippet you posted.

[...]

By the way, something that _would_ match `^[0-9]+$` might be:

[too much code]

Something which would match [0-9]+ in its first argument (if any) would
be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;
if (!c) exit(1);
return 0;
}

but that's 14 lines of text, 13 of which have absolutely no relation to
the problem of recognizing a digit.

This is wrong in many ways. Did you actually test that program?

First of all, why `"string.h"` and not `<string.h>`? Ok, that's
not technically an error, but it's certainly unconventional, and
raises questions that are ultimately a distraction.

Second, suppose that `argc==0` (yes, this can happen under
POSIX).

Third, the loop: why `> 10`? Don't you mean `< 10`? You are
trying to match digits, not non-digits.

Fourth, you exit with failure (`exit(1)`) if `!p` *and* if `!c`
at the end, but `!c` there means you've reached the end of the
string; which should be success.

Fifth and finally, you `return 0;` which is EXIT_SUCCESS, in the
failure case.

Compare:

#include <regex.h>
#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>

int
main(int argc, char *argv[])
{
regex_t reprog;
int ret;

if (argc != 2) {
fprintf(stderr, "Usage: regexp pattern\n");
return(EXIT_FAILURE);
}
(void)regcomp(&reprog, "^[0-9]+$", REG_EXTENDED | REG_NOSUB);
ret = regexec(&reprog, argv[1], 0, NULL, 0);
regfree(&reprog);

return ret == 0 ? EXIT_SUCCESS : EXIT_FAILURE;
}

This is only marginally longer, but is correct.

- Dan C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dan Cross@21:1/5 to Dan Cross on Fri Nov 22 17:43:59 2024

XPost: comp.unix.shell, comp.unix.programmer

In article <vhqfrs$bit$1@reader2.panix.com>,
Dan Cross <cross@spitfire.i.gajendra.net> wrote:

In article <87v7wfrx26.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote: >>cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <87zflrs1ti.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

Rainer Weikusat <rweikusat@talktalk.net> writes:

[...]

Something which would match [0-9]+ in its first argument (if any) would >>>>> be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;

This needs to be

while (c = *p, c && c - '0' > 9) ++p

No, that's still wrong. Try actually running it.

If you know something that's wrong with that, why not write it instead
of utilizing the claim for pointless (and wrong) snide remarks?

I did, at length, in my other post.

Cf. <vhqebq$c71$1@reader2.panix.com>

- Dan C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rainer Weikusat@21:1/5 to Dan Cross on Fri Nov 22 17:48:37 2024

XPost: comp.unix.shell, comp.unix.programmer

cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote: >>>>cross@spitfire.i.gajendra.net (Dan Cross) writes:

[snip]
It's also not exactly right. `[0-9]+` would match one or more
characters; this possibly matches 0 (ie, if `p` pointed to
something that wasn't a digit).

The regex won't match any digits if there aren't any. In this case, the >>>>match will fail. I didn't include the code for handling that because it >>>>seemed pretty pointless for the example.

That's rather the point though, isn't it? The program snippet
(modulo the promotion to signed int via the "usual arithmetic
conversions" before the subtraction and comparison giving you
unexpected values; nothing to do with whether `char` is signed
or not) is a snippet that advances a pointer while it points to
a digit, starting at the current pointer position; that is, it
just increments a pointer over a run of digits.

That's the core part of matching someting equivalent to the regex [0-9]+ >>and the only part of it is which is at least remotely interesting.

Not really, no. The interesting thing in this case appears to
be knowing whether or not the match succeeded, but you omited
that part.

This of interest to you as it enables you to base an 'argumentation'
(sarcasm) on arbitrary assumptions you've chosen to make. It's not
something I consider interesting and it's besides the point of the
example I posted.

But that's not the same as a regex matcher, which has a semantic
notion of success or failure. I could run your snippet against
a string such as, say, "ZZZZZZ" and it would "succeed" just as
it would against an empty string or a string of one or more
digits.

Why do you believe that p being equivalent to the starting position
would be considered a "successful match", considering that this
obviously doesn't make any sense?

Because absent any surrounding context, there's no indication
that the source is even saved.

A text usually doesn't contain information about things which aren't
part of its content. I congratulate you to this rather obvious observation.

[...]

Something which would match [0-9]+ in its first argument (if any) would
be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;
if (!c) exit(1);
return 0;
}

but that's 14 lines of text, 13 of which have absolutely no relation to
the problem of recognizing a digit.

This is wrong in many ways. Did you actually test that program?

First of all, why `"string.h"` and not `<string.h>`? Ok, that's
not technically an error, but it's certainly unconventional, and
raises questions that are ultimately a distraction.

Such as your paragraph above.

Second, suppose that `argc==0` (yes, this can happen under
POSIX).

It can happen in case of some piece of functionally hostile software intentionally creating such a situation. Tangential, irrelevant
point. If you break it, you get to keep the parts.

Third, the loop: why `> 10`? Don't you mean `< 10`? You are
trying to match digits, not non-digits.

Mistake I made. The opposite of < 10 is > 9.

Fourth, you exit with failure (`exit(1)`) if `!p` *and* if `!c`
at the end, but `!c` there means you've reached the end of the
string; which should be success.

Mistake you made: [0-9]+ matches if there's at least one digit in the
string. That's why the loop terminates once one was found. In this case,
c cannot be 0.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dan Cross@21:1/5 to rweikusat@talktalk.net on Fri Nov 22 18:12:34 2024

XPost: comp.unix.shell, comp.unix.programmer

In article <87o727rwga.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

Something which would match [0-9]+ in its first argument (if any) would >>>be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;
if (!c) exit(1);
return 0;
}

but that's 14 lines of text, 13 of which have absolutely no relation to >>>the problem of recognizing a digit.

This is wrong in many ways. Did you actually test that program?

First of all, why `"string.h"` and not `<string.h>`? Ok, that's
not technically an error, but it's certainly unconventional, and
raises questions that are ultimately a distraction.

Such as your paragraph above.

Second, suppose that `argc==0` (yes, this can happen under
POSIX).

It can happen in case of some piece of functionally hostile software >intentionally creating such a situation. Tangential, irrelevant
point. If you break it, you get to keep the parts.

Third, the loop: why `> 10`? Don't you mean `< 10`? You are
trying to match digits, not non-digits.

Mistake I made. The opposite of < 10 is > 9.

I see. So you want to skip non-digits and exit the first time
you see a digit. Ok, fair enough, though that program has
already been written, and is called `grep`.

Fourth, you exit with failure (`exit(1)`) if `!p` *and* if `!c`
at the end, but `!c` there means you've reached the end of the
string; which should be success.

Mistake you made: [0-9]+ matches if there's at least one digit in the
string. That's why the loop terminates once one was found. In this case,
c cannot be 0.

Ah, you are trying to match `[0-9]` (though you're calling it
`[0-9]+`). Yeah, your program was not at all equivalent to one
I wrote, though this is what you posted in response to mine, so
I assumed you were trying to emulate that behavior (matching
`^[0-9]+$`).

But I see above that you mentioned `[0-9]+`. But as I mentioned
above, really you're just matching any digit, so you may as well
be matching `[0-9]`; again, this not the same as the actual
regexp, because you are ignoring the semantics of what regular
expressions actually describe.

In any event, this seems simpler than what you posted:

#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>

int
main(int argc, char *argv[])
{
if (argc != 2) {
fprintf(stderr, "Usage: matchd <str>\n");
return EXIT_FAILURE;
}

for (const char *p = argv[1]; *p != '\0'; p++)
if ('0' <= *p && *p <= '9')
return EXIT_SUCCESS;

return EXIT_FAILURE;
}

- Dan C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Rainer Weikusat on Fri Nov 22 18:14:48 2024

XPost: comp.unix.shell, comp.unix.programmer

Rainer Weikusat <rweikusat@talktalk.net> writes:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote: >>>cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

[...]

Personally I think that writing bulky procedural stuff for something >>>>>> like [0-9]+ can only be much worse, and that further abbreviations >>>>>> like \d+ are the better direction to go if targeting a good interface. >>>>>> YMMV.

Assuming that p is a pointer to the current position in a string, e is a >>>>>pointer to the end of it (ie, point just past the last byte) and - >>>>>that's important - both are pointers to unsigned quantities, the 'bulky' >>>>>C equivalent of [0-9]+ is

while (p < e && *p - '0' < 10) ++p;

That's not too bad. And it's really a hell lot faster than a >>>>>general-purpose automaton programmed to recognize the same pattern >>>>>(which might not matter most of the time, but sometimes, it does).

It's also not exactly right. `[0-9]+` would match one or more
characters; this possibly matches 0 (ie, if `p` pointed to
something that wasn't a digit).

The regex won't match any digits if there aren't any. In this case, the >>>match will fail. I didn't include the code for handling that because it >>>seemed pretty pointless for the example.

That's rather the point though, isn't it? The program snippet
(modulo the promotion to signed int via the "usual arithmetic
conversions" before the subtraction and comparison giving you
unexpected values; nothing to do with whether `char` is signed
or not) is a snippet that advances a pointer while it points to
a digit, starting at the current pointer position; that is, it
just increments a pointer over a run of digits.

That's the core part of matching someting equivalent to the regex [0-9]+
and the only part of it is which is at least remotely interesting.

But that's not the same as a regex matcher, which has a semantic
notion of success or failure. I could run your snippet against
a string such as, say, "ZZZZZZ" and it would "succeed" just as
it would against an empty string or a string of one or more
digits.

Why do you believe that p being equivalent to the starting position
would be considered a "successful match", considering that this
obviously doesn't make any sense?

[...]

By the way, something that _would_ match `^[0-9]+$` might be:

[too much code]

Something which would match [0-9]+ in its first argument (if any) would
be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;
if (!c) exit(1);
return 0;
}

but that's 14 lines of text, 13 of which have absolutely no relation to
the problem of recognizing a digit.

Personally, I'd use:

$ cat /tmp/a.c
#include <stdint.h>
#include <string.h>

int
main(int argc, const char **argv)
{
char *cp;
uint64_t value;

if (argc < 2) return 1;

value = strtoull(argv[1], &cp, 10);
if ((cp == argv[1])
|| (*cp != '\0')) {
return 1;
}
return 0;
}
$ cc -o /tmp/a /tmp/a.c
$ /tmp/a 13254
$ echo $?
0
$ /tmp/a 23v23
$ echo $?
1

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Janis Papanagnou on Fri Nov 22 18:19:30 2024

XPost: comp.unix.shell, comp.unix.programmer

On 2024-11-22, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 21.11.2024 20:12, Kaz Kylheku wrote:

[...]

In the wild, you see regexes being used for all sorts of stupid stuff,

No one can prevent folks using features for stupid things. Yes.

But the thing is that "modern" regular expressions (Perl regex and its
progeny) have features that are designed to exclusively cater to these
folks.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Muttley@DastartdlyHQ.org on Fri Nov 22 18:18:04 2024

XPost: comp.unix.shell, comp.unix.programmer

On 2024-11-22, Muttley@DastartdlyHQ.org <Muttley@DastartdlyHQ.org> wrote:

On Thu, 21 Nov 2024 19:12:03 -0000 (UTC)
Kaz Kylheku <643-408-1753@kylheku.com> boring babbled:

On 2024-11-20, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

I'm curious what you mean by Regexps presented in a "procedural" form.
Can you give some examples?

Here is an example: using a regex match to capture a C comment /* ... */
in Lex compared to just recognizing the start sequence /* and handling
the discarding of the comment in the action.

Without non-greedy repetition matching, the regex for a C comment is
quite obtuse. The procedural handling is straightforward: read
characters until you see a * immediately followed by a /.

Its not that simple I'm afraid since comments can be commented out.

Umm, no.

eg:

// int i; /*

This /* sequence is inside a // comment, and so the machinery that
recognizes /* as the start of a comment would never see it.

Just like "int i;" is in a string literal and so not recognized
as a keyword, whitespace, identifier and semicolon.

int j;
/*
int k;
*/
++j;

A C99 and C++ compiler would see "int j" and compile it, a regex would
simply remove everything from the first /* to */.

No, it won't, because that's not how regexes are used in a lexical
analyzer. At the start of the input, the lexical analyzer faces
the characters "// int i; /*\n". This will trigger the pattern match
for // comments. Essentially that entire sequence through the newline
is treated as a kind of token, equivalent to a space.

Once a token is recognized and removed from the input, it is gone;
no other regular expression can match into it.

Also the same probably applies to #ifdef's.

Lexically analyzing C requires implementing the translation phases
as described in the standard. There are preprocessor phases which
delimit the input into preprocessor tokens (pp-tokens). Comments
are stripped in preprocessing. But logical lines (backslash
continuations) are recognized below comments; i.e. this is one
comment:

\\ comment \
split \
into \
physical \
lines

A lexical scanner can have an input routine which transparently handles
this low-level detail, so that it doesn't have to deal with the
line continuations in every token pattern.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Scott Lurndal on Fri Nov 22 18:22:45 2024

XPost: comp.unix.shell, comp.unix.programmer

scott@slp53.sl.home (Scott Lurndal) writes:

Rainer Weikusat <rweikusat@talktalk.net> writes: >>cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote: >>>>cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

[...]

Personally I think that writing bulky procedural stuff for something >>>>>>> like [0-9]+ can only be much worse, and that further abbreviations >>>>>>> like \d+ are the better direction to go if targeting a good interface. >>>>>>> YMMV.

Assuming that p is a pointer to the current position in a string, e is a >>>>>>pointer to the end of it (ie, point just past the last byte) and - >>>>>>that's important - both are pointers to unsigned quantities, the 'bulky' >>>>>>C equivalent of [0-9]+ is

while (p < e && *p - '0' < 10) ++p;

That's not too bad. And it's really a hell lot faster than a >>>>>>general-purpose automaton programmed to recognize the same pattern >>>>>>(which might not matter most of the time, but sometimes, it does).

It's also not exactly right. `[0-9]+` would match one or more
characters; this possibly matches 0 (ie, if `p` pointed to
something that wasn't a digit).

The regex won't match any digits if there aren't any. In this case, the >>>>match will fail. I didn't include the code for handling that because it >>>>seemed pretty pointless for the example.

That's rather the point though, isn't it? The program snippet
(modulo the promotion to signed int via the "usual arithmetic
conversions" before the subtraction and comparison giving you
unexpected values; nothing to do with whether `char` is signed
or not) is a snippet that advances a pointer while it points to
a digit, starting at the current pointer position; that is, it
just increments a pointer over a run of digits.

That's the core part of matching someting equivalent to the regex [0-9]+ >>and the only part of it is which is at least remotely interesting.

But that's not the same as a regex matcher, which has a semantic
notion of success or failure. I could run your snippet against
a string such as, say, "ZZZZZZ" and it would "succeed" just as
it would against an empty string or a string of one or more
digits.

Why do you believe that p being equivalent to the starting position
would be considered a "successful match", considering that this
obviously doesn't make any sense?

[...]

By the way, something that _would_ match `^[0-9]+$` might be:

[too much code]

Something which would match [0-9]+ in its first argument (if any) would
be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;
if (!c) exit(1);
return 0;
}

but that's 14 lines of text, 13 of which have absolutely no relation to
the problem of recognizing a digit.

Personally, I'd use:

Albeit this is limited to strings of digits that sum to less than
ULONG_MAX...

$ cat /tmp/a.c
#include <stdint.h>
#include <string.h>

int
main(int argc, const char **argv)
{
char *cp;
uint64_t value;

if (argc < 2) return 1;

value = strtoull(argv[1], &cp, 10);
if ((cp == argv[1])
|| (*cp != '\0')) {
return 1;
}
return 0;
}
$ cc -o /tmp/a /tmp/a.c
$ /tmp/a 13254
$ echo $?
0
$ /tmp/a 23v23
$ echo $?
1

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dan Cross@21:1/5 to Scott Lurndal on Fri Nov 22 18:30:31 2024

XPost: comp.unix.shell, comp.unix.programmer

In article <VZ30P.4664$YSkc.1894@fx40.iad>,
Scott Lurndal <slp53@pacbell.net> wrote:

scott@slp53.sl.home (Scott Lurndal) writes:

Rainer Weikusat <rweikusat@talktalk.net> writes: >>>cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote: >>>>>cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

[...]

Personally I think that writing bulky procedural stuff for something >>>>>>>> like [0-9]+ can only be much worse, and that further abbreviations >>>>>>>> like \d+ are the better direction to go if targeting a good interface. >>>>>>>> YMMV.

Assuming that p is a pointer to the current position in a string, e is a >>>>>>>pointer to the end of it (ie, point just past the last byte) and - >>>>>>>that's important - both are pointers to unsigned quantities, the 'bulky' >>>>>>>C equivalent of [0-9]+ is

while (p < e && *p - '0' < 10) ++p;

That's not too bad. And it's really a hell lot faster than a >>>>>>>general-purpose automaton programmed to recognize the same pattern >>>>>>>(which might not matter most of the time, but sometimes, it does). >>>>>>

It's also not exactly right. `[0-9]+` would match one or more
characters; this possibly matches 0 (ie, if `p` pointed to
something that wasn't a digit).

The regex won't match any digits if there aren't any. In this case, the >>>>>match will fail. I didn't include the code for handling that because it >>>>>seemed pretty pointless for the example.

That's rather the point though, isn't it? The program snippet
(modulo the promotion to signed int via the "usual arithmetic
conversions" before the subtraction and comparison giving you
unexpected values; nothing to do with whether `char` is signed
or not) is a snippet that advances a pointer while it points to
a digit, starting at the current pointer position; that is, it
just increments a pointer over a run of digits.

That's the core part of matching someting equivalent to the regex [0-9]+ >>>and the only part of it is which is at least remotely interesting.

But that's not the same as a regex matcher, which has a semantic
notion of success or failure. I could run your snippet against
a string such as, say, "ZZZZZZ" and it would "succeed" just as
it would against an empty string or a string of one or more
digits.

Why do you believe that p being equivalent to the starting position
would be considered a "successful match", considering that this
obviously doesn't make any sense?

[...]

By the way, something that _would_ match `^[0-9]+$` might be:

[too much code]

Something which would match [0-9]+ in its first argument (if any) would >>>be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;
if (!c) exit(1);
return 0;
}

but that's 14 lines of text, 13 of which have absolutely no relation to >>>the problem of recognizing a digit.

Personally, I'd use:

Albeit this is limited to strings of digits that sum to less than >ULONG_MAX...

It's not quite equivalent to his program, which just exit's with
success if it sees any input string with a digit in it; your's
is closer to what I wrote, which matches `^[0-9]+$`. His is not
an interesting program and certainly not a recognizable
equivalent a regular expression matcher in any reasonable sense,
but I think the cognitive dissonance is too strong to get that
across.

- Dan C.

$ cat /tmp/a.c
#include <stdint.h>
#include <string.h>

int
main(int argc, const char **argv)
{
char *cp;
uint64_t value;

if (argc < 2) return 1;

value = strtoull(argv[1], &cp, 10);
if ((cp == argv[1])
|| (*cp != '\0')) {
return 1;
}
return 0;
}
$ cc -o /tmp/a /tmp/a.c
$ /tmp/a 13254
$ echo $?
0
$ /tmp/a 23v23
$ echo $?
1

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rainer Weikusat@21:1/5 to Dan Cross on Fri Nov 22 18:48:55 2024

XPost: comp.unix.shell, comp.unix.programmer

cross@spitfire.i.gajendra.net (Dan Cross) writes:

[...]

In any event, this seems simpler than what you posted:

#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>

int
main(int argc, char *argv[])
{
if (argc != 2) {
fprintf(stderr, "Usage: matchd <str>\n");
return EXIT_FAILURE;
}

for (const char *p = argv[1]; *p != '\0'; p++)
if ('0' <= *p && *p <= '9')
return EXIT_SUCCESS;

return EXIT_FAILURE;
}

It's not only 4 lines longer but in just about every individual aspect syntactically more complicated and more messy and functionally more
clumsy. This is particularly noticable in the loop

for (const char *p = argv[1]; *p != '\0'; p++)
if ('0' <= *p && *p <= '9')
return EXIT_SUCCESS;

the loop header containing a spuriously qualified variable declaration,
the loop body and half of the termination condition. The other half then follows as special-case in the otherwise useless loop body.

It looks like a copy of my code which each individual bit redesigned
under the guiding principle of "Can we make this more complicated?", eg,

char **argv

declares an array of pointers (as each pointer in C points to an array)
and

char *argv[]

accomplishes exactly the same but uses both more characters and more
different kinds of characters.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rainer Weikusat@21:1/5 to Scott Lurndal on Fri Nov 22 18:59:43 2024

XPost: comp.unix.shell, comp.unix.programmer

scott@slp53.sl.home (Scott Lurndal) writes:

Rainer Weikusat <rweikusat@talktalk.net> writes:

[...]

Something which would match [0-9]+ in its first argument (if any) would
be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;
if (!c) exit(1);
return 0;
}

but that's 14 lines of text, 13 of which have absolutely no relation to
the problem of recognizing a digit.

Personally, I'd use:

$ cat /tmp/a.c
#include <stdint.h>
#include <string.h>

int
main(int argc, const char **argv)
{
char *cp;
uint64_t value;

if (argc < 2) return 1;

value = strtoull(argv[1], &cp, 10);
if ((cp == argv[1])
|| (*cp != '\0')) {
return 1;
}
return 0;
}

This will accept a string of digits whose numerical value is <=
ULLONG_MAX, ie, it's basically ^[0-9]+$ with unobvious length and
content limits.

return !strstr(argv[1], "0123456789");

would be a better approximation, just a much more complicated algorithm
than necessary. Even in strictly conforming ISO-C "digitness" of a
character can be determined by a simple calculation instead of some kind
of search loop.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dan Cross@21:1/5 to rweikusat@talktalk.net on Fri Nov 22 19:05:42 2024

XPost: comp.unix.shell, comp.unix.programmer

In article <87h67zrtns.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

[...]

In any event, this seems simpler than what you posted:

#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>

int
main(int argc, char *argv[])
{
if (argc != 2) {
fprintf(stderr, "Usage: matchd <str>\n");
return EXIT_FAILURE;
}

for (const char *p = argv[1]; *p != '\0'; p++)
if ('0' <= *p && *p <= '9')
return EXIT_SUCCESS;

return EXIT_FAILURE;
}

It's not only 4 lines longer but in just about every individual aspect >syntactically more complicated and more messy and functionally more
clumsy.

That's a lot of opinion, and not particularly well-founded
opinion at that, given that your code was incorrect to begin
with.

This is particularly noticable in the loop

for (const char *p = argv[1]; *p != '\0'; p++)
if ('0' <= *p && *p <= '9')
return EXIT_SUCCESS;

the loop header containing a spuriously qualified variable declaration,

Ibid. Const qualifying a pointer that I'm not going to assign
through is just good hygiene, IMHO.

the loop body and half of the termination condition.

I think you're trying to project a value judgement onto that
loop in order to make it fit a particular world view, but I
think this is an odd way to look at it.

Another way to loop at it is that the loop is only concerned
with the iteration over the string, while the body is concerned
with applying some predicate to the element, and doing something
if that predicate evaluates it to true.

The other half then
follows as special-case in the otherwise useless loop body.

That's a way to look at it, but I submit that's an outlier point
of view.

It looks like a copy of my code which each individual bit redesigned
under the guiding principle of "Can we make this more complicated?", eg,

Uh, no.

char **argv

declares an array of pointers

No, it declares a pointer to a pointer to char.

(as each pointer in C points to an array)

That's absolutely not true. A pointer in C may refer to
an array, or a scalar. Consider,

char c;
char *p = &c;
char **pp = &p;

For a concrete example of how this works in a real function,
consider the second argument to `strtol` et al in the standard
library.

and

char *argv[]

accomplishes exactly the same but uses both more characters and more >different kinds of characters.

"more characters" is a poor metric.

- Dan C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dan Cross@21:1/5 to rweikusat@talktalk.net on Fri Nov 22 19:15:07 2024

XPost: comp.unix.shell, comp.unix.programmer

In article <87cyinrt5s.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

scott@slp53.sl.home (Scott Lurndal) writes:

Rainer Weikusat <rweikusat@talktalk.net> writes:

[...]

Something which would match [0-9]+ in its first argument (if any) would >>>be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;
if (!c) exit(1);
return 0;
}

but that's 14 lines of text, 13 of which have absolutely no relation to >>>the problem of recognizing a digit.

Personally, I'd use:

$ cat /tmp/a.c
#include <stdint.h>
#include <string.h>

int
main(int argc, const char **argv)
{
char *cp;
uint64_t value;

if (argc < 2) return 1;

value = strtoull(argv[1], &cp, 10);
if ((cp == argv[1])
|| (*cp != '\0')) {
return 1;
}
return 0;
}

This will accept a string of digits whose numerical value is <=
ULLONG_MAX, ie, it's basically ^[0-9]+$ with unobvious length and
content limits.

He acknowledged this already.

return !strstr(argv[1], "0123456789");

would be a better approximation,

No it wouldn't. That's not even close. `strstr` looks for an
instance of its second argument in its first, not an instance of
any character in it's second argument in its first. Perhaps you
meant something with `strspn` or similar. E.g.,

const char *p = argv[1] + strspn(argv[1], "0123456789");
return *p != '\0';

just a much more complicated algorithm
than necessary. Even in strictly conforming ISO-C "digitness" of a
character can be determined by a simple calculation instead of some kind
of search loop.

Yes, one can do that, but why bother?

- Dan C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Kaz Kylheku on Fri Nov 22 20:20:06 2024

XPost: comp.unix.shell, comp.unix.programmer

On 22.11.2024 19:19, Kaz Kylheku wrote:

On 2024-11-22, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 21.11.2024 20:12, Kaz Kylheku wrote:

[...]

In the wild, you see regexes being used for all sorts of stupid stuff,

No one can prevent folks using features for stupid things. Yes.

But the thing is that "modern" regular expressions (Perl regex and its progeny) have features that are designed to exclusively cater to these
folks.

Which ones are you specifically thinking of?

Since I'm not using Perl I don't know all the Perl RE details. Besides
the basic REs I'm aware of the abbreviations (like '\d') (that I like),
then extensions of Chomsky-3 (like back-references) (that I also like
to have in cases I need them; but one must know what we buy with them),
then the minimum-match (as opposed to matching the longest substring)
(which I think is useful to simplify some types of expressions), and
there was another one that evades my memories, something like context
dependent patterns (also useful), and wasn't there also some syntax to
match subexpression-hierarchies (useful as well) (similar like in GNU
Awk's gensub() (probably in a more primitive variant there), and also
existing in Kornshell patterns that also supports some more from above [Perl-]features, like the abbreviations).

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rainer Weikusat@21:1/5 to Dan Cross on Fri Nov 22 19:24:23 2024

XPost: comp.unix.shell, comp.unix.programmer

cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

[...]

In any event, this seems simpler than what you posted:

#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>

int
main(int argc, char *argv[])
{
if (argc != 2) {
fprintf(stderr, "Usage: matchd <str>\n");
return EXIT_FAILURE;
}

for (const char *p = argv[1]; *p != '\0'; p++)
if ('0' <= *p && *p <= '9')
return EXIT_SUCCESS;

return EXIT_FAILURE;
}

It's not only 4 lines longer but in just about every individual aspect >>syntactically more complicated and more messy and functionally more
clumsy.

That's a lot of opinion, and not particularly well-founded
opinion at that, given that your code was incorrect to begin
with.

That's not at all an opinion but an observation. My opinion on this is
that this is either a poor man's attempt at winning an obfuscation
context or - simpler - exemplary bad code.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Rainer Weikusat on Fri Nov 22 20:33:24 2024

XPost: comp.unix.shell, comp.unix.programmer

On 22.11.2024 12:56, Rainer Weikusat wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 20.11.2024 18:50, Rainer Weikusat wrote:

[...]
while (p < e && *p - '0' < 10) ++p;

That's not too bad. And it's really a hell lot faster than a
general-purpose automaton programmed to recognize the same pattern
(which might not matter most of the time, but sometimes, it does).

Okay, I see where you're coming from (and especially in that simple
case).

Personally (and YMMV), even here in this simple case I think that
using pointers is not better but worse - and anyway isn't [in this
form] available in most languages;

That's a question of using the proper tool for the job. In C, that's
pointer and pointer arithmetic because it's the simplest way to express something like this.

Yes, in "C" you'd use that primitive (error-prone) pointer feature.
That's what I said. And that in other languages it's less terse than
in "C" but equally error-prone if you have to create all the parsing
code yourself (without an existing engine and in a non-standard way).
And if you extend the expression to parse it's IME much simpler done
in Regex than adjusting the algorithm of the ad hoc procedural code.

in other cases (and languages)
such constructs get yet more clumsy, and for my not very complex
example - /[0-9]+(ABC)?x*foo/ - even a "catastrophe" concerning
readability, error-proneness, and maintainability.

Procedural code for matching strings constructed in this way is
certainly much simpler¹ than the equally procedural code for a
programmable automaton capable of interpreting regexes.

The point is that Regexps and the equivalence to FSA (with guaranteed
runtime complexity) is an [efficient] abstraction with a formalized
syntax; that are huge advantages compared to ad hoc parsing code in C
(or in any other language).

Your statement
is basically "If we assume that the code interpreting regexes doesn't
exist, regexes need much less code than something equivalent which does exist." Without this assumption, the picture becomes a different one altogether.

I don't speak of assumptions. I speak about the fact that there's a well-understood model with existing [parsing-]implementations already
available to handle a huge class of algorithms in a standardized way
with a guaranteed runtime-efficiency and in an error-resilient way.

Janis

[...]

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rainer Weikusat@21:1/5 to Dan Cross on Fri Nov 22 19:26:07 2024

XPost: comp.unix.shell, comp.unix.programmer

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <87cyinrt5s.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

scott@slp53.sl.home (Scott Lurndal) writes:

Rainer Weikusat <rweikusat@talktalk.net> writes:

[...]

Something which would match [0-9]+ in its first argument (if any) would >>>>be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;
if (!c) exit(1);
return 0;
}

but that's 14 lines of text, 13 of which have absolutely no relation to >>>>the problem of recognizing a digit.

Personally, I'd use:

$ cat /tmp/a.c
#include <stdint.h>
#include <string.h>

int
main(int argc, const char **argv)
{
char *cp;
uint64_t value;

if (argc < 2) return 1;

value = strtoull(argv[1], &cp, 10);
if ((cp == argv[1])
|| (*cp != '\0')) {
return 1;
}
return 0;
}

This will accept a string of digits whose numerical value is <=
ULLONG_MAX, ie, it's basically ^[0-9]+$ with unobvious length and
content limits.

He acknowledged this already.

return !strstr(argv[1], "0123456789");

would be a better approximation,

No it wouldn't. That's not even close. `strstr` looks for an
instance of its second argument in its first, not an instance of
any character in it's second argument in its first. Perhaps you
meant something with `strspn` or similar. E.g.,

const char *p = argv[1] + strspn(argv[1], "0123456789");
return *p != '\0';

My bad.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dan Cross@21:1/5 to rweikusat@talktalk.net on Fri Nov 22 19:46:31 2024

XPost: comp.unix.shell, comp.unix.programmer

In article <878qtbrs0o.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

Rainer Weikusat <rweikusat@talktalk.net> wrote: >>>cross@spitfire.i.gajendra.net (Dan Cross) writes:

[...]

In any event, this seems simpler than what you posted:

#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>

int
main(int argc, char *argv[])
{
if (argc != 2) {
fprintf(stderr, "Usage: matchd <str>\n");
return EXIT_FAILURE;
}

for (const char *p = argv[1]; *p != '\0'; p++)
if ('0' <= *p && *p <= '9')
return EXIT_SUCCESS;

return EXIT_FAILURE;
}

It's not only 4 lines longer but in just about every individual aspect >>>syntactically more complicated and more messy and functionally more >>>clumsy.

That's a lot of opinion, and not particularly well-founded
opinion at that, given that your code was incorrect to begin
with.

That's not at all an opinion but an observation. My opinion on this is
that this is either a poor man's attempt at winning an obfuscation
context or - simpler - exemplary bad code.

Opinion (noun)
a view or judgment formed about something, not necessarily based on
fact or knowledge. "I'm writing to voice my opinion on an issue of
little importance"

You mentioned snark earlier. Physician, heal thyself.

- Dan C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dan Cross@21:1/5 to rweikusat@talktalk.net on Fri Nov 22 19:51:18 2024

XPost: comp.unix.shell, comp.unix.programmer

In article <874j3zrrxs.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <87cyinrt5s.fsf@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rweikusat@talktalk.net> wrote:

scott@slp53.sl.home (Scott Lurndal) writes:

Rainer Weikusat <rweikusat@talktalk.net> writes:

[...]

Something which would match [0-9]+ in its first argument (if any) would >>>>>be:

#include "string.h"
#include "stdlib.h"

int main(int argc, char **argv)
{
char *p;
unsigned c;

p = argv[1];
if (!p) exit(1);
while (c = *p, c && c - '0' > 10) ++p;
if (!c) exit(1);
return 0;
}

but that's 14 lines of text, 13 of which have absolutely no relation to >>>>>the problem of recognizing a digit.

Personally, I'd use:

$ cat /tmp/a.c
#include <stdint.h>
#include <string.h>

int
main(int argc, const char **argv)
{
char *cp;
uint64_t value;

if (argc < 2) return 1;

value = strtoull(argv[1], &cp, 10);
if ((cp == argv[1])
|| (*cp != '\0')) {
return 1;
}
return 0;
}

This will accept a string of digits whose numerical value is <= >>>ULLONG_MAX, ie, it's basically ^[0-9]+$ with unobvious length and
content limits.

He acknowledged this already.

return !strstr(argv[1], "0123456789");

would be a better approximation,

No it wouldn't. That's not even close. `strstr` looks for an
instance of its second argument in its first, not an instance of
any character in it's second argument in its first. Perhaps you
meant something with `strspn` or similar. E.g.,

const char *p = argv[1] + strspn(argv[1], "0123456789");
return *p != '\0';

My bad.

You've made a lot of "bad"s in this thread, and been rude about
it to boot, crying foul when someone's pointed out ways that
your code is deficient; claiming offense at what you perceive as
"snark" while dishing the same out in kind, making basic errors
that show you haven't done the barest minimum of testing, and
making statements that show you have, at best, a limited grasp
on the language you're choosing to use.

I'm done being polite. My conclusion is that perhaps you are
not as up on these things as you seem to think that you are.

- Dan C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Janis Papanagnou on Fri Nov 22 20:41:21 2024

XPost: comp.unix.shell, comp.unix.programmer

On Fri, 22 Nov 2024 12:47:16 +0100, Janis Papanagnou wrote:

On 21.11.2024 23:05, Lawrence D'Oliveiro wrote:

Another handy one is “\b” for word boundaries.

I prefer \< and \> (that are quite commonly used) for such structural
things ...

“\<” only matches the beginning of a word, “\>” only matches the end, “\b” matches both <https://www.gnu.org/software/emacs/manual/html_node/emacs/Regexp-Backslash.html>.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online

System Info

Re: Command Languages Versus Programming Languages