Forum: Too Lazy BBS

Who's Online
Recent Visitors
- Sykotik
  Mon Jun 30 18:27:57 2025
  from Canada via Telnet
- Sykotik
  Mon Jun 30 17:32:10 2025
  from Canada via Telnet
- Guest
  Mon Jun 30 17:20:27 2025
  from Auckland via SSH
- Sykotik
  Mon Jun 30 16:11:59 2025
  from Canada via Telnet

System Info

Sysop:	Amessyroom
Location:	Fayetteville, NC
Users:	27
Nodes:	6 (0 / 6)
Uptime:	50:56:12
Calls:	479
Calls today:	11
Files:	1,071
Messages:	95,616
Posted today:	1

A feature I'd like to see in GAWK...

From Kenny McCormack@21:1/5 to All on Mon Jul 15 18:28:31 2024

As we know, AWK in general, and GAWK in particular, has several different
ways of getting data into the program. In addition to the Automatic Input
Loop (the main feature of AWK), there are several variations of "getline".

"getline" can be used with files, or with processes (in 2 different ways!),
or even with network sockets. But the problem with getline is that using
it breaks the Automatic Input Loop. You can't use the standard "pattern/action" paradigm if your input is coming in via "getline". Yes,
there are workarounds and yes we've all gotten used to it, but it is a
shame. For one thing, you can write your program as a shell script, and
use the shell to pipe in the data from a process. But this is ugly. And
not always sufficient.

Now, I have written a GAWK extension to handle this - called "pipeline".
Here is a sample script that uses "pipeline". Note that the Linux "df"
command has a "-l" option to show you only the local filesystems, but what
I usually want is the non-local ones - that's much more interesting. The
only way I can figure how to get that is to run "df" twice and compare the output with and without "-l". Here is my program (non-local-df):

--- Cut Here ---
@load "pipeline"
@include "abort"
# Note: You can ignore the "abort" stuff. It is part of my ecosystem, but
# probably not part of yours.
BEGIN {
testAbort(ARGC > 1,"This program takes no args!!!",1)
pipeline("in","df -l")
while (ARGC < 3)
ARGV[ARGC++] = "-"
}
ENDFILE { if (ARGIND == 1) pipeline("in","df") }
ARGIND == 1 { x[$1]; next }
FNR == 1 || !($1 in x)
--- Cut Here ---

Needless to say, I'd like to see this sort of functionality built-in.

It seems to me that GAWK has been sort of fishing around lately looking for
new worlds to conquer. Some features have been added lately that seem (to
me anyway) sort of "out of place". namespaces, MPFR arithmetic (apparently, now deprecated), persistent memory (nifty idea, though I don't really see
the practicality - and have not gotten around to testing it - i.e.,
compiling up a new enough version to try it).

I think something like the above would be more in line with the sort of
things I'd like to see i

From Mack The Knife@21:1/5 to Kenny McCormack on Tue Jul 16 14:29:10 2024

While this is interesting, it can actually be done very easily from the
shell level, using process substitution:

awk -f foo.awk <(df) <(df)

In article <v73pof$3gdp5$2@news.xmission.com>,
Kenny McCormack <gazelle@shell.xmission.com> wrote:

As we know, AWK in general, and GAWK in particular, has several different >ways of getting data into the program. In addition to the Automatic Input >Loop (the main feature of AWK), there are several variations of "getline".

"getline" can be used with files, or with processes (in 2 different ways!), >or even with network sockets. But the problem with getline is that using
it breaks the Automatic Input Loop. You can't use the standard >"pattern/action" paradigm if your input is coming in via "getline". Yes, >there are workarounds and yes we've all gotten used to it, but it is a
shame. For one thing, you can write your program as a shell script, and
use the shell to pipe in the data from a process. But this is ugly. And
not always sufficient.

Now, I have written a GAWK extension to handle this - called "pipeline".
Here is a sample script that uses "pipeline". Note that the Linux "df" >command has a "-l" option to show you only the local filesystems, but what
I usually want is the non-local ones - that's much more interesting. The >only way I can figure how to get that is to run "df" twice and compare the >output with and without "-l". Here is my program (non-local-df):

--- Cut Here ---
@load "pipeline"
@include "abort"
# Note: You can ignore the "abort" stuff. It is part of my ecosystem, but
# probably not part of yours.
BEGIN {
testAbort(ARGC > 1,"This program takes no args!!!",1)
pipeline("in","df -l")
while (ARGC < 3)
ARGV[ARGC++] = "-"
}
ENDFILE { if (ARGIND == 1) pipeline("in","df") }
ARGIND == 1 { x[$1]; next }
FNR == 1 || !($1 in x)
--- Cut Here ---

Needless to say, I'd like to see this sort of functionality built-in.

It seems to me that GAWK has been sort of fishing around lately looking for >new worlds to conquer. Some features have been added lately that seem (to
me anyway) sort of "out of place". namespaces, MPFR arithmetic (apparently, >now deprecated), persistent memory (nifty idea, though I don't really see
the practicality - and have not gotten around to testing it - i.e.,
compiling up a new enough version to try it).

I think something like the above would be more in line with the sort of >things I'd like to see in GAWK.

--
Adderall, pseudoephed, teleprompter

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kenny McCormack@21:1/5 to mack@the-knife.org on Tue Jul 16 16:25:28 2024

In article <669683b6$0$713$14726298@news.sunsite.dk>,
Mack The Knife <mack@the-knife.org> wrote:

While this is interesting, it can actually be done very easily from the
shell level, using process substitution:

awk -f foo.awk <(df -l) <(df)

Which, as noted in the OP, is ugly and not AWK, but rather shell.
(As I said, we all know the workarounds - and we all know they are ugly)

And it doesn't work if you have to calculate the value of the process to
run inside the AWK script (which isn't the case with my "df" example, but
is why I used the phrase "not always sufficient").

--
After 4 years of disastrous screwups, Trump now favors 3 policies that I support:
1) $2K/pp stimulus money. Who doesn't want more money?
2) Water pressure. My shower doesn't work very well; I want Donnie to come fix it.
3) Repeal of Section 230. This will lead to the demise of Face/Twit/Gram. Yey!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Kenny McCormack on Tue Jul 16 17:10:25 2024

On 2024-07-15, Kenny McCormack <gazelle@shell.xmission.com> wrote:

--- Cut Here ---
@load "pipeline"
@include "abort"
# Note: You can ignore the "abort" stuff. It is part of my ecosystem, but
# probably not part of yours.
BEGIN {
testAbort(ARGC > 1,"This program takes no args!!!",1)
pipeline("in","df -l")
while (ARGC < 3)
ARGV[ARGC++] = "-"
}
ENDFILE { if (ARGIND == 1) pipeline("in","df") }
ARGIND == 1 { x[$1]; next }
FNR == 1 || !($1 in x)
--- Cut Here ---

Needless to say, I'd like to see this sort of functionality built-in.

TXR Lisp Awk macro:

(awk (:inputs (open-command "df -l")) (#/tmpfs/ (prn [f 5])))

/run
/dev/shm
/run/lock
/sys/fs/cgroup
/run/user/122
/run/user/500
nil

:inputs arguments can be files, lists of strings, input streams.

(awk (:inputs '("alpha beta" "gamma delta")) (t (prn [f 0])))

alpha
gamma
nil

(awk (:inputs "/etc/hostname") (t (prn [f 0])))

sun-go
nil

nil is the return value of the awk expression. You can control that.
The awk construct establishes a hidden block named awk around
your code.

E.g. return the first tmpfs path from "df -l":

(awk (:inputs (open-command "df -l"))

(#/tmpfs/ (return-from awk [f 5])))
"/run"

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Arti F. Idiot@21:1/5 to Kenny McCormack on Tue Jul 16 14:05:56 2024

On 7/15/24 12:28 PM, Kenny McCormack wrote:

I think something like the above would be more in line with the sort of things I'd like to see in GAWK.

+1 ; great idea.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kenny McCormack@21:1/5 to Arti F. Idiot on Wed Jul 17 12:23:00 2024

In article <v76jr4$tu3$1@nnrp.usenet.blueworldhosting.com>,
Arti F. Idiot <addr@is.invalid> wrote:

On 7/15/24 12:28 PM, Kenny McCormack wrote:

I think something like the above would be more in line with the sort of
things I'd like to see in GAWK.

+1 ; great idea.

Well, I think so. The idea is that you shouldn't have to give up the most intrinsic part of AWK (the pattern/action paradigm) just because your input isn't a named (i.e., on the command line) file.

I think of it as "rehabilitating getline". Bringing it back into the fold, rather than exiling it to the sidelines.

Note also that my "pipeline" extension only handles the case of a simple process (either input or output - i.e., like AWK's "getline" and "print"
with "|" redirection). It doesn't handle any of the other variations of getline/print - such as the ones that interface with network sockets. It
would be nice if a built-in approach did those things as well (and better
than my extension does).

--
The randomly chosen signature file that would have appeared here is more than 4 lines long. As such, it violates one or more Usenet RFCs. In order to remain in compliance with said RFCs, the actual sig can be found at the following URL:
http://user.xmission.com/~gazelle/Sigs/FreeCollege

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Jeremy Brubaker@21:1/5 to Kenny McCormack on Fri Jul 19 14:26:35 2024

On 2024-07-15, Kenny McCormack wrote:

As we know, AWK in general, and GAWK in particular, has several different ways of getting data into the program. In addition to the Automatic Input Loop (the main feature of AWK), there are several variations of "getline".

"getline" can be used with files, or with processes (in 2 different ways!), or even with network sockets. But the problem with getline is that using
it breaks the Automatic Input Loop. You can't use the standard "pattern/action" paradigm if your input is coming in via "getline". Yes, there are workarounds and yes we've all gotten used to it, but it is a
shame. For one thing, you can write your program as a shell script, and
use the shell to pipe in the data from a process. But this is ugly. And
not always sufficient.

Now, I have written a GAWK extension to handle this - called
"pipeline".

That sounds quite useful. I am fairly certain I have wished a feature
like that existed and ended up just wrapping awk with sh but I agree
that's ugly.

Awk is underrated IMHO. Not that json/yaml/etc aren't useful things but frequently when I seem them used my first thought is "If you had just
done well-formatted text records I could have parsed this with awk".

-- () www.asciiribbon.org | Jeremy Brubaker /\ - against html mail | јЬruЬаkе@оrіоnаrtѕ.іо / neonrex on IRC

Even a hawk is an eagle among crows.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online

Recent Visitors

System Info

A feature I'd like to see in GAWK...