Sysop: | Amessyroom |
---|---|
Location: | Fayetteville, NC |
Users: | 42 |
Nodes: | 6 (0 / 6) |
Uptime: | 00:43:40 |
Calls: | 220 |
Calls today: | 1 |
Files: | 824 |
Messages: | 121,521 |
Posted today: | 6 |
As we know, AWK in general, and GAWK in particular, has several different >ways of getting data into the program. In addition to the Automatic Input >Loop (the main feature of AWK), there are several variations of "getline".
"getline" can be used with files, or with processes (in 2 different ways!), >or even with network sockets. But the problem with getline is that using
it breaks the Automatic Input Loop. You can't use the standard >"pattern/action" paradigm if your input is coming in via "getline". Yes, >there are workarounds and yes we've all gotten used to it, but it is a
shame. For one thing, you can write your program as a shell script, and
use the shell to pipe in the data from a process. But this is ugly. And
not always sufficient.
Now, I have written a GAWK extension to handle this - called "pipeline".
Here is a sample script that uses "pipeline". Note that the Linux "df" >command has a "-l" option to show you only the local filesystems, but what
I usually want is the non-local ones - that's much more interesting. The >only way I can figure how to get that is to run "df" twice and compare the >output with and without "-l". Here is my program (non-local-df):
--- Cut Here ---
@load "pipeline"
@include "abort"
# Note: You can ignore the "abort" stuff. It is part of my ecosystem, but
# probably not part of yours.
BEGIN {
testAbort(ARGC > 1,"This program takes no args!!!",1)
pipeline("in","df -l")
while (ARGC < 3)
ARGV[ARGC++] = "-"
}
ENDFILE { if (ARGIND == 1) pipeline("in","df") }
ARGIND == 1 { x[$1]; next }
FNR == 1 || !($1 in x)
--- Cut Here ---
Needless to say, I'd like to see this sort of functionality built-in.
It seems to me that GAWK has been sort of fishing around lately looking for >new worlds to conquer. Some features have been added lately that seem (to
me anyway) sort of "out of place". namespaces, MPFR arithmetic (apparently, >now deprecated), persistent memory (nifty idea, though I don't really see
the practicality - and have not gotten around to testing it - i.e.,
compiling up a new enough version to try it).
I think something like the above would be more in line with the sort of >things I'd like to see in GAWK.
--
Adderall, pseudoephed, teleprompter
While this is interesting, it can actually be done very easily from the
shell level, using process substitution:
awk -f foo.awk <(df -l) <(df)
--- Cut Here ---
@load "pipeline"
@include "abort"
# Note: You can ignore the "abort" stuff. It is part of my ecosystem, but
# probably not part of yours.
BEGIN {
testAbort(ARGC > 1,"This program takes no args!!!",1)
pipeline("in","df -l")
while (ARGC < 3)
ARGV[ARGC++] = "-"
}
ENDFILE { if (ARGIND == 1) pipeline("in","df") }
ARGIND == 1 { x[$1]; next }
FNR == 1 || !($1 in x)
--- Cut Here ---
Needless to say, I'd like to see this sort of functionality built-in.
(awk (:inputs (open-command "df -l")) (#/tmpfs/ (prn [f 5])))/run
(awk (:inputs '("alpha beta" "gamma delta")) (t (prn [f 0])))alpha
(awk (:inputs "/etc/hostname") (t (prn [f 0])))sun-go
(awk (:inputs (open-command "df -l"))(#/tmpfs/ (return-from awk [f 5])))
I think something like the above would be more in line with the sort of things I'd like to see in GAWK.
On 7/15/24 12:28 PM, Kenny McCormack wrote:
I think something like the above would be more in line with the sort of
things I'd like to see in GAWK.
+1 ; great idea.
As we know, AWK in general, and GAWK in particular, has several different ways of getting data into the program. In addition to the Automatic Input Loop (the main feature of AWK), there are several variations of "getline".
"getline" can be used with files, or with processes (in 2 different ways!), or even with network sockets. But the problem with getline is that using
it breaks the Automatic Input Loop. You can't use the standard "pattern/action" paradigm if your input is coming in via "getline". Yes, there are workarounds and yes we've all gotten used to it, but it is a
shame. For one thing, you can write your program as a shell script, and
use the shell to pipe in the data from a process. But this is ugly. And
not always sufficient.
Now, I have written a GAWK extension to handle this - called
"pipeline".