From Newsgroup: comp.lang.awk
On 11/6/2025 2:35 AM, Luis Mendes wrote:
Hi all,
I've build a small gawk script that is intended to provide the user three outputs, all at the END block.
This is running in Linux.
Preferable, I'd like some non-specific gawk awk.
The script is invoked as:
find . -name '*.yml' | xargs awk -f script.awk
Schematics of the script:
do some stuff for some lines of each file
END {
do some stuff
print "title1"
for - some cycle
print "details"
print "title2"
for - another cycle
printf("%12s,%6s\n", k_arr[1], somefunc(k_arr[2])) | "sort -t - k2,2"
print "title3"
for - third cycle
print detail
}
As it is, it prints all from first for cycle, the title of the second, the third cycle and only aftwerwards the detail of the second cycle.
As I have searched and understand, the sort is done after all of the
printfs.
So, I thought of using file descriptors, to have the prints in order. Modified the printf line to:
printf("%12s,%6s\n", k_arr[1], somefunc(k_arr[2])) | "sort -t -k2,2 | cat>&10"
And adding below:
while ((getline < "/dev/fd/10") >0)
print $0
But get either, bad file descriptor or file descriptor not found.
What should be modified?
Another question, as some file descriptors are in use, how to find a file descriptor that is free to be used?
Thanks,
Lu|!s Mendes
As has already been pointed out you can solve this specific problem by
just closing the pipeline to `sort` after the loop in which you use it,
but FYI a general approach to producing output sorted in various ways
that doesn't require you to spawn a subshell from awk to call sort
(thereby letting awk focus on what it does best, manipulate text, while
the shell does what it does best, sequence calls to tools) and also
solves this problem is the Decorate-Sort-Undecorate idiom (
https://rosettacode.org/wiki/Decorate-sort-undecorate_idiom), e.g.
untested:
awk ' # Decorate
...
END {
do some stuff
OFS = "-"
sectNr = 0
lineNr = 0
print ++sectNr, ++lineNr, "title1"
for - some cycle
print sectNr, ++lineNr, "details"
print ++sectNr, 0, "title2"
for - another cycle
print sectNr, 0, sprintf("%12s,%6s", k_arr[1], somefunc(k_arr[2]))
k2,2"
lineNr = 0
print ++sectNr, ++lineNr, "title3"
for - third cycle
print sectNr, ++lineNr, "details"
}
' |
sort -t- -k1,1n -k2,2n -k4,4 | # Sort
cut -d- -f3- # Undecorate
The first awk command decorates the output by prefixing it with:
a) a section number for each section of output you apparently want so we
can initially sort on that to keep the sections in order, and
b) a line number within the first and third sections to keep those lines
in that order when sorted, and
c) the same line number, 0, for all lines within the middle section as
we want to sort that by the subsequent content, not by the original
order of the output lines in that section.
The sort command then sorts by the section number, the line numbers
(which only affect the first and third sections), then the content
(which only affects the middle section).
The cut command then removes the section and line numbers added by the
first awk.
That will work with any awk, sort, and cut (or you can replace cut with
a second awk if you prefer).
Ed.
--- Synchronet 3.21a-Linux NewsLink 1.2