• Which shell and how to get started handling arguments

    From Janis Papanagnou@21:1/5 to James Harris on Mon Apr 15 16:03:34 2024
    On 15.04.2024 14:22, James Harris wrote:
    For someone who is relatively new to Unix shell scripting (me) some
    advice would be more than welcome on where to begin.

    I have two main queries:


    Q1) How can one write a script which is maximally compatible with
    different systems?

    There's various grades of "portable". These days I would - when
    striving for portability - use the _POSIX_ features as base of
    your programming. That means; not Bourne shell. If you don't
    want to learn what's defined in POSIX you should probably use a
    shell that most closely resembles the POSIX subset, maybe 'dash'.
    Or get the book from Bolsky/Korn that has an appendix with the
    feature comparisons Bourne/POSIX/ksh88/ksh93/...

    Personally, I have a less restricted view on "portability". I
    don't want to miss the modern features, specifically those that
    can all be found in the prominent shells (ksh, zsh, bash). From
    those I'd pick what will be available in your systems' contexts.
    On Linux you usually have all these shells available, but in a
    commercial context (from those three shells) you may have only
    ksh available. One "problem" with those shells is that each will
    provide own features that the other two shells won't support. So
    either you'll have to spend some time learning the differences
    or if you write your scripts run them through all three shells
    to let the shell provide the information.

    There's other factor like execution speed (ksh), consistent new
    [non-standard] concepts (zsh), large community (bash), that may
    influence your decision.

    Personally I use Kornshell which has the richest feature set and
    is the fastest; specifically Martijn Dekker's branch (of the
    original AT&T) "ksh93u+m". I try to use mostly features also
    available in bash and zsh, but wouldn't take that too strict.


    I am thinking to write in /the language of/ the Bourne shell, if
    feasible, so that it could be run by either the Bourne shell or Bash,
    etc? (Ideally, the shebang line would be #!/bin/sh.)

    Regularly "/bin/sh" nowadays refers to a POSIX shell.

    But the '#!' line's purpose is to define any available interpreter.
    Just be sure that you specify the shell language used whenever
    deviating from POSIX features.


    Or is Bash now so universal that there's no point any longer in writing
    for anything else?

    See above.



    Q2) How does one go about handling arguments in preferably a simple but universal way?

    My first idea was to iterate over arguments with such as

    while [ $# -gt 0 ]
    do
    ...
    shift
    done;

    and from that to (a) note any switches and (b) build up an array of positional parameters. However, I gather the Bourne shell has no arrays (other than the parameters themselves) so that won't work.

    Arrays can be populated by the argument list in one go, e.g. by

    a=( "$@" )

    (but that may not be what you want).


    I read up on getopts

    It's the right tool.

    but from tests it seems to require that switches
    precede arguments rather than allowing them to be specified after, so
    that doesn't seem very good, either.

    That's the usual convention, first come the options (with optional
    arguments), then the non-option arguments. Here's a syntax example

    yagol [-s] [-w width] [-h height] [-g[ngen]] [-d density]
    [-i infile] [-o outfile] [-r random-seed] [-u rule]
    [-k|-t[sec]|-l|-f] [-p|-n|-c] [-a[gen]] [-m[rate]]

    (this one just with options and no further arguments).

    Of course it uses ksh's getopts - but note that ksh's getopts is
    not portable - because it simplifies processing a lot (and it also
    implies a usage and help information).


    Online tutorials show different ways to handle this and few talk about
    which shell to use for this case so I thought I would ask you guys for suggestions.

    My clear getopts favorite (but not only with this featute) is the
    original AT&T Kornshell (in form of above mentioned "u+m" version).


    My requirement just now is, in fact, so simple that I don't need a
    universal way to handle things but ISTM best to start with an approach
    that will scale over time, if there is one.

    This is an excellent thought; it's very typical that you start
    with trivial samples, and then (own and foreign) demands come up
    to extend it, and at some point you have to do some refactoring
    (unless you started with a more general approach). (Above yagol
    example also started with only four options.)


    So any guidance on how to get started would be appreciated!

    Hope it helps. - Feel free to come back with more questions.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Janis Papanagnou on Mon Apr 15 16:34:41 2024
    On 15.04.2024 16:03, Janis Papanagnou wrote:
    On 15.04.2024 14:22, James Harris wrote:

    I read up on getopts

    It's the right tool.

    but from tests it seems to require that switches
    precede arguments rather than allowing them to be specified after, so
    that doesn't seem very good, either.

    That's the usual convention, first come the options (with optional arguments), then the non-option arguments. Here's a syntax example

    yagol [-s] [-w width] [-h height] [-g[ngen]] [-d density]
    [-i infile] [-o outfile] [-r random-seed] [-u rule]
    [-k|-t[sec]|-l|-f] [-p|-n|-c] [-a[gen]] [-m[rate]]

    Please ignore that example (I forgot it's C code using 'getopt()'
    from the GNU C library). The features differ in some ways.

    For ksh's 'getopts' type in a ksh terminal 'getopts --man' to get a
    more extensive manual information than that you find in 'man ksh'.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Christian Weisgerber@21:1/5 to James Harris on Mon Apr 15 13:35:40 2024
    On 2024-04-15, James Harris <james.harris.1@gmail.com> wrote:

    Q1) How can one write a script which is maximally compatible with
    different systems?

    I am thinking to write in /the language of/ the Bourne shell, if
    feasible, so that it could be run by either the Bourne shell or Bash,
    etc? (Ideally, the shebang line would be #!/bin/sh.)

    Yes. POSIX shell, more specifically. That is the easy part. The
    difficult part is that your script will likely call various external
    commands and those have a lot of variation as well.

    Q2) How does one go about handling arguments in preferably a simple but universal way?

    That's too vague...

    I read up on getopts

    If you want to handle option flags, getopts is the way to go.

    but from tests it seems to require that switches precede arguments
    rather than allowing them to be specified after, so that doesn't
    seem very good, either.

    But that's the way Unix commands work. You cannot specify flags
    after the first non-flag argument.

    $ touch foo -l
    $ ls foo -l
    -l foo
    $ ls -l foo -l
    -rw-r--r-- 1 naddy naddy 0 Apr 15 15:28 -l
    -rw-r--r-- 1 naddy naddy 0 Apr 15 15:28 foo

    Apparently GNU implementations deviate from this, which makes for
    a bad surprise and is incompatible with other implementations as
    well as historical practice.

    --
    Christian "naddy" Weisgerber naddy@mips.inka.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lew Pitcher@21:1/5 to James Harris on Mon Apr 15 15:06:31 2024
    On Mon, 15 Apr 2024 13:22:14 +0100, James Harris wrote:

    For someone who is relatively new to Unix shell scripting (me) some
    advice would be more than welcome on where to begin.

    I have two main queries:


    Q1) How can one write a script which is maximally compatible with
    different systems?

    As others have said, write your script to the POSIX shell language
    standards. (see
    https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html)

    Most shells support this restricted dialect.



    Q2) How does one go about handling arguments in preferably a simple but universal way?

    The "simple but universal way" is to sequentially parse your argument list. But, this leads to complications that may not sit with your script design,
    in that you (the programmer) have to decide on whether or not you want
    to impose a specific order to the argument list, and, following that
    decision, how you want to handle "unflagged" arguments.

    Then, there is getopts (which is /not/ a universally-supported extension
    to the shell language), which will handle the argument list for you, but
    with caveats and argument list order decisions that you might not agree
    with.

    For the most part, the "simple but universal" rule is "KISS" (Keep It Simple
    & Sequential), with flags first, and non-flag arguments in a fixed order,
    after the flags.


    HTH
    --
    Lew Pitcher
    "In Skills We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to James Harris on Mon Apr 15 15:45:25 2024
    James Harris <james.harris.1@gmail.com> writes:

    For someone who is relatively new to Unix shell scripting (me) some advice would be more than welcome on where to begin.

    I have two main queries:


    Q1) How can one write a script which is maximally compatible with different systems?

    Use only the features described for POIX sh.

    I am thinking to write in /the language of/ the Bourne shell, if feasible,
    so that it could be run by either the Bourne shell or Bash, etc? (Ideally, the shebang line would be #!/bin/sh.)

    The term "Bourne shell" is a little ambiguous. Many people take it to
    mean "POSIX shell" but some people would go further and take it to mean
    an older shell without some of the most recent things you can not rely
    on.

    Or is Bash now so universal that there's no point any longer in writing for anything else?

    That depends on the your audience. For Linux users, pretty much yes.

    Q2) How does one go about handling arguments in preferably a simple but universal way?

    My first idea was to iterate over arguments with such as

    while [ $# -gt 0 ]
    do
    ...
    shift
    done;

    and from that to (a) note any switches and (b) build up an array of positional parameters. However, I gather the Bourne shell has no arrays (other than the parameters themselves) so that won't work.

    I read up on getopts but from tests it seems to require that switches
    precede arguments rather than allowing them to be specified after, so that doesn't seem very good, either.

    Well that's what most people will be used to. I would want

    command -o out1 file1 -z -i out2 file2

    to use out1 for the first file, out2 for the second and for the -z to
    apply only to the second file.

    If you can accept that this is a reasonable way of working, then you can
    use your previously written loop. Every non-flag argument is just
    processed from inside the loop at the point it is seen.

    If you have to save them for later, you could consider building a
    string of saved arguments using an "unlikely" separator string:

    #!/bin/sh

    args=""
    sep='
    '
    while [ $# -gt 0 ]
    do
    case "$1" in
    -*) echo flag: $1
    ;;
    *) args="$1$sep$args"
    ;;
    esac
    shift
    done
    while [ -n "${args}" ]
    do
    echo "Arg is '${args%%$sep*}'"
    args="${args#*$sep}"
    done

    This reverses the order. You can preserve the order with slightly
    different string fiddling.

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Christian Weisgerber@21:1/5 to Lew Pitcher on Mon Apr 15 15:36:09 2024
    On 2024-04-15, Lew Pitcher <lew.pitcher@digitalfreehold.ca> wrote:

    As others have said, write your script to the POSIX shell language
    standards.

    Then, there is getopts (which is /not/ a universally-supported extension
    to the shell language),

    It is part of POSIX sh.

    --
    Christian "naddy" Weisgerber naddy@mips.inka.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Kenny McCormack on Mon Apr 15 17:57:27 2024
    On 2024-04-15, Kenny McCormack <gazelle@shell.xmission.com> wrote:
    In article <slrnv1qib9.1f89.naddy@lorvorc.mips.inka.de>,
    Christian Weisgerber <naddy@mips.inka.de> wrote:
    On 2024-04-15, Lew Pitcher <lew.pitcher@digitalfreehold.ca> wrote:

    As others have said, write your script to the POSIX shell language
    standards.

    Then, there is getopts (which is /not/ a universally-supported extension >>> to the shell language),

    It is part of POSIX sh.

    It would be useful to know exactly *why* OP wants to stay "as portable as possible". Yes, I know it is against the creed here to question such
    things, but it needs to be done, nevertheless.

    OP probably doesn't have a good feeling for the shell script portability landscape, and just wants their scripts to work on multiple systems,
    with whatever shell is installed by default?

    Wild-assed guess; but he did mention multiple systems.

    I say this as someone who does occasionally program in dash (Debian's
    version of the "POSIX shell" paradigm), just to see and to remind myself about how limited it is. But I would urge OP to think long and hard about whether or not it matters. I find that when I do program in dash, I rather quickly run into the limitations and end up regretting the choice.

    Whereas if you program in Bash, it's like you're mounted on a steed that
    is galloping over vast, open plains of software engineering techniques ...

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kenny McCormack@21:1/5 to naddy@mips.inka.de on Mon Apr 15 17:38:34 2024
    In article <slrnv1qib9.1f89.naddy@lorvorc.mips.inka.de>,
    Christian Weisgerber <naddy@mips.inka.de> wrote:
    On 2024-04-15, Lew Pitcher <lew.pitcher@digitalfreehold.ca> wrote:

    As others have said, write your script to the POSIX shell language
    standards.

    Then, there is getopts (which is /not/ a universally-supported extension
    to the shell language),

    It is part of POSIX sh.

    It would be useful to know exactly *why* OP wants to stay "as portable as possible". Yes, I know it is against the creed here to question such
    things, but it needs to be done, nevertheless.

    I say this as someone who does occasionally program in dash (Debian's
    version of the "POSIX shell" paradigm), just to see and to remind myself
    about how limited it is. But I would urge OP to think long and hard about whether or not it matters. I find that when I do program in dash, I rather quickly run into the limitations and end up regretting the choice.

    For example, many things exist in both bash (my preferred shell programming language, just in case such had not been made clear by now) and in "POSIX" shell, but have limited functionality in the POSIX version. So, if you are used to the full functionality you get in the bash version, you can get
    bitten if you assume that the "POSIX" version has the same functionality.
    I mention this specifically because you mentioned "getopts" - which does
    indeed exist in both bash and dash, but I'd be willing to bet, has limited functionality in the dash version. I never use "getopts", so I don't know
    this for a fact; I am just speculating. I know it *is* true for some other shell commands/functions.

    In general, I just don't find it worth the bother to limit myself to a
    crippled shell.

    But, mind you, it is possible, though unlikely, that OP actually has a good/valid reason for doing so. Mostly, I think it is just virtue
    signalling.

    --
    When I was growing up we called them "retards", but that's not PC anymore.
    Now, we just call them "Trump Voters".

    The question is, of course, how much longer it will be until that term is also un-PC.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lew Pitcher@21:1/5 to Christian Weisgerber on Mon Apr 15 20:37:15 2024
    On Mon, 15 Apr 2024 15:36:09 +0000, Christian Weisgerber wrote:

    On 2024-04-15, Lew Pitcher <lew.pitcher@digitalfreehold.ca> wrote:

    As others have said, write your script to the POSIX shell language
    standards.

    Then, there is getopts (which is /not/ a universally-supported extension
    to the shell language),

    It is part of POSIX sh.

    I did not know that. I've learned something new today.
    Thanks :-)

    --
    Lew Pitcher
    "In Skills We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Helmut Waitzmann@21:1/5 to All on Mon Apr 15 23:03:52 2024
    Christian Weisgerber <naddy@mips.inka.de>:
    On 2024-04-15, James Harris <james.harris.1@gmail.com> wrote:

    Q1) How can one write a script which is maximally compatible
    with different systems?


    I am thinking to write in /the language of/ the Bourne shell,
    if feasible, so that it could be run by either the Bourne shell
    or Bash, etc? (Ideally, the shebang line would be #!/bin/sh.)


    Yes. POSIX shell, more specifically. That is the easy part. The
    difficult part is that your script will likely call various external
    commands and those have a lot of variation as well.

    Q2) How does one go about handling arguments in preferably a
    simple but universal way?


    That's too vague...


    I read up on getopts


    If you want to handle option flags, getopts is the way to go.


    but from tests it seems to require that switches precede
    arguments rather than allowing them to be specified after, so
    that doesn't seem very good, either.


    But that's the way Unix commands work. You cannot specify flags
    after the first non-flag argument.

    $ touch foo -l
    $ ls foo -l
    -l foo
    $ ls -l foo -l
    -rw-r--r-- 1 naddy naddy 0 Apr 15 15:28 -l
    -rw-r--r-- 1 naddy naddy 0 Apr 15 15:28 foo

    Apparently GNU implementations deviate from this, which makes for
    a bad surprise and is incompatible with other implementations as
    well as historical practice.


    To handle this deviation, always put an end‐of‐flags marker
    ("--", as specified by POSIX) before the first non‐flag argument,
    then even the GNU implementations will well‐behave, i. e. behave
    as specified by POSIX:


    Compare (using GNU ls) with Christians well‐behaving "ls":


    touch -- foo -l

    $ ls foo -l
    -rw------- 1 helmut helmut 0 Apr 15 15:28 foo

    which deviates from POSIX,



    $ ls -l foo -l
    -rw------- 1 helmut helmut 0 Apr 15 15:28 foo

    which deviates from POSIX,



    $ ls -- foo -l
    -l
    foo

    which behaves as specified by POSIX,



    $ ls -l -- foo -l
    -rw------- 1 helmut helmut 0 Apr 15 15:28 -l
    -rw------- 1 helmut helmut 0 Apr 15 15:28 foo

    which behaves as specified by POSIX.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Helmut Waitzmann@21:1/5 to All on Mon Apr 15 23:49:33 2024
    James Harris <james.harris.1@gmail.com>:

    I read up on getopts but from tests it seems to require that
    switches precede arguments


    Yes, that's true.


    rather than allowing them to be specified after, so that doesn't
    seem very good, either.


    The problem with specifying options after non‐option arguments is
    that non‐option arguments may take any form:  They even may start
    with an "-", that is, look like options even when they aren't
    meant to be used as options.


    So, if you have some command "some_command" with one non‐option
    argument "a" followed by an option "-b"


    $ some_command -a -b


    and parse the arguments from left to right then there is no way
    for "some_command" to investigate that "-a" is to be taken as a
    non‐option argument while "-b" is to be taken as an option.


    Whereas, when "some_command" expects options before any
    non‐option argument and recognizes the end‐of‐options marker
    ("--") then there is no ambiguity:


    $ some_command -b -- -a


    "-b" is an option, while "-a" (because the series of options is
    terminated by the end‐of‐options marker) is the (first)
    non‐option argument "-a".


    That's the way how the POSIX‐shell builtin "getopts" works.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to James Harris on Mon Apr 15 22:20:34 2024
    On Mon, 15 Apr 2024 13:22:14 +0100, James Harris wrote:

    I am thinking to write in /the language of/ the Bourne shell, if
    feasible, so that it could be run by either the Bourne shell or Bash,
    etc? (Ideally, the shebang line would be #!/bin/sh.)

    There is such a thing as a standardized “POSIX shell”. On Debian, for example, /bin/sh will launch Dash, which is a minimal POSIX-compliant
    shell.

    It’s certainly a safe, boring choice. ;)

    Or is Bash now so universal that there's no point any longer in writing
    for anything else?

    This is where we get into “Unix®” the trade mark, versus “Unix” as an informal description of a collection of traditional OS behaviour.

    I say this because the only currently “Unix®” trade mark licensee still seeing any significant use is Apple’s macOS, and that does not offer
    Bash--at least, not any reasonably recent version. This is for ideological reasons or something.

    So if you are targeting “Unix” in the latter sense, then Bash is quite widespread, yes.

    I read up on getopts but from tests it seems to require that switches
    precede arguments rather than allowing them to be specified after, so
    that doesn't seem very good, either.

    One reason for that convention is that it is possible for file/directory
    names to begin with “-”. To minimize the confusion this causes, there is a an additional common convention among command-line tools that a plain “--” option means “don’t look for any more options after this”. That is to say,
    treat the remaining items as file names (or whatever else the program does
    with them), even if they begin with “-”.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Helmut Waitzmann on Tue Apr 16 01:14:38 2024
    On 2024-04-15, Helmut Waitzmann <nn.throttle@xoxy.net> wrote:
    Compare (using GNU ls) with Christians well‐behaving "ls":


    touch -- foo -l

    $ ls foo -l
    -rw------- 1 helmut helmut 0 Apr 15 15:28 foo

    touch deviates in the first place; omit the -- and you get

    $ touch foo -l
    touch: invalid option -- 'l'

    That's crazy. foo is a non-option argument, so the options
    have ended at that point.

    I see where it is documented in "2 Common options" (Coreutils manual):

    Normally options and operands can appear in any order, and programs act
    as if all the options appear before any operands. For example, ‘sort -r
    passwd -t :’ acts like ‘sort -r -t : passwd’, since ‘:’ is an
    option-argument of -t. However, if the POSIXLY_CORRECT environment
    variable is set, options must appear before operands, unless otherwise
    specified for a particular command.

    It is disingenous to call it "POSIXly correct", because in fact the
    POSIX rules are how everyone understands it and how other implementors
    of utilities implement it. (Does anyone else do this crazy thing?)

    If all the vendors feature a given extension, so that it is portable,
    but POSIX refuses to adopt it, then, sure: the mode which takes the
    extension away can be flippantly called "POSIXly correct".

    Also the claim "options must appear before operands [in POSIX]" is
    misleading, because "must" is usually interpreted as an imposed
    requirement, which can be violated and diagnosed. But in fact it is *logically* impossible for options to appear elsewhere because arguments
    that look like options placed in the non-option part of the command line
    are operands. It's the logical "must", not the reuqirements "must".

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Keith Thompson on Tue Apr 16 10:19:19 2024
    On 15.04.2024 23:31, Keith Thompson wrote:

    Bash has an option that tells it to (attempt to) restrict itself to
    POSIX semantics:

    Starting Bash with the '--posix' command-line option or executing
    'set -o posix' while Bash is running will cause Bash to conform more
    closely to the POSIX standard by changing the behavior to match that
    specified by POSIX in areas where the Bash default differs.

    I haven't used this option myself, and I don't know just how closely it actually conforms to POSIX.

    I'm less familiar with ksh and zsh, but they probably have similar
    options. At least the "MirBSD Korn shell" has "set -o posix".

    The branch ksh93u+m has it...

    $ set -o | grep -i posix
    posix off

    But original ksh93u+ doesn't show such an option.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Christian Weisgerber@21:1/5 to Keith Thompson on Tue Apr 16 11:11:16 2024
    On 2024-04-15, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

    Bash has an option that tells it to (attempt to) restrict itself to
    POSIX semantics:

    No, it does not:

    Starting Bash with the '--posix' command-line option or executing
    'set -o posix' while Bash is running will cause Bash to conform more
    closely to the POSIX standard by changing the behavior to match that
    specified by POSIX in areas where the Bash default differs.
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

    This only tweaks bash's behavior where it otherwise differs from
    POSIX. It does not disable the myriad extensions.

    There is a tool ShellCheck that among other things can be used to
    warn about unportable code in shell scripts.
    https://www.shellcheck.net/

    I haven't used it myself yet. It is written in Haskell, so it
    suffers itself from portability concerns.

    --
    Christian "naddy" Weisgerber naddy@mips.inka.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kenny McCormack@21:1/5 to Keith.S.Thompson+u@gmail.com on Tue Apr 16 19:59:03 2024
    In article <87y19dqgsh.fsf@nosuchdomain.example.com>,
    Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    Christian Weisgerber <naddy@mips.inka.de> writes:
    On 2024-04-15, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    Bash has an option that tells it to (attempt to) restrict itself to
    POSIX semantics:

    No, it does not:

    Starting Bash with the '--posix' command-line option or executing
    'set -o posix' while Bash is running will cause Bash to conform more >>> closely to the POSIX standard by changing the behavior to match that >>> specified by POSIX in areas where the Bash default differs.
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

    This only tweaks bash's behavior where it otherwise differs from
    POSIX. It does not disable the myriad extensions.

    I stand corrected.

    ISTR (which is to say, I can't prove it or point to an example at the
    moment), that there were some systems under some circumstances where if
    bash was copied/linked as "sh" (and then run as "sh" instead of "bash"),
    then it did indeed behave like a plain "POSIX" shell (i.e., extensions were disabled).

    It is to be noted that bash is an evolving (i.e., changing) program and
    there is no written "standard" for it - like Perl, it is just whatever its current maintainers makes it out to be at any particular moment. This
    makes it hard to make the kind of hard-and-fast statements about it that
    people in newsgroups like this one like so much to do.

    --
    Elect a clown, expect a circus.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Helmut Waitzmann@21:1/5 to All on Tue Apr 16 22:23:46 2024
    Kaz Kylheku <643-408-1753@kylheku.com>:
    On 2024-04-15, Helmut Waitzmann <nn.throttle@xoxy.net> wrote:
    Compare (using GNU ls) with Christians well‐behaving "ls":


    touch -- foo -l

    $ ls foo -l
    -rw------- 1 helmut helmut 0 Apr 15 15:28 foo

    touch deviates in the first place; omit the -- and you get


    $ touch foo -l
    touch: invalid option -- 'l'

    That's crazy. foo is a non-option argument, so the options
    have ended at that point.


    Yes, that's the same problem like with GNU "ls".  (I decided to
    silently avoid it by making use of the end‐of‐option marker
    without commenting it.)


    I see where it is documented in "2 Common options" (Coreutils
    manual):


    Normally options and operands can appear in any order, and
    programs act as if all the options appear before any operands.
    For example, ‘sort -r passwd -t :’ acts like ‘sort -r -t :
    passwd’, since ‘:’ is an option-argument of -t. However, if
    the POSIXLY_CORRECT environment variable is set, options must
    appear before operands, unless otherwise specified for a
    particular command.


    If I understand the last sentence correctly, setting the
    POSIXLY_CORRECT environment variable forces the GNU utilities to
    stop option processing before the first non‐option argument
    (i. e. an argument beginning not with a "-") as if that argument
    had been preceded by the end‐of‐option argument "--".


    It is disingenous to call it "POSIXly correct", because in fact
    the POSIX rules are how everyone understands it and how other
    implementors of utilities implement it. (Does anyone else do
    this crazy thing?)


    If all the vendors feature a given extension, so that it is
    portable, but POSIX refuses to adopt it, then, sure: the mode
    which takes the extension away can be flippantly called "POSIXly
    correct".


    Maybe my knowledge of the English language is not good enough to
    understand you correctly.  Under the premise, that the
    POSIXLY_CORRECT environment variable has been set, do you see the
    GNU utilities behaving in any different way from what is the
    behavior specified by POSIX?


    Also the claim "options must appear before operands [in POSIX]"
    is misleading, because "must" is usually interpreted as an
    imposed requirement, which can be violated and diagnosed. But in
    fact it is *logically* impossible for options to appear
    elsewhere because arguments that look like options placed in the
    non-option part of the command line are operands. It's the
    logical "must", not the reuqirements "must".


    The GNU manual does not say "in POSIX" but says "if the
    POSIXLY_CORRECT environment variable is set", but otherwise I
    agree with you.  This is, why I recommend to not make use of the
    GNU behavior of looking for options right of the first non‐option
    operand when neither the end‐of‐option marker is used nor the
    POSIXLY_CORRECT environment variable has been set, but rather
    always put any options (if present) first, then – regardless of
    whether any options are actually present – in any case supply the
    end‐of‐option marker ("--") and finally any non‐option operands. 
    This way both, POSIX utilities and GNU utilities, will behave the
    same, even, if the POSIXLY_CORRECT environment variable happens
    not to be set.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Christian Weisgerber on Tue Apr 16 20:45:50 2024
    On 2024-04-16, Christian Weisgerber <naddy@mips.inka.de> wrote:
    On 2024-04-15, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

    Bash has an option that tells it to (attempt to) restrict itself to
    POSIX semantics:

    No, it does not:

    Starting Bash with the '--posix' command-line option or executing
    'set -o posix' while Bash is running will cause Bash to conform more
    closely to the POSIX standard by changing the behavior to match that
    specified by POSIX in areas where the Bash default differs.
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

    This only tweaks bash's behavior where it otherwise differs from
    POSIX. It does not disable the myriad extensions.

    "set -o posix" disabling conforming extensions would be a GCC-grade
    stupidity.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Christian Weisgerber@21:1/5 to Kenny McCormack on Tue Apr 16 21:57:34 2024
    On 2024-04-16, Kenny McCormack <gazelle@shell.xmission.com> wrote:

    ISTR (which is to say, I can't prove it or point to an example at the moment), that there were some systems under some circumstances where if
    bash was copied/linked as "sh" (and then run as "sh" instead of "bash"),
    then it did indeed behave like a plain "POSIX" shell (i.e., extensions were disabled).

    From the man page:
    If bash is invoked with the name sh, it tries to mimic the startup
    behavior of historical versions of sh as closely as possible, while
    conforming to the POSIX standard as well. [...]
    When invoked as sh, bash enters posix mode after the startup files are
    read.

    However, it is possible to disable a lot of features at build time.
    In particular, configuring bash with --enable-minimal-config produces
    a much reduced feature set:

    dnl a minimal configuration turns everything off, but features can be
    dnl added individually
    if test $opt_minimal_config = yes; then
    opt_job_control=no opt_alias=no opt_readline=no
    opt_history=no opt_bang_history=no opt_dirstack=no
    opt_restricted=no opt_process_subst=no opt_prompt_decoding=no
    opt_select=no opt_help=no opt_array_variables=no opt_dparen_arith=no
    opt_brace_expansion=no opt_disabled_builtins=no opt_command_timing=no
    opt_extended_glob=no opt_cond_command=no opt_arith_for_command=no
    opt_net_redirs=no opt_progcomp=no opt_separate_help=no
    opt_multibyte=yes opt_cond_regexp=no opt_coproc=no
    opt_casemod_attrs=no opt_casemod_expansions=no opt_extglob_default=no
    opt_translatable_strings=no
    opt_globascii_default=yes
    fi

    --
    Christian "naddy" Weisgerber naddy@mips.inka.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Christian Weisgerber on Wed Apr 17 00:52:26 2024
    On Tue, 16 Apr 2024 11:11:16 -0000 (UTC), Christian Weisgerber wrote:

    On 2024-04-15, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

    There is a tool ...

    Or just use a basic POSIX shell. Such things exist, you know.

    ... ShellCheck that among other things can be used to warn
    about unportable code in shell scripts. https://www.shellcheck.net/

    I don’t see any mention of “unportability”, only about “bugs”.

    Just for fun, I tried this:

    #!/bin/bash

    collect_expand()
    {
    local -n arr="$2"
    arr=()
    coproc expander { find . -maxdepth 1 -name "$1" -print0; }
    # must ensure while-loop runs in this process
    while read -u ${expander[0]} -rd '' line; do
    arr[${#arr[*]}]="$line"
    done
    wait $expander_PID
    } # collect_expand

    #+
    # Mainline
    #-

    test_filenames=('file 1.dat' $'file number\n2.dat' $'file\t3 dat')
    # test multiple spaces in a row and newlines, among any other odd
    # things you can think of!

    tmpdir=$(mktemp -d -p '' collect-work.XXXXXXXXXX)
    echo "tmpdir =" $(printf %q "$tmpdir")

    cd "$tmpdir"
    for f in "${test_filenames[@]}"; do
    echo "create" $(printf %q "$f")
    touch "$f"
    done

    collect_expand '*dat' found_filenames
    for f in "${found_filenames[@]}"; do
    echo "found" $(printf %q "$f")
    done

    cd
    rm -rfv "$tmpdir"

    and it came back with

    Line 9:
    while read -u ${expander[0]} -rd '' line; do
    ^-- SC2086 (info): Double quote to prevent globbing and word splitting.

    Did you mean: (apply this, apply all SC2086)
    while read -u "${expander[0]}" -rd '' line; do

    Line 12:
    wait $expander_PID
    ^-- SC2154 (warning): expander_PID is referenced but not assigned.
    ^-- SC2086 (info): Double quote to prevent globbing and word splitting.

    Did you mean: (apply this, apply all SC2086)
    wait "$expander_PID"

    Line 24:
    echo "tmpdir =" $(printf %q "$tmpdir")
    ^-- SC2046 (warning): Quote this to prevent word splitting.

    Line 26:
    cd "$tmpdir"
    ^-- SC2164 (warning): Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

    Did you mean: (apply this, apply all SC2164)
    cd "$tmpdir" || exit

    Line 28:
    echo "create" $(printf %q "$f")
    ^-- SC2046 (warning): Quote this to prevent word splitting.

    Line 33:
    for f in "${found_filenames[@]}"; do
    ^-- SC2154 (warning): found_filenames is referenced but not assigned.

    Line 34:
    echo "found" $(printf %q "$f")
    ^-- SC2046 (warning): Quote this to prevent word splitting.

    Line 37:
    cd
    ^-- SC2164 (warning): Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

    Did you mean: (apply this, apply all SC2164)
    cd || exit

    I would have to count every single one of those messages as spurious.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kenny McCormack@21:1/5 to ldo@nz.invalid on Wed Apr 17 09:02:36 2024
    In article <uvn6ga$17j5g$3@dont-email.me>,
    Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
    On Tue, 16 Apr 2024 11:11:16 -0000 (UTC), Christian Weisgerber wrote:

    On 2024-04-15, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

    There is a tool ...

    Or just use a basic POSIX shell. Such things exist, you know.

    ... ShellCheck that among other things can be used to warn
    about unportable code in shell scripts. https://www.shellcheck.net/

    I dont see any mention of unportability, only about bugs.

    ShellCheck reads the #! line and figures out which flavor of shell you are using, but this can be overridden by a command line switch. So, presumably, you could tell it to parse your batch script as if it were plain old
    "POSIX" (i.e., crippled) sh.

    Just for fun, I tried this:
    ...
    I would have to count every single one of those messages as spurious.

    Yes, ShellCheck complains about a lot of things, and most of its complaints
    can and should be ignored. I still find it interesting and useful, but you
    have to take (most of) what it says with a (big) grain of salt.

    --
    The randomly chosen signature file that would have appeared here is more than 4 lines long. As such, it violates one or more Usenet RFCs. In order to remain in compliance with said RFCs, the actual sig can be found at the following URL:
    http://user.xmission.com/~gazelle/Sigs/WeekendAwayFromHome

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Christian Weisgerber@21:1/5 to Lawrence D'Oliveiro on Wed Apr 17 14:23:35 2024
    On 2024-04-17, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

    ... ShellCheck that among other things can be used to warn
    about unportable code in shell scripts. https://www.shellcheck.net/

    I don’t see any mention of “unportability”, only about “bugs”.

    https://www.shellcheck.net/wiki/
    The long list of items also has just short of sixty about things
    that are undefined in POSIX sh.

    --
    Christian "naddy" Weisgerber naddy@mips.inka.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)