• Arbitrary characters in filenames (was Re: Default PATH setting ...)

    From Janis Papanagnou@21:1/5 to Kaz Kylheku on Mon Jan 27 20:06:42 2025
    On 27.01.2025 19:30, Kaz Kylheku wrote:

    I don't suspect this problem intersects with the issue we are talking
    about, but it's hard to be sure about a negative without doing a bunch
    of work.

    (It's worth a subject switch.)


    The ability to use a Unicode slash in a filename on POSIX systems,
    thanks to UTF-8, is mostly a good thing, in my view.

    I recall an old discussion on the topic, many many years ago.

    I generally avoid, for example, using spaces (and yet more so any
    exotic control characters) in filenames. - It always stroke me as
    if it would produce more hassle than [principle] gain. (Thinking
    of 'Backspace', 'Delete', 'CR', 'NL', or even 'Bell', etc.)

    I also think that a fileNAME should not be a fileNOVEL or carry all
    or part of a file's meta-data. For me it would certainly be okay to
    have the 'printable' characters available. - I suppose (given what
    excessive filenames are regularly used) that many people will see
    that differently.

    Back these days someone had pointed out that it's actually helpful
    if you have only few restrictions ('\0' and '/') on characters; it
    makes it possible to support "non-ASCII file systems" based on that
    underlying primitive design. - That's certainly a valid point.

    The point (made upthread) with the non-ASCII slash character makes
    me doubt, though. Wouldn't such exploits like you constructed with
    the "literal '~'" topic be also possible with "fake" slashes?

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Janis Papanagnou on Mon Jan 27 19:26:43 2025
    On 2025-01-27, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
    Back these days someone had pointed out that it's actually helpful
    if you have only few restrictions ('\0' and '/') on characters; it
    makes it possible to support "non-ASCII file systems" based on that underlying primitive design. - That's certainly a valid point.

    Today, billions of people around the world have files named using their
    native scripts.

    The point (made upthread) with the non-ASCII slash character makes
    me doubt, though. Wouldn't such exploits like you constructed with
    the "literal '~'" topic be also possible with "fake" slashes?

    Sure; say the user adds "/home/foo/bin" to their PATH, but somehow their
    editor flips the slashes to the Unicode U+2215 slash, then it's just one relative path component that is susceptible to hijack.

    The user would have to somehow not notice that their /home/bin/foo PATH
    element is not actually working: programs in that directory are not
    being found.

    If the shell did something silly, namely map Unicode slashes to ASCII equivalents when doing its own procesing of PATH, then that user would
    be fooled into thinking that the path component is correct.

    That's kind of what Bash is doing with the tilde; for its own purposes,
    it's turning, in the leading position, a dumb tilde into a smart tilde,
    which we can almost regard as different characters.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)