• [gentoo-user] Cat a binary = system crash?

    From Mitchell Dorrell@21:1/5 to All on Fri May 9 04:20:01 2025
    This is not a bug report and I'm not really seeking assistance, I'm just inviting discussion because... this shouldn't be able to happen, right?

    Earlier today, I opened a terminal using urxvt, then initiated an SSH connection to a remote machine. On the remote machine, I ran a command
    roughly like this (but as a one-liner):

    for d in path1 path2 path3; do
    files=$(find $d -not -type d -exec readlink -f {} \; | sort -u);
    for f in $files; do
    cat $f | tr ' ' '\n' | pipe_through_sed_and_grep_etc;
    done;
    done

    ... which caused grep to mention finding some matches inside binary data
    via stdin. After (insufficiently) adding to the pipeline to filter the
    output down to just the matching strings, I added '-a' to the grep
    commands, hit enter, briefly saw some junk printed to the terminal, and
    then my screen went black and I noticed that my power LED was dark.

    There are 468 null bytes in /var/log/messages at the crash time.

    Neither urxvt, nor bash, nor ssh were running as root, and I'm pretty
    sure I had rebooted since my last @world update, so there shouldn't be
    any outdated libraries in play.

    Userspace applications shouldn't be able to crash the system, right?

    I haven't tried to reproduce it yet. I'm in no hurry to deliberately
    crash my daily-driver, but since I know the bug report might be
    important, I'll try it anyway when I can.

    As an aside, this laptop has been having trouble resuming from
    hibernation, but that's been a problem for a while now. I doubt it's
    related.

    Any thoughts?
    -MD



    Machine details:

    uname -rp: 6.14.4-gentoo-dist AMD Ryzen 7 7840U w/ Radeon 780M Graphics
    Portage profile: default/linux/amd64/23.0/split-usr/desktop
    Global USE flags:
    "dist-kernel pulseaudio sqlite vaapi vdpau xinerama -gtk -qt5 -wayland" VIDEO_CARDS="amdgpu radeonsi radeon" (Why all three? I don't remember.)

    Packages of interest:
    sys-kernel/gentoo-kernel-6.14.4 USE="-initramfs"
    x11-base/xorg-server-21.1.16
    x11-drivers/xf86-video-amdgpu-23.0.0
    x11-drivers/xf86-video-ati-22.0.0
    x11-terms/rxvt-unicode-9.31-r3 USE="24-bit-color"
    app-shells/bash-5.2_p37
    net-misc/openssh-9.9_p2-r3

    NOTE: I apply an extra (maybe no-longer-needed) patch to rxvt-unicode: "0001-Revert-rxvt-unicode-screen.C-to-rxvt-unicode-9.30-st.patch"
    (found here: https://bugs.archlinux.org/task/77062)

    Based on Xorg.log, only the amdgpu driver is being loaded, not the ati.

    Loaded graphics-related kernel modules: amdgpu,amdxcp,amdxdna,cec,drm_buddy,drm_client_lib,drm_display_helper, drm_exec,drm_kms_helper,drm_panel_backlight_quirks,drm_shmem_helper, drm_suballoc_helper,drm_ttm_helper,gpu_sched,i2c_algo_bit,ttm,video,wmi, wmi_bmof

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Nate Eldredge@21:1/5 to All on Fri May 9 04:30:01 2025
    On May 8, 2025, at 20:12, Mitchell Dorrell <mwd@psc.edu> wrote:

    This is not a bug report and I'm not really seeking assistance, I'm just inviting discussion because... this shouldn't be able to happen, right?

    Right.

    Unless you can reproduce it, I don't think we can reject the "null hypothesis" that the crash was caused by something unrelated (e.g. hardware problem) that just coincidentally happened to occur during this particular task.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Eli Schwartz@21:1/5 to Mitchell Dorrell on Fri May 9 05:00:01 2025
    This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --------------0gOlmerpJcwEC08S08aZL3mI
    Content-Type: text/plain; charset=UTF-8
    Content-Transfer-Encoding: quoted-printable

    On 5/8/25 10:12 PM, Mitchell Dorrell wrote:
    This is not a bug report and I'm not really seeking assistance, I'm just inviting discussion because... this shouldn't be able to happen, right?

    Earlier today, I opened a terminal using urxvt, then initiated an SSH connection to a remote machine. On the remote machine, I ran a command roughly like this (but as a one-liner):

    for d in path1 path2 path3; do
    files=$(find $d -not -type d -exec readlink -f {} \; | sort -u);
    for f in $files; do
    cat $f | tr ' ' '\n' | pipe_through_sed_and_grep_etc;
    done;
    done

    ... which caused grep to mention finding some matches inside binary data
    via stdin. After (insufficiently) adding to the pipeline to filter the
    output down to just the matching strings, I added '-a' to the grep
    commands, hit enter, briefly saw some junk printed to the terminal, and
    then my screen went black and I noticed that my power LED was dark.

    There are 468 null bytes in /var/log/messages at the crash time.

    Neither urxvt, nor bash, nor ssh were running as root, and I'm pretty
    sure I had rebooted since my last @world update, so there shouldn't be
    any outdated libraries in play.

    Userspace applications shouldn't be able to crash the system, right?


    I would say that this is an almost fallacious way to look at things,
    honestly. urxvt is a userspace application, so it "can't" crash the
    system, no matter what I do with it... right? Even if I run `sudo /usr/sbin/crashsystem`, it's running in a userspace application, what
    can it do really?

    Userspace applications have to make use of kernel facilities for
    everything they do, such as displaying graphics on the screen. A not-entirely-uncommon cause of system crashes is bugs being triggered in
    a GPU driver.

    That's deeply trusted code running at a higher permission level than
    merely sudo. Of course, it "should" be designed to not mishandle bad
    data, and for the most part, they do a good job at that. But things
    happen. It's a valid possibility. :)


    --
    Eli Schwartz

    --------------0gOlmerpJcwEC08S08aZL3mI--

    -----BEGIN PGP SIGNATURE-----

    wnsEABYIACMWIQTnFNnmK0TPZHnXm3qEp9ErcA0vVwUCaB1vMAUDAAAAAAAKCRCEp9ErcA0vV5BO AQCUZ7zHTD6A77ux8t3WCTXkcJnAhGgz2VwLHz/oNh2JagEAwtrKm0o1Nb9SkZdILMK9wuNOY7sb bjDAcytiKlvpRwQ=
    =9262
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mitchell Dorrell@21:1/5 to eschwartz@gentoo.org on Fri May 9 06:50:01 2025
    On Thu, May 8, 2025, 22:23 Nate Eldredge <nate@thatsmathematics.com> wrote:

    Unless you can reproduce it, I don't think we can reject the "null
    hypothesis" that the crash was caused by something unrelated (e.g. hardware problem) that just coincidentally happened to occur during this particular task.


    I fully agree. I haven't decided whether I'm hoping I can reproduce it, or whether I'm hoping I can't.

    On Thu, May 8, 2025, 22:59 Eli Schwartz <eschwartz@gentoo.org> wrote:

    I would say that this is an almost fallacious way to look at things, honestly. urxvt is a userspace application, so it "can't" crash the system, no matter what I do with it... right? Even if I run `sudo /usr/sbin/crashsystem`, it's running in a userspace application, what can
    it do really?


    I disagree. With neither a setuid binary nor my password, it would be a
    major problem if a userspace application is allowed to crash the system. If
    a buggy application can do so accidentally, then a malicious application
    can do so deliberately.

    Userspace applications have to make use of kernel facilities for everything
    they do, such as displaying graphics on the screen. A not-entirely-uncommon cause of system crashes is bugs being triggered in a GPU driver.


    Yes, I forgot to mention that. I specifically included the details about
    the GPU driver and kernel modules because I'm guessing that urxvt (or maybe Xorg) triggered a bug in the GPU driver. I suppose terminal beeps could
    trigger a bug in an audio driver, but audio drivers always seemed more
    stable to me.

    -MD



    <div dir="auto"><div><div style="min-width:150px" dir="auto"><div dir="ltr">On Thu, May 8, 2025, 22:23 Nate Eldredge &lt;<a href="mailto:nate@thatsmathematics.com" rel="noreferrer noreferrer noreferrer noreferrer noreferrer noreferrer noreferrer" target="
    _blank">nate@thatsmathematics.com</a>&gt; wrote:</div><blockquote style="min-width:150px;margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Unless you can reproduce it, I don&#39;t think we can reject the &quot;null
    hypothesis&quot; that the crash was caused by something unrelated (e.g. hardware problem) that just coincidentally happened to occur during this particular task.<br></blockquote></div><div dir="auto"><br></div>I fully agree. I haven&#39;t decided whether
    I&#39;m hoping I can reproduce it, or whether I&#39;m hoping I can&#39;t.</div><div dir="auto"><br><div class="gmail_quote" dir="auto"><div dir="ltr" class="gmail_attr">On Thu, May 8, 2025, 22:59 Eli Schwartz &lt;<a href="mailto:eschwartz@gentoo.org" rel=
    "noreferrer noreferrer noreferrer noreferrer noreferrer noreferrer noreferrer" target="_blank">eschwartz@gentoo.org</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I would say
    that this is an almost fallacious way to look at things, honestly. urxvt is a userspace application, so it &quot;can&#39;t&quot; crash the system, no matter what I do with it... right? Even if I run `sudo /usr/sbin/crashsystem`, it&#39;s running in a
    userspace application, what can it do really?</blockquote></div></div><div dir="auto"><br></div><div dir="auto">I disagree. With neither a setuid binary nor my password, it would be a major problem if a userspace application is allowed to crash the
    system. If a buggy application can do so accidentally, then a malicious application can do so deliberately.</div><div dir="auto"><br></div><div dir="auto"><div class="gmail_quote" dir="auto"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-
    left:1px #ccc solid;padding-left:1ex">
    Userspace applications have to make use of kernel facilities for everything they do, such as displaying graphics on the screen. A not-entirely-uncommon cause of system crashes is bugs being triggered in a GPU driver.</blockquote></div></div><div dir="
    auto"><br></div><div dir="auto">Yes, I forgot to mention that. I specifically included the details about the GPU driver and kernel modules because I&#39;m guessing that urxvt (or maybe Xorg) triggered a bug in the GPU driver. I suppose terminal beeps
    could trigger a bug in an audio driver, but audio drivers always seemed more stable to me.</div><div dir="auto"><br></div><div dir="auto">-MD</div><div dir="auto"></div><div dir="auto"><div class="gmail_quote"><blockquote class="gmail_quote" style="
    margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> </blockquote></div></div></div>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mitchell Dorrell@21:1/5 to All on Wed May 14 04:50:01 2025
    Brief follow up:

    Since the other day, my system has been showing other symptoms of a buggy/unstable GPU driver, but I will still attempt to reproduce the crash (perhaps this weekend) for good measure.

    -MD

    <div dir="auto">Brief follow up: <div dir="auto"><br></div><div dir="auto">Since the other day, my system has been showing other symptoms of a buggy/unstable GPU driver, but I will still attempt to reproduce the crash (perhaps this weekend) for good
    measure.</div><div dir="auto"><br></div><div dir="auto">-MD</div></div>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)