• Bug#1091521: cron: /usr/sbin/CRON deadlock if libpam-fscrypt is install

    From Pawel Jasiak@21:1/5 to All on Sat Dec 28 00:00:01 2024
    Package: cron
    Version: 3.0pl1-162
    Severity: normal
    X-Debbugs-Cc: debian+bugs@jasiak.dev

    Dear Maintainer,

    * What led up to the situation?

    Installation of libpam-fscrypt and the following crontab after a few
    hours:

    * * * * * touch /foo
    * * * * * touch /foo
    # ... repeat more times to increase probability of deadlock
    * * * * * touch /foo


    * What was the outcome of this action?

    # ps auxf

    root 432 0.0 0.2 6608 2732 ? Ss 22:14 0:00 /usr/sbin/cron -f
    root 4823 0.0 1.4 1090900 14112 ? Sl 22:21 0:00 \_ /usr/sbin/CRON -f
    root 5174 0.0 1.2 960148 12016 ? Ss 22:21 0:00 | \_ /usr/sbin/CRON -f
    root 8095 0.0 1.4 1091156 14180 ? Sl 22:25 0:00 \_ /usr/sbin/CRON -f
    root 8337 0.0 1.2 1025940 12032 ? Ss 22:25 0:00 | \_ /usr/sbin/CRON -f
    root 12228 0.0 1.4 1090904 13868 ? Sl 22:30 0:00 \_ /usr/sbin/CRON -f
    root 12519 0.0 1.2 1025688 12036 ? Ss 22:30 0:00 \_ /usr/sbin/CRON -f


    /usr/sbin/CRON processes are deadlocked (in pam_fscrypt.so) and
    resources are not released until manual intervention.


    * What outcome did you expect instead?

    I expected that touch /foo will be executed and CRON processes will
    exit.

    * More debugging:

    My understanding of ithe relevant cron control flow:
    * [cron] decides that it's time to execute entry X
    * [cron] fork is called
    * [child] prepares environment
    * [child] fork is called
    * [child] calls wait and waits for [grandchild] death
    * [grandchild] continues to prepare environment
    * [grandchild] execve final command

    Let's focus on the following pids: 4823 [child], 5174 [grandchild]:

    # cat /proc/4823/task/*/stack
    [<0>] do_wait+0x160/0x2f0
    [<0>] kernel_wait4+0xb4/0x160
    [<0>] __do_sys_wait4+0xa2/0xb0
    [<0>] do_syscall_64+0x55/0xb0
    [<0>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8

    [<0>] futex_wait_queue+0x60/0x90
    [<0>] futex_wait+0x185/0x270
    [<0>] do_futex+0x106/0x1b0
    [<0>] __x64_sys_futex+0x8e/0x1d0
    [<0>] do_syscall_64+0x55/0xb0
    [<0>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8

    [<0>] futex_wait_queue+0x60/0x90
    [<0>] futex_wait+0x185/0x270
    [<0>] do_futex+0x106/0x1b0
    [<0>] __x64_sys_futex+0x8e/0x1d0
    [<0>] do_syscall_64+0x55/0xb0
    [<0>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8

    [<0>] futex_wait_queue+0x60/0x90
    [<0>] futex_wait+0x185/0x270
    [<0>] do_futex+0x106/0x1b0
    [<0>] __x64_sys_futex+0x8e/0x1d0
    [<0>] do_syscall_64+0x55/0xb0
    [<0>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8

    [<0>] futex_wait_queue+0x60/0x90
    [<0>] futex_wait+0x185/0x270
    [<0>] do_futex+0x106/0x1b0
    [<0>] __x64_sys_futex+0x8e/0x1d0
    [<0>] do_syscall_64+0x55/0xb0
    [<0>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8

    [<0>] futex_wait_queue+0x60/0x90
    [<0>] futex_wait+0x185/0x270
    [<0>] do_futex+0x106/0x1b0
    [<0>] __x64_sys_futex+0x8e/0x1d0
    [<0>] do_syscall_64+0x55/0xb0
    [<0>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8

    Based on source code futex (and threads that are calling futex) are not
    part of cron codebase.

    # gdb -p 4823
    (gdb) thread apply all bt

    Thread 6 (Thread 0x7f619ffff6c0 (LWP 5167) "cron"):
    #0 0x00007f61d20bbea3 in ?? () from /lib/x86_64-linux-gnu/security/pam_fscrypt.so
    #1 0x00007f61d20867f6 in ?? () from /lib/x86_64-linux-gnu/security/pam_fscrypt.so
    #2 0x00007f61d240df58 in ?? () from /lib/x86_64-linux-gnu/security/pam_fscrypt.so
    #3 0x0000000000000080 in ?? ()
    #4 0x0000000000000000 in ?? ()

    Thread 5 (Thread 0x7f61a4ffd6c0 (LWP 5162) "cron"):
    #0 0x00007f61d20bbea3 in ?? () from /lib/x86_64-linux-gnu/security/pam_fscrypt.so
    #1 0x00007f61d20867f6 in ?? () from /lib/x86_64-linux-gnu/security/pam_fscrypt.so
    #2 0x000000c000038d48 in ?? ()
    #3 0x0000000000000080 in ?? ()
    #4 0x0000000000000000 in ?? ()

    Thread 4 (Thread 0x7f61a57fe6c0 (LWP 5085) "cron"):
    #0 0x00007f61d20bbea3 in ?? () from /lib/x86_64-linux-gnu/security/pam_fscrypt.so
    #1 0x00007f61d20867f6 in ?? () from /lib/x86_64-linux-gnu/security/pam_fscrypt.so
    #2 0x000000c000038948 in ?? ()
    #3 0x0000000000000080 in ?? ()
    #4 0x0000000000000000 in ?? ()

    Thread 3 (Thread 0x7f61a5fff6c0 (LWP 5084) "cron"):
    #0 0x00007f61d20bbea3 in ?? () from /lib/x86_64-linux-gnu/security/pam_fscrypt.so
    #1 0x00007f61d2086871 in ?? () from /lib/x86_64-linux-gnu/security/pam_fscrypt.so
    #2 0x00007f61d23df9b8 in ?? () from /lib/x86_64-linux-gnu/security/pam_fscrypt.so
    #3 0x0000000000000080 in ?? ()
    #4 0x00007f61a5ffec38 in ?? ()
    #5 0x0000000000000000 in ?? ()

    Thread 2 (Thread 0x7f61d1fff6c0 (LWP 4967) "cron"):
    #0 0x00007f61d20bbea3 in ?? () from /lib/x86_64-linux-gnu/security/pam_fscrypt.so
    #1 0x00007f61d20867f6 in ?? () from /lib/x86_64-linux-gnu/security/pam_fscrypt.so
    #2 0x00007f61d23df608 in ?? () from /lib/x86_64-linux-gnu/security/pam_fscrypt.so
    #3 0x0000000000000080 in ?? ()
    #4 0x0000000000000000 in ?? ()

    Thread 1 (Thread 0x7f61d2985840 (LWP 4823) "cron"):
    #0 0x00007f61d2b30bd7 in __GI___wait4 (pid=-1, stat_loc=0x7ffe3ad2aae4, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:30
    #1 0x000055a12c53bbda in ?? ()
    #2 0x000055a12c53c675 in ?? ()
    #3 0x000055a12c53b678 in ?? ()
    #4 0x000055a12c538d05 in ?? ()
    #5 0x00007f61d2a8424a in __libc_start_call_main (main=main@entry=0x55a12c538860, argc=argc@entry=2, argv=argv@entry=0x7ffe3ad2c588) at ../sysdeps/nptl/libc_start_call_main.h:58
    #6 0x00007f61d2a84305 in __libc_start_main_impl (main=0x55a12c538860, argc=2, argv=0x7ffe3ad2c588, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffe3ad2c578) at ../csu/libc-start.c:360
    #7 0x000055a12c538df1 in ?? ()

    gdb confirms they are part of pam_fscrypt.so. I guess it's a golang
    runtime.

    # cat /proc/5174/task/*/stack
    [<0>] futex_wait_queue+0x60/0x90
    [<0>] futex_wait+0x185/0x270
    [<0>] do_futex+0x106/0x1b0
    [<0>] __x64_sys_futex+0x8e/0x1d0
    [<0>] do_syscall_64+0x55/0xb0
    [<0>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8

    # gdb -p 5174
    Thread 1 (Thread 0x7f61d2985840 (LWP 5174) "cron"):
    #0 0x00007f61d20bbea3 in ?? () from /lib/x86_64-linux-gnu/security/pam_fscrypt.so
    #1 0x00007f61d20867f6 in ?? () from /lib/x86_64-linux-gnu/security/pam_fscrypt.so
    #2 0x000000c000038148 in ?? ()
    #3 0x0000000000000080 in ?? ()
    #4 0x0000000000000000 in ?? ()

    grandchild has only one thread and it also waits in pam_fscrypt.so.

    For the remaining deadlocked CRON processes results are the same.

    In do_command.c both child [4823] and grandchild [5174] enters pam. My
    theory is that on first enter to pam we start golang runtime and on
    second enter if we forked when child runtime was in "bad" state
    grandchild will just deadlock waiting for action from missing threads
    (the child process is created with a single thread - the one that
    called fork()).

    The easy workaround that works for me is to modify pam config for cron (/etc/pam.d/cron):

    # The PAM configuration file for the cron daemon

    @include common-auth # this line needs to be replaced with content of
    # common-auth without pam_fscrypt.so

    # Sets the loginuid process attribute
    session required pam_loginuid.so

    # Read environment variables from pam_env's default files, /etc/environment
    # and /etc/security/pam_env.conf.
    session required pam_env.so

    # In addition, read system locale information
    session required pam_env.so envfile=/etc/default/locale

    @include common-account
    @include common-session-noninteractive

    # Sets up user limits, please define limits for cron tasks
    # through /etc/security/limits.conf
    session required pam_limits.so

    The same issue exist in sid - this report was prepared on clean stable installation.

    As an alternative I tried with cronie where everything works fine -
    I didn't find pam enter in grandchild.

    Tested on:
    stable:
    cron version: 3.0pl1-162
    libpam-fscrypt version: 0.3.3-1+b6
    sid:
    cron version: 3.0pl1-189
    libpam-fscrypt version: 0.3.5-1
    cronie: 1.7.2-3


    -- Package-specific info:
    --- EDITOR:


    --- /usr/bin/editor:
    /usr/bin/nano

    --- /usr/bin/crontab:
    -rwxr-sr-x 1 root crontab 43648 Mar 2 2023 /usr/bin/crontab

    --- /var/spool/cron:
    drwxr-xr-x 3 root root 4096 Dec 20 19:38 /var/spool/cron

    --- /var/spool/cron/crontabs:
    drwx-wx--T 2 root crontab 4096 Dec 27 22:30 /var/spool/cron/crontabs

    --- /etc/cron.d:
    drwxr-xr-x 2 root root 4096 Dec 20 19:49 /etc/cron.d

    --- /etc/cron.daily:
    drwxr-xr-x 2 root root 4096 Dec 20 19:52 /etc/cron.daily

    --- /etc/cron.hourly:
    drwxr-xr-x 2 root root 4096 Dec 20 19:38 /etc/cron.hourly

    --- /etc/cron.monthly:
    drwxr-xr-x 2 root root 4096 Dec 20 19:38 /etc/cron.monthly

    --- /etc/cron.weekly:
    drwxr-xr-x 2 root root 4096 Dec 20 19:52 /etc/cron.weekly


    -- System Information:
    Debian Release: 12.8
    APT prefers stable-updates
    APT policy: (500, 'stable-updates'), (500, 'stable-security'), (500, 'stable')
    Architecture: amd64 (x86_64)

    Kernel: Linux 6.1.0-28-amd64 (SMP w/2 CPU threads; PREEMPT)
    Locale: LANG=en_US.UTF-8, LC_CTY