• Re: ksh - issue with (non-existing) jobs

    From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Sun Jan 26 13:13:37 2025
    On 26.01.2025 03:17, Lawrence D'Oliveiro wrote:
    On Sun, 26 Jan 2025 01:25:22 +0100, Janis Papanagnou wrote:

    I'd guess, though, that it wouldn't have helped ...

    Well, if you explicitly waited for a job it couldn’t find, then you would be right. But given it is capable of waiting in general for any job to terminate,

    Not sure what you mean here.

    You initiate jobs, say, by something like 'a_command &', the command
    gets registered in a shell table. And at any time you can 'wait' for
    it. The job information (with its actual execution state) is at least
    as long in the job table until a <Enter> is hit, since the information
    e.g. about termination must be created.

    it might take a different path through the code that is not
    afflicted by the same state confusion.

    Yes, that can of course be the case - I haven't inspected the source.

    My guess is based just on the typical software architectures. Usually
    you have one place to maintain the jobs, a [shell internal] job table,
    and functions to check existence (registration) and get any required
    attributes for any shell command that works with the registered shell
    jobs. So if the respective shell commands see arguments like %1 they
    would certainly look up those jobs and its attributes in that table
    with that key. It's most likely that 'kill %1' and 'wait %1' would
    both do such a table lookup, certainly _before_ they take the actions
    on the identified job.

    If you have another scenario in mind, please elaborate.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to All on Sat Jan 25 16:12:43 2025
    I just noticed in a ksh window that 'jobs' indicates two active jobs.

    [2] + Running a_command $( find -type f ... )
    [1] - Running a_command $( find -type f ... )

    $ kill %1 %2
    kill: %1: no such job
    kill: %2: no such job

    $ pkill a_command # doesn't change the status.

    New jobs consequently get job numbers starting with %3.

    Somehow the ksh instance seems to have missed the job termination,
    and the job management tables seem to have got corrupted.

    Any idea what's going on and how to get rid of those undead jobs?

    I'm running Version AJM 93u+m/1.0.8 2024-01-01

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Janis Papanagnou on Sun Jan 26 00:05:30 2025
    On Sat, 25 Jan 2025 16:12:43 +0100, Janis Papanagnou wrote:

    Any idea what's going on and how to get rid of those undead jobs?

    Does the “wait” command help?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Sun Jan 26 01:25:22 2025
    On 26.01.2025 01:05, Lawrence D'Oliveiro wrote:
    On Sat, 25 Jan 2025 16:12:43 +0100, Janis Papanagnou wrote:

    Any idea what's going on and how to get rid of those undead jobs?

    Does the “wait” command help?

    Good idea. - But I cannot check any more; I meanwhile closed and
    re-opened the shell window instance to get rid of it.

    I'd guess, though, that it wouldn't have helped; 'wait' is (as
    'kill') a built-in in my ksh, and 'kill' reported "no such job",
    so I suppose a very early job-existence-test already failed. In
    case that happens again I try to remember trying 'wait' as well.

    Still curious how that happened. - Must be a bug in my ksh93u+m.
    But obviously rare; I don't recall whether I've seen that before.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Janis Papanagnou on Sun Jan 26 02:17:01 2025
    On Sun, 26 Jan 2025 01:25:22 +0100, Janis Papanagnou wrote:

    I'd guess, though, that it wouldn't have helped ...

    Well, if you explicitly waited for a job it couldn’t find, then you would
    be right. But given it is capable of waiting in general for any job to terminate, it might take a different path through the code that is not afflicted by the same state confusion.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Janis Papanagnou on Sun Jan 26 21:58:27 2025
    On Sun, 26 Jan 2025 13:13:37 +0100, Janis Papanagnou wrote:

    You initiate jobs, say, by something like 'a_command &', the command
    gets registered in a shell table. And at any time you can 'wait' for it.

    The “wait” command doesn’t require you to specify which job to wait for.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Sun Jan 26 23:15:55 2025
    On 26.01.2025 22:58, Lawrence D'Oliveiro wrote:
    On Sun, 26 Jan 2025 13:13:37 +0100, Janis Papanagnou wrote:

    You initiate jobs, say, by something like 'a_command &', the command
    gets registered in a shell table. And at any time you can 'wait' for it.

    The “wait” command doesn’t require you to specify which job to wait for.

    Yes, so what? - Mind that there's no process/job running any more.

    If you don't name any specific job, what will that then lead to in
    case there's no job to wait for, but still job entries in the shell
    internal jobs-table? If there's no signal or information that the
    shell can get active on? - I'm not sure what you think to be gained
    then by a call of 'wait' (without arguments).

    I think it's a shell internal job-table maintenance problem; since
    it says (in case of 'kill') "no such job" but still show them with
    the 'jobs' command.

    I'm anyway just curious how that inconsistent state might have got
    created. - And whether there's some indication of a bug that could
    get fixed.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Janis Papanagnou on Mon Jan 27 00:01:28 2025
    On Sun, 26 Jan 2025 23:15:55 +0100, Janis Papanagnou wrote:

    On 26.01.2025 22:58, Lawrence D'Oliveiro wrote:

    The “wait” command doesn’t require you to specify which job to wait
    for.

    Yes, so what? - Mind that there's no process/job running any more.

    Given it is capable of waiting in general for any job to terminate, it
    might take a different path through the code that is not afflicted by the
    same state confusion.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)