-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Subshell job lost when suspended #390
Comments
Data point: I can reproduce this bug for any external command, but not for built-ins. |
This is the relevant code for how ksh handles SIGTSTP in subshells: Lines 544 to 578 in a38e804
I'll note that at lines 565-566 ksh forks subshells when handling ^Z. Perhaps the problem is that ksh isn't forking the subshell in the correct place. |
I've tried to fork the subshell in every other remotely plausible place I can think of in fault.c and xec.c, but I can't get anything to work properly. So far, the only functional workaround I've found is to not allow virtual subshells in the interactive main shell environment at all. Subshells of subshells can be virtual again. Since this only affects the interactive shell, performance isn't as much of a concern, and it does fix the problem. --- a/src/cmd/ksh93/sh/subshell.c
+++ b/src/cmd/ksh93/sh/subshell.c
@@ -660,6 +660,9 @@ Sfio_t *sh_subshell(Shell_t *shp,Shnode_t *t, volatile int flags, int comsub)
sh_subfork();
}
#endif /* _lib_fchdir */
+ /* Virtual subshells are not safe to suspend (^Z) in the interactive main shell. */
+ if(sh_isstate(SH_INTERACTIVE) && !sh.subshare)
+ sh_subfork();
sh_offstate(SH_INTERACTIVE);
sh_exec(t,flags);
} Even with that, things still break horribly if you use a |
I've noticed that, for command substitutions, other shells (bash, dash, mksh, yash, zsh) simply block Hmm. zsh is also buggy -- if you press ^Z during that, it gets stuck until you press ^C. |
New patch, please try: --- a/src/cmd/ksh93/sh/subshell.c
+++ b/src/cmd/ksh93/sh/subshell.c
@@ -660,6 +660,16 @@ Sfio_t *sh_subshell(Shell_t *shp,Shnode_t *t, volatile int flags, int comsub)
sh_subfork();
}
#endif /* _lib_fchdir */
+#ifdef SIGTSTP
+ /* Virtual subshells are not safe to suspend (^Z, SIGTSTP) in the interactive main shell. */
+ if(sh_isstate(SH_INTERACTIVE))
+ {
+ if(comsub)
+ sigblock(SIGTSTP);
+ else
+ sh_subfork();
+ }
+#endif
sh_offstate(SH_INTERACTIVE);
sh_exec(t,flags);
}
@@ -686,6 +696,10 @@ Sfio_t *sh_subshell(Shell_t *shp,Shnode_t *t, volatile int flags, int comsub)
nv_restore(sp);
if(comsub)
{
+#ifdef SIGTSTP
+ if(savst.states & sh_state(SH_INTERACTIVE))
+ sigrelease(SIGTSTP);
+#endif
/* re-enable job control */
if(!sp->nofork)
sh_offstate(SH_NOFORK);
--- a/src/cmd/ksh93/sh/fault.c
+++ b/src/cmd/ksh93/sh/fault.c
@@ -547,7 +547,7 @@ void sh_exit(register int xno)
sh_offstate(SH_STOPOK);
shp->trapnote = 0;
shp->forked = 1;
- if(!shp->subshell && (sig=sh_fork(shp,0,NIL(int*))))
+ if(sh_isstate(SH_INTERACTIVE) && (sig=sh_fork(shp,0,NIL(int*))))
{
job.curpgid = 0;
job.parent = (pid_t)-1;
@@ -564,7 +564,7 @@ void sh_exit(register int xno)
{
if(shp->subshell)
sh_subfork();
- /* child process, put to sleep */
+ /* script or child process; put to sleep */
sh_offstate(SH_STOPOK);
sh_offstate(SH_MONITOR);
shp->sigflag[SIGTSTP] = 0; edit: version 2 of patch: also update handling in sh_fault(). edit 2: probably safest to leave that |
Related commit: 48ba696 – the fix above depends on that. |
The patch in #390 (comment) works well on my system. It it unfortunate the top level subshell has to be forked in interactive shells, although it probably will not cause a significant performance loss since it doesn't affect command substitutions or nested subshells. That said, is it necessary for the fix to be active in # With '(sleep 3)' added to ~/.kshrc
ksh $ arch/linux.i386-64/bin/ksh
^Z[2] + Stopped <command unknown>
ksh $ fg
<command unknown>
# The fg command above effectively kills the PS1.get discipline function
$ |
Good point. We should exclude |
Profile scripts are executed here but the Also, that Patch version three, please test: --- a/src/cmd/ksh93/sh/main.c
+++ b/src/cmd/ksh93/sh/main.c
@@ -160,9 +160,14 @@ int sh_main(int ac, char *av[], Shinit_f userinit)
else
sh_onoption(SH_BRACEEXPAND);
#endif
- if((beenhere++)==0)
+ if(!beenhere)
{
+ beenhere++;
sh_onstate(SH_PROFILE);
+#ifdef SIGTSTP
+ /* disallow suspending profile scripts */
+ sigblock(SIGTSTP);
+#endif
if(shp->gd->ppid==1)
shp->login_sh++;
if(shp->login_sh >= 2)
@@ -236,6 +241,9 @@ int sh_main(int ac, char *av[], Shinit_f userinit)
sh_source(shp, iop, e_suidprofile);
}
shp->st.cmdname = error_info.id = command;
+#ifdef SIGTSTP
+ sigrelease(SIGTSTP);
+#endif
sh_offstate(SH_PROFILE);
if(rshflag)
sh_onoption(SH_RESTRICTED);
diff --git a/src/cmd/ksh93/sh/subshell.c b/src/cmd/ksh93/sh/subshell.c
index 81ab3ae0..c5af58e8 100644
--- a/src/cmd/ksh93/sh/subshell.c
+++ b/src/cmd/ksh93/sh/subshell.c
@@ -660,6 +660,16 @@ Sfio_t *sh_subshell(Shell_t *shp,Shnode_t *t, volatile int flags, int comsub)
sh_subfork();
}
#endif /* _lib_fchdir */
+#ifdef SIGTSTP
+ /* Virtual subshells are not safe to suspend (^Z, SIGTSTP) in the interactive main shell. */
+ if(sh_isstate(SH_INTERACTIVE))
+ {
+ if(comsub)
+ sigblock(SIGTSTP);
+ else
+ sh_subfork();
+ }
+#endif
sh_offstate(SH_INTERACTIVE);
sh_exec(t,flags);
}
@@ -686,6 +696,10 @@ Sfio_t *sh_subshell(Shell_t *shp,Shnode_t *t, volatile int flags, int comsub)
nv_restore(sp);
if(comsub)
{
+#ifdef SIGTSTP
+ if(savst.states & sh_state(SH_INTERACTIVE))
+ sigrelease(SIGTSTP);
+#endif
/* re-enable job control */
if(!sp->nofork)
sh_offstate(SH_NOFORK);
--- a/src/cmd/ksh93/sh/fault.c
+++ b/src/cmd/ksh93/sh/fault.c
@@ -547,7 +547,7 @@ void sh_exit(register int xno)
sh_offstate(SH_STOPOK);
shp->trapnote = 0;
shp->forked = 1;
- if(!shp->subshell && (sig=sh_fork(shp,0,NIL(int*))))
+ if(sh_isstate(SH_INTERACTIVE) && (sig=sh_fork(shp,0,NIL(int*))))
{
job.curpgid = 0;
job.parent = (pid_t)-1;
@@ -564,7 +564,7 @@ void sh_exit(register int xno)
{
if(shp->subshell)
sh_subfork();
- /* child process, put to sleep */
+ /* script or child process; put to sleep */
sh_offstate(SH_STOPOK);
sh_offstate(SH_MONITOR);
shp->sigflag[SIGTSTP] = 0; |
The above patch makes the situation with $ cat /tmp/env
set -o | grep interactive
sleep 3
$ ENV=/tmp/env arch/linux.i386-64/bin/ksh
interactive on
^ZMemory fault(coredump) EDIT: I should also note that top level subshells fork in the kshrc file after applying either of the patches for this issue: $ cat /tmp/env
echo "Main shell pid: $$"
(echo "Subshell PID: ${.sh.pid}")
true
$ ENV=/tmp/env ksh # Unpatched
Main shell pid: 290991
Subshell PID: 290991
$ ENV=/tmp/env arch/*/bin/ksh # After patch
Main shell pid: 290993
Subshell PID: 290994 |
The cause of that particular problem is these two lines in Lines 388 to 389 in ac0a3a5
eval , and the interactive state should never be on in subshells (reminder that this applies to the interactive state, not the interactive option).
Now why it crashes, I've no idea. :( |
Removing those two lines does allow subshells to avoid forking in kshrc, but it introduces another memory fault (not the same as the sigblock fault, but has similar results with the same reproducer). I've made this patch that instead just checks for --- a/src/cmd/ksh93/sh/subshell.c
+++ b/src/cmd/ksh93/sh/subshell.c
@@ -666,7 +666,7 @@ Sfio_t *sh_subshell(Shell_t *shp,Shnode_t *t, volatile int flags, int comsub)
{
if(comsub)
sigblock(SIGTSTP);
- else
+ else if(!sh_isstate(SH_PROFILE))
sh_subfork();
}
#endif |
I have a new patch that uses diff --git a/src/cmd/ksh93/sh/fault.c b/src/cmd/ksh93/sh/fault.c
index 92146f40..1ba4a38b 100644
--- a/src/cmd/ksh93/sh/fault.c
+++ b/src/cmd/ksh93/sh/fault.c
@@ -547,7 +547,7 @@ void sh_exit(register int xno)
sh_offstate(SH_STOPOK);
shp->trapnote = 0;
shp->forked = 1;
- if(!shp->subshell && (sig=sh_fork(shp,0,NIL(int*))))
+ if(sh_isstate(SH_INTERACTIVE) && (sig=sh_fork(shp,0,NIL(int*))))
{
job.curpgid = 0;
job.parent = (pid_t)-1;
@@ -564,7 +564,7 @@ void sh_exit(register int xno)
{
if(shp->subshell)
sh_subfork();
- /* child process, put to sleep */
+ /* script or child process; put to sleep */
sh_offstate(SH_STOPOK);
sh_offstate(SH_MONITOR);
shp->sigflag[SIGTSTP] = 0;
diff --git a/src/cmd/ksh93/sh/main.c b/src/cmd/ksh93/sh/main.c
index 0ce2d81a..1d732629 100644
--- a/src/cmd/ksh93/sh/main.c
+++ b/src/cmd/ksh93/sh/main.c
@@ -163,6 +163,9 @@ int sh_main(int ac, char *av[], Shinit_f userinit)
if((beenhere++)==0)
{
sh_onstate(SH_PROFILE);
+#ifdef SIGTSTP
+ shp->sigflag[SIGTSTP] |= SH_SIGIGNORE;
+#endif
if(shp->gd->ppid==1)
shp->login_sh++;
if(shp->login_sh >= 2)
@@ -236,6 +239,9 @@ int sh_main(int ac, char *av[], Shinit_f userinit)
sh_source(shp, iop, e_suidprofile);
}
shp->st.cmdname = error_info.id = command;
+#ifdef SIGTSTP
+ shp->sigflag[SIGTSTP] &= ~(SH_SIGIGNORE);
+#endif
sh_offstate(SH_PROFILE);
if(rshflag)
sh_onoption(SH_RESTRICTED);
diff --git a/src/cmd/ksh93/sh/subshell.c b/src/cmd/ksh93/sh/subshell.c
index 81ab3ae0..f76a6168 100644
--- a/src/cmd/ksh93/sh/subshell.c
+++ b/src/cmd/ksh93/sh/subshell.c
@@ -660,6 +660,16 @@ Sfio_t *sh_subshell(Shell_t *shp,Shnode_t *t, volatile int flags, int comsub)
sh_subfork();
}
#endif /* _lib_fchdir */
+#ifdef SIGTSTP
+ /* Virtual subshells are not safe to suspend (^Z, SIGTSTP) in the interactive main shell. */
+ if(sh_isstate(SH_INTERACTIVE))
+ {
+ if(comsub)
+ sigblock(SIGTSTP);
+ else if(!sh_isstate(SH_PROFILE))
+ sh_subfork();
+ }
+#endif
sh_offstate(SH_INTERACTIVE);
sh_exec(t,flags);
}
@@ -686,6 +696,10 @@ Sfio_t *sh_subshell(Shell_t *shp,Shnode_t *t, volatile int flags, int comsub)
nv_restore(sp);
if(comsub)
{
+#ifdef SIGTSTP
+ if(savst.states & sh_state(SH_INTERACTIVE))
+ sigrelease(SIGTSTP);
+#endif
/* re-enable job control */
if(!sp->nofork)
sh_offstate(SH_NOFORK); |
Nice. That works on my end. What's more, in my testing, using The parser should have absolutely no reason to care about that, and it may cause the interactive state to be set in subshells which is wrong, so I do want those lines gone. If that is found to cause breakage elsewhere, then that is breakage that AT&T was papering over (as was their usual MO), and we should fix it properly instead. |
Turning on the
No job control output should be printed for a background process invoked from a subshell, not even after |
Reproducer: $ (sleep 1& echo done) done $ (eval "echo hi"; sleep 1& echo done) hi [1] 30587 done No job control output should be printed for a background process invoked from a subshell, not even after 'eval'. The cause: sh_parse() turns on the shell's interactive state bit (sh_state(SH_INTERACTIVE)) if the interactive shell option is on. This is incorrect. The parser should have no involvement with shell interactivity in principle because that's not its domain. Not only that, the parser may need to run in a subshell, e.g. when executing traps or 'eval' commands (as above). By definition, a subshell can never be interactive. We already fixed many bugs related to job control and the shell's interactive state. Even if these two lines previously papered over some breakage, I can't find any now after simply removing them. If any is found later, then it'll need to be fixed properly instead. Related: #390
Reproducer: run vi in a subshell: $ (vi) vi opens; now press Ctrl+Z to suspend. The output is as expected: [2] + Stopped (vi) …but the exit status is 18 (SIGTSTP's signal number) instead of 0. Now do: $ fg (vi) $ The exit status is 18 again, vi is not resumed, and the job is lost. You have to find vi's pid manually using ps and kill it. Forking all non-command substitution subshells invoked from the interactive main shell is the only reliable and effective fix I've found. I've tried to fork the subshell conditionally in every other remotely plausible place I can think of in fault.c and xec.c, but I can't get anything to work properly. If anyone can get this to work without forking as much (or at all), please do submit a patch or PR that supersedes this fix. At least subshells of subshells don't need to fork, so the performance impact can be limited. Plus, it's not as if most people need maximum speed on the interactive command line. Scripts (including login/profile scripts) are not affected at all. Command substitutions can be handled differently. My testing shows that all shells except ksh93 simply block SIGTSTP (the ^Z signal) while they run. We should do the same, so they don't need to fork. NOTE for any backporters: the subshell.c and fault.c changes depend on commits 35b0262 and 48ba696 to work correctly. src/cmd/ksh93/sh/subshell.c: sh_subshell(): - If the interactive shell state bit is on, then before executing the subshell's code: - for command substitutions, block SIGTSTP; - for other subshells, fork. - For command substitutions, release SIGTSTP if the interactive shell state bit was on upon invoking the subshell. src/cmd/ksh93/sh/fault.c: - Instead of checking for a virtual subshell, check the shell's interactive state bit to decide whether to handle SIGTSTP, as that is only turned on in the interactive main shell. src/cmd/ksh93/sh/main.c: sh_main(): - To avoid bugs, ignore SIGTSTP while running profile scripts. Blocking it doesn't work because delaying it until after sigrelease() will cause a crash. Thanks to @JohnoKing for this. - While we're here, prevent a possible overflow of the 'beenhere' static char variable by only incrementing it once. Co-authored-by: Johnothan King <[email protected]> Resolves: #390
Reproducer: $ (sleep 1& echo done) done $ (eval "echo hi"; sleep 1& echo done) hi [1] 30587 done No job control output should be printed for a background process invoked from a subshell, not even after 'eval'. The cause: sh_parse() turns on the shell's interactive state bit (sh_state(SH_INTERACTIVE)) if the interactive shell option is on. This is incorrect. The parser should have no involvement with shell interactivity in principle because that's not its domain. Not only that, the parser may need to run in a subshell, e.g. when executing traps or 'eval' commands (as above). By definition, a subshell can never be interactive. We already fixed many bugs related to job control and the shell's interactive state. Even if these two lines previously papered over some breakage, I can't find any now after simply removing them. If any is found later, then it'll need to be fixed properly instead. Related: #390
Reproducer: run
vi
in a subshell:vi opens; now press Ctrl+Z to suspend. The output is as expected:
...but the exit status is 18 (SIGTSTP's signal number). That should have the 9th bit set.
Now
The exit status is 18 again, vi is not resumed, and the job is lost. You have to find vi's pid manually using
ps
and kill it.A workaround is to fork the subshell in the first step using
ulimit -t unlimited
.At least in 93u+ 2012-08-01 it's even worse – no job ever registers at all. In ksh2020, the behaviour is identical to 93u+m.
The text was updated successfully, but these errors were encountered: