Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sleep 10 | read hangs interactive shell after ^Z #750

Open
McDutchie opened this issue May 19, 2024 · 3 comments
Open

sleep 10 | read hangs interactive shell after ^Z #750

McDutchie opened this issue May 19, 2024 · 3 comments
Labels
bug Something is not working

Comments

@McDutchie
Copy link

Found on comp.unix.shell:

From: Christian Weisgerber <[email protected]>
Newsgroups: comp.unix.shell
Subject: ksh93: pipelines vs. job control
Date: Mon, 13 May 2024 14:35:32 -0000 (UTC)
Message-ID: <[email protected]>
User-Agent: slrn/1.0.3 (FreeBSD)

... or "how to accidentally hang your ksh93 session".

Bash has this shell option:

    lastpipe
            If set, and job control is not active, the shell runs
            the last command of a pipeline not executed in the
            background in the current shell environment.

The bit about job control sounds like a weird stipulation, but makes
sense once you think about it.  I wonder how ksh93 handles that.
Famously, the AT&T ksh executes the last command of a pipeline in
the current shell:

  $ x=foo
  $ { /bin/sleep 10; echo bar; } | read x
  $ echo $x
  bar

However, a background pipeline runs in a subshell, as documented
in the man page:

  $ x=foo
  $ { /bin/sleep 10; echo bar; } | read x &
  [1]     78726
  $ 
  [1] +  Done                    { /bin/sleep 10; echo bar; } | read x &
  $ echo $x  
  foo

But we are in job control environment.  I can start a pipeline in
the foreground, suspend it, and put it into the background.  ksh93
can't know about that in advance.

  $ { /bin/sleep 10; echo bar; } | read x  
  ^Z

... and now the pipeline is suspended...

    PID  PGID STAT   COMMAND
  71496 71496 S      - ksh93 ksh93
  62980 62980 T+     `-- ksh93 ksh93
   3778 62980 T+p      `-- /bin/sleep 10

... and you're stuck.  There's no shell prompt, so you can't bg or fg
anything, intr (^C) or quit (^\) don't help, and there's no tty
control character to continue a suspended process (group).

You need to take radical measures like hanging up, or sending a
SIGCONT or SIGKILL from a different terminal.

I can't quite tell if this behavior constitutes a bug, but it seems
to follow from the design decision to not run the last command of
a pipeline in a subshell.  And it can trap the unwary.


A more minimal example:

  $ /bin/sleep 10 | read
  ^Z

  PID  PGID STAT   COMMAND
  30796 30796 S      - ksh93 ksh93
  31981 31981 T+p    `-- /bin/sleep 10

-- 
Christian "naddy" Weisgerber                          [email protected]
@lzaoral
Copy link

lzaoral commented Jul 25, 2024

Thank you for the pointers in #763 (comment), @McDutchie! I'd say it is a different bug than #750, tough. The following line seems to cause the problem:

int n = (int)t;

t is a double but inf cannot be properly represented in int so the assignment of n will evaluate -1 on my machine when debugged in GDB:

...
(gdb) p t
$6 = inf
(gdb) p n
$7 = -1
...

This conversion is actually not valid in C as shown by this experiment

#include <math.h>
#include <stdio.h>

int main(void) {
    double d = INFINITY;
    printf("double = %f, int = %d\n", d, (int) d);
}
$ clang -fsanitize=undefined inf.c && ./a.out
inf.c:6:42: runtime error: inf is outside the range of representable values of type 'int'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior inf.c:6:42
double = inf, int = 2147483647

and the C99 standard itself:

6.3.1.4 Real floating and integer

When a finite value of real floating type is converted to an integer type other than _Bool, the fractional part is discarded (i.e., the value is truncated toward zero). If the value of the integral part cannot be represented by the integer type, the behavior is undefined.50)

@McDutchie
Copy link
Author

Thanks for that. Looks like sleep inf was simply unimplemented and accidentally appeared to work. Also, since the Tv_t members are of type uint32_t, n should match that.

Here is a patch that should fix undefined behaviour and implement sleep inf on C99 and up. Please test:

diff --git a/src/cmd/ksh93/bltins/sleep.c b/src/cmd/ksh93/bltins/sleep.c
index e849ef44b..a899e8279 100644
--- a/src/cmd/ksh93/bltins/sleep.c
+++ b/src/cmd/ksh93/bltins/sleep.c
@@ -29,6 +29,7 @@
 #include	<error.h>
 #include	<errno.h>
 #include	<tmx.h>
+#include	<ast_float.h>
 #include	"builtins.h"
 #include	"FEATURE/time"
 #include	"FEATURE/poll"
@@ -140,9 +141,27 @@ skip:
  */
 void sh_delay(double t, int sflag)
 {
-	int n = (int)t;
+	uint32_t n;
 	Tv_t ts, tx;
 
+#if _lib_fpclassify
+	switch (fpclassify(t))
+	{
+	case FP_NAN:
+		errormsg(SH_DICT,ERROR_exit(1),e_number,"NaN");
+		UNREACHABLE();
+	case FP_INFINITE:
+		ts.tv_sec = 0xFFFFFFFF;  /* uint32_t max */
+		ts.tv_nsec = 0;
+		while (1)
+		{
+			tvsleep(&ts, NULL);
+			if ((sh.trapnote & (SH_SIGSET | SH_SIGTRAP)) || sflag)
+				return;
+		}
+	}
+#endif
+	n = (uint32_t)t;
 	ts.tv_sec = n;
 	ts.tv_nsec = 1000000000 * (t - (double)n);
 	while(tvsleep(&ts, &tx) < 0)

@lzaoral
Copy link

lzaoral commented Jul 29, 2024

Thank you, @McDutchie! I confirm that your patch fixes the issue described in #763 (comment). clock_nanosleep is now called only once and with correct values!

$ strace arch/linux.arm64-64/bin/ksh -c 'sleep inf'
...
rt_sigaction(SIGWINCH, {sa_handler=0x44ee5c, sa_mask=[], sa_flags=SA_INTERRUPT}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGALRM, {sa_handler=0x44ee5c, sa_mask=[], sa_flags=SA_INTERRUPT}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=4294967295, tv_nsec=0}, 

McDutchie added a commit that referenced this issue Nov 17, 2024
The 'sleep' built-in takes a floating-point duration value, so
passing the special float values Inf (infinite) and NaN (not a
number) is also possible. However, the code did not handle these
values specially, resulting in undefined behaviour when the whole
and fractional parts were typecast to integers in sh_delay().

src/cmd/ksh93/bltins/sleep.c:
- b_sleep(): Trigger a "bad number" error if NaN is passed.
- sh_delay(): Handle Inf separately with an infinite loop, which
  can be interrupted as appropriate by a signal.

Discussion:
#750 (comment)
McDutchie added a commit that referenced this issue Nov 17, 2024
The 'sleep' built-in takes a floating-point duration value, so
passing the special float values Inf (infinite) and NaN (not a
number) is also possible. However, the code did not handle these
values specially, resulting in undefined behaviour when the whole
and fractional parts were typecast to integers in sh_delay().

src/cmd/ksh93/bltins/sleep.c:
- b_sleep(): Trigger a "bad number" error if NaN is passed.
- sh_delay(): Handle Inf separately with an infinite loop, which
  can be interrupted as appropriate by a signal.

Discussion:
#750 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is not working
Projects
None yet
Development

No branches or pull requests

2 participants