From c258a04f7abebc97a9dad22b4ab7846264137cd6 Mon Sep 17 00:00:00 2001 From: Johnothan King Date: Tue, 16 Jun 2020 14:44:02 -0700 Subject: [PATCH] Implement a portable fix for SIGCHLD crashes (#18) As previously reported in rhbz#1112306 (https://bugzilla.redhat.com/show_bug.cgi?id=1112306), ksh may crash when receiving SIGCHLD because GCC's optimizer will fail to generate `addl` and `sub` instructions to increment and decrement `job.in_critical` in the `job_subsave` function. This bug *does* occur in GCC 10 with `-O2`, but not `-O1`; it doesn't appear this bug has been fixed. As a reference, here is the relevant debug assembly output of `job_subsave` when KSH is compiled with `CCFLAGS` set to `-g -O1`: 0000000000034c97 : void *job_subsave(void) { 34c97: 53 push %rbx struct back_save *bp = new_of(struct back_save,0); 34c98: bf 18 00 00 00 mov $0x18,%edi 34c9d: e8 34 4a 0a 00 callq d96d6 <_ast_malloc> 34ca2: 48 89 c3 mov %rax,%rbx job_lock(); 34ca5: 83 05 3c 50 13 00 01 addl $0x1,0x13503c(%rip) # 169ce8 *bp = bck; 34cac: 66 0f 6f 05 4c 5a 13 movdqa 0x135a4c(%rip),%xmm0 # 16a700 34cb3: 00 34cb4: 0f 11 00 movups %xmm0,(%rax) 34cb7: 48 8b 05 52 5a 13 00 mov 0x135a52(%rip),%rax # 16a710 34cbe: 48 89 43 10 mov %rax,0x10(%rbx) bp->prev = bck.prev; 34cc2: 48 8b 05 47 5a 13 00 mov 0x135a47(%rip),%rax # 16a710 34cc9: 48 89 43 10 mov %rax,0x10(%rbx) bck.count = 0; 34ccd: c7 05 29 5a 13 00 00 movl $0x0,0x135a29(%rip) # 16a700 34cd4: 00 00 00 bck.list = 0; 34cd7: 48 c7 05 26 5a 13 00 movq $0x0,0x135a26(%rip) # 16a708 34cde: 00 00 00 00 bck.prev = bp; 34ce2: 48 89 1d 27 5a 13 00 mov %rbx,0x135a27(%rip) # 16a710 job_unlock(); 34ce9: 8b 05 f9 4f 13 00 mov 0x134ff9(%rip),%eax # 169ce8 34cef: 83 e8 01 sub $0x1,%eax 34cf2: 89 05 f0 4f 13 00 mov %eax,0x134ff0(%rip) # 169ce8 34cf8: 75 2b jne 34d25 34cfa: 8b 3d ec 4f 13 00 mov 0x134fec(%rip),%edi # 169cec 34d00: 85 ff test %edi,%edi 34d02: 74 21 je 34d25 34d04: c7 05 da 4f 13 00 01 movl $0x1,0x134fda(%rip) # 169ce8 When `-O2` is used instead of `-O1`, the `addl` and `sub` instructions for incrementing and decrementing the lock are removed. GCC instead generates a broken `mov` instruction for `job_lock` and removes the initial `sub` instruction in job_unlock (this is also seen in Red Hat's bug report): job_lock(); *bp = bck; 37d7c: 66 0f 6f 05 7c 79 14 movdqa 0x14797c(%rip),%xmm0 # 17f700 37d83: 00 struct back_save *bp = new_of(struct back_save,0); 37d84: 49 89 c4 mov %rax,%r12 job_lock(); 37d87: 8b 05 5b 6f 14 00 mov 0x146f5b(%rip),%eax # 17ece8 ... job_unlock(); 37dc6: 89 05 1c 6f 14 00 mov %eax,0x146f1c(%rip) # 17ece8 37dcc: 85 c0 test %eax,%eax 37dce: 75 2b jne 37dfb The original patch works around this bug by using the legacy `__sync_fetch_and_add/sub` GCC builtins. This forces GCC to generate instructions that change the lock with `lock addl`, `lock xadd` and `lock subl`: job_lock(); 37d9f: f0 83 05 41 6f 14 00 lock addl $0x1,0x146f41(%rip) # 17ece8 37da6: 01 ... job_unlock(); 37deb: f0 0f c1 05 f5 6e 14 lock xadd %eax,0x146ef5(%rip) # 17ece8 37df2: 00 37df3: 83 f8 01 cmp $0x1,%eax 37df6: 74 08 je 37e00 ... 37e25: 74 11 je 37e38 37e27: f0 83 2d b9 6e 14 00 lock subl $0x1,0x146eb9(%rip) # 17ece8 While this does work, it isn't portable. This patch implements a different workaround for this compiler bug. If `job_lock` is put at the beginning of `job_subsave`, GCC will generate the required `addl` and `sub` instructions: job_lock(); 37d67: 83 05 7a 5f 14 00 01 addl $0x1,0x145f7a(%rip) # 17dce8 ... job_unlock(); 37dbb: 83 e8 01 sub $0x1,%eax 37dbe: 89 05 24 5f 14 00 mov %eax,0x145f24(%rip) # 17dce8 It is odd that moving a single line of code fixes this problem, although GCC _should_ have generated these instructions from the original code anyway. I'll note that this isn't the only way to get these instructions to generate. The problem also seems to go away when inserting almost anything else inside of the code for `job_subsave`. This is just a simple workaround for a strange compiler bug. --- src/cmd/ksh93/sh/jobs.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/src/cmd/ksh93/sh/jobs.c b/src/cmd/ksh93/sh/jobs.c index 9b852b0e65c4..d9fcf1f2a971 100644 --- a/src/cmd/ksh93/sh/jobs.c +++ b/src/cmd/ksh93/sh/jobs.c @@ -1973,8 +1973,14 @@ static int job_chksave(register pid_t pid) void *job_subsave(void) { - struct back_save *bp = new_of(struct back_save,0); + /* + * We must make a lock first before doing anything else, + * otherwise GCC will remove the job locking mechanism + * as a result of compiler optimization. + */ job_lock(); + + struct back_save *bp = new_of(struct back_save,0); *bp = bck; bp->prev = bck.prev; bck.count = 0;