Issue with printf %Lb "\0200" in UTF-8 locales #14

stephane-chazelas · 2016-04-27T21:12:30Z

printf %Lb "\0200"

In UTF-8 locales seems to print random areas of memory.

In ksh93u on Debian amd64 (from package):

$ ksh -c 'printf %Lb "\0200"' | wc -c
18564
$ ksh -c 'printf %Lb "\0200"' | wc -c
18972

With ksh93v- (built from beta git branch), it seems to enter some infinite loop in:

#0  ast_mbrchar (w=0x7ffc6404c664 L"", s=0x25b781d23d41 "", n=16, q=0x7ffc6404c7d0) at src/lib/libast/comp/setlocale.c:2188
#1  0x0000000000574583 in sfvprintf (f=0x841ec0 <_Sfstdout>, form=0x25b781d23d33 "", args=0x7ffc64051838) at src/lib/libast/sfio/sfvprintf.c:744
#2  0x0000000000566b67 in sfprintf (f=0x841ec0 <_Sfstdout>, form=0x5bddef "%!") at src/lib/libast/sfio/sfprintf.c:48
#3  0x0000000000492ec3 in b_print (argc=-1, argv=0x25b781d23c10, context=0x7ffc64051ab0) at src/cmd/ksh93/bltins/print.c:350
#4  0x00000000004925ea in b_printf (argc=3, argv=0x25b781d23c00, context=0x8433f0 <sh+1392>) at src/cmd/ksh93/bltins/print.c:150
#5  0x0000000000472692 in sh_exec (shp=0x7ffc6404c664, t=0x25b781d23d41, flags=5) at src/cmd/ksh93/sh/xec.c:1387
#6  0x0000000000416cad in exfile (shp=0x7ffc6404c664, iop=0x25b781d23d41, fno=16) at src/cmd/ksh93/sh/main.c:610
#7  0x0000000000416065 in sh_main (ac=3, av=0x7ffc640522e8, userinit=0x0) at src/cmd/ksh93/sh/main.c:382
#8  0x0000000000415192 in main (argc=3, argv=0x7ffc640522e8) at src/cmd/ksh93/sh/pmain.c:45

The text was updated successfully, but these errors were encountered:

floam · 2016-09-07T09:42:27Z

I ran into this - it happens actually right after \0176, at \0177 (0x7F)

floam · 2017-09-30T10:24:55Z

I believe Apple fixed this with: https://opensource.apple.com/source/ksh/ksh-25/patches/src__lib__libast__sfio__sfvprintf.c.diff.auto.html

siteshwar · 2018-03-23T14:43:49Z

The patch by Apple seems to workaround a bug that is present somewhere else in the code. But I will take it anyways. Thanks @floam for pointing to the fix. I have opened a pull request for it.

krader1961 · 2018-03-23T23:05:12Z

Note that the "fix" is really a hack that only affects printf by ignoring invalid UTF-8 sequences when printf iterates over the string to be printed. The real problem is that invalid UTF-8 sequences aren't handled consistently and replaced ignored or replaced by the U+FFFD replacement char code point. If you switch to the C locale the example works fine. I'll bet $100 there are other places in the code which do not correctly handle invalid UTF-8 strings.

Running 'printf %Lb "\0200"' drops ksh into an infinite loop. This commit fixes it. Thanks to Apple for this patch. Resolves: #14

ksh segfaults in job_chksave after receiving SIGCHLD https://bugs.launchpad.net/ubuntu/+source/ksh/+bug/1697501 Eric Desrochers wrote on 2017-06-12: [Impact] * The compiler optimization dropped parts from the ksh job locking mechanism from the binary code. As a consequence, ksh could terminate unexpectedly with a segmentation fault after it received the SIGCHLD signal. [Test Case] Unfortunately, there is no clear and easy way to reproduce the segfault. * But the original reporter of this bug can randomly reproduce the problem using an in-house ksh script that only works inside his infrastructure as follow : "ksh <in-house-script.ksh>" and then once in a while ksh will segfault as follow : (gdb) bt #0 job_chksave (pid=pid@entry=19003) at /build/ksh-6IEHIC/ksh-93u+20120801/src/cmd/ksh93/sh/jobs.c:1948 illumos#1 0x00000000004282ab in job_reap (sig=17) at /build/ksh-6IEHIC/ksh-93u+20120801/src/cmd/ksh93/sh/jobs.c:428 illumos#2 <signal handler called> ... [Regression Potential] * Regression risk : low/none expected, the package has been highly/intensively tested by a user who run over 18M ksh scripts a day on each of their clusters. [...] * The fix has been written by RH and has been proven to work for them for the last 3 years. * A test package including the RH fix has been intensively tested and verified (pre-SRU) by an affected user with positive feedbacks using a reproducer that segfault without the RH patch. * Test package (pre-SRU) feedbacks : https://bugs.launchpad.net/ubuntu/xenial/+source/ksh/+bug/1697501/comments/7 [Other Info] * Details about the RH bug : - https://bugzilla.redhat.com/show_bug.cgi?id=1123467 - https://bugzilla.redhat.com/show_bug.cgi?id=1112306 - https://access.redhat.com/solutions/1253243 - http://rhn.redhat.com/errata/RHBA-2014-1015.html - ksh.spec * Fri Jul 25 2014 Michal Hlavinka <email address hidden> - 20120801-10.8 * job locking mechanism did not survive compiler optimization (#1123467) - patch * ksh-20120801-locking.patch Debian bug: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=867181 [Original Description] # gdb [New LWP 3882] Core was generated by `/bin/ksh <KSH_SCRIPT>.ksh'. Program terminated with signal SIGSEGV, Segmentation fault. #0 job_chksave (pid=pid@entry=19385) at /build/ksh-6IEHIC/ksh-93u+20120801/src/cmd/ksh93/sh/jobs.c:1948 1948 if(jp->pid==pid) (gdb) p *jp Cannot access memory at address 0xb (gdb) p *jp->pid Cannot access memory at address 0x13 (gdb) p pid $2 = 19385 (gdb) p *jpold $1 = {next = 0xb, pid = -604008960, exitval = 11124} The struct is corrupted at some point looking at the next,pid and exitval struct members values which isn't valid data. # assembly code => 0x0000000000427159 <+41>: cmp %edi,0x8(%rdx) (gdb) p $edi ## pid variable $1 = 19385 (gdb) p *($rdx + 8) ## jp->pid struct Cannot access memory at address 0x13 -- ksh is segfaulting because it can't access struct "jp" ($rdx) thus cannot de-reference the struct member "jp>pid" ($rdx + 8) at line : src/cmd/ksh93/sh/jobs.c:1948 when looking if jp->pid is equal to pid ($edi) variable.

The more I think about it, the more it seems obvious that commit 07cc71b (PR att#14) is quite simply a workaround for a GCC optimiser bug, and (who knows?) possibly an old, long-fixed one, as the bug report is years old. The commit also caused ksh to fail to build on HP-UX B.11.11 with GCC 4.2.3 (hosted at polarhome.com), because it doesn't have __sync_fetch_and_add() and __sync_fetch_and_sub(). It may fail on other systems. The GCC documentation says these are legacy: https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html HELP WANTED: what I would like best is if someone could come up with some way of detecting this optimiser bug and then error out with a message along the lines of "please upgrade your broken compiler". It would probably need to be a new iffe test. Meanwhile, let's try it this way for a while and see what happens: src/cmd/ksh93/include/jobs.h: - Restore original ksh version of job_lock()/job_unlock() macros. - Use the workaround version only if the compiler has the builtins __sync_fetch_and_add() and __sync_fetch_and_sub().

In ksh93u- 2010-08-11, the mbwidth() macro was changed so that it returns -1 for a control character (which has no width) or an invalid multibyte character. But the uses of mbwidth() in sfvprintf.c were not updated to check for this. As a result, byte offsets were corrupted, causing something like 'printf %Lb "\0200"' to intermittently output garbage. src/lib/libast/sfio/sfvprintf.c: - Add missing checks for negative mbwidth() result. - Gets rid of all the #ifdef mbwidth directives, as we are working with a known version of libast. This fix extends and somewhat refactors a patch from Apple: https://opensource.apple.com/source/ksh/ksh-25/patches/src__lib__libast__sfio__sfvprintf.c.diff.auto.html Related: att#14 Resolves: #544

dannyweldon added the bug label Mar 5, 2017

siteshwar mentioned this issue Nov 30, 2017

path.sh test fails on macOS with a call to abort() #169

Closed

siteshwar mentioned this issue Mar 23, 2018

Fix an infinite loop in sfvprintf() function #446

Merged

siteshwar closed this as completed in #446 Mar 24, 2018

siteshwar added a commit that referenced this issue Mar 24, 2018

Fix an infinite loop in sfvprintf() function

550afb3

Running 'printf %Lb "\0200"' drops ksh into an infinite loop. This commit fixes it. Thanks to Apple for this patch. Resolves: #14

JohnoKing mentioned this issue Sep 27, 2022

printf %Lb "\0200" prints random areas of memory ksh93/ksh#544

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with printf %Lb "\0200" in UTF-8 locales #14

Issue with printf %Lb "\0200" in UTF-8 locales #14

stephane-chazelas commented Apr 27, 2016 •

edited

Loading

floam commented Sep 7, 2016

floam commented Sep 30, 2017

siteshwar commented Mar 23, 2018

krader1961 commented Mar 23, 2018

Issue with printf %Lb "\0200" in UTF-8 locales #14

Issue with printf %Lb "\0200" in UTF-8 locales #14

Comments

stephane-chazelas commented Apr 27, 2016 • edited Loading

floam commented Sep 7, 2016

floam commented Sep 30, 2017

siteshwar commented Mar 23, 2018

krader1961 commented Mar 23, 2018

stephane-chazelas commented Apr 27, 2016 •

edited

Loading