-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ksh93: "$*" joins positional parameters on the first byte of $IFS instead of first character #13
Comments
On the release version (2012-08-01), expanding This appears to be fixed on the current beta (2014-12-24), at least as compiled on Linux. On the release version, the behaviour is as follows:
Note the missing quote after Even using a subshell doesn't avoid the corruption (yay non-forking subshells). However, setting and then unsetting
|
libexec/modernish/cap/BUG_MULTIBIFS.t: - Added. We're on a UTF-8 locale and the shell supports UTF-8 characters in general (i.e. we don't have BUG_MULTIBYTE) -- however, using multibyte characters as IFS field delimiters still doesn't work. For example, "$*" joins positional parameters on the first byte of $IFS instead of the first character. Found on ksh93 and mksh. Ref.: att/ast#13 (On ksh93, only "$*" is affected; on mksh, multibyte IFS characters don't work in any context. I'm not bothering with separate bug tests; if multibyte IFS characters are broken for "$*" they shouldn't be used at all.) README.md: - document it
Closes att#13. Previously, the `varsub` method used for the macro expansion of `$param`, `${param}`, and `${param op word}` would incorrectly expand the internal field separator (IFS) if it was a multibyte character. This was due to truncation based on the incorrect assumption that the IFS would never be larger than a single byte. This change fixes this issue by carefully tracking the number of bytes that should be persisted in the IFS case and ensuring that all bytes are written during expansion and substitution.
Closes att#13. Previously, the `varsub` method used for the macro expansion of `$param`, `${param}`, and `${param op word}` would incorrectly expand the internal field separator (IFS) if it was a multibyte character. This was due to truncation based on the incorrect assumption that the IFS would never be larger than a single byte. This change fixes this issue by carefully tracking the number of bytes that should be persisted in the IFS case and ensuring that all bytes are written during expansion and substitution.
Closes att#13. Previously, the `varsub` method used for the macro expansion of `$param`, `${param}`, and `${param op word}` would incorrectly expand the internal field separator (IFS) if it was a multibyte character. This was due to truncation based on the incorrect assumption that the IFS would never be larger than a single byte. This change fixes this issue by carefully tracking the number of bytes that should be persisted in the IFS case and ensuring that all bytes are written during expansion and substitution.
Closes att#13. Previously, the `varsub` method used for the macro expansion of `$param`, `${param}`, and `${param op word}` would incorrectly expand the internal field separator (IFS) if it was a multibyte character. This was due to truncation based on the incorrect assumption that the IFS would never be larger than a single byte. This change fixes this issue by carefully tracking the number of bytes that should be persisted in the IFS case and ensuring that all bytes are written during expansion and substitution.
Closes att#13. Previously, the `varsub` method used for the macro expansion of `$param`, `${param}`, and `${param op word}` would incorrectly expand the internal field separator (IFS) if it was a multibyte character. This was due to truncation based on the incorrect assumption that the IFS would never be larger than a single byte. This change fixes this issue by carefully tracking the number of bytes that should be persisted in the IFS case and ensuring that all bytes are written during expansion and substitution.
Closes att#13. Previously, the `varsub` method used for the macro expansion of `$param`, `${param}`, and `${param op word}` would incorrectly expand the internal field separator (IFS) if it was a multibyte character. This was due to truncation based on the incorrect assumption that the IFS would never be larger than a single byte. This change fixes this issue by carefully tracking the number of bytes that should be persisted in the IFS case and ensuring that all bytes are written during expansion and substitution.
Closes att#13. Previously, the `varsub` method used for the macro expansion of `$param`, `${param}`, and `${param op word}` would incorrectly expand the internal field separator (IFS) if it was a multibyte character. This was due to truncation based on the incorrect assumption that the IFS would never be larger than a single byte. This change fixes this issue by carefully tracking the number of bytes that should be persisted in the IFS case and ensuring that all bytes are written during expansion and substitution.
This commit fixes BUG_MULTIBIFS, which had two bug reports in the ksh2020 branch. The modernish regression test suite now only reports eight test failures. src/cmd/ksh93/sh/macro.c: - Backport Eric Scrivner's fix for multibyte IFS characters (slightly modified for compatibility with C89). Explanation from att#737: Previously, the varsub method used for the macro expansion of $param, ${param}, and ${param op word} would incorrectly expand the internal field separator (IFS) if it was a multibyte character. This was due to truncation based on the incorrect assumption that the IFS would never be larger than a single byte. This change fixes this issue by carefully tracking the number of bytes that should be persisted in the IFS case and ensuring that all bytes are written during expansion and substitution. Bug report: att#13 - Fixed another bug that caused multibyte characters with the same initial byte to be treated as the same character by the IFS. This bug was occurring because the first byte of a multibyte character wasn't being written to the stack when the IFS delimiter had the same initial byte: $ IFS=£ $ v='§' $ set -- $v $ v="${1-}" $ echo "$v" | hd # The first byte should be c2, but it isn't due to the bug 00000000 a7 0a |..| 00000002 Bug report: att#1372 src/cmd/ksh93/tests/variables.sh: - Add (reworked) regression tests from ksh2020 for the multibyte IFS bugs. - Add a regression test for att#1372 based on the reproducer.
This commit fixes BUG_MULTIBIFS, which had two bug reports in the ksh2020 branch. The modernish regression test suite now only reports eight test failures. src/cmd/ksh93/sh/macro.c: - Backport Eric Scrivner's fix for multibyte IFS characters (slightly modified for compatibility with C89). Explanation from att#737: Previously, the varsub method used for the macro expansion of $param, ${param}, and ${param op word} would incorrectly expand the internal field separator (IFS) if it was a multibyte character. This was due to truncation based on the incorrect assumption that the IFS would never be larger than a single byte. This change fixes this issue by carefully tracking the number of bytes that should be persisted in the IFS case and ensuring that all bytes are written during expansion and substitution. Bug report: att#13 - Fixed another bug that caused multibyte characters with the same initial byte to be treated as the same character by the IFS. This bug was occurring because the first byte of a multibyte character wasn't being written to the stack when the IFS delimiter had the same initial byte: $ IFS=£ $ v='§' $ set -- $v $ v="${1-}" $ echo "$v" | hd # The first byte should be c2, but it isn't due to the bug 00000000 a7 0a |..| 00000002 Bug report: att#1372 src/cmd/ksh93/tests/variables.sh: - Add (reworked) regression tests from ksh2020 for the multibyte IFS bugs. - Add a regression test for att#1372 based on the reproducer.
This commit fixes BUG_MULTIBIFS, which had two bug reports in the ksh2020 branch. The modernish regression test suite now only reports eight test failures. src/cmd/ksh93/sh/macro.c: - Backport Eric Scrivner's fix for multibyte IFS characters (slightly modified for compatibility with C89). Explanation from att#737: Previously, the varsub method used for the macro expansion of $param, ${param}, and ${param op word} would incorrectly expand the internal field separator (IFS) if it was a multibyte character. This was due to truncation based on the incorrect assumption that the IFS would never be larger than a single byte. This change fixes this issue by carefully tracking the number of bytes that should be persisted in the IFS case and ensuring that all bytes are written during expansion and substitution. Bug report: att#13 - Fixed another bug that caused multibyte characters with the same initial byte to be treated as the same character by the IFS. This bug was occurring because the first byte of a multibyte character wasn't being written to the stack when the IFS delimiter had the same initial byte: $ IFS=£ $ v='§' $ set -- $v $ v="${1-}" $ echo "$v" | hd # The first byte should be c2, but it isn't due to the bug 00000000 a7 0a |..| 00000002 Bug report: att#1372 src/cmd/ksh93/tests/variables.sh: - Add (reworked) regression tests from ksh2020 for the multibyte IFS bugs. - Add a regression test for att#1372 based on the reproducer.
Add support for multibyte characters to $IFS This commit fixes BUG_MULTIBIFS, which had two bug reports in the ksh2020 branch. src/cmd/ksh93/sh/macro.c: - Backport Eric Scrivner's fix for multibyte IFS characters (slightly modified for compatibility with C89). Explanation from att#737: Previously, the varsub method used for the macro expansion of $param, ${param}, and ${param op word} would incorrectly expand the internal field separator (IFS) if it was a multibyte character. This was due to truncation based on the incorrect assumption that the IFS would never be larger than a single byte. This change fixes this issue by carefully tracking the number of bytes that should be persisted in the IFS case and ensuring that all bytes are written during expansion and substitution. Bug report: att#13 - Fixed another bug that caused multibyte characters with the same initial byte to be treated as the same character by the IFS. This bug was occurring because the first byte of a multibyte character wasn't being written to the stack when the IFS delimiter had the same initial byte: $ IFS=£ $ v='§' $ set -- $v $ v="${1-}" $ echo "$v" | hd # The first byte should be c2, but it isn't due to the bug 00000000 a7 0a |..| 00000002 Bug report: att#1372 src/cmd/ksh93/tests/variables.sh: - Add (reworked) regression tests from ksh2020 for the multibyte IFS bugs. - Add a regression test for att#1372 based on the reproducer.
The following is quoted from Marcin Cieślak [*]: When running under FreeBSD /bin/sh (and not ksh) we get spurious file named '=' created in the root. This is because the "checksh" function runs /bin/sh -c '(( .sh.version >= 20111111 ))' which produces a "=" file with /bin/sh as a side effect. This bug was reported in att#13, but was closed in error. I was still getting the "=" file to generate on FreeBSD. bin/package, src/cmd/INIT/package.sh: - Fix the creation of a spurious '=' file by making sure /bin/sh has support for (( ... )) arithmetic. .gitignore: - Remove the '=' file entry since it no longer has a purpose. [*]: https://bsd.network/@saper/103196289917156347
Expected
:é:
(3a c3 a9 3a 0a)POSIX says it must be the first character not byte http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_05_02
bash and zsh use the first character.
The text was updated successfully, but these errors were encountered: