Add support for multibyte characters to $IFS #92
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request fixes BUG_MULTIBIFS, which had two bug reports in the ksh2020 branch. The modernish regression test suite now only reports eight test failures.
Backport Eric Scrivner's fix for multibyte IFS characters (slightly modified for compatibility with C89). Explanation from Fix expansion of multibyte IFS characters att/ast#737:
Bug report: ksh93: "$*" joins positional parameters on the first byte of $IFS instead of first character att/ast#13
Fixed another bug that caused multibyte characters with the same initial byte to be treated as the same character by the IFS. This bug was occurring because the initial byte wasn't being written to the stack when the IFS delimiter and multibyte character had the same initial byte:
The code in question was skipping past the initial byte with
continue
:ksh/src/cmd/ksh93/sh/macro.c
Lines 2403 to 2406 in 8c16f38
This problem is fixed by putting the initial byte on the stack with
sfputc
before thecontinue
.Bug report: IFS: UTF-8 support is incomplete att/ast#1372