From ceae1e44c8ffedb9dfc2fc4eb3432438991717f8 Mon Sep 17 00:00:00 2001 From: Martijn Dekker Date: Thu, 30 Mar 2023 02:35:10 +0100 Subject: [PATCH] Make anchored empty pattern match in ${parameter/pattern/string} @ormaaj wrote: > This is useful for compatibility and I think it's supposed to > work anyway. > > Working in ksh93 v- and u+m: > > ${x/~(E)^/y} > > Non-working: > > ${x/#/y} > ${x/#~(K)/y} The current ksh93 behaviour, though unique, /is/ logical. Unlike an empty regular expression, an empty shell pattern matches nothing. The non-working reproducers tell ksh to match an empty glob pattern at #, the beginning of the string (the ~(K) signifying a shell/glob pattern, which is also the default). Since empty shell patterns never match, no substitution is done. Having said that, mksh, bash, yash, and zsh have all special-cased ${var/#/prefix} and ${var/%/suffix}. I think we should do the same, because the strictly logical behaviour is not meeting user expectations and the special-casing is useful, particularly in vector expansions such as ${@/#/prefix} (which will now prefix each positional parameter). This change is a potential corner-case incompatibility, so it is applied only to the dev branch for the future ksh 93u+m/1.1. (The ${x/#~(K)/y} case still doesn't work after this commit, but I'm okay with that; it explicitly circumvents the special-casing and basically says "yes, I absolutely want to match the string against an empty glob pattern". So that case is not going to break user expectations.) src/cmd/ksh93/sh/macro.c: - When calling strngrpmatch(3) (in the '%' case, via substring()), if pattern is empty, pass an ERE anchor (^ or $ prefixed by ~(E)) to match the beginning or end of the string, as appropriate. Resolves: https://github.com/ksh93/ksh/issues/558 --- NEWS | 6 ++++++ src/cmd/ksh93/COMPATIBILITY | 6 ++++++ src/cmd/ksh93/sh/macro.c | 11 +++++++++-- src/cmd/ksh93/tests/substring.sh | 19 +++++++++++++++++++ 4 files changed, 40 insertions(+), 2 deletions(-) diff --git a/NEWS b/NEWS index 2b8ebc2fb4ad..52b2441cbb98 100644 --- a/NEWS +++ b/NEWS @@ -7,6 +7,12 @@ Uppercase BUG_* IDs are shell bug IDs as used by the Modernish shell library. - Fixed a bug in 'printf -v' (added on 2021-11-18) where using the %B format specifier would overwrite any data already written to the variable. +- [1.1 change] In the ${parameter/pattern/string} search-and-replace + expansion, an anchored empty pattern will now match the beginning or the + end of the string, so that ${parameter/#/string} will prefix the string to + the parameter value and ${parameter/%/string} will append it. This change + brings ksh 93u+m/1.1 into line with mksh, bash, yash and zsh. + 2023-03-26: - Fixed an intermittent crash in 'command -x' experienced on some arm/arm64 diff --git a/src/cmd/ksh93/COMPATIBILITY b/src/cmd/ksh93/COMPATIBILITY index 570574ffb103..907d6eb454ac 100644 --- a/src/cmd/ksh93/COMPATIBILITY +++ b/src/cmd/ksh93/COMPATIBILITY @@ -8,6 +8,12 @@ For more details, see the NEWS file and for complete details, see the git log. from the environment. Any script that depends on this will need to be changed to typeset the expected attributes itself. +2. In the ${parameter/pattern/string} search-and-replace expansion, an + anchored empty pattern will now match the beginning or the end of the + string, so that ${parameter/#/string} will prefix the string to the + parameter value and ${parameter/%/string} will append it. This change + brings ksh 93u+m/1.1 into line with mksh, bash, yash and zsh. + ____________________________________________________________________________ ksh 93u+m vs. ksh 93u+ diff --git a/src/cmd/ksh93/sh/macro.c b/src/cmd/ksh93/sh/macro.c index e8071f978562..4513be322715 100644 --- a/src/cmd/ksh93/sh/macro.c +++ b/src/cmd/ksh93/sh/macro.c @@ -1883,9 +1883,16 @@ static int varsub(Mac_t *mp) oldv = v; nmatch_prev = nmatch; if(c=='%') - nmatch=substring(v,tsize,pattern,match,flag&STR_MAXIMAL); + nmatch = substring(v, tsize, + *pattern ? pattern : "~(E)$", + match, + flag & STR_MAXIMAL); else - nmatch=strngrpmatch(v,vsize,pattern,(ssize_t*)match,elementsof(match)/2,flag|STR_INT); + nmatch = strngrpmatch(v, vsize, + *pattern ? pattern : "~(E)^", + (ssize_t*)match, + elementsof(match) / 2, + flag | STR_INT); if(nmatch && repstr && !mp->macsub) sh_setmatch(v,vsize,nmatch,match,index++); if(nmatch) diff --git a/src/cmd/ksh93/tests/substring.sh b/src/cmd/ksh93/tests/substring.sh index 85d94918df28..8359d86013f7 100755 --- a/src/cmd/ksh93/tests/substring.sh +++ b/src/cmd/ksh93/tests/substring.sh @@ -744,5 +744,24 @@ got=${x//~(E:(a)|b)/<\1>} exp='<>' [[ $got == "$exp" ]] || err_exit "back-reference (got $(printf %q "$got"), expected $(printf %q "$exp"))" +# ====== +# On ksh 93u+m/1.1+, anchored empty pattern should match in replacement, e.g. "${@/#/replacement}" +# https://github.com/ksh93/ksh/issues/558 +case ${.sh.version} in +*\ 93u+m/1.0.* ) + ;; +*\ 93u+m/* ) + set one two three + exp=Xone/Xtwo/Xthree + got=$(IFS=/; echo "${*/#/X}") + [[ $got == "$exp" ]] || err_exit "#-anchored empty pattern vector replacement" \ + "(got $(printf %q "$got"), expected $(printf %q "$exp"))" + exp=oneX/twoX/threeX + got=$(IFS=/; echo "${*/%/X}") + [[ $got == "$exp" ]] || err_exit "%-anchored empty pattern vector replacement" \ + "(got $(printf %q "$got"), expected $(printf %q "$exp"))" + ;; +esac + # ====== exit $((Errors<125?Errors:125))