Hey. This issue was originally found by myself, then further analysed by Harald van Dijk and posted by him to the busybox mailing list at: http://lists.busybox.net/pipermail/busybox/2022-November/090036.html I thought it would be a good idea to open a bug here to keep the issue tracked. Cheers, Chris.
Just for convenience and the records, a verbatim copy of Harald's aforementioned mail: ----------------- Hi, Over on the dash mailing list, Christoph Anton Mitterer reported a segfault in dash when dealing with invalid substitutions: <https://lore.kernel.org/dash/fe9be702c676b43bba8e0136a94583f919a05a66.camel@scientia.org/T/#t> busybox ash being based on dash, despite the segmentation fault not triggering there with the original script, it can be triggered in busybox ash with a different script as well. I am reporting this here as requested in that thread. The below is with a build with 'make defconfig', except CONFIG_DEBUG=y and CONFIG_DEBUG_PESSIMIZE=y. $ gdb --args ./busybox_unstripped sh -c 'f() { echo ${PWD-${PWD!}}; }; f' GNU gdb (GDB) 12.1 Copyright (C) 2022 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from ./busybox_unstripped... (gdb) run Starting program: /home/harald/busybox/busybox_unstripped sh -c f\(\)\ \{\ echo\ \$\{PWD-\$\{PWD\!\}\}\;\ \}\;\ f [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Program received signal SIGSEGV, Segmentation fault. 0x00348eab in argstr (p=0x1 <error: Cannot access memory at address 0x1>, flag=1089) at shell/ash.c:6819 6819 if (*p == '~') (gdb) This happens because invalid substitutions (${PWD!}) are encoded using a null byte, but function definitions treat node text as C-style strings terminated by the first null byte, so we end up accessing the duplicated node text past the end of the buffer: (gdb) b shell/ash.c:9148 Breakpoint 1 at 0x34ca03: file shell/ash.c, line 9148. (gdb) run Starting program: /home/harald/busybox/busybox_unstripped sh -c f\(\)\ \{\ echo\ \$\{PWD-\$\{PWD\!\}\}\;\ \}\;\ f [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Breakpoint 1, copynode (n=0x3923ec) at shell/ash.c:9148 9148 new->narg.text = nodeckstrdup(n->narg.text); (gdb) cont Continuing. Breakpoint 1, copynode (n=0x39240c) at shell/ash.c:9148 9148 new->narg.text = nodeckstrdup(n->narg.text); (gdb) p n->narg.text $1 = 0x3923fc "\202\002PWD=\202" (gdb) step nodeckstrdup (s=0xffffd1c8 "") at shell/ash.c:9059 9059 { (gdb) next 9060 funcstring_end -= SHELL_ALIGN(strlen(s) + 1); (gdb) 9061 return strcpy(funcstring_end, s); (gdb) 9062 } Here, \202 is CTLVAR. We can see that n->narg.text ends in CTLVAR followed by a null byte, and it is copied using strlen() and strcpy(), so any bytes after that null byte will be left out. There are two possible ways of fixing this, depending on the intended behaviour. Nothing has yet been said on the list to definitively know what the dash intended behaviour here is, but regardless, busybox may choose to act now. 1: If the intended behaviour is to raise an error: --- a/shell/ash.c +++ b/shell/ash.c @@ -7465,9 +7465,6 @@ varvalue(char *name, int varflags, int flags, int quoted) int discard = (subtype == VSPLUS || subtype == VSLENGTH) | (flags & EXP_DISCARD); if (!subtype) { - if (discard) - return -1; - ifsfree(); raise_error_syntax("bad substitution"); } This preserves the copynode() behaviour of cutting off the word, but it is okay as now a null byte is guaranteed to terminate the expansion. 2: If the intended behaviour is to ignore the invalid substitution as long as it is skipped: --- a/shell/ash.c +++ b/shell/ash.c @@ -12981,6 +12981,8 @@ parsesub: { synstack->dblquote = newsyn != BASESYNTAX; } + if (subtype == 0) + subtype = VSNUL; ((unsigned char *)stackblock())[typeloc] = subtype; if (subtype != VSNORMAL) { synstack->varnest++; This encodes invalid substitutions using VSNUL, which when masked with VSTYPE will result in 0 like before, but does not result in all bits zero, so does not terminate the string. I know my mail client will have mangled the formatting. Apologies for that. I am not expecting either of these patches to be applied as is anyway, at this point I am more interested in getting the busybox views on whether either fix is wanted now already (before dash acts), and if so, which one. Especially the second one will likely have opportunities to clean up and reduce code size by making sure subtype is already set to VSNUL at this point, rather than 0, meaning it does not need to be patched up here. Cheers, Harald van Dijk
I shall add, that a patch for this issue has recently been posted for dash on their mailing list at: https://lore.kernel.org/dash/Y47ZlpwkQy+jiule@gondor.apana.org.au/ and is scheduled for being merged in their git. It seems that the klibc (which contains an ash/dash based sh implementation that also suffers from this) upstream is likely going to use that patch as well. So that might be of interest for BusyBox, too. Cheers, Chris.
Oh and just one more thing for the records, since dash/klibc have no proper bugtracker, I had reported bugs in against the respective Debian packages: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1024635 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1024735
The first patch seems to have been buggy, a v2 was posted: https://lore.kernel.org/dash/Y5BTWr28NgVMm8UG@gondor.apana.org.au/