Bug 235

Summary: ash: incorrect word splitting with read builtin
Product: Busybox Reporter: Harald van Dijk <truedfx>
Component: Standard ComplianceAssignee: unassigned
Status: RESOLVED FIXED    
Severity: enhancement CC: busybox-cvs
Priority: P5    
Version: 1.13.x   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Host: Target:
Build:
Attachments: Fix
Fix for the problem described in comment #6

Description Harald van Dijk 2009-03-30 17:31:16 UTC
The read builtin in ash performs word splitting differently depending on whether the first character in $IFS is a space.

printf 'a\t\tb\tc\n' | busybox ash -c 'IFS=$(printf "\t") read a b c; echo ".$a. .$b. .$c."'
.a. .. .b	c.
printf 'a\t\tb\tc\n' | busybox ash -c 'IFS=$(printf " \t") read a b c; echo ".$a. .$b. .$c."'
.a. .b. .c.
printf 'a,,b,c\n' | busybox ash -c 'IFS="," read a b c; echo ".$a. .$b. .$c."'
.a. .. .b,c.
printf 'a,,b,c\n' | busybox ash -c 'IFS=" ," read a b c; echo ".$a. .$b. .$c."'
.a. .b. .c.

This isn't right. Whether multiple characters form a single field separator depends on whether those characters themselves are whitespace characters, not on whether space is present in $IFS. Current behaviour is different from how IFS is handled during ordinary word splitting, different from what susv3 specifies, and different from other shells, so I'm going to assume this is unintentional.

For reference, here's bash's behaviour:

printf 'a\t\tb\tc\n' | bash -c 'IFS=$(printf "\t") read a b c; echo ".$a. .$b. .$c."'
.a. .b. .c.
printf 'a\t\tb\tc\n' | bash -c 'IFS=$(printf " \t") read a b c; echo ".$a. .$b. .$c."'
.a. .b. .c.
printf 'a,,b,c\n' | bash -c 'IFS="," read a b c; echo ".$a. .$b. .$c."'
.a. .. .b,c.
printf 'a,,b,c\n' | bash -c 'IFS=" ," read a b c; echo ".$a. .$b. .$c."'
.a. .. .b,c.
Comment 2 Denys Vlasenko 2009-03-31 19:17:14 UTC
Created attachment 223 [details]
Fix

Try attached patch
Comment 3 Harald van Dijk 2009-03-31 19:48:03 UTC
Nice, I played with it a bit and it didn't break. It gives the results I expect.
Comment 4 Mike Frysinger 2009-03-31 20:20:44 UTC
same bug probably exists in hush ;)
Comment 5 Harald van Dijk 2009-04-01 04:20:12 UTC
hush's read simply doesn't do word splitting at all.
Comment 6 Gene Ruud 2010-01-04 13:03:27 UTC
According to http://www.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_05 (provided I interpret it correctly) the following

printf '\t,\ta\t,\tb\tc' | ash -c 'IFS=$(printf " \t,") read a b c d; echo ".$a. .$b. .$c. .$d."'

should result in: .. .a. .b. .c. In version 1.15.3 it however results in: .a. .b. .c. ..

The fix-patch from this ticket introduced the bug. Patch with fix and some test cases attached.
Comment 7 Gene Ruud 2010-01-04 13:10:20 UTC
Created attachment 879 [details]
Fix for the problem described in comment #6
Comment 8 Denys Vlasenko 2010-01-08 14:45:50 UTC
Fixed in git, thanks!