Bug 235 - ash: incorrect word splitting with read builtin
Summary: ash: incorrect word splitting with read builtin
Status: RESOLVED FIXED
Alias: None
Product: Busybox
Classification: Unclassified
Component: Standard Compliance (show other bugs)
Version: 1.13.x
Hardware: PC Linux
: P5 enhancement
Target Milestone: ---
Assignee: unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-03-30 17:31 UTC by Harald van Dijk
Modified: 2010-01-30 23:14 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:


Attachments
Fix (3.12 KB, patch)
2009-03-31 19:17 UTC, Denys Vlasenko
Details
Fix for the problem described in comment #6 (1.61 KB, patch)
2010-01-04 13:10 UTC, Gene Ruud
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Harald van Dijk 2009-03-30 17:31:16 UTC
The read builtin in ash performs word splitting differently depending on whether the first character in $IFS is a space.

printf 'a\t\tb\tc\n' | busybox ash -c 'IFS=$(printf "\t") read a b c; echo ".$a. .$b. .$c."'
.a. .. .b	c.
printf 'a\t\tb\tc\n' | busybox ash -c 'IFS=$(printf " \t") read a b c; echo ".$a. .$b. .$c."'
.a. .b. .c.
printf 'a,,b,c\n' | busybox ash -c 'IFS="," read a b c; echo ".$a. .$b. .$c."'
.a. .. .b,c.
printf 'a,,b,c\n' | busybox ash -c 'IFS=" ," read a b c; echo ".$a. .$b. .$c."'
.a. .b. .c.

This isn't right. Whether multiple characters form a single field separator depends on whether those characters themselves are whitespace characters, not on whether space is present in $IFS. Current behaviour is different from how IFS is handled during ordinary word splitting, different from what susv3 specifies, and different from other shells, so I'm going to assume this is unintentional.

For reference, here's bash's behaviour:

printf 'a\t\tb\tc\n' | bash -c 'IFS=$(printf "\t") read a b c; echo ".$a. .$b. .$c."'
.a. .b. .c.
printf 'a\t\tb\tc\n' | bash -c 'IFS=$(printf " \t") read a b c; echo ".$a. .$b. .$c."'
.a. .b. .c.
printf 'a,,b,c\n' | bash -c 'IFS="," read a b c; echo ".$a. .$b. .$c."'
.a. .. .b,c.
printf 'a,,b,c\n' | bash -c 'IFS=" ," read a b c; echo ".$a. .$b. .$c."'
.a. .. .b,c.
Comment 2 Denys Vlasenko 2009-03-31 19:17:14 UTC
Created attachment 223 [details]
Fix

Try attached patch
Comment 3 Harald van Dijk 2009-03-31 19:48:03 UTC
Nice, I played with it a bit and it didn't break. It gives the results I expect.
Comment 4 Mike Frysinger 2009-03-31 20:20:44 UTC
same bug probably exists in hush ;)
Comment 5 Harald van Dijk 2009-04-01 04:20:12 UTC
hush's read simply doesn't do word splitting at all.
Comment 6 Gene Ruud 2010-01-04 13:03:27 UTC
According to http://www.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_05 (provided I interpret it correctly) the following

printf '\t,\ta\t,\tb\tc' | ash -c 'IFS=$(printf " \t,") read a b c d; echo ".$a. .$b. .$c. .$d."'

should result in: .. .a. .b. .c. In version 1.15.3 it however results in: .a. .b. .c. ..

The fix-patch from this ticket introduced the bug. Patch with fix and some test cases attached.
Comment 7 Gene Ruud 2010-01-04 13:10:20 UTC
Created attachment 879 [details]
Fix for the problem described in comment #6
Comment 8 Denys Vlasenko 2010-01-08 14:45:50 UTC
Fixed in git, thanks!