Bug 5108 - awk uses updated FS too early
Summary: awk uses updated FS too early
Status: RESOLVED FIXED
Alias: None
Product: Busybox
Classification: Unclassified
Component: Standard Compliance (show other bugs)
Version: 1.19.x
Hardware: PC Linux
: P5 minor
Target Milestone: ---
Assignee: unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-04-16 20:24 UTC by dubiousjim
Modified: 2016-01-03 22:09 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description dubiousjim 2012-04-16 20:24:08 UTC
BusyBox 1.19.3, built against uClibc 0.9.32, on i686 Linux

The POSIX-2008 standard says that changing FS should have no effect on the current input line, but only on the next one. The language is:

> Before the first reference to a field in the record is evaluated, the record
> shall be split into fields, according to the rules in Regular Expressions,
> using the value of FS that was current at the time the record was read. 

(Supposedly, this requirement was introduced as far back as 1996.) BusyBox awk doesn't conform to this requirement: if no fields have yet been referenced, the new value of FS will be used also for parsing the current line.


For example. POSIX requires that this:
$ printf 'a:b c:d\ne:f g:h' | awk '{FS=":"; print $1}'
should print:
a:b
e

and that is indeed the behavior of gawk. But BusyBox awk prints:
a
e
Comment 1 Denys Vlasenko 2012-07-10 23:28:19 UTC
Fixed in git:

commit df8066a78ccd9b899244145f6be0171957a41a1e
Author: Denys Vlasenko <vda.linux@googlemail.com>
Date:   Wed Jul 11 01:27:15 2012 +0200

    awk: fix FS assignment behavior
Comment 2 Michael Tokarev 2013-03-01 07:50:19 UTC
Hm.  With this change (forcible splitting line into fields before assignment to FS), we've another variation of this issue.  I'm not sure how important it is.

 echo a:b c:d | busybox awk '{FS=":"; FS=" "; echo $2}'

with the patch applied it prints "b c", but it looks like it should print a:b.

Sure thing it isn't very practical to do multiple assignments to FS like this.  But I can easily imagine a code which does that conditionally:

 {
    FS="a";
    if (condition1) { FS="b"; }
    if (condition2) { FS="c"; }
    ...
    print $1
 }
Comment 3 Denys Vlasenko 2016-01-03 22:09:17 UTC
Works for me with current git:

$ echo a:b c:d | ./busybox awk '{FS=":"; FS=" "; print $2}'
c:d