Bug 12531 - awk: backslashes not parsed in EREs passed from variables or "string literals"
Summary: awk: backslashes not parsed in EREs passed from variables or "string literals"
Status: NEW
Alias: None
Product: Busybox
Classification: Unclassified
Component: Standard Compliance (show other bugs)
Version: unspecified
Hardware: All All
: P5 normal
Target Milestone: ---
Assignee: unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-02-03 21:12 UTC by Martijn Dekker
Modified: 2020-02-03 21:12 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:


Attachments
my .config, as required (27.61 KB, application/octet-stream)
2020-02-03 21:12 UTC, Martijn Dekker
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martijn Dekker 2020-02-03 21:12:15 UTC
Created attachment 8356 [details]
my .config, as required

In busybox awk, match(), sub() and gsub() don't parse C-style backslash-escaped special characters in EREs passed from variables or "string literals" (as opposed to /ERE literals/, for which busybox awk behaves correctly).

Below are a couple of test cases. (Note: double quotes remove one level of backslash escaping; the ERE parsing in match(), sub(), gsub() should be removing another)


$ echo $'abc\tdef' | awk '{ ere="\\t"; gsub(ere, "TAB"); print; }'
abc     def

Expected output (as on onetrueawk, gawk, mawk, Solaris awk):
abcTABdef


$ awk 'BEGIN { print !match("\n", "^\\n$"); }'
1

Expected output (as on onetrueawk, gawk, mawk, Solaris awk):
0

Reference to standard:
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html#tag_20_06_13_04

Regular Expressions: ..."The awk utility shall make use of the extended regular expression notation (see XBD Extended Regular Expressions) except that it shall allow the use of C-language conventions for escaping special characters within the EREs, as specified in the table in XBD File Format Notation ( '\\', '\a', '\b', '\f' , '\n', '\r', '\t', '\v' ) and the following table; these escape sequences shall be recognized both inside and outside bracket expressions."...

RATIONALE: ..."Historical implementations of awk have long supported <backslash>-escape sequences as an extension to extended regular expressions, and this extension has been retained despite inconsistency with other utilities."...