In an attempt to transfer a binary file to the box over the terminal connection I converted the binary to an ASCII file with decimal ASCII codes in it. However, awk fails to recover the content due to it losing zeros (and maybe other characters): In BusyBox v1.31.0 zeros are lost: > # (echo 0; echo 1; echo 2) | awk '{ printf("%c",$0); }' | hexdump > 0000000 0201 > 0000002 As a comparison, on FreeBSD all bytes are recovered: > $ (echo 0; echo 1; echo 2) | awk '{ printf("%c",$0); }' | hd > 00000000 00 01 02 |...| > 00000003 There is no practical benefit to losing zeros. This can only hurt operations in an embedded system where BusyBox is typically used.
Printf seems easy enough to fix, but same issue is present also in sprintf, which would require more substantial changes to get right.
One thing worth noting is that current behaviour is strictly speaking POSIX compliant.
(In reply to wolf+busybox from comment #3) Interesting. BSD awk also claims to be POSIX compliant: > The awk utility is compliant with the IEEE Std 1003.1-2008 (“POSIX.1”) > specification, except awk does not support {n,m} pattern matching. But I would favor practicality over technical standard compliance.
(In reply to Yuri from comment #4) > Interesting. BSD awk also claims to be POSIX compliant: POSIX (on awk) just states that when printf should print \000, the behaviour is undefined. So both implementation's are compliant. > But I would favor practicality over technical standard compliance. Sure, why not. At least for printf it is very easy to resolve. sprintf would require larger changes. And it could be confusing for printf to allow \000 and sprintf to not allow it. I'm curious what conclusion will busybox people reach regarding this one.
Does awk allow zeros in strings, and handle such strings properly? STL's std::string for example does allow zeros in the middle of strings, and handle them properly. On the contrary, shells don't allow zeros in strings. If awk generally doesn't handle zeros properly these two become separate issues. One is to print a zero to the output, the other one is to fix zero handling in strings. IMO, just fix them one by one.