Bug 7538 - uncorrect count symbols in Unicode
Summary: uncorrect count symbols in Unicode
Status: RESOLVED DUPLICATE of bug 6356
Alias: None
Product: Busybox
Classification: Unclassified
Component: Other (show other bugs)
Version: unspecified
Hardware: PC Linux
: P5 minor
Target Milestone: ---
Assignee: unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-10-17 11:56 UTC by Mad Deer
Modified: 2016-02-18 07:02 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mad Deer 2014-10-17 11:56:38 UTC
I have locale LANG=en_US.UTF-8 but some tools in busubox all versions doesn't work with unicode correct. For example awk : 

$ echo тест | busybox awk '{ print length($0) }' 
8
$ echo test | busybox awk '{ print length($0) }' 
4
$ echo тест | awk '{ print length($0) }' #it's standart awk of distrib 
4 

same situation is in sed: 
$ echo -ne "тест" | sed -e :a -e "s/^.\{0,10\}$/& /;ta" | wc -m 
11
$ echo -ne "тест" | busybox  sed -e :a -e "s/^.\{0,10\}$/& /;ta" | wc -m 
7
$ echo -ne "test" | busybox sed -e :a -e "s/^.\{0,10\}$/& /;ta" | wc -m 
11
Comment 1 Mike Frysinger 2016-02-18 07:02:21 UTC

*** This bug has been marked as a duplicate of bug 6356 ***