Bug 41 - busybox line editing is not UTF-8 compatible
Summary: busybox line editing is not UTF-8 compatible
Status: RESOLVED FIXED
Alias: None
Product: Busybox
Classification: Unclassified
Component: Standard Compliance (show other bugs)
Version: unspecified
Hardware: PC Linux
: P5 minor
Target Milestone: ---
Assignee: unassigned
URL: https://bugs.maemo.org/show_bug.cgi?i...
Keywords:
Depends on:
Blocks:
 
Reported: 2009-01-13 16:21 UTC by Andre Klapper
Modified: 2009-07-16 00:29 UTC (History)
0 users

See Also:
Host:
Target:
Build:


Attachments
Fix (already applied to 1.15.x git tree) (10.82 KB, patch)
2009-07-11 22:26 UTC, Denys Vlasenko
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Andre Klapper 2009-01-13 16:21:05 UTC
Forwarding from https://bugs.maemo.org/show_bug.cgi?id=2918 .

SOFTWARE VERSION:
Busybox 1.10

STEPS TO REPRODUCE THE PROBLEM:
Start busybox, type some non-ascii text and then Ctrl+A.

EXPECTED OUTCOME:
The cursor goes to the first character on the line

ACTUAL OUTCOME:
It goes one character beyond the first.
Happens because the busybox shell (msh) is ANSI, not UTF-8 fully compatible.
Also mentioned at
http://sources.busybox.net/index.py/trunk/busybox/TODO?revision=23494&view=markup

REPRODUCIBILITY:
always

SCREENSHOT IN MAEMO:
https://bugs.maemo.org/attachment.cgi?id=885&action=view

WORKAROUND IN MAEMO:
Set the current encoding to "Latin (ISO 8859 - 1)" in Maemo's osso-xterm menu

OTHER COMMENTS:
This was http://bugs.busybox.net/view.php?id=4784 .
Thanks for completely erasing any valid bug reports by "migrating" to Bugzilla.</sarcasm>
Comment 1 Andre Klapper 2009-02-04 15:38:46 UTC
Anybody alive at all? Haven't seen *any* comments on *any* Busybox bugs filed here at all...
Comment 2 Denys Vlasenko 2009-03-03 12:01:57 UTC
msh is not likely to be fixed in that regrd, it abuses 7th bit in characters for it's evil purposes.

There was no active work on msh in months. ash and hush are more active. It's tentatively planned to improve hush to the point where it surpasses msh.

Of course, if someone will start hacking on msh and fixing its problems, it will be gladly accepted.
Comment 3 Andre Klapper 2009-03-03 20:31:56 UTC
Maemo ships ash, not msh.
Sorry for not adding that info in the initial comment.

Maemo Version 4.1 (Diablo) / Busybox 1.6.1:
   CONFIG_FEATURE_SH_IS_ASH=y
   # CONFIG_FEATURE_SH_IS_HUSH is not set
   # CONFIG_FEATURE_SH_IS_LASH is not set
   # CONFIG_FEATURE_SH_IS_MSH is not set
   # CONFIG_FEATURE_SH_IS_NONE is not set
   CONFIG_ASH=y

Maemo Version 5.0alpha (Fremantle) / Busybox 1.10.2:
   CONFIG_FEATURE_SH_IS_ASH=y
   # CONFIG_FEATURE_SH_IS_HUSH is not set
   # CONFIG_FEATURE_SH_IS_MSH is not set
   # CONFIG_FEATURE_SH_IS_NONE is not set
   CONFIG_ASH=y
Comment 4 Andre Klapper 2009-03-04 15:53:10 UTC
"Furthermore, the issue is in libbb/lineedit.c which is linked into all four sh
variants (and fdisk, although maemo doesn't ship that and it doesn't need to
handle non-ASCII input anyway).  Fix one, fix all."
Comment 5 Andre Klapper 2009-03-10 18:52:47 UTC
From https://bugs.maemo.org/show_bug.cgi?id=2918#c17 :

FYI, this is from the busybox TODO file:
> The low hanging fruit is UTF-8 character set support.  We should do this.
> (Vodz pointed out the shell's cmdedit as needing work here.  What else?)

and there is some work along those lines in ftp://ftp.simtreas.ru/pub/my/bb/
(although it's based on an ancient bb version).
Comment 6 Denys Vlasenko 2009-03-20 14:19:13 UTC
You are right, lineedit is not UTF-8 capable. I am renaming this bug.

If you want to have separate bug about msh not being capable of handling chars >127 (even in "1 byte is one char" mode), please create new bug, with an example script which is mishandled by msh.

This is a known problem in msh.
Comment 7 Denys Vlasenko 2009-07-11 22:26:29 UTC
Created attachment 453 [details]
Fix (already applied to 1.15.x git tree)

Please test current git. If you can't, try applying this patch to 1.14.x and test that.