Bug 839

Summary: cal produces garbled week-day names when locale is Japan or Taiwan
Product: Busybox Reporter: kuo <cookie>
Component: OtherAssignee: unassigned
Status: RESOLVED FIXED    
Severity: major CC: busybox-cvs
Priority: P3 Keywords: FIXME
Version: 1.16.x   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Host: Target:
Build:
Attachments: cal produces garbled week-day names
Fix
cal produces correct week-day names when locale is ja_JP.utf8 or zh_TW.utf8

Description kuo 2010-01-03 09:55:42 UTC
Dear Sir/Madam,
     cal produces garbled week-day names when locale is zh_TW.utf8 or ja_JP.utf8, although it works well when locale is C.    Please refer to the upper-left window in attached PNG file for details.
     After tracing the code, I found that the bug is located in line 126 of coreutils/cal.c of version 1.16.0.git.  It seems to be inappropriate for strncpy to copy the first 2 bytes in buf when locale is zh_TW.utf8 or ja_JP.utf8.

/* code segment: from line 124 to line 126 of coreutils/cal.c of v1.16.0.git */
zero_tm.tm_wday = i;
strftime(buf, sizeof(buf), "%a", &zero_tm);
strncpy(day_headings + i * (3+julian) + julian, buf, 2);

     By the way, my OS is Quirky Linux v0.0.2, which is a variant of Puppy Linux.

Best regards,
CHIN-YUAN KUO
Jan. 3, 2010
Comment 1 kuo 2010-01-03 10:06:02 UTC
Created attachment 877 [details]
cal produces garbled week-day names

When locale is zh_TW.utf8 or ja_JP.utf8, cal produces garbled week-day names.
In addition, this attachment shows version of BusyBox, output of date and a portion of output of locale.
Comment 2 Denys Vlasenko 2010-01-05 11:30:54 UTC
I see this:

# LANG=zh_TW.utf8 cal
      一月 2010     
日 一 二 三 四 五 六
                1  2
 3  4  5  6  7  8  9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31

What do you see with "standard" cal?

My machine does not have right fonts to shot those glyphs, so I have a question: do week day names take 2 char positions per hieroglyph?

The responsible code in cal.c is:

        i = 0;
        do {
                zero_tm.tm_mon = i;
                /* full month name according to locale */
                strftime(buf, sizeof(buf), "%B", &zero_tm);
                month_names[i] = xstrdup(buf);

                if (i < 7) {
                        zero_tm.tm_wday = i;
                        /* abbreviated weekday name according to locale */
                        strftime(buf, sizeof(buf), "%a", &zero_tm);
====>===BAD======>      strncpy(day_headings + i * (3+julian) + julian, buf, 2);
                }
        } while (++i < 12);

Arrow indicates code which simply remembers 2 BYTES from every weekday name. That's much.

Just converting it to take two *unicode chars* instead would be not so hard, but I need to know whether these unicode chars (in this case more like hieroglyphs) are twice as wide and ACSII chars? If yes, we'd need the code to account for this.
Comment 3 kuo 2010-01-16 10:54:22 UTC
Dear Sir,
     I am sorry for my reply being late.
     The unicode chars are twice as wide as ASCII chars.
     By the way, there are many GPL TTFs containing Chinese chars.  For example, FireFly TTF (http://cle.linux.org.tw/fonts/FireFly/fireflysung-1.3.0.tar.gz) is one of them.

(In reply to comment #2)
> I see this:
> 
> # LANG=zh_TW.utf8 cal
>       一月 2010     
> 日 一 二 三 四 五 六
>                 1  2
>  3  4  5  6  7  8  9
> 10 11 12 13 14 15 16
> 17 18 19 20 21 22 23
> 24 25 26 27 28 29 30
> 31
> 
> What do you see with "standard" cal?
> 
> My machine does not have right fonts to shot those glyphs, so I have a
> question: do week day names take 2 char positions per hieroglyph?
> 
> The responsible code in cal.c is:
> 
>         i = 0;
>         do {
>                 zero_tm.tm_mon = i;
>                 /* full month name according to locale */
>                 strftime(buf, sizeof(buf), "%B", &zero_tm);
>                 month_names[i] = xstrdup(buf);
> 
>                 if (i < 7) {
>                         zero_tm.tm_wday = i;
>                         /* abbreviated weekday name according to locale */
>                         strftime(buf, sizeof(buf), "%a", &zero_tm);
> ====>===BAD======>      strncpy(day_headings + i * (3+julian) + julian, buf,
> 2);
>                 }
>         } while (++i < 12);
> 
> Arrow indicates code which simply remembers 2 BYTES from every weekday name.
> That's much.
> 
> Just converting it to take two *unicode chars* instead would be not so hard,
> but I need to know whether these unicode chars (in this case more like
> hieroglyphs) are twice as wide and ACSII chars? If yes, we'd need the code to
> account for this.
> 

Comment 4 Denys Vlasenko 2010-01-24 06:49:35 UTC
Created attachment 973 [details]
Fix

This fix is comitted to git
Comment 5 Denys Vlasenko 2010-01-27 23:40:07 UTC
1.16.0 is released and it has this fixed. Please test it and let me know if it does not work.

(Double check that you enabled unicode support in .config)
Comment 6 kuo 2010-01-30 16:07:30 UTC
Created attachment 1015 [details]
cal produces correct week-day names when locale is ja_JP.utf8 or zh_TW.utf8
Comment 7 Denys Vlasenko 2010-01-30 23:18:21 UTC
Fixed in 1.16.0