Bug 2041

Summary: [1.16.2] "sed" segmentation fault
Product: Busybox Reporter: my.somewhat.lengthy.loginname
Component: OtherAssignee: unassigned
Status: RESOLVED INVALID    
Severity: major CC: busybox-cvs
Priority: P5    
Version: 1.16.x   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Host: Target:
Build:
Attachments: Changes to sed.c, as a patch (see comment 5)
objdump output (comment 10)
sed.c, sed.i, sed.s, and output (comment 13)
LOG file (comment 13)

Description my.somewhat.lengthy.loginname 2010-06-13 00:41:13 UTC
Command: echo 1234 | sed 's/23//'

Expected output: 14

Actual output: Segmentation fault

Busybox 1.16.2, as downloaded from busybox.net, and self-compiled. Running under the built-in ash shell. Current environment: kernel 2.6.34, glibc 2.11.2, gcc 4.5.0.

Other info: I have seen this come and go over the 1.16.x series -- with and without the various patches applied, and with slightly older versions of kernel / glibc / gcc.
Comment 1 Denys Vlasenko 2010-06-13 12:52:27 UTC
Can you identify where exactly it segfaults in sed.c?
Comment 2 my.somewhat.lengthy.loginname 2010-06-13 16:18:51 UTC
(In reply to comment #1)
> Can you identify where exactly it segfaults in sed.c?

Alas not. Here is was I tried to get more details; no. 3 is perhaps the most informative part.

(1) The "production" executable (compiled with my CFLAGS, stripped, etc.) says:

busybox[782]: segfault at 0 ip 08061e09 sp bf97ac88 error 4 in busybox[8048000+187000]
Segmentation fault

(2) Then I unset all compiler-related environment flags and switched on "Build BusyBox with extra Debugging symbols" in the "make menuconfig" dialogue, before I compiled again. The error message from busybox_unstripped was similar; interestingly the "ip" value was identical. I believe that "ip" here means "instruction pointer," so I looked that up in busybox_unstripped.map. The closest is

 .text          0x08061e00       0xe6 /usr/lib/libc.a(getc.o)
                0x08061e00                getc
                0x08061e00                _IO_getc
                0x08061e00                fgetc
 *fill*         0x08061ee6        0xa 90909090

(3) Next I switched on "Disable compiler optimizations" in the "make menuconfig" dialogue, and compiled again. This time there was no segfault; instead I got the expected output.

(4) I also tried my hands at GNU debugger, but it kept complaining about extra command-line arguments.
Comment 3 Denys Vlasenko 2010-06-18 01:30:29 UTC
(In reply to comment #2)
> (In reply to comment #1)
> > Can you identify where exactly it segfaults in sed.c?
> 
> Alas not. Here is was I tried to get more details; no. 3 is perhaps the most
> informative part.
> 
> (1) The "production" executable (compiled with my CFLAGS, stripped, etc.) says:
> 
> busybox[782]: segfault at 0 ip 08061e09 sp bf97ac88 error 4 in
> busybox[8048000+187000]
> Segmentation fault

Please run it under ulimit -c 999999999, load the coredump into gdb, produce backtrace and see where that fgetc comes from. Post the backtrace here.
Comment 4 my.somewhat.lengthy.loginname 2010-06-18 02:40:59 UTC
(In reply to comment #3)

> Please run it under ulimit -c 999999999, load the coredump into gdb, produce
> backtrace and see where that fgetc comes from. Post the backtrace here.

root [/tmp/busybox-1.16.2] ./busybox_unstripped ash
root [/tmp/busybox-1.16.2] ./busybox_unstripped echo 1234 | ./busybox_unstripped sed 's/23//'
Segmentation fault (core dumped)
root [/tmp/busybox-1.16.2] gdb ./busybox_unstripped -c core
GNU gdb (GDB) 7.1
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /tmp/busybox-1.16.2/busybox_unstripped...done.

warning: core file may not match specified executable file.
[New Thread 5983]
Core was generated by `./busybox_unstripped sed s/23//'.
Program terminated with signal 11, Segmentation fault.
#0  0x08061e09 in getc ()
(gdb) bt
#0  0x08061e09 in getc ()
#1  0x081742f0 in bb_get_chunk_with_continuation (file=0x0, end=0xbfae8758, lineno=0x0)
    at libbb/get_line_from_file.c:34
#2  0x0816aed1 in get_next_line (
    gets_char=0xbfae879e "\034\nx\364\034\b@\211\256\277 \210\256\277") at editors/sed.c:769
#3  0x0816b00b in process_files () at editors/sed.c:873
#4  0x0816ba1c in sed_main (argc=2, argv=0xbfae8940) at editors/sed.c:1323
#5  0x080fe982 in run_applet_no_and_exit (applet_no=220, argv=0xbfae8938) at libbb/appletlib.c:746
#6  0x080fe9a3 in run_applet_and_exit (name=0xbfae8f11 "sed", argv=0xbfae8938)
    at libbb/appletlib.c:753
#7  0x080feb90 in busybox_main (name=<value optimized out>, argv=0xbfae8934)
    at libbb/appletlib.c:718
#8  run_applet_and_exit (name=<value optimized out>, argv=0xbfae8934) at libbb/appletlib.c:755
#9  0x080fec14 in main (argc=3, argv=0xbfae8934) at libbb/appletlib.c:808
(gdb)
Comment 5 Denys Vlasenko 2010-06-19 15:46:40 UTC
Apparently here we somehow get fp == NULL

                FILE *fp = G.input_file_list[G.current_input_file];
                /* Read line up to a newline or NUL byte, inclusive,
                 * return malloc'ed char[]. length of the chunk read
                 * is stored in len. NULL if EOF/error */
                temp = bb_get_chunk_from_file(fp, &len);

Can you add

bb_error_msg("input_file_list[%d]:%p", G.current_input_file, fp)

just after FILE *fp = ... assignment?

Also, find this line:

G.input_file_list[G.input_file_count++] = file;

and add 

bb_error_msg("storing input_file_list[%d]=%p",
      G.input_file_count - 1, file);

after it.

Rebuild, let it crash, and post the output.
Comment 6 my.somewhat.lengthy.loginname 2010-06-19 16:41:24 UTC
Created attachment 2047 [details]
Changes to sed.c, as a patch (see comment 5)

The changes that I made to sed.c in response to comment 5, as a patch. To rule out that I made a mistake in the process.
Comment 7 my.somewhat.lengthy.loginname 2010-06-19 16:42:35 UTC
And here is the output:

root [/tmp/busybox-1.16.2-new] ./busybox_unstripped ash
root [/tmp/busybox-1.16.2-new] ./busybox_unstripped echo 1234 | ./busybox_unstripped sed 's/23//'
sed: storing input_file_list[0]=0x81cc3c0
sed: input_file_list[0]:(nil)
Segmentation fault (core dumped)
Comment 8 Denys Vlasenko 2010-06-19 17:10:21 UTC
As you see, input_file_list[0] somehow magically changed from non-NULL to NULL.

Please add the line:

bb_error_msg("line %d: input_file_list:%p input_file_list[0]:%p", __LINE__, G.input_file_list, G.input_file_list[0]);

in various places in sed.c and find the exact place where it happens. Then we can figure out _how_ it happens.
Comment 9 my.somewhat.lengthy.loginname 2010-06-20 01:27:22 UTC
Here is the original add_input_file() function:

static void add_input_file(FILE *file)
{
	G.input_file_list = xrealloc_vector(G.input_file_list, 2, G.input_file_count);
	G.input_file_list[G.input_file_count++] = file;
}


and here is a functional replacement that I made, enriched with console messages:

static void add_input_file(FILE *file)
{
	bb_error_msg("file: %p", file);

	G.input_file_list = xrealloc_vector(G.input_file_list, 2, G.input_file_count);
	G.input_file_list[G.input_file_count] = file;

	bb_error_msg("G.input_file_list[G.input_file_count]: %p", G.input_file_list[G.input_file_count]);
	bb_error_msg("G.input_file_list[G.input_file_count]: %p", G.input_file_list[G.input_file_count]);

	G.input_file_count++;
}

The replacement function checks, *twice* in a row, whether the file pointer was correctly stored. The result is:

root [/tmp/busybox-1.16.2] ./busybox_unstripped echo 1234 | ./busybox_unstripped sed 's/23//'
sed: file: 0x81cc3c0
sed: G.input_file_list[G.input_file_count]: 0x81cc3c0
sed: G.input_file_list[G.input_file_count]: (nil)
Segmentation fault
root [/tmp/busybox-1.16.2]
Comment 10 Denys Vlasenko 2010-06-20 01:55:38 UTC
(In reply to comment #9)
and here is a functional replacement that I made, enriched with console
> messages:
> 
> static void add_input_file(FILE *file)
> {
>     bb_error_msg("file: %p", file);
> 
>     G.input_file_list = xrealloc_vector(G.input_file_list, 2,
> G.input_file_count);
>     G.input_file_list[G.input_file_count] = file;
> 
>     bb_error_msg("G.input_file_list[G.input_file_count]: %p",
> G.input_file_list[G.input_file_count]);
>     bb_error_msg("G.input_file_list[G.input_file_count]: %p",
> G.input_file_list[G.input_file_count]);
> 
>     G.input_file_count++;
> }
> 
> The replacement function checks, *twice* in a row, whether the file pointer was
> correctly stored. The result is:
> 
> root [/tmp/busybox-1.16.2] ./busybox_unstripped echo 1234 |
> ./busybox_unstripped sed 's/23//'
> sed: file: 0x81cc3c0
> sed: G.input_file_list[G.input_file_count]: 0x81cc3c0
> sed: G.input_file_list[G.input_file_count]: (nil)
> Segmentation fault

Looks like gcc bug. Let's see the assembly. Please run

objdymp -drsx editors/sed.o >sed.disasm

on a modified file (one with your debug printouts) and attach sed.disasm to this bug.

and attach sed.disasm
Comment 11 my.somewhat.lengthy.loginname 2010-06-20 02:14:54 UTC
Created attachment 2053 [details]
objdump output (comment 10)

The output of "objdump -drsx editors/sed.o >sed.disasm" (comment 10), bz2'ed as per Bugzilla's request
Comment 12 my.somewhat.lengthy.loginname 2010-06-20 07:01:03 UTC
00000000 <add_input_file>:

	push   %ebx

	mov    %eax,%ebx	; EBX = EAX = file pointer,
				; as passed to the function

; bb_error_msg("file: %p", file);
	push   %eax		; EAX: file pointer
	push   $0x128		; location of text string
	call   bb_error_msg

; G.input_file_list = xrealloc_vector( etc. )
	mov    0x14,%ecx	; 0x14: G.input_file_count
	mov    $0x402,%edx	; why $0x402?
	mov    0x1c,%eax	; 0x1c: G.input_file_list
	call   xrealloc_vector_helper
	mov    %eax,0x1c	; 0x1c: G.input_file_list

; G.input_file_list[G.input_file_count] = file;

	; ?

	; pretty please?

	; that's right, it simply does not do that

; First bb_error_msg on the stored file pointer
	; GCC does not actually look it up in the G structure,
	; but uses the copy in EBX. The output is the
	; function parameter.
	push   %ebx		; EBX: file pointer
	push   $0x131		; location of text string
	call   bb_error_msg

; Second bb_error_msg on the stored file pointer
	; This time GCC does look it up in the G structure.
	; Because nothing was ever stored there, the output is (nil).
	mov    0x1c,%eax	; 0x1c: G.input_file_list
	mov    0x14,%edx	; 0x14: G.input_file_count
	pushl  (%eax,%edx,4)	; must be GAS for [eax + edx*4]
	push   $0x131		; location of text string
	call   bb_error_msg

; G.input_file_count++
	incl   0x14		; 0x14: G.input_file_count

	add    $0x18,%esp
	pop    %ebx
	ret
Comment 13 Denys Vlasenko 2010-06-20 12:23:25 UTC
Right. "gcc bug" theory got a +.

Let's make sure that preprocessed source shows the same bug.

* save sed.c
* do "make editors/sed.i"
* rename sed.i to sed.c
* build busybox with "make V=1 2>&1 | tee LOG"
* confirm the bug is still there
* run "make editors/sed.s"

Now you have:
* a self-contained sed.c which exhibits the bug (no include files needed)
* the exact gcc command line which is needed to compile it so that the generated code is wrong
* the resulting buggy assembly in sed.s
* output of "gcc -v"

Please put all this information into this bug. Then you perhaps need to create a bug in gcc bugzilla.
Comment 14 my.somewhat.lengthy.loginname 2010-06-20 21:11:26 UTC
Created attachment 2059 [details]
sed.c, sed.i, sed.s, and output (comment 13)

Most materials from comment 13, in a .tar.bz2 archive. The LOG file comes extra, it made the archive too big for Bugzilla.
Comment 15 my.somewhat.lengthy.loginname 2010-06-20 21:13:52 UTC
Created attachment 2065 [details]
LOG file (comment 13)

The LOG file as described in comment 13, as a BZ2 archive.
Comment 16 Harry Liebel 2010-08-13 10:00:42 UTC
This has been reported as a problem with gcc 4.5.0.

Look here:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43987

To fix the problem with gcc 4.5.0 use option "-fno-tree-pta".