| Summary: | Busybox randomly crashes on ARM920T platform | ||
|---|---|---|---|
| Product: | Busybox | Reporter: | Ivan Castell Rovira <al004140> |
| Component: | Other | Assignee: | unassigned |
| Status: | RESOLVED INVALID | ||
| Severity: | major | CC: | al004140, busybox-cvs |
| Priority: | P5 | ||
| Version: | 1.17.x | ||
| Target Milestone: | --- | ||
| Hardware: | Other | ||
| OS: | Linux | ||
| Host: | i686-pc-linux-gnu | Target: | arm-unknown-linux-gnueabi- |
| Build: | |||
Not enough information to work on this report. |
Hi everybody! I don't know where is exactly the problem, so after searching on google for a whole week without any success, I thought publishing that issue in busybox bugtrack. This issue is related with that pieces of software: - busybox-1.17.3 - dynamic linker ld-2.6.1.so - ARM platform ARM920T - cross-compiler arm-linux-gcc version 4.2.2 - linux kernel 2.6.30.4 This is the test it is crashing. It crashes on the line executing piped ls, grep and cut busybox applets. $ cat test_crash.sh #! /bin/sh ulimit -c unlimited export LD_DEBUG=reloc export LD_DEBUG_OUTPUT=/tmp/ld_logs.out while [ 1 ]; do ls -l /bin/ 2>> /tmp/err | grep pa 2>> /tmp/err | cut -f 1 -d " " 2> /tmp/err # LINE OF CRASH if [ -f /home/root/core ]; then echo core generated! break; fi rm /tmp/ld_logs.out* done The test executes fine for hundreds of times (sometimes more than half an hour). But randomly it crashes, generating a coredump file. Analisis with gdb of the core dumped (I have cross-compiled gdb to run it on the target platform) $ gdb /bin/busybox --core ./core --quiet warning: core file may not match specified executable file. Core was generated by `grep pa USER=root HOME=/home/root OLDPWD=/home/root PS1=\u@\h:\w\$ ENV=/home/r'. Program terminated with signal 11, Segmentation fault. [New process 21169] #0 0x4015d63c in _start () from /lib/ld-linux.so.3 (gdb) bt #0 0x4015d63c in _start () from /lib/ld-linux.so.3 #1 0x000838cc in tryexec (cmd=Cannot access memory at address 0xbefe7f14) at shell/ash.c:7279 Cannot access memory at address 0xbefe7f2c (gdb) info registers r0 0xfffffff2 4294967282 r1 0x1733cc 1520588 r2 0x1733f4 1520628 r3 0x1 1 r4 0x172008 1515528 r5 0x0 0 r6 0x0 0 r7 0xb 11 r8 0x0 0 r9 0x0 0 r10 0x40024000 1073889280 r11 0x0 0 r12 0x4015d630 1075172912 sp 0xbefe7f08 0xbefe7f08 lr 0x838cc 538828 pc 0x4015d63c 0x4015d63c <_start> fps 0x0 0 cpsr 0x60000010 1610612752 Dynamic linker is pointing to 2.6.1 $ ls -l /lib/ld-linux.so.3 lrwxrwxrwx 1 root root 11 Jan 17 2011 /lib/ld-linux.so.3 -> ld-2.6.1.so I modified busybox source including some simple printf lines to see variable contents at the time of the crash. 7244 static void 7245 tryexec(IF_FEATURE_SH_STANDALONE(int applet_no,) char *cmd, char **argv, char **envp) 7246 { 7247 int repeated = 0; 7248 7249 #if ENABLE_FEATURE_SH_STANDALONE 7250 if (applet_no >= 0) { 7251 if (APPLET_IS_NOEXEC(applet_no)) { 7252 clearenv(); 7253 while (*envp) 7254 putenv(*envp++); 7255 run_applet_no_and_exit(applet_no, argv); 7256 } 7257 /* re-exec ourselves with the new arguments */ 7258 execve(bb_busybox_exec_path, argv, envp); 7259 /* If they called chroot or otherwise made the binary no longer 7260 * executable, fall through */ 7261 } 7262 #endif 7263 7264 repeat: 7265 #ifdef SYSV 7266 do { 7267 // if (cmd != NULL) printf("cmd = <>%s<>\n", cmd); else printf("cmd is NULL!"); 7268 7269 execve(cmd, argv, envp); 7270 } while (errno == EINTR); 7271 #else 7272 if (cmd != NULL) printf("cmd = <>%s<>\n", cmd); else printf("cmd is NULL!"); 7273 printf("cmd=%p\n", cmd); 7274 printf("argv=%p\n", argv); 7275 printf("envp=%p\n", envp); 7276 printf("tryexec=%p\n", tryexec); 7277 printf("execve=%p\n", execve); 7278 7279 execve(cmd, argv, envp); 7280 #endif Just before crashing, my test script printed this addresses on stdout: cmdname = <>/usr/bin/cut<> cmd = <>/usr/bin/cut<> cmd=0x17357c argv=0x1733fc envp=0x17342c tryexec=0x83840 execve=0xcca8 Segmentation fault (core dumped) core generated! As you can see, cmd is pointing to "/usr/bin/cut", so I don't understand why is gdb showing me this error: "cmd=Cannot access memory at address 0xbefe7f14) at shell/ash.c:7279" The dissasembled busybox binary referencing the address of crash (0x000838cc) $ arm-linux-objdump -d busybox 00083840 <tryexec>: 83840: e52de004 push {lr} ; (str lr, [sp, #-4]!) 83844: e24dd024 sub sp, sp, #36 ; 0x24 83848: e58d000c str r0, [sp, #12] 8384c: e58d1008 str r1, [sp, #8] 83850: e58d2004 str r2, [sp, #4] 83854: e3a03000 mov r3, #0 ; 0x0 83858: e58d3014 str r3, [sp, #20] 8385c: e59d300c ldr r3, [sp, #12] 83860: e3530000 cmp r3, #0 ; 0x0 83864: 0a000003 beq 83878 <tryexec+0x38> 83868: e59f0188 ldr r0, [pc, #392] ; 839f8 <tryexec+0x1b8> 8386c: e59d100c ldr r1, [sp, #12] 83870: ebfe2485 bl ca8c <_init+0x9b0> 83874: ea000001 b 83880 <tryexec+0x40> 83878: e59f017c ldr r0, [pc, #380] ; 839fc <tryexec+0x1bc> 8387c: ebfe2482 bl ca8c <_init+0x9b0> 83880: e59f0178 ldr r0, [pc, #376] ; 83a00 <tryexec+0x1c0> 83884: e59d100c ldr r1, [sp, #12] 83888: ebfe247f bl ca8c <_init+0x9b0> 8388c: e59f0170 ldr r0, [pc, #368] ; 83a04 <tryexec+0x1c4> 83890: e59d1008 ldr r1, [sp, #8] 83894: ebfe247c bl ca8c <_init+0x9b0> 83898: e59f0168 ldr r0, [pc, #360] ; 83a08 <tryexec+0x1c8> 8389c: e59d1004 ldr r1, [sp, #4] 838a0: ebfe2479 bl ca8c <_init+0x9b0> 838a4: e59f0160 ldr r0, [pc, #352] ; 83a0c <tryexec+0x1cc> 838a8: e59f1160 ldr r1, [pc, #352] ; 83a10 <tryexec+0x1d0> 838ac: ebfe2476 bl ca8c <_init+0x9b0> 838b0: e59f015c ldr r0, [pc, #348] ; 83a14 <tryexec+0x1d4> 838b4: e59f115c ldr r1, [pc, #348] ; 83a18 <tryexec+0x1d8> 838b8: ebfe2473 bl ca8c <_init+0x9b0> 838bc: e59d000c ldr r0, [sp, #12] 838c0: e59d1008 ldr r1, [sp, #8] 838c4: e59d2004 ldr r2, [sp, #4] 838c8: ebfe24f6 bl cca8 <_init+0xbcc> 838cc: e59d3014 ldr r3, [sp, #20] 838d0: e3530000 cmp r3, #0 ; 0x0 838d4: 0a000002 beq 838e4 <tryexec+0xa4> Also, the test gathers some information related with the dynamic linker: $ ls -l /tmp/ -rw-r--r-- 1 root root 0 Aug 20 01:49 err -rw-r--r-- 1 root root 724 Aug 20 01:49 ld_logs.out.21167 -rw-r--r-- 1 root root 523 Aug 20 01:49 ld_logs.out.21168 -rw-r--r-- 1 root root 728 Aug 20 01:49 ld_logs.out.21170 And these are the contents of that files: $ cat /tmp/ld_logs.out.211* 21167: 21167: relocation processing: /lib/libc.so.6 (lazy) 21167: 21167: relocation processing: /lib/libm.so.6 (lazy) 21167: 21167: relocation processing: rm (lazy) 21167: 21167: relocation processing: /lib/ld-linux.so.3 21167: 21167: calling init: /lib/libc.so.6 21167: 21167: 21167: calling init: /lib/libm.so.6 21167: 21167: 21167: initialize program: rm 21167: 21167: 21167: transferring control: rm 21167: 21167: 21167: calling fini: rm [0] 21167: 21167: 21167: calling fini: /lib/libm.so.6 [0] 21167: 21167: 21167: calling fini: /lib/libc.so.6 [0] 21167: 21168: 21168: relocation processing: /lib/libc.so.6 (lazy) 21168: 21168: relocation processing: /lib/libm.so.6 (lazy) 21168: 21168: relocation processing: ls (lazy) 21168: 21168: relocation processing: /lib/ld-linux.so.3 21168: 21168: calling init: /lib/libc.so.6 21168: 21168: 21168: calling init: /lib/libm.so.6 21168: 21168: 21168: initialize program: ls 21168: 21168: 21168: transferring control: ls 21168: 21170: 21170: relocation processing: /lib/libc.so.6 (lazy) 21170: 21170: relocation processing: /lib/libm.so.6 (lazy) 21170: 21170: relocation processing: cut (lazy) 21170: 21170: relocation processing: /lib/ld-linux.so.3 21170: 21170: calling init: /lib/libc.so.6 21170: 21170: 21170: calling init: /lib/libm.so.6 21170: 21170: 21170: initialize program: cut 21170: 21170: 21170: transferring control: cut 21170: 21170: 21170: calling fini: cut [0] 21170: 21170: 21170: calling fini: /lib/libm.so.6 [0] 21170: 21170: 21170: calling fini: /lib/libc.so.6 [0] 21170: I would really appreciate whatever advice to continue tracing that issue by myself. I really need to know why is this problem happening, but have no idea how can I continue tracing that issue. If you can help me, that would be very nice. Thanks a lot in advance!! Best regards, -- Ivan