| Summary: | ash hangs after wait returns EINTR | ||
|---|---|---|---|
| Product: | Busybox | Reporter: | martin.brueckner |
| Component: | Other | Assignee: | unassigned |
| Status: | RESOLVED INVALID | ||
| Severity: | normal | CC: | busybox-cvs |
| Priority: | P5 | ||
| Version: | 1.31.x | ||
| Target Milestone: | --- | ||
| Hardware: | Other | ||
| OS: | Linux | ||
| Host: | Target: | ||
| Build: | |||
Running ./pid with PID 512 wait4 returns with EINTR and then wait is called again and again returning ECHILD. ash does not recover from this.
Other shells like bash or zsh seems to handle this differently. They don't hang at PID 512.
ioctl(0, SNDCTL_TMR_START or TCSETS, {B134 -opost isig icanon echo ...}) = 0
rt_sigaction(SIGWINCH, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, NULL, 8) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=NULL) = 512
setpgid(512, 512) = 0
wait4(-1, My process ID : 512
My parent's ID: 143
[{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WSTOPPED, NULL) = 512
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=512, si_uid=0, si_status=0, si_utime=0, si_stime=7} ---
rt_sigreturn({mask=[]}) = -1 EINTR (Interrupted system call)
wait4(-1, 0x7fc08b98, WSTOPPED, NULL) = -1 ECHILD (No child processes)
wait4(-1, 0x7fc08b98, WSTOPPED, NULL) = -1 ECHILD (No child processes)
The above looks like the following. wait4 syscall returns 512:
the "WSTOPPED, NULL) = 512" part clearly shows that syscall does NOT return EINTR, ERESTARTSYS or the like. It returns 512.
However, SIGCHLD is simultaneously delivered, therefore on return from wait4 kernel constructs the signal frame on stack and runs the signal handler
(in case of ash, it's signal_handler() function).
When it returns, (signal frame on stack is constructed so that) it returns to a stub in C library which executes rt_sigreturn syscall, which *should* restore registers and stack to the state it was before signal was delivered. However, we see "= -1 EINTR" there. If signal handler was invoked immediately on exit from wait4 syscall, this means we see a bug: rt_sigreturn did not correctly restore the syscall return register.
(This may not be so as there is no way to see when signal was delivered, it could be delivered a microsecond later later - however, as your other experiment shows, _usually_ SIGCHLD is delivered immediately at wait4.)
The bug may be caused by the fact that ERESTARTSYS constant *is* equal to 512. So rt_sigreturn code may be buggy and might be misinterpreting the value of 512 somewhere as ERESTARTSYS.
Actually, this seems to be what happens in linux/arch/nios2/kernel/signal.c
static int do_signal(struct pt_regs *regs)
{
unsigned int retval = 0, continue_addr = 0, restart_addr = 0;
int restart = 0;
struct ksignal ksig;
current->thread.kregs = regs;
/*
* If we were from a system call, check for system call restarting...
*/
if (regs->orig_r2 >= 0) {
continue_addr = regs->ea;
restart_addr = continue_addr - 4;
retval = regs->r2;
/*
* Prepare for system call restart. We do this here so that a
* debugger will see the already changed PC.
*/
switch (retval) {
case ERESTART_RESTARTBLOCK:
restart = -2;
case ERESTARTNOHAND:
case ERESTARTSYS:
case ERESTARTNOINTR:
restart++;
regs->r2 = regs->orig_r2;
regs->r7 = regs->orig_r7;
regs->ea = restart_addr;
break;
}
}
if (get_signal(&ksig)) {
/* handler */
if (unlikely(restart && regs->ea == restart_addr)) {
if (retval == ERESTARTNOHAND ||
retval == ERESTART_RESTARTBLOCK ||
(retval == ERESTARTSYS
&& !(ksig.ka.sa.sa_flags & SA_RESTART))) {
regs->r2 = EINTR;
regs->r7 = 1;
regs->ea = continue_addr;
}
}
handle_signal(&ksig, regs);
return 0;
}
Yes, looks like the bug is here:
/*
* If we were from a system call, check for system call restarting...
*/
if (regs->orig_r2 >= 0) {
continue_addr = regs->ea;
restart_addr = continue_addr - 4;
retval = regs->r2;
/*
* Prepare for system call restart. We do this here so that a
* debugger will see the already changed PC.
*/
switch (retval) {
case ERESTART_RESTARTBLOCK:
restart = -2;
case ERESTARTNOHAND:
case ERESTARTSYS:
case ERESTARTNOINTR:
restart++;
regs->r2 = regs->orig_r2;
regs->r7 = regs->orig_r7;
regs->ea = restart_addr;
break;
}
}
The above code examines r2, the "errno" register on the exit from syscall, but it DOES NOT check that it is indeed an error!
linux/arch/nios2/kernel/entry.S says:
/* Execute the system call */
callr r1
/* If the syscall returns a negative result:
* Set r7 to 1 to indicate error,
* Negate r2 to get a positive error code
* If the syscall returns zero or a positive value:
* Set r7 to 0.
* The sigreturn system calls will skip the code below by
* adding to register ra. To avoid destroying registers
*/
translate_rc_and_ret:
movi r1, 0
bge r2, zero, 3f
sub r2, zero, r2
movi r1, 1
3:
IOW: the code in do_signal() should have checked that r7 == 1 before assuming that r2 == 512 means "ERESTARTSYS".
(In reply to Denys Vlasenko from comment #2) Thanks Denys for your very helpful analysis. I am on vacation and couldn't reply earlier. When I am back at work I can try to patch the kernel to check r7==1. I am unsure where to put the check for r7==1. Is it after "case ERESTARTNOINTR" like this? if (regs->orig_r2 >= 0) { continue_addr = regs->ea; restart_addr = continue_addr - 4; retval = regs->r2; /* * Prepare for system call restart. We do this here so that a * debugger will see the already changed PC. */ switch (retval) { case ERESTART_RESTARTBLOCK: restart = -2; case ERESTARTNOHAND: case ERESTARTSYS: case ERESTARTNOINTR: if (regs->r7 == 0) break; // or is it regs->orig_r7 ? restart++; regs->r2 = regs->orig_r2; regs->r7 = regs->orig_r7; regs->ea = restart_addr; break; } |
I am running busybox on a NIOS2 Softcore CPU. As shell I am using ash. With process IDs below 512 everything runs smoothly. But when it comes to PID 512 ash hangs in an endless loop. Starting a new ash shell afterwards (PID > 512) works. One has to kill the first shell with -9. ash in the endless loop keeps the CPU at 100%. To debug this issue I let ash run with strace. Additionally I wrote a small program which prints its own and the parent's PID. # ./pid My process ID : 495 My parent's ID: 137 # Having ash running with strace and starting ./pid gives me this output: ioctl(0, SNDCTL_TMR_START or TCSETS, {B38400 opost isig icanon echo ...}) = 0 rt_sigaction(SIGWINCH, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, NULL, 8) = 0 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=NULL) = 518 setpgid(518, 518) = 0 wait4(-1, My process ID : 518 My parent's ID: 517 [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WSTOPPED, NULL) = 518 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=518, si_uid=0, si_status=0, si_utime=1, si_stime=6} --- rt_sigreturn({mask=[]}) = 518 ioctl(10, TIOCSPGRP, [517]) = 0 wait4(-1, 0x7fdadc28, WNOHANG|WSTOPPED, NULL) = -1 ECHILD (No child processes) ioctl(0, TCGETS, {B38400 opost isig icanon echo ...}) = 0 ioctl(0, SNDCTL_TMR_START or TCSETS, {B38400 opost -isig -icanon -echo ...}) = 0 Running ./pid with PID 512 wait4 returns with EINTR and then wait is called again and again returning ECHILD. ash does not recover from this. Other shells like bash or zsh seems to handle this differently. They don't hang at PID 512. ioctl(0, SNDCTL_TMR_START or TCSETS, {B134 -opost isig icanon echo ...}) = 0 rt_sigaction(SIGWINCH, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, NULL, 8) = 0 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=NULL) = 512 setpgid(512, 512) = 0 wait4(-1, My process ID : 512 My parent's ID: 143 [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WSTOPPED, NULL) = 512 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=512, si_uid=0, si_status=0, si_utime=0, si_stime=7} --- rt_sigreturn({mask=[]}) = -1 EINTR (Interrupted system call) wait4(-1, 0x7fc08b98, WSTOPPED, NULL) = -1 ECHILD (No child processes) wait4(-1, 0x7fc08b98, WSTOPPED, NULL) = -1 ECHILD (No child processes) wait4(-1, 0x7fc08b98, WSTOPPED, NULL) = -1 ECHILD (No child processes) wait4(-1, 0x7fc08b98, WSTOPPED, NULL) = -1 ECHILD (No child processes) wait4(-1, 0x7fc08b98, WSTOPPED, NULL) = -1 ECHILD (No child processes) wait4(-1, 0x7fc08b98, WSTOPPED, NULL) = -1 ECHILD (No child processes) ... (endless loop)