| Summary: | Services started with start-stop-daemon applet not properly destroyed when stopped | ||
|---|---|---|---|
| Product: | Busybox | Reporter: | Ivan Castell Rovira <al004140> |
| Component: | Other | Assignee: | unassigned |
| Status: | RESOLVED INVALID | ||
| Severity: | normal | CC: | busybox-cvs |
| Priority: | P5 | ||
| Version: | 1.32.x | ||
| Target Milestone: | --- | ||
| Hardware: | Other | ||
| OS: | Linux | ||
| Host: | Target: | ||
| Build: | |||
| Attachments: | busybox config file used | ||
I discovered this bug only happens when booting the rootfs from root=ram. However, when booting from the emmc device (root=/dev/mmcblk1p3), it doesn't happen. It is the same rootfs.ext4 filesystem generated by buildroot. (In reply to Ivan Castell Rovira from comment #0) > Sending a SIGTERM or a SIGKILL to any of this PIDS doesn't helps: # kill -TERM 14710 # kill -KILL 14710 # ps fax|grep loop.sh 14710 root 0:00 [loop.sh] 20555 root 0:00 [loop.sh] 20633 root 0:00 [loop.sh] 20670 root 0:00 [loop.sh] 20707 root 0:00 [loop.sh] 27604 root 0:00 grep loop.sh This behavior (of seemingly not dying from SIGKILL) is a typical symptom of the dead process not being reaped by the parent: zombie processes still can be signaled (kill does not return any errors). Please show pstree -p, so we can see what is the parent of these dead processes. (In reply to Denys Vlasenko from comment #2) Hello and thanks for your answer. Excuse me by the delay but I had to prepare the setup again. Below are more details and the output you requested with pstree. Hope it helps! # busybox BusyBox v1.27.2 (2020-11-09 12:48:50 UTC) multi-call binary. BusyBox is copyrighted by many authors between 1998-2015. Licensed under GPLv2. See source distribution for detailed copyright notices. Usage: busybox [function [arguments]...] or: busybox --list[-full] or: busybox --install [-s] [DIR] or: function [arguments]... BusyBox is a multi-call binary that combines many common Unix utilities into a single executable. Most people will create a link to busybox for each function they wish to use and BusyBox will act like whatever it was invoked as. Currently defined functions: [, [[, addgroup, adduser, ar, arp, arping, ash, awk, basename, blkid, bunzip2, bzcat, cat, chattr, chgrp, chmod, chown, chroot, chrt, chvt, cksum, clear, cmp, cp, cpio, crond, crontab, cut, date, dc, dd, deallocvt, delgroup, deluser, devmem, df, diff, dirname, dmesg, dnsd, dnsdomainname, dos2unix, du, dumpkmap, echo, egrep, eject, env, ether-wake, expr, factor, fallocate, false, fbset, fdflush, fdformat, fdisk, fgrep, find, flock, fold, free, freeramdisk, fsck, fsfreeze, fstrim, fuser, getopt, getty, grep, gunzip, gzip, halt, hdparm, head, hexdump, hostid, hostname, hwclock, i2cdetect, i2cdump, i2cget, i2cset, id, ifconfig, ifdown, ifup, inetd, init, insmod, install, ip, ipaddr, ipcrm, ipcs, iplink, ipneigh, iproute, iprule, iptunnel, kill, killall, killall5, klogd, last, less, link, linux32, linux64, linuxrc, ln, loadfont, loadkmap, logger, login, logname, losetup, ls, lsattr, lsmod, lsof, lspci, lsscsi, lsusb, lzcat, lzma, lzopcat, makedevs, md5sum, mdev, mesg, microcom, mkdir, mkdosfs, mke2fs, mkfifo, mknod, mkpasswd, mkswap, mktemp, modprobe, more, mount, mountpoint, mt, mv, nameif, netstat, nice, nl, nohup, nproc, nslookup, od, openvt, partprobe, passwd, paste, patch, pidof, ping, pipe_progress, pivot_root, poweroff, printenv, printf, ps, pwd, rdate, readlink, readprofile, realpath, reboot, renice, reset, resize, rm, rmdir, rmmod, route, run-parts, runlevel, sed, seq, setarch, setconsole, setkeycodes, setlogcons, setpriv, setserial, setsid, sh, sha1sum, sha256sum, sha3sum, sha512sum, shred, sleep, sort, start-stop-daemon, stat, strings, stty, su, sulogin, svc, swapoff, swapon, switch_root, sync, sysctl, syslogd, tail, tar, tee, telnet, test, tftp, time, top, touch, tr, traceroute, true, truncate, tty, ubirename, udhcpc, uevent, umount, uname, uniq, unix2dos, unlink, unlzma, unlzop, unxz, unzip, uptime, usleep, uudecode, uuencode, vconfig, vi, vlock, w, watch, watchdog, wc, wget, which, who, whoami, xargs, xxd, xz, xzcat, yes, zcat # start-stop-daemon -S -b -q -m -p /var/run/my.pid --exec /root/loop.sh # start-stop-daemon -K -p /var/run/my.pid stopped process in pidfile '/var/run/my.pid' (pid 397) # start-stop-daemon -S -b -q -m -p /var/run/my.pid --exec /root/loop.sh # start-stop-daemon -K -p /var/run/my.pid stopped process in pidfile '/var/run/my.pid' (pid 402) # start-stop-daemon -S -b -q -m -p /var/run/my.pid --exec /root/loop.sh # start-stop-daemon -K -p /var/run/my.pid stopped process in pidfile '/var/run/my.pid' (pid 407) # start-stop-daemon -S -b -q -m -p /var/run/my.pid --exec /root/loop.sh # start-stop-daemon -K -p /var/run/my.pid stopped process in pidfile '/var/run/my.pid' (pid 412) # start-stop-daemon -S -b -q -m -p /var/run/my.pid --exec /root/loop.sh # start-stop-daemon -K -p /var/run/my.pid stopped process in pidfile '/var/run/my.pid' (pid 417) # ps fax|grep loop.sh 397 root [loop.sh] 402 root [loop.sh] 407 root [loop.sh] 412 root [loop.sh] 417 root [loop.sh] 421 root grep loop.sh # pstree -p swapper/0(1)-+-dbus-daemon(183) |-klogd(148) |-loop.sh(397) |-loop.sh(402) |-loop.sh(407) |-loop.sh(412) |-loop.sh(417) |-macipsrv(198) |-monitor_eth_con(203) |-sleep(374) |-sleep(384) |-sleep(398) |-sleep(403) |-sleep(408) |-sleep(413) |-sleep(418) |-sshd(268) |-start-stop-daem(143) |-start-stop-daem(146) |-start-stop-daem(193) |-start-stop-daem(196) |-start-stop-daem(200) |-start-stop-daem(396) |-start-stop-daem(401) |-start-stop-daem(406) |-start-stop-daem(411) |-start-stop-daem(416) |-syslogd(145) |-udevd(151) |-udhcpc(195) `-udhcpc(248) # ps fax PID USER COMMAND 1 root [swapper/0] 2 root [kthreadd] 3 root [ksoftirqd/0] 4 root [kworker/0:0] 5 root [kworker/0:0H] 6 root [kworker/u2:0] 7 root [rcu_preempt] 8 root [rcu_sched] 9 root [rcu_bh] 10 root [migration/0] 11 root [lru-add-drain] 12 root [cpuhp/0] 13 root [kdevtmpfs] 14 root [netns] 15 root [oom_reaper] 16 root [writeback] 17 root [kcompactd0] 18 root [crypto] 19 root [bioset] 20 root [kblockd] 21 root [kworker/0:1] 22 root [cfg80211] 23 root [watchdogd] 24 root [kswapd0] 25 root [vmstat] 70 root [bioset] 71 root [bioset] 72 root [bioset] 73 root [bioset] 74 root [bioset] 75 root [bioset] 76 root [bioset] 77 root [bioset] 78 root [bioset] 79 root [bioset] 80 root [bioset] 81 root [bioset] 82 root [bioset] 83 root [bioset] 84 root [bioset] 85 root [bioset] 86 root [bioset] 87 root [bioset] 88 root [bioset] 89 root [bioset] 90 root [bioset] 91 root [bioset] 92 root [bioset] 93 root [bioset] 94 root [kworker/u2:1] 96 root [cfinteractive] 97 root [irq/53-mmc0] 98 root [irq/54-mmc1] 99 root [kworker/0:2] 100 root [kworker/0:3] 101 root [ipv6_addrconf] 102 root [bioset] 103 root [krfcommd] 104 root [mmcqd/1] 105 root [bioset] 106 root [mmcqd/1boot0] 107 root [bioset] 108 root [mmcqd/1boot1] 109 root [bioset] 110 root [mmcqd/1rpmb] 111 root [irq/36-imx_ther] 112 root [jbd2/ram0-8] 113 root [ext4-rsv-conver] 114 root {linuxrc} init 122 root [kworker/0:1H] 123 root [jbd2/mmcblk1p5-] 124 root [ext4-rsv-conver] 125 root [jbd2/mmcblk1p6-] 126 root [ext4-rsv-conver] 143 root [start-stop-daem] 146 root [start-stop-daem] 148 root /sbin/klogd -n 151 root /sbin/udevd -d 183 dbus dbus-daemon --system 193 root [start-stop-daem] 195 root [udhcpc] 196 root [start-stop-daem] 198 root [macipsrv] 200 root [start-stop-daem] 203 root [monitor_eth_con] 217 root [kworker/u2:2] 221 root -sh 374 root [sleep] 384 root [sleep] 396 root [start-stop-daem] 397 root [loop.sh] 398 root sleep 1000 401 root [start-stop-daem] 402 root [loop.sh] 403 root sleep 1000 406 root [start-stop-daem] 407 root [loop.sh] 408 root sleep 1000 411 root [start-stop-daem] 412 root [loop.sh] 413 root sleep 1000 416 root [start-stop-daem] 417 root [loop.sh] 418 root sleep 1000 423 root ps fax > # ps fax
> PID USER COMMAND
> 1 root [swapper/0]
Your init process is kernel's idle thread.
How did that happen??
Normally, kernel's idle thread has pid 0 (not 1 as it is seen on your system) and is not visible to userspace (/proc/0 does not exist).
Of course, kernel's idle thread is not doing anything useful, in particular, it does not reap dead children. Thus you see the zombies.
As explained in a previous posts, this issue only happens when the board is booted from root=ram. The same rootfs bootted using root=/dev/mmcblk1p3 partitions works fine:
# cat /proc/cmdline
console=ttymxc0,115200 root=/dev/mmcblk1p3 rootwait ro
# ps fax
PID USER COMMAND
1 root init
2 root [kthreadd]
3 root [ksoftirqd/0]
4 root [kworker/0:0]
5 root [kworker/0:0H]
6 root [kworker/u2:0]
7 root [rcu_preempt]
8 root [rcu_sched]
9 root [rcu_bh]
10 root [migration/0]
11 root [lru-add-drain]
12 root [cpuhp/0]
13 root [kdevtmpfs]
14 root [netns]
Just in case some information could be helpful to debug, attached are all relevant files involved in the initial steps of mounting the rootfs:
# ls -l /sbin/init
lrwxrwxrwx 1 root root 14 Nov 9 12:49 /sbin/init -> ../bin/busybox
# cat /proc/cmdline
console=ttymxc0,115200 root=ram initrd=0x82000000,50M rootfstype=ext4 rw
# cat /etc/fstab
# /etc/fstab: static file system information.
#
# <file system> <mount pt> <type> <options> <dump> <pass>
/dev/root / ext4 auto,ro 0 0
proc /proc proc defaults 0 0
devpts /dev/pts devpts defaults,gid=5,mode=620 0 0
tmpfs /dev/shm tmpfs mode=0777 0 0
tmpfs /tmp tmpfs defaults 0 0
sysfs /sys sysfs defaults 0 0
debugfs /sys/kernel/debug/ debugfs defaults 0 0
# cat /etc/inittab
# /etc/inittab
# Startup the system
::sysinit:/bin/mount -o remount,rw /
::sysinit:/etc/init.d/rcS
::sysinit:/bin/mount -o remount,ro /
# Put a getty on the serial port
ttymxc0::respawn:/sbin/getty -L ttymxc0 0 vt100 # GENERIC_SERIAL
# Stuff to do before rebooting
::shutdown:/etc/init.d/rcK
::shutdown:/bin/umount -a -r
cat /etc/init.d/rcS
#!/bin/sh
mount_overlay() {
mkdir -p /tmp/overlays$1 /tmp/overlays-work$1
mount -t overlay overlay -o lowerdir=$1,upperdir=/tmp/overlays$1,workdir=/tmp/overlays-work$1 $1
}
. /etc/profile
# Flasher manually mounts devtmpfs
mount -t devtmpfs none /dev 2> /dev/null || true
mkdir -p /dev/pts /dev/shm
mount -a
echo "[$IMAGE_SIDE] Mounting manually devtmpfs ... $(get_status)"
# Mount some overlays
mount_overlay /etc
mount_overlay /root
mount_overlay /usr
mount_overlay /var
mount_overlay /run
Hope all this this helps to discover what is wrong. Please remember the rootfs is built using the buildroot tool. Thanks.
(In reply to Ivan Castell Rovira from comment #5) > As explained in a previous posts, this issue only happens when the board is booted from root=ram. The same rootfs bootted using root=/dev/mmcblk1p3 partitions works fine Well, when you run with root=ram, where is your init process (pid 1)? It's not in the pstree output - instead, you see "swapper/0", kernel's idle thread. This does not look right. You need to investigate why this happens. |
Created attachment 8561 [details] busybox config file used Tested on a ARM platform, but I suppose this happens in all platforms. First of all create this simple loop.sh script for the test: # cat /root/loop.sh while [ 1 ] do sleep 1000 done A service is started with start-stop-daemon as shown: # start-stop-daemon -S -b -q -m -p /var/run/my.pid --exec /root/loop.sh When checking for the service running in background, pid 14710 is found: # ps fax | grep loop.sh 14710 root 0:00 {loop.sh} /bin/sh /root/loop.sh 14793 root 0:00 grep loop.sh Then the service is stopped with start-stop-daemon: # start-stop-daemon -K -p /var/run/my.pid You can see the PID 14710 entry with a process named [loop.sh] # ps fax | grep loop.sh 14710 root 0:00 [loop.sh] 15563 root 0:00 grep loop.sh If I repeat this start/stop cycle several times, I get this nasty output: # ps fax | grep loop.sh 14710 root 0:00 [loop.sh] 20555 root 0:00 [loop.sh] 20633 root 0:00 [loop.sh] 20670 root 0:00 [loop.sh] 20707 root 0:00 [loop.sh] 20775 root 0:00 grep loop.sh Sending a SIGTERM or a SIGKILL to any of this PIDS doesn't helps: # kill -TERM 14710 # kill -KILL 14710 # ps fax|grep loop.sh 14710 root 0:00 [loop.sh] 20555 root 0:00 [loop.sh] 20633 root 0:00 [loop.sh] 20670 root 0:00 [loop.sh] 20707 root 0:00 [loop.sh] 27604 root 0:00 grep loop.sh This has consequences with systemv services, as some of them wait until the PID is fully removed.