| Summary: | Wrong PID written by start-stop-daemon -S -b -m -p | ||
|---|---|---|---|
| Product: | Busybox | Reporter: | mwadsten |
| Component: | Other | Assignee: | unassigned |
| Status: | NEW --- | ||
| Severity: | normal | CC: | busybox-cvs |
| Priority: | P5 | ||
| Version: | 1.33.x | ||
| Target Milestone: | --- | ||
| Hardware: | Other | ||
| OS: | Linux | ||
| Host: | Target: | ||
| Build: | |||
| Attachments: | .config file used in build | ||
|
Description
mwadsten
2021-06-15 16:32:38 UTC
(In reply to mwadsten from comment #0) > In any case, it is technically incorrect to use vfork the way > that debianutils/start-stop-daemon.c does, because there is no > defined guarantee that the result of getpid() in the grandchild process is correct. Can you expand on this? Where does it say that getpid() after vfork() is not guaranteed to return correct value? (In reply to mwadsten from comment #0) > This might be a compatibility issue with the 2.6.35 kernel, or whatever > version of glibc is present on this platform. I am unable to reproduce > this with a host build of the same code, running under WSL2 > (Ubuntu 20.04; Linux 5.4.72). I had a coworker try my reproduction on another, > newer Linux-based system (kernel 5.x?), and we were unable to reproduce > it there either. (That system also uses musl instead of glibc.) I vaguely remember that glibc used to have a pid caching (mis)feature: to save a few microseconds, they were remembering result of first getpid() call and avoided a syscall if getpid() was called again. Try strace busybox lsof 2>&1 | grep getpid to see whether "pid caching" is happening. If yes, you'll see only one getpid() call. Looks like you remembered correctly - glibc 2.25 (released 2017-02-05) removed the getpid caching behavior. https://sourceware.org/glibc/wiki/Release/2.25#pid_cache_removal Do you know of any other recent busybox updates which rely on newer glibc versions (or other parts of the system) that we should be aware of? I don't see any callouts like "busybox now requires glibc 1.x+" on busybox.net. (In reply to mwadsten from comment #3) > Do you know of any other recent busybox updates which rely on newer glibc versions (or other parts of the system) that we should be aware of? None. Busybox should work okay with pid cache as well - it does not use direct clone syscall or something like that. I assume you see a glibc bug wrt pid cache (maybe it's not invalidated on vfork). strace busybox lsof 2>&1 | grep getpid is actually giving an empty response on this system. I did it without grep and I definitely don't see any getpid calls. I also don't see getpid if I do strace -F start-stop-daemon ... (not even from the resolution of $$). Maybe that IS a sign that this glibc is caching the getpid result (and not reaching a syscall). If I'm reading the features.h header correctly, it's glibc 2.12. Given that the glibc 2.25 release notes explicitly say "it was deemed safer to remove the cache than to potentially return a wrong answer" in regards to removing the PID cache, I assume that is the root cause here. In that case, would you agree that we can characterize this as a busybox compatibility issue with glibc < 2.25? This is rather weird. I suggest you dig deeper, like trying
int main() { return getpid(); }
under strace and gdb.
|