Bug 2341 - Forked new process with locked malloc mutex
Summary: Forked new process with locked malloc mutex
Status: NEW
Alias: None
Product: uClibc
Classification: Unclassified
Component: Threads (show other bugs)
Version: 0.9.32
Hardware: PC Linux
: P5 normal
Target Milestone: ---
Assignee: unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-08-05 13:17 UTC by Vladimir Sorokin
Modified: 2015-04-16 09:20 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:


Attachments
Patch for relocking malloc mutex in libc_fork (1.02 KB, patch)
2010-08-05 13:17 UTC, Vladimir Sorokin
Details
test case to reproduce the deadlock behaviour using setenv() (377 bytes, application/octet-stream)
2010-12-03 20:46 UTC, Robin Haberkorn
Details
patch to apply in addition to the malloc-fork patch to resolve setenv() 'deadlocks' (2.74 KB, patch)
2010-12-03 21:03 UTC, Robin Haberkorn
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Vladimir Sorokin 2010-08-05 13:17:59 UTC
Created attachment 2293 [details]
Patch for relocking malloc mutex in libc_fork

Happens in multithreading applications. If one thread going to fork while another thread works with memory (call malloc, free etc) then in new process we have locked malloc mutex (libc/stdlib/malloc-standart/malloc.c: __malloc_lock). In linuxthreads.old, in ptfork.c all malloc mutexes (depends on malloc subsystem type) locked before fork and unlocked after. In glibc used code like next:
static void
ptmalloc_lock_all (void)
{
    ...
    __pthread_mutex_lock(&__malloc_lock);
    ...
}
static void
ptmalloc_unlock_all (void)
{
   ...
    __pthread_mutex_unlock(&__malloc_lock);
   ...
}
static void
ptmalloc_unlock_all2 (void)
{
   ...
    __pthread_mutex_init(__malloc_lock, ...);
   ...
}
void
ptmalloc_init (void) {
   ...
    __pthread_atfork(ptmalloc_lock_all, ptmalloc_unlock_all, ptmalloc_unlock_all2);
   ...
}

But in my system (x86_64) using pthread_atfork cause segmentation fault.
So i modify libc_fork function directly (patch attached). 

P.S.: It possible several another mutexes incorrectly forked. Need to check.
Comment 1 Robin Haberkorn 2010-12-03 20:46:39 UTC
Created attachment 2761 [details]
test case to reproduce the deadlock behaviour using setenv()

the same after-fork-deadlocking happens with setenv() as well.

once, because setenv() calls malloc() (so calling setenv() in the child process with concurrent malloc()s in the parent process at the time of forking can result in hangs) but also with the malloc() patch applied, setenv()s own mutex can cause deadlocks.
Comment 2 Robin Haberkorn 2010-12-03 21:03:14 UTC
Created attachment 2767 [details]
patch to apply in addition to the malloc-fork patch to resolve setenv() 'deadlocks'

patch for setenv()/fork() in a similar spirit as patch 2293 (based on another uclibc git checkout, though). this is only a quick fix and might not be completely correct.

I'd like to raise another point: can this be considered a bug at all?
glibc's malloc implementation makes malloc() safe after forking using atfork handlers but POSIX.1-2001 states that for the very reason causing these 'deadlocks', you should only use async-signal-safe functions (they are guaranteed not to use any threading primitives) after forking but before exec() in the child process, at least in multi-threaded environments.

see: http://www.opengroup.org/onlinepubs/000095399/functions/fork.html

so am I missing some specification uClibc wants to support but stating that this has to work or is it just a nice-to-have glibc-compatibility feature?
Comment 3 Bernhard Reutner-Fischer 2011-02-09 18:11:53 UTC
Hi,

Offhand, I think this is a nice-to-have compat feature (which should be nullified with NPTL).
Please doublecheck current master, i486 or any other arch that already has NPTL support, which excludes x86_64 ATM).
Comment 4 Timo Teräs 2011-03-26 19:11:17 UTC
This problem is present on all NPTL builds too.

Alternate patch (which I consider cleaner) is at:
http://lists.uclibc.org/pipermail/uclibc/2011-March/045117.html
Comment 5 Roman I Khimov 2015-04-06 07:11:50 UTC
Why was this one changed to busybox? It's not a busybox bug, it's a uclibc bug.