Discussion:
[musl] Deadlock when calling fflush/fclose in multiple threads
d***@kooiot.com
2018-11-02 05:11:00 UTC
Permalink
Hi,

We got deadlock on fflush/fclose with musl-1.1.19 (openwrt 18.06). Actually we using lua's popen in mutiple threads, following is gdb trace.

I am new to musl libc source code, fflush(NULL) will call __ofl_lock and then try to lock and flush every stream, fclose will lock the stream and then __ofl_lock. The question is the fflush/fclose api thread-safe? What i have got from man document is that linux fflush/fclose is thread-safe api.

(gdb) info thread
Id Target Id Frame
* 1 LWP 2772 "skynet" __cp_end () at src/thread/i386/syscall_cp.s:31
2 LWP 2776 "skynet" __cp_end () at src/thread/i386/syscall_cp.s:31
3 LWP 2777 "skynet" __cp_end () at src/thread/i386/syscall_cp.s:31
4 LWP 2778 "skynet" 0xb7f95ad9 in __kernel_vsyscall ()
5 LWP 2779 "skynet" __cp_end () at src/thread/i386/syscall_cp.s:31
6 LWP 2780 "skynet" 0xb7f95ad9 in __kernel_vsyscall ()
7 LWP 2781 "skynet" 0xb7f95ad9 in __kernel_vsyscall ()
8 LWP 2782 "skynet" __cp_end () at src/thread/i386/syscall_cp.s:31
(gdb) thread 6
[Switching to thread 6 (LWP 2780)]
#0 0xb7f95ad9 in __kernel_vsyscall ()
(gdb) bt
#0 0xb7f95ad9 in __kernel_vsyscall ()
#1 0xb7fb1314 in __vsyscall () at src/internal/i386/syscall.s:19
#2 0xb7fe29a9 in __syscall4 (a4=0, a3=2781, a2=128, a1=136262812, n=240) at ./arch/i386/syscall_arch.h:37
#3 __wait (addr=0x81f349c, waiters=0x81f34a0, val=2781, priv=128) at src/thread/__wait.c:13
#4 0xb7fd7a52 in __lockfile (f=0x81f3450) at src/stdio/__lockfile.c:10
#5 0xb7fd829d in fflush (f=0x81f3450) at src/stdio/fflush.c:13
#6 0x0807129e in io_popen (L=0x81e5f04) at liolib.c:282
#7 0x0805a064 in luaD_precall (L=0x81e5f04, func=0x81e638c, nresults=2) at ldo.c:434
#8 0x080661f0 in luaV_execute (L=0x81e5f04) at lvm.c:1146
#9 0x08059dc1 in unroll (L=0x81e5f04, ud=0x0) at ldo.c:556
#10 0x080596bb in luaD_rawrunprotected (L=0x81e5f04, f=0x805a250 <resume>, ud=0xb778a5f8) at ldo.c:142
#11 0x0805a46d in lua_resume (L=0x81e5f04, from=0x81d8bd4, nargs=<optimized out>) at ldo.c:664
#12 0x0806f252 in auxresume (L=***@entry=0x81d8bd4, co=***@entry=0x81e5f04, narg=3) at lcorolib.c:39
#13 0x0806f5a5 in luaB_coresume (L=0x81d8bd4) at lcorolib.c:60
#14 0xb7732ca9 in timing_resume (L=0x81d8bd4) at lualib-src/lua-profile.c:134
#15 0x0805a064 in luaD_precall (L=0x81d8bd4, func=0x81e4144, nresults=-1) at ldo.c:434
#16 0x08066686 in luaV_execute (L=0x81d8bd4) at lvm.c:1162
#17 0x0805a348 in luaD_call (L=0x81d8bd4, func=0x81e4084, nResults=-1) at ldo.c:499
#18 0x0805a3ac in luaD_callnoyield (L=0x81d8bd4, func=0x81e4084, nResults=-1) at ldo.c:509
#19 0x080560e8 in f_call (L=0x81d8bd4, ud=0xb778a8e8) at lapi.c:944
#20 0x080596bb in luaD_rawrunprotected (L=0x81d8bd4, f=0x80560d0 <f_call>, ud=0xb778a8e8) at ldo.c:142
#21 0x0805a656 in luaD_pcall (L=0x81d8bd4, func=0x80560d0 <f_call>, u=0xb778a8e8, old_top=132, ef=0) at ldo.c:729
#22 0x08057969 in lua_pcallk (L=0x81d8bd4, nargs=5, nresults=-1, errfunc=0, ctx=0, k=0x806f070 <finishpcall>) at lapi.c:970
#23 0x0806f116 in luaB_pcall (L=0x81d8bd4) at lbaselib.c:424
#24 0x0805a064 in luaD_precall (L=0x81d8bd4, func=0x81e406c, nresults=2) at ldo.c:434
#25 0x080661f0 in luaV_execute (L=0x81d8bd4) at lvm.c:1146
#26 0x0805a348 in luaD_call (L=0x81d8bd4, func=0x81e4024, nResults=0) at ldo.c:499
#27 0x0805a3ac in luaD_callnoyield (L=0x81d8bd4, func=0x81e4024, nResults=0) at ldo.c:509
#28 0x080560e8 in f_call (L=0x81d8bd4, ud=0xb778abb8) at lapi.c:944
#29 0x080596bb in luaD_rawrunprotected (L=0x81d8bd4, f=0x80560d0 <f_call>, ud=0xb778abb8) at ldo.c:142
#30 0x0805a656 in luaD_pcall (L=0x81d8bd4, func=0x80560d0 <f_call>, u=0xb778abb8, old_top=36, ef=12) at ldo.c:729
#31 0x08057969 in lua_pcallk (L=0x81d8bd4, nargs=5, nresults=0, errfunc=1, ctx=0, k=0x0) at lapi.c:970
#32 0xb772978e in _cb (context=0x81da400, ud=0x81d8bd4, type=1, session=2, source=3, msg=0x81b1080, sz=2) at lualib-src/lua-skynet.c:75
#33 0x0804ecb1 in dispatch_message (ctx=***@entry=0x81da400, msg=***@entry=0xb778ac80) at skynet-src/skynet_server.c:274
#34 0x0804f97a in skynet_context_message_dispatch (sm=0x815b2a0, q=0x81da470, weight=-1) at skynet-src/skynet_server.c:334
#35 0x080501a9 in thread_worker (p=0xbffffb9c) at skynet-src/skynet_start.c:163
#36 0xb7fe3fc5 in start (p=0xb778ad54) at src/thread/pthread_create.c:150
#37 0xb7fe2b3a in __clone () at src/thread/i386/clone.s:36
#38 0x00000000 in ?? ()
(gdb) thread 7
[Switching to thread 7 (LWP 2781)]
#0 0xb7f95ad9 in __kernel_vsyscall ()
(gdb) bt
#0 0xb7f95ad9 in __kernel_vsyscall ()
#1 0xb7fb1314 in __vsyscall () at src/internal/i386/syscall.s:19
#2 0xb7fe278b in __syscall3 (a3=-2147483646, a2=128, a1=-1207963496, n=240) at ./arch/i386/syscall_arch.h:30
#3 __futexwait (priv=128, val=-2147483646, addr=0xb7fff098 <ofl_lock>) at ./src/internal/pthread_impl.h:150
#4 __lock (l=0xb7fff098 <ofl_lock>) at src/thread/__lock.c:42
#5 0xb7fd9f30 in __ofl_lock () at src/stdio/ofl.c:9
#6 0xb7fd8144 in fclose (f=0x81f3450) at src/stdio/fclose.c:17
#7 0xb7fda4e2 in pclose (f=0x81f3450) at src/stdio/pclose.c:9
#8 0x0807130e in io_pclose (L=0x81f3034) at liolib.c:273
#9 0x0807090a in aux_close (L=0x81f3034) at liolib.c:205
#10 0x0805a064 in luaD_precall (L=0x81f3034, func=0x81f312c, nresults=0) at ldo.c:434
#11 0x080661f0 in luaV_execute (L=0x81f3034) at lvm.c:1146
#12 0x080596bb in luaD_rawrunprotected (L=0x81f3034, f=0x805a250 <resume>, ud=0xb77748c8) at ldo.c:142
#13 0x0805a46d in lua_resume (L=0x81f3034, from=0x81e6654, nargs=<optimized out>) at ldo.c:664
#14 0x0806f252 in auxresume (L=***@entry=0x81e6654, co=***@entry=0x81f3034, narg=0) at lcorolib.c:39
#15 0x0806f5a5 in luaB_coresume (L=0x81e6654) at lcorolib.c:60
#16 0xb7732ca9 in timing_resume (L=0x81e6654) at lualib-src/lua-profile.c:134
#17 0x0805a064 in luaD_precall (L=0x81e6654, func=0x81f09e4, nresults=-1) at ldo.c:434
#18 0x08066686 in luaV_execute (L=0x81e6654) at lvm.c:1162
#19 0x0805a348 in luaD_call (L=0x81e6654, func=0x81f0924, nResults=0) at ldo.c:499
#20 0x0805a3ac in luaD_callnoyield (L=0x81e6654, func=0x81f0924, nResults=0) at ldo.c:509
#21 0x080560e8 in f_call (L=0x81e6654, ud=0xb7774bb8) at lapi.c:944
#22 0x080596bb in luaD_rawrunprotected (L=0x81e6654, f=0x80560d0 <f_call>, ud=0xb7774bb8) at ldo.c:142
#23 0x0805a656 in luaD_pcall (L=0x81e6654, func=0x80560d0 <f_call>, u=0xb7774bb8, old_top=36, ef=12) at ldo.c:729
#24 0x08057969 in lua_pcallk (L=0x81e6654, nargs=5, nresults=0, errfunc=1, ctx=0, k=0x0) at lapi.c:970
#25 0xb772978e in _cb (context=0x81e6e00, ud=0x81e6654, type=1, session=1, source=0, msg=0x0, sz=0) at lualib-src/lua-skynet.c:75
#26 0x0804ecb1 in dispatch_message (ctx=***@entry=0x81e6e00, msg=***@entry=0xb7774c80) at skynet-src/skynet_server.c:274
#27 0x0804f97a in skynet_context_message_dispatch (sm=0x815b2c0, q=0x81e6ec0, weight=-1) at skynet-src/skynet_server.c:334
#28 0x080501a9 in thread_worker (p=0xbffffba8) at skynet-src/skynet_start.c:163
#29 0xb7fe3fc5 in start (p=0xb7774d54) at src/thread/pthread_create.c:150
#30 0xb7fe2b3a in __clone () at src/thread/i386/clone.s:36
#31 0x00000000 in ?? ()
(gdb)



***@kooiot.com
Rich Felker
2018-11-02 14:29:15 UTC
Permalink
Post by d***@kooiot.com
Hi,
We got deadlock on fflush/fclose with musl-1.1.19 (openwrt 18.06).
Actually we using lua's popen in mutiple threads, following is gdb
trace.
I am new to musl libc source code, fflush(NULL) will call __ofl_lock
and then try to lock and flush every stream, fclose will lock the
stream and then __ofl_lock. The question is the fflush/fclose api
thread-safe? What i have got from man document is that linux
fflush/fclose is thread-safe api.
Your analysis is exactly correct. Calling fflush(NULL) frequently (or
at all) is a really bad idea because of how it scales and how
serializing it is, but it is valid, and the deadlock is a bug in musl.

The current placement of the ofl update seems to have been based on
minimizing how serializing fclose is, and on avoiding taking the
global lock for F_PERM (stdin/out/err) FILEs (which is largely a
useless optimization since the operation can happen at most 3 times).
Just moving it above the FLOCK (and making it not conditional on
F_PERM, to avoid data races) would solve this, but there's a deeper
bug here too.

By removing the FILE being closed from the open file list (and
unlocking the open file list, without which the removal can't be seen)
before it's flushed and closed, fclose creates a race window where
fflush(NULL) or exit() from another thread can complete without this
file being flushed, potentially causing data loss.

I think we just have to move the __ofl_lock to the top of the
function, before FLOCK, and the __ofl_unlock to after the
fflush/close. Unfortunately this makes fclose much more serializing
than it was before, but I don't see any way to avoid it.

Rich
John Starks
2018-11-02 15:15:51 UTC
Permalink
Post by Rich Felker
I think we just have to move the __ofl_lock to the top of the
function, before FLOCK, and the __ofl_unlock to after the
fflush/close. Unfortunately this makes fclose much more serializing
than it was before, but I don't see any way to avoid it.
Perhaps you could keep a global count of FILEs that are still being flushed after having been removed from the list. fflush could perform a futex wait on this becoming 0.
Rich Felker
2018-11-02 15:33:52 UTC
Permalink
Post by John Starks
Post by Rich Felker
I think we just have to move the __ofl_lock to the top of the
function, before FLOCK, and the __ofl_unlock to after the
fflush/close. Unfortunately this makes fclose much more serializing
than it was before, but I don't see any way to avoid it.
Perhaps you could keep a global count of FILEs that are still being
flushed after having been removed from the list. fflush could
perform a futex wait on this becoming 0.
I think such an approach is plausible, but involves the kind of
complex and error-prone direct use of atomics I'm actively trying to
eliminate. The same could be done without low level hacks via clever
use of rwlocks or a mutex+condvar pair, but all of these involve
namespace-safety issues and a lot more code than should be introduced
into minimal static programs using stdio.

For what it's worth, the only consumers of the open file list that can
be executed more than once are fflush(NULL), fclose, and __ofl_add
(used by fopen, etc.).

Rich
John Starks
2018-11-02 15:37:18 UTC
Permalink
Post by Rich Felker
I think such an approach is plausible, but involves the kind of
complex and error-prone direct use of atomics I'm actively trying to
eliminate. The same could be done without low level hacks via clever
use of rwlocks or a mutex+condvar pair, but all of these involve
namespace-safety issues and a lot more code than should be introduced
into minimal static programs using stdio.
For what it's worth, the only consumers of the open file list that can
be executed more than once are fflush(NULL), fclose, and __ofl_add
(used by fopen, etc.).
Rich
Yeah, I can see the desire to avoid additional complexity. It would be a shame, though, for fopen or fclose of a local file to block behind flushing during an fclose of a file on a network file system, for example.
Rich Felker
2018-11-02 15:46:11 UTC
Permalink
Post by John Starks
Post by Rich Felker
I think such an approach is plausible, but involves the kind of
complex and error-prone direct use of atomics I'm actively trying to
eliminate. The same could be done without low level hacks via clever
use of rwlocks or a mutex+condvar pair, but all of these involve
namespace-safety issues and a lot more code than should be introduced
into minimal static programs using stdio.
For what it's worth, the only consumers of the open file list that can
be executed more than once are fflush(NULL), fclose, and __ofl_add
(used by fopen, etc.).
Rich
Yeah, I can see the desire to avoid additional complexity. It would
be a shame, though, for fopen or fclose of a local file to block
behind flushing during an fclose of a file on a network file system,
for example.
Indeed. Actually it seems to be a show-stopping bug, but I have a
better solution. See the other branch of this thread.

Rich
Rich Felker
2018-11-02 15:45:28 UTC
Permalink
Post by Rich Felker
By removing the FILE being closed from the open file list (and
unlocking the open file list, without which the removal can't be seen)
before it's flushed and closed, fclose creates a race window where
fflush(NULL) or exit() from another thread can complete without this
file being flushed, potentially causing data loss.
I think we just have to move the __ofl_lock to the top of the
function, before FLOCK, and the __ofl_unlock to after the
fflush/close. Unfortunately this makes fclose much more serializing
than it was before, but I don't see any way to avoid it.
Another possibility seems to be moving the ofl lock and after the
fflush, close, and FUNLOCK, but before the free. This leaves a 'dead'
FILE in the open file list momentarily, but the only things that can
act on it are pthread_create's init_file_lock, __stdio_exit's
close_file, and fflush(NULL), and none of these can have any side
effects except on a FILE with buffered data (which the FILE being
closed can't have at this point).

I think I like this solution better, and I think it's necessary to do
something other than the above-quoted idea; holding the ofl lock
during flush can itself cause deadlock, since the flush could block
and forward progress of whatever has it blocked (e.g. other end of
socket or pipe) could depend on forward progress of fclose, fopen,
etc. in another thread. Also, in light of having added support for
application-provided buffers with setvbuf, even on regular files the
flush operation could take a long time.

Rich
d***@kooiot.com
2018-11-02 15:59:13 UTC
Permalink
In our case, fflush(NULL) is called in lua's popen. free/df/loadavg will be executed in popen to collect device status in multiple threads, for each five seconds. And the case for me to reproduce this bug is calling popen In infinite loop in two threads, the deadlock happens within one second in our application.

This deadlock happens on x86-64 openwrt (virtual machine), and we haven't meet this deadlock in nxp imx6ul(arm-cortex-a9) and tplink 702n(mips24_kc). I was thought it might caused by 64bits, and we used glibc for those amd64 vms. We started port our application to Raspberry 3Bfrom last month, we got same deaklock(the executed commands become zombie). And i got chance to loop at this issue again.



来自 魅族 PRO 6s

-------- 原始邮件 --------
发件人Rich Felker <***@libc.org>
时闎2018幎11月2日 22:29
收件人***@kooiot.com
抄送musl <***@lists.openwall.com>
䞻题Re: [musl] Deadlock when calling fflush/fclose in multiple threads
Post by Rich Felker
Post by d***@kooiot.com
Hi,
We got deadlock on fflush/fclose with musl-1.1.19 (openwrt 18.06).
Actually we using lua's popen in mutiple threads, following is gdb
trace.
I am new to musl libc source code, fflush(NULL) will call __ofl_lock
and then try to lock and flush every stream, fclose will lock the
stream and then __ofl_lock. The question is the fflush/fclose api
thread-safe? What i have got from man document is that linux
fflush/fclose is thread-safe api.
Your analysis is exactly correct. Calling fflush(NULL) frequently (or
at all) is a really bad idea because of how it scales and how
serializing it is, but it is valid, and the deadlock is a bug in musl.
The current placement of the ofl update seems to have been based on
minimizing how serializing fclose is, and on avoiding taking the
global lock for F_PERM (stdin/out/err) FILEs (which is largely a
useless optimization since the operation can happen at most 3 times).
Just moving it above the FLOCK (and making it not conditional on
F_PERM, to avoid data races) would solve this, but there's a deeper
bug here too.
By removing the FILE being closed from the open file list (and
unlocking the open file list, without which the removal can't be seen)
before it's flushed and closed, fclose creates a race window where
fflush(NULL) or exit() from another thread can complete without this
file being flushed, potentially causing data loss.
I think we just have to move the __ofl_lock to the top of the
function, before FLOCK, and the __ofl_unlock to after the
fflush/close. Unfortunately this makes fclose much more serializing
than it was before, but I don't see any way to avoid it.
Rich
Loading...