You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In a stress-test of usrsctp (the same test as was attached to #709) I saw a deadlock between usrsctp_close and usrsctp_conninput. Looking at the code, I suspect this could happen for the kernel implementation as well.
The issue is that sctp_common_input_processing acquires (in sctp_findassociation_addr) stcb->tcb_mtx, then, through the call stack sctp_process_data -> sctp_process_a_data_chunk -> sctp_add_to_readq, tries to acquire inp->inp_mtx. Meanwhile, sctp_close acquires inp->inp_mtx, then, in sctp_inpcb_free, tries to acquire stcb->tcb_mtx.
(Note: the line numbers shown in the crash are from #710, but nothing in that PR should have affected this deadlock.)
Excerpted gdb info:
(gdb) info threads
Id Target Id Frame
* 1 Thread 0xffffa8923020 (LWP 2157812) "crash_repro" futex_wait (private=0, expected=2, futex_word=0xffff8c015c70)
at ../sysdeps/nptl/futex-internal.h:146
...
13 Thread 0xffffa253f120 (LWP 2165461) "crash_repro" futex_wait (private=0, expected=2, futex_word=0xffff8c02e4e8)
at ../sysdeps/nptl/futex-internal.h:146
(gdb) bt
#0 futex_wait (private=0, expected=2, futex_word=0xffff8c015c70) at ../sysdeps/nptl/futex-internal.h:146
#1 __GI___lll_lock_wait (futex=futex@entry=0xffff8c015c70, private=private@entry=0) at ./nptl/lowlevellock.c:49
#2 0x0000ffffa868070c in lll_mutex_lock_optimized (mutex=0xffff8c015c70) at ./nptl/pthread_mutex_lock.c:48
#3 ___pthread_mutex_lock (mutex=mutex@entry=0xffff8c015c70) at ./nptl/pthread_mutex_lock.c:93
#4 0x0000ffffa88b0aa4 in sctp_inpcb_free (inp=inp@entry=0xffff8c02e140, immediate=immediate@entry=1, from=from@entry=1)
at ../../usrsctplib/netinet/sctp_pcb.c:4083
#5 0x0000ffffa88b854c in sctp_close (so=so@entry=0xffff8c02b1e0) at ../../usrsctplib/netinet/sctp_usrreq.c:891
#6 0x0000ffffa8863a7c in sofree (so=0xffff8c02b1e0) at ../../usrsctplib/user_socket.c:287
#7 0x0000ffffa8867aa8 in usrsctp_close (so=<optimized out>) at ../../usrsctplib/user_socket.c:2005
#8 0x0000aaaab6c020c8 in close_socket (o=0xaaaada80e3e0) at crash_repro.c:164
#9 run_test (close_ns=close_ns@entry=198272357) at crash_repro.c:245
#10 0x0000aaaab6c014a8 in main () at crash_repro.c:284
Actually unfortunately it looks like this was a consequence of #710 -- unexpectedly, it looks like the library depends on the socket's reference count not going to zero during sctp_common_input_processing.
In a stress-test of usrsctp (the same test as was attached to #709) I saw a deadlock between usrsctp_close and usrsctp_conninput. Looking at the code, I suspect this could happen for the kernel implementation as well.
The issue is that
sctp_common_input_processing
acquires (insctp_findassociation_addr
)stcb->tcb_mtx
, then, through the call stacksctp_process_data
->sctp_process_a_data_chunk
->sctp_add_to_readq
, tries to acquireinp->inp_mtx
. Meanwhile,sctp_close
acquiresinp->inp_mtx
, then, insctp_inpcb_free
, tries to acquirestcb->tcb_mtx
.(Note: the line numbers shown in the crash are from #710, but nothing in that PR should have affected this deadlock.)
Excerpted gdb info:
The text was updated successfully, but these errors were encountered: