Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock when using ZRTP as transport in pjsip_2.5-trunk #31

Open
efendizadeh opened this issue Mar 31, 2016 · 5 comments
Open

Deadlock when using ZRTP as transport in pjsip_2.5-trunk #31

efendizadeh opened this issue Mar 31, 2016 · 5 comments

Comments

@efendizadeh
Copy link

First of all thank you for great work.
We are using pjsip in connection with ZRTP4PJ+ZRTPCPP for delivering secure calls. In development we faced with deadlock problem. After some investigation we found following situation:

As you can see from attached logs (start from thread 20). Pjsip initiates sending of rtp and passes it to zrtp transport over transport_send_rtp function for further processing. Via zrtp4pj API zrtpcpp processEvent(event_t *ev) method (zrtp/ZrtpStateClass.cpp Line 81). In processEvent method synchEnter is called, which locks zrtp mutex (let’s call it mutex_1). After that we returning back to pjsip transport_send_rtp where we have internal pjsip mutex lock (let’s call it mutex_2). At the same in thread 15, Pjsip, before calling srtp_rtp_cb, locks mutex_2, then process goes to start zrtp transport and we come in processEvent method's synchEnter call, where mutex_1 is already locked. Sample diagram attached.

After some investigation we decided that, internal zrtp synchronisation is not needed any more, pjsip is responsible for packet processing synchronisation now. We removed synchEnter and synchLeave and have tested most cases. At this time flight is success :) And the question is that, is this approach is correct?

Thread 15
2 pthread_mutex_lock ()
3 pj_mutex_lock at /src/pj/os_core_unix.c:1265
4 zrtp_synchEnter at /zsrtp/transport_zrtp.c:768
5 ZrtpCallbackWrapper::synchEnter() at /zsrtp/zrtp/zrtp/ZrtpCallbackWrapper.cpp:121
6 ZRtp::synchEnter() at /zsrtp/zrtp/zrtp/ZRtp.cpp:2568
7 ZrtpStateClass::processEvent(Event*) at /zsrtp/zrtp/zrtp/ZrtpStateClass.cpp:86
8 ZRtp::startZrtpEngine() at /zsrtp/zrtp/zrtp/ZRtp.cpp:253
9 ::zrtp_startZrtpEngine(ZrtpContext *) at /zsrtp/zrtp/zrtp/ZrtpCWrapper.cpp:119
10 pjmedia_transport_zrtp_startZrtp at /zsrtp/transport_zrtp.c:859
11 transport_rtp_cb at /zsrtp/transport_zrtp.c:987
12 srtp_rtp_cb at /src/pjmedia/transport_srtp.c:971
13 ice_on_rx_data at /src/pjmedia/transport_ice.c:1779
14 ice_rx_data at /src/pjnath/ice_strans.c:1467
15 pj_ice_sess_on_rx_pkt at /src/pjnath/ice_session.c:3020
16 stun_on_rx_data at /src/pjnath/ice_strans.c:1508
17 on_data_recvfrom at /src/pjnath/stun_sock.c:957
18 ioqueue_on_read_complete at /src/pj/activesock.c:498
19 ioqueue_dispatch_read_event at /src/pj/ioqueue_common_abs.c:605
20 pj_ioqueue_poll at /src/pj/ioqueue_select.c:981
21 pjsip_endpt_handle_events2 at /src/pjsip/sip_endpoint.c:741
22 pjsua_handle_events at /src/pjsua-lib/pjsua_core.c:1858
23 worker_thread at /src/pjsua-lib/pjsua_core.c:696
24 thread_main at /src/pj/os_core_unix.c:541
25 _pthread_body ()
26 _pthread_start ()
27 thread_start ()

Thread 20
2 pthread_mutex_lock_slow ()
3 pj_mutex_lock at /src/pj/os_core_unix.c:1265
4 pj_lock_acquire at /src/pj/lock.c:180
5 grp_lock_acquire at /src/pj/lock.c:290
6 pj_grp_lock_acquire at /src/pj/lock.c:478
7 pj_ice_strans_sendto at /src/pjnath/ice_strans.c:1247
8 transport_send_rtp at /src/pjmedia/transport_ice.c:1655
9 pjmedia_transport_send_rtp at /include/pjmedia/transport.h:678
10 transport_send_rtp at /src/pjmedia/transport_srtp.c:853
11 pjmedia_transport_send_rtp at /include/pjmedia/transport.h:678
12 zrtp_sendDataZRTP at /zsrtp/transport_zrtp.c:461
13 ZrtpCallbackWrapper::sendDataZRTP(unsigned char const
, int) at /zsrtp/zrtp/zrtp/ZrtpCallbackWrapper.cpp:37
14 ZRtp::sendPacketZRTP(ZrtpPacketBase_) at /zsrtp/zrtp/zrtp/ZRtp.cpp:2577
15 ZrtpStateClass::evInitial() at /zsrtp/zrtp/zrtp/ZrtpStateClass.cpp:166
16 ZrtpStates::processEvent(ZrtpStateClass&) at /zsrtp/zrtp/zrtp/libzrtpcpp/ZrtpStates.h:68
17 ZrtpStateClass::processEvent(Event_) at /zsrtp/zrtp/zrtp/ZrtpStateClass.cpp:151
18 ZRtp::startZrtpEngine() at /zsrtp/zrtp/zrtp/ZRtp.cpp:253
19 ::zrtp_startZrtpEngine(ZrtpContext ) at /zsrtp/zrtp/zrtp/ZrtpCWrapper.cpp:119
20 pjmedia_transport_zrtp_startZrtp at /zsrtp/transport_zrtp.c:859
21 transport_send_rtp at /zsrtp/transport_zrtp.c:1154
22 pjmedia_transport_send_rtp at /include/pjmedia/transport.h:678
23 put_frame_imp at /src/pjmedia/stream.c:1382
24 put_frame at /src/pjmedia/stream.c:1511
25 pjmedia_port_put_frame at /src/pjmedia/port.c:115
26 write_port at /src/pjmedia/conference.c:1760
27 get_frame at /src/pjmedia/conference.c:1998
28 pjmedia_port_get_frame at /src/pjmedia/port.c:98
29 resample_get_frame at /src/pjmedia/resample_port.c:182
30 pjmedia_port_get_frame at /src/pjmedia/port.c:98
31 play_cb at /src/pjmedia/sound_port.c:88
32 output_renderer at /src/pjmedia-audiodev/coreaudio_dev.m:1073
33 AUInputElement::PullInput(unsigned long&, AudioTimeStamp const&, unsigned long, unsigned long) ()
34 AUInputFormatConverter2::InputProc(OpaqueAudioConverter_, unsigned long_, AudioBufferList_, AudioStreamPacketDescription
, void) ()
35 AudioConverterChain::CallInputProc(unsigned long) ()
36 AudioConverterChain::FillBufferFromInputProc(unsigned long_, CABufferList_) ()
37 BufferedAudioConverter::GetInputBytes(unsigned long, unsigned long&, CABufferList const_&) ()
38 CBRConverter::RenderOutput(CABufferList_, unsigned long, unsigned long&, AudioStreamPacketDescription_) ()
39 BufferedAudioConverter::FillBuffer(unsigned long&, AudioBufferList&, AudioStreamPacketDescription_) ()
40 AudioConverterChain::RenderOutput(CABufferList_, unsigned long, unsigned long&, AudioStreamPacketDescription_) ()
41 BufferedAudioConverter::FillBuffer(unsigned long&, AudioBufferList&, AudioStreamPacketDescription_) ()
42 AudioConverterFillComplexBuffer ()
43 AUInputFormatConverter2::PullAndConvertInput(AudioTimeStamp const&, unsigned long&, AudioBufferList&, AudioStreamPacketDescription_, bool&) ()
44 AUConverterBase::RenderBus(unsigned long&, AudioTimeStamp const&, unsigned long, unsigned long) ()
45 AUBase::DoRenderBus(unsigned long&, AudioTimeStamp const&, unsigned long, AUOutputElement_, unsigned long, AudioBufferList&) ()
46 AUBase::DoRender(unsigned long&, AudioTimeStamp const&, unsigned long, unsigned long, AudioBufferList&) ()
47 AURemoteIO::PerformIO(unsigned long, unsigned int, unsigned int, AudioTimeStamp const&, AudioTimeStamp const&, AudioBufferList const_, AudioBufferList_, int&) ()
48 AURIOCallbackReceiver_PerformIO ()
49 XPerformIO ()
50 mshMIGPerform ()
51 MSHMIGDispatchMessage ()
52 AURemoteIO::IOThread::Run() ()
53 AURemoteIO::IOThread::Entry(void
) ()
54 CAPThread::Entry(CAPThread*) ()
55 _pthread_body ()
56 _pthread_start ()
57 thread_start ()

thread diagram

@wernerd
Copy link
Owner

wernerd commented Mar 31, 2016

The ZRTP intercal sync makes sure that the state engine handle only one packet
at a time for a stream. Stream means: one RTP in/out data flow, for example s
end/receive audio.

Looking at the stack trace I see that bot threads perform a "zrtp_startZrtpEngine(..). Why
this? Only one thread should start a ZRTP stack for one stream . If ZRTP should work for
another stream, for example adding Video to the call, the you need to have a second
ZRTP instance.

Werner

Am 31.03.2016 um 14:59 schrieb Jamil:

First of all thank you for great work.
We are using pjsip in connection with ZRTP4PJ+ZRTPCPP for delivering secure calls. In development we faced with deadlock problem. After some investigation we found following situation:

As you can see from attached logs (start from thread 20). Pjsip initiates sending of rtp and passes it to zrtp transport over transport_send_rtp function for further processing. Via zrtp4pj API zrtpcpp processEvent(event_t *ev) method (zrtp/ZrtpStateClass.cpp Line 81). In processEvent method synchEnter is called, which locks zrtp mutex (let’s call it mutex_1). After that we returning back to pjsip transport_send_rtp where we have internal pjsip mutex lock (let’s call it mutex_2). At the same in thread 15, Pjsip, before calling srtp_rtp_cb, locks mutex_2, then process goes to start zrtp transport and we come in processEvent method's synchEnter call, where mutex_1 is already locked. Sample diagram attached.

After some investigation we decided that, internal zrtp synchronisation is not needed any more, pjsip is responsible for packet processing synchronisation now. We removed synchEnter and synchLeave and have tested most cases. At this time flight is success :) And the question is that, is this approach is correct?

Thread 15
2 pthread_mutex_lock ()
3 pj_mutex_lock at /src/pj/os_core_unix.c:1265
4 zrtp_synchEnter at /zsrtp/transport_zrtp.c:768
5 ZrtpCallbackWrapper::synchEnter() at /zsrtp/zrtp/zrtp/ZrtpCallbackWrapper.cpp:121
6 ZRtp::synchEnter() at /zsrtp/zrtp/zrtp/ZRtp.cpp:2568
7 ZrtpStateClass::processEvent(Event*) at /zsrtp/zrtp/zrtp/ZrtpStateClass.cpp:86
8 ZRtp::startZrtpEngine() at /zsrtp/zrtp/zrtp/ZRtp.cpp:253
9 ::zrtp_startZrtpEngine(ZrtpContext *) at /zsrtp/zrtp/zrtp/ZrtpCWrapper.cpp:119
10 pjmedia_transport_zrtp_startZrtp at /zsrtp/transport_zrtp.c:859
11 transport_rtp_cb at /zsrtp/transport_zrtp.c:987
12 srtp_rtp_cb at /src/pjmedia/transport_srtp.c:971
13 ice_on_rx_data at /src/pjmedia/transport_ice.c:1779
14 ice_rx_data at /src/pjnath/ice_strans.c:1467
15 pj_ice_sess_on_rx_pkt at /src/pjnath/ice_session.c:3020
16 stun_on_rx_data at /src/pjnath/ice_strans.c:1508
17 on_data_recvfrom at /src/pjnath/stun_sock.c:957
18 ioqueue_on_read_complete at /src/pj/activesock.c:498
19 ioqueue_dispatch_read_event at /src/pj/ioqueue_common_abs.c:605
20 pj_ioqueue_poll at /src/pj/ioqueue_select.c:981
21 pjsip_endpt_handle_events2 at /src/pjsip/sip_endpoint.c:741
22 pjsua_handle_events at /src/pjsua-lib/pjsua_core.c:1858
23 worker_thread at /src/pjsua-lib/pjsua_core.c:696
24 thread_main at /src/pj/os_core_unix.c:541
25 _pthread_body ()
26 _pthread_start ()
27 thread_start ()

Thread 20
2 _pthread_mutex_lock_slow ()
3 pj_mutex_lock at /src/pj/os_core_unix.c:1265
4 pj_lock_acquire at /src/pj/lock.c:180
5 grp_lock_acquire at /src/pj/lock.c:290
6 pj_grp_lock_acquire at /src/pj/lock.c:478
7 pj_ice_strans_sendto at /src/pjnath/ice_strans.c:1247
8 transport_send_rtp at /src/pjmedia/transport_ice.c:1655
9 pjmedia_transport_send_rtp at /include/pjmedia/transport.h:678
10 transport_send_rtp at /src/pjmedia/transport_srtp.c:853
11 pjmedia_transport_send_rtp at /include/pjmedia/transport.h:678
12 zrtp_sendDataZRTP at /zsrtp/transport_zrtp.c:461
13 ZrtpCallbackWrapper::sendDataZRTP(unsigned char const/, int) at /zsrtp/zrtp/zrtp/ZrtpCallbackWrapper.cpp:37
14 ZRtp::sendPacketZRTP(ZrtpPacketBase/) at /zsrtp/zrtp/zrtp/ZRtp.cpp:2577
15 ZrtpStateClass::evInitial() at /zsrtp/zrtp/zrtp/ZrtpStateClass.cpp:166
16 ZrtpStates::processEvent(ZrtpStateClass&) at /zsrtp/zrtp/zrtp/libzrtpcpp/ZrtpStates.h:68
17 ZrtpStateClass::processEvent(Event/) at /zsrtp/zrtp/zrtp/ZrtpStateClass.cpp:151
18 ZRtp::startZrtpEngine() at /zsrtp/zrtp/zrtp/ZRtp.cpp:253
19 ::zrtp_startZrtpEngine(ZrtpContext *) at /zsrtp/zrtp/zrtp/ZrtpCWrapper.cpp:119
20 pjmedia_transport_zrtp_startZrtp at /zsrtp/transport_zrtp.c:859
21 transport_send_rtp at /zsrtp/transport_zrtp.c:1154
22 pjmedia_transport_send_rtp at /include/pjmedia/transport.h:678
23 put_frame_imp at /src/pjmedia/stream.c:1382
24 put_frame at /src/pjmedia/stream.c:1511
25 pjmedia_port_put_frame at /src/pjmedia/port.c:115
26 write_port at /src/pjmedia/conference.c:1760
27 get_frame at /src/pjmedia/conference.c:1998
28 pjmedia_port_get_frame at /src/pjmedia/port.c:98
29 resample_get_frame at /src/pjmedia/resample_port.c:182
30 pjmedia_port_get_frame at /src/pjmedia/port.c:98
31 play_cb at /src/pjmedia/sound_port.c:88
32 output_renderer at /src/pjmedia-audiodev/coreaudio_dev.m:1073
33 AUInputElement::PullInput(unsigned long&, AudioTimeStamp const&, unsigned long, unsigned long) ()
34 AUInputFormatConverter2::InputProc(OpaqueAudioConverter/, unsigned long/, AudioBufferList/, AudioStreamPacketDescription*/, void/) ()
35 AudioConverterChain::CallInputProc(unsigned long) ()
36 AudioConverterChain::FillBufferFromInputProc(unsigned long/, CABufferList/) ()
37 BufferedAudioConverter::GetInputBytes(unsigned long, unsigned long&, CABufferList const/&) ()
38 CBRConverter::RenderOutput(CABufferList/, unsigned long, unsigned long&, AudioStreamPacketDescription/) ()
39 BufferedAudioConverter::FillBuffer(unsigned long&, AudioBufferList&, AudioStreamPacketDescription/) ()
40 AudioConverterChain::RenderOutput(CABufferList/, unsigned long, unsigned long&, AudioStreamPacketDescription/) ()
41 BufferedAudioConverter::FillBuffer(unsigned long&, AudioBufferList&, AudioStreamPacketDescription/) ()
42 AudioConverterFillComplexBuffer ()
43 AUInputFormatConverter2::PullAndConvertInput(AudioTimeStamp const&, unsigned long&, AudioBufferList&, AudioStreamPacketDescription/, bool&) ()
44 AUConverterBase::RenderBus(unsigned long&, AudioTimeStamp const&, unsigned long, unsigned long) ()
45 AUBase::DoRenderBus(unsigned long&, AudioTimeStamp const&, unsigned long, AUOutputElement/, unsigned long, AudioBufferList&) ()
46 AUBase::DoRender(unsigned long&, AudioTimeStamp const&, unsigned long, unsigned long, AudioBufferList&) ()
47 AURemoteIO::PerformIO(unsigned long, unsigned int, unsigned int, AudioTimeStamp const&, AudioTimeStamp const&, AudioBufferList const/, AudioBufferList/, int&) ()
48 AURIOCallbackReceiver_PerformIO ()
49 _XPerformIO ()
50 mshMIGPerform ()
51 MSHMIGDispatchMessage ()
52 AURemoteIO::IOThread::Run() ()
53 AURemoteIO::IOThread::Entry(void/) ()
54 CAPThread::Entry(CAPThread*) ()
55 _pthread_body ()
56 _pthread_start ()
57 thread_start ()

thread diagram https://cloud.githubusercontent.com/assets/7306284/14176319/0bf35ffa-f759-11e5-9bf6-98ba92ac93ae.png


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub #31

Werner Dittmann
email: [email protected]
cell: +49 173 44 37 659
PGP key: 82EF5E8B

@efendizadeh
Copy link
Author

Situation seems like following:
We are not manipulating directly pjsip threads. In this situation on outgoing call transport_send_rtp and transport_rtp_cb functions calls by pjsip in most case on the same time by different threads. In connection of this pjmedia_transport_zrtp_startZrtp in ZRTP4PJ (transport_zrtp.c) calls more than one time at the same time. It happens regarding to zrtp->started not covered by sync primitives (mutex).

For solving our problem we are thinking about possible solution to add synchronisation to zrtp->started checking and setting calls. As an architect and developer of this great software we need your advise.

Thanks in advance.

@aylesm
Copy link

aylesm commented May 30, 2017

@efendizadeh did you get a solution for this? I'm seeing the same deadlock.

@efendizadeh
Copy link
Author

@aylesm see attached patch
deadlock_patch.txt

@aylesm
Copy link

aylesm commented May 30, 2017

@efendizadeh thanks. Looks like synchEnter/Leave are only called in that function. So commenting out the callback impl has the same effect as your patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants