By default, Bugzilla does not search the list of RESOLVED bugs.
You can force it to do so by putting the upper-case word ALL in front of your search query, e.g.: ALL tdelibs
We recommend searching for bugs this way, as you may discover that your bug has already been resolved and fixed in a later release.
Bug 2060 - Amarok can hang when nework stream is interrupted
Summary: Amarok can hang when nework stream is interrupted
Status: NEW
Alias: None
Product: TDE
Classification: Unclassified
Component: non-core programs (show other bugs)
Version: R14.0.x [Trinity]
Hardware: Other Linux
: P5 normal
Assignee: Timothy Pearson
URL:
Depends on:
Blocks:
 
Reported: 2014-05-28 23:50 CDT by Timothy Pearson
Modified: 2018-05-27 10:48 CDT (History)
3 users (show)

See Also:
Compiler Version:
TDE Version String:
Application Version:
Application Name:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Timothy Pearson 2014-05-28 23:50:04 CDT
Amarok occasionally hangs when playing a network stream and the stream is interrupted (e.g. bad network connection, source glitch, etc.)  The entire UI freezes and has to be terminated manually.  The hing process does not consume any measurable CPU time.

The backtrace of the hung process does not reveal much:
0x00007f84a4fc7a43 in __GI___poll (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at ../sysdeps/unix/sysv/linux/poll.c:87
87      ../sysdeps/unix/sysv/linux/poll.c: No such file or directory.
(gdb) bt
#0  0x00007f84a4fc7a43 in __GI___poll (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at ../sysdeps/unix/sysv/linux/poll.c:87
#1  0x00007f84a0b82ff6 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#2  0x00007f84a0b83124 in g_main_context_iteration () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#3  0x00007f84a5dc18d8 in TQEventLoop::processEvents (this=0x23be820, flags=<optimized out>) at kernel/qeventloop_x11_glib.cpp:279
#4  0x00007f84a5df0599 in TQEventLoop::enterLoop (this=0x23be820) at kernel/qeventloop.cpp:227
#5  0x00007f84a5df0529 in TQEventLoop::exec (this=0x23be820) at kernel/qeventloop.cpp:174
#6  0x00000000004013a4 in main (argc=3, argv=0x7ffffc88e9b8) at /build/buildd/amarok-trinity-14.0.0-r247/./amarok/src/main.cpp:116
(gdb)
Comment 1 Michele Calgaro 2014-05-29 06:55:32 CDT
Yep, I have seen this happening occasionally also on my computer.
Comment 2 Timothy Pearson 2014-11-09 23:30:11 CST
I'm having issues reproducing this issue on-demand, though I am pretty sure it still exists.

Next time a freeze is encountered please attach to the amarokapp process with gdb and run:
thread apply all bt

then post the results here.  Let's see whose Amarok instance goes down first. ;-)
Comment 3 Michele Calgaro 2014-11-10 02:58:19 CST
Ok, will do. But haven't noticed it recently, also because I now use VLC more than Amarok.
But if it happens, I will let you know.
Comment 4 Timothy Pearson 2015-03-10 12:51:37 CDT
Thread 6 (Thread 0x7f497a77e700 (LWP 27474)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:215
#1  0x00007f49788c01b1 in ?? () from /usr/lib/libxine.so.2
#2  0x00007f4986512e9a in start_thread (arg=0x7f497a77e700) at pthread_create.c:308
#3  0x00007f49895002ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#4  0x0000000000000000 in ?? ()

Thread 5 (Thread 0x7f4973fff700 (LWP 27476)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x00007f49788cff81 in ?? () from /usr/lib/libxine.so.2
#2  0x00007f49788d286c in ?? () from /usr/lib/libxine.so.2
#3  0x00007f4986512e9a in start_thread (arg=0x7f4973fff700) at pthread_create.c:308
#4  0x00007f49895002ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#5  0x0000000000000000 in ?? ()

Thread 4 (Thread 0x7f4973631700 (LWP 27477)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x00007f49788c44fb in ?? () from /usr/lib/libxine.so.2
#2  0x00007f49788cb73d in ?? () from /usr/lib/libxine.so.2
#3  0x00007f4986512e9a in start_thread (arg=0x7f4973631700) at pthread_create.c:308
#4  0x00007f49895002ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#5  0x0000000000000000 in ?? ()

Thread 3 (Thread 0x7f4972e30700 (LWP 27478)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x00007f49788d438b in xine_event_wait () from /usr/lib/libxine.so.2
#2  0x00007f49788d442e in ?? () from /usr/lib/libxine.so.2
#3  0x00007f4986512e9a in start_thread (arg=0x7f4972e30700) at pthread_create.c:308
#4  0x00007f49895002ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#5  0x0000000000000000 in ?? ()

Thread 2 (Thread 0x7f496794f700 (LWP 1842)):
#0  0x00007f49894f9653 in select () at ../sysdeps/unix/syscall-template.S:82
#1  0x00007f49788df53a in _x_io_select () from /usr/lib/libxine.so.2
#2  0x00007f49788df636 in ?? () from /usr/lib/libxine.so.2
#3  0x00007f49685c5661 in ?? () from /usr/lib/xine/plugins/2.0/xineplug_inp_http.so
#4  0x00007f49685c5847 in ?? () from /usr/lib/xine/plugins/2.0/xineplug_inp_http.so
#5  0x00007f49788e0c9e in ?? () from /usr/lib/libxine.so.2
#6  0x00007f4971c26db0 in ?? () from /usr/lib/xine/plugins/2.0/xineplug_xiph.so
#7  0x00007f4971c2a208 in ?? () from /usr/lib/xine/plugins/2.0/xineplug_xiph.so
#8  0x00007f49788d9352 in ?? () from /usr/lib/libxine.so.2
#9  0x00007f4986512e9a in start_thread (arg=0x7f496794f700) at pthread_create.c:308
#10 0x00007f49895002ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#11 0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7f498b363780 (LWP 27441)):
#0  0x00007f49894f4933 in __GI___poll (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at ../sysdeps/unix/sysv/linux/poll.c:87
#1  0x00007f4985053ff6 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#2  0x00007f4985054124 in g_main_context_iteration () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#3  0x00007f498a328b58 in TQEventLoop::processEvents (this=0x15caf50, flags=<optimized out>) at kernel/qeventloop_x11_glib.cpp:279
#4  0x00007f498a3580b9 in TQEventLoop::enterLoop (this=0x15caf50) at kernel/qeventloop.cpp:227
#5  0x00007f498a358049 in TQEventLoop::exec (this=0x15caf50) at kernel/qeventloop.cpp:174
#6  0x00000000004013a4 in main (argc=1, argv=0x7ffff189f488) at /build/buildd/amarok-trinity-14.0.0-r275/./amarok/src/main.cpp:116
Comment 5 Timothy Pearson 2015-03-10 12:52:17 CDT
I'm starting to think this is a Xine hang from the gdb output.
Comment 6 Michele Calgaro 2015-03-10 20:11:27 CDT
At first sight it looks like a deadlock: 4 threads waiting on pthread_cond_timedwait are very suspicious.
Any chance you can install libxine dbg symbols and check what mutex each thread is waiting on?
Also have you found any way to reproduce this systematically? I could try on my system as well (not that at this time I would be able to *work* on this anyway :-( - as you know)
Comment 7 Timothy Pearson 2015-03-10 22:52:46 CDT
(In reply to Michele Calgaro from comment #6)
> At first sight it looks like a deadlock: 4 threads waiting on
> pthread_cond_timedwait are very suspicious.
> Any chance you can install libxine dbg symbols and check what mutex each
> thread is waiting on?
> Also have you found any way to reproduce this systematically? I could try on
> my system as well (not that at this time I would be able to *work* on this
> anyway :-( - as you know)

Yes, I have installed the debug symbols and am waiting for another hang.  The only time I have ever noticed the hang is when my ISP is having issues, which translates to very high lag spikes with intermittent heavy packet loss.  Not sure if you might be able to simulate something similar, e.g. by using Wi-Fi next to an operating microwave oven. ;-)

In any case I won't really have any time either for the next few months, so we'll probably get the requisite data from my system before either of us can look at it.
Comment 8 Michele Calgaro 2015-03-11 00:39:51 CDT
Ok, let's wait for the next hang then. I should try that idea with the microwave and the wifi :-)
Comment 9 Timothy Pearson 2015-03-12 13:13:56 CDT
This was fast:
Thread 6 (Thread 0x7f9c92702700 (LWP 7504)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:215
#1  0x00007f9c908441b1 in metronom_sync_loop (this_gen=0x31d91f0) at metronom.c:871
#2  0x00007f9c9e496e9a in start_thread (arg=0x7f9c92702700) at pthread_create.c:308
#3  0x00007f9ca14842ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#4  0x0000000000000000 in ?? ()

Thread 5 (Thread 0x7f9c8bfff700 (LWP 7506)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x00007f9c90853f81 in fifo_peek_int (fifo=0x31dbc10, blocking=1) at audio_out.c:358
#2  0x00007f9c9085686c in fifo_peek (fifo=0x31dbc10) at audio_out.c:398
#3  ao_loop (this_gen=0x32117a0) at audio_out.c:1023
#4  0x00007f9c9e496e9a in start_thread (arg=0x7f9c8bfff700) at pthread_create.c:308
#5  0x00007f9ca14842ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#6  0x0000000000000000 in ?? ()

Thread 4 (Thread 0x7f9c8b631700 (LWP 7507)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x00007f9c908484fb in fifo_buffer_get (fifo=0x337c190) at buffer.c:236
#2  0x00007f9c9084f73d in audio_decoder_loop (stream_gen=0x33704e0) at audio_decoder.c:67
#3  0x00007f9c9e496e9a in start_thread (arg=0x7f9c8b631700) at pthread_create.c:308
#4  0x00007f9ca14842ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#5  0x0000000000000000 in ?? ()

Thread 3 (Thread 0x7f9c8ae30700 (LWP 7508)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x00007f9c9085838b in xine_event_wait (queue=0x3387fe0) at events.c:56
#2  0x00007f9c9085842e in listener_loop (queue_gen=0x3387fe0) at events.c:219
#3  0x00007f9c9e496e9a in start_thread (arg=0x7f9c8ae30700) at pthread_create.c:308
#4  0x00007f9ca14842ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#5  0x0000000000000000 in ?? ()

Thread 2 (Thread 0x7f9c89e2e700 (LWP 7622)):
#0  0x00007f9ca147d653 in select () at ../sysdeps/unix/syscall-template.S:82
#1  0x00007f9c9086353a in _x_io_select (stream=0x33704e0, fd=22, state=<optimized out>, timeout_msec=<optimized out>) at io_helper.c:279
#2  0x00007f9c90863636 in xio_rw_abort (stream=0x33704e0, fd=22, cmd=2, buf_gen=0x34a6261, todo=8500) at io_helper.c:355
#3  0x00007f9c7fda8661 in http_plugin_read_int (this=0x3422c10,
    buf=0x34a6261 "\242\276\347\240N\356/\246\202$\275\215\267\027\003\240\232\337.\345m8o:\201\066\r\220\257>L\t\321K\377ބ\020\372\017\252R\241\304O\355\377\177[ߗ_\200\027\002$\376~\v", total=8500) at input_http.c:343
#4  0x00007f9c7fda8847 in http_plugin_read (this_gen=0x3422c10, buf_gen=0x34a6261, nlen=<optimized out>) at input_http.c:413
#5  0x00007f9c90864c9e in cache_plugin_read (this_gen=0x3430240, buf_gen=0x34a6261, len=8500) at input_cache.c:151
#6  0x00007f9c89425db0 in read_ogg_packet (this=0x33d53a0) at xine_ogg_demuxer.c:242
#7  0x00007f9c89429208 in demux_ogg_send_chunk (this_gen=0x33d53a0) at xine_ogg_demuxer.c:1571
#8  0x00007f9c9085d352 in demux_loop (stream_gen=0x33704e0) at demux.c:342
#9  0x00007f9c9e496e9a in start_thread (arg=0x7f9c89e2e700) at pthread_create.c:308
#10 0x00007f9ca14842ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#11 0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7f9ca32e7780 (LWP 7482)):
#0  0x00007f9ca1478933 in __GI___poll (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at ../sysdeps/unix/sysv/linux/poll.c:87
#1  0x00007f9c9cfd7ff6 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#2  0x00007f9c9cfd8124 in g_main_context_iteration () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#3  0x00007f9ca22acb58 in TQEventLoop::processEvents (this=0x263dd60, flags=<optimized out>) at kernel/qeventloop_x11_glib.cpp:279
#4  0x00007f9ca22dc0b9 in TQEventLoop::enterLoop (this=0x263dd60) at kernel/qeventloop.cpp:227
#5  0x00007f9ca22dc049 in TQEventLoop::exec (this=0x263dd60) at kernel/qeventloop.cpp:174
#6  0x00000000004013a4 in main (argc=1, argv=0x7fff51d08148) at /build/buildd/amarok-trinity-14.0.0-r275/./amarok/src/main.cpp:116

Looks like it's hung in thread 2 waiting for I/O.
Comment 10 Michele Calgaro 2015-03-14 23:42:02 CDT
>Looks like it's hung in thread 2 waiting for I/O.
Uhm.... I am a little perplexed.
Looks like thread 1 and thread 2 are waiting for events (probably the arrival of input data), would be interesting to know the value of the timeout parameter (whether it is infinite or not).
IMO there is something else going on. For example if Amarok can't read from a source, after a while it comes up with an error message saying the source file/stream is not available. I would expect a similar behavior in case an input stream suddenly becomes unavailable. Usually before the error message is displayed, the GUI is frozen for a while.
I guess every time some bytes are received a timeout counter is reset, so if a stream is coming in very very slowly, the GUI freeze period may become very very long. The next time you experience the problem, can you try waiting for a long time to see if Amarok resumes after a while?

IMO, we need to rework the way Amarok handles different tasks. The GUI should run on a separate thread than the one(s) receiving and playing data. If I remember correctly Amarok uses its own thread implementation. Perhaps it is time we remove that and switch to the "new" Tqt threads.
What do you think?
Comment 11 Timothy Pearson 2015-03-19 14:38:17 CDT
(In reply to Michele Calgaro from comment #10)
> >Looks like it's hung in thread 2 waiting for I/O.
> Uhm.... I am a little perplexed.
> Looks like thread 1 and thread 2 are waiting for events (probably the
> arrival of input data), would be interesting to know the value of the
> timeout parameter (whether it is infinite or not).
> IMO there is something else going on. For example if Amarok can't read from
> a source, after a while it comes up with an error message saying the source
> file/stream is not available. I would expect a similar behavior in case an
> input stream suddenly becomes unavailable. Usually before the error message
> is displayed, the GUI is frozen for a while.
> I guess every time some bytes are received a timeout counter is reset, so if
> a stream is coming in very very slowly, the GUI freeze period may become
> very very long. The next time you experience the problem, can you try
> waiting for a long time to see if Amarok resumes after a while?

Yes, it does--sometimes.  BTW at least this isn't TDE's fault, it's an old KDE bug:
https://bugs.kde.org/show_bug.cgi?id=136675

> IMO, we need to rework the way Amarok handles different tasks. The GUI
> should run on a separate thread than the one(s) receiving and playing data.
> If I remember correctly Amarok uses its own thread implementation. Perhaps
> it is time we remove that and switch to the "new" Tqt threads.
> What do you think?

Worth a shot, but we can't separate Xine's thread from the GUI thread due to the way Xine and Xorg are designed.