-
Notifications
You must be signed in to change notification settings - Fork 7.6k
2.x: Possible deadlock when using observeOn(Scheduler, boolean) #6146
Description
RxJava version: 2.1.11
Java: 1.8.0_181
I'm encountering an intermittent deadlock in a rather long Flowable, and I believe I've pinpointed it to an observeOn(...) call. (I've reached this conclusion through a series of log statements.) I haven't been able to trace through the test when the deadlock occurs, as it only occurs about once every 30 - 40 executions, and each execution takes about a minute. I've managed to reproduce the deadlock about a dozen times (each time, I've been adding more logging to figure out where things are getting stuck).
Flowable<SomeType> flow = // Lots of stuff upstream
flow.doOnNext(x -> /* log 1 */)
.doOnComplete(() -> /* log 2 */)
.observeOn(Schedulers.io(), true)
.doOnNext(x -> /* log 3 */)
.doOnComplete(() -> /* log 4 */)
// Lots more downstream
In the test case where I experience the occasional deadlock, I expect only 1 item to be emitted through this part of the Flowable. I can see log statements 1 and 2 indicating that the item reaches the observeOn(...) and that the upstream is finished, but logs 3 and 4 are never reached. (I forgot to add a doOnError(...) to make sure an exception isn't sneaking through and holding things up else where, but I'm fairly confident there aren't any uncaught exceptions. I've added a doOnError(...) and am re-running my test now to make sure; I'll update my post once I have results.)
Because the logs are hit in this way, this leads me to believe the observeOn(...) is locking up somehow. What's really strange is that everything works fine most of the time.
None of the downstream operators should be attempting to dispose the Flowable early either. I believe my terminal operator is a blockingGet() on a Single, no timeout or anything.
I also logged the total number of threads in my JVM to see if I'm leaking threads somewhere, but I'm only at 49 when the deadlock occurs. (I'm using the IO scheduler, which I believe is backed by an unbounded pool, so I can't imagine I would be running out of worker threads.) I do have other Flowables doing unrelated tasks in the background. All of those Flowables are using the IO scheduler. Additionally, the up and down stream of the Flowable in my test also make use of the IO scheduler, but the deadlock always seems to happen here.
I realize that's not a lot of info to go off of, but I figured I'd ask the experts in case there's something glaring that I'm missing, or if there's something else I can do to figure out what's going on.
Thanks in advance!