-
Notifications
You must be signed in to change notification settings - Fork 7.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stacktraces and subscribeOn/observeOn #3521
Comments
Taking a stacktrace has a 100x overhead compared to a pass-through. This comes up from time to time but I don't see why one wants to see a stacktrace from another thread when the exception happened on the scheduler's thread. If the exception was thrown before Could you give some examples where (I'm guessing) the code crashed after The only thing I can think of is the following code: Observable.<Func0>just(null).observeOn(Schedulers.computation()).map(r -> r.call()).subscribe(...) This will crash on the computation scheduler with NPE, but the problem is in just(). How would the stack trace from the main thread inside the |
api.fetchRequestResult()
.compose(parse()) // a crash is here
.subscribeOn(io())
.observeOn(mainThread())
.subscribe(it -> whatever); If i have the call to |
I'm guessing it crashes in response to a value, right? Why not add doOnNext() with a conditional breakpoint and move it along the pipeline to see where the bad value appears? |
Becaue I can't put a breakpoint without user's permission on a device that is located 2000 miles away? |
I also don't care about the performance overhead in this situation because inability to fix the crash has bigger impact on my app over a 0.01 ms slowdown during a network request. |
Do you have highly branching code or do you use custom |
Unit tests do not cover all possible situations, sometimes crashes happen.
|
I'm still having difficulty imagining a situation where a remote app crashes and you need the stacktrace all the way to the equivalent of main(String[] args) on Android. Maybe an NPE is an uncommon case. What kind of crashes do you experience? |
I see the value of good stacktraces because investigating concurrency problems is usually a hard and time-consuming task. But // Internally we've started implementing a very similar approach for non-rx concurrency and probably will do the same for RxJava Schedulers via schedulers hook soon. Would be nice to see it in the RxJava. @akarnokd if you afraid of overhead — this setting is dynamic, so developers will usually enable it only for debug builds or even at runtime (many projects have Developer Settings menu in the app for such options). Real life example of hard-to-debug crash — data bus built via As far as I remember from the recordings of @benjchristensen talks about RxJava he had some thoughts on improving debugging of RxJava code, @benjchristensen what do you think about subject? |
But can you know what bad data is before posting it on the event bus? If so,why not add verification to sources of the events via doOnNext() and crash the lambda? |
Because you can not verify all cases of bad data. |
RxJavaPlugin let's you decorate the |
I don't think that plugins can help here - I want to have stack traces for some observables. If we inject a plugin we will be forced to always have that traces and lose the ability for recursive chain calls because of memory leaks. |
@akarnokd No, I don't use RxJava as a bus. The problem is not tied to any specific use case, it is a general RxJava issue. |
You can create a custom scheduler, do the decoration and apply it at places you desire. |
@konmik I am familiar with this pain. The stacktrace of an exception that takes place in an observable chain that's been scheduled does not have any context of where in the code base it was composed. It only shows a worker that's scheduled the subscription or onNext behavior. Is there any specific information that we could add to the exception that would help? |
Here is a simple example. There is no mention of my class or source code at all. public static void main(String[] args) {
Observable.empty()
.observeOn(Schedulers.io())
.toBlocking()
.first();
} Output:
|
@stealthcode Sure, as I already said, we can take stacktrace right before placing an action into worker and when an exception occurs there, we can rethrow the raised exception with included saved stacktrace. |
I had an idea for a tool that might help mitigate this problem. I think the desired output would be something like this:
This could happen by having a tool which statically analyzed the user application's source code (at application start time) and looked for any callsite to any rx.Observable operator instance or static methods. This would log a few things to a static application callsite buffer. Things like "callsite-id", "line number", "column number", etc. Then at runtime, scheduling operators would hold on to and pass through their callsite id to their downstream subscriber. When an exception is encountered the operator would lookup the appropriate callsite ids and get the line #s then append that information into the thrown Exception's message. @Blesh for context (applicable for RxJS too). |
@stealthcode I think that both - stacktrace and static analysis would be nice to have. However, RxJava does not currently have static analysis, so it can only be implemented as a separate library. |
+1 |
Here is the solution I came up with. Just add public class OperatorTraceOnError<T> implements Observable.Operator<T, T> {
private final StackTraceElement[] trace = new Throwable().getStackTrace();
public static <T> OperatorTraceOnError<T> traceOnError() {
return new OperatorTraceOnError<>();
}
@Override
public Subscriber<? super T> call(Subscriber<? super T> child) {
Subscriber<T> parent = new Subscriber<T>() {
@Override
public void onCompleted() {
child.onCompleted();
}
@Override
public void onError(Throwable throwable) {
child.onError(new TracedException(throwable, trace));
}
@Override
public void onNext(T t) {
child.onNext(t);
}
};
child.add(parent);
return parent;
}
private static class TracedException extends RuntimeException {
public TracedException(Throwable throwable, StackTraceElement[] trace) {
super(throwable);
setStackTrace(trace);
}
}
} P.S. It will not help with the case of the bloking operator error because the exception happens after |
You could also use the RxJavaDebug plugin. This adds logging to all operators for virtually all relevant lifecycle steps. |
@konmik, thanks a lot for sharing the |
|
I extended your
The main differences are:
Usage: What do you think? |
@Petikoch I like the idea of having an immutable throwable (my current solution) because it will survive any use case. The idea of checking thread id is nice but I afraid it will not work in some cases (If an exception happened in the background and then returned to the main thread to die here.) From the other hand, |
@konmik, interesting topic, isn't it? :-) Thanks for your feedback. |
I'm using @konmik 's |
I was thinking about and had the idea of only the capturing of the originating stack track one per worker when the first action is scheduled rather than every schedule call. This would amortize the cost over many actions. I would think most operators schedule on their worker instance in a fairly consistent way. It wouldn't be 100% but better than nothing. |
@akarnokd you know RxJava perfectly, you probably do always everything as it should be done (unit testing and everything). What REALLY happen out there is that lot of people use RxJava without knowing every little bit of it or do not perfectly test everything. The fact the libraries like this exist: https://github.com/android10/frodo to let you print what the flow of an Observable in the log or people asking for ways to have more meaningful stacktrace shouldn't be discarded like you are doing (closing this issue / refusing to acknowledge this is an area where RxJava can improve). I'm pretty sure someone like you that know RxJava so well could come up with lot of better solutions then the one currently available to help debugging and tracing whats going on and it's a shame you just refuse to do it because for you they are not useful. PS: please don't miss out on the choice of name of the library I linked ("frodo") and the image associated to it... Look at the gif in that page where Frodo hugs Sam almost crying. That's not just funny... It's inspired on the feeling you have finding that library when you are straggling understanding what's going on on a complex chain of Observables. |
@konmik closed this issue as he found a way to support his streams by capturing the stacktrace via an operator. Several months later, as part of a reworked plugin system the assembly tracking, which is very similar to @konmik's solution, has been added to RxJava. It can be enabled at runtime after which all new streams get extra tracking information but it's not cheap and should be used as a diagnostic option. We are a foundation for fluent reactive programming on the Android and JVM platforms with Java 6+ support and we try our best to support other libraries that wish to build upon us. However, we are limited by the Java platform as well as how exceptions and stacktraces work.
Nobody is stopping you or anybody to deepen his/her understanding of RxJava internals and come up with a solution, which if works, doesn't break binary compatibility and doesn't penalize those who don't need it, we are happy to accept PR(s) of it. |
@akarnokd sorry I didn't noticed the issue has been closed by @konmik. I still think an operator like the one used by @konmik or modified by @Petikoch would be nice to have in RxJava. Something that you can add on the single operator to obtain an audit of THAT operator. the reset() addition you cited is nice and also useful for testing but is a global on/off switch not suitable for when you have a bug in production you need to test on a specific observable. I didn't want to sound accusing, I just think you are way better with RxJava that most of us and that sometimes you are a bit too quick to dismiss request like that because from your point of view are just wasting time while it would make RxJava easier for everyone else. You just told me:
gee, as if deepening the understanding to such level was a thing you can achieve in half a day - it's not, not for me at least. I've read most of your blog which took way more then half-day and still didn't understood everything in it, nor would I feel comfortable I did everything right writing an operator or creating an Observable from scratch (Observable.create()). Thanks anyway. If I'll ever achieve THAT much knowledge and find a way to ease things out for the beginner I'll surely write a PR ;) |
I don't want to sound too harsh but this is an open-source project and we (I) put in a lot of effort to keep RxJava up-to-date and running. As a project lead, I also have the duty to keep RxJava intact but often people want local solutions that disrupt or break other parts of RxJava they usually can't see. In addition, sometimes issues are too vague, very difficult or outright impossible to solve given our constraints. The problem mentioned in this issue is typical for asynchronous programming; you'd have the same issues if you had ExecutorServices, queues and concurrent datastructures the classical and manual way with just Java itself. I'd guess you wouldn't run to Oracle complaining about your program not working as you expected despite all the fundamental building blocks Java gives you work correctly. The second source of such issues is that people come from different prior asynchronous experiences and tend to project those old practices onto RxJava where they likely won't work or apply. Anecdotes say the a "aha" moment when suddenly it clicks; but I believe this is only for its declarative nature. There is another "aha" moment when you embrace it as "everything is a dataflow". In similar stacktrace issues as this, either the user code throws at which point you have the exception point to your source code (you have to mentally filter out all irrelevant entries) or some operator signalled an exception rightfully following its spec (like single()) but you don't fully understand your sequence that lead to it. There is a no-zero chance you run into an operator bug but those are confirmed and fixed quite reliably I think. As you mentioned such failures tend to be within production/private code and not enough information is provided to work out a solution that can also be unit tested within RxJava itself. If you have a specific problem, please open an issue with all the details, unit tests to see if we can help you or not. |
@akarnokd You are right! :) as soon as I stumble into something hard to debug I'll do what you suggest. Thank you. |
Since this is the top Google result for |
https://github.com/akaita/RxJava2Debug is similar. I'm not able to get more useful stacks when I run these types of extensions alongside my project, though, and I wish I knew why. In a use case like networking, it's very important to be able to work backwards from an error in a response to the code which issued the request--and if that request happened in a callback for some previous response, what request initiated the previous response, etc... There's no amount of upfront validation or unit testing that will substitute for the behaviors you actually see when you put all the components of a system together and run it in a production-like environment. This is somewhat easier to do when you use synchronous networking, or async networking in .NET which manipulates async stacks when debugging so that you can see what initiated a task when you're debugging the completion of the task. Now, Rx is a very different paradigm from async/await, and so this stack rewriting solution (which resembles the assembly tracking extensions for RxJava) doesn't always map that well onto it. There is a great deal of value that could come from an improvement to the debugging experience here that is stable and performant enough for production use. I'm experimenting with the tracing operator posted a while back by @konmik and it's definitely an improvement--planning to experiment more and see what the experience is like in more complex tracing scenarios. |
RxJava2Debug didn't add any useful information for me either...not sure what I'm doing wrong. Traceur was very helpful, but doesn't seem to be working for me with Crashlytics (T-Spoon/Traceur#8). Mainly trying to solve the problem of often not having useful stacktraces for crash reports involving RxJava code, which causes difficulty in reproducing and fixing those bugs. RxJavaAssemblyTracking doesn't appear to add any extra info to my stacktraces (but it does look useful for debugging during development), so I'm still hunting for a solution that can be used in production code. @akarnokd I was wondering what your thoughts are on an approach like this? Seems to provide more useful crash logs without the large performance overhead...although at the cost of more manual effort. |
Depends on what type of error you want to track down. Null returns could be using my function tagging code. Otherwise, that looks a way but I'd add a tag string to |
Thanks for the quick response! We're mostly having difficulties with exceptions thrown by Retrofit or APIs of third party libraries that we're wrapping with RxJava...the stacktrace never seems to go back far enough to help us figure out the source. |
I think the "network call failed, where was it initiated?" use case is pretty well addressed by the breadcrumb approach you suggest @starkej2 👍 |
Another option: https://github.com/halfhp/rxtracer Other approaches were either too verbose in their implementation for my taste or simply did not work. RxTracer uses assembly tracking to supply the original exception with a more useful stack trace, so exception type, properties etc. are preserved. It's used by invoking |
This is still one of the top search results, so I'm leaving here the solution for RxJava 3: https://github.com/akarnokd/RxJavaExtensions#debug-support |
This is hardly a solution. Helps only while developing or if you have a reproducible bug. Doesn't help if you get an obscure crash in production that you do not know how to reproduce. |
@danielesegato Could you give concrete example when akarnokd's RxJavaAssemblyTracking was not helpful? |
@AlexShafir It comes with performance drawbacks. It's a global on/off switch. When you have a bug in production you can't reproduce, it's not an on/off bug and the stacktrace you get is not helpful the only option you have is turn that flag on in production for everybody and wait (hope) for the bug to happen. This is a common issue with any framework of this type. Kotlin Coroutines have similar issues. My knowledge of RxJava is far greater than my first comments in this issue and I've just learned to write better code with it. I hardly end up in any issue like this nowadays. I remember i would have love to turn the behavior on for just a couple of observables (so the performance issue would be limited to that). Nowadays it's more an annoyance than anything, would be nice if i could see a better stacktrace all the time but I don't wanna give up performance for it. |
RxJava in not nice when it comes to stacktraces.
The worst thing is
observeOn
that schedules execution on different threads. Every time we useobserveOn
we get our stacktrace from the process root, all the call history gets erased.The idea is to make schedulers save the caller's stack, wrap action calls and attach the stacktrace to each exception that has been thrown by scheduled actions.
Pros:
traceable exceptions
Cons:
Performance cost
Leaks because of recursion
I think that we can have
Schedulers.io(boolean trace)
alternative that will save the stacktrace, so we could call itSchedules.io(DEBUG)
to turn stacktracing off on production for performance critical parts.How do you guys find this idea?
The text was updated successfully, but these errors were encountered: