-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash when returning from background - SIGTRAP(TRAP_BRKPT) #160
Comments
Thanks for reporting this issue @adamup928. I'm working on cleaning up the persistence layer for the mutation queue (which @larryonoff is also working on), including better SQLite connection discipline. I'll make sure to include this issue in the list of things I'm considering. |
Thanks, @palpatim, I appreciate the update. |
tl;dr: I can't repro, but have a hypothesis. Discussion Unfortunately I'm not able to repro this in any of my tests or test environments, but we can make some progress just looking at the stack trace & the code. It looks like the crash is happening during the delta sync upon resuming from background. The delta query response handler, attempts to call My hypothesis is that we're hitting an error condition returning from background, possibly something like You may already know, but to provide context for future readers: SQLite allows multiple connections to a given database file. It coordinates access at the OS level so that any number of processes may read the database concurrently, but only one may write. In order to write, we have to obtain an exclusive lock on the database, during which time no other process can access, even to read. If the lock lasts longer than the busy timeout (which we set to 100ms), then the update fails and the busy handler is invoked. We don't specify a busy handler, so that means the update simply fails. Because our single database file is used for all three of mutation queue, subscription metadata, and record caching, it is possible that those three largely unrelated processes may be executing updates that take longer than the busy timeout, thus causing the error. If that's the case, one mitigation would be to isolate each database-accessing system to use a separate database file. That would also allow us to provide granularity for people who want to use, say, a persistent record cache without persisting a mutation queue. That wouldn't solve the problem if the issue were related to activity in a single module, but it would help overall contention. I will put together a change that causes these modules to use separate DB files, and we'll see if it helps your case. |
@palpatim our team also have deadlock quite often. As far as I've investigated it occurs when offline mutations are being loaded from cache and someone performs mutation so deadlock happens on SQLite.swift |
@palpatim - Thanks for the detailed update. Your solution to split the database makes sense. If I can reproduce the issue consistently, I'll post a sample project. |
@larryonoff Thanks for calling out the deadlock. Part of my work to split the databases will be to audit our mutation queue's database writes to ensure we're serializing them appropriately. @adamup928 A sample project would be great if you're able to get a consistent repro, thanks! |
@palpatim I investigated a bit the issue. SQLite.swift works in serial manner. So I thought about putting all its work in async DispatchQueue, but not sure that this's correct. |
Wanted to add this additional stack trace that has the
|
Thanks for the extra symbolication. This does point to a different area of the code--I had assumed this was hitting the error attempting to force-unwrap in the FailableIterator, but this points to the force-unwrap of the Statement's I'll dig into this a bit to see if this changes my thinking about whether separating the DBs will help. |
I think this ends up being basically the same path--the force unwrap in question is the one attempting to get the statement handle (rather than force- Some other questions, which are basically grasping at straws at this point:
|
Thanks for your investigation and fast response, @palpatim. To answer your questions:
Hope this helps! |
Thanks for the clarifications. At this point, I think separating the DB files is our best bet since the stack traces aren't displaying anything that looks like a deadlock or invalid SQLite access pattern. |
PR #171 is merged to master and should go out with the next release (which I'm working on getting done by end of week) |
Thanks, @palpatim. That's a lot of work, and it's greatly appreciated! |
This is released in 2.10.0. Please let us know if you see any issues, otherwise, we'll let this issue close out automatically. |
Describe the bug
When returning from the background, we experience intermittent crashes related to the AppSync SDK.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Returning from the background should not result in a crash.
Environment(please complete the following information):
Device Information (please complete the following information):
Additional context
Here is a symbolicated stack trace:
Cause: Crash due to signal: SIGTRAP(TRAP_BRKPT) at 10bf4a1a4
The text was updated successfully, but these errors were encountered: