Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent iOS 15 crashes by not polling isLowPowerModeEnabled #613

Merged

Conversation

earltedly
Copy link
Contributor

@earltedly earltedly commented Sep 28, 2021

What and why?

As described in #609, iOS 15 has introduced a race condition around ProcessInfo.isLowPowerModeEnabled:

The conditions under which this can occur are:

  1. Register to observer: NSProcessInfoPowerStateDidChangeNotification
  2. In the observation block, query ProcessInfo.isLowPowerModeEnabled
  3. Later on another piece of code calls ProcessInfo.isLowPowerModeEnabled
  4. Sometimes that will trigger a change in the power state which will be broadcast to observers before returning
  5. The code in (2) is called which re-enters isLowPowerModeEnabled
  6. Thread deadlock over an os_unfair_lock occurs and the app crashes

How?

We introduce a LowPowerModeMonitor which:

  1. Listens to NSProcessInfoPowerStateDidChangeNotification
  2. On any updates it queries ProcessInfo.isLowPowerModeEnabled and caches the return value

MobileDevice retains an instance of this in the local scope and queries it instead of ProcessInfo locally.

This prevents us from constantly polling for new values and thus takes us out of the call stack for re-entry.

Review checklist

  • Feature or bugfix MUST have appropriate tests (unit, integration)
  • Make sure each commit and the PR mention the Issue number or JIRA reference

Notes

This is a change in the implementation details rather than behaviour, so I've not added any additional tests

@earltedly earltedly requested a review from a team as a code owner September 28, 2021 14:31
Copy link
Contributor

@buranmert buranmert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @earltedly 👋
thanks so much for the PR, we are really happy to have your contribution! 🥳

switching from polling to listening to notifications seems to be a step in the right direction, but i guess there is still a problem with isLowPowerModeEnabled property. i left a comment below 👇

Sources/Datadog/Core/System/MobileDevice.swift Outdated Show resolved Hide resolved
@@ -99,3 +100,29 @@ internal class MobileDevice {
}
}
}

private class LowPowerModeMonitor {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey guys 👋

Quick question, does this class needs to be open to subclassing? Why not putting it as final?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fairly sure this is implicit for a private class, however always best to be certain so I've added final too 👍

… they change rather than polling for them on every log submission
@earltedly earltedly force-pushed the V5IOSCS-1929-isLowPowerMode-iOS15-crash branch from 34b2d76 to 65f6ae9 Compare September 29, 2021 13:31
@earltedly
Copy link
Contributor Author

Thanks for the review 🎉 . I think I've addressed everything, but please let me know if anything else is needed.

Copy link
Contributor

@buranmert buranmert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once again thanks @earltedly 🙏
unfortunately CI failed due to linter errors, i left suggestions to fix them.
once these trivial errors are fixed, CI will probably pass and we can merge the PR 🚀

note: you can see linter output as warnings in Xcode and/or by running ./tools/lint/run_linter.sh in Terminal

Sources/Datadog/Core/System/MobileDevice.swift Outdated Show resolved Hide resolved
Sources/Datadog/Core/System/MobileDevice.swift Outdated Show resolved Hide resolved
Sources/Datadog/Core/System/MobileDevice.swift Outdated Show resolved Hide resolved
Sources/Datadog/Core/System/MobileDevice.swift Outdated Show resolved Hide resolved
Sources/Datadog/Core/System/MobileDevice.swift Outdated Show resolved Hide resolved
Sources/Datadog/Core/System/MobileDevice.swift Outdated Show resolved Hide resolved
Sources/Datadog/Core/System/MobileDevice.swift Outdated Show resolved Hide resolved
Comment on lines +121 to +124
guard let processInfo = notification.object as? ProcessInfo else {
return
}
self?.publisher.publishAsync(processInfo.isLowPowerModeEnabled)
Copy link

@yousefhamza yousefhamza Sep 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @earltedly 👋

Can you help me understand how accessing processInfo like this, prevents against calling the deadlock?

I'm confused because the object that posts the notification will be the same object that we query for isLowPowerModeEnabled which should cause the same issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @yousefhamza 👋

DatadogSDK currently polls processInfo.isLowPowerModeEnabled regularly, once in every 5 seconds. this PR switches from polling to notification-based readings: SDK will read processInfo.isLowPowerModeEnabled only when its value changes.
needless to say, as the author feel free to explain further @earltedly :)

assuming that it changes less often than once in every 5 seconds, this should reduce the reads and number of crashes if not solves the root cause (it's very likely that Apple needs to fix the root cause).

publishAsync is just to make this property thread-safe as it's written in main and is read in a background thread.

if your app experiences similar crashes too, it'd be very helpful to hear your stats if possible 👂
how often does this happen in your case? do you have the exact same stack traces? etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yousefhamza 👋

It's just as @buranmert, at a worst case this will greatly reduce the number of times DatadogSDK access processInfo and thus reduces the potential for hitting this bug.

We can also hope that processInfo itself only notifies when the state has recently changed and that immediately querying it will be unlikely to cause processInfo to immediately find it has changed again (and thus re-broadcast within the same callstack).

Also, it would be great to hear if you're experiencing this too and in what kind of numbers.

Copy link
Member

@ncreated ncreated Sep 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add a bit more context - in worst case scenario, Datadog SDK might read processInfo.isLowPowerModeEnabled from 6 simultaneous threads. Given that we suspect it for not being thread-safe on iOS 15 it could lead to to a crash.

With the proposed change, the value of processInfo.isLowPowerModeEnabled would be read and published from a single thread and with the use of ValuePublisher it can be later accessed from any number of simultaneous threads in a thread-safe manner.

This should solve the crash originating from our SDK. It doesn't though address the problem of some other code (e.g. application code) reading shared ProcessInfo.processInfo.isLowPowerModeEnabled at the same time as we do. I think we can't do anything with this as the problem seems to be on the Apple side. I will fill radar when I manage to reproduce this issue with minimal code on iOS 15 device (@earltedly already did 🏅).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the radar: FB9661108. Filing a duplicate would be great as it increases the priority within Apple's triage system.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@buranmert @earltedly @ncreated Thanks for the very thorough responses 🙏

What I understand is it will help with this fix because:

  1. It reduces how many times isLowPowerModeenabled is queried
  2. Querying it after it has changed will give a much lower chance that it will change again while querying
  3. Reading it from a single thread will reduce the chances of multiple reads (which can cause firing a notification) at the same time, while making it thread safe.

I'm working on Instabug SDK and this issue came into my radar because we use ProcessInfo.processInfo.isLowPowerModeEnabled, we are also prone to this crash too. however, we didn't get any reports of it yet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yousefhamza yes, your reasoning about this change is correct 👍 with the additional note that our MobileDevice is a shared component, instantiated only once from SDK init - so we will have only one active subscriber to LPM-change notification.

Thanks for more details, and I hope you will manage to prevent this crash in Instabug SDK 🙂🤞!

@earltedly
Copy link
Contributor Author

@buranmert Thanks for the the CR and fix it suggestions, I've applied them all so 🤞 it will pass now :)

Copy link
Member

@maxep maxep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌

@ncreated ncreated merged commit f44f351 into DataDog:master Oct 4, 2021
ncreated added a commit that referenced this pull request Oct 4, 2021
…S15-crash

Prevent iOS 15 crashes by not polling isLowPowerModeEnabled

(cherry picked from commit f44f351)
@earltedly earltedly deleted the V5IOSCS-1929-isLowPowerMode-iOS15-crash branch October 5, 2021 14:48
@earltedly earltedly restored the V5IOSCS-1929-isLowPowerMode-iOS15-crash branch October 5, 2021 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants