-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RUMM-1026 Use sysctl to gather a more accurate App Launch Time #381
RUMM-1026 Use sysctl to gather a more accurate App Launch Time #381
Conversation
…r a more accurate App Launch Time
@@ -16,10 +16,7 @@ internal protocol LaunchTimeProviderType { | |||
|
|||
internal class LaunchTimeProvider: LaunchTimeProviderType { | |||
var launchTime: TimeInterval? { | |||
// Following dynamic dispatch synchronisation mutes TSAN issues when running tests from CLI. | |||
objc_sync_enter(self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't reproduce the TSAN issues locally, but I believe that if we don't involve any Objc objects (and call a C function) in that call paths there can't be any dynamic dispatches involved to begin with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds reasonable 👍. My suspicion behind this TSAN issue was indeed around the dynamic dispatch with this call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually had to come back to using objc_sync_enter()
/objc_sync_exit()
. I could not reproduce locally the test failure, but based on Bitrise logs, TSAN reported a potential data race on the call to AppLaunchTime()
where a write happened according to TSAN. When reporting, TSAN crashes on a SIGABRT
.
The only write in that C function might come from pthread_rwlock_rdlock(&rwLock);
which in itself must be thread safe (otherwise nothing can be). So anyway, this is back here, it might be extraneous but let's trust TSAN for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really smart approach, we might reuse this on Android too !
// Knowing how this value is collected and that it is written only once, the use of | ||
// `atomic` property is enough for thread safety. It guarantees that neither partial | ||
// nor garbage value will be returned. | ||
@property (atomic) NSTimeInterval timeToApplicationBecomeActive; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe atomic
is already the default for a @property
. That being said, I don't think this was necessary due the inherent sequencing of setting/getting this property throughout the App lifecycle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it looks like LaunchTimeProviderTests.testThreadSafety()
disagrees with my assumptions 🤔
see: https://app.bitrise.io/build/87853c6c2bfa5825#?tab=log
Thanks @xgouchet for the swift review! I'll keep that PR open a little bit longer to get @buranmert and @ncreated's opinion on this change :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i didn't know sysctl
gives that info, TIL 👍
} else { | ||
// Fallback to less accurate delta with DD's framework load time | ||
_processStartTime = _frameworkLoadTime; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we know documented cases where res != 0
?
this is quite a longshot but if res
is practically always 0 and if we can find a way to execute code pre-main in Swift, we can call sysctl
from Swift and remove this ObjC code. eventually, we can also remove _Datadog_Private
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per the documentation it's only going to be 0 or -1:
Upon successful completion, the value 0 is returned; otherwise the
value -1 is returned and the global variable errno is set to indicate the
error.
In general, we should be able to call sysctl()
at any time on the Swift side, but to get close to a relevant app launch time without the Objc +(void)load
trick we should either ask for explicit calls from our users (which could be easily overlooked), or integrate this mechanism into the Datadog/singleton setup that is hopefully done as part of the UIApplication.delegate
application(...:didFinishLaunching...)
by our users.
An obvious downside of that is spreading the whole app launch time state across our SDK, along with potential discrepancies if our assumptions of Datadog's setup are wrong.
// This is called at the `_Datadog_Private` load time, keep the work minimal | ||
NSTimeInterval now = CFAbsoluteTimeGetCurrent(); | ||
// Schedule a call for the first turn of run loop before the main nib/storyboard is loaded: | ||
[NSRunLoop.mainRunLoop performInModes:@[NSRunLoopCommonModes] block:^{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some piece of offline feedback @buranmert shared with me:
This run loop callback is specifically there as a mean to call markEndOfUIApplicationMain
at the right time. This is a value that we will not use yet. So maybe it would be better to take note of this solution for later and get rid of this block (i.e. just alloc
/init
and call scheduleUIApplicationDidFinishLaunchingMarking
inline).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO: if we don't use it → remove it as it's just a dead code. Do I miss something 🤔?
[[NSNotificationCenter defaultCenter] removeObserver:self]; | ||
- (void)scheduleUIApplicationDidFinishLaunchingMarking { | ||
id __block token = [NSNotificationCenter.defaultCenter | ||
addObserverForName:UIApplicationDidFinishLaunchingNotification |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've been using UIApplicationDidBecomeActiveNotification
before, but now we use UIApplicationDidFinishLaunchingNotification
- if they do not mark the same moment in time, this changes the definition of RUM "launch time" for iOS.
How is UIApplicationDidFinishLaunchingNotification
different than the first UIApplicationDidBecomeActiveNotification
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After the UIApplication.delegate
is done with the callbacks and UIWindow
is made key and visible, the sequence of notifications will be UIApplicationDidFinishLaunchingNotification
then UIApplicationDidBecomeActiveNotification
. So I believe they happen close to each other (depending on potential observers and the work they do).
That being said UIApplicationDidBecomeActiveNotification
is a repeating notification that is meant for the whole app lifecycle (although by removing the observer this is only fires initially) and UIApplicationDidFinishLaunchingNotification
fits with the semantic of App Launch Time better.
I don't think it really changes the definition of RUM "launch time" but I'm ok using UIApplicationDidBecomeActiveNotification
if that is safer as it's still related to the app first appearance anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it 👍. I also did some digging around and realised that "finished launching" relates to cold start, whereas "did become active" is for returning from background. Both should work fine and mean the same in our implementation 👌.
@property (atomic) NSTimeInterval timeToApplicationBecomeActive; | ||
// Note that the following properties will be set early enough in the App lifecycle to avoid any race conditions without any specific synchronization means. | ||
static NSTimeInterval TimeToApplicationFinishedLaunching = 0.0; | ||
static NSTimeInterval TimeToLoadApplicationRootUI = 0.0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any usages of this value - we only set it, but never read / process. Do we need it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, I've sent this along for reference in case we consider this additional step eventually, but I'm going to clean this up in a new commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea to use the process start time 👌. I left few questions and suggestions. Main blocker for me is the presence of code that adds complexity but is not used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG 🎸!
…ApplicationDidBecomeActive
// https://developer.apple.com/documentation/uikit/app_and_environment/responding_to_the_launch_of_your_app/about_the_app_launch_sequence | ||
|
||
// A Read-Write lock to allow concurrent reads of TimeToApplicationDidBecomeActive, unless the initial (and only) write is locking it. | ||
static pthread_rwlock_t rwLock; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't it work with objc_sync_enter/exit
in LaunchTimeProvider
? now we have a global lock staying around, not great :/
What and why?
ObjcAppLaunchHandler
is the existing mechanism that observes early steps of the App Launch sequence to provide a Launch Time for further reporting byRUMViewScope
intoRUMActionEvent.Action
for thestart_application
event.This currently relies on Objective-C's runtime initialisation which calls classes'
+(void)load
method as soon as they are loaded into the runtime. This happens before theUIApplication.sharedInstance
is instantiated, and this provides an early point in time regarding the launch sequence.The limitations are that the "earliness" is not guaranteed and depends on many factors, so it might lead to not accounting for some launch time.
This is why we aim to use a more accurate and stable source of information for the App Launch time.
Note that in that PR, we don't change the existing definition of App Launch Time as we still include the time to load the main UI, create/setup the app delegate.
How?
In that PR, we refactor the pair
ObjcAppLaunchHandler
/AppLaunchTimer
into a singleAppLaunchHandler
class that becomes an implementation detail for a C functionAppLaunchTime()
.The
AppLaunchTimer
class still implements and reacts to+(void)load
to set up some observation onUIApplication
.Note the
AppLaunchTimer
is only kept alive by the capturing blocks until it has computed the times.More specifically,
AppLaunchTimer
callssysctl()
to get akinfo_proc
for the current process (getpid()
) and especially looks atp_starttime
.Review checklist
Feature or bugfix MUST have appropriate tests (unit, integration)