-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1655100 - Capture device logs to diagnose intermittent iOS integration test failure #1133
Conversation
@@ -43,6 +43,11 @@ class DeletionRequestPingTest: XCTestCase { | |||
app.launchArguments = ["USE_MOCK_SERVER", "\(port)"] | |||
app.launch() | |||
|
|||
// Sleep for a 100ms to prevent an unlikely race condition that, while | |||
// not practically acheiveable in real-world situations, does appear to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this race condition? Can you describe what's racing against what and why that's unlikely to happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was about to ask the same.
If we are running into a race condition here we should try to plug that as well (even if that happens in a followup and we do this hack for now)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 'race' was more hypothesis than anything concrete. I could never get this to reproduce locally so the only thing I could think of that might affect this is that the test toggles the telemetry enabled switch as soon as the app is launched which could potentially still be during Glean's initialization. The race condition is I believe here, if the app beats Glean to here with toggling the switch before init is complete, then no deletion-request ping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 'race' was more hypothesis than anything concrete. I could never get this to reproduce locally so the only thing I could think of that might affect this is that the test toggles the telemetry enabled switch as soon as the app is launched which could potentially still be during Glean's initialization.
Shouldn't this be covered by #1046?
The race condition is I believe here, if the app beats Glean to here with toggling the switch before init is complete, then no deletion-request ping.
I believe we should probably let this race here then, and add more instrumentation (= log) to test the theories instead of suppressing it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As said in a private chat looking at this again (and oh, I wrote that code), that should probably be an atomic to prevent a possible race I can think of.
Otherwise I agree: Can we add logging and see if we trigger it again on CI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no issues with that, it looks like we already log this as an error inside of glean but I don't see any indication of this in the test logs. The test logs don't appear to capture device logging.
73c17f2
to
e939c6f
Compare
Okay, I've added a step that should capture the simulator device logs when we run the iOS integration tests. Running the iOS tests to ensure it works, so not ready for review yet. |
faf919a
to
a4db70c
Compare
.circleci/config.yml
Outdated
@@ -636,9 +636,15 @@ jobs: | |||
name: Run sample app tests | |||
command: | | |||
bash bin/run-ios-sample-app-test.sh | |||
SIMULATOR_UDID=$(xcrun instruments -s devices | grep 'iPhone 11 (13' | awk -F '[][]' '{print $2}') | |||
xcrun simctl spawn $SIMULATOR_UDID log collect --output ~/project/ios-integration-test.logarchive 2>/dev/null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL if there's only one simulator running, this works:
xcrun simctl spawn $SIMULATOR_UDID log collect --output ~/project/ios-integration-test.logarchive 2>/dev/null | |
xcrun simctl spawn booted log collect --output ~/project/ios-integration-test.logarchive 2>/dev/null |
Then we don't even need the UUID
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth a shot, I was just getting ready to try fastlane scan
Okay, with @badboy's help I think we have this where we are actually saving the logs in CI in such a way that we can see the Glean logs that we are interested in. |
I defer to @badboy for the review of this, seems it seems very iOS specific. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 -- this should get a changelog notice as we're changing how logging works a bit (not a major change for devs using it, but let's still call it out).
We should keep the associated bug open so that we take a look at the next time we see that intermittent (I leave a comment).
I'll also open a followup bug to change the Rust logging on iOS.
Okay, changelog entry added and I'll leave the bug open for now (I'll untake it and make it a P2 for now) |
Updated: This adds device logging in a way that we can capture them from the simulator used in testing on CI.