Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RUMM-640 Add API for clearing out all data #644

Merged

Conversation

ncreated
Copy link
Member

What and why?

📦 This PR adds Datadog.clearAllData() API.

/// Clears all data that has not already been sent to Datadog servers.
public static func clearAllData()

It deletes all authorised (.granted) and unauthorised (.pending) data (not yet uploaded) buffered on device. This includes data for Logging, Tracing, RUM and Internal Monitoring features.

How?

Because all our features use the same abstraction, I simply added DataOrchestrator type to FeatureStorage. Unlike FilesOrchestrator, DataOrchestrator manages data in multiple folders (we use it for managing authorised and unauthorised directories). The public API calls DataOrchestrator.deleteAllFiles() for each active feature.

Review checklist

  • Feature or bugfix MUST have appropriate tests (unit, integration)
  • Make sure each commit and the PR mention the Issue number or JIRA reference

@ncreated ncreated self-assigned this Oct 19, 2021
@ncreated ncreated requested a review from a team as a code owner October 19, 2021 11:10
Comment on lines +29 to +36
#if DD_SDK_COMPILED_FOR_TESTING
func markAllFilesAsReadable() {
queue.sync {
authorizedFilesOrchestrator.ignoreFilesAgeWhenReading = true
unauthorizedFilesOrchestrator.ignoreFilesAgeWhenReading = true
}
}
#endif
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, this was part of FileReader as it had the access to filesOrchestrator (not accessible anywhere else). As now, DataOrchestrator manages files orchestrators, it seems better to move this logic here. This allowed cleaning up FileReader so it no longer exposes filesOrchestrator internally.

Comment on lines 52 to 69
func testGivenDefaultWriteConditions_whenFileCanNotBeUsedMoreTimes_itCreatesNewFile() throws {
let orchestrator = configureOrchestrator(using: RelativeDateProvider(advancingBySeconds: 0.001))
var previousFile: WritableFile = try orchestrator.getWritableFile(writeSize: 1) // first use

var previousFile: WritableFile = try orchestrator.getWritableFile(writeSize: 1) // first use of a new file
var nextFile: WritableFile

// use file maximum number of times
for _ in (0 ..< performance.maxObjectsInFile).dropLast() { // skip first use
nextFile = try orchestrator.getWritableFile(writeSize: 1)
XCTAssertEqual(nextFile.name, previousFile.name) // assert it uses same file
for _ in (0..<5) {
for _ in (0 ..< performance.maxObjectsInFile).dropLast() { // skip first use
nextFile = try orchestrator.getWritableFile(writeSize: 1)
XCTAssertEqual(nextFile.name, previousFile.name, "It must reuse the file when number of events is below the limit")
previousFile = nextFile
}

nextFile = try orchestrator.getWritableFile(writeSize: 1) // first use of a new file
XCTAssertNotEqual(nextFile.name, previousFile.name, "It must obtain a new file when number of events exceeds the limit")
previousFile = nextFile
}

// next time it returns different file
nextFile = try orchestrator.getWritableFile(writeSize: 1)
XCTAssertNotEqual(nextFile.name, previousFile.name)
}
Copy link
Member Author

@ncreated ncreated Oct 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to this PR directly, but I found this FilesOrchestrator test very weak so I enhanced it. Now it tests that "new file is always started after writing 500 events", instead of "new file is started after writing 500 events". Difference is that now it adds logical coverage for this line which is crucial for the storage & upload performance.

@ncreated ncreated added the needs-docs To mark PRs which need documentation update label Oct 19, 2021
@@ -134,6 +134,11 @@ public class Datadog {
instance?.consentProvider.changeConsent(to: trackingConsent)
}

/// Clears all data that has not already been sent to Datadog servers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it might worth mentioning that this method is async and the files are not deleted once it returns.?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question 👌. Actually, initially I made this API sync to guarantee "no data" after it returns (by sacrificing the performance ofc), but then I realised that it makes no difference from the user standpoint. The SDK synchronizes all data collection internally so any event collected before clearAllData() will be deleted and any event collected after will be stored. That said, adding the information on async execution would be purely informative and could be considered a breaking change if we need to use sync for some reason in the future. WDYT @buranmert @maxep ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question indeed 🤔 Hypothetically, if I do:

log(something) // <- will be deleted
clearAllData()
log(something) // <- will be stored

Because we are using serial readWriteQueue queues in each feature, am I correct?
If so, I don't think we have to specify it's async, the clearAllData will be doing what it's expected to do in a optimised manner.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this snippet is correct - the underlying readWriteQueue is serial and it guarantees the order of operations. Just in case we ever need to change its behaviour, I'd avoid expressing implementation detail in the comment.

@ncreated ncreated merged commit 7bcb52a into master Oct 20, 2021
@ncreated ncreated deleted the ncreated/RUMM-640-add-API-for-clearing-out-all-buffered-data branch October 25, 2021 12:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-docs To mark PRs which need documentation update
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants