Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

src: mark generated snapshot_data as const #45786

Merged
merged 1 commit into from
Dec 10, 2022

Conversation

addaleax
Copy link
Member

@addaleax addaleax commented Dec 8, 2022

This renders the mutex protecting it unnecessary, since mutexes only need to protect concurrent accesses to mutable data.

This renders the mutex protecting it unnecessary, since mutexes
only need to protect concurrent accesses to mutable data.
@addaleax addaleax requested a review from joyeecheung December 8, 2022 13:43
@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/startup

@addaleax addaleax added fast-track PRs that do not need to wait for 48 hours to land. request-ci Add this label to start a Jenkins CI on a PR. labels Dec 8, 2022
@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. labels Dec 8, 2022
@github-actions
Copy link
Contributor

github-actions bot commented Dec 8, 2022

Fast-track has been requested by @addaleax. Please 👍 to approve.

@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Dec 8, 2022
@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@anonrig anonrig added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Dec 8, 2022
Copy link
Member

@joyeecheung joyeecheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually have a const cast here

@@ -1053,8 +1052,6 @@ static void ResetContextSettingsBeforeSnapshot(Local<Context> context) {
context->AllowCodeGenerationFromStrings(true);
}

Mutex SnapshotBuilder::snapshot_data_mutex_;
Copy link
Member

@joyeecheung joyeecheung Dec 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually have a const_cast at L1064, though I think that V8 doesn't actually need params->snapshot_blob to be non-const. Can you modify the v8::CreateParams::snapshot_blob to a const pointer too and see if it works? If so we should send a patch to the upstream first to get rid of the const cast, so that they don't start to actually mutate the blob without us knowing about it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh-oh, it appears v8 is const-casting too.. 😅

: SerializedData(const_cast<byte*>(snapshot.begin()), snapshot.length()) {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm yeah… our const_cast is technically invalid then as well, right? But yeah, agreed that the right thing to do here is to send a patch upstream. I’ll try to put one together and link it here.

I think what you’re bringing up is actually an argument to make this change either way, though. V8 really can’t mutate the blob, since it needs to be re-usable across Isolates (both concurrently and sequentially). If V8 were to mutate the blob, we would need to learn about that; and an easy way to do that is to make the actual data const so that attempting to mutate it crashes.

Copy link
Member Author

@addaleax addaleax Dec 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, V8’s StartupData is just a const char* data + int raw_size. That can’t really be mutated by V8 anyway.

Copy link
Member

@joyeecheung joyeecheung Dec 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it looks like V8 isn't actually mutating the data, just reusing the data structure that is also used for serialization. Perhaps some additional field in SerializedData that can be used to DCHECK mutation in non-const methods would be enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If V8 were to mutate the blob, we would need to learn about that; and an easy way to do that is to make the actual data const so that attempting to mutate it crashes.

Isn't mutating const-casted data undefined behavior, so it's not certain that it would crash (or it would crash reliably)? And that's my main concern because we might be seeing crashes that can't be easily link to the mutation, depending on how the mutation happens..it'd probably be safer to have some DCHECK against the mutation, either in V8 or we DCHECK a verification of the snapshot checksum when we initialize a new isolate with the snapshot data.

Copy link
Member

@joyeecheung joyeecheung Dec 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

V8 really can’t mutate the blob, since it needs to be re-usable across Isolates (both concurrently and sequentially).

I think you are talking about the default snapshot, not the one passed into IsolateParams? V8 also has a mutex to guard its default snapshot read from disk:

base::MutexGuard lock_guard(external_startup_data_mutex.Pointer());
(not entirely sure if that's strictly necessary for V8 either, but I think that's why we also have a mutex in our own code)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(On a second thought, why is the mutex used in an accessor (both in our code and in V8)? It probably should've been somewhere surrounding Isolate::Initialize() instead to serve that purpose..

Copy link
Member

@joyeecheung joyeecheung Dec 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, the V8 mutex is guarding against refreshing the snapshot while it's being read for isolate initialization, which isn't a thing for us because we don't refresh our embedded snapshot blob in any way. The one we read from disk is in a separate location. So we really don't need this mutex anyway (and if we really want to avoid a race from mutation from V8, we should have a mutex surrounding Isolate::Initialize() instead). But with the V8 CL if V8 mutates the const-pointed snapshot in Isolate::Initialize(), it's their bug, not ours, so we should just do this anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't mutating const-casted data undefined behavior, so it's not certain that it would crash (or it would crash reliably)?

From a language perspective, yeah, it's UB. I was honestly expecting that marking this as const would put it in the .rodata section of the binary; looking at the compiled result, it seems like it's a bit too complex for that.

In any case, the V8 CL is merged now, if you'd like me to cherry-pick the IsolateParams change and remove the const_cast I'm happy to also do that.

DCHECK a verification of the snapshot checksum

Just to avoid disambiguity, this PR does not affect the const-ness of the actual snapshot data; that's already a const char* anyway (which also does actually end up in .rodata). If V8 does const_cast on that, then yeah, as you point out, it's on them to make sure they do it properly.

@joyeecheung joyeecheung removed the fast-track PRs that do not need to wait for 48 hours to land. label Dec 8, 2022
Copy link
Member

@joyeecheung joyeecheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking until we get rid of the const cast

@@ -1053,8 +1052,6 @@ static void ResetContextSettingsBeforeSnapshot(Local<Context> context) {
context->AllowCodeGenerationFromStrings(true);
}

Mutex SnapshotBuilder::snapshot_data_mutex_;
Copy link
Member

@joyeecheung joyeecheung Dec 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, the V8 mutex is guarding against refreshing the snapshot while it's being read for isolate initialization, which isn't a thing for us because we don't refresh our embedded snapshot blob in any way. The one we read from disk is in a separate location. So we really don't need this mutex anyway (and if we really want to avoid a race from mutation from V8, we should have a mutex surrounding Isolate::Initialize() instead). But with the V8 CL if V8 mutates the const-pointed snapshot in Isolate::Initialize(), it's their bug, not ours, so we should just do this anyway.

@aduh95 aduh95 added the commit-queue Add this label to land a pull request using GitHub Actions. label Dec 10, 2022
@nodejs-github-bot nodejs-github-bot removed the commit-queue Add this label to land a pull request using GitHub Actions. label Dec 10, 2022
@nodejs-github-bot nodejs-github-bot merged commit 94d23f5 into nodejs:main Dec 10, 2022
@nodejs-github-bot
Copy link
Collaborator

Landed in 94d23f5

@addaleax addaleax deleted the snapshot-data-const-mutex branch December 10, 2022 18:05
ErickWendel pushed a commit to ErickWendel/node that referenced this pull request Dec 12, 2022
This renders the mutex protecting it unnecessary, since mutexes
only need to protect concurrent accesses to mutable data.

PR-URL: nodejs#45786
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Joyee Cheung <[email protected]>
Reviewed-By: Minwoo Jung <[email protected]>
targos pushed a commit that referenced this pull request Dec 12, 2022
This renders the mutex protecting it unnecessary, since mutexes
only need to protect concurrent accesses to mutable data.

PR-URL: #45786
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Joyee Cheung <[email protected]>
Reviewed-By: Minwoo Jung <[email protected]>
targos pushed a commit that referenced this pull request Dec 13, 2022
This renders the mutex protecting it unnecessary, since mutexes
only need to protect concurrent accesses to mutable data.

PR-URL: #45786
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Joyee Cheung <[email protected]>
Reviewed-By: Minwoo Jung <[email protected]>
danielleadams pushed a commit that referenced this pull request Dec 30, 2022
This renders the mutex protecting it unnecessary, since mutexes
only need to protect concurrent accesses to mutable data.

PR-URL: #45786
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Joyee Cheung <[email protected]>
Reviewed-By: Minwoo Jung <[email protected]>
danielleadams pushed a commit that referenced this pull request Dec 30, 2022
This renders the mutex protecting it unnecessary, since mutexes
only need to protect concurrent accesses to mutable data.

PR-URL: #45786
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Joyee Cheung <[email protected]>
Reviewed-By: Minwoo Jung <[email protected]>
danielleadams pushed a commit that referenced this pull request Jan 3, 2023
This renders the mutex protecting it unnecessary, since mutexes
only need to protect concurrent accesses to mutable data.

PR-URL: #45786
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Joyee Cheung <[email protected]>
Reviewed-By: Minwoo Jung <[email protected]>
danielleadams pushed a commit that referenced this pull request Jan 4, 2023
This renders the mutex protecting it unnecessary, since mutexes
only need to protect concurrent accesses to mutable data.

PR-URL: #45786
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Joyee Cheung <[email protected]>
Reviewed-By: Minwoo Jung <[email protected]>
danielleadams pushed a commit that referenced this pull request Jan 5, 2023
This renders the mutex protecting it unnecessary, since mutexes
only need to protect concurrent accesses to mutable data.

PR-URL: #45786
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Joyee Cheung <[email protected]>
Reviewed-By: Minwoo Jung <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author ready PRs that have at least one approval, no pending requests for changes, and a CI started. c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants