-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split searchable snapshot into multiple repo operations #116918
Split searchable snapshot into multiple repo operations #116918
Conversation
Each operation on a snapshot repository uses the same `Repository`, `BlobStore`, etc. instances throughout, in order to avoid the complexity arising from handling metadata updates that occur while an operation is running. Today we model the entire lifetime of a searchable snapshot shard as a single repository operation since there should be no metadata updates that matter in this context (other than those that are handled dynamically via other mechanisms) and some metadata updates might be positively harmful to a searchable snapshot shard. It turns out that there are some undocumented legacy settings which _do_ matter to searchable snapshots, and which are still in use, so with this commit we move to a finer-grained model of repository operations within a searchable snapshot.
Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination) |
Hi @DaveCTurner, I've created a changelog YAML for you. |
if (repositoryMetadata.name().equals(request.name())) { | ||
final RepositoryMetadata newRepositoryMetadata = new RepositoryMetadata( | ||
request.name(), | ||
repositoryMetadata.uuid(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we copy the UUID from the previous repository instance rather than using _na_
. The next time we load the RepositoryData
we update the metadata if needed:
elasticsearch/server/src/main/java/org/elasticsearch/repositories/blobstore/BlobStoreRepository.java
Lines 2465 to 2475 in cef1b54
if (loaded.getUuid().equals(metadata.uuid())) { | |
listener.onResponse(loaded); | |
} else { | |
// someone switched the repo contents out from under us | |
RepositoriesService.updateRepositoryUuidInMetadata( | |
clusterService, | |
metadata.name(), | |
loaded, | |
new ThreadedActionListener<>(threadPool.generic(), listener.map(v -> loaded)) | |
); | |
} |
We could conceivably be stricter here, see #109936, but it doesn't seem necessary today. Instead note that in RepositorySupplier
we'll notice the change in UUID and look for a different repository with matching UUID before eventually throwing a RepositoryMissingException
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand this is an optimisation when UUID does not change which should be most of the cases.
When the UUID does change, are you saying that
- Copying the UUID only delays the need to load repositoryData.
- The repositoryData will be loaded before any writing, e.g. createSnapshot, can happen so that UUID will become consistent.
It is not clear to me why a cached repositoryData would not be loaded in 2 and further delays the UUID consistency update?
The change somehow feels not belong here. But I may be too paranoid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're updating settings that fundamentally change the underlying repository then org.elasticsearch.repositories.RepositoriesService#applyClusterState
will create a brand-new Repository
instance to replace the existing one (i.e. org.elasticsearch.repositories.RepositoriesService#canUpdateInPlace
will return false
) and this new instance will have no cached RepositoryData
.
It's kind of an optimization but also kind of vital for the behaviour here. If we don't do this then we can't see that the new Repository
instance is the one we should use for searchable snapshot operations in future (at least not without blocking some thread somewhere while waiting for the new UUID to be loaded).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
create a brand-new Repository
Thanks! This is an important information that I originally missed.
If we don't do this then we can't see that the new Repository instance is the one we should use for searchable snapshot operations in future (at least not without blocking some thread somewhere while waiting for the new UUID to be loaded).
I see the point now. I guess that means searchable snapshot actions do not always load repo data as the first step? If the repo UUID did change, does that mean it would take a while before searchable snapshot related code realise it? Would that lead to any issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that means searchable snapshot actions do not always load repo data as the first step?
Searchable snapshot actions essentially never load the RepositoryData
. They already know how to find the shard data within the blob store (from the index.store.snapshot.index_uuid
and index.store.snapshot.snapshot_uuid
settings in the index metadata, and the shard ID). If the repo switches out from underneath them then they'll get exceptions indicating that the blobs they need are no longer found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Would be good with a second review too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I guess the changes may introduce a minor performance overhead due to looking up repository and shardContainer. But that seems to be in the nature of the requirement.
if (repositoryMetadata.name().equals(request.name())) { | ||
final RepositoryMetadata newRepositoryMetadata = new RepositoryMetadata( | ||
request.name(), | ||
repositoryMetadata.uuid(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand this is an optimisation when UUID does not change which should be most of the cases.
When the UUID does change, are you saying that
- Copying the UUID only delays the need to load repositoryData.
- The repositoryData will be loaded before any writing, e.g. createSnapshot, can happen so that UUID will become consistent.
It is not clear to me why a cached repositoryData would not be loaded in 2 and further delays the UUID consistency update?
The change somehow feels not belong here. But I may be too paranoid.
static { | ||
// these thread names must be aligned with those in :server | ||
assert CACHE_FETCH_ASYNC_THREAD_POOL_NAME.equals(BlobStoreRepository.SEARCHABLE_SNAPSHOTS_CACHE_FETCH_ASYNC_THREAD_NAME); | ||
assert CACHE_PREWARMING_THREAD_POOL_NAME.equals(BlobStoreRepository.SEARCHABLE_SNAPSHOTS_CACHE_PREWARMING_THREAD_NAME); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: We can also actively assign them to be equal? eg:
CACHE_FETCH_ASYNC_THREAD_POOL_NAME = BlobStoreRepository.SEARCHABLE_SNAPSHOTS_CACHE_FETCH_ASYNC_THREAD_NAME;
|
||
for (final Repository repository : repositoriesByName.values()) { | ||
if (repository.getMetadata().uuid().equals(repositoryUuid)) { | ||
repositoryNameHint = repository.getMetadata().name(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use a debug message here to indicate the actual repository name does not match?
} | ||
|
||
private synchronized BlobContainer refreshAndGet() { | ||
final LastKnownState lastKnownState = this.lastKnownState; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this local variable necessary? We are inside synchronized and this is the only place where this.lastKnownState
can be updated?
return cluster.getHttpAddresses(); | ||
} | ||
|
||
public void testReloadCredentialsFromKeystore() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May want to skip it when in fips.
currentRepository.shardContainer(indexId, shardId) | ||
); | ||
this.lastKnownState = new LastKnownState(currentRepository, newContainer); | ||
return newContainer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similary, I wonder whether a logging could be useful here.
Yeah but in practice if we're asking for the |
💔 Backport failedThe backport operation could not be completed due to the following error:
You can use sqren/backport to manually backport by running |
Each operation on a snapshot repository uses the same `Repository`, `BlobStore`, etc. instances throughout, in order to avoid the complexity arising from handling metadata updates that occur while an operation is running. Today we model the entire lifetime of a searchable snapshot shard as a single repository operation since there should be no metadata updates that matter in this context (other than those that are handled dynamically via other mechanisms) and some metadata updates might be positively harmful to a searchable snapshot shard. It turns out that there are some undocumented legacy settings which _do_ matter to searchable snapshots, and which are still in use, so with this commit we move to a finer-grained model of repository operations within a searchable snapshot. Backport of elastic#116918 to 8.x
Each operation on a snapshot repository uses the same `Repository`, `BlobStore`, etc. instances throughout, in order to avoid the complexity arising from handling metadata updates that occur while an operation is running. Today we model the entire lifetime of a searchable snapshot shard as a single repository operation since there should be no metadata updates that matter in this context (other than those that are handled dynamically via other mechanisms) and some metadata updates might be positively harmful to a searchable snapshot shard. It turns out that there are some undocumented legacy settings which _do_ matter to searchable snapshots, and which are still in use, so with this commit we move to a finer-grained model of repository operations within a searchable snapshot. Backport of elastic#116918 to 8.16
Backports:
Unfortunately I can't see a way to backport this to 7.17 safely, the test framework is vastly different there and lacks many of the features we need. |
Each operation on a snapshot repository uses the same `Repository`, `BlobStore`, etc. instances throughout, in order to avoid the complexity arising from handling metadata updates that occur while an operation is running. Today we model the entire lifetime of a searchable snapshot shard as a single repository operation since there should be no metadata updates that matter in this context (other than those that are handled dynamically via other mechanisms) and some metadata updates might be positively harmful to a searchable snapshot shard. It turns out that there are some undocumented legacy settings which _do_ matter to searchable snapshots, and which are still in use, so with this commit we move to a finer-grained model of repository operations within a searchable snapshot. Backport of #116918 to 8.x
* Split searchable snapshot into multiple repo operations Each operation on a snapshot repository uses the same `Repository`, `BlobStore`, etc. instances throughout, in order to avoid the complexity arising from handling metadata updates that occur while an operation is running. Today we model the entire lifetime of a searchable snapshot shard as a single repository operation since there should be no metadata updates that matter in this context (other than those that are handled dynamically via other mechanisms) and some metadata updates might be positively harmful to a searchable snapshot shard. It turns out that there are some undocumented legacy settings which _do_ matter to searchable snapshots, and which are still in use, so with this commit we move to a finer-grained model of repository operations within a searchable snapshot. Backport of #116918 to 8.16 * Add end-to-end test for reloading S3 credentials We don't seem to have a test that completely verifies that a S3 repository can reload credentials from an updated keystore. This commit adds such a test. Backport of #116762 to 8.16.
Each operation on a snapshot repository uses the same `Repository`, `BlobStore`, etc. instances throughout, in order to avoid the complexity arising from handling metadata updates that occur while an operation is running. Today we model the entire lifetime of a searchable snapshot shard as a single repository operation since there should be no metadata updates that matter in this context (other than those that are handled dynamically via other mechanisms) and some metadata updates might be positively harmful to a searchable snapshot shard. It turns out that there are some undocumented legacy settings which _do_ matter to searchable snapshots, and which are still in use, so with this commit we move to a finer-grained model of repository operations within a searchable snapshot.
Each operation on a snapshot repository uses the same `Repository`, `BlobStore`, etc. instances throughout, in order to avoid the complexity arising from handling metadata updates that occur while an operation is running. Today we model the entire lifetime of a searchable snapshot shard as a single repository operation since there should be no metadata updates that matter in this context (other than those that are handled dynamically via other mechanisms) and some metadata updates might be positively harmful to a searchable snapshot shard. It turns out that there are some undocumented legacy settings which _do_ matter to searchable snapshots, and which are still in use, so with this commit we move to a finer-grained model of repository operations within a searchable snapshot.
Each operation on a snapshot repository uses the same
Repository
,BlobStore
, etc. instances throughout, in order to avoid the complexityarising from handling metadata updates that occur while an operation is
running. Today we model the entire lifetime of a searchable snapshot
shard as a single repository operation since there should be no metadata
updates that matter in this context (other than those that are handled
dynamically via other mechanisms) and some metadata updates might be
positively harmful to a searchable snapshot shard.
It turns out that there are some undocumented legacy settings which do
matter to searchable snapshots, and which are still in use, so with this
commit we move to a finer-grained model of repository operations within
a searchable snapshot.