-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fix][Tiered Storage] Eagerly Delete Offloaded Segments On Topic Deletion #15914
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm.
Great
Even though we should delete the offload data on topic deletion, it has changed the default behavior. We'd better have a proposal to discuss it. Another one is that it can't prevent orphan ledgers because they offloaded data deletion is async. |
We need more eyes. @merlimat @codelipenghui @315157973 @Jason918 @zymap @horizonzy |
I disagree that undocumented silently orphaned data (without tools/automated processes to detect it and clean up) is a "default behavior" one ever expected or wanted. It looks like a bug that just happened.
In case of truncate internalTrimLedgers runs with In case I missed some place where internalTrimLedgers completes the promise before the data deletion is done i'll fix that as long as we agree on overall approach. |
// Truncate to ensure the offloaded data is not orphaned. | ||
// Also ensures the BK ledgers are deleted and not just scheduled for deletion | ||
CompletableFuture<Void> truncateFuture = ledger.asyncTruncate(); | ||
truncateFuture.whenComplete((ignore, exc) -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about that move this logic to ledger.asyncDelete(). It can cover more situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@horizonzy moved it into ManagedLedgerImpl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM.
My only concern is if the offload data have been used by other systems, for example, a hive table, the topic deletion will delete the table, and other systems won't get the data out from the table.
There is another risk is that after the ledger deletion succeeds, but some ledgers deletion from BookKeeper failed, it won't delete the topic metadata, which will lead to some ledgers has been deleted from storage, but their metadata still can be found from the topic metadata, and the consumer will fetch data failed. We are writing a proposal to solve this issue.
@@ -2687,6 +2687,22 @@ public void deleteLedgerFailed(ManagedLedgerException e, Object ctx) { | |||
|
|||
@Override | |||
public void asyncDelete(final DeleteLedgerCallback callback, final Object ctx) { | |||
// Truncate to ensure the offloaded data is not orphaned. | |||
// Also ensures the BK ledgers are deleted and not just scheduled for deletion | |||
CompletableFuture<Void> truncateFuture = asyncTruncate(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we use asyncTruncate
to trigger delete storage data.
Maybe we should mofidy the code:
pulsar/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java
Lines 2430 to 2434 in ce24db1
if (!factory.isMetadataServiceAvailable()) { | |
// Defer trimming of ledger if we cannot connect to metadata service | |
promise.complete(null); | |
return; | |
} |
When the meta data service is not available, should complete with exception.
The pr had no activity for 30 days, mark with Stale label. |
75c7192
to
a912495
Compare
@eolivelli @hangc0276 please take another look. |
pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BrokerService.java
Outdated
Show resolved
Hide resolved
pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BrokerService.java
Outdated
Show resolved
Hide resolved
DeleteLedgerCallback callback, Object ctx) { | ||
final CompletableFuture<Map<Long, MLDataFormats.ManagedLedgerInfo.LedgerInfo>> | ||
ledgerInfosFuture = new CompletableFuture<>(); | ||
store.getManagedLedgerInfo(managedLedgerName, false, null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we read ManagedLedgerInfo
from store instead of just use info
in the parameter list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
info
is org.apache.bookkeeper.mledger.ManagedLedgerInfo
, store returns MLDataFormats.ManagedLedgerInfo.LedgerInfo
which has some additional info and used in OffloadUtils
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dlg99 This still confuse me. All data in ManagedLedgerInfo
comes directly from MLDataFormats.ManagedLedgerInfo.LedgerInfo
. I think it's better to just sync all the info to ManagedLedgerInfo
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately ManagedLedgerInfo is part of of the public REST API
for instance here we return it (JSON encoded) to the client
pulsar/pulsar-broker/src/main/java/org/apache/pulsar/broker/admin/impl/PersistentTopicsBase.java
Line 1262 in 957a16b
PartitionedManagedLedgerInfo partitionedManagedLedgerInfo = new PartitionedManagedLedgerInfo(); |
We should not add everything to it.
We have this bad problem in Pulsar that we aren't always aware of what is leaking to the public APIs.
I prefer to keep the patch in this form.
And if we want to change ManagedLedgerInfo we can do it in a follow up work.
As this patch is fixing some kind of "bad problem" (because we are not deleting data that should have been deleted, that has some legal impact in some countries), this patch should be cherry-picked to active branches.
I won't add API changes in a patch that will be ported
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately ManagedLedgerInfo is part of of the public REST API
Sorry, I missed this.
I prefer to keep the patch in this form. And if we want to change ManagedLedgerInfo we can do it in a follow up work.
As this patch is fixing some kind of "bad problem" (because we are not deleting data that should have been deleted, that has some legal impact in some countries), this patch should be cherry-picked to active branches. I won't add API changes in a patch that will be ported
Make sense to me, this patch LGTM
managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerFactoryImpl.java
Outdated
Show resolved
Hide resolved
managed-ledger/src/main/java/org/apache/bookkeeper/mledger/ManagedLedgerFactory.java
Outdated
Show resolved
Hide resolved
a912495
to
7c63673
Compare
@Jason918 I addressed your comments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…pic Deletion (apache#15914)" This reverts commit 9026d19.
tests are now failing due to #17736 because "trim" cannot happen if a ManagedLedger is "fenced", and we set "fenced" in "delete". |
…tion (apache#15914) * Truncate topic before deletion to avoid orphaned offloaded ledgers * CR feedback (cherry picked from commit 9026d19)
…tion (apache#15914) * Truncate topic before deletion to avoid orphaned offloaded ledgers * CR feedback (cherry picked from commit 9026d19)
…tion (apache#15914) * Truncate topic before deletion to avoid orphaned offloaded ledgers * CR feedback (cherry picked from commit 9026d19)
…tion (apache#17915) (#152) (cherry picked from commit 0854032) Fixes apache#9962 ### Motivation Offloaded ledgers can be orphaned on topic deletion. This is a redo of apache#15914 which conflicted with concurrently merged apache#17736 thus resulting in apache#17889 . apache#17736 made a decision to not allow managed ledger trimming for the fenced mledgers because in many case fencing indicates a problems that should stop all operations on mledger. At the same time fencing is used before deletion starts, so trimming added to the deletion process cannot proceed. After discussion with @eolivelli I introduced new state, FencedForDeletion, which acts as Fenced state except for the trimming/deletion purposes. ### Modifications Topic to be truncated before deletion to delete offloaded ledgers properly and fail if truncation fails. ### Verifying this change local fork tests: dlg99#1 - [ ] Make sure that the change passes the CI checks. This change added integration tests ### Does this pull request potentially affect one of the following parts: *If `yes` was chosen, please highlight the changes* Nothing changed in the options but admin CLI will implicitly run truncate before topic delete. - Dependencies (does it add or upgrade a dependency): (yes / no) - The public API: (yes / no) - The schema: (yes / no / don't know) - The default values of configurations: (yes / no) - The wire protocol: (yes / no) - The rest endpoints: (yes / no) - The admin cli options: (yes / no) - Anything that affects deployment: (yes / no / don't know) ### Documentation Check the box below or label this PR directly. Need to update docs? - [ ] `doc-required` (Your PR needs to update docs and you will update later) - [x] `doc-not-needed` (Please explain why) - [ ] `doc` (Your PR contains doc changes) - [ ] `doc-complete` (Docs have been already added)
@dlg99 could you please cherry-pick this PR to branch-2.9? thanks. |
@dlg99 hi, I move this PR to |
Remove the |
Fixes #9962
Motivation
Offloaded ledgers can be orphaned on topic deletion.
Modifications
Topic to be truncated before deletion to delete offloaded ledgers properly and fail if truncation fails.
Verifying this change
This change added integration tests
Does this pull request potentially affect one of the following parts:
If
yes
was chosen, please highlight the changesNothing changed in the options but admin CLI will implicitly run truncate before topic delete.
Documentation
Check the box below or label this PR directly.
Need to update docs?
doc-required
(Your PR needs to update docs and you will update later)
doc-not-needed
(Please explain why)
doc
(Your PR contains doc changes)
doc-complete
(Docs have been already added)