-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
close writer before soft deletion of searchindex #25
base: master
Are you sure you want to change the base?
Conversation
6c00965
to
f15dce5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable to me (with some very rusty Scala knowledge). I have a few comments on the approach, but I'll defer to Bob for a final opinion.
case e: IOException => | ||
warn("Error while closing writer", e) | ||
ctx.args.writer.close() | ||
case e: IOException => warn("Error while closing writer", e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though we have the correct close-then-rename order, do you think it's worth including the suggested approach from the Lucene 4.6.1 docs, which was something like:
try {
writer.close();
} finally {
if (IndexWriter.isLocked(directory)) {
IndexWriter.unlock(directory);
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion. I can add this.
val manager = self | ||
node.spawn((_) => { | ||
openTimer.time { | ||
IndexService.start(node, ctx.args.config, path, "standard") match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this not a bit wasteful? (Imagine a bulk deletion case: it dramatically increases the intensity of the work done.)
It seems to me that here we could:
- Verify it's not in LRU
- Directly execute the rename operation on the appropriate directory in this actor
(It looks to me like IndexService.softDelete()
is already passed almost all of the context it requires, and it wouldn't be too difficult to call it from here either.)
A downside here would be additional error handling (e.g., if the directory doesn't exist, etc.). Another would be that we might increase the time taken to execute the 'soft_delete
operations in this actor, and delay legitimate open operations, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of logic in IndexService.softDelete()
can be moved to IndexManagerService.scala
. Just one line ctx.args.writer.close()
in https://github.com/cloudant-labs/clouseau/pull/25/files#diff-2ed27527cbacd7a17e06d19270f99182R185, this still needs IndexService.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"soft" delete should be handled the same as "hard" delete. A message to be sent from dreyfus to the IndexService actor which, on receipt, closes the index and renames its files then calls exit on its own pid. This ensures the correct order of processing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not duplicate part of the OpenIndexMsg logic here. I note also that this doesn't put the Index in the LRU, so could potentially lead us to attempt to open the index multiple times.
@@ -34,6 +34,7 @@ case class RenamePathMsg(dbName: String) | |||
case class CleanupDbMsg(dbName: String, activeSigs: List[String]) | |||
case class DiskSizeMsg(path: String) | |||
case class CloseLRUByPathMsg(path: String) | |||
case class SoftDeleteMsg(path: String) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should delete RenamePathMsg above.
val manager = self | ||
node.spawn((_) => { | ||
openTimer.time { | ||
IndexService.start(node, ctx.args.config, path, "standard") match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"soft" delete should be handled the same as "hard" delete. A message to be sent from dreyfus to the IndexService actor which, on receipt, closes the index and renames its files then calls exit on its own pid. This ensures the correct order of processing.
cd53287
to
105b805
Compare
rename(srcDir, destDir) | ||
case SoftDeletePathMsg(path: String) => | ||
logger.info("Soft-deleting " + path) | ||
val pattern = Pattern.compile(path + "/([0-9a-f]+)$") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this a different pattern to the one we use in CleanupDbMsg?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for CleanupDbMsg
, the passed is pure database name while database shardname is passed for SoftDeletePathMsg, like shards/00000000-ffffffff/foo.1234567890
. So the regexp is different for these two messages.
softDelete(file, path, includePattern) | ||
} | ||
|
||
logger.info(fileOrDir.getAbsolutePath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this line.
@@ -220,7 +227,16 @@ class IndexService(ctx: ServiceContext[IndexServiceArgs]) extends Service(ctx) w | |||
ctx.args.writer.rollback() | |||
} catch { | |||
case e: AlreadyClosedException => 'ignored | |||
case e: IOException => warn("Error while closing writer", e) | |||
case e: IOException => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the right pattern. rollback() is already a close() (but without the commit), so the fallback to a failed rollback() is just the isLocked/unlock portion (the finally clause).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, let me remove close() line.
@@ -29,4 +35,33 @@ object Utils { | |||
new BytesRef(string) | |||
} | |||
|
|||
def rename(rootDir: File, dbName: String, sig: String) { | |||
val logger = Logger.getLogger("clouseau.utils") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the rootDir here is always the same, so don't pass it in, simply fetch it within this method. This will make it clearer which directory we are renaming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks that there is noctx
in Utils. So I keep this for now.
hi Bob @rnewson thanks for your comments. I addressed them. For now, I select the approach of only closing process with pid, and then move |
d59a2ff
to
61b636c
Compare
61b636c
to
bbf7bbe
Compare
Overview
This fix is to refactor the logic for soft-deletion of search index. In the past, there is case where write.lock was stayed in search index directory and can't be moved. It is due to the fact that the index was not closed gracefully before moving/renaming search index. With this fix, the search index writer can be closed, and then search index can be moved to .deleted directory smoothly.
Testing recommendations
Related Issues or Pull Requests
apache/couchdb#2130
Checklist
rel/overlay/etc/default.ini