-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support cluster purge #7
base: master
Are you sure you want to change the base?
Conversation
chguocloudant
commented
Nov 30, 2016
- add 2 APIs(set_purge_seq, get_purge_seq).
- add a test case to test the new APIs.
where is purgeSeq persisted to the index? |
@@ -70,6 +70,7 @@ class IndexService(ctx: ServiceContext[IndexServiceArgs]) extends Service(ctx) w | |||
var reader = DirectoryReader.open(ctx.args.writer, true) | |||
var updateSeq = getCommittedSeq | |||
var pendingSeq = updateSeq | |||
var purgeSeq = 0L |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the default value is 0? Similarly to updateSeq
it should be read from commitData on disk
@@ -199,6 +206,26 @@ class IndexService(ctx: ServiceContext[IndexServiceArgs]) extends Service(ctx) w | |||
} | |||
} | |||
|
|||
private def commitPurgeSeq(newSeq: Long) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method does not set the committing flag and so could allow concurrent attempts to commit, leading to a broken index (one with the wrong values).
This work should instead be done in the preceding commit
function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean we can commit purgeSeq
and updateSeq
with commit
function?
We can change commit
function with 2 arguments to support it.
private def commit(newSeq: Long, type: String) {
if (type == "updateSeq") {
...
} else {
...
}
- add 2 APIs(set_purge_seq, get_purge_seq). - keep purge_seq in clouseau. - adjust the commit logic to update purge seq. - update the test code. BugzID: 68280
86f9452
to
a8c3c8e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved, nicely done. Please don't merge until we have the matching dreyfus work at the same level in case it changes what we need here.
@@ -569,7 +580,8 @@ class IndexService(ctx: ServiceContext[IndexServiceArgs]) extends Service(ctx) w | |||
('doc_count, reader.numDocs), | |||
('doc_del_count, reader.numDeletedDocs), | |||
('pending_seq, pendingSeq), | |||
('committed_seq, getCommittedSeq) | |||
('committed_seq, getCommittedSeq), | |||
('purge_seq, pendingPurgeSeq) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be purgeSeq
here not pendingPurgeSeq
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I will change it later.
case SetPurgeSeqMsg(newSeq: Long) => | ||
pendingPurgeSeq = newSeq | ||
logger.debug("Pending purge sequence is now %d".format(newSeq)) | ||
commit(pendingSeq, pendingPurgeSeq) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @rnewson .
I triggered commit when the purgeSeq is updated. Will it cost too much?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I think the var pendingPurgeSeq
can be removed if the commit
is allowed. Right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm, good question. I'd rather we always commit asynchronously and on the established interval. For users that purge a lot (and there will be a lot of them if this work is extended to allow 'hard' deletes) they'll be committing very frequently and this will hurt performance.
This means the purge seq message could be processed but not actioned, when clouseau crashes before committing, so we'll need to ensure that dreyfus will retry (it should, since it reads the purge seq from the index, but let's be sure).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @rnewson .
If we commit purgeSeq asynchronously and also the users purge a lot, the dreyfus
will retry to delete the index again in clouseau
.
For example, if dreyfus
gets a purgeSeq(6) from couchdb and gets an IdxPurgeSeq(5) from clouseau
, it will purge index from 5 and set purgeSeq to 6.
If clouseau
is crashed before committing, dreyfus
will get purgeSeq(5) again. And dreyfus
will purge index from 5 again.
But I am not sure which is more expensive, delete index or commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can safely assume clouseau crashes are rare from our production experience.
commit is expensive, the most expensive operation there is in Lucene, I think.
So the choice is between an expensive commit for every purge or a less expensive delete on retry for every clouseau crash.
We should also not say "delete index" when we really mean "delete document". :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I think I can modify it later.
Let me make sure that we can remove the value pendingPurgeSeq
, right? If the dreyfus
calls get_purge_seq
, clouseau
will return purgeSeq
directly(now this value maybe is not committed yet).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how we can remove pending. we need to return the committed purge seq when asked and need to retain the pending value for when we commit.
- adjust the commit logic to update purge seq. BugzID: 68280
@@ -117,6 +120,10 @@ class IndexService(ctx: ServiceContext[IndexServiceArgs]) extends Service(ctx) w | |||
pendingSeq = newSeq | |||
logger.debug("Pending sequence is now %d".format(newSeq)) | |||
'ok | |||
case SetPurgeSeqMsg(newPurgeSeq: Long) => | |||
purgeSeq = newPurgeSeq |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be pendingPurgeSeq = newPurgeSeq
. purgeSeq
must only change after it's been committed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. I will modify it.
- use pendingPurgeSeq in set_purge_seq and get_purge_seq. - update purgeSeq when pengdingPurgeSeq is committed. BugzId: 68280
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 but let's merge once the matching dreyfus PR is ready.
@@ -117,6 +121,10 @@ class IndexService(ctx: ServiceContext[IndexServiceArgs]) extends Service(ctx) w | |||
pendingSeq = newSeq | |||
logger.debug("Pending sequence is now %d".format(newSeq)) | |||
'ok | |||
case SetPurgeSeqMsg(newPurgeSeq: Long) => | |||
pendingPurgeSeq = newPurgeSeq | |||
logger.debug("purge sequence is now %d".format(newPurgeSeq)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should say 'Pending purge sequence is now %d'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, got it. I will modify it later.