-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
block: Handle gracefully extra duplicated chunk produced by TSDB issue. #375
Conversation
Looks good! The tests are failing, but it looks like it's due to the changes you made to some function signatures, nothing critical. I just have a few minor comments. |
cmd/thanos/bucket.go
Outdated
for i, bid := range *verifyIDWhitelist { | ||
id, err := ulid.Parse(bid) | ||
if err != nil { | ||
return errors.Wrap(err, "not valid ULID found in --id-whitelist flag") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"not valid" → "invalid"
pkg/block/index.go
Outdated
|
||
n := i.OutsideChunks - (i.CompleteOutsideChunks + i.Issue347OutsideChunks) | ||
if n > 0 { | ||
errMsg += fmt.Sprintf("found %d chunks non-completely outside the block time range.", n) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"non-completely" → "partially"
pkg/block/index.go
Outdated
// GatherIndexIssueStats returns useful counters as well as outsider chunks (chunks outside of block time range) that | ||
// helps to assess index health. | ||
// It considers https://github.com/prometheus/tsdb/issues/347 as something that Thanos can handle. | ||
// See Stats.AcceptableOutsiders for details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you renamed Stats.AcceptableOutsiders
to Stats.Issue347OutsideChunks
.
pkg/block/index.go
Outdated
if c.MinTime > maxTime || c.MaxTime < minTime { | ||
stats.CompleteOutsiders++ | ||
stats.CompleteOutsideChunks++ | ||
} else if c.MinTime == maxTime && c.MaxTime > maxTime { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if it can actually happen, but in theory a chunk with a single value could have c.MinTime == c.MaxTime
, and if it's also at the very end of the block (c.MinTime == maxTime
), it wouldn't be picked up by this condition.
I think you just need to test if c.MinTime == maxTime
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can agree with it, but single sample chunk is very sad chunk (: But can happen probably.
return true, nil | ||
} | ||
return false, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see why you would collect separate stats for outside chunks and complete outside chunks, but when it comes to repairing blocks, they're really the same thing.
I mean if you have two intervals, block = [a, b)
and chunk = [c, d]
, they don't overlap at all even if b == c
. So you could have a single ignore function:
func IgnoreOutsideChunk(mint, maxt int64, _ *chunks.Meta, curr *chunks.Meta) (bool, error) {
if curr.MinTime >= maxt || curr.MaxTime < mint {
return true, nil
}
return false, nil
}
If you don't merge those two functions, then I think you should change the condition in IgnoreIssue347OutsideChunk()
to
if curr.MinTime == maxt {
return true, nil
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea will do latter. Can agree that "complete" and "near" is kind of the same -> we can kill them just fine, but there is no upstream issue causing "complete". with issue347 we know what is that root cause and we know that it will not be fixed until it lands and land into Prometheus AND Prometheus go released. This will be quite a long time. With complete
and duplicates
even though they safe to repair they only occurred once Thanos had an issue, so the only explicit manual command should be fixing these, nothing automated.
pkg/compact/compact.go
Outdated
return false, errors.Wrapf(err, "gather index issues for block %s", pdir) | ||
} | ||
|
||
if err = stats.CriticalErr(); err == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably want to use if err := ...
here, since you're in a different scope. Same for a few more if
conditions below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also err != nil
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ups
cmd/thanos/bucket.go
Outdated
for _, bid := range whilelistIDs { | ||
if id.Compare(bid) == 0 { | ||
found = true | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return true
here and get rid of found
variable
pkg/block/index.go
Outdated
ExactSum int | ||
if i.OutOfOrderSeries > 0 { | ||
errMsg += fmt.Sprintf( | ||
"%d/%d series have an average of %.3f out-of-order chunks. "+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the sentence should be "x, y, z" as opposed to "x. y. z". You should probably have errMsg []string
and join it with ", " in the end.
pkg/block/index.go
Outdated
repl = append(repl, c) | ||
continue | ||
if ignore { | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would labelling outer loop with OUTER_LOOP:
and continue OUTER_LOOP
work here?
pkg/block/index.go
Outdated
return false, nil | ||
} | ||
|
||
if curr.MinTime > prev.MaxTime { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you have curr.Mintime = prev.MaxTime
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no. Chunks never overlaps. Blocks overlaps with miliesecond always but means [t1, t2) (TSDB known inconsistency). But good question.
pkg/block/index.go
Outdated
) | ||
|
||
for _, ignoreChkFn := range ignoreChkFns { | ||
ignore, err = ignoreChkFn(mint, maxt, last, &c) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some ignore check functions will fail if last is nil, is this ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes - it is all handled.
pkg/compact/compact.go
Outdated
@@ -532,35 +534,35 @@ func (cg *Group) areBlocksOverlapping(include *block.Meta, excludeDirs ...string | |||
return nil | |||
} | |||
|
|||
func (cg *Group) compact(ctx context.Context, dir string, comp tsdb.Compactor) (compID ulid.ULID, err error) { | |||
func (cg *Group) compact(ctx context.Context, dir string, comp tsdb.Compactor) (nothingToCompact bool, err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't seem to be utilizing named returns
pkg/compact/compact.go
Outdated
return false, errors.Wrapf(err, "gather index issues for block %s", pdir) | ||
} | ||
|
||
if err = stats.CriticalErr(); err == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also err != nil
?
pkg/compact/compact.go
Outdated
return false, errors.Wrapf(err, "deleting old block %s failed. You need to delete this block manually", id) | ||
} | ||
|
||
return false, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we continue with execution here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove this return
as we would want to continue with the rest of the pdir := range plan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, i think it would be good to make the Compact
section of code more clear ... also changes as requested by @BenoitKnecht + @stefan-improbable
cmd/thanos/bucket.go
Outdated
|
||
idMatcher = func(id ulid.ULID) bool { | ||
found := false | ||
for _, bid := range whilelistIDs { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought - would it be better to do this using a map? map[ulid.ULID]struct{}
then see if the ulid is present in the map? Not a big deal anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you cannot simply do map[ulid.ULID]struct{}
(ulid.ULID is a byte slice) but we cannot do map[string] - good point!
cmd/thanos/compact.go
Outdated
// We keep going through the outer loop until no group has any work left. | ||
if id != (ulid.ULID{}) { | ||
if !nothingToCompact { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit - !nothingToCompact
reads weird would prefer !completed
or !compactionCompleted
pkg/compact/compact.go
Outdated
@@ -440,23 +442,23 @@ func (cg *Group) Resolution() int64 { | |||
|
|||
// Compact plans and runs a single compaction against the group. The compacted result | |||
// is uploaded into the bucket the blocks were retrieved from. | |||
func (cg *Group) Compact(ctx context.Context, dir string, comp tsdb.Compactor) (ulid.ULID, error) { | |||
func (cg *Group) Compact(ctx context.Context, dir string, comp tsdb.Compactor) (bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to add to comment what the bool represents. Will returns true if you should continue compaction on subsequent groups.
pkg/compact/compact.go
Outdated
if err != nil { | ||
cg.compactionFailures.Inc() | ||
} | ||
cg.compactions.Inc() | ||
|
||
return id, err | ||
return continueCompact, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Abit confused here ... continueCompaction
here turns into nothingToCompact
in thanos/compact.go
and in the inner compact
below you return nothingToCompact
as true
if its done ... so this then gets assinged to continueCompact
... maybe im missing something but the vars are abit confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just change it all to continueCompact
? Think this would read a lot better and help keep it simple.
pkg/compact/compact.go
Outdated
} | ||
|
||
if err = stats.Issue347OutsideChunksErr(); err != nil { | ||
level.Info(cg.logger).Log("msg", "stopping compaction and repairing invalid plan block", "err", err.Error(), "broken_block_id", pdir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit - sentence
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually looks like this meets convention so ignore 👍
pkg/compact/compact.go
Outdated
return false, errors.Wrapf(err, "deleting old block %s failed. You need to delete this block manually", id) | ||
} | ||
|
||
return false, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove this return
as we would want to continue with the rest of the pdir := range plan
90a2b8d
to
90228b3
Compare
Thanks all for the detailed review! Apologise for so many nits.. That always happens on 2 AM PRs ^^ Addressed all PTAL again @BenoitKnecht @stefan-improbable @domgreen On green CI, before landing, I am going to test the fixed compaction flow on our testing cluster. |
pkg/block/index.go
Outdated
// Issue347OutsideChunksErr returns error if stats indicates issue347 block issue, that is repaired explicitly before compaction (on plan block). | ||
func (i Stats) Issue347OutsideChunksErr() error { | ||
if i.Issue347OutsideChunks > 0 { | ||
return errors.Errorf("Found %d chunks outside the block time range introduced by https://github.com/prometheus/tsdb/issues/347", i.Issue347OutsideChunks) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: found with small f in the beginning due to style guide
pkg/block/index.go
Outdated
var errMsg string | ||
|
||
if err := i.CriticalErr(); err != nil { | ||
errMsg += err.Error() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use strings.join ", " similarly to CriticalErr
@Bplotka is it possible to unit test compaction and the block error detection/repair logic? |
90228b3
to
1dda32e
Compare
} | ||
return errors.Wrap(err, "compaction") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we change this section so we do not indent the normal flow?
id, err := g.Compact(ctx, compactDir, comp)
if err != nil {
if compact.IsIssue347Error(err) {
err = compact.RepairIssue347(ctx, logger, bkt, err)
if err == nil {
done = false
continue
}
}
return errors.Wrap(err, "compaction")
}
// If the returned ID has a zero value, the group had no blocks to be compacted.
// We keep going through the outer loop until no group has any work left.
if id != (ulid.ULID{}) {
done = false
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but that will increase the indentation deepness. It's a tradeoff. What do you prefer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer less indentation on standard flow but it is not a blocker, just a preference.
@stefan-improbable yes, and we have these. I added compaction upstream test though: prometheus-junkyard/tsdb#349 (we use vanilla TSDB compactor internally) |
@stefan-improbable actually we have even e2e tests, sorry forgot abou these: https://github.com/improbable-eng/thanos/blob/master/pkg/compact/compact_e2e_test.go |
1dda32e
to
40fccef
Compare
PTAL again -> Added notion of |
f35be41
to
76c8191
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I like the addition of |
76c8191
to
52faa10
Compare
I am going to leave it for a day on our beta cluster, but seems to be good to go 👍 |
pkg/block/block.go
Outdated
IndexFilename = "index" | ||
// ChunksDirname is the known dir name for chunks with compressed samples. | ||
ChunksDirname = "chunks" | ||
// Real upload source of the block. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Source is ...
, comment should start with object name due to govet or go lint conventions.
pkg/block/index.go
Outdated
ExactSum int | ||
if i.OutOfOrderSeries > 0 { | ||
errMsg = append(errMsg, fmt.Sprintf( | ||
"%d/%d series have an average of %.3f out-of-order chunks. "+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: comment should be "%d/%d series have an ... chunks: %.3f of these ... time range)"
(replace first period with colon and don't sentence in period so that Join(", ")
works properly.
pkg/block/index.go
Outdated
} | ||
|
||
if len(errMsg) > 0 { | ||
return errors.New(strings.Join(errMsg, ",")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: replace ","
with ", "
pkg/block/index.go
Outdated
CompleteOutsiders int | ||
n := i.OutsideChunks - (i.CompleteOutsideChunks + i.Issue347OutsideChunks) | ||
if n > 0 { | ||
errMsg = append(errMsg, fmt.Sprintf("found %d chunks non-completely outside the block time range.", n)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no period at the end of error message
pkg/block/index.go
Outdated
} | ||
|
||
if i.CompleteOutsideChunks > 0 { | ||
errMsg = append(errMsg, fmt.Sprintf("found %d chunks completely outside the block time range.", i.CompleteOutsideChunks)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: no period at the end of error message
pkg/block/index.go
Outdated
return errors.Errorf("No chunks are out of order, but found some outsider blocks. (Blocks that outside of block time range): %d. Complete: %d", | ||
i.Outsiders, i.CompleteOutsiders) | ||
if len(errMsg) > 0 { | ||
return errors.New(strings.Join(errMsg, ",")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: replace ","
with ", "
cb := crc32.Checksum(curr.Chunk.Bytes(), castagnoli) | ||
|
||
if ca != cb { | ||
return false, errors.Errorf("non-sequential chunks not equal: %x and %x", ca, cb) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Printing ca
and cb
is not useful (or is it)?. Printing (last.MinTime, last.MaxTime), (curr.MinTime, curr.MaxTime) is probably better. Or is there a way to indentify bad chunks so that users can debug them further?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is enough. You have block ID so you can go through chunks and see on your own.
// - compactor created blocks | ||
// NOTE: It is not safe to miss "old" block (even that it is newly created) in sync step. Compactor needs to aware of ALL old blocks. | ||
// TODO(bplotka): https://github.com/improbable-eng/thanos/issues/377 | ||
if ulid.Now()-id.Time() < uint64(c.syncDelay/time.Millisecond) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't you use time.Now().UTC()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of what? We want to compare ulids which are not same as Time.Now() (they are not that precise)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant using time.Now().UTC() - id.Time()
. The functions that "conveniently" call another functions are just introducing an unnecessary indirection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree here. time.Now().UTC()
will not work.
we would endup doing ulid.Timestamp(time.Now().UTC())
which is rewritting function that is exactly for this. so would vote for leaving ulid.Now()
pkg/compact/compact.go
Outdated
// avoid races when a block is only partially uploaded. This relates only to level 1 blocks. | ||
// NOTE: It is not safe to miss compacted block in sync step. Compactor needs to aware of ALL old blocks. | ||
if meta.Compaction.Level == 1 && ulid.Now()-id.Time() < uint64(c.syncDelay/time.Millisecond) { | ||
// avoid races when a block is only partially uploaded. This relates to all blocks despite: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: despite is probably incorrect word here
TSDB issue: prometheus-junkyard/tsdb#347 How we handle it? - segregate this issue in special stat entry in verifier. - auto-fix broken plan block before thanos compaction. - adding repair job to run offline batch repair for indivdual blocks. NOTE: At this point we have no power of fixing the bug when someone uses local compaction ): Fixes #354 Signed-off-by: Bartek Plotka <[email protected]>
52faa10
to
4ff9c61
Compare
Use iterator for local Series call
TSDB issue: prometheus-junkyard/tsdb#347
How we handle it?
NOTE: At this point we have no power of fixing the bug when someone uses local compaction ):
Fixes #354
PTAL @BenoitKnecht
Signed-off-by: Bartek Plotka [email protected]