WIP: [Rhythm] Block-builder consumption loop #4480

mdisibio · 2024-12-19T16:05:20Z

What this PR does:
This is an alternate block-builder consumption loop that I think has benefits and would like to get feedback on.

The curent loop can be thought of as top-down. It calculates the total time range that the block builder is lagging, splits it into smaller sections (i.e. 5 minutes), consume/flush/commit each section.

This new loop is: while more data, start at last commit and consume/flush/commit another chunk of data (i.e. 5 minutes).

The benefits are:

less state - Each consume/flush/commit cycle is independent, doesn't require knowledge of the overall state of the queue, or side effects from previous loops
fewer kafka apis are involved, (i.e. no CalculateGroupLag)
doesn't require custom commit metadata
I think core loop is simpler and will make it easier to iterate. For example there is a TODO to round-robin when the block-builder is assigned to multiple partitions and they are all lagging. I think this is more complex to try in the current design.

The drawbacks are:
* Mainly around how we want to metric "lag". The current metric is number of messages, but I think finding a way to determine length of time (i.e. "the block-builder is 15 minutes behind") is more useful. However you have to read a record to determine that, so this needs more work.

TODO

This is a draft, and needs more finishing touches before it can be merged. What I want is to think about the overall loop structure. Logs/cleanup/finishing touches would be added before any merge.
Depends on test updates in [WIP] [Rhythm] Block builder test updates #4510 which setup support for consumer groups.

Which issue(s) this PR fixes:
Fixes #

Checklist

Tests updated
Documentation added
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

…to work using real consumergroup functionality

javiermolinar

Looks good!

modules/blockbuilder/blockbuilder.go

javiermolinar · 2024-12-20T11:41:03Z

modules/blockbuilder/blockbuilder.go

+		return false, err
+	}
+
+	lastCommit, ok := commits.Lookup(topic, partition)


Suggested change

lastCommit, ok := commits.Lookup(topic, partition)

lastCommit, exists := commits.Lookup(topic, partition)

if exists && lastCommit.At >= 0 {

startOffset = startOffset.At(lastCommit.At)

} else {

startOffset = kgo.NewOffset().AtStart()

}

https://pkg.go.dev/github.com/twmb/franz-go/pkg/[email protected]#OffsetResponses.Lookup

I think ok is more idiomatic, and the lib is reading from a map internally anyway.

javiermolinar · 2024-12-20T11:53:37Z

modules/blockbuilder/blockbuilder.go

+			}
+
+			err := b.pushTraces(rec.Key, rec.Value, writer)
+			if err != nil {


What will happen if something is wrong with the WAL? I guess it will enter in a loop

mapno

Nice job on reducing the number of API calls. The way we calculate cycle sections is complex and should be simplified.

I have some concerns on losing visibility on the state of partitions. Before it was easy to see exactly how many pending messages there were and what was being consumed. I feel we lose that a bit with this PR.
I believe lag can still be periodically polled with fewer lines than currently and we'd keep that functionality. We might be able to delete the fallback time entirely.

mapno · 2025-01-03T15:09:54Z

modules/blockbuilder/blockbuilder.go

 )

 var (
 	metricPartitionLag = promauto.NewGaugeVec(prometheus.GaugeOpts{
 		Namespace: "tempo",
 		Subsystem: "block_builder",
-		Name:      "partition_lag",
-		Help:      "Lag of a partition.",
+		Name:      "partition_lag_s",


Suggested change

Name: "partition_lag_s",

Name: "partition_lag_seconds",

https://prometheus.io/docs/practices/naming/#metric-names

IMO the number of pending records is also relevant. I think we should keep the original metric and add one for time.

👍 re-added, and it is polled in a separate go routine.

mapno · 2025-01-03T15:52:57Z

modules/blockbuilder/blockbuilder.go

+				// Determine begin and end time range, which is -/+ cycle duration.
+				// But don't exceed the given overall end time.
+				begin = rec.Timestamp.Add(-dur)
+				if rec.Timestamp.Add(dur).Before(end) {
+					end = rec.Timestamp.Add(dur)
+				}


This feels like a strange side-effect of the writer being nil. Cycle initialisation could be consolidated in one place instead.

This is the cycle initialization. Swapped to use an init bool which should be clearer.

mapno · 2025-01-03T16:01:39Z

modules/blockbuilder/blockbuilder.go

+				writer = newPartitionSectionWriter(b.logger, int64(partition), rec.Offset, b.cfg.BlockConfig, b.overrides, b.wal, b.enc)
+			}
+
+			if rec.Timestamp.Before(begin) || rec.Timestamp.After(end) {


Unsure if we should break for records being too old. The record ts is set by the produced—the distributor in this case. It's not based on the trace's time.

If a record was so old as to be outside of the cycle's start time, I think we'd want to put it in the current block. Otherwise it could impact consumption too much, creating too small blocks.

I was thinking that if something is too old (older than the cycle time, i.e. minutes), then it likely has old traces too. But don't have strong opinions. Removed the check.

…ix consume loop to only consume full-duration cycles for more determinism

mdisibio · 2025-01-06T23:19:35Z

lag can still be periodically polled with fewer lines than currently and we'd keep that functionality. We might be able to delete the fallback time entirely.

Good call, re-added the original metric, polled it in a separate goroutine, and removed the fallback logic.

mdisibio added 4 commits December 12, 2024 18:08

Alternate block-builder consume

9ccb477

Set timeout on PollFetches, reduce initial poll delay, update 1 test …

4fa3d19

…to work using real consumergroup functionality

restore metrics

7813692

Merge branch 'main-rhythm' into bb2

949343f

javiermolinar reviewed Dec 20, 2024

View reviewed changes

Merge branch 'main-rhythm' into bb2

cb49a4a

mapno reviewed Jan 3, 2025

View reviewed changes

mdisibio mentioned this pull request Jan 3, 2025

[WIP] [Rhythm] Block builder test updates #4510

Open

3 tasks

mdisibio added 6 commits January 6, 2025 17:32

Merge branch 'main-rhythm' into HEAD

7a6498e

Re-add original partition lag metric, polled in separate goroutine. F…

2389fe0

…ix consume loop to only consume full-duration cycles for more determinism

merge conflict

9a3d011

Review feedback

766ba9f

Review feedback

cf47807

Comment

ee9a52b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: [Rhythm] Block-builder consumption loop #4480

WIP: [Rhythm] Block-builder consumption loop #4480

mdisibio commented Dec 19, 2024 •

edited

Loading

javiermolinar left a comment

javiermolinar Dec 20, 2024

mdisibio Jan 6, 2025

javiermolinar Dec 20, 2024

mapno left a comment

mapno Jan 3, 2025

mapno Jan 3, 2025

mdisibio Jan 6, 2025

mapno Jan 3, 2025

mdisibio Jan 6, 2025

mapno Jan 3, 2025 •

edited

Loading

mdisibio Jan 6, 2025

mdisibio commented Jan 6, 2025

-	lastCommit, ok := commits.Lookup(topic, partition)
+	lastCommit, exists := commits.Lookup(topic, partition)
+	if exists && lastCommit.At >= 0 {
+		startOffset = startOffset.At(lastCommit.At)
+	} else {
+		startOffset = kgo.NewOffset().AtStart()
+	}

WIP: [Rhythm] Block-builder consumption loop #4480

Are you sure you want to change the base?

WIP: [Rhythm] Block-builder consumption loop #4480

Conversation

mdisibio commented Dec 19, 2024 • edited Loading

javiermolinar left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mapno left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mapno Jan 3, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mdisibio commented Jan 6, 2025

mdisibio commented Dec 19, 2024 •

edited

Loading

mapno Jan 3, 2025 •

edited

Loading