Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enforce max trace size on trace by id path #2935

Merged
merged 11 commits into from
Sep 22, 2023

Conversation

joe-elliott
Copy link
Member

@joe-elliott joe-elliott commented Sep 14, 2023

Fixes #2794. Enforces max trace size on the query path by checking it in 4 places:

  • As it's read from the parquet file at the block level (querier and ingester)
  • As it's combined in the querier
  • As it's combined in the ingester
  • As it's combined in the query frontend

Additional changes

  • Fixes an undocumented bug where errors were being dropped querying ingesters
  • Fixes e2e test timestamps on spans
  • Changelog

@joe-elliott joe-elliott changed the title WIP: Enforce max trace size on trace by id path Enforce max trace size on trace by id path Sep 20, 2023
Copy link
Member

@mapno mapno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@joe-elliott
Copy link
Member Author

@mapno I have done a bit of restructuring:

  • pushed the error generation into the combiner
  • return 400 from the querier instead of 500 where possible
  • extend tests to confirm the above

500s from the queriers will be retried by the frontend. 400s will be piped up to the trace sharding layer and fail quickly. this is better, but still not perfect.

making these changes makes me realize our query path needs an operational facelift. i intend to work on that next.

Copy link
Member

@mapno mapno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coalescing the check into the combiner is nice. Left some comments.

Makefile Outdated
@@ -160,7 +160,7 @@ lint:

.PHONY: docker-component # Not intended to be used directly
docker-component: check-component exe
docker build -t grafana/$(COMPONENT) --build-arg=TARGETARCH=$(GOARCH) -f ./cmd/$(COMPONENT)/Dockerfile .
docker build -t grafana/$(COMPONENT) --load --build-arg=TARGETARCH=$(GOARCH) -f ./cmd/$(COMPONENT)/Dockerfile .
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know if the docker we use in Drone supports buildx? Asking because --load is an option only in buildx, it's not supported in vanilla docker build.

I spent a bit of time searching but didn't get a clear answer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm shocked this passed CI. a recent docker update forced buildx on me and i haven't been able to revert. i have to add this param to get docker to build to my local repo. good eyes. Will remove!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that nothing in the GH CI triggers docker builds

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think e2e tests do. don't they build a local grafana/tempo:latest and then use that for the tests?

Comment on lines +254 to +260
if maxTraceSizeBytes > 0 {
estimatedSize := estimateMarshalledSizeFromTrace(tr)
if estimatedSize > maxTraceSizeBytes {
return nil, errors.Errorf("trace exceeds max size in the block. size: %d, max: %d", estimatedSize, maxTraceSizeBytes)
}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Size check in backendBlock is missing in vParquet and vParquet2

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm ok with this. we are purposefully not backporting all changes to all previous parquet versions for sanity.

@joe-elliott joe-elliott merged commit 498e365 into grafana:main Sep 22, 2023
@savar savar mentioned this pull request Feb 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enforce limitations on parquet trace by id queries
2 participants