-
Notifications
You must be signed in to change notification settings - Fork 630
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prototype(symbolization): Add symbolization in Pyroscope read path #3799
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be nice to have some benchmarks of symbolizing different amount of locations and different file sizes, I think it can help us to pick the right place and architecture for using this
|
||
// Save the debuginfo to a temporary file | ||
tempDir := os.TempDir() | ||
filePath := filepath.Join(tempDir, fmt.Sprintf("%s.elf", buildID)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this may cause a path traversal, allowing somebody to specify buildID as ../../.../ we should sanitize user provided data in buildID
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also add that generally speaking we should also be avoiding file operations as much as possible. If it's possible to do this in memory I'd do this in memory. If the file is too large to fit in memory I'd see if we could stream it
efdde88
to
6b009d3
Compare
if r.Symbolizer == nil { | ||
return false | ||
} | ||
if len(loc.Line) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's time to move this logic (unsymbolized fallback)
pyroscope/pkg/ingester/otlp/convert.go
Line 166 in 466aaa8
gl.Line = append(gl.Line, &googleProfile.Line{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work, Marc! I'm excited to see some experimental results 🚀
I think we can implement a slightly more optimized version for production use:
sequenceDiagram
autonumber
participant QF as Query Frontend
participant M as Metastore
participant QB as Query Backend
participant SYM as Symbolizer
QF ->>+M: Query Metadata
Note left of M: Build identifiers are returned<br> along with the metadata records
M ->>-QF:
par
QF ->>+SYM: Request for symbolication
Note left of SYM: Prepare symbols for<br>the objects requested
and
QF ->>+QB: Data retrieval and aggregation
Note left of QB: The main data path<br>Might be serverless
end
QB ->>-QF: Data in pprof format
Note over QF: Because of the truncation,<br> only a limited set of locations<br>make it here (16K by default)
QF --)SYM: Location addresses
SYM ->>-QF: Symbols
QF ->>QF: Flame graph rendering
Even without a parallel pipeline and dedicated symbolication service, we could implement something like this:
sequenceDiagram
autonumber
participant QF as Query Frontend
participant M as Metastore
participant QB as Query Backend
participant SYM as Symbols
QF ->>+M: Query Metadata
Note left of M: No build identifiers are returned
M ->>-QF:
QF ->>+QB: Data retrieval and aggregation
Note left of QB: The main data path<br>Might be serverless
QB ->>-QF: Data in pprof format
Note over QF: Because of the truncation,<br> only a limited set of locations<br>make it here (16K by default)
QF ->>+SYM: Fetch symbols
SYM ->>-QF: Symbols
Note over QF: In terms of the added latency,<br>this approach is not worse than<br>block level symbolication
QF ->>QF: Flame graph rendering
I think we should avoid symbolization at the block level if the symbols are not already present in the block itself. Otherwise, this approach leads to excessive processing, increased latency, and higher resource usage. Please keep in mind, that a query may span many thousands of blocks.
I won't delve too deeply into how we fetch and process ELF/DWARF files, but I strongly doubt we can bypass the need for an intermediate representation optimized for our access patterns. Additionally, we need a solution to prevent concurrent access to the debuginfod service.
@@ -19,6 +21,13 @@ func buildTree( | |||
appender *SampleAppender, | |||
maxNodes int64, | |||
) (*model.Tree, error) { | |||
// Try debuginfod symbolization first | |||
if symbols != nil && symbols.Symbolizer != nil { | |||
if err := symbolizeLocations(ctx, symbols); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that symbols
here include all locations of the partition. The profile we want to symbolize typically includes only a subset of them.
What we should also leverage is truncation. After pruning insignificant nodes, only a limited number of locations are preserved (this could reduce the number from millions to thousands). This is also helpful because some of the mappings will be excluded entirely.
// Find all locations needing symbolization | ||
for i, loc := range symbols.Locations { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just an observation (I'm not suggesting implementing it here): it is much more performant and simpler to sort locations by mapping ID and then split the slice into groups with a single pass. There should be no mappings with distinct IDs and matching BuildID.
locIdx := locs[i].idx | ||
|
||
// Clear the existing lines for the location | ||
symbols.Locations[locIdx].Line = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
symbols
are read only (as they are shared if multiple queries are accessing the same dataset simultaneously)
|
||
// Find all locations needing symbolization | ||
for i, loc := range symbols.Locations { | ||
if mapping := &symbols.Mappings[loc.MappingId]; symbols.needsDebuginfodSymbolization(&loc, mapping) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we should check it mapping-wide. If a mapping has symbols, all locations referring it have them, and vice versa. If the mapping flags are not set correctly, we should fix it.
var sym *symbolizer.Symbolizer | ||
if config.DebuginfodURL != "" { | ||
sym = symbolizer.NewSymbolizer( | ||
symbolizer.NewDebuginfodClient(config.DebuginfodURL), | ||
) | ||
} | ||
|
||
q := QueryBackend{ | ||
config: config, | ||
logger: logger, | ||
reg: reg, | ||
backendClient: backendClient, | ||
blockReader: blockReader, | ||
symbolizer: sym, | ||
} | ||
|
||
// Pass symbolizer to BlockReader if it's the right type | ||
if br, ok := blockReader.(*BlockReader); ok { | ||
br.symbolizer = sym | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are going to merge it at some point, I'd suggest implement dependency injection differently.
symbolizer.Symbolizer
does not belong to thequeryContext
.symbolizer.Symbolizer
should be passed toquerybackend.NewBlockReader
ininitQueryBackend
at initialization.querybackend.NewBlockReader
should acceptsymbolizer.Symbolizer
as an argument (might be an interface).symbolizer.Symbolizer
should be injected tosymbols
via*symdb.Reader
here:
for _, ds := range md.Datasets {
dataset := block.NewDataset(ds, object)
dataset.Symbols().SetSymbolizer(b.symbolizer)
qcs = append(qcs, newQueryContext(ctx, b.log, r, agg, dataset))
}
https://github.com/grafana/pyroscope/blob/main/pkg/phlaredb/symdb/block_reader.go#L419-L427:
func (p *partition) Symbols() *Symbols {
return &Symbols{
Stacktraces: p,
Locations: p.locations.slice(),
Mappings: p.mappings.slice(),
Functions: p.functions.slice(),
Strings: p.strings.slice(),
}
}
*partition
has access to *symdb.Reader
that should have Symbolizer
(interface defined in the symdb
package as it is the main/only consumer).
I have not look into the code yet, but I've tried to run it locally and it looks like it's trying to load a lot of unnecesarry debug files. I run ebpf profiler with no ontarget symbolization , also run a simple I then query only one executable I see 268 GET requests, with 13 requests to Other then that it works \M/ Can't wait to run it in dev. |
Context
This PR introduces a prototype implementation for DWARF symbolization of unsymbolized profiles in the Pyroscope read path. While this approach may introduce additional latency and requires further optimization, it enables us to gather performance statistics by running it alongside the eBPF collector in development environments.
All development has been done with V2 architecture support in mind
Key features
Configuration Example
Missing points