Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: cache hit / miss metric #4445

Closed
azr opened this issue Nov 27, 2023 · 2 comments · Fixed by #4464
Closed

Feature request: cache hit / miss metric #4445

azr opened this issue Nov 27, 2023 · 2 comments · Fixed by #4464

Comments

@azr
Copy link
Contributor

azr commented Nov 27, 2023

Hello there,

I'd like to add a cache hit / miss metric.
The only option I currently see without touching the code would be to pipe log lines and increment counters on /^CACHED/ or /^DONE/ regexp matches.

Would you have any pointers, for a different//better way to do this ? I will be looking for this, and may try to implement a few solutions, if possible.

@tonistiigi
Copy link
Member

How do you wish to expose such a metric? The history API record does have a counter for cached/total steps.

@azr
Copy link
Contributor Author

azr commented Dec 1, 2023

How do you wish to expose such a metric?

I'm not entirely sure of the best approach. I currently have WIP where in a cmd/buildctl/build.go:L296, I Tee the progress writer to, at the end just sort of dump a JSON struct to stderr; like so:

	if clicontext.String("cache_metrics") == "json"
		metricsCh := make(chan *client.SolveStatus)
		pw = progresswriter.Tee(pw, metricsCh)
		fromRegexp := regexp.MustCompile(`^\[.*\] FROM`)
		meg.Go(func() error {
			type vtxInfo struct {
				cached    bool
				completed bool
				from      bool
				name      string
			}
			vtxMap := make(map[digest.Digest]*vtxInfo)
			for st := range metricsCh {
				for _, vtx := range st.Vertexes {
					if _, ok := vtxMap[vtx.Digest]; !ok {
						vtxMap[vtx.Digest] = &vtxInfo{
							name: vtx.Name,
						}
					}
					if fromRegexp.MatchString(vtx.Name) {
						vtxMap[vtx.Digest].from = true
					}
					if vtx.Cached {
						vtxMap[vtx.Digest].cached = true
					}
					if vtx.Completed != nil {
						vtxMap[vtx.Digest].completed = true
					}
				}
			}
			eg.Wait()
			metrics := struct {
				Total            int64 `json:"total"`
				Completed        int64 `json:"completed"`
				UserTotal        int64 `json:"user_total"`
				UserCached       int64 `json:"user_cached"`
				UserCacheable    int64 `json:"user_cacheable"`
				UserCompleted    int64 `json:"user_completed"`
				From             int64 `json:"from"`
				Miss             int64 `json:"miss"`
				ClientDurationMS int64 `json:"client_duration_ms"`
			}{
				ClientDurationMS: time.Since(startTime).Milliseconds(),
			}
			for _, vtx := range vtxMap {
				metrics.Total++
				if vtx.completed {
					metrics.Completed++
				}
				if strings.HasPrefix(vtx.name, "[internal]") ||
					strings.HasPrefix(vtx.name, "[auth]") ||
					strings.HasPrefix(vtx.name, "importing cache") ||
					strings.HasPrefix(vtx.name, "exporting ") {
					continue
				}
				metrics.UserTotal++
				metrics.UserCacheable++
				if vtx.cached {
					metrics.UserCached++
				} else if !vtx.from {
					metrics.Miss++
					fmt.Printf("cache miss: %s\n", vtx.name)
				}
				if vtx.completed {
					metrics.UserCompleted++
				}
				if vtx.from {
					metrics.From++
					metrics.UserCacheable--
				}
			}
			dt, _ := json.Marshal(metrics)
			fmt.Fprintln(os.Stderr, string(dt))
			return nil
		})
//...

Makes it easy to grab that last line and just dump it into a metrics sink; could be a good idea to maybe log it to a different file.

How do you feel about something like this ?

Could this be its own command as well ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants