Optimistically return metrics from the cache (if present) #418

nbrachet · 2020-11-30T22:01:35Z

before building a metric to add to the cache.

…ilding a metric to add to the cache

nbrachet · 2020-12-04T18:15:49Z

@jupp0r any advice on how to address this jenkins error:

no coveralls.io token specified and no travis job id found

It doesn't seem related to this PR?

jupp0r · 2020-12-04T19:10:36Z

@jupp0r any advice on how to address this jenkins error

Ignore it, it's a know issue that I though I had fixed.

gjasny · 2020-12-04T19:29:00Z

Hi,

Could you please add a test for the expected behavior?

Thanks,
Gregor

nbrachet · 2020-12-04T22:32:43Z

I've been thinking about adding a test but I'm struggling to come with anything.
This is a transparent change and I can't come up with a way to externally check whether a new unique_ptr was created or not.

PS: I'm not super happy about the code duplication either but again I couldn't come up with a better approach.

danevandyck · 2020-12-08T01:57:11Z

Is the idea that hashing the labels is less expensive than the call to make_unique? Seems doubtful but not sure. Regardless, seems like labels should only be hashed once if the metric is missing.

nbrachet · 2020-12-08T03:36:05Z

The idea is that the vast majority of the time Add() is called it will return the metric from the cache. Such that the first call with a new set of labels will be indeed more expensive (2 hashes + 1 make_unique), but it will be amortized over time since subsequent calls will be cheaper (1 hash with no make_unique and no ~unique_ptr).

As an illustration, we have the following pseudo-code:

 Family<Counter> counter{"total_requests", "Count all requests"};
 
 void handle_request(...) {
    auto vhost = <get virtual host>;
    counter.Add({ "vhost", vhost }).Increment();
    ...
 }

The first request for each vhost initializes the cache (ie. Family<Counter>::metrics_, Family<Counter>::labels_, and Family<Counter>::labels_reverse_lookup_) after that all subsequent requests simply retrieve the Counter from there, saving a call to make_unique<Counter> and ~unique_ptr<Counter>.

danevandyck · 2020-12-08T14:56:18Z

I get it. How about implementing a helper which accepts a pre-computed hash value, thereby avoiding the new first-lookup penalty and avoiding the duplication mentioned earlier?

nbrachet · 2020-12-09T21:08:52Z

How about this?

danevandyck · 2020-12-14T21:29:47Z

core/include/prometheus/family.h

+    metrics_iterator iter = FindMetric(labels);
+    if (iter->second) return *(iter->second);
+    return Add(iter, detail::make_unique<T>(args...));


You're acquiring the lock twice here. The helpers (e.g. FindMetric and Add overload) are private and should assume that the lock is held.

Same comment as before: One double-locking is nothing in the long-run. However, grabbing the lock before computing the hash is always detrimental.
But that really opens the door to the fix I prefer: move the implementation of Add to the header file. Much simpler but I figure less acceptable.

I don't think consolidating lock acquisition is incompatible with hashing outside. As long as the private methods remain in the code file the goal of preventing instantiation outside of Counter/Gauge/... should be achieved. So,

T& Add(const std::map<std::string, std::string>& labels, Args&&... args) { auto hash = detail::hash_labels(labels); std::lock_guard<std::mutex> lock{mutex_}; metrics_iterator iter = FindMetric(hash); if (iter->second) return *(iter->second); return Add(iter, detail::make_unique<T>(args...));

where both private methods (FindMetric and Add) use a pre-computed hash.

Taking the lock twice leads to a race condition where the unique ptr is null in between.

My goal when hiding the implementation was to avoid circular includes as well as limiting the exposed symbols to have at least a chance to provide patches while maintaining the SONAME.

Taking the lock twice leads to a race condition where the unique ptr is null in between.

If by "race condition" you mean "there is a possibility that 2 (or more) unique ptr's are created" that is correct.

My goal when hiding the implementation was to avoid circular includes as well as limiting the exposed symbols to have at least a chance to provide patches while maintaining the SONAME.

Right. This is why this patch is structured this way: to maintain the same isolation of the implementation. The runtime cost is what we've talked about here: double-hashing and double-locking until the cache is populated. After that, no more double-hashing or double-locking, and no more extraneous new/delete (the original intent of this patch.)
I think it's a worthy trade-off.

On my forgeAI-nick/optimistic-family-add branch I moved the lock into to inlined template. That way we don't double lock.

gjasny · 2020-12-26T17:04:24Z

Could you please reset your branch to "our" forgeAI-nick/optimistic-family-add?

Optimistically return metrics from the cache (if present) *before* bu…

23d0784

…ilding a metric to add to the cache

fix indentation

4c0c398

jupp0r mentioned this pull request Dec 7, 2020

more flexibility desired from Registry interface #420

Open

re-work to save computation of hash

04e3899

danevandyck suggested changes Dec 14, 2020

View reviewed changes

gjasny requested a review from jupp0r December 26, 2020 17:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimistically return metrics from the cache (if present) #418

Optimistically return metrics from the cache (if present) #418

nbrachet commented Nov 30, 2020

nbrachet commented Dec 4, 2020

jupp0r commented Dec 4, 2020

gjasny commented Dec 4, 2020

nbrachet commented Dec 4, 2020

danevandyck commented Dec 8, 2020

nbrachet commented Dec 8, 2020

danevandyck commented Dec 8, 2020

nbrachet commented Dec 9, 2020

danevandyck Dec 14, 2020

nbrachet Dec 15, 2020

danevandyck Dec 17, 2020

gjasny Dec 17, 2020

nbrachet Dec 17, 2020

gjasny Dec 26, 2020

gjasny commented Dec 26, 2020

Optimistically return metrics from the cache (if present) #418

Are you sure you want to change the base?

Optimistically return metrics from the cache (if present) #418

Conversation

nbrachet commented Nov 30, 2020

nbrachet commented Dec 4, 2020

jupp0r commented Dec 4, 2020

gjasny commented Dec 4, 2020

nbrachet commented Dec 4, 2020

danevandyck commented Dec 8, 2020

nbrachet commented Dec 8, 2020

danevandyck commented Dec 8, 2020

nbrachet commented Dec 9, 2020

danevandyck Dec 14, 2020

Choose a reason for hiding this comment

nbrachet Dec 15, 2020

Choose a reason for hiding this comment

danevandyck Dec 17, 2020

Choose a reason for hiding this comment

gjasny Dec 17, 2020

Choose a reason for hiding this comment

nbrachet Dec 17, 2020

Choose a reason for hiding this comment

gjasny Dec 26, 2020

Choose a reason for hiding this comment

gjasny commented Dec 26, 2020