Skip to content

Commit

Permalink
Precalculated attribute set hashes (open-telemetry#1407)
Browse files Browse the repository at this point in the history
The hash of `AttributeSet`s are expensive to compute, as they have to be
computed for each key and value in the attribute set. This hash is used
by the `ValueMap` to look up if we are already aggregating a time series
for this set of attributes or not. Since this hashmap lookup occurs
inside a mutex lock, no other counters can execute their `add()` calls
while this hash is being calculated, and therefore contention in high
throughput scenarios exists.

This PR calculates and caches the hashmap at creation time. This
improves throughput because the hashmap is calculated by the thread
creating the `AttributeSet` and is performed outside of any mutex locks,
meaning hashes can be computed in parallel and the time spent within a
mutex lock is reduced. As larger sets of attributes are used for time
series, the benefits of reduction of lock times should be greater.

The stress test results of this change for different thread counts are:

| Thread Count | Main        |  PR           |
| -------------- | ---------- | --------- |
| 2                       | 3,376,040 | 3,310,920 |
| 3                       | 5,908,640 | 5,807,240 |
| 4                       | 3,382,040 | 8,094,960 |
| 5                       | 1,212,640 | 9,086,520 |
| 6                       | 1,225,280 | 6,595,600 |

The non-precomputed hashes starts feeling contention with 4 threads, and
drops substantially after that while precomputed hashes doesn't start
seeing contention until 6 threads, and even then we still have 5-6x more
throughput after contention due to reduced locking times.

While these benchmarks may not be "realistic" (since most applications
will be doing more work in between counter updates) it does show a
benefit of better parallelism and the opportunity to reduce lock
contention at the cost of only 8 bytes per time series (so a total of
16KB additional memory at maximum cardinality).
  • Loading branch information
KallDrexx authored Nov 28, 2023
1 parent 897e70a commit c0104d3
Showing 1 changed file with 24 additions and 8 deletions.
32 changes: 24 additions & 8 deletions opentelemetry-sdk/src/attributes/set.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
use std::collections::hash_map::DefaultHasher;
use std::collections::HashSet;
use std::{
cmp::Ordering,
Expand Down Expand Up @@ -104,13 +105,13 @@ impl Eq for HashKeyValue {}
///
/// This must implement [Hash], [PartialEq], and [Eq] so it may be used as
/// HashMap keys and other de-duplication methods.
#[derive(Clone, Default, Debug, Hash, PartialEq, Eq)]
pub struct AttributeSet(Vec<HashKeyValue>);
#[derive(Clone, Default, Debug, PartialEq, Eq)]
pub struct AttributeSet(Vec<HashKeyValue>, u64);

impl From<&[KeyValue]> for AttributeSet {
fn from(values: &[KeyValue]) -> Self {
let mut seen_keys = HashSet::with_capacity(values.len());
let mut vec = values
let vec = values
.iter()
.rev()
.filter_map(|kv| {
Expand All @@ -121,25 +122,34 @@ impl From<&[KeyValue]> for AttributeSet {
}
})
.collect::<Vec<_>>();
vec.sort_unstable();

AttributeSet(vec)
AttributeSet::new(vec)
}
}

impl From<&Resource> for AttributeSet {
fn from(values: &Resource) -> Self {
let mut vec = values
let vec = values
.iter()
.map(|(key, value)| HashKeyValue(KeyValue::new(key.clone(), value.clone())))
.collect::<Vec<_>>();
vec.sort_unstable();

AttributeSet(vec)
AttributeSet::new(vec)
}
}

impl AttributeSet {
fn new(mut values: Vec<HashKeyValue>) -> Self {
values.sort_unstable();
let mut hasher = DefaultHasher::new();
values.iter().fold(&mut hasher, |mut hasher, item| {
item.hash(&mut hasher);
hasher
});

AttributeSet(values, hasher.finish())
}

/// Returns the number of elements in the set.
pub fn len(&self) -> usize {
self.0.len()
Expand All @@ -163,3 +173,9 @@ impl AttributeSet {
self.0.iter().map(|kv| (&kv.0.key, &kv.0.value))
}
}

impl Hash for AttributeSet {
fn hash<H: Hasher>(&self, state: &mut H) {
state.write_u64(self.1)
}
}

0 comments on commit c0104d3

Please sign in to comment.