Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(k8sprocessor): race condition when getting Pod data #938

Merged
merged 1 commit into from
Feb 7, 2023

Conversation

swiatekm
Copy link

@swiatekm swiatekm commented Feb 7, 2023

We had a race condition where we'd update Pod metadata in GetPod while only holding a read lock. In circumstances where two batches of data would arrive from the same Pod in quick succession, we'd get a panic due to concurrent map writes.

I fixed this by changing the interface to only return an attributes map and explicitly making that a copy of what's in the Pod attributes.

@github-actions github-actions bot added the go label Feb 7, 2023
@swiatekm swiatekm force-pushed the fix/k8sprocessor/update-race branch from a6c6817 to a742efd Compare February 7, 2023 10:12
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Feb 7, 2023
Comment on lines 266 to 285
func (c *WatchClient) GetPodAttributes(identifier PodIdentifier) (map[string]string, bool) {
pod, ok := c.getPod(identifier)
if !ok {
return nil, false
}
attributes := make(map[string]string, len(pod.Attributes))
for key, value := range pod.Attributes {
attributes[key] = value
}
for key, value := range c.getPodOwnerMetadataAttributes(pod) {
attributes[key] = value
}
observability.RecordIPLookupMiss()
return nil, false
return attributes, ok
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we pass target map to this function so we avoid double copying?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not unless we can show a performance increase. Doing it this way is easier to understand, and I'd be surprised if copying like 16 strings twice makes any significant difference.

Either way, for this fix I'd rather keep things simple, we can revisit this later.

@swiatekm swiatekm force-pushed the fix/k8sprocessor/update-race branch from a742efd to e5dac4f Compare February 7, 2023 11:48
@swiatekm swiatekm marked this pull request as ready for review February 7, 2023 11:49
@swiatekm swiatekm requested a review from a team as a code owner February 7, 2023 11:49
@swiatekm swiatekm requested a review from sumo-drosiek February 7, 2023 11:52
@swiatekm swiatekm force-pushed the fix/k8sprocessor/update-race branch from e5dac4f to 8fa9147 Compare February 7, 2023 12:11
@swiatekm
Copy link
Author

swiatekm commented Feb 7, 2023

I refactored the locking a bit and I believe it's now safe. We should rethink how it's meant to work in general though. The code acts internally as if passing around a *Pod is safe, but in reality the attributes of that Pod may be modified at any time. The way locking is done in the internal methods should reflect this.

ownerAttributes := c.getPodOwnerMetadataAttributes(pod)

// we need to take a lock here because pod.Attributes may be modified concurrently
// TODO: clean up the locking in these functions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which functions?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one and the ones it calls, made that clearer in the comment.

@swiatekm swiatekm force-pushed the fix/k8sprocessor/update-race branch from 8fa9147 to b28fda2 Compare February 7, 2023 12:56
@swiatekm swiatekm enabled auto-merge (rebase) February 7, 2023 13:10
@swiatekm swiatekm merged commit 7909c61 into main Feb 7, 2023
@swiatekm swiatekm deleted the fix/k8sprocessor/update-race branch February 7, 2023 13:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation go
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants