Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Target Allocator not allocating all available targets #1037

Closed
jaronoff97 opened this issue Aug 16, 2022 · 0 comments · Fixed by #1039
Closed

Target Allocator not allocating all available targets #1037

jaronoff97 opened this issue Aug 16, 2022 · 0 comments · Fixed by #1039
Labels
area:target-allocator Issues for target-allocator

Comments

@jaronoff97
Copy link
Contributor

Following from #1030 I noticed an issue with the amount of targets allocated to collectors vs the amount of targets discovered by the target allocator. You can see below how the collectors are being allocated a total of 76 targets, despite 142 targets discovered.
Screen Shot 2022-08-16 at 4 16 23 PM

This finding helps explain some of the behavior previously reported about the target allocator here. From my investigation, I discovered the following:

Once targets are discovered, they are added to the allocator's targetsWaiting map like so:

func (allocator *Allocator) SetWaitingTargets(targets []TargetItem) {
	// Dump old data
	allocator.m.Lock()
	defer allocator.m.Unlock()
	allocator.targetsWaiting = make(map[string]TargetItem, len(targets))
	// Set new data
	for _, i := range targets {
		allocator.targetsWaiting[i.JobName+i.TargetURL] = i
	}
}

The key for this map is a JobName and TargetURL ... so what happens when you have multiple targets with the same ip and port but different endpoint names? This is the exact scenario my team was running in to:

lightstep-collector-collector                               ClusterIP   10.35.242.209   <none>        8888/TCP,4317/TCP
lightstep-collector-collector-monitoring                    ClusterIP   10.35.241.103   <none>        8888/TCP

The problem is that the endpoint name isn't included, only the target URL which is of form ip:port. Observing the targets for these:

⫸ k get endpoints lightstep-collector-collector-monitoring -o yaml
apiVersion: v1
kind: Endpoints
metadata:
  name: lightstep-collector-collector-monitoring
  namespace: opentelemetry
subsets:
- addresses:
  - ip: 10.32.2.16
  - ip: 10.32.2.17
  - ip: 10.32.2.18
ports:
  - name: monitoring
    port: 8888
    protocol: TCP
⫸ k get endpoints lightstep-collector-collector -o yaml
apiVersion: v1
kind: Endpoints
metadata:
  name: lightstep-collector-collector
  namespace: opentelemetry
subsets:
- addresses:
  - ip: 10.32.2.16
  - ip: 10.32.2.17
  - ip: 10.32.2.18
  ports:
  - appProtocol: grpc
    name: otlp-grpc
    port: 4317
    protocol: TCP
  - name: metrics
    port: 8888
    protocol: TCP

We see the potential collision of each ip:port combo and the port name. Depending on what order the targets are discovered from the kube api, the target allocator may or may not drop the desired targets.

I am currently working on a solution to address this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:target-allocator Issues for target-allocator
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants