Target Allocator not allocating all available targets #1037

jaronoff97 · 2022-08-16T20:23:02Z

Following from #1030 I noticed an issue with the amount of targets allocated to collectors vs the amount of targets discovered by the target allocator. You can see below how the collectors are being allocated a total of 76 targets, despite 142 targets discovered.

This finding helps explain some of the behavior previously reported about the target allocator here. From my investigation, I discovered the following:

Once targets are discovered, they are added to the allocator's targetsWaiting map like so:

func (allocator *Allocator) SetWaitingTargets(targets []TargetItem) {
	// Dump old data
	allocator.m.Lock()
	defer allocator.m.Unlock()
	allocator.targetsWaiting = make(map[string]TargetItem, len(targets))
	// Set new data
	for _, i := range targets {
		allocator.targetsWaiting[i.JobName+i.TargetURL] = i
	}
}

The key for this map is a JobName and TargetURL ... so what happens when you have multiple targets with the same ip and port but different endpoint names? This is the exact scenario my team was running in to:

lightstep-collector-collector                               ClusterIP   10.35.242.209   <none>        8888/TCP,4317/TCP
lightstep-collector-collector-monitoring                    ClusterIP   10.35.241.103   <none>        8888/TCP

The problem is that the endpoint name isn't included, only the target URL which is of form ip:port. Observing the targets for these:

⫸ k get endpoints lightstep-collector-collector-monitoring -o yaml
apiVersion: v1
kind: Endpoints
metadata:
  name: lightstep-collector-collector-monitoring
  namespace: opentelemetry
subsets:
- addresses:
  - ip: 10.32.2.16
  - ip: 10.32.2.17
  - ip: 10.32.2.18
ports:
  - name: monitoring
    port: 8888
    protocol: TCP
⫸ k get endpoints lightstep-collector-collector -o yaml
apiVersion: v1
kind: Endpoints
metadata:
  name: lightstep-collector-collector
  namespace: opentelemetry
subsets:
- addresses:
  - ip: 10.32.2.16
  - ip: 10.32.2.17
  - ip: 10.32.2.18
  ports:
  - appProtocol: grpc
    name: otlp-grpc
    port: 4317
    protocol: TCP
  - name: metrics
    port: 8888
    protocol: TCP

We see the potential collision of each ip:port combo and the port name. Depending on what order the targets are discovered from the kube api, the target allocator may or may not drop the desired targets.

I am currently working on a solution to address this.

The text was updated successfully, but these errors were encountered:

pavolloffay added the area:target-allocator Issues for target-allocator label Aug 17, 2022

This was referenced Aug 17, 2022

Resolve bug where TA doesn't allocate all targets #1039

Merged

[target-allocator] Target allocator assigns target to multiple collectors #1038

Closed

pavolloffay closed this as completed in #1039 Aug 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Target Allocator not allocating all available targets #1037

Target Allocator not allocating all available targets #1037

jaronoff97 commented Aug 16, 2022

Target Allocator not allocating all available targets #1037

Target Allocator not allocating all available targets #1037

Comments

jaronoff97 commented Aug 16, 2022