Skip to content
This repository has been archived by the owner on May 23, 2024. It is now read-only.

Add resolved udp connection type, continually resolve dns names in background #520

Merged
merged 25 commits into from
Jul 9, 2020

Conversation

terev
Copy link
Contributor

@terev terev commented Jun 26, 2020

Which problem is this PR solving?

Short description of the changes

  • This pr adds a new type that decorates a UDPConn
  • For instances where an agent address is specified as a hostname the udp transport will create an instance of this type.
  • This type continually attempts to resolve a new address for the host, if a new address is resolved, and the dial succeeds the currently held connection is atomically swapped with a new one

@codecov
Copy link

codecov bot commented Jun 26, 2020

Codecov Report

Merging #520 into master will increase coverage by 0.51%.
The diff coverage is 88.74%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #520      +/-   ##
==========================================
+ Coverage   88.75%   89.27%   +0.51%     
==========================================
  Files          60       61       +1     
  Lines        3790     3915     +125     
==========================================
+ Hits         3364     3495     +131     
+ Misses        309      294      -15     
- Partials      117      126       +9     
Impacted Files Coverage Δ
config/config_env.go 96.80% <63.63%> (-3.21%) ⬇️
utils/udp_client.go 65.38% <78.78%> (+65.38%) ⬆️
utils/reconnecting_udp_conn.go 92.50% <92.50%> (ø)
config/config.go 95.51% <100.00%> (+0.21%) ⬆️
transport_udp.go 97.59% <100.00%> (+0.29%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b8ed773...da03468. Read the comment docs.

Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the patch!

utils/resolved_udp_conn.go Outdated Show resolved Hide resolved
utils/resolved_udp_conn.go Outdated Show resolved Hide resolved
@@ -142,17 +142,62 @@
version = "v0.0.5"

[[projects]]
digest = "1:0496f0e99014b7fd0a560c539f51d0882731137b85494142f47e550e4657176a"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there something specific required by upgrading the lock file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this digest changed because i use the testify/mock sub package for my tests

utils/resolved_udp_conn.go Outdated Show resolved Hide resolved
utils/resolved_udp_conn.go Outdated Show resolved Hide resolved
utils/resolved_udp_conn.go Outdated Show resolved Hide resolved
utils/resolved_udp_conn.go Outdated Show resolved Hide resolved
utils/resolved_udp_conn.go Outdated Show resolved Hide resolved
utils/resolved_udp_conn.go Outdated Show resolved Hide resolved
transport_udp.go Outdated
@@ -59,7 +60,7 @@ type udpSender struct {

// NewUDPTransport creates a reporter that submits spans to jaeger-agent.
// TODO: (breaking change) move to transport/ package.
func NewUDPTransport(hostPort string, maxPacketSize int) (Transport, error) {
func NewUDPTransport(hostPort string, maxPacketSize int, logger log.Logger) (Transport, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a breaking change. I would rather add another function like below and delegate to it from this one:

type UDPTransportParams struct {
    HostPort string
    MaxPacketSize int
    Logger log.Logger
}

func NewUDPTransportWithParams(params UDPTransportParams)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly for NewAgentClientUDP

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made this change. I'm defaulting to log.StdLogger if no logger is passed (as is the case in NewUDPTransport and NewAgentClientUDP)

@terev terev requested a review from yurishkuro June 29, 2020 19:31
Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few more comments, thanks

transport_udp.go Outdated Show resolved Hide resolved
utils/resolved_udp_conn.go Outdated Show resolved Hide resolved
utils/resolved_udp_conn.go Outdated Show resolved Hide resolved
utils/resolved_udp_conn.go Outdated Show resolved Hide resolved
utils/resolved_udp_conn.go Outdated Show resolved Hide resolved
utils/resolved_udp_conn.go Outdated Show resolved Hide resolved
utils/resolved_udp_conn.go Outdated Show resolved Hide resolved
utils/udp_client.go Outdated Show resolved Hide resolved
Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

almost there, most comments in the tests

transport_udp.go Show resolved Hide resolved
utils/resolved_udp_conn.go Outdated Show resolved Hide resolved
utils/resolved_udp_conn.go Outdated Show resolved Hide resolved
utils/resolved_udp_conn_test.go Outdated Show resolved Hide resolved
Port: 34322,
}).
Return(clientConn, nil).
Once()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: these seem repeated from previous test, could they be combined?

resolved, dialer := mockResolverAndDialer(hostPort, clientConn)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it may be hard to combine these into a function because each test has a different set of calls being mocked

On("ResolveUDPAddr", "udp", hostPort).
Return(nil, fmt.Errorf("failed to resolve"))

timer := time.AfterFunc(time.Millisecond*100, func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, this could create flaky test in CI due to short time interval. Could we use something more deterministic instead, like fail the resolve on connection creation, and then do a check-sleep loop until the connection is established from the background timer loop?
e.g.

for i:=0; i < 100; i++ {
    lock
    connected := (conn.conn != nil)
    unlock
    if connected { break }
    time.Sleep(10ms)
}
lock; defer unlock
assert.NotNil(t, conn.conn)

utils/resolved_udp_conn_test.go Outdated Show resolved Hide resolved
assert.GreaterOrEqual(t, 65000*2, bufferBytes)
}

expectedString := "yo this is a test"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems to repeat validation from the previous test, please refactor into a function

connUDP, err := net.DialUDP(destAddr.Network(), nil, destAddr)
if err != nil {
return nil, err
if ip := net.ParseIP(host); ip == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add tests for this logic too? This is where the most test coverage is missing

}

// attempt to resolve and dial new address in case that's the problem, if resolve and dial succeeds, try write again
if reconnErr := c.attemptResolveAndDial(); reconnErr == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this part is missing test coverage. I think one of the tests is capable of reproducing this condition.

https://codecov.io/gh/jaegertracing/jaeger-client-go/compare/66c008c3d6ad856cac92a0af53186efbffa8e6a5...aaf0c9b420e28648794f01cdc4c3ace3f9c74220/diff

terev added 4 commits July 2, 2020 23:01
Signed-off-by: Trevor Foster <[email protected]>
Signed-off-by: Trevor Foster <[email protected]>
Signed-off-by: Trevor Foster <[email protected]>
@terev terev requested a review from yurishkuro July 3, 2020 03:33
Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, a few more comments, main one about the design from user POV.

utils/udp_client.go Outdated Show resolved Hide resolved
utils/resolved_udp_conn.go Outdated Show resolved Hide resolved
utils/udp_client.go Outdated Show resolved Hide resolved
utils/resolved_udp_conn.go Outdated Show resolved Hide resolved
@terev
Copy link
Contributor Author

terev commented Jul 6, 2020

@yurishkuro what do you think of me changing the implementation to only re-resolve on emit instead of in the background? The re-resolve would occur before transmit and only if outside of a duration since last resolve, or on error.

My thought is this would make sure resolves only happen if something is actually being emitted. Less random dns requests in the background if there's no spans to emit.

One downside i see with this is that the dns query would extend the duration of the emit call (but probably not a lot?).

@yurishkuro
Copy link
Member

We don't want to re-resolve on every emit, that's going to affect performance. I think the point of time-based re-resolve is that if the name does change then the client will reconnect within the given interval. Since there's no acks from UDP server I don't see what else can be done.

@terev
Copy link
Contributor Author

terev commented Jul 6, 2020

Yes sorry I didn't mean every emit. Was thinking of it as if the resolved addr has a ttl, and if an emit occurs after the ttl, the reconnect is done. But yeah the resolve and syscall to create the socket could affect performance so i'll continue with the current strategy.

@terev terev requested a review from yurishkuro July 7, 2020 07:03
terev added 2 commits July 7, 2020 13:10
Signed-off-by: Trevor Foster <[email protected]>
Signed-off-by: Trevor Foster <[email protected]>
Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look great! Thanks for sticking with it. I left just a few small comments.

transport/http_test.go Outdated Show resolved Hide resolved
utils/reconnecting_udp_conn_test.go Outdated Show resolved Hide resolved
constants.go Outdated Show resolved Hide resolved
utils/reconnecting_udp_conn_test.go Outdated Show resolved Hide resolved
all assertions of udp write to succeed

Signed-off-by: Trevor Foster <[email protected]>
@terev terev requested a review from yurishkuro July 8, 2020 05:22
Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@yurishkuro yurishkuro merged commit 704ce28 into jaegertracing:master Jul 9, 2020
@yurishkuro
Copy link
Member

Thank you @terev ! 🎉🎉🎉

@terev
Copy link
Contributor Author

terev commented Jul 9, 2020

@yurishkuro Will I need to make this change in the open telemetry exporter as well or will this client eventually support that interface?

@yurishkuro
Copy link
Member

I am not sure if OpenTelemetry supports UDP emission at all. The ticket for OTEL Java SDK was closed today, don't know about Go.

@terev
Copy link
Contributor Author

terev commented Jul 9, 2020

Looks like it does still support udp emission to the agent https://github.com/open-telemetry/opentelemetry-go/blob/master/exporters/trace/jaeger/agent.go

@yurishkuro
Copy link
Member

then it would need a similar change

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
2 participants