Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only delete conntrack entries for orig-src and reply-src. #1500

Merged
merged 2 commits into from
Jul 11, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 15 additions & 4 deletions conntrack/conntrack.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,22 @@ import (
log "github.com/Sirupsen/logrus"
)

var conntrackDirections = []string{
// For TCP/UDP, each conntrack entry holds two copies of the tuple
// (src addr, dst addr, src port, dst port). One copy for the original direction and one copy for
// the reply direction. This is how the kernel handles NAT: by looking up the tuple for a packet
// by its original tuple and mapping onto the corresponding reply direction tuple (or vice versa).
// The reply tuple is calculated when the original outgoing packet is processed (and possibly
// NATted).
//
// When we delete conntrack entries by IP address, we need to specify which element of the tuple
// to look in. This slice holds the flags corresponding to the fields we care about. Since we're
// deleting entries for local workload endpoints, either the endpoint originated the traffic, or it
// received the traffic and replied to it. In the originating case, the "original source" will be
// set to the endpoint's IP; in the other case, the "reply source". Hence, it's sufficient to only
// look in those two fields.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is very helpful, but I think it may need even a bit more in order to explain why only looking at orig-src and reply-src is watertight for our purposes.

The concerns are, I think that:

  • There could be a (partial) flow with orig-dst==<endpoint IP> but no reply-src, if the endpoint chooses not to respond at all. At least with UDP that would be quite easy to arrange; probably more tricky with TCP.
  • Is there any way to get a flow with reply-dst==<endpoint IP>!=orig-src ?

I think the possible answers are that:

  • It isn't possible to get conntrack state at all for such cases, because they fall outside the definition of what conntrack considers to be a flow; and/or
  • There can be conntrack state for such cases, but it naturally expires so quickly that we don't care about Felix deleting it here. (Hypothesis:) Conntrack state is only long-lived for complete flows with reply-src==orig-dst and reply-dst==orig-src, and it's only that long-lived state that we are concerned about deleting here.

Feel free to cut and paste if any of this is right or useful.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the conntrack tuples are produced as a result of the NAT calculation; it doesn't matter that there hasn't yet been a reply; the reply direction would still exist and have the expected reply source IP in it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see now, I had completely misunderstood. (So this exchange was very useful!)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To double-check my new understanding, then:

  • If orig-dst is different from reply-src, that would only be because of a DNAT affecting connection to the local workload, with orig-dst being the service or floating IP and reply-src being the endpoint IP. Since we are deleting by endpoint IP, reply-src is the only useful lookup.

  • If reply-dst is different from orig-src, that would only be because of an SNAT affecting outgoing connections from the local workload, with orig-src being the endpoint IP - and again, because we are deleting by endpoint IP, orig-src is the only useful lookup.

That all seems watertight to me, so I'll approve now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's my understanding. I can't think of any other cases.

var deleteDirections = []string{
"--orig-src",
"--orig-dst",
"--reply-src",
"--reply-dst",
}

const numRetries = 3
Expand Down Expand Up @@ -65,7 +76,7 @@ func (c Conntrack) RemoveConntrackFlows(ipVersion uint8, ipAddr net.IP) {
log.WithField("version", ipVersion).Panic("Unknown IP version")
}
log.WithField("ip", ipAddr).Info("Removing conntrack flows")
for _, direction := range conntrackDirections {
for _, direction := range deleteDirections {
logCxt := log.WithFields(log.Fields{"ip": ipAddr, "direction": direction})
// Retry a few times because the conntrack command seems to fail at random.
for retry := 0; retry <= numRetries; retry += 1 {
Expand Down
18 changes: 0 additions & 18 deletions conntrack/conntrack_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -35,18 +35,14 @@ var _ = Describe("Conntrack", func() {
conntrack.RemoveConntrackFlows(4, net.ParseIP("10.0.0.1"))
Expect(cmdRec.cmdArgs).To(Equal([][]string{
[]string{"--family", "ipv4", "--delete", "--orig-src", "10.0.0.1"},
[]string{"--family", "ipv4", "--delete", "--orig-dst", "10.0.0.1"},
[]string{"--family", "ipv4", "--delete", "--reply-src", "10.0.0.1"},
[]string{"--family", "ipv4", "--delete", "--reply-dst", "10.0.0.1"},
}))
})
It("IPv6: Should remove all directions", func() {
conntrack.RemoveConntrackFlows(6, net.ParseIP("fe80::beef"))
Expect(cmdRec.cmdArgs).To(Equal([][]string{
[]string{"--family", "ipv6", "--delete", "--orig-src", "fe80::beef"},
[]string{"--family", "ipv6", "--delete", "--orig-dst", "fe80::beef"},
[]string{"--family", "ipv6", "--delete", "--reply-src", "fe80::beef"},
[]string{"--family", "ipv6", "--delete", "--reply-dst", "fe80::beef"},
}))
})
It("should panic on unknown IP version", func() {
Expand All @@ -62,9 +58,7 @@ var _ = Describe("Conntrack", func() {
conntrack.RemoveConntrackFlows(4, net.ParseIP("10.0.0.1"))
Expect(cmdRec.cmdArgs).To(Equal([][]string{
[]string{"--family", "ipv4", "--delete", "--orig-src", "10.0.0.1"},
[]string{"--family", "ipv4", "--delete", "--orig-dst", "10.0.0.1"},
[]string{"--family", "ipv4", "--delete", "--reply-src", "10.0.0.1"},
[]string{"--family", "ipv4", "--delete", "--reply-dst", "10.0.0.1"},
}))
})
})
Expand All @@ -78,9 +72,7 @@ var _ = Describe("Conntrack", func() {
Expect(cmdRec.cmdArgs).To(Equal([][]string{
[]string{"--family", "ipv4", "--delete", "--orig-src", "10.0.0.1"},
[]string{"--family", "ipv4", "--delete", "--orig-src", "10.0.0.1"},
[]string{"--family", "ipv4", "--delete", "--orig-dst", "10.0.0.1"},
[]string{"--family", "ipv4", "--delete", "--reply-src", "10.0.0.1"},
[]string{"--family", "ipv4", "--delete", "--reply-dst", "10.0.0.1"},
}))
})
})
Expand All @@ -97,20 +89,10 @@ var _ = Describe("Conntrack", func() {
[]string{"--family", "ipv4", "--delete", "--orig-src", "10.0.0.1"},
[]string{"--family", "ipv4", "--delete", "--orig-src", "10.0.0.1"},

[]string{"--family", "ipv4", "--delete", "--orig-dst", "10.0.0.1"},
[]string{"--family", "ipv4", "--delete", "--orig-dst", "10.0.0.1"},
[]string{"--family", "ipv4", "--delete", "--orig-dst", "10.0.0.1"},
[]string{"--family", "ipv4", "--delete", "--orig-dst", "10.0.0.1"},

[]string{"--family", "ipv4", "--delete", "--reply-src", "10.0.0.1"},
[]string{"--family", "ipv4", "--delete", "--reply-src", "10.0.0.1"},
[]string{"--family", "ipv4", "--delete", "--reply-src", "10.0.0.1"},
[]string{"--family", "ipv4", "--delete", "--reply-src", "10.0.0.1"},

[]string{"--family", "ipv4", "--delete", "--reply-dst", "10.0.0.1"},
[]string{"--family", "ipv4", "--delete", "--reply-dst", "10.0.0.1"},
[]string{"--family", "ipv4", "--delete", "--reply-dst", "10.0.0.1"},
[]string{"--family", "ipv4", "--delete", "--reply-dst", "10.0.0.1"},
}))
})
})
Expand Down