-
Notifications
You must be signed in to change notification settings - Fork 112
agent: Simplify tearing down the network #225
Conversation
1a653c9
to
de7a8c0
Compare
network.go
Outdated
for _, route := range s.network.routes { | ||
if err := s.removeRoute(netHandle, route); err != nil { | ||
routeList = append(routeList, *route) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sboeuf Where else in the code are we updating this list?
Why not just store objects instead of pointers in the list? We can always modify the objects itself when a slice is passed to a function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This involves a bigger change regarding how things are handled if we move from pointer to object. I am fine with any solution but I want some feedback before to go ahead and implement it.
@bergwolf @jodh-intel @egernst @grahamwhaley PTAL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because a slice is not really a good data structure for dynamical element removal, - you cannot avoid data copy if you want to remove within a loop.
OTOH, if we add an ID to pb.Route to uniquely identify a route for a sandbox, we can use a map for it and it is perfectly safe to do
for _, r := range routes {
delete(routes, r.ID)
}
The same applies to pb.Interface where we can use interface.Name to uniquely identify an interface and thus use a map instead of a slice to store it.
Codecov Report
@@ Coverage Diff @@
## master #225 +/- ##
=========================================
+ Coverage 40.88% 41.2% +0.32%
=========================================
Files 12 12
Lines 1815 1803 -12
=========================================
+ Hits 742 743 +1
+ Misses 970 958 -12
+ Partials 103 102 -1
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@bergwolf @grahamwhaley PTAL |
network.go
Outdated
@@ -25,10 +25,10 @@ import ( | |||
// related information. | |||
type network struct { | |||
ifacesLock sync.Mutex | |||
ifaces []*pb.Interface | |||
ifaces []pb.Interface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I mentioned in https://github.com/kata-containers/agent/pull/225/files#r184835917, we can use pb.Interface.Name to uniquely identify an interface and thus use a map instead of a slice to store it. Then it is perfectly safe to call delete in a loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either way is fine with me. Do you really want that we introduce maps here, or you're okay with this way ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please change to use a map for it. It has a few advantages:
- suitable for delete within a loop
- ensuring each interface is unique
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, it works in this case because the name is already what we use to identify what we need to remove.
grpc.go
Outdated
@@ -892,10 +892,6 @@ func (a *agentGRPC) DestroySandbox(ctx context.Context, req *pb.DestroySandboxRe | |||
} | |||
a.sandbox.Unlock() | |||
|
|||
if err := a.sandbox.removeNetwork(); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the same logic, can we simplify DestroySandbox
by removing removeMounts
and teardownSharedPidNs
as well? Assuming that we do not reuse a sandbox after destroying it, the only thing must being done here is necessary pre-condition check and syscall.Sync().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to discuss about this, but if it's done, it's gonna be done in a follow up PR. This PR is more about the network.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, agreed.
grpc.go
Outdated
@@ -892,10 +892,6 @@ func (a *agentGRPC) DestroySandbox(ctx context.Context, req *pb.DestroySandboxRe | |||
} | |||
a.sandbox.Unlock() | |||
|
|||
if err := a.sandbox.removeNetwork(); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, agreed.
network.go
Outdated
@@ -25,10 +25,10 @@ import ( | |||
// related information. | |||
type network struct { | |||
ifacesLock sync.Mutex | |||
ifaces []*pb.Interface | |||
ifaces []pb.Interface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please change to use a map for it. It has a few advantages:
- suitable for delete within a loop
- ensuring each interface is unique
network.go
Outdated
|
||
routesLock sync.Mutex | ||
routes []*pb.Route | ||
routes []pb.Route |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be able to use a map here as well, by introducing a name/id field for each route rule. Then we can delete the rule in a loop safely. But it's ok if you want to defer it to a later PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is in the scope for another PR here, because we don't have such thing as route ID, and we actually check for the entire route fields to detect if this is the same route.
By storing the network information such as routes as pointers, we end up having issues if we try to remove some elements from the list as at the same time this list is parsed through by a loop. For interfaces, we can use maps instead, because we have a way to keep track of them based on their names. Maps are safer and allow to keep pointers in this case. In both cases, routes and interfaces list can be modified properly while we loop over every items. Fixes kata-containers#226 Signed-off-by: Sebastien Boeuf <[email protected]>
@bergwolf PR updated PTAL |
@egernst PTAL |
@egernst PR updated. |
@sboeuf - I think for paranoia sake it would be wise to keep just the interface removal part (I don't see need for removing routes/dns). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we still remove the interfaces? In the event that the VM disconnects the network first (not necessarily the host side), I'd rather explicitly shut it down so we know when the packets stop sending out of the container...
@egernst I have updated the PR. I am no longer removing the code removing the interfaces. |
@sboeuf - yes, please remove that as well. LGTM otherwise - I'm happy with merging one you make this update. |
@egernst done ! |
The network routes will always be removed when the associated interfaces are going to be removed. Same thing for DNS. It is safer to leave this to each interface removal. Fixes kata-containers#230 Signed-off-by: Sebastien Boeuf <[email protected]>
@grahamwhaley PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
This PR addresses the issues related to the way the agent currently tears down the network.
First it fixes an issue due to a modification of a list containing pointers instead of copies, while at the same time, this list is being parsed.
Then, with the idea of simplifying the way the network is removed, it introduces a commit removing the network removal code that was called when the sandbox was destroyed. Indeed, such a case means the VM is going to be shut down, leading to the removal of the network by the VM anyway.
Fixes #226
Fixes #230
Signed-off-by: Sebastien Boeuf [email protected]