Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

agent: Simplify tearing down the network #225

Merged
merged 2 commits into from
May 1, 2018

Conversation

sboeuf
Copy link

@sboeuf sboeuf commented Apr 26, 2018

This PR addresses the issues related to the way the agent currently tears down the network.
First it fixes an issue due to a modification of a list containing pointers instead of copies, while at the same time, this list is being parsed.

Then, with the idea of simplifying the way the network is removed, it introduces a commit removing the network removal code that was called when the sandbox was destroyed. Indeed, such a case means the VM is going to be shut down, leading to the removal of the network by the VM anyway.

Fixes #226
Fixes #230

Signed-off-by: Sebastien Boeuf [email protected]

@sboeuf sboeuf added the review label Apr 26, 2018
@sboeuf sboeuf force-pushed the fix_k8s_routes branch 2 times, most recently from 1a653c9 to de7a8c0 Compare April 26, 2018 22:14
network.go Outdated
for _, route := range s.network.routes {
if err := s.removeRoute(netHandle, route); err != nil {
routeList = append(routeList, *route)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sboeuf Where else in the code are we updating this list?
Why not just store objects instead of pointers in the list? We can always modify the objects itself when a slice is passed to a function.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This involves a bigger change regarding how things are handled if we move from pointer to object. I am fine with any solution but I want some feedback before to go ahead and implement it.
@bergwolf @jodh-intel @egernst @grahamwhaley PTAL

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because a slice is not really a good data structure for dynamical element removal, - you cannot avoid data copy if you want to remove within a loop.

OTOH, if we add an ID to pb.Route to uniquely identify a route for a sandbox, we can use a map for it and it is perfectly safe to do

for _, r := range routes {
     delete(routes, r.ID)
}

The same applies to pb.Interface where we can use interface.Name to uniquely identify an interface and thus use a map instead of a slice to store it.

@codecov
Copy link

codecov bot commented Apr 27, 2018

Codecov Report

Merging #225 into master will increase coverage by 0.32%.
The diff coverage is 50%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #225      +/-   ##
=========================================
+ Coverage   40.88%   41.2%   +0.32%     
=========================================
  Files          12      12              
  Lines        1815    1803      -12     
=========================================
+ Hits          742     743       +1     
+ Misses        970     958      -12     
+ Partials      103     102       -1
Impacted Files Coverage Δ
grpc.go 20.64% <0%> (-0.05%) ⬇️
network.go 49.16% <66.66%> (+2.92%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 74e720e...4bcacdc. Read the comment docs.

@sboeuf sboeuf changed the title agent: network: Don't loop over a list being potentially modified agent: Simplify tearing down the network Apr 27, 2018
@sboeuf
Copy link
Author

sboeuf commented Apr 27, 2018

@egernst @amshinde I have simplified the network removal, PTAL.

Copy link
Member

@amshinde amshinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@sboeuf
Copy link
Author

sboeuf commented Apr 28, 2018

@bergwolf @grahamwhaley PTAL

network.go Outdated
@@ -25,10 +25,10 @@ import (
// related information.
type network struct {
ifacesLock sync.Mutex
ifaces []*pb.Interface
ifaces []pb.Interface
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in https://github.com/kata-containers/agent/pull/225/files#r184835917, we can use pb.Interface.Name to uniquely identify an interface and thus use a map instead of a slice to store it. Then it is perfectly safe to call delete in a loop.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either way is fine with me. Do you really want that we introduce maps here, or you're okay with this way ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change to use a map for it. It has a few advantages:

  • suitable for delete within a loop
  • ensuring each interface is unique

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, it works in this case because the name is already what we use to identify what we need to remove.

grpc.go Outdated
@@ -892,10 +892,6 @@ func (a *agentGRPC) DestroySandbox(ctx context.Context, req *pb.DestroySandboxRe
}
a.sandbox.Unlock()

if err := a.sandbox.removeNetwork(); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the same logic, can we simplify DestroySandbox by removing removeMounts and teardownSharedPidNs as well? Assuming that we do not reuse a sandbox after destroying it, the only thing must being done here is necessary pre-condition check and syscall.Sync().

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to discuss about this, but if it's done, it's gonna be done in a follow up PR. This PR is more about the network.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, agreed.

grpc.go Outdated
@@ -892,10 +892,6 @@ func (a *agentGRPC) DestroySandbox(ctx context.Context, req *pb.DestroySandboxRe
}
a.sandbox.Unlock()

if err := a.sandbox.removeNetwork(); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, agreed.

network.go Outdated
@@ -25,10 +25,10 @@ import (
// related information.
type network struct {
ifacesLock sync.Mutex
ifaces []*pb.Interface
ifaces []pb.Interface
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change to use a map for it. It has a few advantages:

  • suitable for delete within a loop
  • ensuring each interface is unique

network.go Outdated

routesLock sync.Mutex
routes []*pb.Route
routes []pb.Route
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to use a map here as well, by introducing a name/id field for each route rule. Then we can delete the rule in a loop safely. But it's ok if you want to defer it to a later PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is in the scope for another PR here, because we don't have such thing as route ID, and we actually check for the entire route fields to detect if this is the same route.

By storing the network information such as routes as pointers, we
end up having issues if we try to remove some elements from the list
as at the same time this list is parsed through by a loop.

For interfaces, we can use maps instead, because we have a way to
keep track of them based on their names. Maps are safer and allow
to keep pointers in this case.

In both cases, routes and interfaces list can be modified properly
while we loop over every items.

Fixes kata-containers#226

Signed-off-by: Sebastien Boeuf <[email protected]>
@sboeuf
Copy link
Author

sboeuf commented Apr 30, 2018

@bergwolf PR updated PTAL

@sboeuf
Copy link
Author

sboeuf commented Apr 30, 2018

@egernst PTAL

@sboeuf
Copy link
Author

sboeuf commented Apr 30, 2018

@egernst PR updated.

@egernst
Copy link
Member

egernst commented Apr 30, 2018

@sboeuf - I think for paranoia sake it would be wise to keep just the interface removal part (I don't see need for removing routes/dns).

Copy link
Member

@egernst egernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we still remove the interfaces? In the event that the VM disconnects the network first (not necessarily the host side), I'd rather explicitly shut it down so we know when the packets stop sending out of the container...

@sboeuf
Copy link
Author

sboeuf commented Apr 30, 2018

@egernst I have updated the PR. I am no longer removing the code removing the interfaces.
Do you want me to remove the code removing DNS ?

@egernst
Copy link
Member

egernst commented Apr 30, 2018

@sboeuf - yes, please remove that as well. LGTM otherwise - I'm happy with merging one you make this update.

@sboeuf
Copy link
Author

sboeuf commented Apr 30, 2018

@egernst done !

@sboeuf sboeuf force-pushed the fix_k8s_routes branch from 013072d to 254aa22 Compare May 1, 2018 04:01
The network routes will always be removed when the associated
interfaces are going to be removed. Same thing for DNS.  It is
safer to leave this to each interface removal.

Fixes kata-containers#230

Signed-off-by: Sebastien Boeuf <[email protected]>
@sboeuf sboeuf force-pushed the fix_k8s_routes branch from 254aa22 to 4bcacdc Compare May 1, 2018 04:18
@sboeuf
Copy link
Author

sboeuf commented May 1, 2018

@grahamwhaley PTAL

Copy link
Contributor

@grahamwhaley grahamwhaley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@grahamwhaley grahamwhaley merged commit c1af46a into kata-containers:master May 1, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
5 participants