-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
check that StartTransientUnit/StopUnit succeeds #2331
Conversation
@kolyshkin PTAL |
613c712
to
28fc3f3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on that! Left a couple of nits.
Can you also add the same code (wait for the operation to complete, check it's "done", return an error if not) to the Destroy()
method. Currently it does not even wait.
Oh, finally, could you please wait until #2299 is merged and then rebase? It is mostly code refactoring in there.
@lifubang have you tested that the fix works? I have described the repro in details here: containers/crun#331 (comment), should take 10-15 minutes to check it (mostly waiting for vagrant up). |
OK |
I have tested it in vmware. |
bb47d24
to
c0b435d
Compare
I add it in #2329 |
Needs rebase |
c0b435d
to
3edc556
Compare
77d1669
to
b47cad6
Compare
2aa85d6
to
a5d5d89
Compare
libcontainer/cgroups/systemd/v2.go
Outdated
select { | ||
case s := <-statusChan: | ||
if DbusJobCompletionResultType(s) != DbusJobDone { | ||
logrus.Warnf("error stopping unit `%s`: got `%s`. Continuing...", unitName, s) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think s/stopping/removing/
is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
|
||
const ( | ||
DbusJobDone DbusJobCompletionResultType = "done" | ||
DbusJobCanceled DbusJobCompletionResultType = "canceled" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you using any of those (I mean, except for DbusJobDone
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed now!
@@ -11,6 +11,26 @@ import ( | |||
"github.com/opencontainers/runc/libcontainer/configs" | |||
) | |||
|
|||
// Refer to https://github.com/coreos/go-systemd/blob/5a0db84d3dc459ccdc6ffcc44b1c452bf9f171cb/dbus/methods.go#L78-L101 | |||
// or https://godoc.org/github.com/coreos/go-systemd/dbus#Conn.StartUnit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these two are the same: one is the source code, another is godoc extracted from it.
And we only need "done", let's not overcomplicate things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In v1, shall we need to add this?
If yes, I'll add a const string DbusJobDone
.
8fcd201
to
ac366bc
Compare
@kolyshkin this code also can fix #2344 |
libcontainer/cgroups/systemd/v2.go
Outdated
// Please refer to https://godoc.org/github.com/coreos/go-systemd/dbus#Conn.StartUnit | ||
if s != "done" { | ||
logrus.Warnf("error removing unit `%s`: got `%s`. Continuing...", unitName, s) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The channel can be closed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should not close the channel, because if someone send string to this closed channel, it will cause goroutine panic. Although I guess there is nobody would send sth to this channel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked the github.com/coreos/go-systemd/dbus
code.
Looks like once it sends a string over the channel it forgets it. So yes we should close the channel -- unless we hit the timeout!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
finished now
So, I guess this code should be moved to |
0da17ed
to
59912cd
Compare
Both v1 and v2 are using the same |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM too early :)
@@ -99,3 +102,51 @@ func isUnitExists(err error) bool { | |||
} | |||
return false | |||
} | |||
|
|||
func startJob(unitName string, properties []systemdDbus.Property) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/startJob/startUnit/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Job is a very misleading term here.
return nil | ||
} | ||
|
||
func stopJob(unitName string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/stopJob/stopUnit/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit wrt functions naming. Otherwise, it's perfecto!
59912cd
to
2b91a2d
Compare
needs rebase |
2b91a2d
to
b0fffaa
Compare
@kolyshkin LGTY? |
Signed-off-by: lifubang <[email protected]>
b0fffaa
to
bfa1b2a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
fix #2313(relate to #2310) #2344
To fix #2313
We can check whether
StartTransientUnit
succeeds or not with the value from the channel.If we got
failed
, we should runResetFailedUnit
and return the error.The error
ERRO[0000] cannot detect unified path
relates to #2329 .To fix #2344
We should wait
StopUnit
succeeds.Signed-off-by: lifubang [email protected]