-
Notifications
You must be signed in to change notification settings - Fork 616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
config: make docker health check configurable #1624
Conversation
agent/tcs/handler/types.go
Outdated
@@ -50,3 +51,10 @@ func (params *TelemetrySessionParams) time() ttime.Time { | |||
}) | |||
return params._time | |||
} | |||
|
|||
func (params *TelemetrySessionParams) isMetricsDisabled() (bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function name makes me a little bit confusing. It is called isMetricsDisabled, but when DisableMetrics flag is true, the function will not return true. Do you want to say MetricsSessionDisabled or something? Maybe change the function name and add some description as comments here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair. changed to isContainerHealthMetricsDisabled
since checking against both DisabledMetrics && DisableDockerHealthCheck
result: true, | ||
err: nil, | ||
description: "both telemetry and health metrics were disable should return true", | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it in golang, for && operator, if the first one is false, it will not proceed to check the value of second parameter? Since seems you miss a case, when DisableMetrics is false and DisableDockerHealthCheck is true.
gomock.Any()).AnyTimes().Return([]string{}, nil) | ||
|
||
ctx, cancel := context.WithCancel(context.TODO()) | ||
// Cancel the context to cancel async routines |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
without it we end up with flakey tests that have run away go routines. maybe you've seen some errors similar to "goroutine ended after test" kind of thing. please correct me if i'm wrong though.
capabilities, err := agent.capabilities() | ||
require.NoError(t, err) | ||
|
||
capMap := make(map[string]bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we can use assert.NotContains instead of these
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh cool, changed.
agent/tcs/handler/handler.go
Outdated
return | ||
} | ||
if ok { | ||
seelog.Warnf("Metrics were disabled, not start the telemetry session") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: start -> starting
@@ -73,6 +73,19 @@ func (*mockStatsEngine) GetTaskHealthMetrics() (*ecstcs.HealthMetadata, []*ecstc | |||
return nil, nil, nil | |||
} | |||
|
|||
// TestDisableMetrics tests the StartMetricsSession will return immediately if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just wondering what is the result of breaking the logic we test here? it seems that in that case we will just not return immediately, but i'm not sure whether the test will fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hrm yea this is a weird one, this looks like it needs a StartMetricsSession
refactor to be able to have more sensible testing. we can track this as tech debt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add a TODO for this
ok, err := tc.param.isMetricsDisabled() | ||
if tc.err != nil { | ||
assert.Error(t, err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
else we should assert no error?
|
||
func (params *TelemetrySessionParams) isMetricsDisabled() (bool, error) { | ||
if params.Cfg != nil { | ||
return params.Cfg.DisableMetrics && params.Cfg.DisableDockerHealthCheck, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for understanding, is DisableMetrics
something that we have introduced previously but never used? Would it be better to add a description in the PR summary for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DisableMetrics
maps to ECS_DISABLE_METRICS
and is already covered in the README. this PR is to add a toggle for the health checks, which relies on the underlying tcs/metrics subsystem.
thanks for taking a pass at this, will make changes and need to update the test for this actually as well. |
72f1713
to
877a310
Compare
@@ -73,6 +73,19 @@ func (*mockStatsEngine) GetTaskHealthMetrics() (*ecstcs.HealthMetadata, []*ecstc | |||
return nil, nil, nil | |||
} | |||
|
|||
// TestDisableMetrics tests the StartMetricsSession will return immediately if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add a TODO for this
windows ftest failing:
|
merging this to |
Summary
carried over from #1522
Implementation details
Testing
make release
)go build -out amazon-ecs-agent.exe ./agent
)make test
) passgo test -timeout=25s ./agent/...
) passmake run-integ-tests
) pass.\scripts\run-integ-tests.ps1
) passmake run-functional-tests
) pass.\scripts\run-functional-tests.ps1
) passNew tests cover the changes:
Description for the changelog
Licensing
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.