Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: continue monitoring if unhandled Exception is thrown #676

Merged
merged 1 commit into from
Oct 26, 2023

Conversation

crystall-bitquill
Copy link
Contributor

@crystall-bitquill crystall-bitquill commented Oct 10, 2023

Summary

Add additional logging for efm

Description

  • Adds additional logging in the MonitorImpl class for previously unhandled exceptions and for scenarios where monitoring has stopped but the startMonitoring method was called again.
  • Changes the monitoring loop to continue running if an unhandled exception is thrown.

Related to #675

Additional Reviewers

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

} finally {
if (this.monitoringConn != null) {
try {
this.monitoringConn.close();
} catch (final SQLException ex) {
// ignore
LOGGER.warning(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I'm not sure this message adds extra clarity.

@@ -94,6 +94,9 @@ public MonitorImpl(

@Override
public void startMonitoring(final MonitorConnectionContext context) {
if (this.stopped) {
LOGGER.warning(() -> Messages.get("MonitorImpl.monitorIsStopped"));
Copy link
Contributor

@sergiyvamz sergiyvamz Oct 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add monitoring host name to the message?

@davecramer
Copy link
Contributor

Do we have an issue to refer to that prompted this change ?

@crystall-bitquill
Copy link
Contributor Author

Do we have an issue to refer to that prompted this change ?

Yes, #675. I've updated the PR description.

@@ -188,6 +188,9 @@ MonitorThreadContainer.emptyNodeKeys=Provided node keys are empty.

# Monitor Impl
MonitorImpl.contextNullWarning=Parameter 'context' should not be null.
MonitorImpl.interruptedExceptionDuringMonitoring=Monitoring thread for node {0} was interrupted: {1}
MonitorImpl.exceptionDuringMonitoring=Unhandled exception in monitoring thread for node {0}: {1}
MonitorImpl.monitorIsStopped=Monitoring has already stopped for node {0}.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: Monitor was or is already stopped?

() -> Messages.get(
"MonitorImpl.interruptedExceptionDuringMonitoring",
new Object[] {this.hostSpec.getHost(), intEx.getMessage()}));
} catch (final Exception ex) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we expecting any other kind of exception? (we did not catch it beforehand)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're adding extra logging in case there was an exception we hadn't accounted for stopping this method and going undetected. This is because of the OOM error that was noticed after the newContexts queue had too many context objects added to it.

new Object[] {this.hostSpec.getHost(), intEx.getMessage()}));
} catch (final Exception ex) {
// do nothing; exit thread
LOGGER.warning(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to change code so such unhandled exceptions to be logged but the monitoring thread/loop keeps running.

@crystall-bitquill crystall-bitquill changed the title chore: add additional logging for efm fix: continue monitoring unless InterruptedException is thrown Oct 17, 2023
this.activeContexts.add(monitorContext);
break;
}
synchronized (monitorContext) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this really have to be synchronized? It is a local variable

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a local variable but we add it to queues in MonitorImpl that runs in a separate thread. Monitoring thread may change an internal state of the context. The idea was to synchronize on it to avoid multi-threading collisions. It seems this intent isn't properly implemented here. Need to address it in a separate PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not fix it in this PR ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some investigation, it seems like the current implementation is working and a fix isn't required.

@crystall-bitquill crystall-bitquill changed the title fix: continue monitoring unless InterruptedException is thrown fix: continue monitoring if unhandled Exception is thrown Oct 24, 2023
chore: add additional logging for efm
@crystall-bitquill crystall-bitquill merged commit 87ec172 into aws:main Oct 26, 2023
5 checks passed
@crystall-bitquill crystall-bitquill deleted the issue-675 branch October 26, 2023 20:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants