-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase resiliency of logging alerts #389
Comments
Labeled with |
In the past, we used generic exceptions to record errors to New Relic. This creates custom exceptions for producing and consuming so that we can query New Relic on custom exceptions instead of relying on log messages. edx/edx-arch-experiments#389
I decided against changing the exception handling in |
We did complete this ticket as written (I think), but it turns out that we are still using the logging alerts, and those alerts should be made more resilient by using a tag or something other than an error message to be more resilient to change. We may want to create a new ticket for this. Note: I'm also closing this ticket, which was marked as Done, but not closed. |
We had an issue where we were using the following for an event bus alert:
SELECT * FROM Log WHERE message RLIKE r'Error producing event to event bus.*' LIMIT MAX
, but the error message had presumably changed to:Error delivering message to Kafka event bus
. This stopped alerts from firing.A/C:
TransactionError
should be queried off of anerror.class
that is custom for producing errors.TransactionError
should be off of aerror.class
that is custom for event consumption errors.Implementation Details:
ProducingException
(or better name) exception class.record_producing_error
to reraise the previous error as aProducingException
exception.poll_indefinitely
has a call torecord_exception
that should be refactored to userecord_producing_error
instead.ConsumingException
(or better name) exception class.record_event_consuming_error
to reraise the previous error as aConsumingException
error.The text was updated successfully, but these errors were encountered: