Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: the participant_id column throws NumericValueOutOfRange error #1897

Closed
3 of 4 tasks
cris-oddball opened this issue Jul 30, 2024 · 8 comments
Closed
3 of 4 tasks

Comments

@cris-oddball
Copy link

cris-oddball commented Jul 30, 2024

Description

While validating the lambda dependency upgrade ticket in Dev and exercising all lambdas, we started seeing catastrophic alerts on the MPI Lookup lambda due to a participant_id that throws a NumericValueOutOfRange error.

  • Ticket is understood, and QA has been contacted (if the ticket has a QA label).

The root cause is seen in this log and traceback:

[DEBUG] 2024-07-30T16:35:48.355Z 893d5bd3-58aa-55ed-964b-04f040ee50c8 Persisting item with participant_id: 9100787210, va_profile_id: 7105

[ERROR] NumericValueOutOfRange: integer out of range

Traceback (most recent call last):
File "./python/lib/python3.10/site-packages/datadog_lambda/wrapper.py", line 232, in call
File "/var/task/bip_msg_mpi_lookup_lambda.py", line 99, in bip_msg_mpi_lookup_handler
def bip_msg_mpi_lookup_handler(event: dict, context=None) -> dict:
File "./python/lib/python3.10/site-packages/ddtrace/contrib/aws_lambda/patch.py", line 120, in call
File "/var/task/bip_msg_mpi_lookup_lambda.py", line 128, in bip_msg_mpi_lookup_handler
return_value = asyncio.run(bip_msg_mpi_lookup_async_handler(records, db_connection))
File "/var/lang/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/var/lang/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "./python/lib/python3.10/site-packages/ddtrace/contrib/asyncio/patch.py", line 50, in traced_coro
File "/var/task/bip_msg_mpi_lookup_lambda.py", line 185, in bip_msg_mpi_lookup_async_handler
persist_row(participant_id, va_profile_id, db_connection)
File "/var/task/bip_msg_mpi_lookup_lambda.py", line 290, in persist_row
c.execute(VA_PROFILE_UPDATE_QUERY, (participant_id, va_profile_id))
File "/opt/python/lib/python3.10/site-packages/ddtrace/contrib/dbapi/init.py", line 156, in execute
return self._trace_method(
File "./python/lib/python3.10/site-packages/ddtrace/contrib/psycopg/cursor.py", line 14, in _trace_method
File "/opt/python/lib/python3.10/site-packages/ddtrace/contrib/dbapi/init.py", line 125, in _trace_method
return method(*args, **kwargs)

Additionally, this is apparently being retried, which should not happen.
Screenshot 2024-07-30 at 11.06.27 AM.jpg

Steps to Reproduce

  1. Run the ETL lambda in Dev, which will trigger the MPI lambda
  2. Observe the logs as shown above (the exception and the retries)

Impact/Urgency

HIGH - we are expected to launch in August and this bug will prevent certain veterans with participant-ids that are out of range from receiving sms notifications.

Expected Behavior

  • When the ETL lambda is run and triggers the MPI lambda, upon an insert attempt, Postgres will accept the size of the participant_id and not throw an out of range error
  • If an out of range error is thrown, it is properly handled and not retried This will be addressed in #175

QA Considerations

  • work with Dev on validating in dev and perf

Additional Info & Resources

Integer in Postgres is -2147483648 to +2147483647
This column should be a bigint - see documentation
As should the VA Profile ID

Line where the code should likely be changed is here.

@npmartin-oddball
Copy link

@MackHalliday
Copy link

MackHalliday commented Aug 2, 2024

elt collects va profile -> gives to MPI looking -> gets back large VA profile id

DEV Does have test data
PERF does not have test data at the moment -> Can we generate more data ? " kyle wants the vaprofle column to also be bigint since he just learned that the vaprofile IDs can get that high.i think the only way to test this is to log into the jumpbox and try to insert a number that big in both columns."

@cris-oddball
Copy link
Author

elt collects va profile -> gives to MPI looking -> gets back large VA profile id

it's elt collects va profile -> gives to MPI looking -> gets back large participant_id

regardless, both need to be bigint (vaprofile id and participant id)

@MackHalliday
Copy link

Friday

  • Add migration, issues with running migrations local. Seem to be unran migrations
  • Looked at Kafka repo for throwing error for

Today

  • Put in PR today

@MackHalliday
Copy link

Rolling over to next sprint

This was pulled in on Friday. It's rolling over because it's slow to test the migrations and also we decided to delete data if the data is not backwards compatible.

@MackHalliday
Copy link

Yesterday

  • Worked on downgrade functionality and using the DB Downgrade Github Action
  • Updated documentation on DB Downgrade Github Action

Today

  • Submitted PR and awaiting PR approvals

@cris-oddball
Copy link
Author

PR approved and merged.

Will run the MPI Lookup when deployed to Perf just to ensure that no regressions were introduced.

@cris-oddball
Copy link
Author

Regression passed in Perf and MPI Lookup Lambda functions as expected. Closing ticket and sending to Prod.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants