Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade api to Python 3.12 #465

Open
4 of 8 tasks
sastels opened this issue Nov 25, 2024 · 21 comments
Open
4 of 8 tasks

Upgrade api to Python 3.12 #465

sastels opened this issue Nov 25, 2024 · 21 comments
Assignees
Labels
Reliability Task related to reliability. Security | Sécurité Tech Debt An issue targeting an identified technical debt

Comments

@sastels
Copy link

sastels commented Nov 25, 2024

Description

As a software developer working on GCNotify stack,
I want the Python stack to be upgraded at latest supported version,
So that I can leverage the latest features of the language and better/safer code.

As a system ops working on GCNotify stack,
I want the Python stack to be upgraded at latest supported version,
So that I can leverage the latest performance improvement
And process notifications faster.

WHY are we building?

To get latest performance improvements, features and security updates of the Python language/runtime.

WHAT are we building?

Upgrade the stack to the latest possible version of Python. A few considerations:

  • It is possible that we cannot upgrade to the latest of the latest (3.12 at the time of this writing for the stable version), as this might be blocked by a dependency library that did not release a version supporting the latest. For example, we might have to upgrade from version 3.10 to version 3.11, and wait a bit longer to migrate to version 3.12.
  • The lambdas might have additional considerations that might limit the possible maximum Python version we can use, such as supported runtime and also monitoring libraries that need to support that environment (such as New Relic). Consider doing this work in Upgrade lambda and New Relic to latest supported version #354 .

VALUE created by our solution

Better performance, security and reliability.

Acceptance Criteria

  • Api running on Python 3.12.
  • test locally
  • test on dev (k8s and lambda)
  • document in tricks and tips how we can deploy branches to dev
  • Run rollercoaster tests
  • Run load test, compare to current 5500 emails / min peak

QA Steps

  • Run smoke tests
  • bug bash on staging
@sastels sastels added Reliability Task related to reliability. Security | Sécurité Tech Debt An issue targeting an identified technical debt labels Nov 25, 2024
@sastels sastels self-assigned this Nov 25, 2024
@sastels sastels changed the title Copy of Upgrade document-download-api to Python 3.12 Upgrade api to Python 3.12 Nov 25, 2024
@sastels
Copy link
Author

sastels commented Nov 26, 2024

Upgraded to 3.12.7.
cds-snc/notification-api#2362

@sastels
Copy link
Author

sastels commented Nov 27, 2024

A couple annual limits tests now failing :/

@sastels
Copy link
Author

sastels commented Nov 27, 2024

figured them out! There was a redis call in utils that now fails if there is no annual limit data in redis for a service, which happens in the test.

@sastels
Copy link
Author

sastels commented Nov 28, 2024

Working locally (devcontainer) with other apps.

@sastels
Copy link
Author

sastels commented Dec 3, 2024

ready for review but will first get admin in
tested in dev k8s but the main branch lambda was being flakey so didn't test 3.12 lambda api in dev.

@sastels
Copy link
Author

sastels commented Dec 4, 2024

tested in dev k8s and all looked good
dev api lambda not connected to api gateway so couldn't test that.
Will see if we can fix the api lambda and test this image there tomorrow.

@sastels
Copy link
Author

sastels commented Dec 5, 2024

@sastels
Copy link
Author

sastels commented Dec 5, 2024

ok, so the lambda isn't happy :/
There's an official New Relic lambda layer
arn:aws:lambda:ca-central-1:451483290750:layer:NewRelicPython312:22

that the lambda_staging.yml GitHub action is downloading (for 3.10 we were rolling our own version because there wasn't an official version at the time). This GitHub actions works and a lambda docker image is uploaded to our ECR. However when the lambda starts it crashes:

[NR_EXT] New Relic Lambda Extension starting up
[NR_EXT] Initializing version 2.3.14 of the New Relic Lambda Extension...
[NR_EXT] error occurred while making registration request: 403 Forbidden
...

We can test the failing lambda version here. New that the working versions of the lambda do not crash for this console test.

In New Relic we can see the that the bad version is connecting to New Relic here so it appears that the connection to New Relic is working.

@sastels
Copy link
Author

sastels commented Dec 6, 2024

The documentation for how to build the dockerfile isn't quite the same as how we do it - going to see if using their approach makes a difference...

@sastels
Copy link
Author

sastels commented Dec 6, 2024

Is this relevant to the complaint about registration?
https://github.com/cds-snc/notification-api/pull/1421/files
(update: NOPE)

Trying to get the main branch Python 3.10 working with the instructions from the documentation and still having registration issues...

testing in dev. uploading images to my own temp ecr and deploying them to api lambda. Can just run a "api gateway http" test packet through it to see the crash.

@sastels
Copy link
Author

sastels commented Dec 9, 2024

using the code in main with the official New Relic 3.10 layer (replaced the one in the repo with this by running the get_newrelic_layer.sh script whith the arn for the 3.10 layer)

docker image builds fine.

@sastels
Copy link
Author

sastels commented Dec 9, 2024

ok. Looking at
https://docs.aws.amazon.com/lambda/latest/dg/runtimes-extensions-api.html
what's happening is that the New Relic extension is failing to register with the Lambda to receive events.

@sastels
Copy link
Author

sastels commented Dec 10, 2024

wait, it works on Debian? what is happening?

@sastels
Copy link
Author

sastels commented Dec 10, 2024

Ben had a good suggestion - try debian-slim

@sastels
Copy link
Author

sastels commented Dec 11, 2024

debian slim 'd 3.10 and 3.12 both work and are smaller images than the alpine 3.10 one 🎉

Going to perf test the slim 3.10 vs current 3.10 today, then if that's all good (and works!) get the 3.12 ready for the same.

@ben851
Copy link
Contributor

ben851 commented Dec 12, 2024

Blocked by incidents yesterday.

@sastels
Copy link
Author

sastels commented Dec 13, 2024

  • Perf testing new 3.10 debian-slim on staging. This test was probably a bit silly because we did not change the base images in the celery pods, JUST the api lambda base image for now.
    • upload csv of 50000 emails
    • 5280 emails / min base (before merge)
    • 5244 emails / min for the debian-slim.
  • Also repeatedly hit the api status and saw the invocations appear in New Relic
  • smoke tests passing

So I think the debian-slim is fine. Will proceed with getting testing a 3.12 image on dev and then staging.

@sastels
Copy link
Author

sastels commented Dec 13, 2024

  • Made a utils PR to merge in the app team's branch.
  • Using this new utils branch in the updated the api 312 PR.
  • New 312 api lambda image builds locally and runs in dev. Tested in the AWS lambda console with an api gateway packet that asks for the home page. Successfully gets the api status line out.
  • clicked the Test button repeatedly and can see the invocations in New Relic.

I believe this PR is ready for review and test in staging 🎉

@sastels
Copy link
Author

sastels commented Dec 16, 2024

Won't test this in staging until after the annual limits bug bash

@sastels
Copy link
Author

sastels commented Dec 17, 2024

Will test in staging after we do a perf and rollarcoaster test for the helm changes.

@sastels
Copy link
Author

sastels commented Dec 19, 2024

blocked by code freeze

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Reliability Task related to reliability. Security | Sécurité Tech Debt An issue targeting an identified technical debt
Projects
None yet
Development

No branches or pull requests

2 participants