This repo contains the infrastructure to bootstrap our AWS accounts and deploy an instance of the UKHSA Data Dashboard app.
The tooling and scripts in this repo are tested with Linux and Mac. If you're using Windows these may work with WSL2 🤞.
There are a few steps needed before you can get started:
- Setup an SSH key for GitHub
- Clone this repo
- Install tools
- Configure AWS SSO
- Login to the GitHub CLI
- Enable multi-platform Docker builds
Please follow the instructions here to setup an SSH key for GitHub.
Open a terminal and run the following commands:
git clone [email protected]:UKHSA-Internal/data-dashboard-infra.git
cd data-dashboard-infra
We use homebrew to manage our software dependencies. If you don't already have it installed:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
Then run the following commands to install the software dependencies:
brew install awscli
brew install --cask docker
brew install gh
brew install jq
brew install [email protected]
brew install tfenv
You need to sign into AWS and configure your profiles. You can either do this via the AWS CLI or by editing your config files directly. For UKHSA engineers we recommend editing your config files directly.
Sign into AWS and configure your profiles:
aws configure sso
Follow the prompts and configure the accounts / roles with the following profile names. When prompted for the region, enter eu-west-2
.
Account | Role | Profile Name |
---|---|---|
Development | Developer | uhd-dev |
Tooling | Developer | uhd-tools |
Due to a bug in the AWS Terraform provider (hashicorp/terraform-provider-aws#28263), the following manual post config steps are needed:
- Open your
.aws/config
file - Remove the
sso_session
parameter from the profile - Add the
sso_start_url
andsso_region
to the profile
Example:
[profile foo]
sso_region = eu-west-2
sso_start_url = https://bar.awsapps.com/start
sso_account_id = 999999999
sso_role_name = Baz
region = eu-west-2
You will also need to add the assumed-role
profiles the CLI is expecting for both the dev
and tools
accounts:
[profile foo/assumed-role]
role_arn = arn:aws:iam::foo:role/Developer
source_profile = foo
region = eu-west-2
The ~/.aws/config
should be updated with the profile names we use. Please follow the instructions in Confluence.
We use the GitHub CLI to check out pull request branches. To enable this feature you must login to the GitHub CLI:
gh auth login
We use docker buildx
to enable us to produce arm64
images from x86_64
platforms.
You'll need to enable it if you haven't already:
docker buildx create --use
Source our CLI tool:
source uhd.sh
Assume the Developer role in our tools
and dev
accounts:
uhd aws login
And then test that you can query whoami
uhd aws whoami
We use Terraform to manage the resources we deploy to AWS.
The Terraform code is split into two layers:
- For account level resources. We deploy one each of these resources in each AWS account.
- For application resources. This is the infrastructure to run an instance of the application. We deploy multiple instances of these resources into each AWS account.
Terraform must be initialized on a new machine. To initialize for all layers:
uhd terraform init
Or to initialize a specific layer:
uhd terraform init <layer>
For example:
uhd terraform init 10-account
To run terraform plan
for the application layer in your dev environment:
uhd terraform plan
Or to plan
for a specific layer and environment:
uhd terraform plan:layer <layer> <env>
For example:
uhd terraform plan:layer 20-app foo
To run terraform apply
for the application layer in your dev environment:
uhd terraform apply
Or to apply
for a specific layer and environment:
uhd terraform apply:layer <layer> <env>
For example:
uhd terraform apply:layer 20-app foo
Until we finalize our strategy for ECR, you'll need to pull the latest container images and push them to your ECR:
uhd docker update <account> <env>
For example:
uhd docker update dev 12345678
Note that when pushing to the ECR in the next account, you will be logged into the ECR for that account
automatically as part of the uhd docker update <account> <env>
step.
Otherwise, should you wish to log in to the ECR within a specific account which is not the tools account:
uhd docker ecr:login <account>
For example:
uhd docker ecr:login dev
Once your infrastructure is deployed, you'll need to bootstrap your environment. This will set the API key, CMS admin user password, and seed your database with content and metrics.
These commands must be run from the
dev
account
Open a new terminal window and login to AWS:
source uhd.sh
uhd aws login uhd-dev
Run the bootstrap job:
uhd ecs run bootstrap-env
Make a note of the task ID. It is the last part of the task ARN. In the example below it is e3a5dc3fc35546c6980fabe45bc59fe6
arn:aws:ecs:eu-west-2:123456789012:task/uhd-12345678-cluster/e3a5dc3fc35546c6980fabe45bc59fe6
You can then tail the logs from the task. It may take a minute or two for the task to start:
uhd ecs logs <env> <task id>
For example:
uhd ecs logs 12345678 d20ee493b97143f293ae6ebb5f7b7c0a
Your environment should now be setup. The final step is to restart your services:
uhd ecs restart-services
The update command can be used to run all the previous steps together to simplify updating your environment. This command requires you to be logged in to both the tools
and dev
accounts at the same time. This can be done with the following command:
uhd aws login
Now that you are logged in to both accounts, you can run the update command:
uhd update
The update command will perform the following tasks:
- Switch to the tools account
- Run
terraform init
- Run
terraform apply
- Run
Docker ECR login
- Run
Docker pull
to grab the latest images from the tools account - Run
Docker push
to deploy the latest images to your environment - Switch to the dev account
- Restart ecs services
- Restart lambda functions
- Switch back to tools account
We cache very aggressively in the app to maximize performance. The trade off is that at the moment we must flush the caches if we make changes to CMS content or metric data. We have three caches:
- A Redis cache which sits between the private API and the database
- A CloudFront cache which sits in front of the public API load balancer
- A CloudFront cache which sits in front of the front end load balancer
Depending on what has changed, there are a couple of options:
Both the Redis and front end CloudFront caches must be flushed.
First sign into AWS and switch to the dev
account:
uhd aws login
uhd aws use uhd-dev
Then flush the caches:
uhd cache flush-redis
uhd cache flush-front-end
uhd cache fill-front-end
All caches must be flushed for metric data changes:
uhd cache flush-redis
uhd cache flush-front-end
uhd cache flush-public-api
uhd cache fill-front-end
uhd cache fill-public-api
Flushing the caches one by one, and waiting for each one to finish before starting the next one is tedious. To flush them all in one command:
uhd cache flush
There are a few steps to test feature branch in your dev environment:
- Clone all repos
- Pull the latest code
- Deploy the latest infra
- Cut a custom image and push it (for API and front end changes)
- Apply infra changes (for infra changes)
- Test it
Firstly, clone all the repos if you don't have them already. Our tooling expects the other repos to be cloned as siblings of this repo.
uhd gh clone
If you have been working on other tickets, it's recommended to switch all branches back to main
and pull the latest code:
uhd gh main
If for some reason you don't want to do that, at least pull the latest infra:
git checkout main && git pull
This will pull the latest prod images, and update your env to use the latest infra:
uhd aws login
uhd update
Only use this step if you're testing API or front end changes.
Now we can checkout the branch for pull request. The pattern is:
uhd gh co [repo] [pull request number | url | branch]
For example:
uhd gh co api 123
Next, build and push a custom image:
uhd docker build [repo]
For example:
uhd docker build api
And finally restart the ECS services:
uhd aws use uhd-dev
uhd ecs restart-services
Flush the cache to reflect Front-end changes
uhd cache flush
Only use this step if you're testing infra changes
First, checkout the branch for the pull request. For example:
uhd gh co infra 123
If you would like to preview the infra changes that Terraform will make:
uhd terraform plan
Then apply the changes:
uhd terraform apply
You can now commence testing the pull request in your dev environment.
To run terraform destroy
for the application layer in your dev environment:
uhd terraform destroy
Or to destroy
for a specific layer and environment:
uhd terraform destroy:layer <layer> <env>
For example:
uhd terraform destroy:layer 20-app foo
Note that production-grade environments, which are set via the
use_prod_sizing
variable, have deletion protection enabled on the main db cluster.
To destroy the infra for these environments, you must switch use_prod_sizing
off
for that environment prior to running the above.
Caution
Refreshing Databases should only be performed on your DEV AWS instances. Do not perform a refresh on the named pre-production environments.
When you are actively developing features or testing on your deployed development environment it may be necessary for you to drop and rebuild your databases. At the time of writing the two databases we make use of are:
- App DB - named something like
uhd-<environmentId>-aurora-db-app
- Feature Flag DB - named something like
uhd-<environmentId>-aurora-db-feature-flags
To Drop these DBs in the AWS console select the DB instance and select "Delete" from the "Actions" Menu.
- You will need to delete the DB instance before you can delete the DB cluster.
Once the DBs have been deleted then you can create new ones.
To spin up new versions of the databases you will need to first create the DBs and the infrastructure prior to populating the DB.
-
Assume the tools role
uhd aws use uhd-tools
-
Create the new DBs and update secrets within the secrets manager.
uhd terraform apply
-
Assume the Dev role
uhd aws use uhd-dev
-
Restart your services
uhd ecs restart-services
-
Populate the new DBs
uhd ecs run bootstrap-env
When you run uhd terraform destroy
then by default all secrets for that environment will be scheduled for deletion.
This does not delete secrets immediately.
As such, you will need to delete the secrets associated with the environment separately. This can be done with the following command:
uhd secrets delete-all-secrets <env>
This command must be run from the
dev
account
These repos contain the app source code:
This repo contains the infra for the part of the ETL pipeline which sits within AWS: