Staff device DNS DHCP disaster recovery

This repo contains an interactive script which can be used to roll back a corrupt config file for the DNS or DHCP services.

Prerequisites

AWS Vault configured for the corrupted environment
jq to slice and filter and map and transform structured data

Recovering from a Disaster

In the event that Grafana has alerted on a disaster scenario, find the correct section and follow the steps provided.

Infrastructure deployment

To be able to follow this guide, you need to have the following already:

AWS Vault set up.
Access to Moj AWS SSO.

🎉 TIP
You may configure your AWS Vault to use AWS SSO. A step-by-step guide can be found in our team documentation site.

Prepare the variables

Clone this repo to a local directory.

Initialize your Terraform

make init

If you hadn't already done this the first time it will clean the .terraform dir and create a new .env file with values retrieved from AWS SSM Parameter store.

Then run the above command again.

Switch to an isolated workspace

If you do not have a Terraform workspace created already, use the command below to create a new workspace.

Create Terraform workspace

aws-vault exec <shared-services-aws-vault-profile> -- terraform workspace new "YOUR_UNIQUE_WORKSPACE_NAME"

This should create a new workspace and select that new workspace at the same time.

If you already have a workspace created use the command below to select the right workspace before continue.

View Terraform workspace list
aws-vault exec <shared-services-aws-vault-profile> -- terraform workspace list
Select a Terraform workspace
aws-vault exec <shared-services-aws-vault-profile> -- terraform workspace select "YOUR_WORKSPACE_NAME"

Finally spin up your own Infra

make apply

Restore database

In the event that the RDS instance (staff-device-production-dhcp-db and/or staff-device-production-dhcp-admin-db) needs to be restored (e.g. due to data loss or instance failure), it can be restored using the daily automatic snapshots. This can be done using the AWS console or the AWS CLI.

To restore the database to a snapshot follow the steps below:

Go the the RDS console in AWS
Navigate to the 'System' tab of the Snapshot window.
Select the latest snapshot to restore e.g. rds:staff-device-production-dhcp-db-2025-01-06-22-36, and click 'Restore snapshot' in the Actions dropdown.
Enter the details required, such as DB identifier and security group id's.
Press restore DB instance

Corrupt Config

Identify the broken service (dns/dhcp) and environment (development/pre-production/production)
Run:
1. aws-vault exec CORRUPT_ENVIRONMENT_VAULT_PROFILE_NAME -- make restore-dns-dhcp-config
2. At the prompt, enter the environment name (development/pre-production/production)
3. At the second prompt, enter the corrupt service name (dns/dhcp)
4. You will be given an output of the last five published configs with their VersionId and LastModified
5. Copy the VersionId of the config you wish to restore to
6. At the final prompt, paste the VersionId
7. The terminal will exit with the following command: Successfully rolled back dhcp to version: VersionId

Corrupt Container

Identify the broken service (dns/dhcp) and environment (development/pre-production/production)
Run:
1. aws-vault exec CORRUPT_ENVIRONMENT_VAULT_PROFILE_NAME -- make restore-service-container
2. At the prompt, enter the environment name (development/pre-production/production)
3. At the second prompt, enter the corrupt service name (dns/dhcp)
4. You will be given an output of the last five pushed containers with their imageDigest and imagePushedAt
5. Copy the imageDigest of the container you wish to re-tag as latest
6. At the final prompt, paste the imageDigest
7. The terminal will exit with the following command: Successfully re-tagged image: imageDigest as latest

Corrupt Admin Container

Run:
1. aws-vault exec CORRUPT_ENVIRONMENT_VAULT_PROFILE_NAME -- make restore-admin-container
2. At the prompt, enter the environment name (development/pre-production/production)
3. You will be given an output of the last five pushed containers with their imageDigest and imagePushedAt
4. Copy the imageDigest of the container you wish to re-tag as latest
5. At the final prompt, paste the imageDigest
6. The terminal will exit with the following command: Successfully re-tagged image: imageDigest as latest

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.github		.github
scripts		scripts
.editorconfig		.editorconfig
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Staff device DNS DHCP disaster recovery

Prerequisites

Recovering from a Disaster

Infrastructure deployment

Prepare the variables

Initialize your Terraform

Switch to an isolated workspace

Create Terraform workspace

View Terraform workspace list

Select a Terraform workspace

Finally spin up your own Infra

Restore database

Corrupt Config

Corrupt Container

Corrupt Admin Container

About

Releases

Packages

Contributors 9

Languages

License

ministryofjustice/staff-device-dns-dhcp-disaster-recovery

Folders and files

Latest commit

History

Repository files navigation

Staff device DNS DHCP disaster recovery

Prerequisites

Recovering from a Disaster

Infrastructure deployment

Prepare the variables

Initialize your Terraform

Switch to an isolated workspace

Create Terraform workspace

View Terraform workspace list

Select a Terraform workspace

Finally spin up your own Infra

Restore database

Corrupt Config

Corrupt Container

Corrupt Admin Container

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 9

Languages

Packages