Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachprod: fix default backup schedule creation on start #118368

Conversation

renatolabs
Copy link
Contributor

For a while now, roachprod has created a default backup schedule on cluster creation when --schedule-backups is passed. This is also the default in clusters created to run roachtests.

When we fixed an issue with starting external-process tenants in 3715eb5, however, we unadvertently changed the order of two operations performed by roachprod on cluster start: setting the default cluster settings, and creating the default backup schedule.

As a consequence, the command used to create the backup schedule fails because, at that point, we haven't configured a license key yet.

To make matters worse, there was a bug in the error handling of createFixedBackupSchedule that prevented errors from being reported to the user; these errors were being swallowed and went unnoticed for a few months.

In this commit, we fix the error checking in that function, and also officially remove code that partially supported creating backups or admin users in tenants. Currently, roachprod will run that part of the setup for the system tenant. In the future, we might revisit this and also create a backup schedule and admin users for application tenants.

Epic: none

Release note: None

For a while now, `roachprod` has created a default backup schedule on
cluster creation when `--schedule-backups` is passed. This is also the
default in clusters created to run roachtests.

When we fixed an issue with starting external-process tenants in
3715eb5, however, we unadvertently changed the order of two operations
performed by roachprod on cluster start: setting the default cluster
settings, and creating the default backup schedule.

As a consequence, the command used to create the backup schedule fails
because, at that point, we haven't configured a license key yet.

To make matters worse, there was a bug in the error handling of
`createFixedBackupSchedule` that prevented errors from being reported
to the user; these errors were being swallowed and went unnoticed for
a few months.

In this commit, we fix the error checking in that function, and also
officially remove code that partially supported creating backups or
admin users in tenants. Currently, roachprod will run that part of the
setup for the system tenant. In the future, we might revisit this and
also create a backup schedule and admin users for application
tenants.

Epic: none

Release note: None
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@renatolabs
Copy link
Contributor Author

Running a 0.1 build on this branch to make sure it didn't break any roachtests.

@renatolabs renatolabs marked this pull request as ready for review January 29, 2024 21:15
@renatolabs renatolabs requested a review from a team as a code owner January 29, 2024 21:15
@renatolabs renatolabs requested review from herkolategan and srosenberg and removed request for a team January 29, 2024 21:15
@renatolabs
Copy link
Contributor Author

Run above looks normal; failures are unrelated and also happen on master.

@renatolabs renatolabs added backport-23.1.x Flags PRs that need to be backported to 23.1 backport-23.2.x Flags PRs that need to be backported to 23.2. labels Jan 30, 2024
@renatolabs
Copy link
Contributor Author

TFTR!

bors r=herkolategan

@craig craig bot merged commit 38dd16a into cockroachdb:master Jan 30, 2024
9 checks passed
@craig
Copy link
Contributor

craig bot commented Jan 30, 2024

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-23.1.x Flags PRs that need to be backported to 23.1 backport-23.2.x Flags PRs that need to be backported to 23.2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants