Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributed backup scheduling #1

Open
hgfischer opened this issue Oct 17, 2017 · 0 comments
Open

Distributed backup scheduling #1

hgfischer opened this issue Oct 17, 2017 · 0 comments

Comments

@hgfischer
Copy link
Contributor

hgfischer commented Oct 17, 2017

CaOps should be able to receive an API call from every node, that will trigger a cluster-wide backup. This needs:

  • To be as much synchronized as possible, so to avoid network delays, the API will always schedule the backup to the nearest rounded time.
  • Under normal usage all Cassandra nodes must be up and running. No nodes must be joining or leaving.

Since we are not sure about how reliable Serf is on big clusters, we might need to implement something else to keep the scheduling consistent. Two options that I could think are:

  • Using Cassandra itself, with a table that has consistency of the number of nodes. This appears to be the most easy solution, but it has lots of trade-offs. The main one is that this table will be like a queue, which is an anti-pattern on Cassandra.

  • Using Hashicorp Raft library, so each CaOps agent has its own persisted state, that can also be more controlled for the special use-case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant