Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cloud: Automatically add locality flag when using Helm to deploy CDB in K8 #23940

Open
tlvenn opened this issue Mar 16, 2018 · 13 comments
Open
Labels
A-orchestration Relating to orchestration systems like Kubernetes C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) O-community Originated from the community

Comments

@tlvenn
Copy link
Contributor

tlvenn commented Mar 16, 2018

There are 2 well known node labels that could be used to create dynamically the locality information to be passed to CDB on startup:

  • failure-domain.beta.kubernetes.io/region
  • failure-domain.beta.kubernetes.io/zone

We could probably set the country if the node is exposing failure-domain.beta.kubernetes.io/country or if we detect the cloud provider and then derivate the country from its region name.

@a-robinson what do you think ?

Jira issue: CRDB-5797

@a-robinson
Copy link
Contributor

That's a reasonable idea, yeah. Is there any reason we wouldn't want to just use all available failure-domain key-value pairs rather than only picking out specific ones?

One concern I have is that we may not want to use failure domain tags that could change for any given node when it's rescheduled. For example, if node 1 runs in zone=a but then its machine goes down for maintenance and it gets rescheduled to zone=b, we may have to do a lot of rebalancing as a result of the change. It'd be best to only use failure domain tags that match the scope of persistent volume being used by the node -- i.e. if a PV can't be moved from region to region then using a region tag is good, but if it can be moved zone to zone then using a zone tag might not be worth it.

@tlvenn
Copy link
Contributor Author

tlvenn commented Mar 29, 2018

That's a reasonable idea, yeah. Is there any reason we wouldn't want to just use all available failure-domain key-value pairs rather than only picking out specific ones?

I dont think so but can we order them properly in all cases ?

One concern I have is that we may not want to use failure domain tags that could change for any given node when it's rescheduled. For example, if node 1 runs in zone=a but then its machine goes down for maintenance and it gets rescheduled to zone=b, we may have to do a lot of rebalancing as a result of the change. It'd be best to only use failure domain tags that match the scope of persistent volume being used by the node -- i.e. if a PV can't be moved from region to region then using a region tag is good, but if it can be moved zone to zone then using a zone tag might not be worth it.

Ya totally agree on this

@a-robinson
Copy link
Contributor

I dont think so but can we order them properly in all cases ?

Great point.

I doubt this is going to have much value until it's less work to get a multi-region cockroach cluster running in kubernetes, so I'm not going to work on it immediately, but if you want to play around with it contributions are welcome.

@a-robinson
Copy link
Contributor

@tlvenn Do you have any pointers to examples of kubernetes configs that do this? Unfortunately node labels do not appear to be accessible via the downward API, so we'd have to do some work inside the pod to talk to the Kubernetes API and retrieve the node labels from it, all before starting cockroach.

I'm starting to play around with multi-region clusters on kubernetes so getting this automatically would be awesome, but it'd be great if we could do so without having to insert a bunch of glue code.

@a-robinson
Copy link
Contributor

So based on upstream discussions we would indeed have to write some code to retrieve them from the API (and expand our RBAC scopes in order to allow such retrievals). It may be made easier in future Kubernetes releases, but that will be a ways off.

@a-robinson
Copy link
Contributor

One example: Yolean/kubernetes-kafka#41

@solsson
Copy link

solsson commented Apr 8, 2018

Anyone here interested in collaborating on kubernetes/kubernetes#62078 (comment)?

@a-robinson
Copy link
Contributor

What do you have in mind, @solsson? I can't see us putting an init container for this into our default configuration - it adds extra mental overhead for people to understand how our deployment works, would require additional RBAC privileges (including adding RBAC privileges to our insecure deployment, which currently needs none), and would only really be useful right now for multi-zone Kubernetes clusters.

For multi-region cockroach deployments that span Kubernetes clusters, manually specifying the --locality flag in the config file for each Kubernetes cluster isn't a big deal.

@solsson
Copy link

solsson commented Apr 13, 2018

Actually I did have RBAC and extra mental overhead in mind :) It's up to you to balance priority, complexity and resources for your use case - so I'll take it as no for now.

@a-robinson a-robinson added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) O-community Originated from the community A-orchestration Relating to orchestration systems like Kubernetes labels Apr 30, 2018
@github-actions
Copy link

github-actions bot commented Jun 7, 2021

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it in
5 days to keep the issue queue tidy. Thank you for your contribution
to CockroachDB!

@Bessonov
Copy link

Bessonov commented Jun 7, 2021

I think this is still an issue.

@github-actions
Copy link

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it in
10 days to keep the issue queue tidy. Thank you for your contribution
to CockroachDB!

@Bessonov
Copy link

a comment will keep it active

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-orchestration Relating to orchestration systems like Kubernetes C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) O-community Originated from the community
Projects
None yet
Development

No branches or pull requests

4 participants