Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Migrating kubernetes components to use client-go #35159

Closed
caesarxuchao opened this issue Oct 19, 2016 · 16 comments
Closed

[RFC] Migrating kubernetes components to use client-go #35159

caesarxuchao opened this issue Oct 19, 2016 · 16 comments
Assignees
Labels
sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.

Comments

@caesarxuchao
Copy link
Member

caesarxuchao commented Oct 19, 2016

Background

We are migrating components other than kube-apiserver in the main repository to use client-go. The end goal is removing most parts of the pkg/client out of the main repository, and developing clients in client-go. We believe it will increase the development velocity in the long run.

As the first step, we will migrate components to use the client-go located in the staging area of main repository.

Goal

The migration involves huge changes. We want to find a manageable way.

  • We want to break the migration into multiple PRs of sizes reasonable for review.
  • We want to reduce the operational cost the k8s community needs to bear with during the migration.

[Abandoned] Approach 1: migrating one component at a time

For example, first convert kube-controller-manager (#29934) to use client-go, then converting kubelet, etc.

The problem is: Components migrated to client-go will use versioned types defined in client-go, e.g., client-go#v1.Pod (client-go is the repo name, v1 is the package name), while the remaining components keep using unversioned types defined in the main repository, e.g., kubernetes#api.Pod. The difference in the types causes a dilemma for the packages shared by both groups of components, e.g., pkg/volume. We’ve considered two workarounds:

[Abandoned] Approach 1.1: Make two copies of the shared dependencies

We are going to take this approach if the community can bear with the disadvantages.

We make two copies of the shared dependencies, one keeps using kubernetes#api.Pod, the other one uses client-go#v1.Pod. We will provide a script to do the copy. I had experimented to migrate only kube-controller-manager, it turned out we at least need to maintain copies of these package:

  • pkg/util/node (1 file)
  • pkg/util/system (one liner)
  • pkg/volume (147 files) @saad-ali
  • pkg/serviceaccount (5 files)
  • pkg/fieldpath (3 files)
  • pkg/controller/deployment/util (2 files) @janetkuo

Disadvantages:

  • We need to add a verify script to force developers who make changes to the original package to run the update script to keep the copy in sync.

[Abandoned] Approach 1.2: Converting before calling shared packages

We convert client-go#v1.Pod to kubernetes#api.Pod before passing the object to shared packages. The conversion requires two steps: i). using the unsafe package to forcefully convert client-go#v1.Pod to kubernetes#v1.Pod, ii). Use api.Scheme to convert to kubernetes#api.Pod.

Disadvantages:

  • Converting to api.Pod will apply the defaults. Perhaps this will cause problem if a controller is creating API objects.
  • Conversion is inefficient.

[Abandoned] Approach 2: migrating in a developing branch

We’ll take this approach if the community finds the operational cost of the first approach is too high. We will make a developing branch, accumulating PRs and getting reviews, and finally merge with the master branch the migration is done for all binaries.

Disadvantages:

  • The developing branch is in an uncompilable state until the most components are migrated.
  • Perhaps the final merge will be challenging. I will try to reduce the workload by
    • capture most rewrites in a script
    • finish the migration during the 1.5 code freeze

[Picked] Approach 3: migrating to the versioned clients during the code freeze

Details are in this comment

Please let us know if you can bear with the cost of approach 1.1.

@kubernetes/sig-api-machinery @smarterclayton @deads2k @liggitt

@caesarxuchao caesarxuchao added the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label Oct 19, 2016
@caesarxuchao caesarxuchao self-assigned this Oct 19, 2016
@saad-ali
Copy link
Member

CC @kubernetes/sig-storage

@smarterclayton
Copy link
Contributor

Have we evaluated whether all those components can use the versioned objects (i.e. have we tested selective refactors to verify there are no additional problems that might surface)?

@caesarxuchao
Copy link
Member Author

I had converted some controllers during 1.2 or 1.3 release and they passed all unit tests and e2e tests. I needed to fix some issues in the test framework.

I'm experimenting locally using approach 1.1 to convert kube-controller-manager, I'll report the results.

I took a look at kubelet, it seems we can convert it.

@deads2k
Copy link
Contributor

deads2k commented Oct 20, 2016

I'm still more concerned about silently dropping fields we don't know about than I am about the conversion cost, particularly since that is getting amortized over all controllers (shared informers) and I was actually well convinced of the idea of an unsafe cast where safe.

Using external clients in controllers gives the appearance of resolving the field stomping problem without actually addressing it. I've seen @lavalamp's plan for handling it and I think it can work. Is that moving ahead concurrent to this?

@caesarxuchao
Copy link
Member Author

@deads2k, moving to versioned (or in your term "external") clients will NOT solve the field stomping problem. Adding a new field can be a compatible API change that does not increase API version, so API versions does not prevent an old client dropping the new field. Could you point me to @lavalamp's proposal?

@liggitt
Copy link
Member

liggitt commented Oct 20, 2016

at least with the current structure, we know that for a given build, all the controllers agree with the api server on the latest version of things. it sounds like this could result in a mish-mash of different controllers coded against different client versions in the same build.

@caesarxuchao
Copy link
Member Author

@liggitt a version of client-go only supports one version of clientset, and a clientset includes only one version of each group of API, so currently we won't have the problem you described.

@deads2k
Copy link
Contributor

deads2k commented Oct 20, 2016

Could you point me to @lavalamp's proposal?

#34508 (comment) item 2.

@caesarxuchao
Copy link
Member Author

Thanks @deads2k. Still, I think moving to client-go will be parallel to that effort.

@liggitt
Copy link
Member

liggitt commented Oct 20, 2016

one version of each group of API

one apiVersion (e.g. "v1") or one released version (e.g. "client-go-1.4.0")?

if all controllers have to move to a new version together, what is the benefit of making them speak an external version, rather than an internal version?

how will we support upgrades where we want one controller to speak to a new API in a new version (like scheduledjobs in batch/v2alpha1), but existing controllers still need to speak to the old version for compatibility with existing masters during a rolling update? doesn't this cut off our ability to have sparse versions?

@caesarxuchao
Copy link
Member Author

caesarxuchao commented Oct 20, 2016

one apiVersion (e.g. "v1") or one released version (e.g. "client-go-1.4.0")?

One apiVersion each group.

if all controllers have to move to a new version together, what is the benefit of making them speak an external version, rather than an internal version?

This almost convinced me we should support multiple apiVersions of a group in a clientset.

it sounds like this could result in a mish-mash of different controllers coded against different client versions in the same build.

Then what do you mean by client versions here? If it means apiVersions, does it contradict to your other question?

how will we support upgrades where we want one controller to speak to a new API in a new version (like scheduledjobs in batch/v2alpha1), but existing controllers still need to speak to the old version for compatibility with existing masters during a rolling update?

I thought we upgrade controllers and api-server at the same time?

[edit] I do see the problem of supporting only one version: we cannot convert job controller to use batch/v2alpha1 because that version might be disabled, while scheduledjob controller needs to use v2alpha1.

@caesarxuchao
Copy link
Member Author

how will we support upgrades where we want one controller to speak to a new API in a new version (like scheduledjobs in batch/v2alpha1), but existing controllers still need to speak to the old version for compatibility with existing masters during a rolling update

According to the current design, controller only talks to the local api server, so it's not a problem:

Requests from scheduler (and controllers) go to a local apiserver via localhost interface, so both components will be in the same version.

@smarterclayton
Copy link
Contributor

That's going to be going away very soon so we shouldn't depend on it.

Some other options

  • move the controllers out of kubernetes/kubernetes and copy the libraries
    as we go - moving them slowly. (this is similar to your option 1.1)
  • move kubelet out of kubernetes/kubernetes
  • move kubectl out of kubernetes/kubernetes

We could potentially pick off libraries we want to reuse into their own
repos, or let the clients vendor them.

On Thu, Oct 20, 2016 at 5:41 PM, Chao Xu [email protected] wrote:

how will we support upgrades where we want one controller to speak to a
new API in a new version (like scheduledjobs in batch/v2alpha1), but
existing controllers still need to speak to the old version for
compatibility with existing masters during a rolling update

According to the current design
https://github.com/fgrzadkowski/kubernetes/blob/038418fb14568bfda8e334ae315501c5ea504e0a/docs/design/ha_master.md#upgrades,
controller only talks to the local api server, so it's not a problem:

Requests from scheduler (and controllers) go to a local apiserver via localhost interface, so both components will be in the same version.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#35159 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_pyzWXdjGKTLTW-TJ-9VnSAiBc_lOks5q1-AZgaJpZM4KbeJV
.

@caesarxuchao
Copy link
Member Author

@smarterclayton I think the options you gave have the same operational cost as option 1.1. With controllers/kubelet in their own repos, we still have duplicate packages for different API versions, just they are in different repos. Also we still need a mechanism to keep them in sync.

I think I can go with option 2. Its cost is rebase could be huge. I can try to finish the work during the 1.5 code freeze, and try to get merged as the first PR after freeze.

@caesarxuchao caesarxuchao changed the title [RFC] Migrating kubernetes components to use client-go in a managable way [RFC] Migrating kubernetes components to use client-go Oct 31, 2016
@caesarxuchao
Copy link
Member Author

caesarxuchao commented Oct 31, 2016

Background

Currently the client-go is populated in two steps:

  • In the main repo, someone runs /staging/copy.sh to update /staging
  • A robot copies /staging literally to client-go

This process will continue for a while, until we have the proper test infra. Developers cannot send PR to client-go until then. Hence, we are actually migrating the main repo to use the /staging area.

Problem

After the migration, developing experience will be bad. Imagine a developer finds a problem with the clients, he will find the code in /staging, but he cannot change it directly, he needs to find where the code is copied from, update that code, and then run copy.sh.

Solution

During the code freeze, we migrate the main repo to use the versioned clients inside the main repo. After the freeze, we gradually migrate to client-go. The benefits are:

@smarterclayton
Copy link
Contributor

SGTM

k8s-github-robot pushed a commit that referenced this issue Nov 7, 2016
Automatic merge from submit-queue

[RFC] Prepare for deprecating NodeLegacyHostIP

Ref #9267 (comment)

*What this PR does*
- Add comments saying "LegacyHostIP" will be deprecated in 1.7;
- Add v1.NodeLegacyHostIP to be consistent with the internal API (useful for client-go migration #35159)
- Let cloudproviders who used to only set LegacyHostIP set the IP as both InternalIP and ExternalIP
- Master used to ssh tunnel to node's ExternalIP or LegacyHostIP to do [healthz check](https://github.com/kubernetes/kubernetes/blame/master/pkg/master/master.go#L328-L332). OTOH, if on-prem, kubelet only [sets](https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/kubelet_node_status.go#L430-L431) LegacyHostIP or InternalIP. In order to deprecate LegacyHostIP in 1.7, I let healthz check to use InternalIP if ExternalIP is not available. (The healthz check is the only consumer of LegacyHostIP in k8s.)

@liggitt @justinsb @bgrant0607 

```release-note
LegacyHostIP will be deprecated in 1.7.
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.
Projects
None yet
Development

No branches or pull requests

5 participants