[RFC] Migrating kubernetes components to use client-go #35159

caesarxuchao · 2016-10-19T21:22:36Z

Background

We are migrating components other than kube-apiserver in the main repository to use client-go. The end goal is removing most parts of the pkg/client out of the main repository, and developing clients in client-go. We believe it will increase the development velocity in the long run.

As the first step, we will migrate components to use the client-go located in the staging area of main repository.

Goal

The migration involves huge changes. We want to find a manageable way.

We want to break the migration into multiple PRs of sizes reasonable for review.
We want to reduce the operational cost the k8s community needs to bear with during the migration.

[Abandoned] Approach 1: migrating one component at a time

For example, first convert kube-controller-manager (#29934) to use client-go, then converting kubelet, etc.

The problem is: Components migrated to client-go will use versioned types defined in client-go, e.g., client-go#v1.Pod (client-go is the repo name, v1 is the package name), while the remaining components keep using unversioned types defined in the main repository, e.g., kubernetes#api.Pod. The difference in the types causes a dilemma for the packages shared by both groups of components, e.g., pkg/volume. We’ve considered two workarounds:

[Abandoned] Approach 1.1: Make two copies of the shared dependencies

We are going to take this approach if the community can bear with the disadvantages.

We make two copies of the shared dependencies, one keeps using kubernetes#api.Pod, the other one uses client-go#v1.Pod. We will provide a script to do the copy. I had experimented to migrate only kube-controller-manager, it turned out we at least need to maintain copies of these package:

pkg/util/node (1 file)
pkg/util/system (one liner)
pkg/volume (147 files) @saad-ali
pkg/serviceaccount (5 files)
pkg/fieldpath (3 files)
pkg/controller/deployment/util (2 files) @janetkuo

Disadvantages:

We need to add a verify script to force developers who make changes to the original package to run the update script to keep the copy in sync.

[Abandoned] Approach 1.2: Converting before calling shared packages

We convert client-go#v1.Pod to kubernetes#api.Pod before passing the object to shared packages. The conversion requires two steps: i). using the unsafe package to forcefully convert client-go#v1.Pod to kubernetes#v1.Pod, ii). Use api.Scheme to convert to kubernetes#api.Pod.

Disadvantages:

Converting to api.Pod will apply the defaults. Perhaps this will cause problem if a controller is creating API objects.
Conversion is inefficient.

[Abandoned] Approach 2: migrating in a developing branch

We’ll take this approach if the community finds the operational cost of the first approach is too high. We will make a developing branch, accumulating PRs and getting reviews, and finally merge with the master branch the migration is done for all binaries.

Disadvantages:

The developing branch is in an uncompilable state until the most components are migrated.
Perhaps the final merge will be challenging. I will try to reduce the workload by
- capture most rewrites in a script
- finish the migration during the 1.5 code freeze

[Picked] Approach 3: migrating to the versioned clients during the code freeze

Details are in this comment

Please let us know if you can bear with the cost of approach 1.1.

@kubernetes/sig-api-machinery @smarterclayton @deads2k @liggitt

The text was updated successfully, but these errors were encountered:

saad-ali · 2016-10-19T21:30:27Z

CC @kubernetes/sig-storage

smarterclayton · 2016-10-20T03:06:17Z

Have we evaluated whether all those components can use the versioned objects (i.e. have we tested selective refactors to verify there are no additional problems that might surface)?

caesarxuchao · 2016-10-20T03:16:28Z

I had converted some controllers during 1.2 or 1.3 release and they passed all unit tests and e2e tests. I needed to fix some issues in the test framework.

I'm experimenting locally using approach 1.1 to convert kube-controller-manager, I'll report the results.

I took a look at kubelet, it seems we can convert it.

deads2k · 2016-10-20T11:43:13Z

I'm still more concerned about silently dropping fields we don't know about than I am about the conversion cost, particularly since that is getting amortized over all controllers (shared informers) and I was actually well convinced of the idea of an unsafe cast where safe.

Using external clients in controllers gives the appearance of resolving the field stomping problem without actually addressing it. I've seen @lavalamp's plan for handling it and I think it can work. Is that moving ahead concurrent to this?

caesarxuchao · 2016-10-20T17:49:16Z

@deads2k, moving to versioned (or in your term "external") clients will NOT solve the field stomping problem. Adding a new field can be a compatible API change that does not increase API version, so API versions does not prevent an old client dropping the new field. Could you point me to @lavalamp's proposal?

liggitt · 2016-10-20T17:54:12Z

at least with the current structure, we know that for a given build, all the controllers agree with the api server on the latest version of things. it sounds like this could result in a mish-mash of different controllers coded against different client versions in the same build.

caesarxuchao · 2016-10-20T18:01:13Z

@liggitt a version of client-go only supports one version of clientset, and a clientset includes only one version of each group of API, so currently we won't have the problem you described.

deads2k · 2016-10-20T18:05:19Z

Could you point me to @lavalamp's proposal?

#34508 (comment) item 2.

caesarxuchao · 2016-10-20T18:09:48Z

Thanks @deads2k. Still, I think moving to client-go will be parallel to that effort.

liggitt · 2016-10-20T18:12:11Z

one version of each group of API

one apiVersion (e.g. "v1") or one released version (e.g. "client-go-1.4.0")?

if all controllers have to move to a new version together, what is the benefit of making them speak an external version, rather than an internal version?

how will we support upgrades where we want one controller to speak to a new API in a new version (like scheduledjobs in batch/v2alpha1), but existing controllers still need to speak to the old version for compatibility with existing masters during a rolling update? doesn't this cut off our ability to have sparse versions?

caesarxuchao · 2016-10-20T18:28:51Z

one apiVersion (e.g. "v1") or one released version (e.g. "client-go-1.4.0")?

One apiVersion each group.

if all controllers have to move to a new version together, what is the benefit of making them speak an external version, rather than an internal version?

This almost convinced me we should support multiple apiVersions of a group in a clientset.

it sounds like this could result in a mish-mash of different controllers coded against different client versions in the same build.

Then what do you mean by client versions here? If it means apiVersions, does it contradict to your other question?

how will we support upgrades where we want one controller to speak to a new API in a new version (like scheduledjobs in batch/v2alpha1), but existing controllers still need to speak to the old version for compatibility with existing masters during a rolling update?

I thought we upgrade controllers and api-server at the same time?

[edit] I do see the problem of supporting only one version: we cannot convert job controller to use batch/v2alpha1 because that version might be disabled, while scheduledjob controller needs to use v2alpha1.

caesarxuchao · 2016-10-20T21:41:31Z

how will we support upgrades where we want one controller to speak to a new API in a new version (like scheduledjobs in batch/v2alpha1), but existing controllers still need to speak to the old version for compatibility with existing masters during a rolling update

According to the current design, controller only talks to the local api server, so it's not a problem:

Requests from scheduler (and controllers) go to a local apiserver via localhost interface, so both components will be in the same version.

smarterclayton · 2016-10-25T18:22:59Z

That's going to be going away very soon so we shouldn't depend on it.

Some other options

move the controllers out of kubernetes/kubernetes and copy the libraries
as we go - moving them slowly. (this is similar to your option 1.1)
move kubelet out of kubernetes/kubernetes
move kubectl out of kubernetes/kubernetes

We could potentially pick off libraries we want to reuse into their own
repos, or let the clients vendor them.

On Thu, Oct 20, 2016 at 5:41 PM, Chao Xu [email protected] wrote:

how will we support upgrades where we want one controller to speak to a
new API in a new version (like scheduledjobs in batch/v2alpha1), but
existing controllers still need to speak to the old version for
compatibility with existing masters during a rolling update

According to the current design
https://github.com/fgrzadkowski/kubernetes/blob/038418fb14568bfda8e334ae315501c5ea504e0a/docs/design/ha_master.md#upgrades,
controller only talks to the local api server, so it's not a problem:

Requests from scheduler (and controllers) go to a local apiserver via localhost interface, so both components will be in the same version.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#35159 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_pyzWXdjGKTLTW-TJ-9VnSAiBc_lOks5q1-AZgaJpZM4KbeJV
.

caesarxuchao · 2016-10-25T21:26:01Z

@smarterclayton I think the options you gave have the same operational cost as option 1.1. With controllers/kubelet in their own repos, we still have duplicate packages for different API versions, just they are in different repos. Also we still need a mechanism to keep them in sync.

I think I can go with option 2. Its cost is rebase could be huge. I can try to finish the work during the 1.5 code freeze, and try to get merged as the first PR after freeze.

caesarxuchao · 2016-10-31T18:04:48Z

Background

Currently the client-go is populated in two steps:

In the main repo, someone runs /staging/copy.sh to update /staging
A robot copies /staging literally to client-go

This process will continue for a while, until we have the proper test infra. Developers cannot send PR to client-go until then. Hence, we are actually migrating the main repo to use the /staging area.

Problem

After the migration, developing experience will be bad. Imagine a developer finds a problem with the clients, he will find the code in /staging, but he cannot change it directly, he needs to find where the code is copied from, update that code, and then run copy.sh.

Solution

During the code freeze, we migrate the main repo to use the versioned clients inside the main repo. After the freeze, we gradually migrate to client-go. The benefits are:

Developers will deal with real code, rather than the shadow code in staging/
This is almost as good as dogfooding client-go. Most client-go frictions were caused by the discrepancies between versioned and internal clients.
The final migration to client-go will be easy, given that in Versioned API objects should be in a self-contained package #7584, we planned to let client and server share the same types. cc @bgrant0607 @smarterclayton

smarterclayton · 2016-11-03T18:29:37Z

SGTM

@liggitt

Automatic merge from submit-queue [RFC] Prepare for deprecating NodeLegacyHostIP Ref #9267 (comment) *What this PR does* - Add comments saying "LegacyHostIP" will be deprecated in 1.7; - Add v1.NodeLegacyHostIP to be consistent with the internal API (useful for client-go migration #35159) - Let cloudproviders who used to only set LegacyHostIP set the IP as both InternalIP and ExternalIP - Master used to ssh tunnel to node's ExternalIP or LegacyHostIP to do [healthz check](https://github.com/kubernetes/kubernetes/blame/master/pkg/master/master.go#L328-L332). OTOH, if on-prem, kubelet only [sets](https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/kubelet_node_status.go#L430-L431) LegacyHostIP or InternalIP. In order to deprecate LegacyHostIP in 1.7, I let healthz check to use InternalIP if ExternalIP is not available. (The healthz check is the only consumer of LegacyHostIP in k8s.) @liggitt @justinsb @bgrant0607 ```release-note LegacyHostIP will be deprecated in 1.7. ```

caesarxuchao added the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label Oct 19, 2016

caesarxuchao self-assigned this Oct 19, 2016

caesarxuchao mentioned this issue Oct 20, 2016

Include multiple API versions of an API group to a single clientset #35237

Closed

nikhiljindal mentioned this issue Oct 27, 2016

Updating namespace controller utils to use versioned clientset #35738

Closed

caesarxuchao changed the title ~~[RFC] Migrating kubernetes components to use client-go in a managable way~~ [RFC] Migrating kubernetes components to use client-go Oct 31, 2016

caesarxuchao mentioned this issue Nov 2, 2016

[RFC] Prepare for deprecating NodeLegacyHostIP #36095

Merged

This was referenced Nov 10, 2016

Rename PetSet to StatefulSet kubernetes/dashboard#1395

Closed

Convert to use versioned api #36673

Merged

saad-ali closed this as completed in #36673 Nov 24, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Migrating kubernetes components to use client-go #35159

[RFC] Migrating kubernetes components to use client-go #35159

caesarxuchao commented Oct 19, 2016 •

edited

Loading

saad-ali commented Oct 19, 2016

smarterclayton commented Oct 20, 2016

caesarxuchao commented Oct 20, 2016

deads2k commented Oct 20, 2016

caesarxuchao commented Oct 20, 2016

liggitt commented Oct 20, 2016

caesarxuchao commented Oct 20, 2016

deads2k commented Oct 20, 2016

caesarxuchao commented Oct 20, 2016

liggitt commented Oct 20, 2016

caesarxuchao commented Oct 20, 2016 •

edited

Loading

caesarxuchao commented Oct 20, 2016

smarterclayton commented Oct 25, 2016

caesarxuchao commented Oct 25, 2016

caesarxuchao commented Oct 31, 2016 •

edited

Loading

smarterclayton commented Nov 3, 2016

[RFC] Migrating kubernetes components to use client-go #35159

[RFC] Migrating kubernetes components to use client-go #35159

Comments

caesarxuchao commented Oct 19, 2016 • edited Loading

Background

Goal

[Abandoned] Approach 1: migrating one component at a time

[Abandoned] Approach 1.1: Make two copies of the shared dependencies

[Abandoned] Approach 1.2: Converting before calling shared packages

[Abandoned] Approach 2: migrating in a developing branch

[Picked] Approach 3: migrating to the versioned clients during the code freeze

saad-ali commented Oct 19, 2016

smarterclayton commented Oct 20, 2016

caesarxuchao commented Oct 20, 2016

deads2k commented Oct 20, 2016

caesarxuchao commented Oct 20, 2016

liggitt commented Oct 20, 2016

caesarxuchao commented Oct 20, 2016

deads2k commented Oct 20, 2016

caesarxuchao commented Oct 20, 2016

liggitt commented Oct 20, 2016

caesarxuchao commented Oct 20, 2016 • edited Loading

caesarxuchao commented Oct 20, 2016

smarterclayton commented Oct 25, 2016

caesarxuchao commented Oct 25, 2016

caesarxuchao commented Oct 31, 2016 • edited Loading

smarterclayton commented Nov 3, 2016

caesarxuchao commented Oct 19, 2016 •

edited

Loading

caesarxuchao commented Oct 20, 2016 •

edited

Loading

caesarxuchao commented Oct 31, 2016 •

edited

Loading