Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add OpenShift KFDef #4

Merged

Conversation

vpavlin
Copy link

@vpavlin vpavlin commented Dec 10, 2019

Which issue is resolved by this Pull Request:
Avoid running oc commands when deploying Kubeflow on OpenShift

Description of your changes:
The proper way to extend deployment targets list for Kubeflow is to come up with kfdef configuration. So far we have resorted to deploying KF to OpenShift by running a long set of oc commands which leads to diverting from a normal KF deployment process.

This PR is trying to align the KF on OpenShift deployment with a standard KF deployment process.

NOTE: It is still work in progress, so the KFDef file only deploys few components, but we can merge it soon and extend it with other PRs to make the testing and deploying simpler.

How to try

Checkout the PR

sed  -i  's#uri: .*#uri: '$PWD'#' ./kfdef/kfctl_openshift.yaml 
kfctl build --file=kfdef/kfctl_openshift.yaml
kfctl apply --file=./kfdef/kfctl_openshift.yaml

You should see non-commented components in the kfctl_openshift.yaml running successfully

@crobby
Copy link

crobby commented Dec 16, 2019

Trying this out: when I do kfctl build --file=kfdef/kfctl_openshift.yaml, I get...

couldn't generate KfApp: (kubeflow.error): Code 500 with message: could not sync cache. Error: (kubeflow.error): Code 400 with message: couldn't download URI /home/vpavlin/devel/github.com/vpavlin/manifests Error stat /home/vpavlin/devel/github.com/vpavlin/manifests: no such file or directory

@crobby
Copy link

crobby commented Dec 16, 2019

Looks like it has a reference to your local filesystem:

uri: /home/vpavlin/devel/github.com/vpavlin/manifests

@crobby
Copy link

crobby commented Dec 16, 2019

Changed the uri to my local filesystem and did:
kfctl build --file=/kfctl_openshift.yaml
kfctl apply --file=/kfctl_openshift.yaml
Seems to be doing something now, but I am seeing the following warning:

WARN[0047] Encountered error during apply: (kubeflow.error): Code 500 with message: Apply.Run Error unable to recognize "/tmp/kout122983657": no matches for kind "Application" in version "app.k8s.io/v1beta1" filename="kustomize/kustomize.go:183"

@vpavlin
Copy link
Author

vpavlin commented Dec 17, 2019

Ah, right

uri: /home/vpavlin/devel/github.com/vpavlin/manifests

I updated the description to account for this

WARN[0047] Encountered error during apply: (kubeflow.error): Code 500 with message: Apply.Run Error unable to recognize "/tmp/kout122983657": no matches for kind "Application" in version "app.k8s.io/v1beta1" filename="kustomize/kustomize.go:183"

Damn:) Yeah, this is because of the overlays applicaiton, I think we can remove those from OpenShift KFDef as we do not really have a use for that CR

@crobby
Copy link

crobby commented Dec 17, 2019

Still seeing the following with the latest version and a fresh cluster.
WARN[0042] Encountered error during apply: (kubeflow.error): Code 500 with message: Apply.Run Error unable to recognize "/tmp/kout757664611": no matches for kind "Application" in version "app.k8s.io/v1beta1" filename="kustomize/kustomize.go:183"

@crobby
Copy link

crobby commented Jan 2, 2020

Back to trying to get this working: I actually went ahead and added in the applicaiton-crd stuff (since other things will require it....tf-serving does) and I got farther than before. But now I wind-up seeing: Deployment.apps "metadata-ui" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app":"metadata-ui", "kustomize.component":"metadata"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable]
Not sure what is happening just yet.

@crobby
Copy link

crobby commented Jan 2, 2020

Hmm...I think I just answered my own question...sort of. Seems like the second running against the existing deployment caused the issue. I blew away the existing deployment and it seems to have worked. That fixes my issue, but might be a problem if others run into a "partially successful" install like I was dealing with at first. If I re-run against my fully successful install, it seems to be fine.

@crobby
Copy link

crobby commented Jan 2, 2020

Seems like metadata-db is still failing: log message from the pod is:

mkdir: cannot create directory '/var/lib/mysql': Permission denied

@crobby
Copy link

crobby commented Jan 3, 2020

Ok, after further review, my problem with metadata-db is related to my use of CRC and this issue: crc-org/crc#814 The sort of cheap workaround for this is to change the metadata-db deployment to use emptyDir: {} instead of persistentVolumeClaim. That seems to yield a filesystem that allows the database to start up.
A deployment of Openshift elsewhere, like AWS, should not have this issue to deal with.

@vpavlin
Copy link
Author

vpavlin commented Jan 8, 2020

We will now be creating 2 SCCs for OpenShift deployment:

  1. kubeflow-anyuid-istio - for Istio related SAs (guess this will be made obsolete by Integrating RH Service Mesh with Kubeflow #8)
  2. kubeflow-anyuid-$(NAMESPACE) - for SAs in KF namespace - to make sure KF can be deployed to a namespace which is not named kubeflow and also potentially into multiple namespaces at the same time

I also added an SA metadatadb for metadata-db, so that we don't need anyuid for default SA

@crobby
Copy link

crobby commented Jan 8, 2020

Ok, I was able to run this and everything I poked at seems happy. (the only thing I tweaked besides the path was the Volume definition for metadata-db due to the CRC bug).
Ideally, we come up with a solution so that your personal directory structure doesn't exist in what's committed, but I think this is in a good/merge-able place for our current purposes.

@vpavlin vpavlin force-pushed the openshift/kfdef branch 3 times, most recently from 6f7d4f6 to 256b461 Compare January 10, 2020 12:36
@vpavlin
Copy link
Author

vpavlin commented Jan 10, 2020

Just hit this issue: kubeflow/kubeflow#4642

It should not block us from merging, but it is something to keep in mind as the repo download will probably not work when people try it out with the proper URI

@vpavlin vpavlin changed the title WIP: Add OpenShift KFDef Add OpenShift KFDef Jan 10, 2020
@crobby
Copy link

crobby commented Jan 10, 2020

New update lgtm. I will merge this.

@crobby crobby self-requested a review January 10, 2020 14:50
@crobby crobby merged commit 96e8502 into opendatahub-io:v0.7-branch-openshift Jan 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants