Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Convert Flyte deployment from Kustomize to Helm! #299

Closed
3 of 23 tasks
wild-endeavor opened this issue Apr 28, 2020 · 13 comments
Closed
3 of 23 tasks

[Feature] Convert Flyte deployment from Kustomize to Helm! #299

wild-endeavor opened this issue Apr 28, 2020 · 13 comments
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers helm help wanted Extra attention is needed
Milestone

Comments

@wild-endeavor
Copy link
Contributor

wild-endeavor commented Apr 28, 2020

Motivation: Why do you think this is important?
#298 is incomplete. Finish it.

Goal: What should the final outcome look like, ideally?
In no particular order:

  • Documentation
  • Create a separate RDS instance for data catalog, or figure out how to add a second database in the Admin RDS instance.
  • Test that things actually work with gRPC clients through the ALB and fix if they don't.
  • Refactor the terraform around the IAM webhook bits into a separate submodule
  • Figure out a nicer way to get the output of terraform into the kustomize overlays.
  • Figure out a nice or at least easier way for users to plug in their own information like aws account number into the kustomize overlays.
  • Move the db password into a TF input variable.
  • Create node groups and separate out the Flyte components to run on a separate set of servers.
  • Create sample service accounts for users to allow the webhook to work (service account needs to be patched) and see if it's possible to get Admin to run that patch as part of the cluster resource manager.
  • Figure out and document how to get the K8s dashboard running.

Describe alternatives you've considered
NA

Flyte component

  • Overall
  • Flyte Setup and Installation scripts
  • Flyte Documentation
  • Flyte communication (slack/email etc)
  • FlytePropeller
  • FlyteIDL (Flyte specification language)
  • Flytekit (Python SDK)
  • FlyteAdmin (Control Plane service)
  • FlytePlugins
  • DataCatalog
  • FlyteStdlib (common libraries)
  • FlyteConsole (UI)
  • Other

Additional context
NA
Is this a blocker for you to adopt Flyte
NA

@wild-endeavor wild-endeavor added enhancement New feature or request untriaged This issues has not yet been looked at by the Maintainers labels Apr 28, 2020
@rstanevich
Copy link
Contributor

discussed with @wild-endeavor in Slack

- Test that things actually work with gRPC clients through the ALB and fix if they don’t.

From our experience with AWS - gRPC unfortunately doesn’t work with ALB. ALB will handle HTTP traffic, but it doesn't allow even register workflow in Flyte.
So, a few months we used Contour Ingress (like in flyte sandbox) attached to ELB only for Flyte.
And recently we connected Flyte ingress to cluster Istio IngressGateway attached to ELB.
As far as I know it should work correctly with Nginx Ingress too.

- Figure out a nice or at least easier way for users to plug in their own information like aws account number into the kustomize overlays:

We are running Flyte on 2 different environments: like staging and production. We have equivalent configuration structure but different values for DB name, S3 bucket, SQS, Cloudwatch, Vault Secrets, Flyte DNS name and etc.
And it was not convenient enough to have a lot of copy paste in configuration. We run Flyte on staging with end2end tests and our tests workflow to make sure that our configuration will be consistent for production environment.
So, we are not big fans of Helm but recently I though about it how it can help us. And, actually, there are a lot of pros for this: in Helm’s values.yaml we are able to set if we’d like using scheduler/notifications or enable/disable spark webhook, we can set nodeSelectors and tolerations for flyte pods, put account and region in one place.
But we’d like to still use updates from lyft/flyte github repo in our kustomization, so we’ve implemented this 'dirty' solution =)
All variable values in overlays manifests we replaced with placeholders, and it looks like we generate own flyte_generated.yaml but with placeholder. The peace of our flyte_generated/_template.yaml:

...
    scheduler:
      eventScheduler:
        scheme: aws
        region: "${AWS_REGION}"
        scheduleRole: "arn:aws:iam::${AWS_ACCOUNT_ID}:role/${AWS_SCHEDULER_ROLE}"
        targetName: "arn:aws:sqs:${AWS_REGION}:${AWS_ACCOUNT_ID}:${AWS_SCHEDULER_SQS_NAME}"
        scheduleNamePrefix: ${AWS_SCHEDULER_CLOUDWATCH_PREFIX}
      workflowExecutor:
        scheme: aws
        region: "${AWS_REGION}"
        scheduleQueueName: "${AWS_SCHEDULER_SQS_NAME}"
        accountId: "${AWS_ACCOUNT_ID}"
        reconnectAttempts: 10
        reconnectDelaySeconds: 30
    notifications:
      type: aws
      region: "${AWS_REGION}"
      publisher:
        topicName: "arn:aws:sns:${AWS_REGION}:${AWS_ACCOUNT_ID}:${AWS_NOTIFICATIONS_SNS_NAME}"
      processor:
        queueName: "${AWS_NOTIFICATIONS_SQS_NAME}"
        accountId: "${AWS_ACCOUNT_ID}"
      emailer:
        sender:  "${AWS_NOTIFICATIONS_SENDER}"
...

And after that we populate these placeholders from staging.ini or prod.ini files using envsubst. Maybe it can look strange - we found it convenient for this specific case.
*.ini files just contains infrastructire parameters:

...
AWS_ACCOUNT_ID=123454342
AWS_REGION=us-east-1
AWS_S3_BUCKET=s3bucket-for-flyte-staging
AWS_NOTIFICATIONS_SNS_NAME=flyte-staging-notifications-topic
AWS_NOTIFICATIONS_SQS_NAME=flyte-staging-notifications-queue
[email protected]
AWS_SCHEDULER_ROLE=flyte_cron_scheduler_role
AWS_SCHEDULER_SQS_NAME=flyte-staging-cron-scheduler-queue
...

My 2 cents about:

  • Scheduling and Notifications:
    The Flyte also has nice implementation of cron scheduling.
    But when we just started it requires some time to figure out and setup. So maybe in future it would be good having terraform definition here for SQS/SNS/Cloudwatch. By the way, in Confluence we added article Flyte Integration: Scheduler and Notifications which contains some notes about thes features and definitions. At least, if we'd have the scheme like in the article it would be easier to setup this first time :)

  • Secrets management (for Flyte and its Workflows)
    Hashicorp Vault and its plugin https://github.com/hashicorp/vault-k8s allows to deliver secrets in workflow's container. This configurations are described in Pod annotations.
    And Johny Burns gave an idea to use annotations argument in LaunchPlan - since it became very convenient approach for our Flyte workflows.

@kumare3 kumare3 added good first issue Good for newcomers help wanted Extra attention is needed labels Jul 6, 2020
@rstanevich
Copy link
Contributor

Added PR with Helm chart for Flyte #550
My team has found using Helm as template engine is easy to configure and maintain Flyte installation.
For now we are running it on development cluster and going to configure the main internal Flyte installation with this.

@kumare3
Copy link
Contributor

kumare3 commented Oct 15, 2020

@rstanevich this is awesome stuff, This is what one of the community members sent me https://www.youtube.com/watch?v=pRG47EQ5OAg&t=761s. Its a combination approach of Kustomize + Helm Charts for great flexibility

@kumare3 kumare3 linked a pull request Oct 15, 2020 that will close this issue
@rstanevich
Copy link
Contributor

Hi @kumare3, thanks for sharing information about this tool!

So, as I see replicatedhq/ship is a good find when we'd like using some existing OSS Helm chart (e.g. Prometheus), but it does not allow changing some components (e.g. serviceType or additional initContainers if there are no placeholders for this in template). And in this case, they suggest patching the generated Kubernetes manifest from the Helm chart with the desired modifications or even adding new components.
But on the other hand, if these changes are necessary, the Helm chart in the repo can be updated for the new release version.

This may be wrong, but as far as I understand it makes sense using ship on the "customer" side when you want to use Helm as a templating engine, but you need to add some new custom components to the installation - e.g. in my case this can be adding CustomResources for Flyte Ingress configuration with Istio.

Then yes, it gives more flexibility. But lt useful for small and very custom patches.

@kumare3
Copy link
Contributor

kumare3 commented Oct 20, 2020

@rstanevich hmm interesting. Thank you for this. Is it ok if I spend next week digging into this and getting myself acquainted with Helm. Right now I do not feel that I can make an educated decision

@rstanevich
Copy link
Contributor

Thank you @kumare3
So, we installed our main Flyte installation with Helm chart a few days ago.
The main benefit I see - it is the configuration in only one file values-<cluster-name>.yaml
The main con - If templates don't define some applications - you can only add it in the different installation or using the tools like ship. The good example is spark-history-server. You should define it in the Helm chart templates to generate deployment manifest with Helm.

@kumare3
Copy link
Contributor

kumare3 commented Jan 29, 2021

@rstanevich I am convinced that we need to use helm. @sbrunk has volunteered to help with it. I will also work with him to complete this. I think that is the best way to deploy Flyte

@kumare3 kumare3 changed the title [Feature] Finish EKS archetype [Feature] Convert Flyte deployment from Kustomize to Helm! Jan 29, 2021
@kumare3 kumare3 added this to the 0.11.0 milestone Jan 29, 2021
@kumare3 kumare3 removed the untriaged This issues has not yet been looked at by the Maintainers label Feb 11, 2021
@kumare3 kumare3 modified the milestones: 0.11.0, 0.12.0 Feb 16, 2021
@EngHabu
Copy link
Contributor

EngHabu commented Apr 1, 2021

@sbrunk would it make sense to plan to release this on stages? maybe the sandbox config first... etc.?

@EngHabu EngHabu modified the milestones: 0.12.0, 0.13.0 Apr 1, 2021
@sbrunk
Copy link
Member

sbrunk commented Apr 1, 2021

@EngHabu I think the sandbox config is in good shape already. See #784. EKS should also be pretty close but I can't test it.

But GKE needs more work and perhaps we also need extra configs like the headless and test overlays in kustomize. So perhaps get sandbox and EKS ready and do the others in a subsequent PR?

@EngHabu
Copy link
Contributor

EngHabu commented Apr 1, 2021

Is it ready to be tested? I can help with that...
How will we maintain kustomize and helm moving forward? will they be able to share most of the base manifests? or are you proposing we delete kustomize altogether?

@sbrunk
Copy link
Member

sbrunk commented Apr 6, 2021

Yes it's ready to be tested, please give it a try.

I don't think it's possible to share manifests between kustomize and helm because the syntax is not compatible. Helm uses its own templating language over yaml while kustomize works on raw yaml & patches.

The question about whether to maintain both chart and kustomize overlays or not is something I can't really answer alone. I guess it depends on peoples requirements, maintenance costs so we need to discuss the trade-offs first.

@kumare3
Copy link
Contributor

kumare3 commented Apr 14, 2021

@kumare3 kumare3 modified the milestones: 0.14.0, 0.15.0 Jun 6, 2021
@kumare3
Copy link
Contributor

kumare3 commented Jun 6, 2021

This is now merged and part of -0.14.0 release. We will still keep this feature in advanced beta.

@kumare3 kumare3 closed this as completed Jun 6, 2021
@sbrunk sbrunk added the helm label Jun 15, 2021
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Dec 6, 2022
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Dec 6, 2022
Signed-off-by: Flyte-Bot <[email protected]>

Co-authored-by: flyte-bot <[email protected]>
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Dec 20, 2022
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Aug 9, 2023
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Aug 21, 2023
Signed-off-by: Flyte-Bot <[email protected]>

Co-authored-by: flyte-bot <[email protected]>
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Aug 21, 2023
troychiu pushed a commit that referenced this issue Jul 8, 2024
…#299)

* Using InMmeory token cache for admin clientset in propeller

* pass cache to auth interceptor
pvditt pushed a commit that referenced this issue Aug 2, 2024
…#299)

* Using InMmeory token cache for admin clientset in propeller

* pass cache to auth interceptor

Signed-off-by: Paul Dittamo <[email protected]>
wild-endeavor pushed a commit that referenced this issue Aug 20, 2024
pmahindrakar-oss added a commit that referenced this issue Sep 9, 2024
…#299) (#5621)

Signed-off-by: Paul Dittamo <[email protected]>
Co-authored-by: Prafulla Mahindrakar <[email protected]>
Signed-off-by: pmahindrakar-oss <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers helm help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants