Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Spark pod template overrides #4183

Merged
merged 7 commits into from
Oct 11, 2023
Merged

Conversation

andrewwdye
Copy link
Contributor

@andrewwdye andrewwdye commented Oct 6, 2023

Tracking issue

#4105

Describe your changes

Create driver, executor, and spark application spec from optional pod template. Uses ToK8sPodSpec as a shared solution for merging defaults with container or pod template target overrides. Driver and executor specs will share the same pod template values. See note below on this choice.

Why use the same pod template for driver and executor?
As part of #4179 we had a discussion about whether or not to use the task-level pod template, or introduce role-specific pod templates for driver and executor. For now this change adopts the task-level pod template pattern used by other k8s plugins. We may end up moving this and others to role-specific templates in the future.

Test plan

Unit tests
I refactored, extended, and renamed the existing container based Spark unit test (TestBuildResourceContainer). I added a new TestBuildResourcePodTemplate test to cover most of the interface.

Sanity check diff of driver and executor pods locally

  1. Setup and build single binary
  2. Add spark to enabled-plugins to the config following this guide
  3. Add spark role, role binding, and service account template files from this guide to $HOME/.flyte/cluster-resource-templates/
  4. Register and run a spark workflow following this guide
  5. Compare driver and executor pods between master and this branch

Verify behavior of specifying @task(pod_template=...) locally
Modified the my_spark example to use a pod template

@task(
    ...
    pod_template=PodTemplate(
        primary_container_name="primary",
        labels={"lKeyA": "lValA", "lKeyB": "lValB"},
        annotations={"aKeyA": "aValA", "aKeyB": "aValB"},
        pod_spec=V1PodSpec(
            containers=[
                V1Container(
                    name="primary",
                    image=custom_image,
                    command="echo",
                    args=["wow"],
                ),
            ],
            tolerations=[
                V1Toleration(
                    key="x/custom",
                    operator="Equal",
                    value="foo",
                    effect="NoSchedule",
                ),
            ],
        ),
    ),
)

And verified that the resulting pod had tolerations set

❯ kubectl get pod -n flytesnacks-development a68zfbjq8r74mjb9q5tq-n0-0-driver -o yaml | grep "tolerations:" -A 12
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  - effect: NoSchedule
    key: x/custom
    operator: Equal
    value: foo

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

@codecov
Copy link

codecov bot commented Oct 6, 2023

Codecov Report

Attention: 27 lines in your changes are missing coverage. Please review.

Comparison is base (26228bd) 59.46% compared to head (336fc8f) 59.26%.
Report is 1 commits behind head on master.

❗ Current head 336fc8f differs from pull request most recent head cb5994d. Consider uploading reports for the commit cb5994d to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4183      +/-   ##
==========================================
- Coverage   59.46%   59.26%   -0.20%     
==========================================
  Files         638      552      -86     
  Lines       54375    39769   -14606     
==========================================
- Hits        32336    23571    -8765     
+ Misses      19490    13867    -5623     
+ Partials     2549     2331     -218     
Flag Coverage Δ
unittests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
flyteplugins/go/tasks/plugins/k8s/dask/dask.go 80.93% <100.00%> (+2.13%) ⬆️
...ns/go/tasks/pluginmachinery/flytek8s/pod_helper.go 70.00% <0.00%> (-4.11%) ⬇️
...asks/pluginmachinery/flytek8s/non_interruptible.go 0.00% <0.00%> (ø)
flyteplugins/go/tasks/plugins/k8s/spark/spark.go 76.87% <87.23%> (-0.49%) ⬇️

... and 572 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@andrewwdye andrewwdye force-pushed the spark-pod-templates-plugins branch from ad57897 to 4fe1ae8 Compare October 10, 2023 03:19
@andrewwdye andrewwdye changed the base branch from spark-pod-templates-idl to master October 10, 2023 03:21
@andrewwdye andrewwdye force-pushed the spark-pod-templates-plugins branch from 4fe1ae8 to b326791 Compare October 10, 2023 03:22
@andrewwdye andrewwdye requested a review from hamersaw October 10, 2023 03:30
@andrewwdye andrewwdye changed the title Implement Spark pod template tolerations Implement Spark pod template overrides Oct 10, 2023
Copy link
Contributor

@hamersaw hamersaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nit questions, otherwise looks good. Thanks!

flyteplugins/go/tasks/plugins/k8s/spark/spark.go Outdated Show resolved Hide resolved
flyteplugins/go/tasks/plugins/k8s/spark/spark.go Outdated Show resolved Hide resolved
Signed-off-by: Andrew Dye <[email protected]>
@andrewwdye andrewwdye force-pushed the spark-pod-templates-plugins branch 2 times, most recently from cf09a48 to c5303f0 Compare October 10, 2023 17:45
@andrewwdye andrewwdye force-pushed the spark-pod-templates-plugins branch from c5303f0 to 6fc00ea Compare October 10, 2023 17:48
Signed-off-by: Andrew Dye <[email protected]>
Copy link
Contributor

@jeevb jeevb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Looks like this plugin might now be primed for GPU (and GPU affinity) support. A task for a later PR. Thanks!

jeevb
jeevb previously approved these changes Oct 10, 2023
hamersaw
hamersaw previously approved these changes Oct 10, 2023
@andrewwdye andrewwdye dismissed stale reviews from hamersaw and jeevb via de2e42a October 10, 2023 23:28
@andrewwdye andrewwdye requested a review from jeevb October 11, 2023 22:54
@andrewwdye andrewwdye merged commit 0ca2d22 into master Oct 11, 2023
@andrewwdye andrewwdye deleted the spark-pod-templates-plugins branch October 11, 2023 23:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants