Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Kubeflow Installation with Standalone Mode #3724

Merged
merged 26 commits into from
May 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
e4afb10
Update Kubeflow Installation with Standalone Components
andreyvelich Apr 28, 2024
72ef593
Add helpful message for manifests installation
andreyvelich Apr 29, 2024
4110705
Add explanation for Kubeflow Platform and Kubeflow Standalone Components
andreyvelich May 8, 2024
7420e03
Move Kubeflow explanation to introduction guide
andreyvelich May 10, 2024
26b262a
Add Spark Operator
andreyvelich May 10, 2024
6cf8f38
Add links to the introduction
andreyvelich May 10, 2024
15ed4c7
Remove Manifests WG link
andreyvelich May 13, 2024
a022b34
Modify table column
andreyvelich May 14, 2024
f9d5a52
Order components alphabeticaly
andreyvelich May 14, 2024
8fbad11
Update introduction
andreyvelich May 14, 2024
29391f8
Update Kubeflow intro
andreyvelich May 15, 2024
db5c542
Fix KFP install link
andreyvelich May 20, 2024
5c80be7
Review comments
andreyvelich May 20, 2024
78f683e
Add Kubeflow Platform Header
andreyvelich May 21, 2024
35ea615
Modify headers and text
andreyvelich May 21, 2024
a438b5a
Add Kubeflow Notebooks to Kubeflow Platform tools
andreyvelich May 22, 2024
2495604
Modify the install headers
andreyvelich May 22, 2024
cf7c50b
Update what are Kubeflow Standalone Components
andreyvelich May 22, 2024
224a638
Change to Standalone Kubeflow Components
andreyvelich May 22, 2024
fb14fb1
Add link to Kubeflow Platform
andreyvelich May 22, 2024
e18ac08
Rename Raw Manifests to Kubeflow Manifests
andreyvelich May 23, 2024
7df289c
Change H3
andreyvelich May 24, 2024
90df53d
Rename to quick and easy
andreyvelich May 25, 2024
63c3a61
Remove new line in Introduction
andreyvelich May 25, 2024
b0de894
Modify the install standalone section
andreyvelich May 27, 2024
7e7b91c
Fix install guide
andreyvelich May 28, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
186 changes: 160 additions & 26 deletions content/en/docs/started/installing-kubeflow.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,30 +5,161 @@ weight = 20

+++

## What is Kubeflow?
This guide describes how to install standalone Kubeflow components or Kubeflow Platform using package
distributions or Kubeflow manifests.

Kubeflow is an end-to-end Machine Learning (ML) platform for Kubernetes, it provides components for each stage in the ML lifecycle, from exploration through to training and deployment.
Operators can choose what is best for their users, there is no requirement to deploy every component.
Read [the introduction guide](/docs/started/introduction) to learn more about Kubeflow, standalone
Kubeflow components and Kubeflow Platform.

Learn more about Kubeflow in the [Introduction](/docs/started/introduction/) and
[Architecture](/docs/started/architecture/) pages.
## Installation Methods

## How to install Kubeflow?
You can install Kubeflow using one of these methods:

Anywhere you are running Kubernetes, you should be able to run Kubeflow.
There are two primary ways to install Kubeflow:
- [**Standalone Kubeflow Components**](#standalone-kubeflow-components)
- [**Kubeflow Platform**](#kubeflow-platform)

1. [**Packaged Distributions**](#packaged-distributions-of-kubeflow)
1. [**Raw Manifests**](#raw-kubeflow-manifests) <sup>(advanced users)</sup>
## Standalone Kubeflow Components

<a id="packaged-distributions"></a>
<a id="install-a-packaged-kubeflow-distribution"></a>
Some components in the [Kubeflow ecosystem](/docs/started/architecture/#conceptual-overview) may be
deployed as standalone services, without the need to install the full Kubeflow platform. You might
integrate these services as part of your existing AI/ML platform or use them independently.

## Packaged Distributions of Kubeflow
These components are a quick and easy method to get started with the Kubeflow ecosystem. They
provide flexibility to users who may not require the capabilities of a full Kubeflow Platform.

The following table lists Kubeflow components that may be deployed in a standalone mode. It also
lists their associated GitHub repository and
corresponding [ML lifecycle stage](/docs/started/architecture/#kubeflow-components-in-the-ml-lifecycle).

<div class="table-responsive distributions-table">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to discuss the ordering of this table.

My proposal is we go by popularity/stars:

  1. Kubeflow Pipelines
  2. Kubeflow Spark Operator
  3. Kubeflow Training Operator
  4. Kubeflow Katib
  5. Kubeflow MPI Operator
  6. Kubeflow Model Registry (we should wait until the first public release)
  7. KServe (should be last, because it's an external add-on)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the user value to order them by popularity and how are we going to track the components popularity in the future ?
E.g. from my point of view, ML Lifecycle makes more sense since order will always be the same, and we can link this table with Kubeflow ML Lifecycle: #3728.
If we don't like that, we can order them alphabetically.
@thesuperzapper @kubeflow/kubeflow-steering-committee @juliusvonkohout Thoughts ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lifecycle or alphabetically is low maintenance

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree on alphabetical ordering - politically correct and low maintenance.

<table class="table table-bordered">
<thead>
<tr>
<th>Component</th>
<th>ML Lifecycle Stage</th>
<th>Source Code</th>
</tr>
</thead>
<tbody>
<tr>
<td>
<a href="https://kserve.github.io/website/master/admin/serverless/serverless">
KServe
</a>
</td>
<td>
Model Serving
</td>
<td>
<a href="https://github.com/kserve/kserve">
<code>kserve/kserve</code>
</a>
</td>
</tr>
<tr>
<td>
<a href="/docs/components/katib/installation/#installing-katib">
Kubeflow Katib
</a>
</td>
<td>
Model Optimization and AutoML
</td>
<td>
<a href="https://github.com/kubeflow/katib">
<code>kubeflow/katib</code>
</a>
</td>
</tr>
<tr>
<td>
<a href="/docs/components/model-registry/installation/#installing-model-registry">
Kubeflow Model Registry
</a>
</td>
<td>
Model Registry
</td>
<td>
<a href="https://github.com/kubeflow/model-registry">
<code>kubeflow/model-registry</code>
</a>
</td>
</tr>
<tr>
<td>
<a href="/docs/components/training/user-guides/mpi/#installation">
Kubeflow MPI Operator
</a>
</td>
<td>
All-Reduce Model Training
</td>
<td>
<a href="https://github.com/kubeflow/mpi-operator">
<code>kubeflow/mpi-operator</code>
</a>
</td>
</tr>
<tr>
<td>
<a href="/docs/components/pipelines/v2/installation/quickstart/">
Kubeflow Pipelines
</a>
</td>
<td>
ML Workflows and Schedules
</td>
<td>
<a href="https://github.com/kubeflow/pipelines">
<code>kubeflow/pipelines</code>
</a>
</td>
</tr>
<tr>
<td>
<a href="https://github.com/kubeflow/spark-operator/tree/master?tab=readme-ov-file#installation">
Kubeflow Spark Operator
</a>
</td>
<td>
Data Preparation
</td>
<td>
<a href="https://github.com/kubeflow/spark-operator">
<code>kubeflow/spark-operator</code>
</a>
</td>
</tr>
<tr>
<td>
<a href="/docs/components/training/installation/#installing-training-operator">
Kubeflow Training Operator
</a>
</td>
<td>
Model Training and Fine-Tuning
</td>
<td>
<a href="https://github.com/kubeflow/training-operator">
<code>kubeflow/training-operator</code>
</a>
</td>
</tr>
</tbody>
</table>
</div>

## Kubeflow Platform

You can use one of the following methods to install the [Kubeflow Platform](/docs/started/introduction/#what-is-kubeflow-platform)
and get the full suite of Kubeflow components bundled together with additional tools.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a H2 heading which separates the "Kubeflow Platform" section:

Suggested change
## Kubeflow Platform
When deployed as a platform, Kubeflow provides a comprehensive set of tools for the entire ML lifecycle.
The key difference from standalone components is the [Kubeflow Central Dashboard](/docs/components/central-dash/overview/), which provides a multi-user interface for the entire platform.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thesuperzapper Do you want to move Raw Manifest + Packaged Distribution installation under Kubeflow Platform sub-section ?
We don't need to explain what is Kubeflow Platform since it is already here: https://deploy-preview-3724--competent-brattain-de2d6d.netlify.app/docs/started/introduction/#what-is-kubeflow-platform-.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andreyvelich we can mark this one as resolved, since we already did this in another commit.

However, we should still discuss what the "intro" text for this heading is, see https://github.com/kubeflow/website/pull/3724/files#r1608938065

### Packaged Distributions

Packaged distributions are maintained by various organizations and typically aim to provide
a simplified installation and management experience for Kubeflow. Some distributions can be
deployed on [all certified Kubernetes distributions](https://kubernetes.io/partners/#conformance),
a simplified installation and management experience for your **Kubeflow Platform**. Some distributions
can be deployed on [all certified Kubernetes distributions](https://kubernetes.io/partners/#conformance),
while others target a specific platform (e.g. EKS or GKE).

{{% alert title="Note" color="warning" %}}
Expand Down Expand Up @@ -200,12 +331,16 @@ The following table lists distributions which are <em>maintained</em> by their r
</table>
</div>

## Raw Kubeflow Manifests
### Kubeflow Manifests

The raw Kubeflow Manifests are aggregated by the [Manifests Working Group](https://github.com/kubeflow/community/tree/master/wg-manifests)
and are intended to be used as the **base of packaged distributions**.
The Kubeflow Manifests are aggregated by the Manifests Working Group and are intended to be
used as the **base of packaged distributions**.

Advanced users may choose to install the manifests for a specific Kubeflow version by following the
Kubeflow Manifests contain all Kubeflow Components, Kubeflow Central Dashboard, and other Kubeflow
applications that comprise the **Kubeflow Platform**. This installation is helpful when you want to
try out the end-to-end Kubeflow Platform capabilities.

Users may choose to install the manifests for a specific Kubeflow version by following the
instructions in the `README` of the [`kubeflow/manifests`](https://github.com/kubeflow/manifests) repository.

- [**Kubeflow 1.8:**](/docs/releases/kubeflow-1.8/)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- [**Kubeflow 1.8:**](/docs/releases/kubeflow-1.8/)
  - [`v1.8-branch`](https://github.com/kubeflow/manifests/tree/v1.8-branch#installation) <sup>(development branch)</sup>
  - [`v1.8.0`](https://github.com/kubeflow/manifests/tree/v1.8.0#installation)
- [**Kubeflow 1.9:**](/docs/releases/kubeflow-1.9/)
  - [`v1.9-branch`](https://github.com/kubeflow/manifests/tree/v1.9-branch#installation) <sup>(development branch)</sup>
  - [`v1.9.0`](https://github.com/kubeflow/manifests/tree/v1.9.0#installation)

I think we should either reference the master or 1.9 branch, but 1.7 is not supported anymore.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@juliusvonkohout Can we update it once we release Kubeflow 1.9 ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not link to unreleased versions.

Expand All @@ -217,16 +352,15 @@ instructions in the `README` of the [`kubeflow/manifests`](https://github.com/ku

{{% alert title="Warning" color="warning" %}}
Kubeflow is a complex system with many components and dependencies.
Using the raw manifests requires a deep understanding of Kubernetes, Istio, and Kubeflow itself.
Using the Kubeflow manifests requires a deep understanding of Kubernetes, Istio, and Kubeflow itself.

When using the raw manifests, the Kubeflow community is not able to provide support for environment-specific issues or custom configurations.
If you need support, please consider using a [packaged distribution](#packaged-distributions-of-kubeflow).
When using the Kubeflow manifests, the community is not able to provide support for environment-specific issues or custom configurations.
If you need support, please consider using a [packaged distribution](#packaged-distributions).
Nevertheless, we welcome contributions and bug reports very much.
{{% /alert %}}

<a id="next-steps"></a>
Copy link
Member

@juliusvonkohout juliusvonkohout May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{{% alert title="Warning" color="warning" %}}
Kubeflow is a complex system with many components and dependencies.
Using the Kubeflow manifests requires some understanding of Kubernetes, Istio, and Kubeflow itself.
The Kubeflow community support for Kubeflow manifests is best-effort for environment-specific issues or custom configurations.
Nevertheless, we welcome contributions and bug reports very much.
{{% /alert %}}

@andreyvelich i was not able to make a direct code suggestion here so I am just pasting the full alert.


## Next steps

- Review the Kubeflow <a href="/docs/components/">component documentation</a>
- Explore the <a href="/docs/components/pipelines/sdk/">Kubeflow Pipelines SDK</a>
- Review our [introduction to Kubeflow](/docs/started/introduction/).
- Explore the [architecture of Kubeflow](/docs/started/architecture).
- Learn more about the [components of Kubeflow](/docs/components/).
76 changes: 45 additions & 31 deletions content/en/docs/started/introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,44 @@ description = "An introduction to Kubeflow"
weight = 1
+++

The Kubeflow project is dedicated to making deployments of machine learning (ML)
workflows on Kubernetes simple, portable and scalable. Our goal is not to
recreate other services, but to provide a straightforward way to deploy
best-of-breed open-source systems for ML to diverse infrastructures. Anywhere
you are running Kubernetes, you should be able to run Kubeflow.
## What is Kubeflow

Kubeflow is a community and ecosystem of open-source projects to address each stage in the
machine learning (ML) lifecycle. It makes ML on Kubernetes simple, portable, and scalable.
The goal of Kubeflow is to facilitate the orchestration of Kubernetes ML workloads and to empower
users to deploy best-in-class open-source tools on any Cloud infrastructure.

Whether you’re a researcher, data scientist, ML engineer, or a team of developers, Kubeflow offers
modular and scalable tools that cater to all aspects of the ML lifecycle: from building ML models to
deploying them to production for AI applications.

## What are Standalone Kubeflow Components

The Kubeflow ecosystem is composed of multiple open-source projects that address different aspects
of the ML lifecycle. Many of these projects are designed to be usable both within the
Kubeflow Platform and independently. These Kubeflow components can be installed standalone on a
Kubernetes cluster. It provides flexibility to users who may not require the full Kubeflow Platform
capabilities but wish to leverage specific ML functionalities such as model training or model serving.

## What is Kubeflow Platform

The Kubeflow Platform refers to the full suite of Kubeflow components bundled together with
additional integration and management tools. Using Kubeflow as a platform means deploying a
comprehensive ML toolkit for the entire ML lifecycle.

In addition to the standalone Kubeflow components, the Kubeflow Platform includes

- [Kubeflow Notebooks](/docs/components/notebooks/overview) for interactive data exploration and
model development.
- [Central Dashboard](/docs/components/central-dash/overview/) for easy navigation and management
with [Kubeflow Profiles](/docs/components/central-dash/profiles/) for access control.
- Additional tooling for data management (PVC Viewer), visualization (TensorBoards), and more.

The Kubeflow Platform can be installed via
[Packaged Distributions](/docs/started/installing-kubeflow/#packaged-distributions) or
[Kubeflow Manifests](/docs/started/installing-kubeflow/#kubeflow-manifests).

## Getting started with Kubeflow

The following diagram shows the main Kubeflow components to cover each step of ML lifecycle
on top of Kubernetes.
Expand All @@ -17,8 +50,6 @@ on top of Kubernetes.
alt="Kubeflow overview"
class="mt-3 mb-3">

## Getting started with Kubeflow

Read the [architecture overview](/docs/started/architecture/) for an
introduction to the architecture of Kubeflow and to see how you can use Kubeflow
to manage your ML workflow.
Expand All @@ -30,28 +61,6 @@ Watch the following video which provides an introduction to Kubeflow.

{{< youtube id="cTZArDgbIWw" title="Introduction to Kubeflow">}}

## What is Kubeflow?

Kubeflow is _the machine learning toolkit for Kubernetes_.

To use Kubeflow, the basic workflow is:

- Download and run the Kubeflow deployment binary.
- Customize the resulting configuration files.
- Run the specified script to deploy your containers to your specific
environment.

You can adapt the configuration to choose the platforms and services that you
want to use for each stage of the ML workflow:

1. data preparation
2. model training,
3. prediction serving
4. service management

You can choose to deploy your Kubernetes workloads locally, on-premises, or to
a cloud environment.

## The Kubeflow mission

Our goal is to make scaling machine learning (ML) models and deploying them to
Expand Down Expand Up @@ -85,12 +94,17 @@ To see what's coming up in future versions of Kubeflow, refer to the [Kubeflow r
The following components also have roadmaps:

- [Kubeflow Pipelines](https://github.com/kubeflow/pipelines/blob/master/ROADMAP.md)
- [KF Serving](https://github.com/kubeflow/kfserving/blob/master/ROADMAP.md)
- [KServe](https://github.com/kserve/kserve/blob/master/ROADMAP.md)
- [Katib](https://github.com/kubeflow/katib/blob/master/ROADMAP.md)
- [Training Operator](https://github.com/kubeflow/common/blob/master/ROADMAP.md)
- [Training Operator](https://github.com/kubeflow/training-operator/blob/master/docs/roadmap.md)

## Getting involved

There are many ways to contribute to Kubeflow, and we welcome contributions!

Read the [contributor's guide](/docs/about/contributing/) to get started on the code, and learn about the community on the [community page](/docs/about/community/).

## Next Steps

- Follow [the installation guide](/docs/started/installing-kubeflow) to deploy standalone
Kubeflow components or Kubeflow Platform.