Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add finetuner docs #771

Merged
merged 26 commits into from
Jul 20, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,8 @@
html_show_sourcelink = False
html_favicon = '_static/favicon.png'

intersphinx_mapping = {'docarray': ('https://docarray.jina.ai/', None), 'finetuner': ('https://finetuner.jina.ai/', None)}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perfect, good catch


latex_documents = [(master_doc, f'{slug}.tex', project, author, 'manual')]
man_pages = [(master_doc, slug, project, [author], 1)]
texinfo_documents = [
Expand Down
1 change: 0 additions & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -178,7 +178,6 @@ It means the client and the server are now connected. Well done!
user-guides/client
user-guides/server
user-guides/faq

```

```{toctree}
Expand Down
187 changes: 187 additions & 0 deletions docs/user-guides/finetuner.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,187 @@
(Finetuner)=
# Fine-tune Models
ZiniuYu marked this conversation as resolved.
Show resolved Hide resolved

Although CLIP-as-service has provided you a list of pre-trained models, you can also fine-tune your models.
This guide will show you how to use [Finetuner](https://finetuner.jina.ai) to fine-tune models and use them in CLIP-as-service.

For installation and basic usage of Finetuner, please refer to [Finetuner documentation](https://finetuner.jina.ai).
You can also [learn more details about fine-tuning CLIP](https://finetuner.jina.ai/tasks/text-to-image/).

## Prepare Training Data

Finetuner accepts training data and evaluation data in the form of {class}`~docarray.array.document.DocumentArray`.
The training data for CLIP is a list of (text, image) pairs.
ZiniuYu marked this conversation as resolved.
Show resolved Hide resolved
Each pair is stored in a {class}`~docarray.document.Document` which wraps two [`chunks`](https://docarray.jina.ai/fundamentals/document/nested/) with `image` and `text` modality respectively.
You can push the resulting {class}`~docarray.array.document.DocumentArray` to the cloud using the {meth}`~docarray.array.document.DocumentArray.push` method.

We use [fashion captioning dataset](https://github.com/xuewyang/Fashion_Captioning) as a sample dataset in this tutorial.
The following are examples of descriptions and image urls from the dataset.
We also include a preview of each image.

| Description | Image URL | Preview |
|---------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------|
| subtly futuristic and edgy this liquid metal cuff bracelet is shaped from sculptural rectangular link | [https://n.nordstrommedia.com/id/sr3/<br/>58d1a13f-b6b6-4e68-b2ff-3a3af47c422e.jpeg](https://n.nordstrommedia.com/id/sr3/58d1a13f-b6b6-4e68-b2ff-3a3af47c422e.jpeg) | <img src="https://n.nordstrommedia.com/id/sr3/58d1a13f-b6b6-4e68-b2ff-3a3af47c422e.jpeg?raw=true" width=100px> |
| high quality leather construction defines a hearty boot one-piece on a tough lug sole | [https://n.nordstrommedia.com/id/sr3/<br/>21e7a67c-0a54-4d09-a4a4-6a0e0840540b.jpeg](https://n.nordstrommedia.com/id/sr3/21e7a67c-0a54-4d09-a4a4-6a0e0840540b.jpeg) | <img src="https://n.nordstrommedia.com/id/sr3/21e7a67c-0a54-4d09-a4a4-6a0e0840540b.jpeg?raw=true" width=100px> |
| this shimmering tricot knit tote is traced with decorative whipstitching and diamond cut chain the two hallmark of the falabella line | [https://n.nordstrommedia.com/id/sr3/<br/>1d8dd635-6342-444d-a1d3-4f91a9cf222b.jpeg](https://n.nordstrommedia.com/id/sr3/1d8dd635-6342-444d-a1d3-4f91a9cf222b.jpeg) | <img src="https://n.nordstrommedia.com/id/sr3/1d8dd635-6342-444d-a1d3-4f91a9cf222b.jpeg?raw=true" width=100px> |
| ... | ... | ... |

You can use the following script to transform the first three entries of the dataset to a {class}`~docarray.array.document.DocumentArray` and push it to the cloud using the name `fashion-sample`.

```python
from docarray import Document, DocumentArray

train_da = DocumentArray(
[
Document(
chunks=[
Document(
content='subtly futuristic and edgy this liquid metal cuff bracelet is shaped from sculptural rectangular link',
modality='text',
),
Document(
uri='https://n.nordstrommedia.com/id/sr3/58d1a13f-b6b6-4e68-b2ff-3a3af47c422e.jpeg',
modality='image',
),
],
),
Document(
chunks=[
Document(
content='high quality leather construction defines a hearty boot one-piece on a tough lug sole',
modality='text',
),
Document(
uri='https://n.nordstrommedia.com/id/sr3/21e7a67c-0a54-4d09-a4a4-6a0e0840540b.jpeg',
modality='image',
),
],
),
Document(
chunks=[
Document(
content='this shimmering tricot knit tote is traced with decorative whipstitching and diamond cut chain the two hallmark of the falabella line',
modality='text',
),
Document(
uri='https://n.nordstrommedia.com/id/sr3/1d8dd635-6342-444d-a1d3-4f91a9cf222b.jpeg',
modality='image',
),
],
),
]
)
train_da.push('fashion-sample')
```

The full dataset has been converted to `clip-fashion-train-data` and `clip-fashion-eval-data` and pushed to the cloud which can be directly used in Finetuner.

## Start Finetuner

You may now create and run a fine-tuning job after login to Jina ecosystem.

```python
import finetuner

finetuner.login()
run = finetuner.fit(
model='openai/clip-vit-base-patch32',
run_name='clip-fashion',
train_data='clip-fashion-train-data',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bwanglzu will finetuner accept docarray instance (rather than a name in hubble) as the dataset input?

Copy link
Member

@bwanglzu bwanglzu Jul 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, both are supported! it can be a string of docarray name, or a document array python instance @numb3r3

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prettry cool

eval_data='clip-fashion-eval-data', # optional
epochs=5,
learning_rate=1e-5,
loss='CLIPLoss',
cpu=False,
Copy link
Member

@numb3r3 numb3r3 Jul 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bwanglzu Just want to confirm will cpu=False work defaultly?

Copy link
Member

@bwanglzu bwanglzu Jul 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by default, finetuner use cpu, if you want to use gpu, you have to set cpu=False. @numb3r3

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me clarify my question. What I really want to ask is "Does finetuner currently allows training a model on GPU for free"?

)
```

After the job started, you may use {meth}`~finetuner.run.Run.status` to check the status of the job.

```python
import finetuner

finetuner.login()
run = finetuner.get_run('clip-fashion')
print(run.status())
```

When the status is `FINISHED`, you can download the tuned model to your local machine.

```python
import finetuner

finetuner.login()
run = finetuner.get_run('clip-fashion')
run.save_artifact('clip-model')
```

You should now get a zip file containing the tuned model named `clip-fashion.zip` under the folder `clip-model`.

## Use the Model

After unzipping the model you get from the previous step, a folder with the following structure will be generated:

```text
.
└── clip-fashion/
├── config.yml
├── metadata.yml
├── metrics.yml
└── models/
├── clip-text/
│ ├── metadata.yml
│ └── model.onnx
├── clip-vision/
│ ├── metadata.yml
│ └── model.onnx
└── input-map.yml
```

Since the tuned model generated from Finetuner contains richer information such as metadata and config, we now transform it to simpler structure used by CLIP-as-service.

* Firstly, create a new folder named `clip-fashion-cas` or name of your choice. This will be the storage of the models to use in CLIP-as-service.

* Secondly, copy the textual model `clip-fashion/models/clip-text/model.onnx` into the folder `clip-fashion-cas` and rename the model to `textual.onnx`.

* Similarly, copy the visual model `clip-fashion/models/clip-vision/model.onnx` into the folder `clip-fashion-cas` and rename the model to `visual.onnx`.

This is the expected structure of `clip-fashion-cas`:

```text
.
└── clip-fashion-cas/
├── textual.onnx
└── visual.onnx
```

In order to use the fine-tuned model, create a custom YAML file `finetuned_clip.yml` like below. Learn more about [Flow YAML configuration](https://docs.jina.ai/fundamentals/flow/yaml-spec/) and [`clip_server` YAML configuration](https://clip-as-service.jina.ai/user-guides/server/#yaml-config).

```yaml
jtype: Flow
version: '1'
with:
port: 51000
executors:
- name: clip_o
uses:
jtype: CLIPEncoder
metas:
py_modules:
- clip_server.executors.clip_onnx
with:
name: ViT-B/32
model_path: 'clip-fashion-cas' # path to clip-fashion-cas
replicas: 1
```

```{warning}
Note that Finetuner only support ViT-B/32 CLIP model currently. The model name should match the fine-tuned model, or you will get incorrect output.
```

You can now start the `clip_server` using fine-tuned model to get a performance boost:

```bash
python -m clip_server finetuned_clip.yml
```

That's it, enjoy 🚀
53 changes: 49 additions & 4 deletions docs/user-guides/server.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,23 @@ Open AI has released 9 models so far. `ViT-B/32` is used as default model in all
| ViT-L/14 | ✅ | ✅ | ❌ | 768 | 933 | 3.66 | 2.04 |
| ViT-L/14@336px | ✅ | ✅ | ❌ | 768 | 934 | 3.74 | 2.23 |

### Use custom model

You can also use your own model in ONNX runtime by specifying the model name and the path to model directory in YAML file.
ZiniuYu marked this conversation as resolved.
Show resolved Hide resolved
The model directory should have the same structure as below:

```text
.
└── custom-model/
├── textual.onnx
└── visual.onnx
```

One may wonder how to produce the model as described above.
Fortunately, you can simply use the [Finetuner](https://finetuner.jina.ai) to fine-tune your model based on custom dataset.
[Finetuner](https://finetuner.jina.ai) is a cloud service that makes fine-tuning simple and fast.
Moving the process into the cloud, [Finetuner](https://finetuner.jina.ai) handles all related complexity and infrastructure, making models performant and production ready.
{ref}`Click here for detail instructions<Finetuner>`.

## YAML config

Expand Down Expand Up @@ -230,11 +247,11 @@ executors:

For all backends, you can set the following parameters via `with`:

| Parameter | Description |
|-----------|--------------------------------------------------------------------------------------------------------------------------------|
| `name` | Model weights, default is `ViT-B/32`. Support all OpenAI released pretrained models. |
| Parameter | Description |
|-------------------------|--------------------------------------------------------------------------------------------------------------------------------|
| `name` | Model weights, default is `ViT-B/32`. Support all OpenAI released pretrained models. |
| `num_worker_preprocess` | The number of CPU workers for image & text prerpocessing, default 4. |
| `minibatch_size` | The size of a minibatch for CPU preprocessing and GPU encoding, default 64. Reduce the size of it if you encounter OOM on GPU. |
| `minibatch_size` | The size of a minibatch for CPU preprocessing and GPU encoding, default 64. Reduce the size of it if you encounter OOM on GPU. |

There are also runtime-specific parameters listed below:

Expand All @@ -252,6 +269,7 @@ There are also runtime-specific parameters listed below:
| Parameter | Description |
|-----------|--------------------------------------------------------------------------------------------------------------------------------|
| `device` | `cuda` or `cpu`. Default is `None` means auto-detect.
| `model_path` | The path to custom CLIP model, default `None`. |

````

Expand All @@ -278,6 +296,33 @@ executors:
- executors/clip_torch.py
```

To use custom model in ONNX runtime, one can do:

```{code-block} yaml
---
emphasize-lines: 9-11
---

jtype: Flow
version: '1'
with:
port: 51000
executors:
- name: clip_o
uses:
jtype: CLIPEncoder
with:
name: ViT-B/32
model_path: 'custom-model'
metas:
py_modules:
- executors/clip_onnx.py
```

```{warning}
The model name should match the fine-tuned model, or you will get incorrect output.
```

### Executor config

The full list of configs for Executor can be found via `jina executor --help`. The most important one is probably `replicas`, which **allows you to run multiple CLIP models in parallel** to achieve horizontal scaling.
Expand Down