Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add finetuner docs #771

Merged
merged 26 commits into from
Jul 20, 2022
Merged
Show file tree
Hide file tree
Changes from 20 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,8 @@
html_show_sourcelink = False
html_favicon = '_static/favicon.png'

intersphinx_mapping = {'docarray': ('https://docarray.jina.ai/', None), 'finetuner': ('https://finetuner.jina.ai/', None)}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perfect, good catch


latex_documents = [(master_doc, f'{slug}.tex', project, author, 'manual')]
man_pages = [(master_doc, slug, project, [author], 1)]
texinfo_documents = [
Expand Down
1 change: 0 additions & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -178,7 +178,6 @@ It means the client and the server are now connected. Well done!
user-guides/client
user-guides/server
user-guides/faq

```

```{toctree}
Expand Down
186 changes: 186 additions & 0 deletions docs/user-guides/finetuner.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,186 @@
(Finetuner)=
# Fine-tune Models
ZiniuYu marked this conversation as resolved.
Show resolved Hide resolved

Although CLIP-as-service has provided you a list of pre-trained models, you can also fine-tune your models.
This guide will show you how to use [Finetuner](https://finetuner.jina.ai) to fine-tune models and use them in CLIP-as-service.

For installation and basic usage of Finetuner, please refer to [Finetuner documentation](https://finetuner.jina.ai).
You can also [learn more details about fine-tuning CLIP](https://finetuner.jina.ai/tasks/text-to-image/).

## Prepare Training Data

Finetuner accepts training data and evaluation data in the form of {class}`~docarray.array.document.DocumentArray`.
The training data for CLIP is a list of (text, image) pairs.
ZiniuYu marked this conversation as resolved.
Show resolved Hide resolved
Each pair is stored in a {class}`~docarray.document.Document` which wraps two [`chunks`](https://docarray.jina.ai/fundamentals/document/nested/) with `image` and `text` modality respectively.
You can push the resulting {class}`~docarray.array.document.DocumentArray` to the cloud using the {meth}`~docarray.array.document.DocumentArray.push` method.

We use [fashion captioning dataset](https://github.com/xuewyang/Fashion_Captioning) as a sample dataset in this tutorial.
The following are examples of descriptions and image urls from the dataset:

| Description | Image URL |
|---------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| subtly futuristic and edgy this liquid metal cuff bracelet is shaped from sculptural rectangular link | [https://n.nordstrommedia.com/id/sr3/<br/>58d1a13f-b6b6-4e68-b2ff-3a3af47c422e.jpeg](https://n.nordstrommedia.com/id/sr3/58d1a13f-b6b6-4e68-b2ff-3a3af47c422e.jpeg) |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we display the image, rather than simply print URL?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copyright?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also kept the original image urls since it is from the dataset

| high quality leather construction defines a hearty boot one-piece on a tough lug sole | [https://n.nordstrommedia.com/id/sr3/<br/>21e7a67c-0a54-4d09-a4a4-6a0e0840540b.jpeg](https://n.nordstrommedia.com/id/sr3/21e7a67c-0a54-4d09-a4a4-6a0e0840540b.jpeg) |
| this shimmering tricot knit tote is traced with decorative whipstitching and diamond cut chain the two hallmark of the falabella line | [https://n.nordstrommedia.com/id/sr3/<br/>1d8dd635-6342-444d-a1d3-4f91a9cf222b.jpeg](https://n.nordstrommedia.com/id/sr3/1d8dd635-6342-444d-a1d3-4f91a9cf222b.jpeg) |
| ... | ... |

You can use the following script to transform the first three entries of the dataset to a {class}`~docarray.array.document.DocumentArray` and push it to the cloud using the name `fashion-sample`.

```python
from docarray import Document, DocumentArray

train_da = DocumentArray(
[
Document(
chunks=[
Document(
content='subtly futuristic and edgy this liquid metal cuff bracelet is shaped from sculptural rectangular link',
modality='text',
),
Document(
uri='https://n.nordstrommedia.com/id/sr3/58d1a13f-b6b6-4e68-b2ff-3a3af47c422e.jpeg',
modality='image',
),
],
),
Document(
chunks=[
Document(
content='high quality leather construction defines a hearty boot one-piece on a tough lug sole',
modality='text',
),
Document(
uri='https://n.nordstrommedia.com/id/sr3/21e7a67c-0a54-4d09-a4a4-6a0e0840540b.jpeg',
modality='image',
),
],
),
Document(
chunks=[
Document(
content='this shimmering tricot knit tote is traced with decorative whipstitching and diamond cut chain the two hallmark of the falabella line',
modality='text',
),
Document(
uri='https://n.nordstrommedia.com/id/sr3/1d8dd635-6342-444d-a1d3-4f91a9cf222b.jpeg',
modality='image',
),
],
),
]
)
train_da.push('fashion-sample')
```

The full dataset has been converted to `clip-fashion-train-data` and `clip-fashion-eval-data` and pushed to the cloud which can be directly used in Finetuner.

## Start Finetuner

You may now create and run a fine-tuning job after login to Jina ecosystem.

```python
import finetuner

finetuner.login()
run = finetuner.fit(
model='openai/clip-vit-base-patch32',
run_name='clip-fashion',
train_data='clip-fashion-train-data',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bwanglzu will finetuner accept docarray instance (rather than a name in hubble) as the dataset input?

Copy link
Member

@bwanglzu bwanglzu Jul 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, both are supported! it can be a string of docarray name, or a document array python instance @numb3r3

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prettry cool

eval_data='clip-fashion-eval-data', # optional
epochs=5,
learning_rate=1e-5,
loss='CLIPLoss',
cpu=False,
Copy link
Member

@numb3r3 numb3r3 Jul 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bwanglzu Just want to confirm will cpu=False work defaultly?

Copy link
Member

@bwanglzu bwanglzu Jul 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by default, finetuner use cpu, if you want to use gpu, you have to set cpu=False. @numb3r3

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me clarify my question. What I really want to ask is "Does finetuner currently allows training a model on GPU for free"?

)
```

After the job started, you may use {meth}`~finetuner.run.Run.status` to check the status of the job.

```python
import finetuner

finetuner.login()
run = finetuner.get_run('clip-fashion')
print(run.status())
```

When the status is `FINISHED`, you can download the tuned model to your local machine.

```python
import finetuner

finetuner.login()
run = finetuner.get_run('clip-fashion')
run.save_artifact('clip-model')
```

You should now get a zip file containing the tuned model named `clip-fashion.zip` under the folder `clip-model`.

## Use the Model

After unzipping the model you get from the previous step, a folder with the following structure will be generated:

```text
.
└── clip-fashion/
├── config.yml
├── metadata.yml
├── metrics.yml
└── models/
├── clip-text/
│ ├── metadata.yml
│ └── model.onnx
├── clip-vision/
│ ├── metadata.yml
│ └── model.onnx
└── input-map.yml
```

Since the tuned model generated from Finetuner contains richer information such as metadata and config, we now transform it to simpler structure used by CLIP-as-service.

* Firstly, create a new folder named `clip-fashion-cas` or name of your choice. This will be the storage of the models to use in CLIP-as-service.

* Secondly, copy the textual model `clip-fashion/models/clip-text/model.onnx` into the folder `clip-fashion-cas` and rename the model to `textual.onnx`.

* Similarly, copy the visual model `clip-fashion/models/clip-vision/model.onnx` into the folder `clip-fashion-cas` and rename the model to `visual.onnx`.

This is the expected structure of `clip-fashion-cas`:

```text
.
└── clip-fashion-cas/
├── textual.onnx
└── visual.onnx
```

In order to use the fine-tuned model, create a custom YAML file `finetuned_clip.yml` like below. Learn more about [Flow YAML configuration](https://docs.jina.ai/fundamentals/flow/yaml-spec/) and [`clip_server` YAML configuration](https://clip-as-service.jina.ai/user-guides/server/#yaml-config).

```yaml
jtype: Flow
version: '1'
with:
port: 51000
executors:
- name: clip_o
uses:
jtype: CLIPEncoder
metas:
py_modules:
- clip_server.executors.clip_onnx
with:
name: ViT-B/32
model_path: 'clip-fashion-cas' # path to clip-fashion-cas
replicas: 1
```

```{warning}
Note that Finetuner only support ViT-B/32 CLIP model currently. The model name should match the fine-tuned model, or you will get incorrect output.
```

You can now start the `clip_server` using fine-tuned model to get a performance boost:

```bash
python -m clip_server finetuned_clip.yml
```

That's it, enjoy 🚀
23 changes: 19 additions & 4 deletions docs/user-guides/server.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,20 @@ Open AI has released 9 models so far. `ViT-B/32` is used as default model in all
| ViT-L/14@336px | ✅ | ✅ | ❌ | 768 | 934 | 3.74 | 2.23 |


You can also use your own model in ONNX runtime by specifying the model name and the path to model directory in YAML file.
ZiniuYu marked this conversation as resolved.
Show resolved Hide resolved
The model directory should have the same format as below:

```text
.
└── custom-model/
├── textual.onnx
└── visual.onnx
```

```{tip}
ZiniuYu marked this conversation as resolved.
Show resolved Hide resolved
You can use Finetuner to fine-tune your model. {ref}`Click here for detail instructions<Finetuner>`.
ZiniuYu marked this conversation as resolved.
Show resolved Hide resolved
```

## YAML config

You may notice that there is a YAML file in our last ONNX example. All configurations are stored in this file. In fact, `python -m clip_server` does **not support** any other argument besides a YAML file. So it is the only source of the truth of your configs.
Expand Down Expand Up @@ -230,11 +244,11 @@ executors:

For all backends, you can set the following parameters via `with`:

| Parameter | Description |
|-----------|--------------------------------------------------------------------------------------------------------------------------------|
| `name` | Model weights, default is `ViT-B/32`. Support all OpenAI released pretrained models. |
| Parameter | Description |
|-------------------------|--------------------------------------------------------------------------------------------------------------------------------|
| `name` | Model weights, default is `ViT-B/32`. Support all OpenAI released pretrained models. |
| `num_worker_preprocess` | The number of CPU workers for image & text prerpocessing, default 4. |
| `minibatch_size` | The size of a minibatch for CPU preprocessing and GPU encoding, default 64. Reduce the size of it if you encounter OOM on GPU. |
| `minibatch_size` | The size of a minibatch for CPU preprocessing and GPU encoding, default 64. Reduce the size of it if you encounter OOM on GPU. |

There are also runtime-specific parameters listed below:

Expand All @@ -252,6 +266,7 @@ There are also runtime-specific parameters listed below:
| Parameter | Description |
|-----------|--------------------------------------------------------------------------------------------------------------------------------|
| `device` | `cuda` or `cpu`. Default is `None` means auto-detect.
| `model_path` | The path to custom CLIP model, default `None`. |

````

Expand Down