Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add finetuner docs #771

Merged
merged 26 commits into from
Jul 20, 2022
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -177,8 +177,8 @@ It means the client and the server are now connected. Well done!

user-guides/client
user-guides/server
user-guides/finetuner
user-guides/faq

```

```{toctree}
Expand Down
186 changes: 186 additions & 0 deletions docs/user-guides/finetuner.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,186 @@
# Fine-tune Models
ZiniuYu marked this conversation as resolved.
Show resolved Hide resolved

Although CLIP-as-service has provided you a list of pre-trained models, you can also train your models.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can also fine-tune your models.

This guide will show you how to use [Finetuner](https://finetuner.jina.ai) to fine-tune models and use them in CLIP-as-service.

For installation and basic usage of Finetuner, please refer to [Finetuner documentation](https://finetuner.jina.ai).
You can also [learn more details about fine-tuning CLIP](https://finetuner.jina.ai/tasks/text-to-image/).

## Prepare Training Data

Finetuner accepts training data and evaluation data in the form of [`DocumentArray`](https://docarray.jina.ai/fundamentals/documentarray/).
ZiniuYu marked this conversation as resolved.
Show resolved Hide resolved
The training data for CLIP is a list of (text, image) pairs.
ZiniuYu marked this conversation as resolved.
Show resolved Hide resolved
Each pair is store in a [`Document`](https://docarray.jina.ai/fundamentals/document/) which wraps two [`chunks`](https://docarray.jina.ai/fundamentals/document/nested/) with the `image` and `text` modality.
ZiniuYu marked this conversation as resolved.
Show resolved Hide resolved
You can push the resulting [`DocumentArray`](https://docarray.jina.ai/fundamentals/documentarray/) to the cloud using the [`.push`](https://docarray.jina.ai/api/docarray.array.document/?highlight=push#docarray.array.document.DocumentArray.push) method.

We use [fashion captioning dataset](https://github.com/xuewyang/Fashion_Captioning) as a sample dataset in this tutorial.
You can get the description and image url from the dataset:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You can get the description and image url from the dataset:
The following are examples of descriptions and image urls from the dataset:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can show the user how they can view these descriptions + urls themselves?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I can, but i think this will distract users and it is not the main topic of this tutorial


| Description | Image URL |
|---------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| subtly futuristic and edgy this liquid metal cuff bracelet is shaped from sculptural rectangular link | [https://n.nordstrommedia.com/id/sr3/<br/>58d1a13f-b6b6-4e68-b2ff-3a3af47c422e.jpeg](https://n.nordstrommedia.com/id/sr3/58d1a13f-b6b6-4e68-b2ff-3a3af47c422e.jpeg) |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we display the image, rather than simply print URL?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copyright?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also kept the original image urls since it is from the dataset

| high quality leather construction defines a hearty boot one-piece on a tough lug sole | [https://n.nordstrommedia.com/id/sr3/<br/>21e7a67c-0a54-4d09-a4a4-6a0e0840540b.jpeg](https://n.nordstrommedia.com/id/sr3/21e7a67c-0a54-4d09-a4a4-6a0e0840540b.jpeg) |
| this shimmering tricot knit tote is traced with decorative whipstitching and diamond cut chain the two hallmark of the falabella line | [https://n.nordstrommedia.com/id/sr3/<br/>1d8dd635-6342-444d-a1d3-4f91a9cf222b.jpeg](https://n.nordstrommedia.com/id/sr3/1d8dd635-6342-444d-a1d3-4f91a9cf222b.jpeg) |
| ... | ... |

You can use the following script to transform the first three entries of the dataset to a [`DocumentArray`](https://docarray.jina.ai/fundamentals/documentarray/) and push it to the cloud using the name `fashion-sample`.

```python
from docarray import Document, DocumentArray

train_da = DocumentArray(
[
Document(
chunks=[
Document(
content='subtly futuristic and edgy this liquid metal cuff bracelet is shaped from sculptural rectangular link',
modality='text',
),
Document(
uri='https://n.nordstrommedia.com/id/sr3/58d1a13f-b6b6-4e68-b2ff-3a3af47c422e.jpeg',
modality='image',
),
],
),
Document(
chunks=[
Document(
content='high quality leather construction defines a hearty boot one-piece on a tough lug sole',
modality='text',
),
Document(
uri='https://n.nordstrommedia.com/id/sr3/21e7a67c-0a54-4d09-a4a4-6a0e0840540b.jpeg',
modality='image',
),
],
),
Document(
chunks=[
Document(
content='this shimmering tricot knit tote is traced with decorative whipstitching and diamond cut chain the two hallmark of the falabella line',
modality='text',
),
Document(
uri='https://n.nordstrommedia.com/id/sr3/1d8dd635-6342-444d-a1d3-4f91a9cf222b.jpeg',
modality='image',
),
],
),
]
)
train_da.push('fashion-sample')
```

The full dataset has been converted to `clip-fashion-train-data` and `clip-fashion-eval-data` and pushed to the cloud.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which can ...

You can directly use them in Finetuner.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can directly use them in Finetuner.

You can directly use them by providing the dataset name in


## Start Finetuner

You may now create and run a fine-tuning job after login to Jina ecosystem.

```python
import finetuner

finetuner.login()
run = finetuner.fit(
model='openai/clip-vit-base-patch32',
run_name='clip-fashion',
train_data='clip-fashion-train-data',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bwanglzu will finetuner accept docarray instance (rather than a name in hubble) as the dataset input?

Copy link
Member

@bwanglzu bwanglzu Jul 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, both are supported! it can be a string of docarray name, or a document array python instance @numb3r3

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prettry cool

eval_data='clip-fashion-eval-data', # optional
epochs=5,
learning_rate=1e-5,
loss='CLIPLoss',
cpu=False,
Copy link
Member

@numb3r3 numb3r3 Jul 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bwanglzu Just want to confirm will cpu=False work defaultly?

Copy link
Member

@bwanglzu bwanglzu Jul 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by default, finetuner use cpu, if you want to use gpu, you have to set cpu=False. @numb3r3

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me clarify my question. What I really want to ask is "Does finetuner currently allows training a model on GPU for free"?

)
```

After the job started, you may use [`.status`](https://finetuner.jina.ai/api/finetuner.run/#finetuner.run.Run.status) to check the status of the job.

```python
import finetuner

finetuner.login()
run = finetuner.get_run('clip-fashion')
print(run.status())
```

When the status is `FINISHED`, you can download the tuned model to your local machine.

```python
import finetuner

finetuner.login()
run = finetuner.get_run('clip-fashion')
run.save_artifact('clip-model')
```

You should now get a zip file containing the tuned model named `clip-fashion.zip` under the folder `clip-model`.

## Use the Model

After unzipping the model you get from the previous step, a folder with the following structure will be generated:

```text
.
└── clip-fashion/
├── config.yml
├── metadata.yml
├── metrics.yml
└── models/
├── clip-text/
│ ├── metadata.yml
│ └── model.onnx
├── clip-vision/
│ ├── metadata.yml
│ └── model.onnx
└── input-map.yml
```

Since the tuned model generated from Finetuner contains richer information such as metadata and config, we now transform it to simpler structure used by CLIP-as-service.

* First create a new folder named `clip-fashion-cas` or anything you like. This will be the storage of the models to use in CLIP-as-service.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* First create a new folder named `clip-fashion-cas` or anything you like. This will be the storage of the models to use in CLIP-as-service.
* Firstly, create a new folder named `clip-fashion-cas` or name of your choice. This will be the storage of the models to use in CLIP-as-service.


* Second copy and move `clip-fashion/models/clip-text/model.onnx` to `clip-fashion-cas` and rename it to `textual.onnx`.
ZiniuYu marked this conversation as resolved.
Show resolved Hide resolved

* Similarly, copy and move `clip-fashion/models/clip-vision/model.onnx` to `clip-fashion-cas` and rename it to `visual.onnx`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Similarly, copy and move `clip-fashion/models/clip-vision/model.onnx` to `clip-fashion-cas` and rename it to `visual.onnx`.
* Similarly, copy the vision model `clip-fashion/models/clip-vision/model.onnx` into `clip-fashion-cas` and rename the model to `visual.onnx`.


This is the expected structure of `clip-fashion-cas`:

```text
.
└── clip-fashion-cas/
├── textual.onnx
└── visual.onnx
```

In order to use fine-tuned model, create a custom YAML file `finetuned_clip.yml` like below. Learn more about [Flow YAML configuration](https://docs.jina.ai/fundamentals/flow/yaml-spec/) and [`clip_server` YAML configuration](https://clip-as-service.jina.ai/user-guides/server/#yaml-config).
ZiniuYu marked this conversation as resolved.
Show resolved Hide resolved

```yaml
jtype: Flow
version: '1'
with:
port: 51000
executors:
- name: clip_o
uses:
jtype: CLIPEncoder
metas:
py_modules:
- clip_server.executors.clip_onnx
with:
name: ViT-B/32
model_path: 'clip-fashion-cas' # path to clip-fashion-cas
replicas: 1
```

```{warning}
Note that Finetuner only support ViT-B/32 CLIP model currently. The model name should match the fine-tuned model, or you will get incorrect output.
```

You can now start the `clip_server` using fine-tuned model to get a performance boost:

```bash
python -m clip_server finetuned_clip.yml
```

That's it, enjoy 🚀