-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: add finetuner docs #771
Changes from all commits
8a1221d
8e5bdda
5b0eac6
56f7305
00d8e97
f55c6ab
24e0737
ed99910
c1f71b5
3079dbc
a07e5e5
76e94eb
1847eb3
2c5c9b2
4129ca2
2848d75
ac26b4f
7a8b46b
85adb22
1d9f967
8127dcd
0ef6e41
05df0fe
83c0866
5f33b2b
ee7c62e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,187 @@ | ||
(Finetuner)= | ||
# Fine-tune Models | ||
ZiniuYu marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Although CLIP-as-service has provided you a list of pre-trained models, you can also fine-tune your models. | ||
This guide will show you how to use [Finetuner](https://finetuner.jina.ai) to fine-tune models and use them in CLIP-as-service. | ||
|
||
For installation and basic usage of Finetuner, please refer to [Finetuner documentation](https://finetuner.jina.ai). | ||
You can also [learn more details about fine-tuning CLIP](https://finetuner.jina.ai/tasks/text-to-image/). | ||
|
||
## Prepare Training Data | ||
|
||
Finetuner accepts training data and evaluation data in the form of {class}`~docarray.array.document.DocumentArray`. | ||
The training data for CLIP is a list of (text, image) pairs. | ||
ZiniuYu marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Each pair is stored in a {class}`~docarray.document.Document` which wraps two [`chunks`](https://docarray.jina.ai/fundamentals/document/nested/) with `image` and `text` modality respectively. | ||
You can push the resulting {class}`~docarray.array.document.DocumentArray` to the cloud using the {meth}`~docarray.array.document.DocumentArray.push` method. | ||
|
||
We use [fashion captioning dataset](https://github.com/xuewyang/Fashion_Captioning) as a sample dataset in this tutorial. | ||
The following are examples of descriptions and image urls from the dataset. | ||
We also include a preview of each image. | ||
|
||
| Description | Image URL | Preview | | ||
|---------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------| | ||
| subtly futuristic and edgy this liquid metal cuff bracelet is shaped from sculptural rectangular link | [https://n.nordstrommedia.com/id/sr3/<br/>58d1a13f-b6b6-4e68-b2ff-3a3af47c422e.jpeg](https://n.nordstrommedia.com/id/sr3/58d1a13f-b6b6-4e68-b2ff-3a3af47c422e.jpeg) | <img src="https://n.nordstrommedia.com/id/sr3/58d1a13f-b6b6-4e68-b2ff-3a3af47c422e.jpeg?raw=true" width=100px> | | ||
| high quality leather construction defines a hearty boot one-piece on a tough lug sole | [https://n.nordstrommedia.com/id/sr3/<br/>21e7a67c-0a54-4d09-a4a4-6a0e0840540b.jpeg](https://n.nordstrommedia.com/id/sr3/21e7a67c-0a54-4d09-a4a4-6a0e0840540b.jpeg) | <img src="https://n.nordstrommedia.com/id/sr3/21e7a67c-0a54-4d09-a4a4-6a0e0840540b.jpeg?raw=true" width=100px> | | ||
| this shimmering tricot knit tote is traced with decorative whipstitching and diamond cut chain the two hallmark of the falabella line | [https://n.nordstrommedia.com/id/sr3/<br/>1d8dd635-6342-444d-a1d3-4f91a9cf222b.jpeg](https://n.nordstrommedia.com/id/sr3/1d8dd635-6342-444d-a1d3-4f91a9cf222b.jpeg) | <img src="https://n.nordstrommedia.com/id/sr3/1d8dd635-6342-444d-a1d3-4f91a9cf222b.jpeg?raw=true" width=100px> | | ||
| ... | ... | ... | | ||
|
||
You can use the following script to transform the first three entries of the dataset to a {class}`~docarray.array.document.DocumentArray` and push it to the cloud using the name `fashion-sample`. | ||
|
||
```python | ||
from docarray import Document, DocumentArray | ||
|
||
train_da = DocumentArray( | ||
[ | ||
Document( | ||
chunks=[ | ||
Document( | ||
content='subtly futuristic and edgy this liquid metal cuff bracelet is shaped from sculptural rectangular link', | ||
modality='text', | ||
), | ||
Document( | ||
uri='https://n.nordstrommedia.com/id/sr3/58d1a13f-b6b6-4e68-b2ff-3a3af47c422e.jpeg', | ||
modality='image', | ||
), | ||
], | ||
), | ||
Document( | ||
chunks=[ | ||
Document( | ||
content='high quality leather construction defines a hearty boot one-piece on a tough lug sole', | ||
modality='text', | ||
), | ||
Document( | ||
uri='https://n.nordstrommedia.com/id/sr3/21e7a67c-0a54-4d09-a4a4-6a0e0840540b.jpeg', | ||
modality='image', | ||
), | ||
], | ||
), | ||
Document( | ||
chunks=[ | ||
Document( | ||
content='this shimmering tricot knit tote is traced with decorative whipstitching and diamond cut chain the two hallmark of the falabella line', | ||
modality='text', | ||
), | ||
Document( | ||
uri='https://n.nordstrommedia.com/id/sr3/1d8dd635-6342-444d-a1d3-4f91a9cf222b.jpeg', | ||
modality='image', | ||
), | ||
], | ||
), | ||
] | ||
) | ||
train_da.push('fashion-sample') | ||
``` | ||
|
||
The full dataset has been converted to `clip-fashion-train-data` and `clip-fashion-eval-data` and pushed to the cloud which can be directly used in Finetuner. | ||
|
||
## Start Finetuner | ||
|
||
You may now create and run a fine-tuning job after login to Jina ecosystem. | ||
|
||
```python | ||
import finetuner | ||
|
||
finetuner.login() | ||
run = finetuner.fit( | ||
model='openai/clip-vit-base-patch32', | ||
run_name='clip-fashion', | ||
train_data='clip-fashion-train-data', | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @bwanglzu will finetuner accept docarray instance (rather than a name in hubble) as the dataset input? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes, both are supported! it can be a string of docarray name, or a document array python instance @numb3r3 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. prettry cool |
||
eval_data='clip-fashion-eval-data', # optional | ||
epochs=5, | ||
learning_rate=1e-5, | ||
loss='CLIPLoss', | ||
cpu=False, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @bwanglzu Just want to confirm will There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. by default, finetuner use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let me clarify my question. What I really want to ask is "Does finetuner currently allows training a model on GPU for free"? |
||
) | ||
``` | ||
|
||
After the job started, you may use {meth}`~finetuner.run.Run.status` to check the status of the job. | ||
|
||
```python | ||
import finetuner | ||
|
||
finetuner.login() | ||
run = finetuner.get_run('clip-fashion') | ||
print(run.status()) | ||
``` | ||
|
||
When the status is `FINISHED`, you can download the tuned model to your local machine. | ||
|
||
```python | ||
import finetuner | ||
|
||
finetuner.login() | ||
run = finetuner.get_run('clip-fashion') | ||
run.save_artifact('clip-model') | ||
``` | ||
|
||
You should now get a zip file containing the tuned model named `clip-fashion.zip` under the folder `clip-model`. | ||
|
||
## Use the Model | ||
|
||
After unzipping the model you get from the previous step, a folder with the following structure will be generated: | ||
|
||
```text | ||
. | ||
└── clip-fashion/ | ||
├── config.yml | ||
├── metadata.yml | ||
├── metrics.yml | ||
└── models/ | ||
├── clip-text/ | ||
│ ├── metadata.yml | ||
│ └── model.onnx | ||
├── clip-vision/ | ||
│ ├── metadata.yml | ||
│ └── model.onnx | ||
└── input-map.yml | ||
``` | ||
|
||
Since the tuned model generated from Finetuner contains richer information such as metadata and config, we now transform it to simpler structure used by CLIP-as-service. | ||
|
||
* Firstly, create a new folder named `clip-fashion-cas` or name of your choice. This will be the storage of the models to use in CLIP-as-service. | ||
|
||
* Secondly, copy the textual model `clip-fashion/models/clip-text/model.onnx` into the folder `clip-fashion-cas` and rename the model to `textual.onnx`. | ||
|
||
* Similarly, copy the visual model `clip-fashion/models/clip-vision/model.onnx` into the folder `clip-fashion-cas` and rename the model to `visual.onnx`. | ||
|
||
This is the expected structure of `clip-fashion-cas`: | ||
|
||
```text | ||
. | ||
└── clip-fashion-cas/ | ||
├── textual.onnx | ||
└── visual.onnx | ||
``` | ||
|
||
In order to use the fine-tuned model, create a custom YAML file `finetuned_clip.yml` like below. Learn more about [Flow YAML configuration](https://docs.jina.ai/fundamentals/flow/yaml-spec/) and [`clip_server` YAML configuration](https://clip-as-service.jina.ai/user-guides/server/#yaml-config). | ||
|
||
```yaml | ||
jtype: Flow | ||
version: '1' | ||
with: | ||
port: 51000 | ||
executors: | ||
- name: clip_o | ||
uses: | ||
jtype: CLIPEncoder | ||
metas: | ||
py_modules: | ||
- clip_server.executors.clip_onnx | ||
with: | ||
name: ViT-B/32 | ||
model_path: 'clip-fashion-cas' # path to clip-fashion-cas | ||
replicas: 1 | ||
``` | ||
|
||
```{warning} | ||
Note that Finetuner only support ViT-B/32 CLIP model currently. The model name should match the fine-tuned model, or you will get incorrect output. | ||
``` | ||
|
||
You can now start the `clip_server` using fine-tuned model to get a performance boost: | ||
|
||
```bash | ||
python -m clip_server finetuned_clip.yml | ||
``` | ||
|
||
That's it, enjoy 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perfect, good catch