-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TRANSFER LEARNING EXAMPLE #106
Comments
Hi @glenn-jocher , I have a question about this. I want to change the configuration of yolo layers(remove some layer, change the number of filters, etc..) and apply transfer learning. In this case, is it possible to use transfer learning using the official weight? If it's possible, could you give me the way or just a keyword about this? |
@jw-pyo you can do anything you want, but you have to do it, we can't "give you a way". Recommend you visit our tutorials to get started, and the PyTorch tutorials for more general customization questions. https://docs.ultralytics.com/yolov5/tutorials/train_custom_data |
I hava a problem, I want to train some new classes and pictures using transfer learning. |
@hac135 If you want to use pretrained model as transfer learning but your own model has different shape, what I know is just copying the weights which are same shape with pretrained model, and about layers of different shape, you just manually initialize the corresponding layer. |
@hac135 most people don't realize this, and it's not the recommended method to go about things, but you can technically use the existing YOLOv3 architecture (and hence the pretrained For example, our single class tutorial operates just as well with no modifications to the cfg file: It's not clean and its not optimal, but it works. |
Thank you ! it did works! |
that's a good suggestion, thanks |
@shahidammer try training from scratch, and observe your training results in results.txt. |
@shahidammer please note that most technical problems are due to:
sudo rm -rf yolov3 # remove exising repo
git clone https://github.com/ultralytics/yolov3 && cd yolov3 # git clone latest
python3 detect.py # verify detection
python3 train.py # verify training (a few batches only)
# CODE TO REPRODUCE YOUR ISSUE HERE
If none of these apply to you, we suggest you close this issue and raise a new one using the Bug Report template, providing screenshots and minimum viable code to reproduce your issue. Thank you! |
i want to retain the existing classes and add new class i.e total of 80+1=81 class in coco dataset.Please tell me how to do it using transfer learning |
@parul19 you create a new 81 class cfg. Follow the directions in the example above. |
Do we still need COCO dataset if we only do transfer-learning? |
@sooonism you need whatever dataset you want to train on. |
@glenn-jocher I have a vehicle that is not Planning to
My question is how do i, do it? |
@Santhosh1509 well I would start by reviewing the examples in the wiki, such as the custom training tutorial: |
@glenn-jocher Need your opinion on this. I just saw a post called transfer learning tutorial for SSD using keras. Its mentioned in Option 1: Just ignore the fact that we need only 8 classes
So I feel, even if i could some how train as i mentioned above for a particular new class, the prediction for the other classes might get affected. Is my approach, right? Is there an alternative way where I could preserve the prediction of the other classes introducing this new class in the same neural network? I feel it needs to be trained from scratch then. What do you think? |
@Santhosh1509 training normally will produce the best results. Transfer learning produces mediocre results quickly. |
@glenn-jocher How do I get to know the All i get is this during training Please guide how do I tune my hyper parameters with this data that is being displayed here? I could have increase the batch size I have more memory on the GPU I do not understand the comment on these line PS: latest training image
|
@Santhosh1509 all of the information you mention is recorded in results.txt. You can plot this with obj and cls are training losses, they are supposed to decrease during training. See #392 for hyperparameter evolution, and explore the open issues for answers to your questions. |
@glenn-jocher This is what is stored in
Don't we have a graph which is easy to visualize, rather than just numbers. Something like this Now we can use even tensor board support inside pytorch to visualize the values As the name mentions HYPERPARAMETER EVOLUTION is to plot those not how these ( |
@Santhosh1509 Tensorboard logs automatically in this repo if you have it installed. See #435 |
@glenn-jocher Please explain how I can only relate terms |
@glenn-jocher accuracy is a classification metric, it is not used here. The metrics displayed during training are training losses and the number of targets per batch. |
@glenn-jocher |
object loss and class loss. training loss is the total of all training losses. |
@joel5638 no, you call it once before training to convert your last.pt into a backbone.pt file ready to be used as pretrained weights for future trainings: Lines 607 to 619 in 5d42cc1
|
@glenn-jocher perfect will do that. Thank u so much |
@joel5638 can you paste your test_gt.jpg and test_pred.jpg here? |
@joel5638 ah, it looks like it's working well! Remember there is NMS, so if the person and the face are largely occupying the same region one or the other may be suppressed. You could try it on zidane.jpg to compare, as in that photo the faces and the persons do not occupy similar areas, the way Bush does above. |
@glenn-jocher perfect. |
@joel5638 --transfer flag is deprecated, you may have been using an older version of the repo before. Basically you no longer need it. Simply train normally, specifying the --weights you want to start from (but making a backbone from them first!). Your command will technically work, but it is not recommended, as hyps with schedules like the learning rate will asssume 273/373 epochs are complete, which is not the case. So just create a backbone first, and then use a normal training command:
|
@joel5638 just the same as before. For example to create
|
@Works fine for me. Remember this is a python command, so you run it from a python console. If you are trying to run it from a bash prompt, you need to encapsulate the command in quotes appropriately. |
@joel5638 from the ubuntu terminal you run the same command, but as
|
@joel5638 I'm not sure. The command works fine for me. You could try omitting the argument, as it's the default argument anyways. Maybe the single quotes is causing problems. EDIT: updated image |
@glenn-jocher got it. Its with the quotes. Thank you |
try this in the command. this works python3 -c "from utils.utils import *; create_backbone(f='weights/last.pt')" |
@joel5638 that's odd. I scanned my screen with iDetection and all people are picked up fine. Maybe your dataset is too small, or if you are using tiny you should switch to the default yolov3-spp.cfg with the default pretrained weights. Are you training with all default settings? You should also post your results.png. |
@joel5638 looks fine. If you want to add a class, like face, be aware you need to train all the existing classes plus your new class. If you're already doing this, then you may just need a larger dataset or longer training. |
Hi @glenn-jocher, I want to ask you about the transfer learning. Is it possible to train a new dataset to increase the prediction of a model? The model I am using here is YOLOv3. In this case for an example of increasing the prediction of the motorbike class by adding a dataset that consists of a few motorbikes images. I have tried training with new 44 images (34 training - 10 validation), but the result for detection in total was decreased. Is there anything wrong with my training step or my dataset? This one is the result of using yolov3.weights. There is 47 total detection, and this is the result of using yolov3-transf.weights (after trained). This result has slightly decreased to 37 in total, |
@renosatyaadrian I very seriously doubt that you would expect to improve upon a model trained on perhaps thousands of images of motorbikes by training it on 34 images and then expecting it to generalize better. If you come back with a dataset of 3400 images then perhaps. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
@glenn-jocher i want to apply the transfer learning fine-tuning(add 2 or 3 layers on YOLOv3 architecture and train it on my costum dataset) on darknet . Any help Bro and thanks a lot |
@glenn-jocher I'm bit confused between transfer learning and training on pretrained weights. Means isn't both are the same like in transfer learning we are using pretrained model weights and freeze the layers weights and fine tune the model. So in training from pretrained model are you just using the weights without freezing any layers and update the weights as training progresses. |
@Dhruv312 its just wording. Basically freezing will always lead to worse results on large datasets. |
@kairavpatel I get how you are confused with this cause I was in the same position. Let me explain: When we say using pre-trained weights, the assumption is that our new custom data has images with labels that are also available in COCO. There are no new class labels in our new dataset, therefore, we can use pre-trained weights from COCO for our desired class. Transfer learning is done when we have new class in our custom dataset. The COCO pre-trained model will not be able to detect it. Therefore, we will have to train the model with COCO + custom dataset with updated class labeling (0-79 COCO labels, 80-n new labels from custom data). This is transfer learning where pre-trained weights can be utilized to make predictions in custom dataset with new class labels. To speed up the process layer freezing is done for faster training time, however, mAP is reduced almost every time. Hope this explanation helps! |
@Utsabab Great question! In the context of using pretrained weights, the idea is to leverage what the model has already learned without specifically freezing any layers. This means we update the weights across all layers based on the new data, which usually gives a better performance because the model can adapt more flexibly to the new task. Transfer learning, as often discussed, involves modifying or extending the existing model architecture to better fit new data, which can include freezing certain layers to not update during training. Essentially, using pretrained weights and not freezing layers allows the whole model to adjust and learn from the new data, while freezing layers in transfer learning is more about fine-tuning or adapting the model to new, possibly related tasks. So, when training with pretrained weights the command is simple: python3 train.py --weights yolov3.pt --data yourdata.yaml And there's no need to explicitly freeze layers unless you have a very specific case where you believe it's necessary. 🤓 Hope that clarifies things! |
This guide explains how to train your data with YOLOv3 using Transfer Learning. Transfer learning can be a useful way to quickly retrain YOLOv3 on new data without needing to retrain the entire network. We accomplish this by starting from the official YOLOv3 weights, and setting each layer's
.requires_grad
field to false that we do not want to calculate gradients for and optimize.Before You Start
git clone https://github.com/ultralytics/yolov3
bash yolov3/data/get_coco2017.sh
Transfer Learning
1. Download pretrained weights from our Google Drive folder that you want to use to transfer learn, and place them in
yolov3/weights/
.2. Update
*.cfg
file (optional). Each YOLO layer has 255 outputs: 85 outputs per anchor [4 box coordinates + 1 object confidence + 80 class confidences], times 3 anchors. If you use fewer classes, reduce filters tofilters=[4 + 1 + n] * 3
, wheren
is your class count. This modification should be made to the layer preceding each of the 3 YOLO layers. Also modifyclasses=80
toclasses=n
in each YOLO layer, wheren
is your class count.3. Train.
Run the above code to transfer learn on COCO, or specify your own data as
--data data/custom.data
(See https://github.com/ultralytics/yolov3/wiki/Train-Custom-Data).If you created a custom
*.cfg
file, specify it as--cfg custom.cfg
.You can observe in the Model Summary (using
model_info(model, report='full')
in train.py) that only the 3 YOLO layers have their gradients activated now (all other layers are frozen for duration of training):Reproduce Our Environment
To access an up-to-date working environment (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled), consider a:
The text was updated successfully, but these errors were encountered: