YoloNAS training time #2056

shekarneo · 2024-10-04T12:42:20Z

💡 Your Question

i am training the yolonas key point detection model on a custome dataset with 1000 images and the training time taking 45minutes per epochs, is the model using the original image size or like 640x640. or is this behavior normal

Versions

No response

BloodAxe · 2024-10-04T12:47:29Z

Depends on many factors - model size, number on workers you have set, your GPU and CPU

shekarneo · 2024-10-07T10:30:59Z

okay, after training 20 epochs my AP was 62% and AR was 100% and still there are no key points detected.

shekarneo · 2024-10-07T11:17:49Z

The dataset i have annotated using cvat tool and exported to coco format.
these are the below training results.

SUMMARY OF EPOCH 24
├── Train
│   ├── Yolonasposeloss/loss_cls = 2.464
│   │   ├── Epoch N-1      = 2.6261 (↘ -0.1621)
│   │   └── Best until now = 2.6261 (↘ -0.1621)
│   ├── Yolonasposeloss/loss_iou = 0.0
│   │   ├── Epoch N-1      = 0.0    (= 0.0)
│   │   └── Best until now = 0.0    (= 0.0)
│   ├── Yolonasposeloss/loss_dfl = 0.0
│   │   ├── Epoch N-1      = 0.0    (= 0.0)
│   │   └── Best until now = 0.0    (= 0.0)
│   ├── Yolonasposeloss/loss_pose_cls = 0.0
│   │   ├── Epoch N-1      = 0.0    (= 0.0)
│   │   └── Best until now = 0.0    (= 0.0)
│   ├── Yolonasposeloss/loss_pose_reg = 0.0
│   │   ├── Epoch N-1      = 0.0    (= 0.0)
│   │   └── Best until now = 0.0    (= 0.0)
│   └── Yolonasposeloss/loss = 2.464
│       ├── Epoch N-1      = 2.6261 (↘ -0.1621)
│       └── Best until now = 2.6261 (↘ -0.1621)
└── Validation
    ├── Yolonasposeloss/loss_cls = nan
    │   ├── Epoch N-1      = nan    (= nan)
    │   └── Best until now = nan    (= nan)
    ├── Yolonasposeloss/loss_iou = 0.0
    │   ├── Epoch N-1      = 0.0    (= 0.0)
    │   └── Best until now = 0.0    (= 0.0)
    ├── Yolonasposeloss/loss_dfl = 0.0
    │   ├── Epoch N-1      = 0.0    (= 0.0)
    │   └── Best until now = 0.0    (= 0.0)
    ├── Yolonasposeloss/loss_pose_cls = 0.0
    │   ├── Epoch N-1      = 0.0    (= 0.0)
    │   └── Best until now = 0.0    (= 0.0)
    ├── Yolonasposeloss/loss_pose_reg = 0.0
    │   ├── Epoch N-1      = 0.0    (= 0.0)
    │   └── Best until now = 0.0    (= 0.0)
    ├── Yolonasposeloss/loss = nan
    │   ├── Epoch N-1      = nan    (= nan)
    │   └── Best until now = nan    (= nan)
    ├── Ap = 0.5355
    │   ├── Epoch N-1      = 0.5786 (↘ -0.0431)
    │   └── Best until now = 0.8088 (↘ -0.2732)
    └── Ar = 1.0
        ├── Epoch N-1      = 1.0    (= 0.0)
        └── Best until now = 1.0    (= 0.0)

and my yaml file is

num_joints: 12

oks_sigmas: [0.025, 0.025, 0.025, 0.025, 0.025, 0.025, 0.025, 0.072, 0.072, 0.025, 0.025, 0.025]

edge_links:
  - [3,9]
  - [10,2]
  - [0,7]
  - [9,6]
  - [4,0]
  - [10,7]
  - [1,2]
  - [11,5]
  - [4,9]
  - [7,8]
  - [1,11]
  - [11,4]
  - [6,10]
  - [8,3]

edge_colors:
 - [214, 39, 40]  
 - [214, 39, 40] 
 - [214, 39, 40]  
 - [214, 39, 40]  
 - [214, 39, 40]  
 - [214, 39, 40]  
 - [214, 39, 40]
 - [214, 39, 40] 
 - [214, 39, 40] 
 - [214, 39, 40]  
 - [214, 39, 40] 
 - [214, 39, 40]  
 - [214, 39, 40]  
 - [214, 39, 40]  


keypoint_colors:
  - [250, 50, 83]
  - [250, 50, 83]
  - [250, 50, 83]
  - [250, 50, 83]
  - [250, 50, 83]
  - [250, 50, 83]
  - [250, 50, 83]
  - [250, 250, 55]
  - [250, 250, 55]
  - [250, 250, 55]
  - [250, 250, 55]
  - [250, 250, 55]

BloodAxe · 2024-10-07T11:45:53Z

What worries me in the reported loss - is zero values for pose/bbox regression for loss.
Which may indicate there are 0 matches between gt boxes/poses and predicted boxes/poses by a model.
Can you attach example image and annotation json that you've exported? I would double-check that there are no export issues in the first place.

You probably aware of, but this notebook shows fine tuning of YoloNAS-Pose on the animals - https://github.com/Deci-AI/super-gradients/blob/master/notebooks/YoloNAS_Pose_Fine_Tuning_Animals_Pose_Dataset.ipynb which is working well.
So my best guess for the root cause of your problem is the data.

shekarneo · 2024-10-07T12:25:49Z

Hi i am not able to share the images here i can share the annotation file and also attaching the python script for training.

yolo_nas_pose_fine_tuning_custom_dataset.py.txt
Training.json.txt
validation.json.txt

shekarneo · 2024-10-07T12:27:41Z

and in my case, I don't need or required to have joint connections between the key points

Sriparna2024 · 2024-10-07T13:39:02Z

Hi, I am also facing similar issue. Please guide on how to export .json file which is compatible with yolonas pose from CVAT tool. Is the .yaml file stated is correctly formatted?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

YoloNAS training time #2056

YoloNAS training time #2056

shekarneo commented Oct 4, 2024

BloodAxe commented Oct 4, 2024

shekarneo commented Oct 7, 2024

shekarneo commented Oct 7, 2024 •

edited

Loading

BloodAxe commented Oct 7, 2024

shekarneo commented Oct 7, 2024 •

edited

Loading

shekarneo commented Oct 7, 2024

Sriparna2024 commented Oct 7, 2024

YoloNAS training time #2056

YoloNAS training time #2056

Comments

shekarneo commented Oct 4, 2024

💡 Your Question

Versions

BloodAxe commented Oct 4, 2024

shekarneo commented Oct 7, 2024

shekarneo commented Oct 7, 2024 • edited Loading

BloodAxe commented Oct 7, 2024

shekarneo commented Oct 7, 2024 • edited Loading

shekarneo commented Oct 7, 2024

Sriparna2024 commented Oct 7, 2024

shekarneo commented Oct 7, 2024 •

edited

Loading

shekarneo commented Oct 7, 2024 •

edited

Loading