-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New frame selection model #195
Conversation
@@ -448,7 +448,7 @@ def test_megadetector_lite_yolox_dog(tmp_path): | |||
"-vcodec", | |||
"libx264", | |||
"-crf", | |||
"25", | |||
"23", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set CRF to default (otherwise test fails due to lossy compression in creating the test video): https://trac.ffmpeg.org/wiki/Encode/H.264
Codecov Report
@@ Coverage Diff @@
## master #195 +/- ##
========================================
+ Coverage 85.3% 86.9% +1.6%
========================================
Files 30 29 -1
Lines 1858 1901 +43
========================================
+ Hits 1585 1653 +68
+ Misses 273 248 -25
|
@pjbull this is ready for your review. the only failing test is due to netlify. note, I've used this code to successfully train and predict with the new frame selection method on the original set of 15k videos, but I'll open a separate PR with the new model weights once this is reviewed and merged. We know from testing that this model is equivalently fast at a video level. However, since this model uses 640 x 640 as input for the MDLite model, the number of workers / batch size needs to be decreased for training and inference to avoid running out of GPU memory. This makes training and inference with this model slower and is the biggest current drawback. |
zamba/object_detection/yolox/assets/yolox_tiny_640_20220528_model_kwargs.json
Show resolved
Hide resolved
…zamba into new-frame-selection-model
✅ Deploy Preview for silly-keller-664934 ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
Ready for another look @pjbull. Addressed all your comments and bonus fixed all the object detection links in the docs which had been broken. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two little things that aren't dealbreakers
resized_frames = [] | ||
resized_video = np.zeros( | ||
(video.shape[0], video.shape[3], self.config.image_height, self.config.image_width), | ||
dtype=np.float32, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these floats at this point? May be worth double checking since a lot of times image data gets loaded in as unit8
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the image is loaded as an int here: https://github.com/drivendataorg/zamba/blob/master/zamba/object_detection/yolox/megadetector_lite_yolox.py#L107
but the output of _preprocess
is a float which is what gets slotted in: https://github.com/drivendataorg/zamba/blob/master/zamba/object_detection/yolox/megadetector_lite_yolox.py#L115-L124
AFAIC this is the correct input for mdlite
Replace the existing frame selection model (yolox-nano, image size 416, trained on 80k frames) and a new model (yolox-tiny, image size 640, trained on 800k frames).
Bonus fixes:
timm
Closes https://github.com/drivendataorg/pjmf-zamba/issues/88