Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verify video validity after transcoding #3407

Closed
kontrollanten opened this issue Dec 6, 2020 · 19 comments
Closed

Verify video validity after transcoding #3407

kontrollanten opened this issue Dec 6, 2020 · 19 comments

Comments

@kontrollanten
Copy link
Contributor

kontrollanten commented Dec 6, 2020

Describe the problem to be solved
We've imported a YouTube video to PeerTube HLS. It appeared that the PeerTube video where corrupt at a specific format (720p), which meant that the video crashed at a certain minute during a specific resolution. I downloaded the file and rand the following command.

$ ffmpeg -v error -i [uuid]-720-fragmented.mp4 -f null
[h264 @ 0x56055d962580] Invalid NAL unit size (0 > 1569).
[h264 @ 0x56055d962580] Error splitting the input into NAL units.
[h264 @ 0x56055d938d80] error while decoding MB 43 4, bytestream -6
Error while decoding stream #0:0: Invalid data found when processing input
[+ 15 000 similar lines]

I've tested the 240p version of the video, and it works fine. After doing a new import it works fine.

Describe the solution you would like:
Since data corruption always is a risk when working with data, I guess we should find a way to detect corruptions to avoid a bad user experience. I'd propose a solution where an admin can opt in for validation. When validation is enabled the user have three choices:

A) Don't publish a video before validation for all resolutions has succeeded.
B) Publish the video and all resolutions that didn't fail. If a specific resolution fails, then let the user to replace the file.
C) Run validation after publish and just send a notification the user.

Describe alternatives you have considered

I think the easiest way forward would be to create a plugin that listens for video creations. Upon every creation the file is validated by running ffmpeg. If they fails a notification is sent to the user.

@Chocobozzz
Copy link
Owner

Hello,

Do you have errors in your logs related to this video/resolution?

@kontrollanten
Copy link
Contributor Author

Unfortunately I couldn't find anything. The transcoding job reported no errors and I couldn't find anything somewhere else either.

@Chocobozzz Chocobozzz added the Status: To Reproduce Likely a bug but needs reproduction and/or extended log from the issuer label Dec 14, 2020
@Chocobozzz
Copy link
Owner

Do you think your issue is related to #3596?

@Chocobozzz Chocobozzz added the Status: Waiting for answer Waiting issue author answer label Jan 21, 2021
@kontrollanten
Copy link
Contributor Author

I don't thinks so. The issue with #3596 lead to audio only was played during the whole video. But in this issue the video just crashed at a certain minute.

@Chocobozzz
Copy link
Owner

aaedadd should fix this issue

@Chocobozzz Chocobozzz reopened this Feb 1, 2022
@Chocobozzz Chocobozzz added Component: Transcoding Type: Feature Request ✨ Component: Notifications and removed Status: To Reproduce Likely a bug but needs reproduction and/or extended log from the issuer Status: Waiting for answer Waiting issue author answer labels Feb 1, 2022
@Chocobozzz Chocobozzz changed the title Verify video validity Verify video validity after transcoding Feb 1, 2022
@kontrollanten
Copy link
Contributor Author

We've found three corrupt videos the last week, so I'd say this would be a really helpful feature.

@Chocobozzz
Copy link
Owner

Same AAC issue? If yes, where do these videos come from? PeerTube lives? External tools?

@kontrollanten
Copy link
Contributor Author

Same AAC issue? If yes, where do these videos come from? PeerTube lives? External tools?

Yes, same AAC issue. At least two of them are recorded with OBS and uploaded at the site.

@kontrollanten
Copy link
Contributor Author

I've a branch where I'm working on this. To begin with I'm doing the validation in a job that's created when all transcoding is done, just to keep the logic simple. The drawback is that it will not be able to parallelize the transcoding and validation, which I guess we want?

If we run them in parallel; should the validation be a part of the TO_TRANSCODE state? Or should we put the video in TO_VALIDATE state when pendingTranscode is 0 and pendingValidate > 0?

@kontrollanten
Copy link
Contributor Author

I tried with a solution where I created a TO_VALIDATE state. The issue there is that gets a bit complex in case both HLS and WebTorrent transcoding is enabled. When HLS transcoding is done, pendingTranscode is 0 and it goes into TO_VALIDATE state. Then when WebTorrent transcoding jobs are created, it should go back to TO_TRANSCODE state.

@Chocobozzz Maybe it's better to just to the validation within the TO_TRANSCODE state? From a conceptual perspective it's a part of the transcoding, and a user may not care whether it's transcoding or validating that the transcoding went well.

@Chocobozzz
Copy link
Owner

Sorry for the late answer. I don't have a lot of time to check the best way to implement this issue, but I think we could run the verify function just after transcoding, to ensure the file is okay. If the verification succeed, then we can move the transcoded file to the videos directory and replace the file in the database

kontrollanten added a commit to kontrollanten/PeerTube that referenced this issue Mar 18, 2022
@kontrollanten
Copy link
Contributor Author

kontrollanten commented Mar 30, 2022

After updating ffmpeg to version 5.0 we've had a lot of issues where all resolutions are crashing at the same timestamp. Output of ffmpeg -i fragmented-file.mp4 -f null -:

[aac @ 0x7f8c91826800] channel element 0.0 is not allocated
Error while decoding stream #0:0: Invalid data found when processing input
[aac @ 0x7f8c91826800] channel element 0.0 is not allocated
Error while decoding stream #0:0: Invalid data found when processing input
[aac @ 0x7f8c91826800] channel element 0.0 is not allocated
Error while decoding stream #0:0: Invalid data found when processing input
[aac @ 0x7f8c91826800] channel element 0.0 is not allocated
Error while decoding stream #0:0: Invalid data found when processing input
[aac @ 0x7f8c91826800] channel element 0.0 is not allocated
Error while decoding stream #0:0: Invalid data found when processing input
[aac @ 0x7f8c91826800] channel element 0.0 is not allocated
Error while decoding stream #0:0: Invalid data found when processing input
[aac @ 0x7f8c91826800] channel element 0.0 is not allocated
Error while decoding stream #0:0: Invalid data found when processing input

The original files are big (6-9 gb) and I'm wondering if it may be something during the upload that makes the files corrupt. kukhariev/node-uploadx#518

@Chocobozzz
Copy link
Owner

Chocobozzz commented Mar 31, 2022

You can try to check your assumption by validating the input file

@kontrollanten
Copy link
Contributor Author

It seems that it's related to config.object_storage.max_upload_part. I tried to decrease the limit to 200mb and uploaded a video of 700 mb, and it failed immediately. Have you considered using https://github.com/aws/aws-sdk-js-v3/tree/main/lib/lib-storage to handle the multipart upload?

@Chocobozzz
Copy link
Owner

Closing this issue now the root cause has been fixed. See also #4867 (comment)

@kontrollanten
Copy link
Contributor Author

FYI we've had this problem six times the last month. Same error as described here #3407 (comment) Hopefully the latest node-uploadx will solve it once and for all.

@kontrollanten
Copy link
Contributor Author

We're running a fork with #4867 and we still get some validation errors. Mostly "Missing lock ...", hopefully they will be fixed when switching to bullmq, but also some failed jobs having a lot of [aac @ 0x558cb2e6f140] channel element 0.0 duplicateError while decoding stream #0:1: Invalid data found when processing input errors. Haven't happened since 4.3.0 though, we'll see if it gets better.

@kontrollanten
Copy link
Contributor Author

We're still having the duplicateError even after switching to 4.3.0. When searching at StackOverflow, ffmpeg bugtracker and Google I can't find any hits at all for "duplicateError" though. No idea what it is and why it appears.

@kontrollanten
Copy link
Contributor Author

FYI, same problem appears in PeerTube 5.1.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants