-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Audible noise in watermarked audio even with --strength 5, also not extractable even at --strength 100 #48
Comments
That is interesting. I've run automated tests with a big set of music files, and I've never come across a file that would be impossible to watermark (or rather, where the watermark could not be detected when using a sufficiently high strength). Are the files long enough (one minute would be best)? Does the sychronization work (are the positions of the patterns detected at the right locations)? Maybe you can share one of these files by mail, or send me a link where I can get access? Also that the watermark is audible is strange (but of course not impossible). Indeed one approach would be parameter tuning. There is a bit of discussion about the frame size parameter in this PR #34. Most parameters should be in Params, in wmcommon.hh / wmcommon.cc. Using a bigger Note that the parameters depend on each other, so if you change one, you might also need to change others. The frequency range is controlled by Maybe if noisiness is a problem, making the watermark more sparse by using less bands over the frequency range could have an effect (so that would be using less Unfortunately tuning parameters to get a good result is not going to be easy, and things may break in unexpected ways, as it is not really supported at this point to do it in a user friendly way. Its a bit of trial-and-error to get the right balance to make all parameters play well together as well. There is also stuff in wmspeed.cc that somehow depends on the parameters without being in the main parameter section. Btw, we have some developer documentation (that is already written) that will be included in the next release, which is really soon now. This could help you a bit to better understand what the parameters actually do. |
Hey @swesterfeld, thanks for getting back! The file is 18 seconds, but it's probably not a length issue as I have it working with much smaller files (eg; 3 seconds). I can send the file by email but can't find your email address anywhere, do you mind sharing? This is an example output of running the watermark
When I try to use Thanks for the details regarding the parameters, will look into that, and looking forward to the developer documentation. |
|
Hey Stefan, I just sent you an email, let me know if you need anything else! |
Well, I can at least tell you why the watermark detection doesn't work as expected: after listening to the file and looking at a spectrum view, I can see that for this file most of the signal energy is in the very low frequencies. However,
and all other bins are above of this one, whereas for your file most of the energy is below this frequency. |
Thanks Stefan, I thought it has something to frequencies in the audio file being lower than those of the watermark. If we lower the settings below those of the audio file, would it still work? Or is there a lower limit where it stops being reliable? Also, for similar audio files where the extraction works but with noticeable noise, if we spread the watermark a little bit, would that help with the noise? Finally, I know it's a long shot, but if audiowmark can detect file frequencies and raise an error/show message where it's not going to work, that's going to be amazing. |
Ok, I found an improvement that fixes the problems with your file. Changing the window function like this:
sounds better. I think it might even be reasonable to use this by default in new releases. I'd have to run a few tests to see if that affects robustness in a negative way. What is even better: with the change of the window function, watermark detection also works for your file. I guess the reason is that the window change reduces the amount different frequency bands that are somewhat apart affect each other. In your case, the very low frequencies do not affect the higher frequencies that much with the changed window function. As for the frequency range - not sure if you still want to experiment with that - I think the absolute minimum bin that you can use from a signal processing point of view is bin 4. However, you probably need to do listening tests to see how it sounds. The actual frequency will depend on the bin number and the frame size.
Not sure if implementing this is really worth it. However, if you want to test whether watermarking a file was successful, you can simply use audiowmark get after adding the watermark and look at the sync score. For instance:
Now 1.389 is significantly higher than 1.0, so we can assume it worked. |
That's amazing, thank you very much. I don't think I need to expirement with the settings if it's working. Would not want to mess with source code and lose the ability to update if there is a new release, unless there as a way to expirement without having to change the source code? Should we expect an update soon, or do you think I should just go ahead an make the change directly? One more question, I noticed that "Data Blocks" was 0 when adding a watermark for the file, should we expect that 0 means it hasn't been watermarked correctly? I'm building an API that's supposed to handle a lot of files and would like to avoid additional processing if not necessary. |
There should be a new release soon. If you want to play with the change, you can do so by changing the source now. I hope that for the next release this will no longer be necessary (I'm currently running a few tests, to see if the change has a negative robustness impact for other files).
No, the number of data blocks only depends on the length of the file. Each data block is ~50 seconds long, so for a 1 minute file, you'll always see one data block being written. Note that files that watermarks on files shorter than one data block can still be detected, what is possible there really depends on the strength and the amount of changes being done to the file after watermarking (i.e. mp3 compression). |
Awesome news, thank you! By the way, I tested this and it works. I can't hear any noise and the hash is extractable, which is amazing. As you mentioned, this probably going to be best as a default. Looking forward to the results of your tests. However, I think the cos window might be less robust on high-frequency files. I'm not sure if it would be possible to analyze the audio frequencies, but if so, it might be best to switch between the two, or have some option, or a fallback mechanism to handle cases where it would be problematic. Makes sense regarding the Data Block, thanks. One more question if you don't mind, what settings that needs to stay the same on both I think that's my last question, thanks again! |
I have tested this now with lots of files in lots of situations and on average changing the window function does have a positive effect on robustness, although it is not a lot. So I'll make that the default. It is also somewhat backward compatible, so if you have a hamming-window watermarked file, it usually can be extracted with the new window function, although extraction would work a litttle better if you watermark and extract with the same window.
Of course the key. If you watermark with different strengths (>=10), you can extract with the default strength, so you need not match the strength here. And of course if you use short payload for watermarking, you need to do so for extraction as well. |
Hello, first of all, thank you very much for this amazing piece of software. I've been experimenting with it for the past month and it's definitely the best open source project for watermarking audio files.
I've encountered a consistent issue with watermarking certain audio files. I can describe the file as "Dark Pulse Flutter" and is producing noticeable noise after watermarking, even when using a minimal strength setting (e.g., --strength 5), not to mention that the watermark is not extractable at all even when increasing the strength (e.g., --strength 100)
Here's what I've observed and attempted so far:
Given the above, I'm seeking advice on adjustments that could help mitigate the noise issue without compromising the watermark's extractability. Here are some specific points and questions:
Watermark Algorithm Sensitivity: Is there a way to adjust the sensitivity of the patchwork algorithm to these audio types, perhaps by modifying the distribution of frequency bands used for embedding?
Parameter Tuning: Could you suggest parameter tuning that might address the noise issue? I am particularly interested in whether there are non-documented parameters or advanced configurations that could be adjusted.
Low-Pass Filter Usage: Since a low-pass filter has helped, is there a recommended approach or a set of parameters that would allow for its use without affecting the watermark?
Frequency Resolution: Would changing the frame size, if possible, provide a solution? If so, how could this be achieved given the current limitations of the command-line options?
Any insights or suggestions would be greatly appreciated. The goal is to find a balance between minimizing the watermark's audibility and ensuring its reliability and extractability, especially for this specific audio type.
The text was updated successfully, but these errors were encountered: