-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Guided reclock interface #5
Comments
First draft // This interface is queried from graph clock, preferably in IMediaFilter::SetSyncSource()
IGuidedReclockDraft : IUnknown
{
// Returns the average amount of adjustment per second below which audio renderer is
// guaranteed to use high quality sample rate conversion instead of time stretching.
// Usually, audio renderer allows short bursts of larger adjustments before
// switching to time stretching.
//
// The value is guaranteed to stay the same until a new graph clock is
// set by IMediaFilter::SetSyncSource()
STDMETHOD(GetTimestretchingThreshold)(REFERENCE_TIME* pThreshold) = 0;
// Instantly offsets graph clock by requested amount. When the method returns, the clock
// is guaranteed to be adjusted already. Audio renderer is responsible for smoothing out
// the values and catches up to adjustment after some time.
STDMETHOD(OffsetClock)(REFERENCE_TIME offset) = 0;
}; |
@zachsaw This is what I meant by "new public interface". Will it be useful? Do video renderers need something more? |
This is perfect! Very very useful and won't take much time to add to MPDN at all! BTW, would it be possible to do pitch shift via the same interface too? For example, someone might want to correct PAL speedup by lowering framerate along pitch. |
Actually, thinking of this a bit more, I think OffsetClock needs to be against system clock (i.e. QPC) and not relative to the default reference clock derived from audio card. There's no way for a video renderer to know what the exact drift is to make a correction until it's gone about 4 minutes into the playback without pausing. I believe to make this useful without user having to wait 4 minutes to fine tune each media type of different FPS, offset needs to be made relative to QPC clock - this is how reclock does it too. With reclock, you can see MPDN's refclk deviation is always 0%. The only trouble is, there's no way to change this refclk deviation via an interface. Even if it's relative to system time, a video renderer still has to figure out what the actual FPS is of the source before setting the offset. Once it knows what the actual FPS is, it's quite easy then for the video renderer to change the refclk deviation so it speeds up or slows down to match FPS to display refresh rate. |
BTW this would not work if we're bitstreaming right? |
0.1% difference in pitch is not perceivable. I think it makes more sense to just slave to monitor rate, and if the drift is low enough audio renderer will adjust with pitch (not time stretching).
Yes, I can't do much there. Reclock and bitstreaming just can't work together. Decoding and re-encoding is not really an option.
Resolution of graph clock is already the same as QPC. I receive audio position in conjunction with QPC for that position, then "overclock"/extrapolate with current QPC value.
|
It's definitely possible for the video renderer to do all of that but wouldn't it be better if it's implemented in the audio renderer then it saves all of us having to write our own version of it?
If you don't provide us with the |
I thought video renderer could drop clock smoothing entirely with |
I just need to be sure we're not complicating the interface needlessly. |
Yes that's the goal, but in order to do that, we'll need to call OffsetClock with a value to correct the drift. But what is the drift? We need something to detect the drift first don't we? Otherwise what value do we call OffsetClock with? |
The goal is to correct the ref clock so it changes the playback rate to make it match display refresh rate, isn't it? If so, since display refresh rate is always measured relative to system clock (QPC), it's just natural for video renderers to first correct the audio drift continually to make it stay as close to system clock as possible, then apply the difference to make it match display refresh rate (calculation wise, not literally calling OffsetClock in two steps of course). All I'm saying is, if you make it slave to system clock to begin with, then all we have to do for video renderers is to apply that final difference. |
Hmm just so I understand what your original thoughts were, could you let me know how you'd expect OffsetClock to be called so we could replicate what reclock does in its dormant mode (i.e. it simply slaves to system clock)? |
You have current graph time and current QPC. But, just thought of it, I can't warp graph clock backwards, msdn says so. For backwards adjustments graph clock will stay at the same value for some time. So we need |
Yes we're on the same page as far as correcting drift is concerned then. However, all I'm asking is you do the subtract and adjust internally - you have current graph time and current QPC too, so why make the video renderer / player do it? I think there's no need for backward adjustments is there? Negative offset only slows down the clock, you should never make it to go backwards... |
It will add a layer of hidden adjustment invisible to video renderer. And will limit the usefulness of
I was thinking about the situation when movie and display rates are the same, but frames are slightly misaligned. When display frame comes just a little before movie frame, I thought it would make more sense to do a little backwards adjustment than adjusting forward. Especially for ~24p content on ~24p displays . |
And these hidden adjustments will be quite large. There's a delay between when I call |
Though I may return lesser value in |
Yes. I think I haven't seen a refclk deviation more than 0.05% with any systems so it should be save to assume a 0.1% safety margin. So as we stand, the interface looks something like this now? IGuidedReclockDraft : IUnknown
{
// Returns the average amount of adjustment per second below which audio renderer is
// guaranteed to use high quality sample rate conversion instead of time stretching.
// Usually, audio renderer allows short bursts of larger adjustments before
// switching to time stretching.
//
// The value is guaranteed to stay the same until a new graph clock is
// set by IMediaFilter::SetSyncSource()
STDMETHOD(GetTimestretchingThreshold)(REFERENCE_TIME* pThreshold) = 0;
STDMETHOD(SlaveToSystemClock)(BOOL enable) = 0;
// Instantly offsets graph clock by requested amount. When the method returns, the clock
// is guaranteed to be adjusted already. Audio renderer is responsible for smoothing out
// the values and catches up to adjustment after some time.
STDMETHOD(OffsetClock)(REFERENCE_TIME offset) = 0;
}; For a video renderer to replicate reclock, we can simply do the following:
I think I've covered all the permutations? |
More or less, but I expected |
OffsetClock(0) means turning off any offsets in relative terms doesn't it? |
|
So essentially all we have to do to set a new offset is like this? OffsetClock(X);
OffsetClock(-X);
OffsetClock(Y); |
|
Hmm, I'm not sure I like it at all. I know MPDN wouldn't work with a clock that warps backwards and I believe a lot of other players wouldn't either - because MSDN specifically disallows it, for good reasons. |
That's why we should include a method in |
And you expected |
Yes there are good reasons why MSDN doesn't allow it, otherwise it would've implemented it. I can't remember off the top of my head why it was the case but it made sense when I was writing the player from scratch. By adding |
Adjusting only rate won't allow you to get rid of half-frame offset, for 24p it's around 20ms which is a lot. |
That's already possible - by changing audio delay in LAV Audio Decoder (i.e. changing time-stamp). |
It makes a lot more sense to include such adjustments in this interface. |
There's no need to create a new interface for this at all. In fact, any player can change it (audio timestamp) without an additional interface (a simple audio passthrough transform filter would do the job). If that's all |
What reclock does is it slaves audio clock to system clock, and then allows the additional offset to be applied on top. |
Player or video renderer? How exactly? And the interface will do what we include in it, nothing more nothing less. |
Just change the timestamp of the audio samples. It's up to the audio renderer then to gradually catchup / slowdown. It is this particularly reason why you don't get an audio break when you change the delay. |
But video renderer can't change the timestamps of audio samples. We're designing the interface between video renderer and audio renderer here. |
Why not? You can easily traverse the whole graph, add your own audio passthrough transform filter between an audio renderer and whatever comes before it. That's a much more generic solution that will work with every audio renderer out there. |
I initially though the whole point of IGuidedReclock is to provide something like Reclock where you change sample rates so you can speedup / slowdown audio playback... What you're suggesting is simply changing the delay, which wouldn't be of much more use than what's already available. Actually come to think of it, maybe we should leave IGuidedReclock out entirely. Like I said, the whole thing can easily be implemented as a transform filter that can be used for every audio renderer, including DirectSound. |
What I propose is that
You know, by that logic video renderers can come equipped with their own audio renderers (heck, why not). This is going nowhere, I propose a timeout. Also, packing for vacation right now, will be back on August 23. But will try to release sanear with status page fixes before going. Also, we should really include madshi in the talk in case he will be interested. |
Well, having an audio transform filter isn't any different to a custom audio renderer. For example, MPDN (or madVR) still needs to traverse the graph to find out if Sanear is used as an audio renderer. Instead, it would just do the same for the audio transform filter. BTW, I'm not suggesting that a video renderer should include its own audio transform filter - but a player usually does that. |
Anyway enjoy your vacation! Try not to think too hard about this :) |
It seems there are 2 different topics here. Here's my thoughts on them:
To be honest, I've zero experience with the audio side of DirectShow. I've no idea how much tweaking the audio renderer might be able to do without doing actual resampling. Maybe some tricks are possible, by talking to OS audio APIs, the audio driver or even the hardware (e.g. changing audio playback clocks to odd values or something). If any of that is feasible at all then it makes a lot of sense implementing all this in the audio renderer because it wouldn't make sense for an audio transform filter to do things like talking to the audio driver or anything like that. If this reclock magic is solely based on either resampling or time stretching then zachsaw has a point in suggesting that this could be done in a simple transform filter and would not necessarily have to be part of the audio renderer. However, where it's implemented is ultimately the decision of the developer who implements it, and I personally don't really care much where it ends up in, so it's fine with me either way. Of course the benefit of doing this in a transform filter would be that it could be used with any audio renderer. Not sure if that's just a theoretical advantage or a practical one.
To make it short: The way madVR works the originally suggested interface would be just perfect for my needs. It would not only allow me to avoid any frame drops/repeats (when refresh rate and frame rate are reasonably close), it would furthermore allow me to make lipsync perfect. That's just my 2 cents, of course. A separate GetTime() method which allows back warping might be useful. Adding it wouldn't hurt, in any case. Personally, I'll probably never call GetTimestretchingThreshold, but that's just me. |
Design and implement.
The text was updated successfully, but these errors were encountered: