-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Proposal] Implement DASH Thumbnail tracks #1496
Open
peaBerberian
wants to merge
1
commit into
dev
Choose a base branch
from
feat/thumbnail-tracks
base: dev
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
peaBerberian
added
DASH
Relative to the DASH streaming protocol
thumbnails
Relative to image thumbnails
labels
Aug 9, 2024
peaBerberian
force-pushed
the
feat/thumbnail-tracks
branch
3 times, most recently
from
August 9, 2024 17:08
1d8c00d
to
308cf1c
Compare
peaBerberian
force-pushed
the
feat/thumbnail-tracks
branch
from
August 29, 2024 19:59
308cf1c
to
cc827a7
Compare
peaBerberian
added
the
Priority: 2 (Medium)
This issue or PR has a medium priority.
label
Sep 3, 2024
peaBerberian
force-pushed
the
feat/thumbnail-tracks
branch
3 times, most recently
from
September 9, 2024 13:49
fe207f0
to
c6b53e6
Compare
Note for later: It seems that in the demo page, the spinner is displayed below the thumbnail div instead of on top. It's too late for CSS right now... |
peaBerberian
force-pushed
the
feat/thumbnail-tracks
branch
2 times, most recently
from
September 9, 2024 15:57
84e40e2
to
f0557e3
Compare
peaBerberian
force-pushed
the
feat/thumbnail-tracks
branch
from
October 8, 2024 14:59
f0557e3
to
e811d07
Compare
peaBerberian
force-pushed
the
feat/thumbnail-tracks
branch
2 times, most recently
from
October 29, 2024 14:52
da09494
to
73da5fe
Compare
peaBerberian
force-pushed
the
feat/thumbnail-tracks
branch
from
November 15, 2024 17:46
73da5fe
to
4264c88
Compare
peaBerberian
force-pushed
the
feat/thumbnail-tracks
branch
from
December 10, 2024 14:52
d99d4cd
to
35abeeb
Compare
Overview ======== This is a feature proposal to add support for DASH thumbnail tracks as specified in the DASH-IF IOP 4.3 6.2.6. Those thumbnail tracks generally allow to provide previews when seeking, and it has been linked as such in our demo page. In a DASH MPD ============= In a DASH MPD (its manifest file), such tracks are set as regular `AdaptationSet`, with an `contentType` attribute set to `"image"` and a specific `EssentialProperty` element. To support multiple thumbnail qualities (e.g. bigger or smaller thumbnails depending on the UI, the usage etc.), multiple `Representation` are also possible. A curiosity is that unlike for "trickmode" tracks (which also fill the role of optionally providing thumbnail previews in the RxPlayer, through our experimental `VideoThumbnailLoader` tool), thumbnail tracks are not linked to any video `AdaptationSet`. So if there's multiple video tracks with different content in it, I'm not sure of how we may be able to choose the right thumbnail track, nor how to communicate it through the API. I guess it could be communicated through a `Subset` element, as defined in the DASH specification to force usage of specific AdaptationSets together, but I never actually encountered this element in the wild and it doesn't seem to be supported by any player. The API ======= Simple solution from other players ---------------------------------- For the API, I saw that most other players do very few things. They generally just synchronously return the metadata on a thumbnail corresponding to a specified timestamp. That metadata includes the thumbnail's URL (e.g. to a jpeg), height and width, but also x and y coordinates as thumbnails are often in image sprites (images including multiple images). It is then the role of the application/UI to load and crop this correctly. This seems acceptable to me, after all UI developers are generally experienced working with images and browsers are also very efficient with it (e.g. doing an `<img>.src = url` vs fetching the jpeg through a fetch request + linking the content to the DOM), but I did want to explore another way for multiple reasons: 1. As the core of the RxPlayer may run in another thread (in what we call "multithreading mode"), and as for now precize manifest information is only available in the WebWorker, we would either have to make such kind of API asynchronous (which makes it much less easy to handle for an application), or to send back the corresponding metadata to main thread (with thus supplementary synchronization complexities). 2. As the thumbnail track is just another AdaptationSet/Representation in the MPD, it may be impacted in the same way by other MPD elements and attributes, like multiple CDNs, content steering... Though thumbnail tracks are much less critical (and they also seem explicitely more limited by the DASH-IF IOP than other media types), I have less confidence on being able to provide a stable API in which the RxPlayer would provide all necessary metadata to the application so it can load and render thumbnails, than just do the loading and thumbnail rendering ourselves. Solution I propose ------------------ So I propose here two APIs: ```ts /** * Get synchronously thumbnail information for the specified time, or * `null` if there's no thumbnail information for that time. * * The returned metadata does not allow an application to load and * render thumbnails, it is mainly meant for an application to check if * thumbnails are available at a particular time and which qualities if * there's multiple ones. */ getThumbnailMetadata({ time }: { time: number }): IThumbnailMetadata[] | null; /** Information returned by the `getThumbnailMetadata` method. */ export interface IThumbnailMetadata { /** Identifier identifying a particular thumbnail track. */ id: string; /** * Width in pixels of the individual thumbnails available in that * thumbnail track. */ width: number | undefined; /** * Height in pixels of the individual thumbnails available in that * thumbnail track. */ height: number | undefined; /** * Expected mime-type of the images in that thumbnail track (e.g. * `image/jpeg` or `image/png`. */ mimeType: string | undefined; } ``` Though with that API, it means that an application continuously has to check if there's thumbnail at each timestamp by calling again and again `getThumbnailMetadata` e.g. as a user moves its mouse on top of the seeking bar. So I'm still unsure with that part, we could also communicate like audio and video tracks per Period and only once. And more importantly the loading and rendering API: ```ts /** * Render inside the given `container` the thumbnail corresponding to the * given time. * * If no thumbnail is available at that time or if the RxPlayer does not succeed * to load or render it, reject the corresponding Promise and remove the * potential previous thumbnail from the container. * * If a new `renderThumbnail` call is made with the same `container` before it * had time to finish, the Promise is also rejected but the previous thumbnail * potentially found in the container is untouched. */ public async renderThumbnail(options: IThumbnailRenderingOptions): Promise<void>; export interface IThumbnailRenderingOptions { /** * HTMLElement inside which the thumbnail should be displayed. * * The resulting thumbnail will fill that container if the thumbnail loading * and rendering operations succeeds. * * If there was already a thumbnail rendering request on that container, the * previous operation is cancelled. */ container: HTMLElement; /** Position, in seconds, for which you want to provide an image thumbnail. */ time: number; /** * If set to `true`, we'll keep the potential previous thumbnail found inside * the container if the current `renderThumbnail` call fail on an error. * We'll still replace it if the new `renderThumbnail` call succeeds (with the * new thumbnail). * * If set to `false`, to `undefined`, or not set, the previous thumbnail * potentially found inside the container will also be removed if the new * new `renderThumbnail` call fails. * * The default behavior (equivalent to `false`) is generally more expected, as * you usually don't want to provide an unrelated preview thumbnail for a * completely different time and prefer to display no thumbnail at all. */ keepPreviousThumbnailOnError?: boolean | undefined; /** * If set, specify from which thumbnail track you want to display the * thumbnail from. That identifier can be obtained from the * `getThumbnailMetadata` call (the `id` property). * * This is mainly useful when encountering multiple thumbnail track qualities. */ thumbnailTrackId?: string | undefined; } ``` Basically this method checks which thumbnail to load, load it and render it inside the given element. For now this is done by going through a Canvas element for easy cropping/resizing. I could also go through an image tag and CSS but I was unsure of how my CSS would interact with outside CSS I do not control, so I chose for now the maybe-less efficient canvas way. As you can see in the method description and in its implementation, there's a lot of added complexities from the fact that we do not control the container element (the application is) and that we're doing the loading ourselves instead of just e.g. the browser through an image tag: - Multiple `renderThumbnail` calls may be performed in a row, in which case we have to cancel the previous requests to avoid rendering thumbnails in the wrong order. - If a new thumbnail request fails, we also have to remove the older thumbnail to avoid having stale data. - Because there's a lot of operations which may take some (still minor) time and as often thumbnails are just present in the same image sprite than the one before, there is a tiny cache implementation which handles just that case: if the previous image sprite already contains the right data, we do not go through the RxPlayer's core code (which may be in another thread) and back. Still, I find the corresponding usage by an application relatively simple and elegant: ```js rxPlayer.renderThumbnail({ time, container }) .then(() => console.log("Thumbnail rendered!")) .catch((err) => { if (err,code !== "ABORTED") { console.warn("Error while loading thumbnails:", err); } ); ```
peaBerberian
force-pushed
the
feat/thumbnail-tracks
branch
from
December 12, 2024 13:51
35abeeb
to
5238382
Compare
peaBerberian
added a commit
that referenced
this pull request
Dec 13, 2024
Based on #1496 Problem ------- We're currently trying to provide a complete[1] and easy to-use API for DASH thumbnail tracks in the RxPlayer. Today the proposal is to have an API called `renderThumbnail`, to which an application would just provide an HTML element and a timestamp, and the RxPlayer would do all that's necessary to fetch the corresponding thumbnail and display it in the corresponding element. The API is like so: ```js rxPlayer.renderThumbnail({ element, time }) .then(() => console.log("The thumbnail is now rendered in the element")); ``` This works and seems to me very simple to understand. Yet, we've known of advanced use cases where an application might not just want to display a single thumbnail for a single position. For example, there's very known examples where an application displays a window of multiple thumbnails at once on the player's UI to facilitate navigation inside the content. To do that under the solution proposed in #1496, an application could just call `renderThumbnail` with several `element` and `time` values. Yet for this type of feature, what the interface would want is not really to indicate a `time` values, it actually wants basically a list of distinct thumbnails around/before/after a given position. By just being able to set a `time` value, an application is blind on which `time` value is going to lead to a different timestamp (i.e. is the thumbnail for the `time` `11` different than the thumbnail for the `time` `12`? Nobody - but the RxPlayer - knows). So we have to find a solution for this [1] By complete, I here mean that we want to be able to handle its complexities inside the RxPlayer, to ensure complex DASH situations like multi-CDN, retry settings for requests and so on while still allowing all potential use cases for an application. Solution -------- In this solution, I experiment with a second thumbnail API, `getAvailableThumbnailTracks` (it already exists in #1496, but its role there was only to list the various thumbnail qualities, if there are several size for example). As this solution build upon yet stays compatible to #1496, I chose to open this second PR on top of that previous one. I profit from the fact that most standardized thumbnail implementations I know of (BIF, DASH) seem follow the principle of having evenly-spaced (in terms of time) thumbnails (though I do see a possibility for that to change, e.g. to have thumbnails corresponding to "important" scenes instead, so our implementation has to be resilient). So here, what this commit does is to add the following properties (all optional) to a track returned by the `getAvailableThumbnailTracks` API: - `start`: The initial `time` the first thumbnail of that track will apply to - `end`: The last `time` the last thumbnail of that track will apply to - `thumbnailDuration`: The "duration" (in seconds) each thumbnail applies to (with the exception of the last thumbnail, which just fills until `end`) Then, an application should have all information needed to calculate a `time` which correspond to a different thumbnail. Though this solution lead to a minor issue: by letting application make the `time` operation themselves with `start`, `end`, `thumbnailDuration` and so on, there's a risk of rounding errors leading to a `time` which does not correspond to the thumbnail wanted but the one before or after. To me, we could just indicate in our API documentation to application developers that they should be extra careful and may add an epsilon (or even choose a `time` in the "middle" of thumbnails each time) if they want that type of thumbnail list feature. Thoughts?
peaBerberian
added a commit
that referenced
this pull request
Dec 13, 2024
Based on #1496 Problem ------- We're currently trying to provide a complete[1] and easy to-use API for DASH thumbnail tracks in the RxPlayer. Today the proposal is to have an API called `renderThumbnail`, to which an application would just provide an HTML element and a timestamp, and the RxPlayer would do all that's necessary to fetch the corresponding thumbnail and display it in the corresponding element. The API is like so: ```js rxPlayer.renderThumbnail({ element, time }) .then(() => console.log("The thumbnail is now rendered in the element")); ``` This works and seems to me very simple to understand. Yet, we've known of advanced use cases where an application might not just want to display a single thumbnail for a single position. For example, there's very known examples where an application displays a window of multiple thumbnails at once on the player's UI to facilitate navigation inside the content. To do that under the solution proposed in #1496, an application could just call `renderThumbnail` with several `element` and `time` values. Yet for this type of feature, what the interface would want is not really to indicate a `time` values, it actually wants basically a list of distinct thumbnails around/before/after a given position. By just being able to set a `time` value, an application is blind on which `time` value is going to lead to a different timestamp (i.e. is the thumbnail for the `time` `11` different than the thumbnail for the `time` `12`? Nobody - but the RxPlayer - knows). So we have to find a solution for this [1] By complete, I here mean that we want to be able to handle its complexities inside the RxPlayer, to ensure complex DASH situations like multi-CDN, retry settings for requests and so on while still allowing all potential use cases for an application. Solution -------- In this solution, I experiment with a second thumbnail API, `getAvailableThumbnailTracks` (it already exists in #1496, but its role there was only to list the various thumbnail qualities, if there are several size for example). As this solution build upon yet stays compatible to #1496, I chose to open this second PR on top of that previous one. I profit from the fact that most standardized thumbnail implementations I know of (BIF, DASH) seem follow the principle of having evenly-spaced (in terms of time) thumbnails (though I do see a possibility for that to change, e.g. to have thumbnails corresponding to "important" scenes instead, so our implementation has to be resilient). So here, what this commit does is to add the following properties (all optional) to a track returned by the `getAvailableThumbnailTracks` API: - `start`: The initial `time` the first thumbnail of that track will apply to - `end`: The last `time` the last thumbnail of that track will apply to - `thumbnailDuration`: The "duration" (in seconds) each thumbnail applies to (with the exception of the last thumbnail, which just fills until `end`) Then, an application should have all information needed to calculate a `time` which correspond to a different thumbnail. Though this solution lead to a minor issue: by letting application make the `time` operation themselves with `start`, `end`, `thumbnailDuration` and so on, there's a risk of rounding errors leading to a `time` which does not correspond to the thumbnail wanted but the one before or after. To me, we could just indicate in our API documentation to application developers that they should be extra careful and may add an epsilon (or even choose a `time` in the "middle" of thumbnails each time) if they want that type of thumbnail list feature. Thoughts?
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
DASH
Relative to the DASH streaming protocol
Priority: 2 (Medium)
This issue or PR has a medium priority.
thumbnails
Relative to image thumbnails
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
This is a feature proposal to add support for DASH thumbnail tracks as specified in the DASH-IF IOP 4.3 6.2.6.
Those thumbnail tracks generally allow to provide previews when seeking, and it has been linked as such in our demo page.
In a DASH MPD
In a DASH MPD (its manifest file), such tracks are set as regular
AdaptationSet
, with acontentType
attribute set to"image"
and a specificEssentialProperty
element.To support multiple thumbnail qualities (e.g. bigger or smaller thumbnails depending on the UI, the usage etc.), multiple
Representation
are also possible.A curiosity is that unlike for "trickmode" tracks (which also fill the role of optionally providing thumbnail previews in the RxPlayer, through our experimental
VideoThumbnailLoader
tool), thumbnail tracks are not linked to any videoAdaptationSet
.So if there's multiple video tracks with different content in it, I'm not sure of how we may be able to choose the right thumbnail track, nor how to communicate it through the API.
I guess it could be communicated through a
Subset
element, as defined in the DASH specification to force usage of specific AdaptationSets together, but I never actually encountered this element in the wild and it doesn't seem to be supported by any player.The API
Simple solution from other players
For the API, I saw that most other players do very few things. They generally just synchronously return the metadata on a thumbnail corresponding to a specified timestamp.
That metadata includes the thumbnail's URL (e.g. to a jpeg), height and width, but also x and y coordinates as thumbnails are often in image sprites (images including multiple images). It is then the role of the application/UI to load and crop this correctly.
This seems acceptable to me, after all UI developers are generally experienced working with images and browsers are also very efficient with it (e.g. doing an
<img>.src = url
vs fetching the jpeg through a fetch request + linking the content to the DOM), but I did want to explore another way for multiple reasons:As the core of the RxPlayer may run in another thread (in what we call "multithreading mode"), and as for now precize manifest information is only available in the WebWorker, we would either have to make such kind of API asynchronous (which makes it much less easy to handle for an application), or to send back the corresponding metadata to main thread (with thus supplementary synchronization complexities).
As the thumbnail track is just another AdaptationSet/Representation in the MPD, it may be impacted in the same way by other MPD elements and attributes, like multiple CDNs, content steering...
Though thumbnail tracks are much less critical (and they also seem explicitely more limited by the DASH-IF IOP than other media types), I have less confidence on being able to provide a stable API in which the RxPlayer would provide all necessary metadata to the application so it can load and render thumbnails, than just do the loading and thumbnail rendering ourselves.
Solution I propose
So I propose here two APIs:
Though with that API, it means that an application continuously has to check if there's thumbnail at each timestamp by calling again and again
getThumbnailMetadata
e.g. as a user moves its mouse on top of the seeking bar. So I'm still unsure with that part, we could also communicate like audio and video tracks per Period and only once.And more importantly the loading and rendering API:
Basically this method checks which thumbnail to load, load it and render it inside the given element.
For now this is done by going through a Canvas element for easy cropping/resizing. I could also go through an image tag and CSS but I was unsure of how my CSS would interact with outside CSS I do not control, so I chose for now the maybe-less efficient canvas way.
As you can see in the method description and in its implementation, there's a lot of added complexities from the fact that we do not control the container element (the application is) and that we're doing the loading ourselves instead of just e.g. the browser through an image tag:
Multiple
renderThumbnail
calls may be performed in a row, in which case we have to cancel the previous requests to avoid rendering thumbnails in the wrong order.If a new thumbnail request fails, we also have to remove the older thumbnail to avoid having stale data.
Because there's a lot of operations which may take some (still minor) time and as often thumbnails are just present in the same image sprite than the one before, there is a tiny cache implementation which handles just that case: if the previous image sprite already contains the right data, we do not go through the RxPlayer's core code (which may be in another thread) and back.
Still, I find the corresponding usage by an application relatively simple and elegant: