WebCodecs encodes audio and video but leaves multiplexing the output into a container to the application. This project provides a way for your application to get its audio and video tracks into a WebM container.
Under the hood it uses libwebm and webmtools compiled to Web Assembly using Emscripten.
For a pure-TypeScript alternative to this project that doesn’t use Web Assembly, see webm-muxer.
webm-muxer.js works on Chrome 95. The WebCodecs spec changes frequently so changes may be required to maintain support going forward.
webm-muxer.js is licensed under the terms of the MIT licence.
You can see a demo here
(tested on Chrome 95 Linux with #enable-experimental-web-platform-features
).
The source code for the demo is available in demo.html and demo.js.
When you click the Start button, you’ll be asked by the browser to give permission to capture your camera and microphone. The data from each is then passed to two separate workers which encode the video into VP9 or AV1 and audio into Opus using the WebCodecs browser API.
The encoded video and audio from each worker is passed into a third worker which muxes it into WebM format.
The WebM output from the third worker is then passed into a <video>
element via
MediaSource
so you can see
it on the page.
When you click the Stop button, the camera and microphone are closed and the workers will exit once they’ve processed the last of the data.
This all happens in near realtime — the data is effectively pipelined between the workers and you don’t have to wait until you press Stop to see the output.
You also have the option to record the WebM file to your local disk.
If you check Record before clicking Start, you’ll be prompted for a file to save
the data into (default camera.webm
). The data is written to disk as it is produced and
then renamed to the file you chose once you click Stop. If you check In-memory too,
the data will instead be buffered in memory first and then saved to camera.webm
once you click Stop.
Finally, when you record, you can check PCM to have the raw audio data from your microphone
passed into the WebM muxer rather than encoding it to Opus. This option is only available when recording
because the <video>
element doesn’t support PCM playback from WebM files — you won’t
be able to monitor the video in this case.
In your application, there are two Javascript files which you should run in Web Workers:
- encoder-worker.js
-
Takes output from
MediaStreamTrackProcessor
and encodes it using WebCodecsVideoEncoder
orAudioEncoder
. You should run this in up to two Workers, one for video and one for audio. If you have only video or audio, runencoder-worker.js
in one worker. - webm-worker.js
-
Takes output from
encoder-worker.js
and muxes it into WebM container format.
You’ll also need to copy webm-muxer.js and webm-muxer.wasm to your application because webm-worker.js
uses them.
Your application should in general follow the procedure described below.
-
Create your
MediaStreamTrack
s (e.g. usinggetUserMedia()
). You can have both video and audio tracks or just one. -
Create up to two
MediaStreamTrackProcessor
s, one for each of your tracks. -
Create a Web Worker using
webm-worker.js
. -
Create up to two Web Workers using
encoder-worker.js
, one for each of your tracks. -
When your application receives a message from the Worker running
webm-worker.js
:-
If the message’s
type
property isstart-stream
then:-
If you have a video track, send a message to the Worker running
encoder-worker.js
for video with the following properties:type
"start"
readable
The
readable
property of theMediaStreamTrackProcessor
for the video. You’ll need to transfer this to the worker.key_frame_interval
How often to generate a key frame, in seconds. Use
0
for no key frames.count_frames
Use frame count rather than timestamp to determine key frame.
config
The
VideoEncoderConfig
for encoding the video. -
If you have an audio track, send a message to the Worker running
encoder-worker.js
for audio with the following properties:type
"start"
audio
true
readable
The
readable
property of theMediaStreamTrackProcessor
for the audio. You’ll need to transfer this to the worker.config
The
AudioEncoderConfig
for encoding the audio.
-
-
If the message’s
type
property ismuxed-data
then:-
The message’s
data
property contains the next chunk of the WebM output as anArrayBuffer
, which your application can use as it likes.
-
-
If the message’s
type
property iserror
then a muxing error occurred and thedetail
property contains the error description. -
If the message’s
type
property isexit
then the muxer has finished (all the tracks have finished, the muxer has flushed its buffers and sent back all the muxed data). -
If the message’s
type
property isstats
then thedata
property contains an object with the following property:memory
Size of the Emscripten/Web Assembly heap. This may grow for long-lived sessions.
-
-
When your application receives a message from one of the Workers running
encoder-worker.js
:-
If the message’s
type
property iserror
then an encoding error occurred and thedetail
property contains the error description. -
If the message’s
type
property isexit
then the encoder has finished (its track ended). -
Otherwise send the message onto the Worker running
webm-worker.js
. You should transfer thedata
property.
-
-
Send a message to the Worker running
webm-worker.js
with the following properties:type
"start"
webm_metadata
An object with the following properties:
max_cluster_duration
Desired length in nanoseconds of each WebM output chunk. Use a
BigInt
to specify this.video
If you have a video track, an object with the following properties:
width
Width of the encoded video in pixels.
height
Height of the encoded video in pixels.
frame_rate
Number of frames per second in the video. This property is optional.
codec_id
WebM codec ID to describe the video encoding method, e.g.
"V_VP9"
,"V_AV1"
or"V_MPEG4/ISO/AVC"
. See the codec mappings page for more values.audio
If you have an audio track, an object with the following properties:
sample_rate
Number of audio samples per second in the encoded audio.
channels
Number of channels in the encoded audio.
bit_depth
Number of bits in each sample. This property is usually used only for PCM encoded audio.
codec_id
WebM codec ID to describe the audio encoding method, e.g.
"A_OPUS"
or"A_PCM/FLOAT/IEEE"
. See the codec mappings page for more values.webm_options
An object with the following properties:
video_queue_limit
The number of video frames to buffer while waiting for audio with a later timestamp to arrive.
Defaults to
Infinity
, i.e. all data is muxed in timestamp order, which is suitable if you have continuous data. However, if you have intermittent audio or video, including delayed start of one with respect to the other, then you can try settingvideo_queue_limit
to a small value.For example, if your video is 30fps then setting
video_queue_limit
to30
will buffer a maximum of one second of video while waiting for audio. If audio subsequently arrives that has a timestamp earlier than the video, its timestamp is modified in order to maintain a monotonically increasing timestamp in the muxed output. This may result in the audio sounding slower.In general, if your audio and video is continuous and start at the same time, leave
video_queue_limit
at the default. Otherwise, the lower you set it, the more accurate the first audio timestamp in the muxed output will be, but subsequent audio timestamps may be altered. The higher you set it, the less accurate the first audio timestamp will be but subsequent audio timestamps are less likely to be altered. This is because WebCodecs provides no way of synchronizing media streams — in fact audio and video timestamps are completely unrelated to each other. So we have to base everything off initial arrival time in the muxer.audio_queue_limit
The number of audio frames to buffer while waiting for video with a later timestamp to arrive.
Same as
video_queue_limit
but for audio.use_audio_timestamps
Always use timestamps in the encoded audio data rather than calculate them from the duration of each audio chunk.
Defaults to
false
, i.e. the timestamp of an audio chunk is set to sum of the durations of all the preceding audio chunks. This is suitable for continuous audio but if you have intermittent audio, set this tofalse
.Note that I’ve found the duration method to be more accurate than the timestamps WebCodecs generates.
webm_stats_interval
If you specify this then the worker will repeately send a message with
type
property set tostats
. The interval between each message will be the number of milliseconds specified. See [stats] for details ofstats
messages. -
To stop muxing cleanly, wait for exit messages from all the Workers running
encoder-worker.js
and then send a message to the Worker runningwebm-worker.js
with the following property:type
"end"
Per above, your application will receive chunked WebM output in multiple type: "muxed-data"
messages from the Worker running webm-worker
.
These are suitable for live streaming but if you concatenate them, for example to record them to a file, please be aware that the result will not be seekable.
You can use webm-writer.js to make the WebM data seekable. It exports a class, WebMWriter
,
which uses one of two methods to index muxed data:
- Index as it goes
-
Writes the data to disk as it’s produced, using the File System Access API. Once the data stops, appends the cues, seeks back to the start of the file and rewrites the header. To use this method:
-
Construct a
WebMWriter
object. The constructor takes an optional options object with a single property,metadata_reserve_size
. This is the number of extra bytes to leave at the start of the file for the header so it can be fixed up after writing stops. The default is 1024, which is enough to rewrite the header.WebMWriter
will try to put the cues into this space too if they’re small enough, otherwise they’re appended to the end of the file, after the track data. You can increasemetadata_reserve_size
to leave more space for the cues at the start of the file, but remember the longer the recording, the larger the cues section will be. -
Call the async
start
method. You must pass a filename argument to this function, otherwise the data is buffered in memory (see below). The user is prompted to for the file to save the data into — the argument passed tostart
is used as the suggested name in the file picker. -
Call the async
write
method for eachtype: "muxed-data"
message, passing it thedata
property of the message. -
Call the async
finish
method. Once this returns (after awaiting), the seekable WebM file will be ready in the file the user chose. Note thename
property of theWebMWriter
object will contain the filename (but not the path). Thehandle
property will contain theFileSystemFileHandle
to the file. You can use this to read it back in again if you need to.finish
returnstrue
if the cues were inserted at the start of the file orfalse
if they were appended at the end.
-
- Buffer in memory
-
Buffers the data in memory and then rewrites the header and cues. The cues are always inserted at the start, before the track data. To use this method:
-
Construct a
WebMWriter
object. -
Call the async
start
method. -
Call the async
write
method for eachtype: "muxed-data"
message, passing it thedata
property of the message. -
Call the async
finish
method. This returns (after awaiting) an array ofArrayBuffer
s or typed arrays containing the seekable WebM recording split into contiguous chunks.
-
After finish
returns in both methods, the size
property of the WebMWriter
object will contain the size of the file
in bytes and the duration
property will contain the length of the recording in milliseconds.
See the demo for an example of how to use WebMWriter
.