Action-Response Cycle bottlenecks in interactive music apps #97
Labels
Developer's Perspective
Machine Learning Experiences on the Web: A Developer's Perspective
Discussion topic
Topic discussed at the workshop
User's Perspective
Machine Learning Experiences on the Web: A User's Perspective
The Interactive ML - Powered Music Applications on the Web talk by @teropa explains how a key design consideration in apps for musical instruments is latency between the user input (e.g. a key press on an instrument, a video input) and musical output as illustrated by the Action-Response Cycle:
This cycle must execute within ~0-20 ms for the experience to feel natural.
Real-time audio is mentioned as a very constrained capability on the web platform currently:
Particularly demanding task is generating actual audio data in the browser with ML (as opposed to generating symbolic music data with ML). Proposals mentioned for consideration that may help lower the latency in this scenario:
Another use case that involves video input (from webcam) and musical output has the following per-frame path:
Notably, the steps to get data into the model (Webcam MediaStream > Draw to Canvas > Build Pixel Tensor) take half of the time.
The bottleneck of canvas (copy rendered video frames to a canvas element, process pixels extracted from the canvas, and render the result to a canvas) was identified as an inefficient path also in the Media processing hooks for the Web talk by @tidoust.
This calls for APIs to provide better abstractions that allow feeding input data into ML models, @teropa concludes:
As a summary, the talk outlines the following areas as important:
This issue is to discuss the proposals that involve Web API surface improvements and other problematic aspects of real-time use cases that involve audio.
Looping in @padenot for AudioWorklet expertise as well as to reflect on the recent work on WebCodecs that might also help with these real-time audio use cases. Feel free to tag other folks who might be interested.
The text was updated successfully, but these errors were encountered: