Drop the support of synchronous execution #531

huningxin · 2024-01-26T04:22:12Z

The current WebNN spec supports both asynchronous and synchronous execution modes. In particular, the synchronous execution mode, including MLContext.computeSync(), ML.createContextSync() and MLGraphBuilder.buildSync() methods, were introduced (only available in dedicated worker) for easy integration with ML frameworks written in C++ and compiled to Wasm, for example ONNXRuntime WebNN EP (Execution Provider) used the sync execution before.

Chromium WebNN prototype supports both execution modes for implementation feedback. Chrome team encouraged WG to check whether sync APIs are really necessary before its launch.

Recently, ONNXRuntim WebNN EP experimented (onnxruntime#19145) the async execution mode and compared the performance with sync. For sync execution, ONNXRuntime runs WebNN EP in a dedicated worker and calls WebNN computeSync() method there, the JavaScript user code in main thread communicate (via postMessage) with WebNN EP in worker thread through ONNXRuntime Wasm proxy. For async execution, ONNXRuntime runs WebNN EP in main thread and calls WebNN async compute() method through asyncify.

According to the test result across 35 models (including CNNs & transformers), the model inference time difference of the two execution modes is minimum. Actually, for GPU, the async is even slightly faster than sync (async / sync 103% in average). While for CPU, the async is a bit slower than sync (async / sync 95% in average). It's because WebNN EP on CPU has less operators support currently (referring to implementation status). For each non-supported op, the model inference will fallback to run Wasm op and return to compute next WebNN sub-graph. The more ops fallback, the more async compute call (more asyncify overhead).

With more ops being supported by WebNN CPU/XNNPACK backend, the ops fallback would be less that means less asyncify overhead. And with JSPI (JavaScript Promise Integration) coming, the asyncify overhead hopefully would become even less. The performance of async execution mode is expected to be faster.

With onnxruntime#19145 merged, ONNXRuntime WebNN EP is now only using WebNN async execution mode and won't use sync execution mode anymore.

Based on this implementation experience, the proposal for WebNN spec is to remove the support of sync execution. That would help simplify the spec as well as the implementation. Wasm ML framework could use WebNN async methods via asyncify today and migrate to JSPI once it is available.

The text was updated successfully, but these errors were encountered:

a-sully · 2024-01-26T05:12:19Z

Big +1 to this proposal from the Chrome team. Thank you for the detailed exploration!

anssiko · 2024-01-26T09:27:06Z

Thank you @huningxin, please feel free to proceed with a PR.

For context, this issue was discussed in https://www.w3.org/2024/01/25-webmachinelearning-minutes.html#t05

PR #532 awaits the landing of the PR that addresses this issue.

wacky6 · 2024-01-29T11:20:20Z

@huningxin Is there a detailed benchmark result that can be shared?

I think 103% / 95% on average is a good news to hear, but I wonder whether the performance delta is consistent across all models, or whether there's a certain grouping (i.e. the distribution of the numbers).

huningxin · 2024-01-31T07:14:32Z

@wacky6

Is there a detailed benchmark result that can be shared?

I added the details at onnxruntime/pull/19145. The updated result have more models. The average async / sync on webnn-cpu becomes 93.45% while webnn-gpu is still 103.84%. The newly-added models has ops fallback on webnn-cpu that causes the cpu number decreased a bit.

Fix webmachinelearning#531

Remove the definition and algorithm steps for - ML.createContextSync() - MLGraphBuilder.buildSync() - MLContext.computeSync() Fix webmachinelearning#531

* Remove the definition and algorithm steps for - ML.createContextSync() - MLGraphBuilder.buildSync() - MLContext.computeSync() * Use [=reject=] |promise| with a {{TypeError}} * Abort after rejecting promise in parallel steps Fix #531

anssiko mentioned this issue Jan 26, 2024

Update Status of this document for CR Snapshot #532

Merged

anssiko mentioned this issue Jan 31, 2024

Remove MLCommandEncoder #546

Merged

huningxin added a commit to huningxin/webnn that referenced this issue Feb 1, 2024

Drop the support of synchronous execution

d87dc43

Fix webmachinelearning#531

huningxin added a commit to huningxin/webnn that referenced this issue Feb 1, 2024

Drop the support of synchronous execution

4404fce

Remove the definition and algorithm steps for - ML.createContextSync() - MLGraphBuilder.buildSync() - MLContext.computeSync() Fix webmachinelearning#531

huningxin mentioned this issue Feb 1, 2024

Drop the support of synchronous execution #548

Merged

This was referenced Feb 1, 2024

Add traversal algorithm to buildSync() #448

Closed

Process: Add documentation for labels, current and proposed #533

Merged

huningxin added a commit to huningxin/webnn that referenced this issue Feb 15, 2024

Drop the support of synchronous execution

1aa0bbc

Remove the definition and algorithm steps for - ML.createContextSync() - MLGraphBuilder.buildSync() - MLContext.computeSync() Fix webmachinelearning#531

anssiko closed this as completed in #548 Feb 15, 2024

ibelem mentioned this issue Mar 7, 2024

Update anchor link for compute() after dropping synchronous execution support #596

Merged

anssiko mentioned this issue Mar 11, 2024

Delta review (to CR) of Web Neural Network API w3ctag/design-reviews#771

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Drop the support of synchronous execution #531

Drop the support of synchronous execution #531

huningxin commented Jan 26, 2024

a-sully commented Jan 26, 2024

anssiko commented Jan 26, 2024

wacky6 commented Jan 29, 2024 •

edited

Loading

huningxin commented Jan 31, 2024

Drop the support of synchronous execution #531

Drop the support of synchronous execution #531

Comments

huningxin commented Jan 26, 2024

a-sully commented Jan 26, 2024

anssiko commented Jan 26, 2024

wacky6 commented Jan 29, 2024 • edited Loading

huningxin commented Jan 31, 2024

wacky6 commented Jan 29, 2024 •

edited

Loading