266793 – WebLLM Compatibility

RESOLVED FIXED 266793

WebLLM Compatibility

https://bugs.webkit.org/show_bug.cgi?id=266793

Summary WebLLM Compatibility

tqchen

Reported 2023-12-21 15:18:45 PST

Congrats on getting the tech preview webgpu support. I am opening this issue to check the compatibility of https://webllm.mlc.ai/ We tried a few models and seems that as of now it still does not work. While in Chrome we are able to run most of the models on my laptop. I know as of now things cannot yet be localized so it is a bit hard to figure out what is going on, just want to keep the issue open, happy to help if there is some action items possible.

Attachments
model working (430.84 KB, image/png) 2024-04-18 22:25 PDT, Mike Wyrzykowski	no flags	Details
View All Add attachment proposed patch, testcase, etc.

Radar WebKit Bug Importer

Comment 1 2023-12-28 15:19:12 PST

<rdar://problem/120255613>

Mike Wyrzykowski

Comment 2 2023-12-28 15:27:31 PST

Thank you for reporting this issue. We are aware of the incompatibility but still trying to narrow down the root cause. We have been tracking this internally with <rdar://problem/108093665> If you become aware of any specific reasons as to why this does not work in WebKit, e.g., built-in WGSL functions returning incorrect results or synchronization issues from WebGPU calls from the JavaScript layer, any and all insight would be appreciated. Our expectation is that as we work towards 100% pass rate of the WebGPU conformance test suite (https://github.com/gpuweb/cts), we will see https://webllm.mlc.ai/ work as expected. We still have quite a bit of work remaining until the WebGPU CTS is passing in WebKit. If we have 100% pass rate on the conformance test suite and still failing to generate the same results as Chrome on https://webllm.mlc.ai/ we will need to take a closer look to see where the incompatibility lies.

Siyuan Feng

Comment 3 2024-02-04 06:33:54 PST

Thanks for your reply. I have been able to locate the problem of this issue: TVM WebGPU runtime uses `onSubmittedWorkDone()` <https://www.w3.org/TR/webgpu/#dom-gpuqueue-onsubmittedworkdone> API to ensure the synchronization. It's said that "Resolution of this Promise implies the completion of mapAsync() calls made prior to that call, on GPUBuffers last used exclusively on that queue." However, it fails to synchronize on Safari Tech Preview but works well on Chrome. Please let me know if you need more detailed information, and I appreciate the excellent work of webgpu support

Mike Wyrzykowski

Comment 4 2024-02-07 10:11:36 PST

Oh thank you for the update. We will take a look, in the mean time if you have a reduced test case illustrating the race in mapAsync / onSubmittedWorkDone that would be much appreciated. There may certainly be a validation error or logic error but that is causing mapAsync / onSubmittedWorkDone to be not functioning correctly. At least we attempt to wait until all commands are submitted: https://github.com/WebKit/WebKit/blob/0c1bf2e5136c5cba56cc4e647169be09861a1a52/Source/WebGPU/WebGPU/Buffer.mm#L303 but we do not wait for their completion. So I wonder if we are getting into a race where work is submitted but not completed and we return the promise. And since buffer mapping returns the contents directly: https://github.com/WebKit/WebKit/blob/0c1bf2e5136c5cba56cc4e647169be09861a1a52/Source/WebGPU/WebGPU/Buffer.mm#L248 we may really need to wait for work to be completed and not simply submitted.

Mike Wyrzykowski

Comment 5 2024-02-07 10:16:03 PST

Oh never mind, we do in fact wait for completion before we return that promise: https://github.com/WebKit/WebKit/blob/0c1bf2e5136c5cba56cc4e647169be09861a1a52/Source/WebGPU/WebGPU/Queue.mm#L145 and https://github.com/WebKit/WebKit/blob/0c1bf2e5136c5cba56cc4e647169be09861a1a52/Source/WebGPU/WebGPU/Queue.mm#L204 I will need to take a closer look to see why synchronization is not working for your case.

Mike Wyrzykowski

Comment 6 2024-04-18 22:25:40 PDT

Created attachment 471002 [details] model working So I think I finally have a fix for this issue, will need to perform a little more investigation and then make a PR

Mike Wyrzykowski

Comment 7 2024-04-18 22:26:44 PDT

Appears to work now in MiniBrowser with some small changes. I tried the default model, Llama-3-8B-Instruct-q4f32_1

tqchen

Comment 8 2024-04-19 04:43:17 PDT

great! Just curious, what was the cause of the bug? is it the async issue as we guessed?

Mike Wyrzykowski

Comment 9 2024-04-19 18:25:48 PDT

It was! But I was looking at the wrong place initially. The Metal command buffers complete quite quickly, but there was a delay responding back to the web process. Still discussing an appropriate fix but should be resolved soon.

Mike Wyrzykowski

Comment 10 2024-04-26 12:37:04 PDT

https://github.com/WebKit/WebKit/pull/27814

Mike Wyrzykowski

Comment 11 2024-06-05 14:39:08 PDT

The async in workers issue was fixed with https://bugs.webkit.org/show_bug.cgi?id=274769 but now we need https://bugs.webkit.org/show_bug.cgi?id=273195 to merge as it appears the site changed from dedicated workers to service workers

Mike Wyrzykowski

Comment 12 2024-06-12 14:46:54 PDT

Pull request: https://github.com/WebKit/WebKit/pull/29769

Mike Wyrzykowski

Comment 13 2024-06-12 14:48:44 PDT

Looks correct with the race between GPUQueue.onSubmittedWorkDone fixed and GPUBuffer.mapAsync along with service+shared worker support added in https://bugs.webkit.org/show_bug.cgi?id=273195

EWS

Comment 14 2024-06-12 22:59:06 PDT

Committed 279981@main (5ddd8467ead1): <https://commits.webkit.org/279981@main> Reviewed commits have been landed. Closing PR #29769 and removing active labels.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version Safari Technology Preview

Hardware Mac (Apple Silicon)

OS macOS 14

Product WebKit

Component WebGPU

Assignee

Mike Wyrzykowski

Reported

2023-12-21 15:18 PST

Modified

2024-06-12 22:59 PDT History

CC List

3 users Show

URL

Keywords InRadar

Depends on

Blocks