RESOLVED FIXED 266793
WebLLM Compatibility
https://bugs.webkit.org/show_bug.cgi?id=266793
Summary WebLLM Compatibility
tqchen
Reported 2023-12-21 15:18:45 PST
Congrats on getting the tech preview webgpu support. I am opening this issue to check the compatibility of https://webllm.mlc.ai/ We tried a few models and seems that as of now it still does not work. While in Chrome we are able to run most of the models on my laptop. I know as of now things cannot yet be localized so it is a bit hard to figure out what is going on, just want to keep the issue open, happy to help if there is some action items possible.
Attachments
model working (430.84 KB, image/png)
2024-04-18 22:25 PDT, Mike Wyrzykowski
no flags
Radar WebKit Bug Importer
Comment 1 2023-12-28 15:19:12 PST
Mike Wyrzykowski
Comment 2 2023-12-28 15:27:31 PST
Thank you for reporting this issue. We are aware of the incompatibility but still trying to narrow down the root cause. We have been tracking this internally with <rdar://problem/108093665> If you become aware of any specific reasons as to why this does not work in WebKit, e.g., built-in WGSL functions returning incorrect results or synchronization issues from WebGPU calls from the JavaScript layer, any and all insight would be appreciated. Our expectation is that as we work towards 100% pass rate of the WebGPU conformance test suite (https://github.com/gpuweb/cts), we will see https://webllm.mlc.ai/ work as expected. We still have quite a bit of work remaining until the WebGPU CTS is passing in WebKit. If we have 100% pass rate on the conformance test suite and still failing to generate the same results as Chrome on https://webllm.mlc.ai/ we will need to take a closer look to see where the incompatibility lies.
Siyuan Feng
Comment 3 2024-02-04 06:33:54 PST
Thanks for your reply. I have been able to locate the problem of this issue: TVM WebGPU runtime uses `onSubmittedWorkDone()` <https://www.w3.org/TR/webgpu/#dom-gpuqueue-onsubmittedworkdone> API to ensure the synchronization. It's said that "Resolution of this Promise implies the completion of mapAsync() calls made prior to that call, on GPUBuffers last used exclusively on that queue." However, it fails to synchronize on Safari Tech Preview but works well on Chrome. Please let me know if you need more detailed information, and I appreciate the excellent work of webgpu support
Mike Wyrzykowski
Comment 4 2024-02-07 10:11:36 PST
Oh thank you for the update. We will take a look, in the mean time if you have a reduced test case illustrating the race in mapAsync / onSubmittedWorkDone that would be much appreciated. There may certainly be a validation error or logic error but that is causing mapAsync / onSubmittedWorkDone to be not functioning correctly. At least we attempt to wait until all commands are submitted: https://github.com/WebKit/WebKit/blob/0c1bf2e5136c5cba56cc4e647169be09861a1a52/Source/WebGPU/WebGPU/Buffer.mm#L303 but we do not wait for their completion. So I wonder if we are getting into a race where work is submitted but not completed and we return the promise. And since buffer mapping returns the contents directly: https://github.com/WebKit/WebKit/blob/0c1bf2e5136c5cba56cc4e647169be09861a1a52/Source/WebGPU/WebGPU/Buffer.mm#L248 we may really need to wait for work to be completed and not simply submitted.
Mike Wyrzykowski
Comment 5 2024-02-07 10:16:03 PST
Oh never mind, we do in fact wait for completion before we return that promise: https://github.com/WebKit/WebKit/blob/0c1bf2e5136c5cba56cc4e647169be09861a1a52/Source/WebGPU/WebGPU/Queue.mm#L145 and https://github.com/WebKit/WebKit/blob/0c1bf2e5136c5cba56cc4e647169be09861a1a52/Source/WebGPU/WebGPU/Queue.mm#L204 I will need to take a closer look to see why synchronization is not working for your case.
Mike Wyrzykowski
Comment 6 2024-04-18 22:25:40 PDT
Created attachment 471002 [details] model working So I think I finally have a fix for this issue, will need to perform a little more investigation and then make a PR
Mike Wyrzykowski
Comment 7 2024-04-18 22:26:44 PDT
Appears to work now in MiniBrowser with some small changes. I tried the default model, Llama-3-8B-Instruct-q4f32_1
tqchen
Comment 8 2024-04-19 04:43:17 PDT
great! Just curious, what was the cause of the bug? is it the async issue as we guessed?
Mike Wyrzykowski
Comment 9 2024-04-19 18:25:48 PDT
It was! But I was looking at the wrong place initially. The Metal command buffers complete quite quickly, but there was a delay responding back to the web process. Still discussing an appropriate fix but should be resolved soon.
Mike Wyrzykowski
Comment 10 2024-04-26 12:37:04 PDT
Mike Wyrzykowski
Comment 11 2024-06-05 14:39:08 PDT
The async in workers issue was fixed with https://bugs.webkit.org/show_bug.cgi?id=274769 but now we need https://bugs.webkit.org/show_bug.cgi?id=273195 to merge as it appears the site changed from dedicated workers to service workers
Mike Wyrzykowski
Comment 12 2024-06-12 14:46:54 PDT
Mike Wyrzykowski
Comment 13 2024-06-12 14:48:44 PDT
Looks correct with the race between GPUQueue.onSubmittedWorkDone fixed and GPUBuffer.mapAsync along with service+shared worker support added in https://bugs.webkit.org/show_bug.cgi?id=273195
EWS
Comment 14 2024-06-12 22:59:06 PDT
Committed 279981@main (5ddd8467ead1): <https://commits.webkit.org/279981@main> Reviewed commits have been landed. Closing PR #29769 and removing active labels.
Note You need to log in before you can comment on or make changes to this bug.