Bug 240392

Summary: [GTK][WPE] Current-context enforcement in ANGLE is expensive
Product: WebKit Reporter: Zan Dobersek <zan>
Component: New BugsAssignee: Zan Dobersek <zan>
Severity: Normal CC: alex, aperez, benjamin, cdumez, cmarcelo, dino, ews-watchlist, kondapallykalyan, mcatanzaro, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugs.webkit.org/show_bug.cgi?id=254286
Bug Depends on:    
Bug Blocks: 237649    
Description Flags
WIP patch
Patch for landing none

Description Zan Dobersek 2022-05-13 11:32:35 PDT
For basically every WebGL entrypoint, the ANGLE context is enforced to be current for the executing thread.

There's at this point in time the smart check to test whether the current context is already the one that we are using, through comparing against EGL_GetCurrentContext() return value. Only when not equal do we go on to call EGL_MakeCurrent().

Even the check is still expensive because it does an inefficient job of loading from thread-local storage. Because it's called on every WebGL entrypoint this becomes quite visible on profiles and a considerable amount of time is spent here. I can provide receipts later.

ANGLE is internally using the `thread_local` specifier on the data structure holding the current-context value. Mesa doesn't suffer from this, and is using the `__thread` specifier, along with the `tls_model("initial-exec")` attribute. This needs confirmation on what the differentiator exactly is, after that it should be possible to decide what to change and where to do it.
Comment 1 Zan Dobersek 2022-05-15 23:54:55 PDT
Created attachment 459407 [details]
WIP patch
Comment 2 Zan Dobersek 2022-05-16 01:26:58 PDT
Testing on equal WebGL load, this is the difference:

Current current-context handling, through ANGLE:
> CPU utilization of the WebProcess:
>           2,629.68 msec task-clock                #    0.657 CPUs utilized
> Percentage of CPU utilization spent under and inside makeContextCurrent():
>      4.82%     0.57%  WPEWebProcess    libWPEWebKit-1.0.so.3.17.0      [.] WebCore::GraphicsContextGLANGLE::makeContextCurrent

With a check on a thread-local variable with the tls_model("initial-exec") attribute, as implemented in the WIP patch:
> CPU utilization of the WebProcess:
>          17,924.20 msec task-clock                #    0.597 CPUs utilized 
> Percentage of CPU utilization spent under and inside makeContextCurrent():
>      0.86%     0.66%  WPEWebProcess    libWPEWebKit-1.0.so.3.17.0      [.] WebCore::GraphicsContextGLANGLE::makeContextCurrent
Comment 3 Zan Dobersek 2022-05-16 12:03:58 PDT
Created attachment 459445 [details]
Comment 4 Zan Dobersek 2022-05-16 12:04:58 PDT
Created attachment 459446 [details]
Comment 5 Adrian Perez 2022-05-16 13:25:29 PDT
Comment on attachment 459446 [details]

View in context: https://bugs.webkit.org/attachment.cgi?id=459446&action=review

> Source/WebCore/ChangeLog:34
> +        'ELF Handling For Thread-Local Storage' paper by Ulrich Drepper.

Reading this was good for a quick-ish summary for me:

Comment 6 Zan Dobersek 2022-05-16 22:39:01 PDT
Created attachment 459481 [details]
Patch for landing
Comment 7 Zan Dobersek 2022-05-16 22:52:18 PDT
Comment on attachment 459481 [details]
Patch for landing

Clearing flags on attachment: 459481

Committed r294290 (250632@trunk): <https://commits.webkit.org/250632@trunk>
Comment 8 Zan Dobersek 2022-05-16 22:52:24 PDT
All reviewed patches have been landed.  Closing bug.
Comment 9 Radar WebKit Bug Importer 2022-05-16 22:53:13 PDT
Comment 10 Michael Catanzaro 2023-09-06 14:44:50 PDT
Note this code has been deleted at some point. Not sure if that is OK or not, but TLS_MODEL_INITIAL_EXEC is no longer used anywhere.