Bug 234008 - ANGLE Metal framebuffer object clear performance regression
Summary: ANGLE Metal framebuffer object clear performance regression
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebGL (show other bugs)
Version: Safari 15
Hardware: iPhone / iPad iOS 15
: P2 Major
Assignee: Dan Glastonbury
URL:
Keywords: InRadar
Depends on:
Blocks: anglemetalregr
  Show dependency treegraph
 
Reported: 2021-12-08 07:46 PST by Jekfer Bichon
Modified: 2022-06-01 00:05 PDT (History)
13 users (show)

See Also:


Attachments
Sample to reproduce (4.77 KB, text/html)
2021-12-08 07:46 PST, Jekfer Bichon
no flags Details
Patch (4.20 KB, patch)
2022-04-04 21:21 PDT, Dan Glastonbury
no flags Details | Formatted Diff | Diff
Patch (29.01 KB, patch)
2022-04-04 23:04 PDT, Dan Glastonbury
no flags Details | Formatted Diff | Diff
Patch (8.28 KB, patch)
2022-04-05 22:01 PDT, Dan Glastonbury
kkinnunen: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jekfer Bichon 2021-12-08 07:46:09 PST
Created attachment 446365 [details]
Sample to reproduce

Hello,

We have detected a performance regression while manipulating framebuffers. I have attached a sample where I bind and clear framebuffers that are created once. The regression has been detected on an iPad Gen 8.

We were getting a comfortable 60 fps on the sample on Safari 14 while we are barely hitting 21 fps on Safari 15. Disabling the experimental option "WebGL via Metal" on Safari 15 puts the performances back at Safari 14 level.

Thank you.
Comment 1 Radar WebKit Bug Importer 2021-12-15 08:02:17 PST
<rdar://problem/86522988>
Comment 2 Kimmo Kinnunen 2021-12-15 12:32:49 PST
I can reproduce slight regressions in perf of iPhone 11, iPad Pro 2015, nothing as drastic as reported. iPhone 11 fps == ~30, iPad Pro 2015 fps == ~10

All samples under 
Sample Count, Samples %, Normalized CPU %, Symbol
209, 4.4%, 3.2%, WebCore::GraphicsContextGLANGLE::clear(unsigned int) (in WebCore)
209, 4.4%, 3.2%,     gl::Clear(unsigned int) (in libANGLE-shared.dylib)
173, 3.6%, 2.6%,         gl::Context::clear(unsigned int) (in libANGLE-shared.dylib)
169, 3.5%, 2.6%,             gl::Context::syncState(angle::IterableBitSet<63ul> const&, angle::BitSetT<12ul, unsigned int, unsigned long> const&, gl::Command) (in libANGLE-shared.dylib)
166, 3.5%, 2.5%,                 rx::ContextMtl::syncState(gl::Context const*, angle::IterableBitSet<63ul> const&, angle::IterableBitSet<63ul> const&) (in libANGLE-shared.dylib)
165, 3.5%, 2.5%,                     rx::ContextMtl::onDrawFrameBufferChangedState(gl::Context const*, rx::FramebufferMtl*, bool) (in libANGLE-shared.dylib)
165, 3.5%, 2.5%,                         rx::ContextMtl::endEncoding(bool) (in libANGLE-shared.dylib)
165, 3.5%, 2.5%,                             rx::ContextMtl::endRenderEncoding(rx::mtl::RenderCommandEncoder*) (in libANGLE-shared.dylib)
163, 3.4%, 2.5%,                                 rx::mtl::RenderCommandEncoder::endEncodingImpl(bool) (in libANGLE-shared.dylib)
Comment 3 Vincent Meynier 2022-01-30 11:36:21 PST
Hello, at Dassault Systèmes, all our 3D web applications are impacted by this regression.
"CATIA xGenerative Design" and "SolidWorks xDesign" are particularly concerned.
Our Apple users unfortunately can no longer take advantage of an essential feature, because less than 30 FPS was not acceptable, so we have restricted the use while waiting for a solution.

Kimmo, I'm sorry I'm not very familiar with bugs.webkit.org
- As a Reviewer, you validated the relevance of the issue and confirmed that the sample works, and now a developer needs to take the ticket? 
- What chance do we have that this one will be dealt with soon?
- How long after the fix will it take before our users' browsers are updated with it?

Thank you very much for your time and your help.
Comment 4 Sébastien VIDECOQ 2022-02-16 07:09:58 PST
Hello, any progress on this ?
Comment 5 Kimmo Kinnunen 2022-03-01 01:48:23 PST
I can still reproduce this on iOS 15.4 seed 4, iPad Pro 2020, portrait:
WebGL via OpenGL 50fps (onscreen counter)
WebGL via Metal 26fps (onscreen counter)

Will investigate this further.
Comment 6 Dan Glastonbury 2022-04-04 21:21:43 PDT
Created attachment 456665 [details]
Patch
Comment 7 EWS Watchlist 2022-04-04 21:23:45 PDT
Note that there are important steps to take when updating ANGLE. See https://trac.webkit.org/wiki/UpdatingANGLE
Comment 8 Dan Glastonbury 2022-04-04 23:04:20 PDT
Created attachment 456673 [details]
Patch
Comment 9 Dan Glastonbury 2022-04-04 23:05:28 PDT
Comment on attachment 456673 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=456673&action=review

> Source/ThirdParty/ANGLE/changes.diff:-19425
> -diff --git a/src/compiler/translator/TranslatorMetalDirect.cpp b/src/compiler/translator/TranslatorMetalDirect.cpp

I didn't make these changes

> Source/ThirdParty/ANGLE/changes.diff:-19840
> -diff --git a/src/libANGLE/renderer/metal/FrameBufferMtl.mm b/src/libANGLE/renderer/metal/FrameBufferMtl.mm

I didn't make these changes

> Source/ThirdParty/ANGLE/changes.diff:-19931
> -@@ -775,5 +775,4 @@ void PBufferSurfaceMtl::setFixedHeight(EGLint height)

I didn't make these changes
Comment 10 Kimmo Kinnunen 2022-04-05 00:11:03 PDT
Comment on attachment 456673 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=456673&action=review

some comments about  perhaps need for clarifications

> Source/ThirdParty/ANGLE/src/libANGLE/renderer/metal/ContextMtl.h:579
> +    uint32_t mRenderPassCount = 0;

The variable name should be more descriptive. It does not explain that it counts the number of render passes since last flush.

> Source/ThirdParty/ANGLE/src/libANGLE/renderer/metal/ContextMtl.mm:1704
> +        mRenderPassCount = 0;

I don't think it's correct to reset the render pass count only when the render pass count exceeds 16.
You need to reset it somehow else, at some other point too. E.g. for every flush?

> Source/ThirdParty/ANGLE/src/libANGLE/renderer/metal/ContextMtl.mm:1705
> +        flushCommandBuffer(mtl::WaitUntilScheduled);

Is there a reason:
-  exceeding max render passes needs WaitUntilScheduled
vs 
- exceeding max call count needs NoWait

> Source/ThirdParty/ANGLE/src/libANGLE/renderer/metal/mtl_command_buffer.h:171
> +    uint32_t mRenderPasses      = 0;

is this variable used at all?

> Source/ThirdParty/ANGLE/src/libANGLE/renderer/metal/mtl_common.h:151
> +constexpr uint32_t kMaxRenderPasses = 16;

kMaxRenderPasses per some quality (in-flight what? You might have 1000 command buffers "in flight", does it mean you can only have 16 render passes total "in flight"?)

kMaxRenderPassesPerCommandBuffer ?
Comment 11 Kimmo Kinnunen 2022-04-05 00:14:26 PDT
For the commit message, you might spend a sentence or two about what it is that makes having more than 16 render passes in a command buffer slow?
Comment 12 Dan Glastonbury 2022-04-05 22:01:39 PDT
Created attachment 456783 [details]
Patch
Comment 13 Kimmo Kinnunen 2022-04-11 23:21:42 PDT
Comment on attachment 456783 [details]
Patch

nice find!
Comment 14 Kenneth Russell 2022-04-12 11:44:16 PDT
Great find and fix! It would be ideal to upstream this to ANGLE as soon as possible.
Comment 15 Dan Glastonbury 2022-05-31 19:15:00 PDT
Pull request: https://github.com/WebKit/WebKit/pull/1201
Comment 16 EWS 2022-06-01 00:05:13 PDT
Committed r295082 (251177@main): <https://commits.webkit.org/251177@main>

Reviewed commits have been landed. Closing PR #1201 and removing active labels.