WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
215232
REGRESSION (
r257667
): 1.9x more CPU time in IPC::SharedBufferDataReference decoding during Netflix playback
https://bugs.webkit.org/show_bug.cgi?id=215232
Summary
REGRESSION (r257667): 1.9x more CPU time in IPC::SharedBufferDataReference de...
Andy Estes
Reported
2020-08-06 11:49:34 PDT
After
r257667
, traces show 1.9x more CPU time is spent decoding IPC::SharedBufferDataReference objects during video playback on Netflix.
Attachments
Patch for EWS
(6.39 KB, patch)
2020-08-18 11:40 PDT
,
Andy Estes
no flags
Details
Formatted Diff
Diff
Patch for EWS
(44.71 KB, patch)
2020-08-18 21:54 PDT
,
Andy Estes
no flags
Details
Formatted Diff
Diff
Patch for EWS
(50.32 KB, patch)
2020-08-19 03:04 PDT
,
Andy Estes
no flags
Details
Formatted Diff
Diff
Patch
(51.24 KB, patch)
2020-08-19 14:26 PDT
,
Andy Estes
no flags
Details
Formatted Diff
Diff
Patch for landing
(92.81 KB, patch)
2020-08-20 17:19 PDT
,
Andy Estes
no flags
Details
Formatted Diff
Diff
Patch for EWS
(91.81 KB, patch)
2020-08-20 17:52 PDT
,
Andy Estes
no flags
Details
Formatted Diff
Diff
Patch for landing
(91.56 KB, patch)
2020-08-20 18:09 PDT
,
Andy Estes
no flags
Details
Formatted Diff
Diff
Show Obsolete
(5)
View All
Add attachment
proposed patch, testcase, etc.
Radar WebKit Bug Importer
Comment 1
2020-08-06 11:51:06 PDT
<
rdar://problem/66637920
>
Andy Estes
Comment 2
2020-08-18 11:40:54 PDT
Comment hidden (obsolete)
Created
attachment 406796
[details]
Patch for EWS
Andy Estes
Comment 3
2020-08-18 21:54:27 PDT
Comment hidden (obsolete)
Created
attachment 406828
[details]
Patch for EWS
Andy Estes
Comment 4
2020-08-19 03:04:45 PDT
Comment hidden (obsolete)
Created
attachment 406832
[details]
Patch for EWS
Geoffrey Garen
Comment 5
2020-08-19 12:15:48 PDT
Comment on
attachment 406832
[details]
Patch for EWS View in context:
https://bugs.webkit.org/attachment.cgi?id=406832&action=review
> Source/WebKit/ChangeLog:24 > + 1. While both techniques copy the SharedBuffer only once during encoding, the new technique > + combines the buffer's data segments (by calling SharedBuffer::data()) while the old > + technique avoids this.
To fix a regression, a partial revert makes great sense. That said, should we also fix the encoder for SharedBuffer, so that it stops doing this inefficient thing?
> Source/WebKit/ChangeLog:30 > + 2. Once a SharedMemory handle is decoded and mapped, the new technique copies its memory > + into a SharedBuffer for use by the message receiver. Unlike the old technique that > + decodes as a pointer into the existing Decoder buffer, the new technique must decode a > + handle, map a SharedMemory allocation, copy it into a new SharedBuffer on the heap, and > + then destroy the handle and SharedMemory.
Same question here.
> Source/WebKit/Platform/IPC/LegacySharedBufferDataReference.h:36 > +class LegacySharedBufferDataReference {
Would be good to add a comment explaining why this should not be used, and what should be used instead.
Andy Estes
Comment 6
2020-08-19 14:26:41 PDT
Created
attachment 406872
[details]
Patch
Andy Estes
Comment 7
2020-08-19 14:37:46 PDT
(In reply to Geoffrey Garen from
comment #5
)
> Comment on
attachment 406832
[details]
> Patch for EWS > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=406832&action=review
> > > Source/WebKit/ChangeLog:24 > > + 1. While both techniques copy the SharedBuffer only once during encoding, the new technique > > + combines the buffer's data segments (by calling SharedBuffer::data()) while the old > > + technique avoids this. > > To fix a regression, a partial revert makes great sense. That said, should > we also fix the encoder for SharedBuffer, so that it stops doing this > inefficient thing?
Yes, I think we should fix the segment combining issue in a different bug.
> > > Source/WebKit/ChangeLog:30 > > + 2. Once a SharedMemory handle is decoded and mapped, the new technique copies its memory > > + into a SharedBuffer for use by the message receiver. Unlike the old technique that > > + decodes as a pointer into the existing Decoder buffer, the new technique must decode a > > + handle, map a SharedMemory allocation, copy it into a new SharedBuffer on the heap, and > > + then destroy the handle and SharedMemory. > > Same question here.
Yes, also in a different bug. I'll file these and add FIXMEs.
Andy Estes
Comment 8
2020-08-19 14:40:36 PDT
(Note that
https://bugs.webkit.org/attachment.cgi?id=406872
has a much differently worded ChangeLog)
Geoffrey Garen
Comment 9
2020-08-19 15:30:25 PDT
Comment on
attachment 406872
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=406872&action=review
r=me
> Source/WebKit/ChangeLog:43 > + LegacySharedBufferDataReference type, which reverts to the old behaior of producing an
behavior
> Source/WebKit/ChangeLog:44 > + Resolve this slowdown by reverting the argument encoding changes made to > + IPC::SharedBufferDataReference in
r257667
. For compatibility with code added after
r257667
, > + IPC::SharedBufferDataReference retains its new behavior of copying into a SharedBuffer when > + decoded, but this type is now only used in cases where the SharedBuffer can be ref'd instead > + of copied. Pre-
r257667
instances of SharedBufferDataReference were changed to a new > + LegacySharedBufferDataReference type, which reverts to the old behaior of producing an > + IPC::DataReference when decoded, avoiding many copies.
So LegacySharedBufferDataReference is the fast IPC data type and SharedBufferDataReference is the slow IPC data type? That is... surprising. Are we in danger of new adoptions of SharedBufferDataReference causing perf regressions?
Darin Adler
Comment 10
2020-08-19 15:30:55 PDT
Comment on
attachment 406872
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=406872&action=review
Great work! So exciting to see this diagnosed.
> Source/WebKit/ChangeLog:3 > + REGRESSION (
r257667
): 1.9x more CPU time in IPC::SharedBufferDataReference decoding during Netflix playback
What’s our strategy to protect ourselves from making this kind of mistake again? Maybe we can find a way to count how often we copy data buffers during regression tests? There can’t be too many different places that do that "copy a lot of bytes" operation.
> Source/WebKit/ChangeLog:21 > + handle. It makes only one copy like the old encoder did, but pays the added cost of > + combining SharedBuffer data segments prior to copying by needlessly calling > + SharedBuffer::data() (this should be fixed independently).
Seems like we have a real opportunity there; doesn’t seem that hard to code!
> Source/WebKit/ChangeLog:43 > + LegacySharedBufferDataReference type, which reverts to the old behaior of producing an
I’m not sure that "Legacy" accurately captures the difference between these two. Normally we name something Legacy when the dependency on the old behavior is an unwanted remnant of the past, to eventually be eliminated. Here it seems from your description there are reasons to prefer the two different behaviors for different cases of sending data across the process boundary; ideally those would be reflected in the names. Typo: "behaior"
> Tools/DumpRenderTree/DumpRenderTree.xcodeproj/project.pbxproj:414 > + BCB281EE0CFA713D007E533E /* Base.xcconfig */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = text.xcconfig; name = Base.xcconfig; path = mac/Configurations/Base.xcconfig; sourceTree = "<group>"; }; > + BCB281F00CFA713D007E533E /* DumpRenderTree.xcconfig */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = text.xcconfig; name = DumpRenderTree.xcconfig; path = mac/Configurations/DumpRenderTree.xcconfig; sourceTree = "<group>"; }; > + BCB282F40CFA7450007E533E /* DebugRelease.xcconfig */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = text.xcconfig; name = DebugRelease.xcconfig; path = mac/Configurations/DebugRelease.xcconfig; sourceTree = "<group>"; }; > + BCB283DE0CFA7C20007E533E /* TestNetscapePlugIn.xcconfig */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = text.xcconfig; name = TestNetscapePlugIn.xcconfig; path = mac/Configurations/TestNetscapePlugIn.xcconfig; sourceTree = "<group>"; };
Are these changes improvements? Is "4" a better file encoding than "30"? Or is this just a script doing some reverting?
Andy Estes
Comment 11
2020-08-19 17:06:33 PDT
(In reply to Geoffrey Garen from
comment #9
)
> So LegacySharedBufferDataReference is the fast IPC data type and > SharedBufferDataReference is the slow IPC data type? That is... surprising. > > Are we in danger of new adoptions of SharedBufferDataReference causing perf > regressions?
(In reply to Darin Adler from
comment #10
)
> I’m not sure that "Legacy" accurately captures the difference between these > two. Normally we name something Legacy when the dependency on the old > behavior is an unwanted remnant of the past, to eventually be eliminated. > Here it seems from your description there are reasons to prefer the two > different behaviors for different cases of sending data across the process > boundary; ideally those would be reflected in the names.
How about I rename IPC::LegacySharedBufferDataReference to IPC::SharedBufferDataReference and call the current IPC::SharedBufferDataReference something like IPC::SharedBufferCopy? The thinking that led to "legacy" was that while IPC::DataReference is currently the faster way to decode, IPC::SharedBufferDataReference provides the safer interface for message receivers with SharedBuffer instead of a raw pointer to an unretained buffer. So that LegacySharedBufferDataReference is decoded as a IPC::DataReference felt like a legacy behavior, but really it's a faster-or-safer tradeoff I want to convey. I think a modern design would be to phase out DataReference (and LegacySharedBufferDataReference) and make SharedBuffer faster to use for decoding. One thing I think would help here is to add a type of SharedBuffer data segment that just refs a SharedMemory or IPC::Decoder, but that's out-of-scope for this change.
Darin Adler
Comment 12
2020-08-19 17:21:47 PDT
All of that sounds smart.
Andy Estes
Comment 13
2020-08-20 17:19:37 PDT
Comment hidden (obsolete)
Created
attachment 406980
[details]
Patch for landing
Andy Estes
Comment 14
2020-08-20 17:52:39 PDT
Created
attachment 406989
[details]
Patch for EWS
Andy Estes
Comment 15
2020-08-20 18:09:26 PDT
Comment hidden (obsolete)
Created
attachment 406992
[details]
Patch for landing
Andy Estes
Comment 16
2020-08-20 18:17:21 PDT
(In reply to Darin Adler from
comment #10
)
> Comment on
attachment 406872
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=406872&action=review
> > Great work! So exciting to see this diagnosed. > > > Source/WebKit/ChangeLog:3 > > + REGRESSION (
r257667
): 1.9x more CPU time in IPC::SharedBufferDataReference decoding during Netflix playback > > What’s our strategy to protect ourselves from making this kind of mistake > again? > > Maybe we can find a way to count how often we copy data buffers during > regression tests? There can’t be too many different places that do that > "copy a lot of bytes" operation.
One thing we could do is assert in IPC::SharedBufferCopy's destructor that m_buffer is either null or has more than one ref. That would detect cases where we created a SharedBuffer that the message receiver never ref'd or moved.
EWS
Comment 17
2020-08-21 04:29:24 PDT
Committed
r266003
: <
https://trac.webkit.org/changeset/266003
> All reviewed patches have been landed. Closing bug and clearing flags on
attachment 406992
[details]
.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug