Bug 235615 - REGRESSION(r287684) speedtest.net uses many GB of memory
Summary: REGRESSION(r287684) speedtest.net uses many GB of memory
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Jean-Yves Avenard [:jya]
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2022-01-25 15:31 PST by Alex Christensen
Modified: 2022-02-10 16:40 PST (History)
32 users (show)

See Also:


Attachments
Patch (1.56 KB, patch)
2022-01-25 15:33 PST, Alex Christensen
no flags Details | Formatted Diff | Diff
Patch (252.15 KB, patch)
2022-01-25 19:26 PST, Alex Christensen
ews-feeder: commit-queue-
Details | Formatted Diff | Diff
safari-613-branch-patch (255.45 KB, patch)
2022-01-25 21:12 PST, Alex Christensen
no flags Details | Formatted Diff | Diff
Patch (2.25 KB, patch)
2022-01-26 01:02 PST, Jean-Yves Avenard [:jya]
no flags Details | Formatted Diff | Diff
Patch (3.69 KB, patch)
2022-01-26 22:03 PST, Jean-Yves Avenard [:jya]
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Alex Christensen 2022-01-25 15:31:50 PST
REGRESSION(r287684) speedtest.net uses many GB of memory
Comment 1 Alex Christensen 2022-01-25 15:33:13 PST
Created attachment 449974 [details]
Patch
Comment 2 Alex Christensen 2022-01-25 15:33:17 PST
<rdar://problem/87830583>
Comment 3 Chris Dumez 2022-01-25 15:34:12 PST
Comment on attachment 449974 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=449974&action=review

> Source/WebCore/ChangeLog:9
> +        For a reason that is still somewhat mysterious to me, the change to XMLHttpRequest::didReceiveData

Doesn't inspire confidence :)
Comment 4 Alex Christensen 2022-01-25 15:34:48 PST
Try it and see for yourself.  Before and after, the difference is quite clear.
Comment 5 Chris Dumez 2022-01-25 15:35:38 PST
(In reply to Alex Christensen from comment #4)
> Try it and see for yourself.  Before and after, the difference is quite
> clear.

Well, I believe you but it'd be good to figure out WHY still. You may be working around a very important bug that will bite us again in the future.
Comment 6 Chris Dumez 2022-01-25 15:38:32 PST
(In reply to Chris Dumez from comment #5)
> (In reply to Alex Christensen from comment #4)
> > Try it and see for yourself.  Before and after, the difference is quite
> > clear.
> 
> Well, I believe you but it'd be good to figure out WHY still. You may be
> working around a very important bug that will bite us again in the future.

Also, r287684 seems to be doing a lot of extra data copying (when allocating those SharedBuffers) unless I am misreading. It is not clear to me we want to keep going in that direction (perf/memory wise). We may want to revert r287684 instead of e.g.
Comment 7 Jean-Yves Avenard [:jya] 2022-01-25 16:07:05 PST
I don’t think this is the right fix. 

There are no copies/allocation. The memory coming from the network process is a SharedMemory and is wrapped in a SharedBuffer

So the memory would still be allocated to the network process rather than the content process. 

You need to change the ownership. 

Your change by doing a new allocation/copy as a side effect assign the memory to the content process
Comment 8 Chris Dumez 2022-01-25 16:09:26 PST
(In reply to Jean-Yves Avenard [:jya] from comment #7)
> I don’t think this is the right fix. 
> 
> There are no copies/allocation. The memory coming from the network process
> is a SharedMemory and is wrapped in a SharedBuffer

My understanding was that it is only SharedMemory if the data comes from the network cache, not when it comes from the network?

> 
> So the memory would still be allocated to the network process rather than
> the content process. 
> 
> You need to change the ownership. 
> 
> Your change by doing a new allocation/copy as a side effect assign the
> memory to the content process
Comment 9 Alex Christensen 2022-01-25 16:36:17 PST
To reduce risk for the branch, I'm preparing a revert of r287684 and r287687.
Comment 10 Jean-Yves Avenard [:jya] 2022-01-25 17:10:44 PST
(In reply to Chris Dumez from comment #8)
> (In reply to Jean-Yves Avenard [:jya] from comment #7)
> > I don’t think this is the right fix. 
> > 
> > There are no copies/allocation. The memory coming from the network process
> > is a SharedMemory and is wrapped in a SharedBuffer
> 
> My understanding was that it is only SharedMemory if the data comes from the
> network cache, not when it comes from the network?

It's all SharedMemory if the data originates from the network process.
Wrapped in a SharedBuffer.
Comment 11 Jean-Yves Avenard [:jya] 2022-01-25 17:14:24 PST
Comment on attachment 449974 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=449974&action=review

> Source/WebCore/xml/XMLHttpRequest.cpp:1062
> +        m_binaryResponseBuilder.append(buffer.data(), buffer.size());

BTW, this would revert to the behaviour of what was done prior https://commits.webkit.org/r287684.

So while it a memory/performance regression following r287684; it is not a regression compare to what we had before.
Comment 12 Alex Christensen 2022-01-25 19:26:13 PST
Created attachment 449993 [details]
Patch
Comment 13 Alex Christensen 2022-01-25 21:12:11 PST
Created attachment 449995 [details]
safari-613-branch-patch
Comment 14 Jean-Yves Avenard [:jya] 2022-01-26 01:02:48 PST
Created attachment 450003 [details]
Patch
Comment 15 youenn fablet 2022-01-26 02:39:29 PST
Comment on attachment 450003 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=450003&action=review

> Source/WebCore/loader/cache/CachedRawResource.cpp:71
> +        auto incrementalData = data.getSomeData(previousDataSize);

This makes SharedBufferDataView::createSharedBuffer() or more generally SharedBuffer created from views potentially harmful.

Unrelated but I wonder whether we need to update m_data (makeContiguous call) in the case of DataBufferingPolicy::DoNotBufferData.
It seems there is a potential to further reduce some memory allocations in that case: XHR, fetch...
Comment 16 Jean-Yves Avenard [:jya] 2022-01-26 02:46:51 PST
Comment on attachment 450003 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=450003&action=review

>> Source/WebCore/loader/cache/CachedRawResource.cpp:71
>> +        auto incrementalData = data.getSomeData(previousDataSize);
> 
> This makes SharedBufferDataView::createSharedBuffer() or more generally SharedBuffer created from views potentially harmful.
> 
> Unrelated but I wonder whether we need to update m_data (makeContiguous call) in the case of DataBufferingPolicy::DoNotBufferData.
> It seems there is a potential to further reduce some memory allocations in that case: XHR, fetch...

harmful in what way? it has always behaved that way.

could you open a bug for following for the non-cache side of things?
Comment 17 youenn fablet 2022-01-26 02:51:59 PST
> harmful in what way? it has always behaved that way.

A SharedBuffer might be much bigger than its size actually tells us.

> could you open a bug for following for the non-cache side of things?

I filed https://bugs.webkit.org/show_bug.cgi?id=235635
Comment 18 Jean-Yves Avenard [:jya] 2022-01-26 02:52:39 PST
Comment on attachment 450003 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=450003&action=review

>>> Source/WebCore/loader/cache/CachedRawResource.cpp:71
>>> +        auto incrementalData = data.getSomeData(previousDataSize);
>> 
>> This makes SharedBufferDataView::createSharedBuffer() or more generally SharedBuffer created from views potentially harmful.
>> 
>> Unrelated but I wonder whether we need to update m_data (makeContiguous call) in the case of DataBufferingPolicy::DoNotBufferData.
>> It seems there is a potential to further reduce some memory allocations in that case: XHR, fetch...
> 
> harmful in what way? it has always behaved that way.
> 
> could you open a bug for following for the non-cache side of things?

Actually, I can optimise things right away as m_data is cleared immediately after when data policy is to not cache
Comment 19 Jean-Yves Avenard [:jya] 2022-01-26 02:58:50 PST
Comment on attachment 450003 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=450003&action=review

>>>> Source/WebCore/loader/cache/CachedRawResource.cpp:71
>>>> +        auto incrementalData = data.getSomeData(previousDataSize);
>>> 
>>> This makes SharedBufferDataView::createSharedBuffer() or more generally SharedBuffer created from views potentially harmful.
>>> 
>>> Unrelated but I wonder whether we need to update m_data (makeContiguous call) in the case of DataBufferingPolicy::DoNotBufferData.
>>> It seems there is a potential to further reduce some memory allocations in that case: XHR, fetch...
>> 
>> harmful in what way? it has always behaved that way.
>> 
>> could you open a bug for following for the non-cache side of things?
> 
> Actually, I can optimise things right away as m_data is cleared immediately after when data policy is to not cache

on second thoughts, there's so many re-entrancy happening in this area of the code ; I'd prefer to keep it that way for now and look into the matter at a later stage, too scared of side effects
Comment 20 youenn fablet 2022-01-26 03:37:06 PST
> on second thoughts, there's so many re-entrancy happening in this area of
> the code ; I'd prefer to keep it that way for now and look into the matter
> at a later stage, too scared of side effects

Yep, it would be nice to make that code easier to reason about.
Comment 21 EWS 2022-01-26 04:57:30 PST
Committed r288614 (246433@main): <https://commits.webkit.org/246433@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 450003 [details].
Comment 22 Jean-Yves Avenard [:jya] 2022-01-26 05:15:31 PST
Comment on attachment 449993 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=449993&action=review

> Source/WebCore/ChangeLog:8
> +        This effectively reverts r287684 and r287687.

on ToT, you would need to revert several more changes for this test to be green. In particular https://commits.webkit.org/r287808
Comment 23 Geoffrey Garen 2022-01-26 10:05:30 PST
Comment on attachment 450003 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=450003&action=review

> Source/WebCore/ChangeLog:18
> +        However, that caused every single intermediary accumulated buffers to be
> +        kept referenced by the XMLHttpRequest SharedBufferBuilder leading to
> +        massive memory use.

Can you explain this detail a little more? What behavior changes when we operate on 'data' instead of 'm_data'?
Comment 24 Geoffrey Garen 2022-01-26 10:07:59 PST
Comment on attachment 450003 [details]
Patch

Seems like a valuable thing to test, now that we've discovered these quirks.

Is it possible to write a layout test for this that would jetsam on our iOS testers, or a cross platform test that uses window.internals to check for crazy high memory use?

We can use an http test with a .cgi script to supply a fast/huge XHR response.
Comment 25 youenn fablet 2022-01-26 10:20:20 PST
> > Source/WebCore/ChangeLog:18
> > +        However, that caused every single intermediary accumulated buffers to be
> > +        kept referenced by the XMLHttpRequest SharedBufferBuilder leading to
> > +        massive memory use.
> 
> Can you explain this detail a little more? What behavior changes when we
> operate on 'data' instead of 'm_data'?

Here is my understanding. For each update, we do the following steps:
- Set m_data to a new big continuous chunk of data that stores all received data
- Create a view of the new data as a SharedBuffer. This SharedBuffer references the big continuous chunk
- XHR stores a ref to the view, hence keep the big chunk alive.

Each XHR chunk, although small in data that can be accessed, is actually keeping in memory all temporary created m_data chunks.

> Is it possible to write a layout test for this that would jetsam on our iOS
> testers, or a cross platform test that uses window.internals to check for
> crazy high memory use?

An internals test is probably a good idea though that might require adding a getter in SharedBuffer to know the actual allocated size.
Comment 26 Geoffrey Garen 2022-01-26 10:32:02 PST
> An internals test is probably a good idea though that might require adding a
> getter in SharedBuffer to know the actual allocated size.

I was thinking of something coarser like window.internals.isUnderMemoryPressure, or adding a new interface like window.internals.memoryUsage which returned something like getrusage.ru_idrss or something.
Comment 27 Geoffrey Garen 2022-01-26 10:40:04 PST
> - Create a view of the new data as a SharedBuffer. This SharedBuffer
> references the big continuous chunk
> - XHR stores a ref to the view, hence keep the big chunk alive.

Got it: The key to the bug is that each call to didReceiveData receives a view of a *new* SharedBuffer, and each new SharedBuffer is as big as the whole response; but XHR expected to be passed a buffer that was only as big as the new data, to accumulate in a builder.

It seems like we could improve this interface by passing both the new data and the accumulated data, since CachedRawResource has both. Then XHR wouldn't need to maintain its own response builder.
Comment 28 youenn fablet 2022-01-26 10:45:53 PST
I am not sure we want to expose the whole data, not all clients need it.
We should actually be able to just pass the new data directly without any special processing in the nobuffer case, m_data would remain null in that case for instance.
Comment 29 Jean-Yves Avenard [:jya] 2022-01-26 15:41:55 PST
(In reply to Geoffrey Garen from comment #27)
> > - Create a view of the new data as a SharedBuffer. This SharedBuffer
> > references the big continuous chunk
> > - XHR stores a ref to the view, hence keep the big chunk alive.
> 
> Got it: The key to the bug is that each call to didReceiveData receives a
> view of a *new* SharedBuffer, and each new SharedBuffer is as big as the
> whole response; but XHR expected to be passed a buffer that was only as big
> as the new data, to accumulate in a builder.

That's right.
However the key is that the new incoming SharedBuffer is a fragmented one, it's made of multiple chunks, and XHR only expects that last chunk.

The CachedRawResource however maintains a *contiguous* SharedBuffer that is created at each didReceiveData call and that is kept alive for the duration of the loop letting all the client knows about that new chunks.
That contiguous SharedBuffer needs to be kept temporarily because some clients when being notified of a new chunk queries the CachedRawResource through a re-entrant call to get a pointer to the flatten data.

With this bug, the new chunk of data given to the client was a SharedBuffer made from a SharedBufferDataView that kept a strong ref to the temporary flatten data.
As the XHR client keeps a strong reference to every chunks, each accumulated buffer were never deleted.
Comment 30 Ryan Haddad 2022-01-26 17:22:13 PST
Reverted r288614 for reason:

Caused layout test crashes

Committed r288659 (246467@trunk): <https://commits.webkit.org/246467@trunk>
Comment 31 Jean-Yves Avenard [:jya] 2022-01-26 22:03:08 PST
Created attachment 450104 [details]
Patch

Fix UAF due to the CachedRawResource being cleared through a re-entrant call
Comment 32 EWS 2022-01-26 23:00:11 PST
Committed r288667 (246473@main): <https://commits.webkit.org/246473@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 450104 [details].