Bug 175083 - REGRESSION (r219895-219897): Number of leaks on Open Source went from 9240 to 235983 and is now at 302372
Summary: REGRESSION (r219895-219897): Number of leaks on Open Source went from 9240 to...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Filip Pizlo
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2017-08-02 11:56 PDT by Matt Lewis
Modified: 2017-08-06 09:40 PDT (History)
13 users (show)

See Also:


Attachments
the patch (24.39 KB, patch)
2017-08-05 18:05 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
the patch (24.92 KB, patch)
2017-08-05 18:13 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
the patch (24.92 KB, patch)
2017-08-05 18:31 PDT, Filip Pizlo
oliver: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Matt Lewis 2017-08-02 11:56:31 PDT
After the build:
https://build.webkit.org/builders/Apple%20Sierra%20%28Leaks%29/builds/2321

The number of reported leaks on the Sierra leaks tester went from 9240 to 235983 according to the archive:
https://build.webkit.org/builders/Apple%20Sierra%20%28Leaks%29?numbuilds=200
Comment 1 Chris Dumez 2017-08-02 12:04:02 PDT
I'll double check but my cleanup patch is unlikely to be the culprit. Phil should probably take a look.
Comment 2 Filip Pizlo 2017-08-02 12:35:25 PDT
Also likely to have the same root cause as https://bugs.webkit.org/show_bug.cgi?id=175082
Comment 3 Alexey Proskuryakov 2017-08-02 22:56:17 PDT
Bug 175082 was fixed, but we still have all those leaks as of r220178.
Comment 4 Radar WebKit Bug Importer 2017-08-02 22:57:03 PDT
<rdar://problem/33693778>
Comment 5 Alexey Proskuryakov 2017-08-03 12:10:33 PDT
I guess it has to be due to https://trac.webkit.org/r219897 (GC should be fine with trading blocks between destructor and non-destructor blocks)
Comment 6 Filip Pizlo 2017-08-03 13:42:03 PDT
(In reply to Alexey Proskuryakov from comment #5)
> I guess it has to be due to https://trac.webkit.org/r219897 (GC should be
> fine with trading blocks between destructor and non-destructor blocks)

I think so too.  I'm looking at this now.

I already tried reproducing obvious things.  It appears that destructors are still called exactly like they should be in simple tests.  I'm currently waiting to run the full tests...
Comment 7 Filip Pizlo 2017-08-04 15:47:00 PDT
I can repro this by just running fast/events/gc-freeze-with-attribute-listeners.html.

In that test, the entire leak is SparseArrayValueMap-related.  It looks like some destructor is not running.
Comment 8 Filip Pizlo 2017-08-05 13:22:33 PDT
This is so strange!  It looks as if all of the leak is that we're not GCing at the end, and `leaks` thinks that everything we allocated is a leak (because it kinda is, since we didn't GC).
Comment 9 Filip Pizlo 2017-08-05 13:46:36 PDT
I just ran into a bug where run-webkit-tests does not clear out the leaks files from previous runs.  I was getting totally nonsensical results.

Now I'm running with "rm -rf WebKitBuild/Release/layout-test-results" before each test run.
Comment 10 Filip Pizlo 2017-08-05 14:47:43 PDT
When I run locally, I can get a massive amount of leaks if I can get tests to timeout.  This is because the leaks tool works best when we are quiet.  If you run leaks at the moment that you are killing the process due to timeout, then things get very confusing.

On the leaks bot, there are many timeouts.

I wonder if this is the issue.
Comment 11 Filip Pizlo 2017-08-05 14:49:45 PDT
(In reply to Filip Pizlo from comment #10)
> When I run locally, I can get a massive amount of leaks if I can get tests
> to timeout.  This is because the leaks tool works best when we are quiet. 
> If you run leaks at the moment that you are killing the process due to
> timeout, then things get very confusing.
> 
> On the leaks bot, there are many timeouts.
> 
> I wonder if this is the issue.

No, there were many timeouts both before and after the regression.
Comment 12 Filip Pizlo 2017-08-05 15:24:36 PDT
I think I'm onto it.

workers/bomb.html has a massive leak.  If you turn off timeout, it leaks ~50MB.  It leaks less than that on the bots only because it times out.  Crazy!
Comment 13 Filip Pizlo 2017-08-05 17:17:10 PDT
A huge chunk of the leak was that we would measure leaks while VM destruction was in progress.  This meant that as the Heap::lastChanceToFinalize() proceeds through the object graph in basically a random order and destroys objects, at some point we will have something that looks like a massive leak because lastChanceToFinalize() will have cut off some subgraphs from the roots.  I fixed this issue by making sure that DumpRenderTree waits for VM shutdown to complete before going to the next test.

Once I did that, I could easily reproduce the actual leak.  Also, if I disable the ability to trade destructor blocks, the leak goes away.

Still for why...
Comment 14 Filip Pizlo 2017-08-05 18:05:21 PDT
Created attachment 317353 [details]
the patch

Maybe this'll do it.
Comment 15 Filip Pizlo 2017-08-05 18:13:27 PDT
Created attachment 317354 [details]
the patch
Comment 16 Filip Pizlo 2017-08-05 18:31:49 PDT
Created attachment 317355 [details]
the patch

Fix debug build
Comment 17 Oliver Hunt 2017-08-05 20:45:53 PDT
Comment on attachment 317355 [details]
the patch

View in context: https://bugs.webkit.org/attachment.cgi?id=317355&action=review

r=me assuming perf is clean

> Source/WTF/ChangeLog:9
> +        Adds a classic ReadWriteLock class. I wrote my own because I can never remember if the pthread one is
> +        guaranted to bias in favor of writers or not.

the best rationale :P

> Source/WTF/wtf/Condition.h:177
>      Atomic<bool> m_hasWaiters;

Any idea why this was protected in the first place? no matter, not relevant to this patch, just curious
Comment 18 Filip Pizlo 2017-08-05 21:44:03 PDT
Landed in https://trac.webkit.org/changeset/220322/webkit
Comment 19 Filip Pizlo 2017-08-05 21:44:40 PDT
(In reply to Oliver Hunt from comment #17)
> Comment on attachment 317355 [details]
> the patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=317355&action=review
> 
> r=me assuming perf is clean
> 
> > Source/WTF/ChangeLog:9
> > +        Adds a classic ReadWriteLock class. I wrote my own because I can never remember if the pthread one is
> > +        guaranted to bias in favor of writers or not.
> 
> the best rationale :P
> 
> > Source/WTF/wtf/Condition.h:177
> >      Atomic<bool> m_hasWaiters;
> 
> Any idea why this was protected in the first place? no matter, not relevant
> to this patch, just curious

Because there was no construct() function, so Condition's constructor had to set ConditionBase::m_hasWaiters directly.
Comment 20 Filip Pizlo 2017-08-06 09:40:04 PDT
Down to 5540 leaks on https://build.webkit.org/builders/Apple%20Sierra%20%28Leaks%29?numbuilds=200 as of build 2498, the first one to have this change.

Woohoo!!!