Bug 177051 - [DFG][FTL] Profile array vector length for array allocation
Summary: [DFG][FTL] Profile array vector length for array allocation
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Yusuke Suzuki
URL:
Keywords: InRadar
Depends on: 177352
Blocks: 175823
  Show dependency treegraph
 
Reported: 2017-09-17 05:43 PDT by Yusuke Suzuki
Modified: 2017-09-27 12:30 PDT (History)
8 users (show)

See Also:


Attachments
Patch (25.28 KB, patch)
2017-09-17 05:52 PDT, Yusuke Suzuki
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews116 for mac-elcapitan (2.95 MB, application/zip)
2017-09-17 09:47 PDT, Build Bot
no flags Details
Patch (25.98 KB, patch)
2017-09-17 10:09 PDT, Yusuke Suzuki
no flags Details | Formatted Diff | Diff
Patch (25.93 KB, patch)
2017-09-21 11:32 PDT, Yusuke Suzuki
no flags Details | Formatted Diff | Diff
Patch (25.93 KB, patch)
2017-09-21 11:34 PDT, Yusuke Suzuki
no flags Details | Formatted Diff | Diff
Patch (25.99 KB, patch)
2017-09-21 12:13 PDT, Yusuke Suzuki
saam: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Yusuke Suzuki 2017-09-17 05:43:52 PDT
[DFG][FTL] Profile array vector length for array allocation
Comment 1 Yusuke Suzuki 2017-09-17 05:52:15 PDT
Created attachment 321041 [details]
Patch
Comment 2 Build Bot 2017-09-17 05:53:27 PDT
Attachment 321041 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:5115:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
Total errors found: 1 in 15 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 3 Yusuke Suzuki 2017-09-17 07:21:22 PDT
Performance is neutral.

                                                 baseline                  patched
Octane:
   encrypt                                   0.20807+-0.00303          0.20600+-0.00282         might be 1.0101x faster
   decrypt                                   3.14098+-0.00780    ?     3.16607+-0.04021       ?
   deltablue                        x2       0.18569+-0.00395          0.18560+-0.00264
   earley                                    0.38538+-0.01024          0.38403+-0.00911
   boyer                                     6.82270+-0.05162    ?     6.97224+-0.11820       ? might be 1.0219x slower
   navier-stokes                    x2       5.21573+-0.04177    ?     5.21574+-0.03458       ?
   raytrace                         x2       1.07764+-0.01811          1.07584+-0.00998
   richards                         x2       0.11382+-0.00141    ?     0.11548+-0.00446       ? might be 1.0145x slower
   splay                            x2       0.42082+-0.00307    ?     0.42293+-0.00598       ?
   regexp                           x2      24.01438+-0.46084    ?    24.80293+-0.56880       ? might be 1.0328x slower
   pdfjs                            x2      55.17571+-1.59673         53.15148+-0.98136         might be 1.0381x faster
   mandreel                         x2      58.92982+-0.67668    ?    59.29992+-0.77605       ?
   gbemu                            x2      47.53171+-3.03032    ?    48.04866+-2.64296       ? might be 1.0109x slower
   closure                                   0.76336+-0.03479          0.75708+-0.02371
   jquery                                    9.64067+-0.37570    ?     9.85404+-0.33870       ? might be 1.0221x slower
   box2d                            x2      12.58995+-0.36595    ?    12.96163+-0.46848       ? might be 1.0295x slower
   zlib                             x2     414.28073+-5.94692    ?   415.01335+-7.49360       ?
   typescript                       x2     840.64915+-20.59660       838.23090+-21.60916

   <geometric>                               6.75872+-0.03968    ?     6.79147+-0.04017       ? might be 1.0048x slower

                                                 baseline                  patched
Kraken:
   ai-astar                                  151.339+-5.551      ?     152.634+-5.059         ?
   audio-beat-detection                       54.184+-2.878             53.456+-2.852           might be 1.0136x faster
   audio-dft                                 101.937+-3.057            101.080+-3.787
   audio-fft                                  39.901+-1.877             38.888+-1.327           might be 1.0261x faster
   audio-oscillator                           67.136+-3.382      ?      68.898+-2.347         ? might be 1.0263x slower
   imaging-darkroom                          113.707+-3.292      ?     114.571+-3.951         ?
   imaging-desaturate                         72.823+-2.169             70.250+-2.169           might be 1.0366x faster
   imaging-gaussian-blur                     105.957+-5.224      ?     110.828+-6.102         ? might be 1.0460x slower
   json-parse-financial                       62.342+-1.823             62.091+-2.311
   json-stringify-tinderbox                   39.915+-1.171             38.649+-1.797           might be 1.0328x faster
   stanford-crypto-aes                        56.167+-1.860      ?      59.665+-2.724         ? might be 1.0623x slower
   stanford-crypto-ccm                        50.625+-2.726             50.208+-2.337
   stanford-crypto-pbkdf2                     91.671+-2.351      ?      94.705+-3.186         ? might be 1.0331x slower
   stanford-crypto-sha256-iterative           31.975+-1.351             31.511+-2.003           might be 1.0147x faster

   <arithmetic>                               74.263+-0.837      ?      74.817+-0.844         ? might be 1.0075x slower

                                                 baseline                  patched
Geomean of preferred means:
   <scaled-result>                           22.4006+-0.1453     ?     22.5385+-0.1485        ? might be 1.0062x slower
Comment 4 Build Bot 2017-09-17 09:47:22 PDT
Comment on attachment 321041 [details]
Patch

Attachment 321041 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/4575595

Number of test failures exceeded the failure limit.
Comment 5 Build Bot 2017-09-17 09:47:25 PDT
Created attachment 321045 [details]
Archive of layout-test-results from ews116 for mac-elcapitan

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews116  Port: mac-elcapitan  Platform: Mac OS X 10.11.6
Comment 6 Yusuke Suzuki 2017-09-17 10:09:23 PDT
Created attachment 321047 [details]
Patch
Comment 7 Build Bot 2017-09-17 10:11:56 PDT
Attachment 321047 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:5115:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
Total errors found: 1 in 15 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 8 Yusuke Suzuki 2017-09-21 08:48:44 PDT
Ping?
Comment 9 Saam Barati 2017-09-21 11:00:34 PDT
Comment on attachment 321047 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=321047&action=review

I'm a bit worried this is too tuned to microbenchmarks that don't matter. Comments below

> Source/JavaScriptCore/bytecode/ArrayAllocationProfile.cpp:54
> +        m_seenLargestVectorLength = std::min(std::max(m_seenLargestVectorLength, lastArray->getVectorLength()), BASE_CONTIGUOUS_VECTOR_LEN_MAX);

So this number is capped at 5? This seems a bit of a hack to just help a microbenchmark. I agree we shouldn't let this number grow unboundedly, but perhaps we can make it larger than 5? 5 seems too small. WDYT?

> Source/JavaScriptCore/bytecode/ArrayAllocationProfile.h:77
> +    unsigned m_seenLargestVectorLength { 0 };

let's name this: m_largestSeenVectorLength
Comment 10 Yusuke Suzuki 2017-09-21 11:26:24 PDT
Comment on attachment 321047 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=321047&action=review

Thanks! I've noted the rationale.

>> Source/JavaScriptCore/bytecode/ArrayAllocationProfile.cpp:54
>> +        m_seenLargestVectorLength = std::min(std::max(m_seenLargestVectorLength, lastArray->getVectorLength()), BASE_CONTIGUOUS_VECTOR_LEN_MAX);
> 
> So this number is capped at 5? This seems a bit of a hack to just help a microbenchmark. I agree we shouldn't let this number grow unboundedly, but perhaps we can make it larger than 5? 5 seems too small. WDYT?

This number "5" is just selected from the initial size of empty array.
The goal of this patch is, "While keeping new_array_buffer's initial vector size small for major cases, we would like to offer a bit large initial vector size (same to empty array) if it seems necessary".


If we have the same initial vector size for empty array and new_array_buffer, I think it's OK.
But the reality is that the initial vector size for new_array_buffer is smaller than the empty array's size (3 v.s. 5)
We do not want to extend new_array_buffer's initial vector size to 5 since it may waste memory.
But we do not want to penaltize the initial vector size of new_array_buffer if it is likely that new_array_buffer's array requires more vector size.

That's why I start using profiling.

>> Source/JavaScriptCore/bytecode/ArrayAllocationProfile.h:77
>> +    unsigned m_seenLargestVectorLength { 0 };
> 
> let's name this: m_largestSeenVectorLength

Fixed!
Comment 11 Yusuke Suzuki 2017-09-21 11:30:31 PDT
Comment on attachment 321047 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=321047&action=review

>>> Source/JavaScriptCore/bytecode/ArrayAllocationProfile.cpp:54
>>> +        m_seenLargestVectorLength = std::min(std::max(m_seenLargestVectorLength, lastArray->getVectorLength()), BASE_CONTIGUOUS_VECTOR_LEN_MAX);
>> 
>> So this number is capped at 5? This seems a bit of a hack to just help a microbenchmark. I agree we shouldn't let this number grow unboundedly, but perhaps we can make it larger than 5? 5 seems too small. WDYT?
> 
> This number "5" is just selected from the initial size of empty array.
> The goal of this patch is, "While keeping new_array_buffer's initial vector size small for major cases, we would like to offer a bit large initial vector size (same to empty array) if it seems necessary".
> 
> 
> If we have the same initial vector size for empty array and new_array_buffer, I think it's OK.
> But the reality is that the initial vector size for new_array_buffer is smaller than the empty array's size (3 v.s. 5)
> We do not want to extend new_array_buffer's initial vector size to 5 since it may waste memory.
> But we do not want to penaltize the initial vector size of new_array_buffer if it is likely that new_array_buffer's array requires more vector size.
> 
> That's why I start using profiling.

BTW, I'm OK to extend this value to a bit more larger numbers, like 16? The current "5" number is just picked from empty array.
Comment 12 Yusuke Suzuki 2017-09-21 11:32:33 PDT
Created attachment 321454 [details]
Patch
Comment 13 Yusuke Suzuki 2017-09-21 11:34:07 PDT
Created attachment 321455 [details]
Patch
Comment 14 Build Bot 2017-09-21 11:37:53 PDT
Attachment 321455 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:5118:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
Total errors found: 1 in 15 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 15 Yusuke Suzuki 2017-09-21 11:42:31 PDT
Comment on attachment 321047 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=321047&action=review

>>>> Source/JavaScriptCore/bytecode/ArrayAllocationProfile.cpp:54
>>>> +        m_seenLargestVectorLength = std::min(std::max(m_seenLargestVectorLength, lastArray->getVectorLength()), BASE_CONTIGUOUS_VECTOR_LEN_MAX);
>>> 
>>> So this number is capped at 5? This seems a bit of a hack to just help a microbenchmark. I agree we shouldn't let this number grow unboundedly, but perhaps we can make it larger than 5? 5 seems too small. WDYT?
>> 
>> This number "5" is just selected from the initial size of empty array.
>> The goal of this patch is, "While keeping new_array_buffer's initial vector size small for major cases, we would like to offer a bit large initial vector size (same to empty array) if it seems necessary".
>> 
>> 
>> If we have the same initial vector size for empty array and new_array_buffer, I think it's OK.
>> But the reality is that the initial vector size for new_array_buffer is smaller than the empty array's size (3 v.s. 5)
>> We do not want to extend new_array_buffer's initial vector size to 5 since it may waste memory.
>> But we do not want to penaltize the initial vector size of new_array_buffer if it is likely that new_array_buffer's array requires more vector size.
>> 
>> That's why I start using profiling.
> 
> BTW, I'm OK to extend this value to a bit more larger numbers, like 16? The current "5" number is just picked from empty array.

I'll extend it to a bit larger value, it should be fit for heap's size class.
Comment 16 Yusuke Suzuki 2017-09-21 12:13:28 PDT
Created attachment 321460 [details]
Patch
Comment 17 Yusuke Suzuki 2017-09-21 12:13:58 PDT
Comment on attachment 321460 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=321460&action=review

> Source/JavaScriptCore/ChangeLog:33
> +        We select 25 to make it fit to one of size classes.

I selected 25, which is fit for one of size classes.
Comment 18 Build Bot 2017-09-21 12:15:31 PDT
Attachment 321460 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:5118:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
Total errors found: 1 in 15 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 19 Yusuke Suzuki 2017-09-21 12:16:24 PDT
Comment on attachment 321460 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=321460&action=review

>> Source/JavaScriptCore/ChangeLog:33
>> +        We select 25 to make it fit to one of size classes.
> 
> I selected 25, which is fit for one of size classes.

Here is JSC Heap MarkedSpace size class dump: 16, 32, 48, 64, 80, 112, 160, 208, 224, 288, 320, 432, 592, 624, 848, 1248, 1792, 2704, 4048, 5408, 8112
Then, vector size can be calculated by (x - 8) / 8 for now.
So, the candidates are

1,3,5,7,9,13,19,25,27,35,39,53,73,77,105,155,223,337,505,675,1013
Comment 20 Saam Barati 2017-09-22 01:10:07 PDT
Comment on attachment 321460 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=321460&action=review

> Source/JavaScriptCore/ChangeLog:8
> +        Currently, NewArrayBuffer allocation is penaltized by JSC: While empty array gets 25 vector size (BASE_CONTIGUOUS_VECTOR_LEN),

penaltized=>penalized

> Source/JavaScriptCore/ChangeLog:31
> +        we should allocate 3 - 25 vector size if it is likely grown. So we should get profile on the resulted array.

what does "3 - 25" mean here? 3 to 25 or 3 minus 25? I think you mean the former, maybe just say "to" instead of "-"

> Source/JavaScriptCore/ChangeLog:36
> +        If the number of new_array_buffer constants is <= 25, array vector size would become 3-25 based on profiling. If the number of its constants

I'd say "to" instead of "-" here as well.
Comment 21 Yusuke Suzuki 2017-09-22 01:15:22 PDT
Comment on attachment 321460 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=321460&action=review

>> Source/JavaScriptCore/ChangeLog:8
>> +        Currently, NewArrayBuffer allocation is penaltized by JSC: While empty array gets 25 vector size (BASE_CONTIGUOUS_VECTOR_LEN),
> 
> penaltized=>penalized

Fixed.

>> Source/JavaScriptCore/ChangeLog:31
>> +        we should allocate 3 - 25 vector size if it is likely grown. So we should get profile on the resulted array.
> 
> what does "3 - 25" mean here? 3 to 25 or 3 minus 25? I think you mean the former, maybe just say "to" instead of "-"

It means 3 to 5. Sounds nice, fixed.

>> Source/JavaScriptCore/ChangeLog:36
>> +        If the number of new_array_buffer constants is <= 25, array vector size would become 3-25 based on profiling. If the number of its constants
> 
> I'd say "to" instead of "-" here as well.

Fixed.
Comment 22 Yusuke Suzuki 2017-09-22 01:22:50 PDT
Committed r222380: <http://trac.webkit.org/changeset/222380>
Comment 23 Yusuke Suzuki 2017-09-22 02:55:20 PDT
(In reply to Yusuke Suzuki from comment #22)
> Committed r222380: <http://trac.webkit.org/changeset/222380>

I'm now investigating whether box2d's regression is real.
https://arewefastyet.com/#machine=29&view=single&suite=octane&subtest=Box2D
If so, I'll explore the way to mitigate it (Maybe shrinking cap size a bit).
Comment 24 Yusuke Suzuki 2017-09-22 03:21:48 PDT
(In reply to Yusuke Suzuki from comment #23)
> (In reply to Yusuke Suzuki from comment #22)
> > Committed r222380: <http://trac.webkit.org/changeset/222380>
> 
> I'm now investigating whether box2d's regression is real.
> https://arewefastyet.com/#machine=29&view=single&suite=octane&subtest=Box2D
> If so, I'll explore the way to mitigate it (Maybe shrinking cap size a bit).

I downloaded r222376 and r222380 JSC from buildbot and ran benchmarks on my MacBookPro.
But I can't see any differences... To ensure the effect of this patch, I'll attempt to revert it once and see the result.
Comment 25 WebKit Commit Bot 2017-09-22 03:22:38 PDT
Re-opened since this is blocked by bug 177352
Comment 26 Yusuke Suzuki 2017-09-22 05:03:25 PDT
(In reply to WebKit Commit Bot from comment #25)
> Re-opened since this is blocked by bug 177352

I've a bit tweaked the patch not to use operationNewArrayWithSizeAndHint if it isn't necessary.
I'll attempt to reland it. If it still causes regression, I guess our ->getVectorLength() is a bit heavy for ArrayAllocationProfile... In this case, I'll introduce a tweak not to check getVectorLength() so frequently.
Comment 27 Yusuke Suzuki 2017-09-22 05:20:00 PDT
Committed r222384: <http://trac.webkit.org/changeset/222384>
Comment 28 Yusuke Suzuki 2017-09-22 05:21:38 PDT
(In reply to Yusuke Suzuki from comment #27)
> Committed r222384: <http://trac.webkit.org/changeset/222384>

I'm watching on arewefastyet now since this Octane/box2d regression cannot be reproduced in my environment (including MacBookPro).
Once the regression happens again, I'll revert it, split the patch into small ones and attempt to get reviews for them.
Comment 29 Yusuke Suzuki 2017-09-22 05:21:54 PDT
(In reply to Yusuke Suzuki from comment #28)
> (In reply to Yusuke Suzuki from comment #27)
> > Committed r222384: <http://trac.webkit.org/changeset/222384>
> 
> I'm watching on arewefastyet now since this Octane/box2d regression cannot
> be reproduced in my environment (including MacBookPro).
> Once the regression happens again, I'll revert it, split the patch into
> small ones and attempt to get reviews for them.

And eventually landing these patches in separated issues.
Comment 30 Yusuke Suzuki 2017-09-22 06:30:17 PDT
Ah, good. The updated patch does not cause regression ;)
Comment 31 Radar WebKit Bug Importer 2017-09-27 12:30:37 PDT
<rdar://problem/34693403>