[DFG][FTL] Profile array vector length for array allocation
Created attachment 321041 [details] Patch
Attachment 321041 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:5115: Weird number of spaces at line-start. Are you using a 4-space indent? [whitespace/indent] [3] Total errors found: 1 in 15 files If any of these errors are false positives, please file a bug against check-webkit-style.
Performance is neutral. baseline patched Octane: encrypt 0.20807+-0.00303 0.20600+-0.00282 might be 1.0101x faster decrypt 3.14098+-0.00780 ? 3.16607+-0.04021 ? deltablue x2 0.18569+-0.00395 0.18560+-0.00264 earley 0.38538+-0.01024 0.38403+-0.00911 boyer 6.82270+-0.05162 ? 6.97224+-0.11820 ? might be 1.0219x slower navier-stokes x2 5.21573+-0.04177 ? 5.21574+-0.03458 ? raytrace x2 1.07764+-0.01811 1.07584+-0.00998 richards x2 0.11382+-0.00141 ? 0.11548+-0.00446 ? might be 1.0145x slower splay x2 0.42082+-0.00307 ? 0.42293+-0.00598 ? regexp x2 24.01438+-0.46084 ? 24.80293+-0.56880 ? might be 1.0328x slower pdfjs x2 55.17571+-1.59673 53.15148+-0.98136 might be 1.0381x faster mandreel x2 58.92982+-0.67668 ? 59.29992+-0.77605 ? gbemu x2 47.53171+-3.03032 ? 48.04866+-2.64296 ? might be 1.0109x slower closure 0.76336+-0.03479 0.75708+-0.02371 jquery 9.64067+-0.37570 ? 9.85404+-0.33870 ? might be 1.0221x slower box2d x2 12.58995+-0.36595 ? 12.96163+-0.46848 ? might be 1.0295x slower zlib x2 414.28073+-5.94692 ? 415.01335+-7.49360 ? typescript x2 840.64915+-20.59660 838.23090+-21.60916 <geometric> 6.75872+-0.03968 ? 6.79147+-0.04017 ? might be 1.0048x slower baseline patched Kraken: ai-astar 151.339+-5.551 ? 152.634+-5.059 ? audio-beat-detection 54.184+-2.878 53.456+-2.852 might be 1.0136x faster audio-dft 101.937+-3.057 101.080+-3.787 audio-fft 39.901+-1.877 38.888+-1.327 might be 1.0261x faster audio-oscillator 67.136+-3.382 ? 68.898+-2.347 ? might be 1.0263x slower imaging-darkroom 113.707+-3.292 ? 114.571+-3.951 ? imaging-desaturate 72.823+-2.169 70.250+-2.169 might be 1.0366x faster imaging-gaussian-blur 105.957+-5.224 ? 110.828+-6.102 ? might be 1.0460x slower json-parse-financial 62.342+-1.823 62.091+-2.311 json-stringify-tinderbox 39.915+-1.171 38.649+-1.797 might be 1.0328x faster stanford-crypto-aes 56.167+-1.860 ? 59.665+-2.724 ? might be 1.0623x slower stanford-crypto-ccm 50.625+-2.726 50.208+-2.337 stanford-crypto-pbkdf2 91.671+-2.351 ? 94.705+-3.186 ? might be 1.0331x slower stanford-crypto-sha256-iterative 31.975+-1.351 31.511+-2.003 might be 1.0147x faster <arithmetic> 74.263+-0.837 ? 74.817+-0.844 ? might be 1.0075x slower baseline patched Geomean of preferred means: <scaled-result> 22.4006+-0.1453 ? 22.5385+-0.1485 ? might be 1.0062x slower
Comment on attachment 321041 [details] Patch Attachment 321041 [details] did not pass mac-debug-ews (mac): Output: http://webkit-queues.webkit.org/results/4575595 Number of test failures exceeded the failure limit.
Created attachment 321045 [details] Archive of layout-test-results from ews116 for mac-elcapitan The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews116 Port: mac-elcapitan Platform: Mac OS X 10.11.6
Created attachment 321047 [details] Patch
Attachment 321047 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:5115: Weird number of spaces at line-start. Are you using a 4-space indent? [whitespace/indent] [3] Total errors found: 1 in 15 files If any of these errors are false positives, please file a bug against check-webkit-style.
Ping?
Comment on attachment 321047 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=321047&action=review I'm a bit worried this is too tuned to microbenchmarks that don't matter. Comments below > Source/JavaScriptCore/bytecode/ArrayAllocationProfile.cpp:54 > + m_seenLargestVectorLength = std::min(std::max(m_seenLargestVectorLength, lastArray->getVectorLength()), BASE_CONTIGUOUS_VECTOR_LEN_MAX); So this number is capped at 5? This seems a bit of a hack to just help a microbenchmark. I agree we shouldn't let this number grow unboundedly, but perhaps we can make it larger than 5? 5 seems too small. WDYT? > Source/JavaScriptCore/bytecode/ArrayAllocationProfile.h:77 > + unsigned m_seenLargestVectorLength { 0 }; let's name this: m_largestSeenVectorLength
Comment on attachment 321047 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=321047&action=review Thanks! I've noted the rationale. >> Source/JavaScriptCore/bytecode/ArrayAllocationProfile.cpp:54 >> + m_seenLargestVectorLength = std::min(std::max(m_seenLargestVectorLength, lastArray->getVectorLength()), BASE_CONTIGUOUS_VECTOR_LEN_MAX); > > So this number is capped at 5? This seems a bit of a hack to just help a microbenchmark. I agree we shouldn't let this number grow unboundedly, but perhaps we can make it larger than 5? 5 seems too small. WDYT? This number "5" is just selected from the initial size of empty array. The goal of this patch is, "While keeping new_array_buffer's initial vector size small for major cases, we would like to offer a bit large initial vector size (same to empty array) if it seems necessary". If we have the same initial vector size for empty array and new_array_buffer, I think it's OK. But the reality is that the initial vector size for new_array_buffer is smaller than the empty array's size (3 v.s. 5) We do not want to extend new_array_buffer's initial vector size to 5 since it may waste memory. But we do not want to penaltize the initial vector size of new_array_buffer if it is likely that new_array_buffer's array requires more vector size. That's why I start using profiling. >> Source/JavaScriptCore/bytecode/ArrayAllocationProfile.h:77 >> + unsigned m_seenLargestVectorLength { 0 }; > > let's name this: m_largestSeenVectorLength Fixed!
Comment on attachment 321047 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=321047&action=review >>> Source/JavaScriptCore/bytecode/ArrayAllocationProfile.cpp:54 >>> + m_seenLargestVectorLength = std::min(std::max(m_seenLargestVectorLength, lastArray->getVectorLength()), BASE_CONTIGUOUS_VECTOR_LEN_MAX); >> >> So this number is capped at 5? This seems a bit of a hack to just help a microbenchmark. I agree we shouldn't let this number grow unboundedly, but perhaps we can make it larger than 5? 5 seems too small. WDYT? > > This number "5" is just selected from the initial size of empty array. > The goal of this patch is, "While keeping new_array_buffer's initial vector size small for major cases, we would like to offer a bit large initial vector size (same to empty array) if it seems necessary". > > > If we have the same initial vector size for empty array and new_array_buffer, I think it's OK. > But the reality is that the initial vector size for new_array_buffer is smaller than the empty array's size (3 v.s. 5) > We do not want to extend new_array_buffer's initial vector size to 5 since it may waste memory. > But we do not want to penaltize the initial vector size of new_array_buffer if it is likely that new_array_buffer's array requires more vector size. > > That's why I start using profiling. BTW, I'm OK to extend this value to a bit more larger numbers, like 16? The current "5" number is just picked from empty array.
Created attachment 321454 [details] Patch
Created attachment 321455 [details] Patch
Attachment 321455 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:5118: Weird number of spaces at line-start. Are you using a 4-space indent? [whitespace/indent] [3] Total errors found: 1 in 15 files If any of these errors are false positives, please file a bug against check-webkit-style.
Comment on attachment 321047 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=321047&action=review >>>> Source/JavaScriptCore/bytecode/ArrayAllocationProfile.cpp:54 >>>> + m_seenLargestVectorLength = std::min(std::max(m_seenLargestVectorLength, lastArray->getVectorLength()), BASE_CONTIGUOUS_VECTOR_LEN_MAX); >>> >>> So this number is capped at 5? This seems a bit of a hack to just help a microbenchmark. I agree we shouldn't let this number grow unboundedly, but perhaps we can make it larger than 5? 5 seems too small. WDYT? >> >> This number "5" is just selected from the initial size of empty array. >> The goal of this patch is, "While keeping new_array_buffer's initial vector size small for major cases, we would like to offer a bit large initial vector size (same to empty array) if it seems necessary". >> >> >> If we have the same initial vector size for empty array and new_array_buffer, I think it's OK. >> But the reality is that the initial vector size for new_array_buffer is smaller than the empty array's size (3 v.s. 5) >> We do not want to extend new_array_buffer's initial vector size to 5 since it may waste memory. >> But we do not want to penaltize the initial vector size of new_array_buffer if it is likely that new_array_buffer's array requires more vector size. >> >> That's why I start using profiling. > > BTW, I'm OK to extend this value to a bit more larger numbers, like 16? The current "5" number is just picked from empty array. I'll extend it to a bit larger value, it should be fit for heap's size class.
Created attachment 321460 [details] Patch
Comment on attachment 321460 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=321460&action=review > Source/JavaScriptCore/ChangeLog:33 > + We select 25 to make it fit to one of size classes. I selected 25, which is fit for one of size classes.
Attachment 321460 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:5118: Weird number of spaces at line-start. Are you using a 4-space indent? [whitespace/indent] [3] Total errors found: 1 in 15 files If any of these errors are false positives, please file a bug against check-webkit-style.
Comment on attachment 321460 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=321460&action=review >> Source/JavaScriptCore/ChangeLog:33 >> + We select 25 to make it fit to one of size classes. > > I selected 25, which is fit for one of size classes. Here is JSC Heap MarkedSpace size class dump: 16, 32, 48, 64, 80, 112, 160, 208, 224, 288, 320, 432, 592, 624, 848, 1248, 1792, 2704, 4048, 5408, 8112 Then, vector size can be calculated by (x - 8) / 8 for now. So, the candidates are 1,3,5,7,9,13,19,25,27,35,39,53,73,77,105,155,223,337,505,675,1013
Comment on attachment 321460 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=321460&action=review > Source/JavaScriptCore/ChangeLog:8 > + Currently, NewArrayBuffer allocation is penaltized by JSC: While empty array gets 25 vector size (BASE_CONTIGUOUS_VECTOR_LEN), penaltized=>penalized > Source/JavaScriptCore/ChangeLog:31 > + we should allocate 3 - 25 vector size if it is likely grown. So we should get profile on the resulted array. what does "3 - 25" mean here? 3 to 25 or 3 minus 25? I think you mean the former, maybe just say "to" instead of "-" > Source/JavaScriptCore/ChangeLog:36 > + If the number of new_array_buffer constants is <= 25, array vector size would become 3-25 based on profiling. If the number of its constants I'd say "to" instead of "-" here as well.
Comment on attachment 321460 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=321460&action=review >> Source/JavaScriptCore/ChangeLog:8 >> + Currently, NewArrayBuffer allocation is penaltized by JSC: While empty array gets 25 vector size (BASE_CONTIGUOUS_VECTOR_LEN), > > penaltized=>penalized Fixed. >> Source/JavaScriptCore/ChangeLog:31 >> + we should allocate 3 - 25 vector size if it is likely grown. So we should get profile on the resulted array. > > what does "3 - 25" mean here? 3 to 25 or 3 minus 25? I think you mean the former, maybe just say "to" instead of "-" It means 3 to 5. Sounds nice, fixed. >> Source/JavaScriptCore/ChangeLog:36 >> + If the number of new_array_buffer constants is <= 25, array vector size would become 3-25 based on profiling. If the number of its constants > > I'd say "to" instead of "-" here as well. Fixed.
Committed r222380: <http://trac.webkit.org/changeset/222380>
(In reply to Yusuke Suzuki from comment #22) > Committed r222380: <http://trac.webkit.org/changeset/222380> I'm now investigating whether box2d's regression is real. https://arewefastyet.com/#machine=29&view=single&suite=octane&subtest=Box2D If so, I'll explore the way to mitigate it (Maybe shrinking cap size a bit).
(In reply to Yusuke Suzuki from comment #23) > (In reply to Yusuke Suzuki from comment #22) > > Committed r222380: <http://trac.webkit.org/changeset/222380> > > I'm now investigating whether box2d's regression is real. > https://arewefastyet.com/#machine=29&view=single&suite=octane&subtest=Box2D > If so, I'll explore the way to mitigate it (Maybe shrinking cap size a bit). I downloaded r222376 and r222380 JSC from buildbot and ran benchmarks on my MacBookPro. But I can't see any differences... To ensure the effect of this patch, I'll attempt to revert it once and see the result.
Re-opened since this is blocked by bug 177352
(In reply to WebKit Commit Bot from comment #25) > Re-opened since this is blocked by bug 177352 I've a bit tweaked the patch not to use operationNewArrayWithSizeAndHint if it isn't necessary. I'll attempt to reland it. If it still causes regression, I guess our ->getVectorLength() is a bit heavy for ArrayAllocationProfile... In this case, I'll introduce a tweak not to check getVectorLength() so frequently.
Committed r222384: <http://trac.webkit.org/changeset/222384>
(In reply to Yusuke Suzuki from comment #27) > Committed r222384: <http://trac.webkit.org/changeset/222384> I'm watching on arewefastyet now since this Octane/box2d regression cannot be reproduced in my environment (including MacBookPro). Once the regression happens again, I'll revert it, split the patch into small ones and attempt to get reviews for them.
(In reply to Yusuke Suzuki from comment #28) > (In reply to Yusuke Suzuki from comment #27) > > Committed r222384: <http://trac.webkit.org/changeset/222384> > > I'm watching on arewefastyet now since this Octane/box2d regression cannot > be reproduced in my environment (including MacBookPro). > Once the regression happens again, I'll revert it, split the patch into > small ones and attempt to get reviews for them. And eventually landing these patches in separated issues.
Ah, good. The updated patch does not cause regression ;)
<rdar://problem/34693403>