RESOLVED FIXED 155559
REGRESSION(r192914): 10% regression on Sunspider's date-format-tofte
https://bugs.webkit.org/show_bug.cgi?id=155559
Summary REGRESSION(r192914): 10% regression on Sunspider's date-format-tofte
Ryosuke Niwa
Reported 2016-03-16 14:09:24 PDT
According to our internal dashboard, r192914 caused 10% regression on Sunspider's date-format-torte.
Attachments
Patch (8.97 KB, patch)
2016-03-26 10:27 PDT, Yusuke Suzuki
no flags
Archive of layout-test-results from ews117 for mac-yosemite (1.08 MB, application/zip)
2016-03-26 11:52 PDT, Build Bot
no flags
Patch (16.76 KB, patch)
2016-03-28 13:13 PDT, Yusuke Suzuki
no flags
Radar WebKit Bug Importer
Comment 1 2016-03-16 14:11:39 PDT
Yusuke Suzuki
Comment 2 2016-03-17 12:20:38 PDT
The most likely cause is that eval code cache. The other part in this patch is not related to the code not using generators. Investigating...
Yusuke Suzuki
Comment 3 2016-03-21 06:23:38 PDT
Still investigating. I found the cause of 8% regression. I'm now looking for the cause of remaining 2% contribution.
Yusuke Suzuki
Comment 4 2016-03-26 10:27:50 PDT
Yusuke Suzuki
Comment 5 2016-03-26 10:28:36 PDT
Performance regression is resolved. Generating benchmark report at /Users/yusukesuzuki/dev/WebKit/baseline_regression_SunSpiderLongSpider_dandelion_20160327_0221_report.txt And raw data at /Users/yusukesuzuki/dev/WebKit/baseline_regression_SunSpiderLongSpider_dandelion_20160327_0221.json Benchmark report for SunSpider and LongSpider on dandelion (MacBookPro10,1). VMs tested: "baseline" at /Users/yusukesuzuki/dev/WebKit/WebKitBuild/reg-master/Release/jsc "regression" at /Users/yusukesuzuki/dev/WebKit/WebKitBuild/reg/Release/jsc Collected 10 samples per benchmark/VM, with 10 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. baseline regression SunSpider: date-format-tofte 11.3569+-0.4065 10.5548+-0.5210 might be 1.0760x faster <arithmetic> 11.3569+-0.4065 10.5548+-0.5210 might be 1.0760x faster baseline regression LongSpider: date-format-tofte 834.4381+-19.9839 ^ 742.8171+-5.0476 ^ definitely 1.1233x faster <geometric> 834.4381+-19.9839 ^ 742.8171+-5.0476 ^ definitely 1.1233x faster baseline regression Geomean of preferred means: <scaled-result> 97.3169+-2.3728 ^ 88.4954+-2.2533 ^ definitely 1.0997x faster
Saam Barati
Comment 6 2016-03-26 11:40:39 PDT
Comment on attachment 274986 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=274986&action=review r=me > Source/JavaScriptCore/bytecode/EvalCodeCache.h:67 > + ASSERT_WITH_MESSAGE(!isArrowFunctionContext, "Arrow function is always evaluated as the strict code."); Interesting. I didn't know this was true! > Source/JavaScriptCore/bytecode/EvalCodeCache.h:68 > + ASSERT_WITH_MESSAGE(derivedContextType == DerivedContextType::None, "derivedContextType is always None because class methods and class constructors are always evaluated as the strict code."); Ditto
Saam Barati
Comment 7 2016-03-26 11:42:02 PDT
I guess r=me with the debug tests fix.
Build Bot
Comment 8 2016-03-26 11:52:39 PDT
Comment on attachment 274986 [details] Patch Attachment 274986 [details] did not pass mac-debug-ews (mac): Output: http://webkit-queues.webkit.org/results/1043397 New failing tests: inspector/script-profiler/event-type-Other.html js/arrowfunction-lexical-bind-this.html js/arrowfunction-lexical-bind-arguments-non-strict.html inspector/unit-tests/protocol-test-dispatch-event-to-frontend.html
Build Bot
Comment 9 2016-03-26 11:52:43 PDT
Created attachment 274987 [details] Archive of layout-test-results from ews117 for mac-yosemite The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews117 Port: mac-yosemite Platform: Mac OS X 10.10.5
Yusuke Suzuki
Comment 10 2016-03-26 20:17:00 PDT
Comment on attachment 274986 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=274986&action=review Thanks! > Source/JavaScriptCore/bytecode/EvalCodeCache.h:66 > + ASSERT_WITH_MESSAGE(thisTDZMode == ThisTDZMode::CheckIfNeeded, "Always CheckIfNeeded because the caching is enabled only in the sloppy mode."); It's true. "All parts of a ClassDeclaration or a ClassExpression are strict mode code." >> Source/JavaScriptCore/bytecode/EvalCodeCache.h:67 >> + ASSERT_WITH_MESSAGE(!isArrowFunctionContext, "Arrow function is always evaluated as the strict code."); > > Interesting. I didn't know this was true! Oops, IIRC, this was true in the draft at the some time, but reading the https://tc39.github.io/ecma262/#sec-strict-mode-code, it's not true now. I'll fix this by adding isArrowFunctionContext into Cache's key. And add tests for that. >> Source/JavaScriptCore/bytecode/EvalCodeCache.h:68 >> + ASSERT_WITH_MESSAGE(derivedContextType == DerivedContextType::None, "derivedContextType is always None because class methods and class constructors are always evaluated as the strict code."); > > Ditto But it's true. "All parts of a ClassDeclaration or a ClassExpression are strict mode code."
Yusuke Suzuki
Comment 11 2016-03-28 11:18:58 PDT
I created the patch that simply adds `bool isArrowFunctionContext` to HashMap's key. But the evaluation shows that the new one is 1% slower than the one of the currently submitted patch. After investigating the actual code by objdump, I found that HashMap has inlineGet for the smart pointer key, and it makes the HashMap::get() path inlining. I'm now attempt to add HashMap::fastGet() path. And after that, I'll re-upload the new patch for the review :)
Yusuke Suzuki
Comment 12 2016-03-28 13:13:09 PDT
Yusuke Suzuki
Comment 13 2016-03-28 13:15:54 PDT
Comment on attachment 275041 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=275041&action=review Uploaded the revised version. I'll paste the performance result soon. > Source/JavaScriptCore/bytecode/EvalCodeCache.h:66 > + bool operator==(const CacheKey& other) const This operator== is the functionally same to the RefPtr<StringImpl>'s ==. This operator is used when comparing the given CacheKey with empty value (zero filled CacheKey). > Source/JavaScriptCore/bytecode/EvalCodeCache.h:80 > + return StringHash::equal(lhs.m_source, rhs.m_source) && lhs.m_isArrowFunctionContext == rhs.m_isArrowFunctionContext; This equal is used for hash table's key's equality. > Source/JavaScriptCore/bytecode/EvalCodeCache.h:96 > + return m_cacheMap.fastGet(CacheKey(evalSource, isArrowFunctionContext)).get(); This `fastGet` is important for the performance.
Yusuke Suzuki
Comment 14 2016-03-28 13:20:26 PDT
30 times SunSpider. baseline regression 3d-cube 6.3334+-0.1511 6.3289+-0.1663 3d-morph 6.0476+-0.1262 6.0295+-0.1094 3d-raytrace 6.9939+-0.1817 ? 7.0416+-0.1855 ? access-binary-trees 2.5875+-0.0825 2.5458+-0.0833 might be 1.0164x faster access-fannkuch 6.8250+-0.1393 6.7827+-0.1649 access-nbody 3.0899+-0.0914 ? 3.1090+-0.0820 ? access-nsieve 3.5793+-0.1431 3.5720+-0.1208 bitops-3bit-bits-in-byte 1.4149+-0.0922 ? 1.4710+-0.1276 ? might be 1.0396x slower bitops-bits-in-byte 3.3190+-0.0821 ? 3.3645+-0.1125 ? might be 1.0137x slower bitops-bitwise-and 2.3692+-0.1015 2.3628+-0.1167 bitops-nsieve-bits 3.5379+-0.0863 3.4987+-0.0943 might be 1.0112x faster controlflow-recursive 2.7097+-0.0638 ? 2.7353+-0.0732 ? crypto-aes 5.0110+-0.1104 ? 5.0353+-0.1496 ? crypto-md5 2.9867+-0.0703 2.8915+-0.0702 might be 1.0329x faster crypto-sha1 2.9029+-0.0870 2.8579+-0.1041 might be 1.0158x faster date-format-tofte 11.1319+-0.3096 ^ 10.2712+-0.2418 ^ definitely 1.0838x faster date-format-xparb 6.1334+-0.2153 6.0941+-0.2674 math-cordic 3.2388+-0.0857 3.2244+-0.1031 math-partial-sums 5.8629+-0.2246 ? 5.8698+-0.1488 ? math-spectral-norm 2.4783+-0.1544 2.3772+-0.0914 might be 1.0425x faster regexp-dna 7.8111+-0.2990 7.7376+-0.2227 string-base64 5.8151+-0.1904 ? 5.8389+-0.1791 ? string-fasta 7.0290+-0.2698 ? 7.0758+-0.2538 ? string-tagcloud 11.0736+-0.4899 10.8812+-0.5226 might be 1.0177x faster string-unpack-code 20.9010+-0.4824 20.8857+-0.4810 string-validate-input 5.4000+-0.1421 ? 5.4465+-0.2000 ? <arithmetic> 5.6378+-0.0383 5.5896+-0.0395 might be 1.0086x faster
Yusuke Suzuki
Comment 15 2016-03-28 13:40:02 PDT
Comment on attachment 275041 [details] Patch Oops, it seems string-fasta shows regression in LongSpider. I'll validate it.
Yusuke Suzuki
Comment 16 2016-03-28 14:06:21 PDT
Comment on attachment 275041 [details] Patch OK, that was just a noise. Here is LongSpider results. Benchmark report for LongSpider on dandelion (MacBookPro10,1). VMs tested: "baseline" at /Users/yusukesuzuki/dev/WebKit/WebKitBuild/reg-master/Release/jsc "regression" at /Users/yusukesuzuki/dev/WebKit/WebKitBuild/reg2/Release/jsc Collected 4 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. baseline regression 3d-cube 934.5906+-18.0325 915.8828+-14.2535 might be 1.0204x faster 3d-morph 673.6235+-2.4252 669.7593+-1.9489 3d-raytrace 742.6982+-6.1829 ? 759.3533+-18.5827 ? might be 1.0224x slower access-binary-trees 983.5670+-15.0350 ? 1000.9830+-22.8137 ? might be 1.0177x slower access-fannkuch 288.8668+-18.0552 ? 291.1860+-14.4857 ? access-nbody 586.3848+-1.4081 ? 586.4259+-2.0672 ? access-nsieve 418.1097+-15.4589 410.5919+-9.5382 might be 1.0183x faster bitops-3bit-bits-in-byte 33.9247+-1.6988 ? 34.1122+-2.3422 ? bitops-bits-in-byte 127.9968+-1.8864 127.3935+-2.5033 bitops-nsieve-bits 396.8289+-3.0894 ? 397.2324+-8.7586 ? controlflow-recursive 513.6129+-1.1380 ? 516.9904+-3.7959 ? crypto-aes 743.8047+-11.4018 742.8596+-7.7084 crypto-md5 515.8610+-5.1683 ? 517.6034+-3.3569 ? crypto-sha1 696.9678+-7.8152 ? 699.5705+-9.5247 ? date-format-tofte 831.4608+-31.2608 ^ 736.3655+-16.6249 ^ definitely 1.1291x faster date-format-xparb 799.9339+-12.4595 792.9003+-57.0802 hash-map 166.5778+-5.0311 165.1250+-3.0846 math-cordic 519.3921+-7.1291 ? 520.1601+-7.9758 ? math-partial-sums 484.4500+-4.1169 483.7134+-2.6950 math-spectral-norm 559.0594+-9.7762 557.2775+-5.7057 string-base64 465.4588+-8.4789 ? 467.3469+-10.4887 ? string-fasta 399.1048+-8.3841 395.4541+-20.5484 string-tagcloud 192.7913+-0.9711 ? 193.0368+-2.1634 ? <geometric> 432.7297+-2.9257 430.3046+-3.4689 might be 1.0056x faster
WebKit Commit Bot
Comment 17 2016-03-29 03:25:44 PDT
Comment on attachment 275041 [details] Patch Clearing flags on attachment: 275041 Committed r198778: <http://trac.webkit.org/changeset/198778>
WebKit Commit Bot
Comment 18 2016-03-29 03:25:51 PDT
All reviewed patches have been landed. Closing bug.
Yusuke Suzuki
Comment 19 2016-03-29 09:48:08 PDT
Note You need to log in before you can comment on or make changes to this bug.