Bug 158719 - We should be able to generate more types of ICs inline
Summary: We should be able to generate more types of ICs inline
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Local Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Saam Barati
URL:
Keywords: InRadar
: 158888 (view as bug list)
Depends on: 159408
Blocks:
  Show dependency treegraph
 
Reported: 2016-06-13 19:37 PDT by Saam Barati
Modified: 2016-07-05 11:37 PDT (History)
13 users (show)

See Also:


Attachments
WIP (20.65 KB, patch)
2016-06-14 18:00 PDT, Saam Barati
no flags Details | Formatted Diff | Diff
WIP (64.09 KB, patch)
2016-06-15 17:52 PDT, Saam Barati
no flags Details | Formatted Diff | Diff
WIP (62.65 KB, patch)
2016-06-16 11:49 PDT, Saam Barati
no flags Details | Formatted Diff | Diff
WIP (71.31 KB, patch)
2016-06-16 15:51 PDT, Saam Barati
no flags Details | Formatted Diff | Diff
WIP (77.86 KB, patch)
2016-06-16 19:50 PDT, Saam Barati
no flags Details | Formatted Diff | Diff
WIP (75.95 KB, patch)
2016-06-17 00:19 PDT, Saam Barati
no flags Details | Formatted Diff | Diff
patch (84.43 KB, patch)
2016-06-17 15:20 PDT, Saam Barati
no flags Details | Formatted Diff | Diff
patch (91.47 KB, patch)
2016-06-19 11:51 PDT, Saam Barati
fpizlo: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Saam Barati 2016-06-13 19:37:53 PDT
i.e, we should have support for loads off butterfly, array length,
and maybe a few others. We can architect this by using LinkBuffer
to generate code inline. We can give StructureStubInfo a pointer
and a size into where it generates code. (Maybe we can just get away
with a pointer, and the size will be implied by platform).
This will help when we can emit some things inline instead of using PolymorphicAccess.
Comment 1 Saam Barati 2016-06-14 18:00:08 PDT
Created attachment 281305 [details]
WIP

It works for most tests. But there is a lot more work and refactoring to do to make it nice.
Comment 2 Radar WebKit Bug Importer 2016-06-15 16:30:35 PDT
<rdar://problem/26825641>
Comment 3 Saam Barati 2016-06-15 17:52:58 PDT
Created attachment 281418 [details]
WIP

it's getting better
Comment 4 Saam Barati 2016-06-16 11:49:35 PDT
Created attachment 281465 [details]
WIP

Works on arm64 now.
Comment 5 Saam Barati 2016-06-16 15:51:40 PDT
Created attachment 281496 [details]
WIP

more ARM64 stuff. Need to do 32-bit arm and x86
Comment 6 WebKit Commit Bot 2016-06-16 16:54:21 PDT
Attachment 281496 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/bytecode/InlineAccess.cpp:55:  Should have a space between // and comment  [whitespace/comments] [4]
ERROR: Source/JavaScriptCore/bytecode/InlineAccess.cpp:59:  Should have a space between // and comment  [whitespace/comments] [4]
ERROR: Source/JavaScriptCore/jit/Repatch.cpp:157:  An else statement can be removed when the prior "if" concludes with a return, break, continue or goto statement.  [readability/control_flow] [4]
ERROR: Source/JavaScriptCore/jit/Repatch.cpp:512:  Should have a space between // and comment  [whitespace/comments] [4]
ERROR: Source/JavaScriptCore/assembler/LinkBuffer.h:106:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITInlineCacheGenerator.cpp:107:  Wrong number of spaces before statement. (expected: 8)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITInlineCacheGenerator.cpp:108:  Wrong number of spaces before statement. (expected: 8)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITInlineCacheGenerator.h:81:  The parameter name "size" adds no information, so it should be removed.  [readability/parameter_name] [5]
ERROR: Source/JavaScriptCore/jit/JITInlineCacheGenerator.h:105:  The parameter name "value" adds no information, so it should be removed.  [readability/parameter_name] [5]
Total errors found: 9 in 22 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 7 Saam Barati 2016-06-16 19:50:49 PDT
Created attachment 281525 [details]
WIP

appears to work on armv7, arm64, x86-32bit.
Going to run perf and clean things up, then I will put up for review.
Comment 8 WebKit Commit Bot 2016-06-16 19:51:52 PDT
Attachment 281525 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/jsc.cpp:37:  Should have a space between // and comment  [whitespace/comments] [4]
ERROR: Source/JavaScriptCore/bytecode/InlineAccess.cpp:60:  Should have a space between // and comment  [whitespace/comments] [4]
ERROR: Source/JavaScriptCore/bytecode/InlineAccess.cpp:64:  Should have a space between // and comment  [whitespace/comments] [4]
ERROR: Source/JavaScriptCore/jit/Repatch.cpp:157:  An else statement can be removed when the prior "if" concludes with a return, break, continue or goto statement.  [readability/control_flow] [4]
ERROR: Source/JavaScriptCore/jit/Repatch.cpp:509:  Should have a space between // and comment  [whitespace/comments] [4]
ERROR: Source/JavaScriptCore/jit/JITInlineCacheGenerator.cpp:107:  Wrong number of spaces before statement. (expected: 8)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITInlineCacheGenerator.cpp:108:  Wrong number of spaces before statement. (expected: 8)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITInlineCacheGenerator.h:85:  The parameter name "size" adds no information, so it should be removed.  [readability/parameter_name] [5]
ERROR: Source/JavaScriptCore/jit/JITInlineCacheGenerator.h:103:  The parameter name "value" adds no information, so it should be removed.  [readability/parameter_name] [5]
Total errors found: 9 in 26 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 9 Saam Barati 2016-06-17 00:19:51 PDT
Created attachment 281541 [details]
WIP

Close to done.
Comment 10 WebKit Commit Bot 2016-06-17 00:21:02 PDT
Attachment 281541 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/jit/Repatch.cpp:511:  Should have a space between // and comment  [whitespace/comments] [4]
ERROR: Source/JavaScriptCore/jit/JITInlineCacheGenerator.cpp:107:  Wrong number of spaces before statement. (expected: 8)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITInlineCacheGenerator.cpp:108:  Wrong number of spaces before statement. (expected: 8)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITInlineCacheGenerator.h:85:  The parameter name "size" adds no information, so it should be removed.  [readability/parameter_name] [5]
ERROR: Source/JavaScriptCore/jit/JITInlineCacheGenerator.h:103:  The parameter name "value" adds no information, so it should be removed.  [readability/parameter_name] [5]
Total errors found: 5 in 24 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 11 Saam Barati 2016-06-17 15:20:47 PDT
Created attachment 281594 [details]
patch
Comment 12 Saam Barati 2016-06-17 15:21:46 PDT
Comment on attachment 281594 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=281594&action=review

> Source/JavaScriptCore/ChangeLog:15
> +        This makes it much simpler to patch inline ICs. All that's
> +        needed to patch an inline IC is to memcpy the code from
> +        a macro assembler inline using LinkBuffer.

Also, I will also write here that I think this architecture will extend naturally into us
doing ICs for other things in the future.
Comment 13 WebKit Commit Bot 2016-06-17 15:22:17 PDT
Attachment 281594 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/jit/JITInlineCacheGenerator.cpp:107:  Wrong number of spaces before statement. (expected: 8)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITInlineCacheGenerator.cpp:108:  Wrong number of spaces before statement. (expected: 8)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITInlineCacheGenerator.h:85:  The parameter name "size" adds no information, so it should be removed.  [readability/parameter_name] [5]
ERROR: Source/JavaScriptCore/jit/JITInlineCacheGenerator.h:103:  The parameter name "value" adds no information, so it should be removed.  [readability/parameter_name] [5]
Total errors found: 4 in 25 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 14 Filip Pizlo 2016-06-17 15:25:58 PDT
Comment on attachment 281594 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=281594&action=review

> Source/JavaScriptCore/bytecode/InlineAccess.cpp:167
> +    CCallHelpers replacementJIT(jit.vm());
> +    auto jumpToSlowPath = replacementJIT.jump();
> +    // We don't need a nop sled because nobody should jump into the middle of an IC.
> +    bool needsBranchCompaction = false;
> +    LinkBuffer linkBuffer(replacementJIT, stubInfo.patch.start.dataLocation(), replacementJIT.m_assembler.buffer().codeSize(), JITCompilationMustSucceed, needsBranchCompaction);
> +    RELEASE_ASSERT(linkBuffer.isValid());
> +    linkBuffer.link(jumpToSlowPath, stubInfo.slowPathStartLocation());
> +
> +    FINALIZE_CODE(linkBuffer, ("InlineAccess linking a jump to slow path because we ran out of room inline"));

What's the limiting principle?

Say that we attempt to emit inline code and fail because of size.  Clearly, what we should do, is use PolymorphicAccess instead.  Can you fix this so that this happens?
Comment 15 Filip Pizlo 2016-06-17 15:29:10 PDT
Comment on attachment 281594 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=281594&action=review

> Source/JavaScriptCore/bytecode/InlineAccess.cpp:136
> +        while (jit.m_assembler.buffer().codeSize() < stubInfo.patch.inlineSize)
> +            jit.nop();

Why are you doing this?  Isn't this the LinkBuffer's responsibility?

Also, this is grossly inefficient.  You should be emitting a sled of large nops.  For example, x86 supports up to like 13 byte nops, so even if you are off by a lot, you can still have just one nop instrjuction.
Comment 16 Filip Pizlo 2016-06-17 15:34:56 PDT
Comment on attachment 281594 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=281594&action=review

> Source/JavaScriptCore/jit/Repatch.cpp:365
> +                InlineAccess::generateSelfPropertyReplace(vm, stubInfo, structure, slot.cachedOffset());

Per discussion on IRC, this method and its friends should return false if you fail to generate so you can fall back on addAccessCase code.
Comment 17 Saam Barati 2016-06-17 17:35:59 PDT
*** Bug 158888 has been marked as a duplicate of this bug. ***
Comment 18 Saam Barati 2016-06-19 11:51:37 PDT
Created attachment 281621 [details]
patch
Comment 19 WebKit Commit Bot 2016-06-19 11:54:47 PDT
Attachment 281621 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/assembler/ARM64Assembler.h:1487:  Should have a space between // and comment  [whitespace/comments] [4]
ERROR: Source/JavaScriptCore/assembler/ARMv7Assembler.h:2033:  Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons.  [readability/comparison_to_zero] [5]
ERROR: Source/JavaScriptCore/jit/JITInlineCacheGenerator.cpp:108:  Wrong number of spaces before statement. (expected: 8)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITInlineCacheGenerator.cpp:109:  Wrong number of spaces before statement. (expected: 8)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITInlineCacheGenerator.h:85:  The parameter name "size" adds no information, so it should be removed.  [readability/parameter_name] [5]
ERROR: Source/JavaScriptCore/jit/JITInlineCacheGenerator.h:103:  The parameter name "value" adds no information, so it should be removed.  [readability/parameter_name] [5]
Total errors found: 6 in 28 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 20 Saam Barati 2016-06-19 11:55:33 PDT
Going to run performance tests now and post numbers.
Comment 21 Filip Pizlo 2016-06-19 12:30:13 PDT
Comment on attachment 281621 [details]
patch

Nice!
Comment 22 Saam Barati 2016-06-19 12:31:22 PDT
Perf numbers:

Benchmark report for SunSpider, V8Spider, Octane, Kraken, and AsmBench on Saams-MacBook-Pro (MacBookPro11,3).

VMs tested:
"og" at /Volumes/Data/WK/b/OpenSource/WebKitBuild/Release/jsc (r202153)
"InlineAccess" at /Volumes/Data/WK/a/OpenSource/WebKitBuild/Release/jsc (r202153)

Collected 15 samples per benchmark/VM, with 15 VM invocations per benchmark. Emitted a call to gc() between sample
measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to
get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds.

                                                    og                   InlineAccess                                   
SunSpider:
   3d-cube                                    4.5174+-0.1883            4.5023+-0.2013        
   3d-morph                                   4.7473+-0.0291            4.7390+-0.0278        
   3d-raytrace                                5.1235+-0.1647            5.1196+-0.2577        
   access-binary-trees                        1.8918+-0.0332     ?      1.9480+-0.1582        ? might be 1.0297x slower
   access-fannkuch                            5.4174+-0.1556     ?      5.4567+-0.3207        ?
   access-nbody                               2.3266+-0.0344            2.3078+-0.0301        
   access-nsieve                              3.1924+-0.1498            3.0143+-0.0571          might be 1.0591x faster
   bitops-3bit-bits-in-byte                   1.0071+-0.0191     ?      1.0128+-0.0187        ?
   bitops-bits-in-byte                        2.4539+-0.0169     ?      2.4689+-0.0273        ?
   bitops-bitwise-and                         1.8488+-0.0852            1.8108+-0.0279          might be 1.0210x faster
   bitops-nsieve-bits                         2.9048+-0.1080     ?      2.9717+-0.1847        ? might be 1.0230x slower
   controlflow-recursive                      2.1209+-0.0092     ?      2.2629+-0.1830        ? might be 1.0669x slower
   crypto-aes                                 4.2479+-0.0339     ?      4.2876+-0.0590        ?
   crypto-md5                                 2.4719+-0.0584     ?      2.5302+-0.1734        ? might be 1.0236x slower
   crypto-sha1                                2.5351+-0.0315     ?      2.6102+-0.1912        ? might be 1.0296x slower
   date-format-tofte                          6.3338+-0.1818            6.2670+-0.2319          might be 1.0107x faster
   date-format-xparb                          4.5485+-0.1386            4.4919+-0.0976          might be 1.0126x faster
   math-cordic                                2.5905+-0.0203     ?      2.6733+-0.1568        ? might be 1.0320x slower
   math-partial-sums                          3.7992+-0.1284            3.7858+-0.1069        
   math-spectral-norm                         1.8097+-0.0158     ?      1.8127+-0.0295        ?
   regexp-dna                                 6.1001+-0.1342     ?      6.1028+-0.1047        ?
   string-base64                              3.8417+-0.2196            3.7851+-0.1688          might be 1.0149x faster
   string-fasta                               5.2342+-0.1242            5.1984+-0.0398        
   string-tagcloud                            7.7245+-0.0723     ?      7.9308+-0.1935        ? might be 1.0267x slower
   string-unpack-code                        17.9480+-0.4844     ?     18.3292+-0.7727        ? might be 1.0212x slower
   string-validate-input                      3.9070+-0.1197            3.8222+-0.0804          might be 1.0222x faster

   <arithmetic>                               4.2555+-0.0320     ?      4.2785+-0.0339        ? might be 1.0054x slower

                                                    og                   InlineAccess                                   
V8Spider:
   crypto                                    33.7267+-0.1982     ?     33.9083+-0.2841        ?
   deltablue                                 46.9386+-0.6550           46.6275+-0.8295        
   earley-boyer                              36.8774+-0.2159           36.7914+-0.1981        
   raytrace                                  19.6028+-0.8618           19.1805+-0.3196          might be 1.0220x faster
   regexp                                    51.0353+-0.9752           49.8716+-0.2763          might be 1.0233x faster
   richards                                  40.0663+-0.3924           39.9014+-0.7316        
   splay                                     27.6875+-0.2232           27.6486+-0.3807        

   <geometric>                               35.0198+-0.2977           34.7580+-0.1803          might be 1.0075x faster

                                                    og                   InlineAccess                                   
Octane:
   encrypt                                   0.15017+-0.00075    ?     0.15145+-0.00128       ?
   decrypt                                   2.63250+-0.03544    ^     2.53343+-0.00820       ^ definitely 1.0391x faster
   deltablue                        x2       0.12257+-0.00102          0.12231+-0.00084       
   earley                                    0.27383+-0.00167          0.27264+-0.00123       
   boyer                                     4.72874+-0.07408          4.71461+-0.06944       
   navier-stokes                    x2       4.67139+-0.01427          4.67029+-0.01919       
   raytrace                         x2       0.75466+-0.00320    ?     0.75631+-0.00305       ?
   richards                         x2       0.07840+-0.00097    ?     0.07876+-0.00070       ?
   splay                            x2       0.32159+-0.00115          0.32041+-0.00491       
   regexp                           x2      14.61364+-0.06893    ?    14.62196+-0.08089       ?
   pdfjs                            x2      36.64102+-0.19913         36.58593+-0.19661       
   mandreel                         x2      39.23289+-0.09777         39.20329+-0.10489       
   gbemu                            x2      29.92250+-1.29925    ^    26.91634+-0.61312       ^ definitely 1.1117x faster
   closure                                   0.46720+-0.00292          0.46482+-0.00203       
   jquery                                    6.28531+-0.02468    ?     6.30203+-0.03186       ?
   box2d                            x2       8.78905+-0.05368          8.70978+-0.03516       
   zlib                             x2     344.12271+-3.79571        341.08419+-2.26744       
   typescript                       x2     600.10041+-3.10070    ^   590.34425+-3.74688       ^ definitely 1.0165x faster

   <geometric>                               4.77255+-0.01595    ^     4.72230+-0.01083       ^ definitely 1.0106x faster

                                                    og                   InlineAccess                                   
Kraken:
   ai-astar                                   82.376+-0.920      ?      82.844+-0.851         ?
   audio-beat-detection                       35.772+-0.218      ?      35.781+-0.186         ?
   audio-dft                                  94.371+-0.998      ?      94.427+-1.074         ?
   audio-fft                                  28.351+-0.056      ?      28.365+-0.031         ?
   audio-oscillator                           46.137+-0.055      ?      46.208+-0.131         ?
   imaging-darkroom                           57.825+-0.713             57.477+-0.451         
   imaging-desaturate                         42.150+-0.683             41.907+-0.253         
   imaging-gaussian-blur                      57.918+-1.031             57.505+-0.671         
   json-parse-financial                       32.007+-0.492      ?      32.180+-0.484         ?
   json-stringify-tinderbox                   21.316+-0.479      !      22.790+-0.822         ! definitely 1.0692x slower
   stanford-crypto-aes                        35.715+-0.310      ?      35.985+-0.305         ?
   stanford-crypto-ccm                        29.500+-0.740      ?      29.692+-0.716         ?
   stanford-crypto-pbkdf2                     87.890+-0.687             87.762+-0.528         
   stanford-crypto-sha256-iterative           28.654+-0.218      ?      29.358+-1.258         ? might be 1.0246x slower

   <arithmetic>                               48.570+-0.137      ?      48.734+-0.128         ? might be 1.0034x slower

                                                    og                   InlineAccess                                   
AsmBench:
   bigfib.cpp                               417.7846+-3.4057          416.5540+-1.5975        
   cray.c                                   349.0533+-1.8112          346.6231+-1.0857        
   dry.c                                    397.3668+-2.7515     ?    406.6632+-9.9195        ? might be 1.0234x slower
   FloatMM.c                                686.2389+-4.0255          683.7029+-1.7082        
   gcc-loops.cpp                           3483.5148+-11.5750    ?   3484.4261+-22.3973       ?
   n-body.c                                 766.8353+-3.2524          765.1171+-2.7197        
   Quicksort.c                              375.1290+-3.0965     ?    375.4340+-3.4105        ?
   stepanov_container.cpp                  3137.8769+-17.3410    ?   3148.3259+-19.7186       ?
   Towers.c                                 258.8826+-1.2048     ?    259.8901+-2.3399        ?

   <geometric>                              683.0330+-1.7220     ?    684.1502+-1.5998        ? might be 1.0016x slower

                                                    og                   InlineAccess                                   
Geomean of preferred means:
   <scaled-result>                           29.8228+-0.0695           29.7773+-0.0518          might be 1.0015x faster
Comment 23 Saam Barati 2016-06-19 12:32:11 PDT
Also, a --outer 100 run of sun spider.
Still noisy though:

Benchmark report for SunSpider on Saams-MacBook-Pro (MacBookPro11,3).

VMs tested:
"og" at /Volumes/Data/WK/b/OpenSource/WebKitBuild/Release/jsc (r202153)
"InlineAccess" at /Volumes/Data/WK/a/OpenSource/WebKitBuild/Release/jsc (r202153)

Collected 100 samples per benchmark/VM, with 100 VM invocations per benchmark. Emitted a call to gc()
between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the
jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times
with 95% confidence intervals in milliseconds.

                                      og                   InlineAccess                                   

3d-cube                         4.5240+-0.0771            4.5005+-0.0725        
3d-morph                        4.8942+-0.0956     ?      4.9602+-0.1118        ? might be 1.0135x slower
3d-raytrace                     5.2819+-0.1646            5.1774+-0.1597          might be 1.0202x faster
access-binary-trees             2.0175+-0.0731            1.9259+-0.0502          might be 1.0476x faster
access-fannkuch                 5.5262+-0.0779     ?      5.6551+-0.1952        ? might be 1.0233x slower
access-nbody                    2.4586+-0.0777            2.4122+-0.0652          might be 1.0193x faster
access-nsieve                   3.1376+-0.0750     ?      3.1830+-0.0668        ? might be 1.0145x slower
bitops-3bit-bits-in-byte        1.0576+-0.0363            1.0293+-0.0263          might be 1.0275x faster
bitops-bits-in-byte             2.5642+-0.0490            2.5404+-0.0551        
bitops-bitwise-and              1.9002+-0.0340            1.8724+-0.0324          might be 1.0149x faster
bitops-nsieve-bits              2.9335+-0.0561     ?      2.9725+-0.0775        ? might be 1.0133x slower
controlflow-recursive           2.2924+-0.0968            2.2147+-0.0483          might be 1.0351x faster
crypto-aes                      4.4596+-0.0996            4.3827+-0.0700          might be 1.0176x faster
crypto-md5                      2.4925+-0.0474     ?      2.4952+-0.0574        ?
crypto-sha1                     2.6368+-0.0876            2.5865+-0.0589          might be 1.0194x faster
date-format-tofte               6.2944+-0.0908            6.2588+-0.1297        
date-format-xparb               4.5968+-0.0932     ?      4.6150+-0.0850        ?
math-cordic                     2.7359+-0.0768            2.6915+-0.0702          might be 1.0165x faster
math-partial-sums               3.9326+-0.0719            3.8637+-0.0553          might be 1.0178x faster
math-spectral-norm              1.8820+-0.0442     ?      1.9144+-0.1330        ? might be 1.0172x slower
regexp-dna                      6.3474+-0.1052            6.2623+-0.0896          might be 1.0136x faster
string-base64                   3.9416+-0.1063            3.9026+-0.1098          might be 1.0100x faster
string-fasta                    5.2489+-0.0564     ?      5.4911+-0.2166        ? might be 1.0461x slower
string-tagcloud                 8.1685+-0.2024            8.1424+-0.1730        
string-unpack-code             17.5705+-0.2154     ?     17.8504+-0.3205        ? might be 1.0159x slower
string-validate-input           3.9607+-0.1161     ?      4.0558+-0.1246        ? might be 1.0240x slower

<arithmetic>                    4.3406+-0.0189     ?      4.3445+-0.0244        ? might be 1.0009x slower
Comment 24 Saam Barati 2016-06-19 12:42:47 PDT
landed in:
http://trac.webkit.org/changeset/202214
Comment 25 Csaba Osztrogonác 2016-07-04 05:14:17 PDT
Comment on attachment 281621 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=281621&action=review

> Source/JavaScriptCore/assembler/ARM64Assembler.h:1488
> +    //template <bool isCopyingToExecutableMemory>
> +    static void fillNops(void* base, size_t size, bool isCopyingToExecutableMemory)

Why did you add the new isCopyingToExecutableMemory parameter?
It is always called with constant false. Is it for future use?
Comment 26 Csaba Osztrogonác 2016-07-04 06:42:49 PDT
Comment on attachment 281594 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=281594&action=review

> Source/JavaScriptCore/bytecode/InlineAccess.h:58
> +#if CPU(ARM_THUMB2)
> +        return 48;
> +#else
> +        return 50;
> +#endif

I don't think if 50 can be good on ARM traditional where all instructions are 4 bytes sized.

> Source/JavaScriptCore/bytecode/InlineAccess.h:78
> +#if CPU(ARM_THUMB2)
> +        return 48;
> +#else
> +        return 50;
> +#endif

ditto

> Source/JavaScriptCore/bytecode/InlineAccess.h:97
> +#if CPU(ARM_THUMB2)
> +        return 30;
> +#else
> +        return 50;
> +#endif

ditto

> Source/JavaScriptCore/bytecode/InlineAccess.h:114
> +    // This is helpful when determining the size of an IC on
> +    // various platforms. When adding a new type of IC, implement
> +    // its placeholder code here, and log the size. That way we
> +    // can intelligently choose sizes on various platforms.
> +    NO_RETURN_DUE_TO_CRASH void dumpCacheSizesAndCrash(VM&);

We can determine the above numbers with this function, don't we?
Where and how should I call it?
Comment 27 Saam Barati 2016-07-04 20:19:49 PDT
Comment on attachment 281594 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=281594&action=review

>> Source/JavaScriptCore/bytecode/InlineAccess.h:114
>> +    NO_RETURN_DUE_TO_CRASH void dumpCacheSizesAndCrash(VM&);
> 
> We can determine the above numbers with this function, don't we?
> Where and how should I call it?

This function should be static. I made a mistake by not making it static. You can call it from anywhere that executes.
Comment 28 Csaba Osztrogonác 2016-07-05 01:58:17 PDT
(In reply to comment #27)
> Comment on attachment 281594 [details]
> patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=281594&action=review
> 
> >> Source/JavaScriptCore/bytecode/InlineAccess.h:114
> >> +    NO_RETURN_DUE_TO_CRASH void dumpCacheSizesAndCrash(VM&);
> > 
> > We can determine the above numbers with this function, don't we?
> > Where and how should I call it?
> 
> This function should be static. I made a mistake by not making it static.
> You can call it from anywhere that executes.

I managed to get these numbers from dumpCacheSizesAndCrash():
array length size: 32
out of line offset cache size: 52
inline offset cache size: 48
replace cache size: 48
replace out of line cache size: 44

But we have these functions with magic numbers:
- sizeForPropertyAccess()
- sizeForPropertyReplace()
- sizeForLengthAccess()

Which belongs to which?
- sizeForPropertyAccess() == array length size ?
- sizeForPropertyReplace() == max(replace cache size,replace out of line cache size) ?
- sizeForLengthAccess() == max(out of line offset cache size, inline offset cache size) ?
Comment 29 Saam Barati 2016-07-05 11:35:58 PDT
(In reply to comment #28)
> (In reply to comment #27)
> > Comment on attachment 281594 [details]
> > patch
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=281594&action=review
> > 
> > >> Source/JavaScriptCore/bytecode/InlineAccess.h:114
> > >> +    NO_RETURN_DUE_TO_CRASH void dumpCacheSizesAndCrash(VM&);
> > > 
> > > We can determine the above numbers with this function, don't we?
> > > Where and how should I call it?
> > 
> > This function should be static. I made a mistake by not making it static.
> > You can call it from anywhere that executes.
> 
> I managed to get these numbers from dumpCacheSizesAndCrash():
> array length size: 32
> out of line offset cache size: 52
> inline offset cache size: 48
> replace cache size: 48
> replace out of line cache size: 44
> 
> But we have these functions with magic numbers:
> - sizeForPropertyAccess()
> - sizeForPropertyReplace()
> - sizeForLengthAccess()
> 
> Which belongs to which?
> - sizeForPropertyAccess() == array length size ?
> - sizeForPropertyReplace() == max(replace cache size,replace out of line
> cache size) ?
> - sizeForLengthAccess() == max(out of line offset cache size, inline offset
> cache size) ?

- sizeForPropertyAccess() == max(inline access, out of line access). 
- sizeForPropertyReplace() == what you wrote
- sizeForLengthAccess() == max(array length size, inline access, out of line access). After looking at the code I committed again, I didn't express this properly in the function. I'll open a bug to do this. The mistake I made here is that some "length" accesses could be on non-arrays.
Comment 30 Saam Barati 2016-07-05 11:37:25 PDT
opened a bug:
https://bugs.webkit.org/show_bug.cgi?id=159429