Bug 65901

Summary: The JSC JIT currently has no facility to profile and report the types of values
Product: WebKit Reporter: Filip Pizlo <fpizlo>
Component: JavaScriptCoreAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: gustavo, webkit.review.bot, xan.lopez
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Attachments:
Description Flags
the patch
gustavo: commit-queue-
the patch (fix style and build)
barraclough: review-
the patch (fix review)
none
the patch (fix style)
none
the patch (fix more style)
none
value profiling enabled patch none

Description Filip Pizlo 2011-08-09 00:49:17 PDT
Much of the effort of tuning the DFG involves trying to figure out what types to predict.  But we have no easy way of profiling or reporting the types seen by expressions in programs.
Comment 1 Filip Pizlo 2011-08-09 00:58:23 PDT
Created attachment 103334 [details]
the patch
Comment 2 WebKit Review Bot 2011-08-09 00:59:58 PDT
Attachment 103334 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1

Source/JavaScriptCore/wtf/StdLibExtras.h:163:  Should have only a single space after a punctuation in a comment.  [whitespace/comments] [5]
Source/JavaScriptCore/wtf/StdLibExtras.h:177:  An else if statement should be written as an if statement when the prior "if" concludes with a return, break, continue or goto statement.  [readability/control_flow] [4]
Source/JavaScriptCore/bytecode/CodeBlock.h:270:  Should have only a single space after a punctuation in a comment.  [whitespace/comments] [5]
Total errors found: 3 in 11 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 3 Gustavo Noronha (kov) 2011-08-09 01:05:06 PDT
Comment on attachment 103334 [details]
the patch

Attachment 103334 [details] did not pass gtk-ews (gtk):
Output: http://queues.webkit.org/results/9337173
Comment 4 Filip Pizlo 2011-08-09 01:10:52 PDT
Created attachment 103335 [details]
the patch (fix style and build)
Comment 5 Filip Pizlo 2011-08-09 01:12:31 PDT
Here is sample profiling output when this feature is enabled.  Note that this is most useful if s_dumpsGeneratedCode is set to true.


ValueProfile for 0x7fefc2021600:
   arg = 1: samples = 3, int32 = 64, double = 0, cell = 0
   bc = 105: samples = 3, int32 = 0, double = 64, cell = 0
   bc = 109: samples = 3, int32 = 0, double = 64, cell = 0
   bc = 129: samples = 3, int32 = 0, double = 64, cell = 0
   bc = 133: samples = 3, int32 = 0, double = 64, cell = 0
   bc = 159: samples = 2, int32 = 0, double = 0, cell = 64
   bc = 177: samples = 3, int32 = 0, double = 64, cell = 0
ValueProfile for 0x7fefc202d800:
   arg = 1: samples = 8, int32 = 0, double = 0, cell = 64
   arg = 2: samples = 8, int32 = 0, double = 0, cell = 64
   arg = 3: samples = 8, int32 = 0, double = 0, cell = 64
ValueProfile for 0x7fefc204ca00:
   arg = 1: samples = 8, int32 = 0, double = 0, cell = 64
   arg = 2: samples = 8, int32 = 0, double = 0, cell = 64
   bc = 33: samples = 8, int32 = 0, double = 64, cell = 0
   bc = 35: samples = 8, int32 = 0, double = 64, cell = 0
   bc = 54: samples = 8, int32 = 64, double = 0, cell = 0
   bc = 72: samples = 8, int32 = 64, double = 0, cell = 0
ValueProfile for 0x7fefc2037e00:
   arg = 1: samples = 8, int32 = 0, double = 0, cell = 64
   arg = 2: samples = 8, int32 = 0, double = 0, cell = 64
   bc = 33: samples = 8, int32 = 0, double = 64, cell = 0
   bc = 35: samples = 8, int32 = 0, double = 64, cell = 0
   bc = 54: samples = 8, int32 = 64, double = 0, cell = 0
   bc = 72: samples = 8, int32 = 64, double = 0, cell = 0
ValueProfile for 0x7fefc2042400:
   arg = 1: samples = 8, int32 = 64, double = 0, cell = 0
   arg = 2: samples = 8, int32 = 64, double = 0, cell = 0
ValueProfile for 0x7fefc2011000:
   bc = 22: samples = 3, int32 = 0, double = 64, cell = 0


Here is the performance impact when this is enabled.



[pizlo@minime bencher] ./bencher TipOfTree:/Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc ValueProfiler:/Volumes/Data/pizlo/OpenSource/WebKitBuild/Release/jsc
TipOfTree: /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc
ValueProfiler: /Volumes/Data/pizlo/OpenSource/WebKitBuild/Release/jsc
376/376
                                           TipOfTree            ValueProfiler    
SunSpider:
   3d-cube                               8.3063+-0.2447   ?     8.5934+-0.1407      ? might be 1.0346x slower
   3d-morph                              7.4778+-0.1030   ?     7.4816+-0.1469      ?
   3d-raytrace                           8.6724+-0.1935   !     9.8602+-0.1691      ! definitely 1.1370x slower
   access-binary-trees                   2.1589+-0.1113   ?     2.3136+-0.0478      ? might be 1.0717x slower
   access-fannkuch                      11.5040+-0.1137   !    12.4823+-0.3213      ! definitely 1.0850x slower
   access-nbody                          6.3985+-0.0758   !     6.9130+-0.1413      ! definitely 1.0804x slower
   access-nsieve                         3.4068+-0.1366   ?     3.5767+-0.0992      ? might be 1.0499x slower
   bitops-3bit-bits-in-byte              2.4047+-0.0435   !     2.6037+-0.0567      ! definitely 1.0827x slower
   bitops-bits-in-byte                   6.4718+-0.1681   ^     5.9623+-0.1465      ^ definitely 1.0855x faster
   bitops-bitwise-and                    3.8309+-0.1916         3.6319+-0.1368        might be 1.0548x faster
   bitops-nsieve-bits                    5.3318+-0.1112   !     5.7348+-0.0813      ! definitely 1.0756x slower
   controlflow-recursive                 2.0706+-0.0608   !     2.4789+-0.0614      ! definitely 1.1972x slower
   crypto-aes                            6.8621+-0.2065   !     7.5240+-0.1857      ! definitely 1.0965x slower
   crypto-md5                            2.9301+-0.0914   !     3.2301+-0.0483      ! definitely 1.1024x slower
   crypto-sha1                           2.4821+-0.0480   !     2.7375+-0.0685      ! definitely 1.1029x slower
   date-format-tofte                    10.8158+-0.2540   !    11.4041+-0.2644      ! definitely 1.0544x slower
   date-format-xparb                     8.5976+-0.1805   ?     8.7141+-0.2467      ? might be 1.0135x slower
   math-cordic                           6.1759+-0.0928   !     6.4378+-0.1011      ! definitely 1.0424x slower
   math-partial-sums                     7.4856+-0.1438         7.4773+-0.1746      
   math-spectral-norm                    3.6331+-0.0976   ?     3.7043+-0.0446      ? might be 1.0196x slower
   regexp-dna                           10.3002+-0.1749   ?    10.3529+-0.1731      ?
   string-base64                         5.3835+-0.1827   ?     5.5308+-0.1230      ? might be 1.0274x slower
   string-fasta                          7.0906+-0.1808   ?     7.2920+-0.1131      ? might be 1.0284x slower
   string-tagcloud                      13.3704+-0.1666   ?    13.6297+-0.2870      ? might be 1.0194x slower
   string-unpack-code                   18.5704+-0.3188        18.4064+-0.3542      
   string-validate-input                 6.5386+-0.2296   ?     6.9250+-0.2225      ? might be 1.0591x slower

   <arithmetic>                          6.8566+-0.0509   !     7.1153+-0.0401      ! definitely 1.0377x slower
   <geometric>                           5.8633+-0.0477   !     6.1364+-0.0315      ! definitely 1.0466x slower
   <harmonic>                            4.9473+-0.0516   !     5.2478+-0.0235      ! definitely 1.0607x slower

                                           TipOfTree            ValueProfiler    
V8:
   crypto                              196.2462+-0.8045   !   207.3415+-3.2399      ! definitely 1.0565x slower
   deltablue                           232.2978+-0.9027   !   290.4061+-2.9151      ! definitely 1.2501x slower
   earley-boyer                        105.3592+-1.3546   !   116.9401+-0.5730      ! definitely 1.1099x slower
   raytrace                             69.5510+-0.3109   !    81.2216+-0.6053      ! definitely 1.1678x slower
   regexp                              109.0020+-0.9485   ^   107.3776+-0.3122      ^ definitely 1.0151x faster
   richards                            228.9386+-1.6412   !   273.4472+-1.4395      ! definitely 1.1944x slower
   splay                               106.8907+-0.9253   !   109.6614+-0.9588      ! definitely 1.0259x slower

   <arithmetic>                        149.7551+-0.2917   !   169.4851+-1.0179      ! definitely 1.1317x slower
   <geometric>                         136.6708+-0.3215   !   151.6047+-0.7487      ! definitely 1.1093x slower
   <harmonic>                          124.6122+-0.3511   !   136.6541+-0.5536      ! definitely 1.0966x slower

                                           TipOfTree            ValueProfiler    
Kraken:
   ai-astar                           1340.1625+-6.2365   !  1397.7431+-8.9674      ! definitely 1.0430x slower
   audio-beat-detection                452.7537+-3.6013   !   484.9311+-2.7704      ! definitely 1.0711x slower
   audio-dft                           429.3198+-10.7121  !   465.6006+-13.4208     ! definitely 1.0845x slower
   audio-fft                           363.0736+-3.1600   !   378.7463+-2.4603      ! definitely 1.0432x slower
   audio-oscillator                    341.5923+-3.0383   !   379.5388+-3.6162      ! definitely 1.1111x slower
   imaging-darkroom                    534.9634+-3.7337   !   551.2264+-1.2353      ! definitely 1.0304x slower
   imaging-desaturate                  540.4312+-2.7447   !   561.6124+-2.8461      ! definitely 1.0392x slower
   imaging-gaussian-blur              1970.0386+-16.3271  !  2255.7230+-6.6906      ! definitely 1.1450x slower
   json-parse-financial                 48.0352+-0.1714   ?    48.4602+-0.2799      ?
   json-stringify-tinderbox             63.4177+-0.8523        62.5026+-0.4275        might be 1.0146x faster
   stanford-crypto-aes                 144.5496+-1.8578   !   158.5244+-1.4645      ! definitely 1.0967x slower
   stanford-crypto-ccm                 110.7869+-0.3804   !   120.0760+-0.5221      ! definitely 1.0838x slower
   stanford-crypto-pbkdf2              392.7411+-2.1725   !   407.1917+-4.3448      ! definitely 1.0368x slower
   stanford-crypto-sha256-iterative    145.6710+-0.8984   !   150.2614+-0.5913      ! definitely 1.0315x slower

   <arithmetic>                        491.2526+-0.9934   !   530.1527+-1.7205      ! definitely 1.0792x slower
   <geometric>                         301.4970+-0.9521   !   318.7165+-1.0444      ! definitely 1.0571x slower
   <harmonic>                          181.4722+-0.8372   !   187.5911+-0.3824      ! definitely 1.0337x slower

                                           TipOfTree            ValueProfiler    
All benchmarks:
   <arithmetic>                        172.4275+-0.2891   !   187.0964+-0.5749      ! definitely 1.0851x slower
   <geometric>                          30.3047+-0.1431   !    32.0880+-0.1061      ! definitely 1.0588x slower
   <harmonic>                            8.7219+-0.0890   !     9.2514+-0.0405      ! definitely 1.0607x slower
Comment 6 Gavin Barraclough 2011-08-17 14:45:20 PDT
Comment on attachment 103335 [details]
the patch (fix style and build)

View in context: https://bugs.webkit.org/attachment.cgi?id=103335&action=review

r- for a bunch of small suggested changes, mostly just that ValueProfile really should be in its own header.

> Source/JavaScriptCore/bytecode/CodeBlock.h:207
> +#if ENABLE(VALUE_PROFILER)

The file CodeBlock is a bit too big already, and this struct is non-trivial, I think we really need to add a new ValueProfile.h header.

> Source/JavaScriptCore/bytecode/CodeBlock.h:682
> +        SegmentedVector<ValueProfile, sizeof(ValueProfile) * 8> m_valueProfiles;

This code is a little weird - I think the second argument to the template should be a count of ValueProfiles, not a size, so isn't this growing n^2 in the size of ValueProfile? - don't you just want 8 here?

> Source/JavaScriptCore/jit/JIT.cpp:377
> +    unsigned numberOfValueProfiles = m_codeBlock->numberOfValueProfiles();

Is this just asserting that no slow cases add any new (FirstProfilingSite) value profiling code? - I think an comment here would be good.

> Source/JavaScriptCore/jit/JIT.h:313
> +#endif

Are value / scratch ever passed? - I didn't see any case of this.
If not, it may make the code simpler to remove these arguments.
Also, this function may trample regT2/regT3 - I think it's worth commenting this on the method, to try to make this more apparent.
Comment 7 Filip Pizlo 2011-08-17 16:40:09 PDT
> > Source/JavaScriptCore/bytecode/CodeBlock.h:207
> > +#if ENABLE(VALUE_PROFILER)
> 
> The file CodeBlock is a bit too big already, and this struct is non-trivial, I think we really need to add a new ValueProfile.h header.

OK, moved to a separate file.

> 
> > Source/JavaScriptCore/bytecode/CodeBlock.h:682
> > +        SegmentedVector<ValueProfile, sizeof(ValueProfile) * 8> m_valueProfiles;
> 
> This code is a little weird - I think the second argument to the template should be a count of ValueProfiles, not a size, so isn't this growing n^2 in the size of ValueProfile? - don't you just want 8 here?

Ooops, I misunderstood the meaning of that template argument.  Fixed.

> 
> > Source/JavaScriptCore/jit/JIT.cpp:377
> > +    unsigned numberOfValueProfiles = m_codeBlock->numberOfValueProfiles();
> 
> Is this just asserting that no slow cases add any new (FirstProfilingSite) value profiling code? - I think an comment here would be good.

Added a comment.  It's asserting that the slow case generation code uses existing ValueProfile instances rather than adding new ones, so that regardless of which path (fast or slow) generated the value it will go to the same ValueProfile, which is tied to that bytecode offset.

> 
> > Source/JavaScriptCore/jit/JIT.h:313
> > +#endif
> 
> Are value / scratch ever passed? - I didn't see any case of this.
> If not, it may make the code simpler to remove these arguments.
> Also, this function may trample regT2/regT3 - I think it's worth commenting this on the method, to try to make this more apparent.

Yeah, this was overly general.  Changed it to always use regT0 and regT3, with a comment saying how.
Comment 8 Filip Pizlo 2011-08-17 16:40:55 PDT
Created attachment 104282 [details]
the patch (fix review)

This will probably make qt and windos bots unhappy.  Will change to r? if/when they are happy.
Comment 9 WebKit Review Bot 2011-08-17 16:47:28 PDT
Attachment 104282 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1

Source/JavaScriptCore/bytecode/ValueProfile.h:37:  Code inside a namespace should not be indented.  [whitespace/indent] [4]
Total errors found: 1 in 13 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 10 Filip Pizlo 2011-08-17 16:52:55 PDT
Created attachment 104284 [details]
the patch (fix style)
Comment 11 Filip Pizlo 2011-08-17 18:06:35 PDT
Created attachment 104286 [details]
the patch (fix more style)
Comment 12 Filip Pizlo 2011-08-17 18:07:33 PDT
Created attachment 104288 [details]
value profiling enabled patch

Posting this to check if enabling value profiling breaks any builds.
Comment 13 Filip Pizlo 2011-08-19 18:03:07 PDT
Comment on attachment 104286 [details]
the patch (fix more style)

Looks like this patch makes the bots happy, going to land it.
Comment 14 WebKit Review Bot 2011-08-19 19:18:04 PDT
Comment on attachment 104286 [details]
the patch (fix more style)

Clearing flags on attachment: 104286

Committed r93466: <http://trac.webkit.org/changeset/93466>
Comment 15 WebKit Review Bot 2011-08-19 19:18:09 PDT
All reviewed patches have been landed.  Closing bug.