Bug 71373

Summary: Enable the DFG JIT on x86-64 Linux platforms
Product: WebKit Reporter: Andy Wingo <wingo>
Component: JavaScriptCoreAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: barraclough, fpizlo, hausmann, ossy, webkit.review.bot, yuqiang.xian, zherczeg
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 71882    
Bug Blocks:    
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
32 and 64 bit buildfix for stricter compilers
none
Patch none

Description Andy Wingo 2011-11-02 09:45:11 PDT
The DFG JIT work is looking very interesting.  Are you (Gavin, Filip, &c) interested in broader testing?  I was going to suggest a Platform.h patch but I didn't know if you wanted to restrict your work to one platform for some reason.

Otherwise it could look something like:

/* Some stubs only implemented for x86 and x86-64. */
#if !defined(ENABLE_DFG_JIT) && ENABLE(JIT) \
    && (CPU(X86) || CPU(X86_64))
#define ENABLE_DFG_JIT 1
#endif

This does enable the DFG on 32-bit PLATFORM(MAC) systems as well; perhaps you are not interested in that?

If you want to open it up to just one other platform, I am happy to help in any issues that come up related to the GTK+ port, at least.

Andy
Comment 1 Gavin Barraclough 2011-11-02 10:56:31 PDT
Yes! - we want to turn this on as soon as there are no known regressions, and I think we're pretty close to being there.

Last time I tested it looked like there were two sets of layout tests still being broken by the DFG JIT on x86 - the jquery & inspector tests.  I'm pretty sure that Yuqiang's patch to change how we set boolean results should have fixed the jquery regressions, the inspector ones might also have been fixed since I last tried them, too.

There is no reason I know of for it not to be enabled on x86-64 non-mac platforms, I just haven't tested & didn't want to turn on without someone having done so.

Btw, nice write up on JSC. :-)
Comment 2 Andy Wingo 2011-11-02 11:44:23 PDT
Created attachment 113342 [details]
Patch
Comment 3 Andy Wingo 2011-11-02 11:47:50 PDT
Cool, I'm looking forward to playing around with this in more detail :)  And thanks for the kind words, it has been fun to look into JSC, and I hope to hack on it more in the future.

To move things along, the attached patch enables the DFG JIT on all x86-64 platforms.  Perhaps Yuqiang can add x86-32 to the set when it is ready, too.
Comment 4 Yuqiang Xian 2011-11-03 22:23:46 PDT
For now, from my testing of GTK+ port on Linux x86 with DFG JIT turned on, I didn't observe any regressions comparing to ToT (including layout test, JavaScriptCore test, and JS benchmarks).
Comment 5 Andy Wingo 2011-11-08 06:59:09 PST
Nice work, Yuqiang!  And congrats on becoming a committer.

The patch to be attached enables the DFG JIT on x86-64 platforms as well.
Comment 6 Andy Wingo 2011-11-08 06:59:59 PST
Created attachment 114057 [details]
Patch
Comment 7 Andy Wingo 2011-11-08 07:25:28 PST
Sunspider:

EST                   COMPARISON            FROM                 TO             DETAILS

=============================================================================

** TOTAL **:           1.074x as fast    162.0ms +/- 1.0%   150.9ms +/- 1.5%     significant

=============================================================================

  3d:                  -                  25.4ms +/- 7.2%    25.1ms +/- 7.9% 
    cube:              ??                 10.4ms +/- 17.5%    10.5ms +/- 21.1%     not conclusive: might be *1.010x as slow*
    morph:             *1.057x as slow*    7.0ms +/- 0.0%     7.4ms +/- 5.0%     significant
    raytrace:          1.111x as fast      8.0ms +/- 0.0%     7.2ms +/- 4.2%     significant

  access:              1.72x as fast      20.6ms +/- 2.4%    12.0ms +/- 0.0%     significant
    binary-trees:      2.00x as fast       2.0ms +/- 0.0%     1.0ms +/- 0.0%     significant
    fannkuch:          1.70x as fast      10.2ms +/- 3.0%     6.0ms +/- 0.0%     significant
    nbody:             1.80x as fast       5.4ms +/- 6.8%     3.0ms +/- 0.0%     significant
    nsieve:            1.50x as fast       3.0ms +/- 0.0%     2.0ms +/- 0.0%     significant

  bitops:              1.59x as fast      14.5ms +/- 2.6%     9.1ms +/- 2.5%     significant
    3bit-bits-in-byte: 2.00x as fast       2.0ms +/- 0.0%     1.0ms +/- 0.0%     significant
    bits-in-byte:      2.65x as fast       5.3ms +/- 6.5%     2.0ms +/- 0.0%     significant
    bitwise-and:       1.43x as fast       3.0ms +/- 0.0%     2.1ms +/- 10.8%     significant
    nsieve-bits:       -                   4.2ms +/- 7.2%     4.0ms +/- 0.0% 

  controlflow:         ??                  1.0ms +/- 0.0%     1.2ms +/- 25.1%     not conclusive: might be *1.20x as slow*
    recursive:         ??                  1.0ms +/- 0.0%     1.2ms +/- 25.1%     not conclusive: might be *1.20x as slow*

  crypto:              *1.108x as slow*   10.2ms +/- 3.0%    11.3ms +/- 3.1%     significant
    aes:               *1.197x as slow*    6.1ms +/- 3.7%     7.3ms +/- 4.7%     significant
    md5:               -                   2.1ms +/- 10.8%     2.0ms +/- 0.0% 
    sha1:              -                   2.0ms +/- 0.0%     2.0ms +/- 0.0% 

  date:                -                  18.0ms +/- 0.0%    17.9ms +/- 2.9% 
    format-tofte:      1.042x as fast     10.0ms +/- 0.0%     9.6ms +/- 3.8%     significant
    format-xparb:      ??                  8.0ms +/- 0.0%     8.3ms +/- 5.8%     not conclusive: might be *1.038x as slow*

  math:                -                  16.1ms +/- 1.4%    16.0ms +/- 0.0% 
    cordic:            *1.176x as slow*    5.1ms +/- 4.4%     6.0ms +/- 0.0%     significant
    partial-sums:      -                   8.0ms +/- 0.0%     8.0ms +/- 0.0% 
    spectral-norm:     1.50x as fast       3.0ms +/- 0.0%     2.0ms +/- 0.0%     significant

  regexp:              ??                 12.8ms +/- 2.4%    13.0ms +/- 0.0%     not conclusive: might be *1.016x as slow*
    dna:               ??                 12.8ms +/- 2.4%    13.0ms +/- 0.0%     not conclusive: might be *1.016x as slow*

  string:              *1.044x as slow*   43.4ms +/- 0.9%    45.3ms +/- 0.8%     significant
    base64:            1.25x as fast       4.0ms +/- 0.0%     3.2ms +/- 9.4%     significant
    fasta:             -                   6.0ms +/- 0.0%     6.0ms +/- 0.0% 
    tagcloud:          ??                 10.9ms +/- 2.1%    11.0ms +/- 0.0%     not conclusive: might be *1.009x as slow*
    unpack-code:       *1.149x as slow*   17.5ms +/- 2.2%    20.1ms +/- 1.1%     significant
    validate-input:    -                   5.0ms +/- 0.0%     5.0ms +/- 0.0% 


V8:


TEST              COMPARISON            FROM                 TO             DETAILS

=============================================================================

** TOTAL **:      1.42x as fast     1070.4ms +/- 0.4%   751.8ms +/- 0.3%     significant

=============================================================================

  v8:             1.42x as fast     1070.4ms +/- 0.4%   751.8ms +/- 0.3%     significant
    crypto:       2.49x as fast      184.6ms +/- 1.0%    74.2ms +/- 2.2%     significant
    deltablue:    1.49x as fast      256.7ms +/- 0.4%   172.7ms +/- 0.7%     significant
    earley-boyer: 1.163x as fast     100.5ms +/- 0.4%    86.4ms +/- 0.6%     significant
    raytrace:     1.093x as fast      65.6ms +/- 0.8%    60.0ms +/- 0.8%     significant
    regexp:       *1.011x as slow*   111.4ms +/- 0.3%   112.6ms +/- 0.6%     significant
    richards:     1.82x as fast      217.9ms +/- 0.6%   119.7ms +/- 0.6%     significant
    splay:        1.059x as fast     133.7ms +/- 0.7%   126.2ms +/- 0.9%     significant


Kraken:


TEST                         COMPARISON            FROM                 TO             DETAILS

=============================================================================

** TOTAL **:                 2.18x as fast     6721.2ms +/- 0.2%   3078.2ms +/- 0.3%     significant

=============================================================================

  ai:                        2.49x as fast     1126.0ms +/- 0.8%    451.5ms +/- 0.7%     significant
    astar:                   2.49x as fast     1126.0ms +/- 0.8%    451.5ms +/- 0.7%     significant

  audio:                     1.89x as fast     1694.9ms +/- 0.3%    897.2ms +/- 0.3%     significant
    beat-detection:          2.54x as fast      482.1ms +/- 0.2%    190.0ms +/- 0.3%     significant
    dft:                     1.58x as fast      526.6ms +/- 0.2%    332.4ms +/- 0.5%     significant
    fft:                     2.86x as fast      359.6ms +/- 0.5%    125.6ms +/- 0.7%     significant
    oscillator:              1.31x as fast      326.6ms +/- 0.9%    249.2ms +/- 0.5%     significant

  imaging:                   2.72x as fast     3009.1ms +/- 0.2%   1106.7ms +/- 0.5%     significant
    gaussian-blur:           3.59x as fast     1914.2ms +/- 0.5%    533.2ms +/- 0.6%     significant
    darkroom:                1.59x as fast      558.0ms +/- 0.9%    351.9ms +/- 0.6%     significant
    desaturate:              2.42x as fast      536.9ms +/- 0.4%    221.6ms +/- 0.7%     significant

  json:                      *1.058x as slow*   129.2ms +/- 2.2%    136.7ms +/- 1.0%     significant
    parse-financial:         -                   56.2ms +/- 2.7%     55.1ms +/- 1.1% 
    stringify-tinderbox:     *1.118x as slow*    73.0ms +/- 2.1%     81.6ms +/- 1.3%     significant

  stanford:                  1.57x as fast      762.0ms +/- 0.4%    486.1ms +/- 0.5%     significant
    crypto-aes:              1.39x as fast      143.7ms +/- 1.4%    103.1ms +/- 0.5%     significant
    crypto-ccm:              *1.018x as slow*   113.9ms +/- 1.2%    115.9ms +/- 1.2%     significant
    crypto-pbkdf2:           1.92x as fast      380.4ms +/- 0.4%    197.8ms +/- 0.9%     significant
    crypto-sha256-iterative: 1.79x as fast      124.0ms +/- 0.5%     69.3ms +/- 0.7%     significant
Comment 8 Gyuyoung Kim 2011-11-08 07:32:30 PST
Comment on attachment 114057 [details]
Patch

Attachment 114057 [details] did not pass efl-ews (efl):
Output: http://queues.webkit.org/results/10379041
Comment 9 Andy Wingo 2011-11-08 07:54:01 PST
Created attachment 114067 [details]
Patch
Comment 10 Andy Wingo 2011-11-08 07:54:32 PST
The newly attached patch attempts to fix EFL.
Comment 11 Philippe Normand 2011-11-09 02:12:42 PST
Comment on attachment 114067 [details]
Patch

Flipping cq+ as Andy is not committer yet.
Comment 12 WebKit Review Bot 2011-11-09 03:03:56 PST
Comment on attachment 114067 [details]
Patch

Clearing flags on attachment: 114067

Committed r99678: <http://trac.webkit.org/changeset/99678>
Comment 13 WebKit Review Bot 2011-11-09 03:04:00 PST
All reviewed patches have been landed.  Closing bug.
Comment 14 Simon Hausmann 2011-11-09 03:28:50 PST
Re-opening as the patch was rolled out due to build-errors of the following kind:

The build errors were:

Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:734:30: error: variable 'tooBig' set but not used [-Werror=unused-but-set-variable]
Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:708:12: error: variable 'valueGPR' set but not used [-Werror=unused-but-set-variable]
Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:1552:27: error: variable 'functionCall' set but not used [-Werror=unused-but-set-variable]
Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:2471:27: error: variable 'functionCall' set but not used [-Werror=unused-but-set-variable]

We weren't 100% sure what the correct fixes were here :)
Comment 15 Csaba Osztrogonác 2011-11-09 03:39:56 PST
Created attachment 114235 [details]
32 and 64 bit buildfix for stricter compilers
Comment 16 Simon Hausmann 2011-11-09 03:43:26 PST
Comment on attachment 114235 [details]
32 and 64 bit buildfix for stricter compilers

View in context: https://bugs.webkit.org/attachment.cgi?id=114235&action=review

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:734
> -        MacroAssembler::Jump tooBig = m_jit.branch32(MacroAssembler::GreaterThan, scratchReg, TrustedImm32(0xff));
> +        m_jit.branch32(MacroAssembler::GreaterThan, scratchReg, TrustedImm32(0xff));

This is the part I'm rather unsure about. It seems "odd" to leave an instruction in there that is unlinked.
Comment 17 Andy Wingo 2011-11-09 03:46:45 PST
I added the patch at bug 71885 at the same time that Ossy wrote his patch.  It removes the unlinked branch, FWIW.
Comment 18 Zoltan Herczeg 2011-11-09 03:51:23 PST
Comment on attachment 114235 [details]
32 and 64 bit buildfix for stricter compilers

View in context: https://bugs.webkit.org/attachment.cgi?id=114235&action=review

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:-734
> -        MacroAssembler::Jump tooBig = m_jit.branch32(MacroAssembler::GreaterThan, scratchReg, TrustedImm32(0xff));

I think this call is unnecessary.
1) The previous check <=, so the it is always >
2) The following code would be dead code.
I think that it is working on x86 is a pure luck, since "jmp 0" jumps to the next instruction after the "jmp".
Comment 19 Zoltan Herczeg 2011-11-09 03:53:30 PST
Comment on attachment 114235 [details]
32 and 64 bit buildfix for stricter compilers

r=me

Necessary changes.
Comment 20 Csaba Osztrogonác 2011-11-09 03:59:49 PST
*** Bug 71885 has been marked as a duplicate of this bug. ***
Comment 21 Yuqiang Xian 2011-11-09 04:38:53 PST
Just a note... The original patch probably needs to merge the latest change in DFG, in particular some new files are added with bug #71787 but the 64bit version may be missing in the Efl cmake list.
Comment 22 Csaba Osztrogonác 2011-11-09 05:32:54 PST
Comment on attachment 114235 [details]
32 and 64 bit buildfix for stricter compilers

Clearing flags on attachment: 114235

Committed r99693: <http://trac.webkit.org/changeset/99693>
Comment 23 Andy Wingo 2011-11-09 05:46:33 PST
Created attachment 114257 [details]
Patch
Comment 24 Andy Wingo 2011-11-09 05:47:22 PST
Comment on attachment 114257 [details]
Patch

Updated, adding 64-bit OSR exit compiler to EFL build.
Comment 25 Csaba Osztrogonác 2011-11-09 05:51:04 PST
Comment on attachment 114257 [details]
Patch

r=me, I'll land it immediately.
Comment 26 Csaba Osztrogonác 2011-11-09 05:55:54 PST
Comment on attachment 114257 [details]
Patch

Clearing flags on attachment: 114257

Committed r99696: <http://trac.webkit.org/changeset/99696>
Comment 27 Csaba Osztrogonác 2011-11-09 05:56:04 PST
All reviewed patches have been landed.  Closing bug.
Comment 28 Filip Pizlo 2011-11-09 12:31:33 PST
(In reply to comment #18)
> (From update of attachment 114235 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=114235&action=review
> 
> > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:-734
> > -        MacroAssembler::Jump tooBig = m_jit.branch32(MacroAssembler::GreaterThan, scratchReg, TrustedImm32(0xff));
> 
> I think this call is unnecessary.
> 1) The previous check <=, so the it is always >
> 2) The following code would be dead code.
> I think that it is working on x86 is a pure luck, since "jmp 0" jumps to the next instruction after the "jmp".

No.  The first check is BelowOrEqual, which is an unsigned comparison.  The second check is GreaterThan, which is a signed comparison.

The tooBig branch should be linked to this line:

        m_jit.move(TrustedImm32(255), scratchReg);
Comment 29 Zoltan Herczeg 2011-11-09 23:36:42 PST
> No.  The first check is BelowOrEqual, which is an unsigned comparison.  The second check is GreaterThan, which is a signed comparison.
> 
> The tooBig branch should be linked to this line:
> 
>         m_jit.move(TrustedImm32(255), scratchReg);

Oh, Intel naming convection: Below/Above unsigned, Greater/Less signed. Perhaps we should change these names to something less confusing.
Comment 30 Filip Pizlo 2011-11-09 23:40:05 PST
(In reply to comment #29)
> > No.  The first check is BelowOrEqual, which is an unsigned comparison.  The second check is GreaterThan, which is a signed comparison.
> > 
> > The tooBig branch should be linked to this line:
> > 
> >         m_jit.move(TrustedImm32(255), scratchReg);
> 
> Oh, Intel naming convection: Below/Above unsigned, Greater/Less signed. Perhaps we should change these names to something less confusing.

Suggestions?