Bug 29268

Summary: [Qt] JSVALUE32_64 not works on Windows platform with MinGW compiler
Product: WebKit Reporter: Csaba Osztrogonác <ossy>
Component: JavaScriptCoreAssignee: QtWebKit Unassigned <webkit-qt-unassigned>
Status: RESOLVED FIXED    
Severity: Major CC: barraclough, jesus, jturcotte, kbalazs, loki, tonikitoo, vestbo, yselkowi, zherczeg
Priority: P2 Keywords: Qt, QtTriaged
Version: 528+ (Nightly build)   
Hardware: PC   
OS: Windows XP   
Attachments:
Description Flags
proposed patch
none
proposed patch none

Description Csaba Osztrogonác 2009-09-15 01:34:42 PDT
If building QtWebKit on Windows with MinGW compiler, I get this error message:

..\..\..\JavaScriptCore\jit\JITStubs.cpp:78: error: size of array 'dummyJITStack
Frame_maintains_16byte_stack_alignment' is negative
..\..\..\JavaScriptCore\jit\JITStubs.cpp:79: error: size of array 'dummyJITStack
Frame_stub_argument_space_matches_ctiTrampoline' is negative
..\..\..\JavaScriptCore\jit\JITStubs.cpp:80: error: size of array 'dummyJITStack
Frame_callFrame_offset_matches_ctiTrampoline' is negative
..\..\..\JavaScriptCore\jit\JITStubs.cpp:81: error: size of array 'dummyJITStack
Frame_code_offset_matches_ctiTrampoline' is negative

It caused by COMPILE_ASSERTs in JITStubs.cpp after enabling JSVALUE32_64
http://trac.webkit.org/changeset/46701

The main problem is aligning members of JITStackFrame struct. (JavaScriptCore/jit/JITStubs.h)
If building QtWebKit on linux with gcc, the compiler use packed alignment. JITStubArg has offset 4, 
padding has offset 52, ..., code has offset 80. But using MinGW, JITStubArg has offset 8, because
compiler insert 4 bytes size pad before JITStubArg. First I tried to modify padding member size to 1,
but it isn't works. (All regression test crash.) And then I tried to get compiler using same alignment 
as on linux with __attribute__((packed)), but it isn't work.

#elif PLATFORM(X86)
#if COMPILER(MSVC)
#pragma pack(push)
#pragma pack(4)
#endif // COMPILER(MSVC)
    struct JITStackFrame {
        void* reserved; // Unused
        JITStubArg args[6];
#if USE(JSVALUE32_64)
        void* padding[2]; // Maintain 16-byte stack alignment.
#endif

        void* savedEBX;
        void* savedEDI;
        void* savedESI;
        void* savedEBP;
        void* savedEIP;

        void* code;
        RegisterFile* registerFile;
        CallFrame* callFrame;
        JSValue* exception;
        Profiler** enabledProfilerReference;
        JSGlobalData* globalData;
        
        // When JIT code makes a call, it pushes its return address just below the rest of the stack.
        ReturnAddressPtr* returnAddressSlot() { return reinterpret_cast<ReturnAddressPtr*>(this) - 1; }
    };
Comment 1 Csaba Osztrogonác 2009-09-15 01:47:21 PDT
Created attachment 39591 [details]
proposed patch

I suggest we should disable JSVALUE32_64 until bug fixed.
Comment 2 Tor Arne Vestbø 2009-09-15 05:38:40 PDT
Comment on attachment 39591 [details]
proposed patch

> +/* On PLATFORM(WIN_OS) with COMPILER(MINGW) JSVALUE32_64 crashes, temporarily disabled */

You want to comment on _why_, and mention the bug, for example:

#elif PLATFORM(WIN_OS) && COMPILER(MINGW)
/* MinGW messes up the padding/alignment when using JSVALUE64, which
causes crashes. See https://bugs.webkit.org/show_bug.cgi?id=29268 */
#define WTF_USE_JSVALUE32 1
#else

I'll fix this when landing
Comment 3 Tor Arne Vestbø 2009-09-15 05:39:05 PDT
We should still figure out how to make MinGW behave though :/
Comment 4 Csaba Osztrogonác 2009-09-16 15:38:37 PDT
patch landed: http://trac.webkit.org/changeset/48390
(forgot to remove review flag earlier)
Comment 5 Csaba Osztrogonác 2010-04-06 08:31:27 PDT
If we could enable JSVALUE32_64 on Windows after fix this bug, SunSpider
would be 1.40x faster. (But unfortunately V8 would be 1.15x slower)

results on Linux: (where JSVALUE32 and JSVALUE32_64 work correctly)
SunSpider: 898.0ms +/- 0.5% -> 640.5ms +/- 0.4%
V8: 3015.9ms +/- 0.5% -> 3462.8ms +/- 0.3% 

I rebuilded QtWebKit on Windows with JSVALUE32_64, build finished
in release and debug mode too without any COMPILE_ASSERT. I think
because of some previously code refactoring. But unfortunataly
all javascriptcore test still fail. :(

/JSVALUE32_64 experts (loki04 and zherczeg) cc-ed/
Guys, have you got any idea how to fix this very old bug?
I have release and debug build on Windows, so I can
test all proposal patch. :)
Comment 6 Jesus Sanchez-Palencia 2010-05-14 07:03:02 PDT
Any updates here?!
Comment 7 Gavin Barraclough 2010-08-13 00:46:19 PDT
We do not intend to continue to support JSVALUE32 going forwards, so we're going to need to find a real solution to this problem (resolve the alignment issues in JSVALUE32_64), or disable the JIT with MinGW.
Comment 8 Csaba Osztrogonác 2010-09-09 08:12:09 PDT
Created attachment 67034 [details]
proposed patch

I tried JSVALUE32_64 with QtWebKit/Windows/MinGW again, and it works now. :)
I don't know which changeset fixed it in the last 5 months. :)

It works now, so I propose to switch to JSVALUE32_64.
SunSpider will be 1.60x faster, but V8 will be 1.19x slower.

TEST                   COMPARISON            FROM                 TO             DETAILS
=============================================================================

** TOTAL **:           1.60x as fast     971.4ms +/- 3.5%   607.4ms +/- 2.7%     significant

=============================================================================

  3d:                  2.90x as fast     243.8ms +/- 9.3%    84.2ms +/- 9.5%     significant
    cube:              3.02x as fast     108.6ms +/- 18.4%    36.0ms +/- 21.3%     significant
    morph:             4.06x as fast      75.6ms +/- 3.0%    18.6ms +/- 3.7%     significant
    raytrace:          2.01x as fast      59.6ms +/- 4.8%    29.6ms +/- 2.3%     significant

  access:              1.53x as fast      97.0ms +/- 7.3%    63.2ms +/- 1.6%     significant
    binary-trees:      *1.18x as slow*     7.6ms +/- 9.0%     9.0ms +/- 9.8%     significant
    fannkuch:          *1.41x as slow*    19.6ms +/- 3.5%    27.6ms +/- 2.5%     significant
    nbody:             3.33x as fast      64.0ms +/- 9.8%    19.2ms +/- 2.9%     significant
    nsieve:            *1.28x as slow*     5.8ms +/- 9.6%     7.4ms +/- 9.2%     significant

  bitops:              1.12x as fast      44.0ms +/- 3.5%    39.2ms +/- 4.1%     significant
    3bit-bits-in-byte: *1.94x as slow*     3.6ms +/- 18.9%     7.0ms +/- 0.0%     significant
    bits-in-byte:      *1.21x as slow*     8.4ms +/- 8.1%    10.2ms +/- 5.5%     significant
    bitwise-and:       *1.76x as slow*     4.2ms +/- 13.2%     7.4ms +/- 9.2%     significant
    nsieve-bits:       1.90x as fast      27.8ms +/- 2.0%    14.6ms +/- 4.7%     significant

  controlflow:         *1.40x as slow*     4.0ms +/- 0.0%     5.6ms +/- 12.2%     significant
    recursive:         *1.40x as slow*     4.0ms +/- 0.0%     5.6ms +/- 12.2%     significant

  crypto:              1.72x as fast      67.6ms +/- 2.8%    39.4ms +/- 24.1%     significant
    aes:               *1.28x as slow*    20.2ms +/- 2.8%    25.8ms +/- 35.6%     significant
    md5:               3.08x as fast      23.4ms +/- 2.9%     7.6ms +/- 9.0%     significant
    sha1:              4.00x as fast      24.0ms +/- 3.7%     6.0ms +/- 0.0%     significant

  date:                1.10x as fast      74.8ms +/- 1.4%    68.0ms +/- 2.9%     significant
    format-tofte:      1.15x as fast      40.2ms +/- 1.4%    35.0ms +/- 3.6%     significant
    format-xparb:      1.05x as fast      34.6ms +/- 2.0%    33.0ms +/- 2.7%     significant

  math:                2.99x as fast     170.8ms +/- 4.6%    57.2ms +/- 1.8%     significant
    cordic:            4.38x as fast      57.8ms +/- 9.9%    13.2ms +/- 4.2%     significant
    partial-sums:      2.68x as fast      86.4ms +/- 2.4%    32.2ms +/- 1.7%     significant
    spectral-norm:     2.25x as fast      26.6ms +/- 2.6%    11.8ms +/- 4.7%     significant

  regexp:              1.08x as fast      22.0ms +/- 0.0%    20.4ms +/- 3.3%     significant
    dna:               1.08x as fast      22.0ms +/- 0.0%    20.4ms +/- 3.3%     significant

  string:              1.07x as fast     247.4ms +/- 1.4%   230.2ms +/- 2.1%     significant
    base64:            1.13x as fast      40.6ms +/- 4.1%    36.0ms +/- 6.5%     significant
    fasta:             -                  38.8ms +/- 12.5%    37.8ms +/- 2.8% 
    tagcloud:          1.09x as fast      44.4ms +/- 3.2%    40.8ms +/- 2.5%     significant
    unpack-code:       -                  74.6ms +/- 1.9%    73.0ms +/- 2.9% 
    validate-input:    1.15x as fast      49.0ms +/- 1.8%    42.6ms +/- 1.6%     significant



    TEST             COMPARISON            FROM                 TO             DETAILS
=============================================================================

** TOTAL **:     *1.19x as slow*   2009.6ms +/- 2.2%   2387.0ms +/- 1.1%     significant

=============================================================================

  v8:            *1.19x as slow*   2009.6ms +/- 2.2%   2387.0ms +/- 1.1%     significant
    crypto:      *1.38x as slow*    318.4ms +/- 3.9%    438.2ms +/- 0.3%     significant
    deltablue:   *1.18x as slow*    591.8ms +/- 2.6%    696.4ms +/- 1.1%     significant
    early-boyer: *1.18x as slow*    377.0ms +/- 4.5%    445.4ms +/- 2.6%     significant
    raytrace:    1.39x as fast      268.6ms +/- 6.4%    193.0ms +/- 7.9%     significant
    richards:    *1.35x as slow*    453.8ms +/- 0.2%    614.0ms +/- 0.3%     significant
Comment 9 Gavin Barraclough 2010-09-09 09:19:54 PDT
Comment on attachment 67034 [details]
proposed patch

This is great news!
Comment 10 Csaba Osztrogonác 2010-09-09 09:28:13 PDT
Comment on attachment 67034 [details]
proposed patch

Clearing flags on attachment: 67034

Committed r67090: <http://trac.webkit.org/changeset/67090>
Comment 11 Csaba Osztrogonác 2010-09-09 09:28:23 PDT
All reviewed patches have been landed.  Closing bug.
Comment 12 Antonio Gomes 2010-09-09 09:38:35 PDT
> I tried JSVALUE32_64 with QtWebKit/Windows/MinGW again, and it works now. :)
> I don't know which changeset fixed it in the last 5 months. :)
> 
> It works now, so I propose to switch to JSVALUE32_64.
> SunSpider will be 1.60x faster, but V8 will be 1.19x slower.

do not they care about this perf regression, or i'm missing something?
Comment 13 Balazs Kelemen 2010-09-09 09:51:29 PDT
(In reply to comment #12)
> > I tried JSVALUE32_64 with QtWebKit/Windows/MinGW again, and it works now. :)
> > I don't know which changeset fixed it in the last 5 months. :)
> > 
> > It works now, so I propose to switch to JSVALUE32_64.
> > SunSpider will be 1.60x faster, but V8 will be 1.19x slower.
> 
> do not they care about this perf regression, or i'm missing something?

v8 will not be slower, but jsc will be slower on the v8 benchmark :)
Comment 14 Csaba Osztrogonác 2010-09-09 10:07:53 PDT
(In reply to comment #12)
> > I tried JSVALUE32_64 with QtWebKit/Windows/MinGW again, and it works now. :)
> > I don't know which changeset fixed it in the last 5 months. :)
> > 
> > It works now, so I propose to switch to JSVALUE32_64.
> > SunSpider will be 1.60x faster, but V8 will be 1.19x slower.
> 
> do not they care about this perf regression, or i'm missing something?

The difference between JSVALUE32 and JSVALUE32_64 is the size of JSValue,
which can be 32 bit or 64 bit. The JSVALUE32_64 is faster when you work
with a lot of 64 bit sized numbers (long int, double) such as 3D tests in
SunSpider. Unfortunately it has a little performance regression if your
application can't exploit the 64 bit sized JSValues.

But we have to know the only choice is JSVALUE32_64, because WebKit 
community don't intend to support JSVALUE32 in the future as Gavin said.