Bug 18991 - SquirrelFish: Major codegen issue in a.b=expr, a[b]=expr
Summary: SquirrelFish: Major codegen issue in a.b=expr, a[b]=expr
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P1 Blocker
Assignee: Cameron Zwarich (cpst)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-05-10 21:15 PDT by Oliver Hunt
Modified: 2008-05-17 21:56 PDT (History)
5 users (show)

See Also:


Attachments
testcase (739 bytes, text/html)
2008-05-10 21:46 PDT, Oliver Hunt
no flags Details
updated testcase (1.52 KB, text/html)
2008-05-10 22:18 PDT, Oliver Hunt
no flags Details
Patch in progress (2.39 KB, patch)
2008-05-11 01:30 PDT, Cameron Zwarich (cpst)
no flags Details | Formatted Diff | Diff
Proposed patch (20.70 KB, patch)
2008-05-14 19:06 PDT, Cameron Zwarich (cpst)
no flags Details | Formatted Diff | Diff
Revised proposed patch (23.18 KB, patch)
2008-05-14 21:29 PDT, Cameron Zwarich (cpst)
no flags Details | Formatted Diff | Diff
Revised proposed patch (20.97 KB, patch)
2008-05-15 18:19 PDT, Cameron Zwarich (cpst)
no flags Details | Formatted Diff | Diff
Further fixeration (23.09 KB, patch)
2008-05-15 23:26 PDT, Oliver Hunt
no flags Details | Formatted Diff | Diff
Fix that is not broken (23.15 KB, patch)
2008-05-16 00:38 PDT, Oliver Hunt
no flags Details | Formatted Diff | Diff
Revised proposed patch (22.81 KB, patch)
2008-05-16 22:30 PDT, Cameron Zwarich (cpst)
no flags Details | Formatted Diff | Diff
Revised proposed patch (22.82 KB, patch)
2008-05-17 01:42 PDT, Cameron Zwarich (cpst)
oliver: review+
Details | Formatted Diff | Diff
Proposed patch for last case (3.40 KB, patch)
2008-05-17 04:51 PDT, Cameron Zwarich (cpst)
oliver: review+
Details | Formatted Diff | Diff
Revised proposed patch (7.02 KB, patch)
2008-05-17 21:45 PDT, Cameron Zwarich (cpst)
oliver: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Oliver Hunt 2008-05-10 21:15:08 PDT
Cameron and I have isolated a number of the probable causes for gmail and ebay bustage, and they are expressions like (assume var a, b;):
* a=a.b=c -- should evaluate as  a.b=c; a=a.b; actually evals as a=c;a.b=a;a=a.b; (ditto for a[b])
* a.b=(a="wiffle") -- should eval as base=a;a="wiffle";base.b=a; actually evaluates as a=wiffle;a.b=a;

This makes me cry.
Comment 1 Oliver Hunt 2008-05-10 21:16:56 PDT
we originally found this through ebay which had what worke dout to be:

z=(z.bar)?z.bar:z.bar={};
Comment 2 Oliver Hunt 2008-05-10 21:46:45 PDT
Created attachment 21061 [details]
testcase

test cases
Comment 3 Oliver Hunt 2008-05-10 22:18:21 PDT
Created attachment 21062 [details]
updated testcase

More testeration
Comment 4 Cameron Zwarich (cpst) 2008-05-10 23:49:59 PDT
I have a patch that fixes Oliver's test case and more by copying everything to temporaries if necessary. It's a regression on SunSpider, but we hope that by checking for eval, closures, and the lack of more assignments on the RHS of the assignment, we can remove most of this regression.
Comment 5 Cameron Zwarich (cpst) 2008-05-11 01:20:51 PDT
I modified my patch to avoid unnecessary copying by tracking assignments on the right hand side. Unfortunately, this works for Oliver's testcase but it doesn't fix eBay. It also did worse on SunSpider than the version that always copies.
Comment 6 Cameron Zwarich (cpst) 2008-05-11 01:30:36 PDT
Created attachment 21064 [details]
Patch in progress

Here's the patch that works.
Comment 7 Cameron Zwarich (cpst) 2008-05-14 06:14:59 PDT
The naive patch I posted is about a 1.4% ms performance regression on ToT SquirrelFish. My slightly better version (that still misses some cases) is about a 0.8-0.9% regression. The tracking of assignments alone doesn't seem to affect much, as it is no change on SunSpider tending towards a 0.1% regression. I'll try fixing the missing case that causes the problem with eBay and see how it looks.

In theory, we also have to deal with the situation where parameters are modified with f.arguments, although this is a pretty terrible programming technique, so in addition to any checks for assignments in the subscript or on the right hand side, we also need to check whether the identifier involved is a function parameter.
Comment 8 Cameron Zwarich (cpst) 2008-05-14 19:06:30 PDT
Created attachment 21150 [details]
Proposed patch

Here is a patch that fixes most of the issues. As far as I know, the only two that it doesn't fix are the local case of ReadModifyResolveNode (it needs a similar fix to AssignResolveNode), and the case of modifying arguments to a function with no closure using f.arguments.
Comment 9 Cameron Zwarich (cpst) 2008-05-14 21:29:54 PDT
Created attachment 21154 [details]
Revised proposed patch

Here's a patch that hopefully corrects this issue, except in the case where f.arguments is used to modify arguments of something in the call chain. I'll post SunSpider results in the next post. I also ran diffs on the bytecode output. I have only looked at a sample of differences, but all of the examples I saw were cases where we definitely need to do different codegen, like global code, eval code, or the body of a function with a closure. I'll look at all of them and make a post along with the diffs.
Comment 10 Cameron Zwarich (cpst) 2008-05-15 05:59:51 PDT
Created attachment 21162


Here is the codegen for the various SunSpider tests, including before, after, and diffs. I'll post SunSpider results soon. There was a substantial performance regression that just got rolled out, so now I can get some reliable numbers.
Comment 11 Cameron Zwarich (cpst) 2008-05-15 06:31:15 PDT
This is really strange. When I ran it with network on, I got a regression, but with network off I got a 0.1% *progression*:

TEST                   COMPARISON            FROM                 TO             DETAILS

=============================================================================

** TOTAL **:           1.001x as fast    2360.8ms +/- 0.1%   2357.9ms +/- 0.1%     significant

=============================================================================

  3d:                  *1.012x as slow*   305.9ms +/- 0.1%    309.6ms +/- 0.1%     significant
    cube:              *1.021x as slow*    92.4ms +/- 0.2%     94.3ms +/- 0.1%     significant
    morph:             *1.018x as slow*   109.8ms +/- 0.2%    111.7ms +/- 0.3%     significant
    raytrace:          -                  103.8ms +/- 0.2%    103.6ms +/- 0.1% 

  access:              1.013x as fast     357.3ms +/- 0.1%    352.8ms +/- 0.1%     significant
    binary-trees:      *1.009x as slow*    39.4ms +/- 0.4%     39.7ms +/- 0.3%     significant
    fannkuch:          1.034x as fast     140.5ms +/- 0.1%    135.9ms +/- 0.1%     significant
    nbody:             -                  140.8ms +/- 0.1%    140.6ms +/- 0.1% 
    nsieve:            -                   36.7ms +/- 0.4%     36.5ms +/- 0.4% 

  bitops:              1.010x as fast     265.9ms +/- 0.1%    263.3ms +/- 0.2%     significant
    3bit-bits-in-byte: -                   35.8ms +/- 0.7%     35.6ms +/- 0.4% 
    bits-in-byte:      -                   49.3ms +/- 0.3%     49.2ms +/- 0.2% 
    bitwise-and:       1.022x as fast      93.9ms +/- 0.1%     91.8ms +/- 0.4%     significant
    nsieve-bits:       1.002x as fast      86.9ms +/- 0.1%     86.7ms +/- 0.2%     significant

  controlflow:         *1.010x as slow*    36.0ms +/- 0.3%     36.3ms +/- 0.4%     significant
    recursive:         *1.010x as slow*    36.0ms +/- 0.3%     36.3ms +/- 0.4%     significant

  crypto:              ??                 160.2ms +/- 0.1%    160.4ms +/- 0.1%     not conclusive: might be *1.001x as slow*
    aes:               *1.010x as slow*    61.1ms +/- 0.2%     61.7ms +/- 0.2%     significant
    md5:               -                   48.4ms +/- 0.3%     48.4ms +/- 0.3% 
    sha1:              1.008x as fast      50.7ms +/- 0.2%     50.3ms +/- 0.3%     significant

  date:                1.002x as fast     193.5ms +/- 0.1%    193.1ms +/- 0.1%     significant
    format-tofte:      -                  120.0ms +/- 0.1%    120.0ms +/- 0.1% 
    format-xparb:      1.006x as fast      73.5ms +/- 0.2%     73.1ms +/- 0.1%     significant

  math:                1.006x as fast     265.1ms +/- 0.1%    263.5ms +/- 0.1%     significant
    cordic:            1.003x as fast      86.9ms +/- 0.2%     86.6ms +/- 0.2%     significant
    partial-sums:      -                  125.4ms +/- 0.1%    125.3ms +/- 0.1% 
    spectral-norm:     1.025x as fast      52.9ms +/- 0.2%     51.6ms +/- 0.3%     significant

  regexp:              -                  214.1ms +/- 0.2%    213.9ms +/- 0.2% 
    dna:               -                  214.1ms +/- 0.2%    213.9ms +/- 0.2% 

  string:              *1.004x as slow*   562.8ms +/- 0.1%    565.0ms +/- 0.1%     significant
    base64:            *1.007x as slow*    92.4ms +/- 0.2%     93.0ms +/- 0.1%     significant
    fasta:             *1.007x as slow*   117.1ms +/- 0.1%    118.0ms +/- 0.1%     significant
    tagcloud:          *1.005x as slow*   130.8ms +/- 0.2%    131.5ms +/- 0.2%     significant
    unpack-code:       -                  132.5ms +/- 0.2%    132.3ms +/- 0.1% 
    validate-input:    ??                  89.9ms +/- 0.4%     90.2ms +/- 0.4%     not conclusive: might be *1.004x as slow*

Comment 12 Cameron Zwarich (cpst) 2008-05-15 06:35:28 PDT
(In reply to comment #11)
> This is really strange. When I ran it with network on, I got a regression, but
> with network off I got a 0.1% *progression*:

Now I get the slight progression even with networking on.
Comment 13 Cameron Zwarich (cpst) 2008-05-15 08:33:21 PDT
Oliver said he saw a 0.3% regression on his system.

The most obvious bad codegen with this patch is the use of an extra temporary while loading a constant in global code, eval code or a function body with a closure. I'll try to fix that.
Comment 14 Cameron Zwarich (cpst) 2008-05-15 18:19:31 PDT
Created attachment 21190 [details]
Revised proposed patch

Here is a revised version of this patch, incorporating some suggestions by Oliver and Maciej. Unfortunately, it seems to be a 0.5% regression, as opposed to the patch above (attachment 21154 [details]), which is either no change or a slight progression on my machine. I'll post the codegen diffs and the SunSpider comparison in the next post. Hopefully some others can try it on their machines and see how it looks.
Comment 15 Cameron Zwarich (cpst) 2008-05-15 18:23:02 PDT
Created attachment 21191


Here are the SunSpider codegen differences for the latest patch, and here is the results comparison:

TEST                   COMPARISON            FROM                 TO             DETAILS

=============================================================================

** TOTAL **:           *1.004x as slow*  2359.8ms +/- 0.1%   2369.0ms +/- 0.1%     significant

=============================================================================

  3d:                  *1.010x as slow*   305.5ms +/- 0.1%    308.5ms +/- 0.1%     significant
    cube:              *1.008x as slow*    92.4ms +/- 0.2%     93.2ms +/- 0.1%     significant
    morph:             *1.027x as slow*   109.5ms +/- 0.1%    112.5ms +/- 0.1%     significant
    raytrace:          1.007x as fast     103.6ms +/- 0.1%    102.9ms +/- 0.1%     significant

  access:              *1.014x as slow*   357.2ms +/- 0.1%    362.2ms +/- 0.1%     significant
    binary-trees:      *1.018x as slow*    39.3ms +/- 0.4%     40.0ms +/- 0.2%     significant
    fannkuch:          *1.015x as slow*   140.5ms +/- 0.2%    142.6ms +/- 0.1%     significant
    nbody:             *1.017x as slow*   140.8ms +/- 0.1%    143.2ms +/- 0.1%     significant
    nsieve:            1.007x as fast      36.6ms +/- 0.4%     36.4ms +/- 0.4%     significant

  bitops:              *1.012x as slow*   266.0ms +/- 0.1%    269.1ms +/- 0.1%     significant
    3bit-bits-in-byte: *1.006x as slow*    35.7ms +/- 0.4%     35.9ms +/- 0.3%     significant
    bits-in-byte:      1.011x as fast      49.4ms +/- 0.3%     48.8ms +/- 0.2%     significant
    bitwise-and:       1.010x as fast      94.0ms +/- 0.1%     93.0ms +/- 0.1%     significant
    nsieve-bits:       *1.049x as slow*    87.0ms +/- 0.3%     91.3ms +/- 0.3%     significant

  controlflow:         *1.009x as slow*    36.0ms +/- 0.2%     36.3ms +/- 0.4%     significant
    recursive:         *1.009x as slow*    36.0ms +/- 0.2%     36.3ms +/- 0.4%     significant

  crypto:              -                  160.1ms +/- 0.1%    160.2ms +/- 0.1% 
    aes:               *1.003x as slow*    61.1ms +/- 0.1%     61.2ms +/- 0.2%     significant
    md5:               -                   48.4ms +/- 0.3%     48.2ms +/- 0.2% 
    sha1:              -                   50.7ms +/- 0.3%     50.7ms +/- 0.3% 

  date:                1.003x as fast     193.3ms +/- 0.1%    192.7ms +/- 0.2%     significant
    format-tofte:      1.002x as fast     119.8ms +/- 0.1%    119.5ms +/- 0.3%     significant
    format-xparb:      1.004x as fast      73.4ms +/- 0.2%     73.1ms +/- 0.1%     significant

  math:                1.009x as fast     265.2ms +/- 0.1%    262.8ms +/- 0.1%     significant
    cordic:            *1.003x as slow*    86.9ms +/- 0.2%     87.1ms +/- 0.2%     significant
    partial-sums:      1.008x as fast     125.4ms +/- 0.1%    124.5ms +/- 0.2%     significant
    spectral-norm:     1.035x as fast      53.0ms +/- 0.2%     51.2ms +/- 0.2%     significant

  regexp:              *1.004x as slow*   213.9ms +/- 0.1%    214.8ms +/- 0.1%     significant
    dna:               *1.004x as slow*   213.9ms +/- 0.1%    214.8ms +/- 0.1%     significant

  string:              -                  562.5ms +/- 0.1%    562.6ms +/- 0.1% 
    base64:            -                   92.4ms +/- 0.3%     92.4ms +/- 0.2% 
    fasta:             1.006x as fast     117.1ms +/- 0.1%    116.4ms +/- 0.3%     significant
    tagcloud:          *1.008x as slow*   130.5ms +/- 0.3%    131.5ms +/- 0.2%     significant
    unpack-code:       ??                 132.5ms +/- 0.2%    132.8ms +/- 0.3%     not conclusive: might be *1.002x as slow*
    validate-input:    1.005x as fast      90.0ms +/- 0.3%     89.6ms +/- 0.3%     significant
Comment 16 Oliver Hunt 2008-05-15 21:57:21 PDT
Hmm, i see a 0.5% regression as well :-/
Comment 17 Oliver Hunt 2008-05-15 22:17:40 PDT
This is kind of dumb, string-base64 has the biggest regression on my system, yet according to the diff is unchanged... :-/
Comment 18 Oliver Hunt 2008-05-15 23:26:45 PDT
Created attachment 21194 [details]
Further fixeration

This patch improves upon the last by preventing unnecessary copying in cases like
local1.a = local2;
<expression> = local1.a = local2;

This gets it from a 0.5% regression to 0.4% on my system
Comment 19 Oliver Hunt 2008-05-16 00:22:35 PDT
whoops, my last patch introduced borkeration, working on a fix

Comment 20 Oliver Hunt 2008-05-16 00:38:33 PDT
Created attachment 21197 [details]
Fix that is not broken
Comment 21 Cameron Zwarich (cpst) 2008-05-16 14:15:40 PDT
The latest patch is a 1.1% regression for me, whereas the last patch is only a 0.5% regression. This doesn't make much sense, because it clearly has better codegen when I look at the diffs.
Comment 22 Cameron Zwarich (cpst) 2008-05-16 22:30:32 PDT
Created attachment 21209 [details]
Revised proposed patch

I managed to remove most of my regression by using ALWAYS_INLINE on the helper functions. It's about 0.3-0.4% for me right now. Any performance testing on other machines would be greatly appreciated.

The codegen is pretty much optimal given our current framework. I will try to run some tests and attempt to find determine whether the extra work during codegen is actually the cause of the performance regression.
Comment 23 Cameron Zwarich (cpst) 2008-05-17 01:42:08 PDT
Created attachment 21212 [details]
Revised proposed patch

I removed some RegisterID reference count thrashing and a useless test. This patch is now between a 0.0-0.1% regression on my machine. I don't think we can do much better.

I'll let you guys do performance testing while I finish the layout test.
Comment 24 Maciej Stachowiak 2008-05-17 02:44:09 PDT
Performance looks good to me. I got some slower times, some faster times, compared to without the patch.
Comment 25 Oliver Hunt 2008-05-17 03:26:17 PDT
Comment on attachment 21212 [details]
Revised proposed patch

The PassRefPtr stuff isn't the nicest thing i've ever seen, but i can't think of anything nicer.

This needs a testcase, and we should probably file individual bugs for the function.arguments, f(f=3), and for..in issues.
Comment 26 Cameron Zwarich (cpst) 2008-05-17 04:20:36 PDT
The patch Oliver reviewed was landed in r33552. There is one last case of this bug that is blocking SquirrelFish:

var a = 1;
a += (a += 1);

It was fixed by one of my earlier patches here, so I should just be able to pull out the fix.

I will make bugs for the other more obscure cases after this bug is no longer a blocker.
Comment 27 Cameron Zwarich (cpst) 2008-05-17 04:51:42 PDT
Created attachment 21214 [details]
Proposed patch for last case
Comment 28 Oliver Hunt 2008-05-17 05:00:16 PDT
Comment on attachment 21214 [details]
Proposed patch for last case

r- due to the valueOf issue i described in irc
Comment 29 Cameron Zwarich (cpst) 2008-05-17 05:49:02 PDT
(In reply to comment #28)
> (From update of attachment 21214 [details] [edit])
> r- due to the valueOf issue i described in irc

Neither of us can come up with a test case that actually shows why this is wrong right now. I actually think it is correct as it is in the patch.

I should also add tests for the Dot and Bracket cases.
Comment 30 Cameron Zwarich (cpst) 2008-05-17 21:45:41 PDT
Created attachment 21218 [details]
Revised proposed patch

This patch contains more tests, as well as a fix for another issue Oliver found.
Comment 31 Cameron Zwarich (cpst) 2008-05-17 21:56:23 PDT
Landed in r33562. The remaining corner cases we found will be made into separate bugs, but this one is closed.