Bug 79776 - Parser: Inline ScopeNodeData into ScopeNode
Summary: Parser: Inline ScopeNodeData into ScopeNode
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Andy Wingo
URL:
Keywords:
Depends on:
Blocks: 79112
  Show dependency treegraph
 
Reported: 2012-02-28 03:46 PST by Andy Wingo
Modified: 2012-03-07 02:19 PST (History)
6 users (show)

See Also:


Attachments
Patch (12.64 KB, patch)
2012-02-28 04:00 PST, Andy Wingo
no flags Details | Formatted Diff | Diff
Patch (12.72 KB, patch)
2012-03-06 02:46 PST, Andy Wingo
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Andy Wingo 2012-02-28 03:46:39 PST
The patch to be attached is a small refactor to the parser.
Comment 1 Andy Wingo 2012-02-28 04:00:06 PST
Created attachment 129227 [details]
Patch
Comment 2 Andy Wingo 2012-02-28 04:07:14 PST
Sunspider:


TEST                   COMPARISON            FROM                 TO             DETAILS

=============================================================================

** TOTAL **:           ??                156.3ms +/- 0.8%   157.0ms +/- 1.3%     not conclusive: might be *1.004x as slow*

=============================================================================

  3d:                  ??                 23.8ms +/- 3.9%    24.9ms +/- 5.8%     not conclusive: might be *1.044x as slow*
    cube:              ??                  6.9ms +/- 7.8%     8.0ms +/- 19.2%     not conclusive: might be *1.152x as slow*
    morph:             -                   8.8ms +/- 5.2%     8.7ms +/- 3.7% 
    raytrace:          ??                  8.0ms +/- 1.5%     8.1ms +/- 2.7%     not conclusive: might be *1.014x as slow*

  access:              1.038x as fast     14.5ms +/- 1.1%    13.9ms +/- 0.8%     significant
    binary-trees:      1.078x as fast      2.2ms +/- 5.4%     2.0ms +/- 3.3%     significant
    fannkuch:          1.023x as fast      6.2ms +/- 1.3%     6.0ms +/- 0.7%     significant
    nbody:             1.062x as fast      3.4ms +/- 4.4%     3.2ms +/- 2.1%     significant
    nsieve:            -                   2.7ms +/- 1.2%     2.7ms +/- 0.7% 

  bitops:              -                  11.9ms +/- 1.7%    11.8ms +/- 0.3% 
    3bit-bits-in-byte: -                   1.4ms +/- 7.8%     1.3ms +/- 0.7% 
    bits-in-byte:      ??                  2.2ms +/- 1.4%     2.3ms +/- 1.3%     not conclusive: might be *1.009x as slow*
    bitwise-and:       -                   3.2ms +/- 2.4%     3.2ms +/- 0.6% 
    nsieve-bits:       -                   5.1ms +/- 3.5%     5.0ms +/- 0.3% 

  controlflow:         ??                  1.9ms +/- 0.9%     1.9ms +/- 3.6%     not conclusive: might be *1.027x as slow*
    recursive:         ??                  1.9ms +/- 0.9%     1.9ms +/- 3.6%     not conclusive: might be *1.027x as slow*

  crypto:              *1.018x as slow*   12.1ms +/- 0.4%    12.3ms +/- 1.3%     significant
    aes:               *1.029x as slow*    7.4ms +/- 0.3%     7.6ms +/- 1.8%     significant
    md5:               ??                  2.5ms +/- 1.3%     2.5ms +/- 1.2%     not conclusive: might be *1.008x as slow*
    sha1:              -                   2.2ms +/- 2.2%     2.2ms +/- 1.4% 

  date:                ??                 19.0ms +/- 1.1%    19.1ms +/- 1.1%     not conclusive: might be *1.002x as slow*
    format-tofte:      ??                 10.2ms +/- 1.5%    10.2ms +/- 0.8%     not conclusive: might be *1.002x as slow*
    format-xparb:      ??                  8.9ms +/- 1.8%     8.9ms +/- 1.7%     not conclusive: might be *1.002x as slow*

  math:                ??                 16.9ms +/- 1.7%    17.1ms +/- 1.7%     not conclusive: might be *1.012x as slow*
    cordic:            ??                  6.1ms +/- 1.8%     6.2ms +/- 2.4%     not conclusive: might be *1.015x as slow*
    partial-sums:      ??                  8.5ms +/- 1.9%     8.5ms +/- 1.8%     not conclusive: might be *1.010x as slow*
    spectral-norm:     ??                  2.3ms +/- 2.1%     2.4ms +/- 4.1%     not conclusive: might be *1.013x as slow*

  regexp:              -                   8.3ms +/- 1.8%     8.3ms +/- 1.4% 
    dna:               -                   8.3ms +/- 1.8%     8.3ms +/- 1.4% 

  string:              -                  47.9ms +/- 0.9%    47.6ms +/- 0.3% 
    base64:            1.044x as fast      4.7ms +/- 3.3%     4.5ms +/- 0.8%     significant
    fasta:             -                   6.4ms +/- 3.0%     6.4ms +/- 1.1% 
    tagcloud:          ??                 11.2ms +/- 1.1%    11.3ms +/- 1.3%     not conclusive: might be *1.007x as slow*
    unpack-code:       ??                 19.7ms +/- 1.1%    19.7ms +/- 0.6%     not conclusive: might be *1.003x as slow*
    validate-input:    1.025x as fast      6.0ms +/- 1.7%     5.8ms +/- 0.5%     significant


Parsing:

From 0m6.214s to 0m6.286s (1.1% slowdown)

Tested by running the following program:
    for (var i=0; i<1000; i++) checkSyntax("tests/parse-only/concat-jquery-mootools-prototype.js");
Taking a minimum of 10 run times.
Comment 3 Andy Wingo 2012-02-28 04:25:15 PST
So, regarding the remaining minor performance regression, I have a couple patches that can make it faster than the baseline case, but they are a bit iffy, so I'd like some feedback.

If I apply this one:

diff --git a/Source/JavaScriptCore/wtf/Vector.h b/Source/JavaScriptCore/wtf/Vector.h
index 29bbd37..351d3e8 100644
--- a/Source/JavaScriptCore/wtf/Vector.h
+++ b/Source/JavaScriptCore/wtf/Vector.h
@@ -281,11 +281,13 @@ namespace WTF {
 
         void deallocateBuffer(T* bufferToDeallocate)
         {
-            if (m_buffer == bufferToDeallocate) {
-                m_buffer = 0;
-                m_capacity = 0;
+            if (bufferToDeallocate) {
+                if (m_buffer == bufferToDeallocate) {
+                    m_buffer = 0;
+                    m_capacity = 0;
+                }
+                fastFree(bufferToDeallocate);
             }
-            fastFree(bufferToDeallocate);
         }
 
         T* buffer() { return m_buffer; }

Then I reach a minimum parse time of 6.220s, which brings the speed to the same as trunk.  The idea is to do less if the vector's buffer was never allocated.

Secondly if I apply this one, on top of that:


diff --git a/Source/JavaScriptCore/wtf/HashTable.h b/Source/JavaScriptCore/wtf/HashTable.h
index cbcc098..a3afd9a 100644
--- a/Source/JavaScriptCore/wtf/HashTable.h
+++ b/Source/JavaScriptCore/wtf/HashTable.h
@@ -310,7 +310,8 @@ namespace WTF {
         ~HashTable() 
         {
             invalidateIterators(); 
-            deallocateTable(m_table, m_tableSize); 
+            if (m_table)
+                deallocateTable(m_table, m_tableSize);
 #if CHECK_HASHTABLE_USE_AFTER_DESTRUCTION
             m_table = (ValueType*)(uintptr_t)0xbbadbeef;
 #endif
@@ -979,11 +980,13 @@ namespace WTF {
     void HashTable<Key, Value, Extractor, HashFunctions, Traits, KeyTraits>::clear()
     {
         invalidateIterators();
-        deallocateTable(m_table, m_tableSize);
-        m_table = 0;
-        m_tableSize = 0;
-        m_tableSizeMask = 0;
-        m_keyCount = 0;
+        if (m_table) {
+            deallocateTable(m_table, m_tableSize);
+            m_table = 0;
+            m_tableSize = 0;
+            m_tableSizeMask = 0;
+            m_keyCount = 0;
+        }
     }
 
     template<typename Key, typename Value, typename Extractor, typename HashFunctions, typename Traits, typename KeyTraits>

then the minimum parse time goes down to 6.181s, which is a 0.5% speedup relative to trunk.  Again, same idea: make it cheaper to clean up hash tables whose buffers are never allocated.

I don't know what other impact this has.  JSC + attachment 129277 [details] + these patches is neutral on sunspider, relative to trunk (within 0.1 ms).
Comment 4 Darin Adler 2012-02-28 12:32:04 PST
Optimizing the deletion of empty vectors and hash tables seems like a fine change. Maybe we could land that separately/first?
Comment 5 Andy Wingo 2012-02-29 07:02:07 PST
(In reply to comment #4)
> Optimizing the deletion of empty vectors and hash tables seems like a fine change. Maybe we could land that separately/first?

Sure.  See bug 79903.
Comment 6 Geoffrey Garen 2012-02-29 10:25:08 PST
Comment on attachment 129227 [details]
Patch

Looks like this needs to be refactored to account for the patch in bug 79903.
Comment 7 Andy Wingo 2012-02-29 11:31:53 PST
(In reply to comment #6)
> (From update of attachment 129227 [details])
> Looks like this needs to be refactored to account for the patch in bug 79903.

For what reason?  Attachment 129227 [details] does not include the patches from comment 3 or bug 79903.

(Not trying to contradict, just seeking clarity :)
Comment 8 Geoffrey Garen 2012-02-29 11:45:10 PST
Comment on attachment 129227 [details]
Patch

Oops! My mistake.
Comment 9 Andy Wingo 2012-03-05 06:00:05 PST
Anyone willing to review this one?  Geoffrey perhaps?
Comment 10 Geoffrey Garen 2012-03-05 16:12:39 PST
Comment on attachment 129227 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=129227&action=review

> Source/JavaScriptCore/parser/Nodes.cpp:91
> +    , m_numConstants(numConstants)
> +    , m_statements(children)

Seems problematic to leave these data members uninitialized in the other ScopeNode constructor.
Comment 11 Andy Wingo 2012-03-06 02:46:24 PST
Created attachment 130345 [details]
Patch
Comment 12 Andy Wingo 2012-03-06 02:53:06 PST
(In reply to comment #10)
> (From update of attachment 129227 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=129227&action=review
> 
> > Source/JavaScriptCore/parser/Nodes.cpp:91
> > +    , m_numConstants(numConstants)
> > +    , m_statements(children)
> 
> Seems problematic to leave these data members uninitialized in the other ScopeNode constructor.

Good catch; fixed.

Relative to current trunk, this patch is performance-neutral:

$ cat test.js
for (var i = 0; i < 1000; i++)
  checkSyntax("/home/wingo/src/WebKit/PerformanceTests/SunSpider/tests/parse-only/concat-jquery-mootools-prototype.js");

Trunk:
$ time ~/src/WebKit-trunk/WebKitBuild/Release/Programs/jsc test.js
real	0m6.075s
user	0m5.856s
sys	0m0.188s

This patch:
$ time ~/src/WebKit/WebKitBuild/Release/Programs/jsc test.js
real	0m6.046s
user	0m5.776s
sys	0m0.232s

Minimum over 10 runs each.
Comment 13 Geoffrey Garen 2012-03-06 17:19:18 PST
Comment on attachment 130345 [details]
Patch

r=me
Comment 14 Andy Wingo 2012-03-07 01:46:51 PST
Comment on attachment 130345 [details]
Patch

Thanks!
Comment 15 WebKit Review Bot 2012-03-07 02:19:00 PST
Comment on attachment 130345 [details]
Patch

Clearing flags on attachment: 130345

Committed r110039: <http://trac.webkit.org/changeset/110039>
Comment 16 WebKit Review Bot 2012-03-07 02:19:05 PST
All reviewed patches have been landed.  Closing bug.