Bug 154808 - NewRegexp should not prevent inlining
Summary: NewRegexp should not prevent inlining
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-02-28 21:49 PST by Filip Pizlo
Modified: 2016-08-25 00:05 PDT (History)
6 users (show)

See Also:


Attachments
Proposed Patch (10.57 KB, patch)
2016-08-20 01:05 PDT, Caio Lima
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews122 for ios-simulator-elcapitan-wk2 (690.74 KB, application/zip)
2016-08-20 03:26 PDT, Build Bot
no flags Details
Patch (13.67 KB, patch)
2016-08-22 09:19 PDT, Caio Lima
no flags Details | Formatted Diff | Diff
Patch (11.31 KB, patch)
2016-08-24 09:08 PDT, Caio Lima
no flags Details | Formatted Diff | Diff
Regexp Inline Benchmark 2 (78.91 KB, text/plain)
2016-08-24 23:11 PDT, Caio Lima
no flags Details
Regex inline benchmark 1 (77.62 KB, text/plain)
2016-08-24 23:12 PDT, Caio Lima
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Filip Pizlo 2016-02-28 21:49:09 PST
It currently prevents inlining because the Node uses indices into the CodeBlock.
Comment 1 Caio Lima 2016-08-19 12:05:44 PDT
Working on It. I already have an implementation to propose, however I am investigating if it is Safe because of Garbage Colector. I am going to post it tonight.
Comment 2 Caio Lima 2016-08-20 01:05:02 PDT
Created attachment 286537 [details]
Proposed Patch

This version that I propose. I am investigating if there is a possibility of callee CodeBock be GCed because if it happens, RegExp are potentially candidates to be collected and Mache Code Block are going to point o invalid RegExp.

The current test that I am using right now is:

```
function toInline(a) {
    var re = /cc+/;

    print("test ccc: "+re.test("ccc"));
    print("test abc: "+re.test("abc"));
    return 0;
}

function f(a) {
    toInline(a)
    var re = /(ab)+/;
    print("test ba: " + re.test("ba"));
    return 0;
}

noInline(f);

for (let i = 0; i < 100000; i++) {
    f(2);
}

gc();

print("GC Passed");

let a = Array(10000000);
for (let i = 0; i < 10000000; i++) {
    a[i] = {value: i};
}

print("Objects Allocated");
for (let i = 0; i < 100; i++) {
    f(4);
}
```

However it is possible write  more test codes with correct asserts.
This testing is working properly but I think it is worth more investigation.
Comment 3 Build Bot 2016-08-20 03:26:44 PDT
Comment on attachment 286537 [details]
Proposed Patch

Attachment 286537 [details] did not pass ios-sim-ews (ios-simulator-wk2):
Output: http://webkit-queues.webkit.org/results/1906777

New failing tests:
fast/scrolling/ios/scroll-events-back-forward.html
Comment 4 Build Bot 2016-08-20 03:26:47 PDT
Created attachment 286540 [details]
Archive of layout-test-results from ews122 for ios-simulator-elcapitan-wk2

The attached test failures were seen while running run-webkit-tests on the ios-sim-ews.
Bot: ews122  Port: ios-simulator-elcapitan-wk2  Platform: Mac OS X 10.11.5
Comment 5 Caio Lima 2016-08-22 09:19:13 PDT
Created attachment 286601 [details]
Patch
Comment 6 Geoffrey Garen 2016-08-22 11:11:57 PDT
Comment on attachment 286601 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=286601&action=review

I think a better model might be for every CodeBlock always to copy its RegExp array out of its unlinked counterpart and into a unique vector. This will waste a little space, but we think it's rare, so it's probably acceptable. The nice thing about this change is that we won't need two different modes for accessing regexps.

If we do need two different modes to save space, I would call the first mode "unlinkedRegExp()" and the second mode "jitCodeRegExp()", and I would store the regexeps used only by optimizing compilers on the JITCode object.

> Source/JavaScriptCore/ChangeLog:9
> +        when it contains NewRegExp nodes. In current implementation, it is not

when the callee contains

In the current implementation, inlining

> Source/JavaScriptCore/ChangeLog:10
> +        possible because the RegExp references are being stored in their function's

are stored

> Source/JavaScriptCore/ChangeLog:12
> +        NewRegExp node is emmited with the RegExp's index in CodeBlock vector

emitted

in the CodeBlock vector

> Source/JavaScriptCore/ChangeLog:13
> +        as parameter to be retrived in futher phases of compilation

as a parameter

retrieved

> Source/JavaScriptCore/ChangeLog:16
> +        As solution, we are remmaping the RegExps from its callee inline canditate

As a solution, we remap each RegExp

candidate

> Source/JavaScriptCore/ChangeLog:17
> +        to its machine code block in inline phase into

in the inlining phase

> Source/JavaScriptCore/ChangeLog:24
> +        It is important notice that we created a new Vector because

to notice

> Source/JavaScriptCore/ChangeLog:27
> +        out of compilation thread.

from the compilation thread

> Source/JavaScriptCore/bytecode/CodeBlock.h:557
> +    unsigned addRegExpList(RegExp* regexp)

This function name sounds like you're adding a list of RegExps. I don't think it differs clearly enough from CodeBlock::regexp(). It is a little surprising that CodeBlock has two different RegExp stores, one that forwards to the unlinked code block and one that is local.

> Source/JavaScriptCore/bytecode/CodeBlock.h:900
> +        Vector<RegExp*> m_regExps;

Why is it OK to store raw pointers to RegExp without a write barrier operation? You've explained why we can't do write barrier operations on the compilation thread, but I don't think you've explained why it's OK just to skip them -- and I don't think it is OK.

> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:5401
> +        

Revert.

> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:5422
> +        

Revert.

> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:5428
> +    // We need to remap and copy RegExps from codeBlock->unlinkedCodeBlock() to
> +    // byteCodeParser->m_codeBlock because it allow us inline functions with NewRegExp.
> +    // nodes.

I don't fully understand this comment.
Comment 7 Caio Lima 2016-08-22 22:45:41 PDT
Comment on attachment 286601 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=286601&action=review

>> Source/JavaScriptCore/bytecode/CodeBlock.h:900
>> +        Vector<RegExp*> m_regExps;
> 
> Why is it OK to store raw pointers to RegExp without a write barrier operation? You've explained why we can't do write barrier operations on the compilation thread, but I don't think you've explained why it's OK just to skip them -- and I don't think it is OK.

I am using this Vector to store the RegExp from inlined functions and also store the RegExp from its own m_unlikedCode. In my current investigation and I tried to replicate it in JSTests/stress/new-regex-inline.js, and the RegExp pointer is valid while its UnlikedCode is not collected by GC. The WriteBarrie is strictly necessary in this case, otherwise it is not safe manipulate RegExp objects. So, In my understand,  we need to keep unlinkedCodeBlock->m_regexps [1], because it is the reason that RegExp* aren't collected by GC and remain valid in execution context. 

However, I agree with you that we can improve this implementation. I have the following changes in my mind:
1. remove all regExpList* methods I created;
2. Store RegEx from inlined functions in CodeBlock->m_rareData->m_inlinedRegexps as Vector<RegExp*> (consider it as a Weak pointer for "GC eyes"); 
3. In regex() check if "index < m_unlinkedCode->numberOfRegExps()" and if it is false, get the element in  CodeBlock->m_rareData->m_inlinedRegexps (we avoid waste space);

To 2. be valid, what I need to be sure is if (Inlined Function CodeBlock)->m_unlikedCode->regexps [1] is not collected in GC phase. I accept help here to be 100% sure that it is a valid implementation. I am trying to debug GC with JSTests/stress/new-regex-inline.js and also --logGC=2.

It is extremely helpful if you know a way to check if (Inlined Function CodeBlock)->m_unlikedCode is collected in "gc()" and I would be happy learning it. My Assumption is that if UnlinkedCode is not collected, m_regexps are pointed to them and won't be collected too. Does it make sense to you?

[1] - https://github.com/WebKit/webkit/blob/e9acddff34511e7f3e184b6e6f34897027e8b61e/Source/JavaScriptCore/bytecode/UnlinkedCodeBlock.h#L452

>> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:5428
>> +    // nodes.
> 
> I don't fully understand this comment.

I think it is confusing. What I am trying to say here is that we are mapping regExps from callee into machine code block to enable inlining, and we are also "mapping"  machine code block's own RegExps into m_regexpRemap (but it is an identity mapping m_regexpRemap[i] == byteCodeParser->m_codeBlock->m_unlinkedCodeBlock->m_regexps[i]).

Since we are planing change this implementation, it is going to be Out of Date.
Comment 8 Filip Pizlo 2016-08-22 23:41:40 PDT
Comment on attachment 286601 [details]
Patch

You shouldn't be adding any new fields to CodeBlock. 

The DFG already has a mechanism for strongly marking objects known to the compiler such as the RegExps. It's called freezing: just do m_graph.freezeStrongly(...). 

You don't need to record the RegExp object anywhere other than the NewRegExp node. That node should not have an index into any vectors. It should just hold a pointer to RegExp. We already do this for other node types. See for example Node::cellOperand(). Notice that its already compatible with freezing.
Comment 9 Caio Lima 2016-08-23 11:29:43 PDT
(In reply to comment #8)
> Comment on attachment 286601 [details]
> Patch
> 
> You shouldn't be adding any new fields to CodeBlock. 
> 
> The DFG already has a mechanism for strongly marking objects known to the
> compiler such as the RegExps. It's called freezing: just do
> m_graph.freezeStrongly(...). 
> 
> You don't need to record the RegExp object anywhere other than the NewRegExp
> node. That node should not have an index into any vectors. It should just
> hold a pointer to RegExp. We already do this for other node types. See for
> example Node::cellOperand(). Notice that its already compatible with
> freezing.

Thank you for this info. I really didn't know about that. I took a look in code using the freezing and it looks simpler and more correct than my current patch. Just for curiosity, as I understood, this Freeze mechanism interact with the GC, right? If yes, How does it happen?
Comment 10 Filip Pizlo 2016-08-23 11:54:45 PDT
(In reply to comment #9)
> (In reply to comment #8)
> > Comment on attachment 286601 [details]
> > Patch
> > 
> > You shouldn't be adding any new fields to CodeBlock. 
> > 
> > The DFG already has a mechanism for strongly marking objects known to the
> > compiler such as the RegExps. It's called freezing: just do
> > m_graph.freezeStrongly(...). 
> > 
> > You don't need to record the RegExp object anywhere other than the NewRegExp
> > node. That node should not have an index into any vectors. It should just
> > hold a pointer to RegExp. We already do this for other node types. See for
> > example Node::cellOperand(). Notice that its already compatible with
> > freezing.
> 
> Thank you for this info. I really didn't know about that. I took a look in
> code using the freezing and it looks simpler and more correct than my
> current patch. Just for curiosity, as I understood, this Freeze mechanism
> interact with the GC, right? If yes, How does it happen?

There's a lot of logic there:

- Freezing causes the resulting CodeBlock to have either a strong or weak reference to the frozen object, depending on whether you froze it strongly or weakly.

- Freezing immediately causes the ongoing DFG compilation plan to track the reference if a GC happens while the compiler is running.  The GC knows how to safepoint the compiler.  This means that from the compiler's standpoint, GCs can only happen at well-defined points: either before the compiler started, during B3 compilation, or after the compiler finished.

There are a lot of other details; I can't remember all of them off the top of my head.  The short version is just: we already use freezing a lot, and it's designed exactly for what you want: you have some object that the generated code will refer to and you want to make sure that this object gets marked.
Comment 11 Caio Lima 2016-08-23 15:40:26 PDT
(In reply to comment #10)
> 
> There's a lot of logic there:
> 
> - Freezing causes the resulting CodeBlock to have either a strong or weak
> reference to the frozen object, depending on whether you froze it strongly
> or weakly.
> 
> - Freezing immediately causes the ongoing DFG compilation plan to track the
> reference if a GC happens while the compiler is running.  The GC knows how
> to safepoint the compiler.  This means that from the compiler's standpoint,
> GCs can only happen at well-defined points: either before the compiler
> started, during B3 compilation, or after the compiler finished.
> 
> There are a lot of other details; I can't remember all of them off the top
> of my head.  The short version is just: we already use freezing a lot, and
> it's designed exactly for what you want: you have some object that the
> generated code will refer to and you want to make sure that this object gets
> marked.


Thank You so much.
Comment 12 Caio Lima 2016-08-24 09:08:09 PDT
Created attachment 286856 [details]
Patch
Comment 13 Caio Lima 2016-08-24 09:45:22 PDT
I am going to run benchmarks and compare to check for any improvement/regress this Patch can cause.
Comment 14 Geoffrey Garen 2016-08-24 14:26:31 PDT
Comment on attachment 286856 [details]
Patch

Looks much better.

r=me
Comment 15 Caio Lima 2016-08-24 23:11:32 PDT
Created attachment 286948 [details]
Regexp Inline Benchmark 2
Comment 16 Caio Lima 2016-08-24 23:12:02 PDT
Created attachment 286949 [details]
Regex inline benchmark 1
Comment 17 Caio Lima 2016-08-24 23:14:57 PDT
Benchmarks are showing possible ~0.63% regression in general results of Benchmark 2.

We may be interested in these results:

1. might be 1.1097x faster" in Octane
2. V8Spider  regex - neutral
3. Sunspider regexp-dna - might be 1.3842x slower
4. ~11 regressions with ~2% of regression

In benchmark 1, Sunspider regex-dna is resulting in "might be 1.0296x slower"

I think that this Patch seems neutral in with some noise in the second run.
Comment 18 WebKit Commit Bot 2016-08-24 23:16:17 PDT
Comment on attachment 286856 [details]
Patch

Rejecting attachment 286856 [details] from commit-queue.

ticaiolima@gmail.com does not have committer permissions according to http://trac.webkit.org/browser/trunk/Tools/Scripts/webkitpy/common/config/contributors.json.

- If you do not have committer rights please read http://webkit.org/coding/contributing.html for instructions on how to use bugzilla flags.

- If you have committer rights please correct the error in Tools/Scripts/webkitpy/common/config/contributors.json by adding yourself to the file (no review needed).  The commit-queue restarts itself every 2 hours.  After restart the commit-queue will correctly respect your committer rights.
Comment 19 Caio Lima 2016-08-24 23:17:41 PDT
(In reply to comment #18)
> Comment on attachment 286856 [details]
> Patch
> 
> Rejecting attachment 286856 [details] from commit-queue.
> 
> ticaiolima@gmail.com does not have committer permissions according to
> http://trac.webkit.org/browser/trunk/Tools/Scripts/webkitpy/common/config/
> contributors.json.
> 
> - If you do not have committer rights please read
> http://webkit.org/coding/contributing.html for instructions on how to use
> bugzilla flags.
> 
> - If you have committer rights please correct the error in
> Tools/Scripts/webkitpy/common/config/contributors.json by adding yourself to
> the file (no review needed).  The commit-queue restarts itself every 2
> hours.  After restart the commit-queue will correctly respect your committer
> rights.

I can't cq+ =).
Comment 20 WebKit Commit Bot 2016-08-25 00:05:47 PDT
Comment on attachment 286856 [details]
Patch

Clearing flags on attachment: 286856

Committed r204958: <http://trac.webkit.org/changeset/204958>
Comment 21 WebKit Commit Bot 2016-08-25 00:05:52 PDT
All reviewed patches have been landed.  Closing bug.