Bug 160921 - TryGetById should have a ValueProfile so that it can predict its output type
Summary: TryGetById should have a ValueProfile so that it can predict its output type
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: All All
: P2 Normal
Assignee: JF Bastien
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-08-16 16:58 PDT by Filip Pizlo
Modified: 2016-08-25 15:13 PDT (History)
8 users (show)

See Also:


Attachments
mostly-but-not-quite-ready patch (17.00 KB, patch)
2016-08-23 23:38 PDT, JF Bastien
buildbot: commit-queue-
Details | Formatted Diff | Diff
Archive of layout-test-results from ews115 for mac-yosemite (2.14 MB, application/zip)
2016-08-24 01:06 PDT, Build Bot
no flags Details
Updated patch, still WIP on tests (19.10 KB, patch)
2016-08-24 16:33 PDT, JF Bastien
saam: review+
Details | Formatted Diff | Diff
patch (20.75 KB, patch)
2016-08-24 21:55 PDT, JF Bastien
saam: review+
Details | Formatted Diff | Diff
patch (20.61 KB, patch)
2016-08-25 07:54 PDT, JF Bastien
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Filip Pizlo 2016-08-16 16:58:16 PDT
...
Comment 1 JF Bastien 2016-08-23 23:38:40 PDT
Created attachment 286837 [details]
mostly-but-not-quite-ready patch

This isn't ready to commit yet. Missing items:
 - Move the tests out of regress and into the new microbenchmark folder (these were in the process of moving today, see https://bugs.webkit.org/show_bug.cgi?id=161096).
 - Write regex tests (especially when ES6's overwrite of regex is used).
 - File bugs for FIXME.
 - File and fix check-webkit-style nits (its regexes need to be taught about macros).
Comment 2 WebKit Commit Bot 2016-08-23 23:40:29 PDT
Attachment 286837 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1214:  Else clause should never be on same line as else (use 2 lines)  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1215:  Should be indented on a separate line, with the colon or comma first on that line.  [whitespace/indent] [4]
Total errors found: 2 in 14 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 3 Build Bot 2016-08-24 01:06:02 PDT
Comment on attachment 286837 [details]
mostly-but-not-quite-ready patch

Attachment 286837 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/1932695

New failing tests:
js/dom/JSON-parse.html
jquery/manipulation.html
js/promises-tests/promises-tests-2-3-4.html
inspector/debugger/tail-recursion.html
js/dom/JSON-stringify.html
imported/w3c/web-platform-tests/html/semantics/forms/the-input-element/type-change-state.html
jquery/attributes.html
js/promises-tests/promises-tests-2-2-7.html
inspector/debugger/tail-deleted-frames-from-vm-entry.html
jquery/event.html
Comment 4 Build Bot 2016-08-24 01:06:05 PDT
Created attachment 286842 [details]
Archive of layout-test-results from ews115 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews115  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 5 Geoffrey Garen 2016-08-24 10:59:49 PDT
Comment on attachment 286837 [details]
mostly-but-not-quite-ready patch

View in context: https://bugs.webkit.org/attachment.cgi?id=286837&action=review

> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1222
> +#define REAL_CONCAT(a, b) a ## b
> +
> +#define CONCAT(a, b) REAL_CONCAT(a, b)
> +
>  #define NEXT_OPCODE(name) \
> -    m_currentIndex += OPCODE_LENGTH(name); \
> -    continue
> +    if (true) { \
> +        m_currentIndex += OPCODE_LENGTH(name); \
> +        goto CONCAT(NEXT_OPCODE_, __LINE__); \
> +    } else \
> +    CONCAT(NEXT_OPCODE_, __LINE__): \
> +        continue
>  
>  #define LAST_OPCODE(name) \
> -    m_currentIndex += OPCODE_LENGTH(name); \
> -    m_exitOK = false; \
> -    return shouldContinueParsing
> +    return \
> +        m_currentIndex += OPCODE_LENGTH(name), \
> +        m_exitOK = false, \
> +        shouldContinueParsing

What's the purpose here?

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:4161
> +    case GetById: { // FIXME dedup with SpeculativeJIT::compileTryGetById and 64-bit version of this?

It's good to ask this question during patch development -- but I think the best thing is to decide if you think we should dedup, and if you think so, record the task in the bug tracker instead of in a comment.
Comment 6 JF Bastien 2016-08-24 13:33:42 PDT
Comment on attachment 286837 [details]
mostly-but-not-quite-ready patch

View in context: https://bugs.webkit.org/attachment.cgi?id=286837&action=review

>> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1222
>> +        shouldContinueParsing
> 
> What's the purpose here?

I was wondering who would ask first :-)

I started off trying to use NEXT_OPCODE and needing to switch on try_get_by_id or not. You can't do that in a ternary NEXT_OPCODE(op == TGBID ? TGBID : GBID) because NEXT_OPCODE calls OPCODE_LENGTH which preprocessor-concats. I then wrote the code you see below:

            if (op_try_get_by_id == opcodeID)
                NEXT_OPCODE(op_try_get_by_id); // Opcode's length is different from others in this case.
            else
                NEXT_OPCODE(op_get_by_id);

This didn't work because NEXT_OPCODE is two statements, which expands to a broken if / else.

The usual fix for this is the following pattern:

#define FOO() do { \
  ...; \
  ...; \
} while (false)

This behaves statement-like, is ended with a semicolon, and doesn't break unbraced if / else. However it doesn't work in this instance because NEXT_OPCODE uses `continue` which doesn't interact well with the dummy `do while (false)`.

The similar but less-often used idiom to fix this is what I have for NEXT_OPCODE:
#define FOO() \
  if (true) { \
    ...; \
    ...; \
    goto other; \
  } else \
  other: \
    continue

This behaves the same (statement-like, semicolon end, works with unbraced if / else) but allows the macro to contain a `continue`. It has the neat / horrible C-macro-ism of `if (true)` where both the true and false sides of the branch are taken because of the `goto`. The CONCAT bit is needed because the goto label has to be unique to the file.

That allows me to write what seems like the intuitive NEXT_OPCODE usage, and hides the ugly details in the macro. I then updated all other places in the file which needlessly braced their if / else for NEXT_OPCODE and LAST_OPCODE, which proceeded to break uses of LAST_OPCODE. That one doesn't need the `do while (false)` idiom to be fixed because it happens to only contain expressions followed by a return, which can be chained using the comma operator.

I half expect R- "just use braces" for this code, but I love this ugly idiom and don't get to use it often. It does hide complexity in one place too, which is its one technical merit.

>> Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:4161
>> +    case GetById: { // FIXME dedup with SpeculativeJIT::compileTryGetById and 64-bit version of this?
> 
> It's good to ask this question during patch development -- but I think the best thing is to decide if you think we should dedup, and if you think so, record the task in the bug tracker instead of in a comment.

Filed: https://bugs.webkit.org/show_bug.cgi?id=161158
Will update the FIXME.
Comment 7 Filip Pizlo 2016-08-24 13:38:58 PDT
Comment on attachment 286837 [details]
mostly-but-not-quite-ready patch

View in context: https://bugs.webkit.org/attachment.cgi?id=286837&action=review

>>> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1222
>>> +        shouldContinueParsing
>> 
>> What's the purpose here?
> 
> I was wondering who would ask first :-)
> 
> I started off trying to use NEXT_OPCODE and needing to switch on try_get_by_id or not. You can't do that in a ternary NEXT_OPCODE(op == TGBID ? TGBID : GBID) because NEXT_OPCODE calls OPCODE_LENGTH which preprocessor-concats. I then wrote the code you see below:
> 
>             if (op_try_get_by_id == opcodeID)
>                 NEXT_OPCODE(op_try_get_by_id); // Opcode's length is different from others in this case.
>             else
>                 NEXT_OPCODE(op_get_by_id);
> 
> This didn't work because NEXT_OPCODE is two statements, which expands to a broken if / else.
> 
> The usual fix for this is the following pattern:
> 
> #define FOO() do { \
>   ...; \
>   ...; \
> } while (false)
> 
> This behaves statement-like, is ended with a semicolon, and doesn't break unbraced if / else. However it doesn't work in this instance because NEXT_OPCODE uses `continue` which doesn't interact well with the dummy `do while (false)`.
> 
> The similar but less-often used idiom to fix this is what I have for NEXT_OPCODE:
> #define FOO() \
>   if (true) { \
>     ...; \
>     ...; \
>     goto other; \
>   } else \
>   other: \
>     continue
> 
> This behaves the same (statement-like, semicolon end, works with unbraced if / else) but allows the macro to contain a `continue`. It has the neat / horrible C-macro-ism of `if (true)` where both the true and false sides of the branch are taken because of the `goto`. The CONCAT bit is needed because the goto label has to be unique to the file.
> 
> That allows me to write what seems like the intuitive NEXT_OPCODE usage, and hides the ugly details in the macro. I then updated all other places in the file which needlessly braced their if / else for NEXT_OPCODE and LAST_OPCODE, which proceeded to break uses of LAST_OPCODE. That one doesn't need the `do while (false)` idiom to be fixed because it happens to only contain expressions followed by a return, which can be chained using the comma operator.
> 
> I half expect R- "just use braces" for this code, but I love this ugly idiom and don't get to use it often. It does hide complexity in one place too, which is its one technical merit.

This is pretty neat.  Does the dummy if/else really make NEXT_OPCODE behave fully like a statement, the same way that do { ... } while(0) does?
Comment 8 JF Bastien 2016-08-24 13:42:37 PDT
Comment on attachment 286837 [details]
mostly-but-not-quite-ready patch

View in context: https://bugs.webkit.org/attachment.cgi?id=286837&action=review

>>>> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1222
>>>> +        shouldContinueParsing
>>> 
>>> What's the purpose here?
>> 
>> I was wondering who would ask first :-)
>> 
>> I started off trying to use NEXT_OPCODE and needing to switch on try_get_by_id or not. You can't do that in a ternary NEXT_OPCODE(op == TGBID ? TGBID : GBID) because NEXT_OPCODE calls OPCODE_LENGTH which preprocessor-concats. I then wrote the code you see below:
>> 
>>             if (op_try_get_by_id == opcodeID)
>>                 NEXT_OPCODE(op_try_get_by_id); // Opcode's length is different from others in this case.
>>             else
>>                 NEXT_OPCODE(op_get_by_id);
>> 
>> This didn't work because NEXT_OPCODE is two statements, which expands to a broken if / else.
>> 
>> The usual fix for this is the following pattern:
>> 
>> #define FOO() do { \
>>   ...; \
>>   ...; \
>> } while (false)
>> 
>> This behaves statement-like, is ended with a semicolon, and doesn't break unbraced if / else. However it doesn't work in this instance because NEXT_OPCODE uses `continue` which doesn't interact well with the dummy `do while (false)`.
>> 
>> The similar but less-often used idiom to fix this is what I have for NEXT_OPCODE:
>> #define FOO() \
>>   if (true) { \
>>     ...; \
>>     ...; \
>>     goto other; \
>>   } else \
>>   other: \
>>     continue
>> 
>> This behaves the same (statement-like, semicolon end, works with unbraced if / else) but allows the macro to contain a `continue`. It has the neat / horrible C-macro-ism of `if (true)` where both the true and false sides of the branch are taken because of the `goto`. The CONCAT bit is needed because the goto label has to be unique to the file.
>> 
>> That allows me to write what seems like the intuitive NEXT_OPCODE usage, and hides the ugly details in the macro. I then updated all other places in the file which needlessly braced their if / else for NEXT_OPCODE and LAST_OPCODE, which proceeded to break uses of LAST_OPCODE. That one doesn't need the `do while (false)` idiom to be fixed because it happens to only contain expressions followed by a return, which can be chained using the comma operator.
>> 
>> I half expect R- "just use braces" for this code, but I love this ugly idiom and don't get to use it often. It does hide complexity in one place too, which is its one technical merit.
> 
> This is pretty neat.  Does the dummy if/else really make NEXT_OPCODE behave fully like a statement, the same way that do { ... } while(0) does?

Yes, the one caveat is that you can't write two of them on the same line because of __LINE__. You could use the non-standard __COUNTER__ to fix this.
Comment 9 Filip Pizlo 2016-08-24 13:51:19 PDT
(In reply to comment #8)
> Comment on attachment 286837 [details]
> mostly-but-not-quite-ready patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=286837&action=review
> 
> >>>> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1222
> >>>> +        shouldContinueParsing
> >>> 
> >>> What's the purpose here?
> >> 
> >> I was wondering who would ask first :-)
> >> 
> >> I started off trying to use NEXT_OPCODE and needing to switch on try_get_by_id or not. You can't do that in a ternary NEXT_OPCODE(op == TGBID ? TGBID : GBID) because NEXT_OPCODE calls OPCODE_LENGTH which preprocessor-concats. I then wrote the code you see below:
> >> 
> >>             if (op_try_get_by_id == opcodeID)
> >>                 NEXT_OPCODE(op_try_get_by_id); // Opcode's length is different from others in this case.
> >>             else
> >>                 NEXT_OPCODE(op_get_by_id);
> >> 
> >> This didn't work because NEXT_OPCODE is two statements, which expands to a broken if / else.
> >> 
> >> The usual fix for this is the following pattern:
> >> 
> >> #define FOO() do { \
> >>   ...; \
> >>   ...; \
> >> } while (false)
> >> 
> >> This behaves statement-like, is ended with a semicolon, and doesn't break unbraced if / else. However it doesn't work in this instance because NEXT_OPCODE uses `continue` which doesn't interact well with the dummy `do while (false)`.
> >> 
> >> The similar but less-often used idiom to fix this is what I have for NEXT_OPCODE:
> >> #define FOO() \
> >>   if (true) { \
> >>     ...; \
> >>     ...; \
> >>     goto other; \
> >>   } else \
> >>   other: \
> >>     continue
> >> 
> >> This behaves the same (statement-like, semicolon end, works with unbraced if / else) but allows the macro to contain a `continue`. It has the neat / horrible C-macro-ism of `if (true)` where both the true and false sides of the branch are taken because of the `goto`. The CONCAT bit is needed because the goto label has to be unique to the file.
> >> 
> >> That allows me to write what seems like the intuitive NEXT_OPCODE usage, and hides the ugly details in the macro. I then updated all other places in the file which needlessly braced their if / else for NEXT_OPCODE and LAST_OPCODE, which proceeded to break uses of LAST_OPCODE. That one doesn't need the `do while (false)` idiom to be fixed because it happens to only contain expressions followed by a return, which can be chained using the comma operator.
> >> 
> >> I half expect R- "just use braces" for this code, but I love this ugly idiom and don't get to use it often. It does hide complexity in one place too, which is its one technical merit.
> > 
> > This is pretty neat.  Does the dummy if/else really make NEXT_OPCODE behave fully like a statement, the same way that do { ... } while(0) does?
> 
> Yes, the one caveat is that you can't write two of them on the same line
> because of __LINE__. You could use the non-standard __COUNTER__ to fix this.

Yup.  This sounds good to me.
Comment 10 JF Bastien 2016-08-24 16:33:21 PDT
Created attachment 286907 [details]
Updated patch, still WIP on tests

I updated the patch after offline comments and fixing the assertion that I'd missed. I'll update the tests, and see if the bots turn up other failures.
Comment 11 WebKit Commit Bot 2016-08-24 16:35:47 PDT
Attachment 286907 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1207:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1210:  WTF_CONCAT is incorrectly named. Don't use underscores in your identifier names.  [readability/naming/underscores] [4]
ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1211:  Else clause should never be on same line as else (use 2 lines)  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1212:  Should be indented on a separate line, with the colon or comma first on that line.  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1217:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1218:  When wrapping a line, only indent 4 spaces.  [whitespace/indent] [3]
Total errors found: 6 in 17 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 12 Saam Barati 2016-08-24 16:52:27 PDT
Comment on attachment 286907 [details]
Updated patch, still WIP on tests

View in context: https://bugs.webkit.org/attachment.cgi?id=286907&action=review

r=me with some comments

> JSTests/stress/tryGetById-basic.js:2
> +// FIXME Test regex path when overwriting.
> +// FIXME move to the new benchmark folder.

You should do this.

> JSTests/stress/tryGetById-basic.js:10
> +    var start = Date.now();

You should remove the date code and the commented out print.

> JSTests/stress/tryGetById-basic.js:18
> +bench(() => { var res = 0; for (var i = 0; i < 1e6; ++i) res += fooPlusBar(o); if (res != 1379000000) throw "unexpected result"; });

It might be worth creating a new builtin for each bench run.

> JSTests/stress/tryGetById-too-much.js:9
> +    var start = Date.now();

ditto

> Source/JavaScriptCore/ChangeLog:8
> +        Add a ValueProfile to TryGetById, and make sure DFG picks it up.

Can you add some info about speedups here?

> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1207
> +    /* The idiom `if (true) { ...; goto label; } else goto label: continue` allows using NEXT_OPCODE as a statement, even in unbraced if+else, while containing a `continue`. The more common `do { ...; } while (false)` idiom doesn't allow using `continue`. */ \

Style: I'd just add this as normal single line comments above the macro.

> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1217
> +        /* Chain expressions with comma-operator so LAST_OPCODE can be used as a statement. */ \

ditto

> Source/WTF/ChangeLog:3
> +        Add WTF_CONCAT to StdLibExtras.h

This line should be the name of your bug and then this line should go under the bug url.
Comment 13 JF Bastien 2016-08-24 20:54:22 PDT
I ran the test I wrote before / after my change: ./JSTests/stress/tryGetById-too-much.js

This benchmarks has 4 loops that execute enough times to get predicted. The loops:

 1) Add two tryGetById results, initially always predicting non bool int32.
 2) The same function predicts non int double as well (two predictions, non bool int32 is still there).
 3) String ident (at which point the engine decides it's too polymorphic).
 4) Again non bool int32 (still too polymorphic).

I expected the first two to get faster based on looking at the jsc verbose output (the type propagate properly, etc). And indeed they do! Each of the 4 loops does the following on my MBP:

 1) 80% runtime (good! expected faster)
 2) 89% runtime (good! expected faster)
 3) ~same runtime (expected ~same since too polymorphic)
 4) ~same runtime (ditto)

The tryGetById-basic.js test only executes the first 2 loops, and is useful in ensuring we get speedups for 1) and 2) without conflating the non-slowdown for 3) and 4).

I'll move these to the microbenchmark folder, and then write regex tests. I ran  run-jsc-benchmarks but it's all in the noise (even Octane regex).
Comment 14 Filip Pizlo 2016-08-24 21:43:08 PDT
Comment on attachment 286907 [details]
Updated patch, still WIP on tests

View in context: https://bugs.webkit.org/attachment.cgi?id=286907&action=review

> JSTests/stress/tryGetById-too-much.js:17
> +bench(() => { var res = 0; for (var i = 0; i < 1e6; ++i) res += fooPlusBar(o); if (res != 1379000000) throw "unexpected result"; });

Make sure that microbenchmarks are relatively quick, since they add to the total runtime of both run-jsc-benchmarks and run-jsc-stress-tests.  I like to aim for around 10ms, or around 50ms max if it really needs to run that long to demonstrate impact.

> Source/JavaScriptCore/llint/LLIntSlowPaths.cpp:578
> +    pc[OPCODE_LENGTH(op_try_get_by_id) - 1].u.profile->m_buckets[0] = JSValue::encode(result);
> +    LLINT_RETURN(result);

This should be LLINT_RETURN_PROFILED(result).
Comment 15 JF Bastien 2016-08-24 21:55:53 PDT
Created attachment 286944 [details]
patch

- Update patch to address Saam + Filip's comments.
- Move tests to microbenchmark.
- Still missing regexp microbenchmarks.
Comment 16 WebKit Commit Bot 2016-08-24 21:57:19 PDT
Attachment 286944 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1215:  WTF_CONCAT is incorrectly named. Don't use underscores in your identifier names.  [readability/naming/underscores] [4]
ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1216:  Else clause should never be on same line as else (use 2 lines)  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1217:  Should be indented on a separate line, with the colon or comma first on that line.  [whitespace/indent] [4]
Total errors found: 3 in 18 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 17 Saam Barati 2016-08-24 22:52:02 PDT
Comment on attachment 286944 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=286944&action=review

> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1207
> +//     if (true) { ...; goto label; } else goto label: continue

Don't you just mean:
else label: continue
Instead of
else goto label: continue

> Source/JavaScriptCore/llint/LLIntSlowPaths.cpp:577
> +    pc[OPCODE_LENGTH(op_try_get_by_id) - 1].u.profile->m_buckets[0] = JSValue::encode(result);

Fil meant that LLINT_RETURN_PROFILED does the value profiling for you. I also think you need to give it an opcode as an argument.
Comment 18 Saam Barati 2016-08-24 22:53:13 PDT
My review is contingent on it building, etc, which I suspect is a simple fix in the LLInt
Comment 19 JF Bastien 2016-08-25 07:54:11 PDT
Created attachment 286973 [details]
patch

- Oops, send the patch after building and fixing silly mistake, not before :-)
- Fix comment from Saam.
- Still need regex bench.
Comment 20 WebKit Commit Bot 2016-08-25 07:56:25 PDT
Attachment 286973 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1215:  WTF_CONCAT is incorrectly named. Don't use underscores in your identifier names.  [readability/naming/underscores] [4]
ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1216:  Else clause should never be on same line as else (use 2 lines)  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1217:  Should be indented on a separate line, with the colon or comma first on that line.  [whitespace/indent] [4]
Total errors found: 3 in 18 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 21 JF Bastien 2016-08-25 10:28:01 PDT
It turns out there are already regex tests which overwrite exec and other functions, so I think this patch is good to go.

I ran the tests on my MacBookPro, most applications shut down, Internet off. Results are somewhat noisy but in general look like a small win:

$ ./Tools/Scripts/run-jsc-benchmarks old:./before/bin/jsc new:./current/bin/jsc --benchmarks "(try-get-by-)|(regexp-).*" --microbenchmarks --outer 20

                                                     old                       new                                        

regexp-exec                                    37.2599+-2.1335           36.4781+-1.6365          might be 1.0214x faster
regexp-last-index                               9.9144+-0.3375            9.6545+-0.2376          might be 1.0269x faster
regexp-prototype-is-not-instance                0.4109+-0.0508            0.3846+-0.0290          might be 1.0684x faster
regexp-prototype-search-observable-side-effects
                                                0.9033+-0.1833            0.7939+-0.0244          might be 1.1379x faster
regexp-prototype-search-observable-side-effects2
                                                0.2887+-0.0702            0.2418+-0.0086          might be 1.1943x faster
regexp-prototype-split-observable-side-effects
                                                1.2081+-0.0314            1.1671+-0.0290          might be 1.0352x faster
regexp-prototype-split-observable-side-effects2
                                                0.2763+-0.0129            0.2707+-0.0071          might be 1.0207x faster
regexp-prototype-split-observable-side-effects3-flags
                                                0.3006+-0.0261            0.2903+-0.0121          might be 1.0353x faster
regexp-prototype-split-observable-side-effects3-global
                                                0.2927+-0.0170     ?      0.3036+-0.0298        ? might be 1.0371x slower
regexp-prototype-split-observable-side-effects3-ignoreCase
                                                0.2909+-0.0141     ?      0.2944+-0.0194        ? might be 1.0119x slower
regexp-prototype-split-observable-side-effects3-multiline
                                                0.3589+-0.0687            0.2940+-0.0120          might be 1.2208x faster
regexp-prototype-split-observable-side-effects3-sticky
                                                0.2848+-0.0125     ?      0.3050+-0.0432        ? might be 1.0711x slower
regexp-prototype-split-observable-side-effects3-unicode
                                                0.3174+-0.0492            0.3123+-0.0352          might be 1.0165x faster
regexp-prototype-split-observable-side-effects4
                                                0.2977+-0.0319     ?      0.3783+-0.1656        ? might be 1.2707x slower
regexp-prototype-test-observable-side-effects
                                                0.6849+-0.0534            0.6471+-0.0225          might be 1.0584x faster
regexp-prototype-test-observable-side-effects2
                                                0.2650+-0.0117            0.2585+-0.0092          might be 1.0252x faster
regexp-set-last-index                          10.3435+-0.3058     ?     10.4231+-0.3214        ?
simple-regexp-exec-folding-fail                 3.2071+-0.2286     ?      3.2136+-0.2755        ?
simple-regexp-exec-folding                     17.4885+-0.8920     ?     18.1168+-0.8843        ? might be 1.0359x slower
simple-regexp-test-folding-fail-with-hoisted-regexp
                                                9.0808+-0.1365     ?      9.1993+-0.2091        ? might be 1.0131x slower
simple-regexp-test-folding-fail                15.8828+-0.3887     ?     15.9778+-0.3977        ?
simple-regexp-test-folding-with-hoisted-regexp
                                               10.2888+-0.2087           10.2108+-0.1329        
simple-regexp-test-folding                     17.1112+-0.7364     ?     17.2812+-0.7045        ?
try-get-by-id-basic                             7.4579+-0.5580     ^      6.1939+-0.1557        ^ definitely 1.2041x faster
try-get-by-id-polymorphic                       7.2954+-0.3342            6.9700+-0.1630          might be 1.0467x faster
v8-regexp-search                               51.9154+-1.0472           51.4304+-0.8380        

<geometric>                                     1.8999+-0.0273            1.8592+-0.0214          might be 1.0219x faster



$ ./Tools/Scripts/run-jsc-benchmarks old:./before/bin/jsc new:./current/bin/jsc --benchmarks "regex" --octane --outer 10

                              old                       new                                        

regexp         x2      17.85925+-0.34929         17.25420+-0.50353         might be 1.0351x faster
Comment 22 JF Bastien 2016-08-25 11:39:20 PDT
I re-ran the benchmarks for longer as suggested by Keith. It seems to be mostly a wash, except for the one benchmark I wrote and expected to get faster:

$ ./Tools/Scripts/run-jsc-benchmarks old:./before/bin/jsc new:./current/bin/jsc --benchmarks "(try-get-by-)|(regexp)" --microbenchmarks --octane --outer 500

                                                           old                       new                                        
Octane:
   regexp                                   x2      18.13436+-0.08082    ?    18.18614+-0.07757       ?

   <geometric>                                      18.13436+-0.08082    ?    18.18614+-0.07757       ? might be 1.0029x slower

                                                           old                       new                                        
Microbenchmarks:
   regexp-exec                                       35.7820+-0.1331     ?     35.9175+-0.1576        ?
   regexp-last-index                                  9.6663+-0.0512            9.6290+-0.0508        
   regexp-prototype-is-not-instance                   0.3730+-0.0053     ?      0.3777+-0.0076        ? might be 1.0127x slower
   regexp-prototype-search-observable-side-effects   
                                                      0.8082+-0.0141     ?      0.8175+-0.0153        ? might be 1.0114x slower
   regexp-prototype-search-observable-side-effects2   
                                                      0.2571+-0.0045     ?      0.2610+-0.0063        ? might be 1.0152x slower
   regexp-prototype-split-observable-side-effects   
                                                      1.2159+-0.0137            1.2159+-0.0141        
   regexp-prototype-split-observable-side-effects2   
                                                      0.2865+-0.0060     ?      0.2907+-0.0074        ? might be 1.0149x slower
   regexp-prototype-split-observable-side-effects3-flags   
                                                      0.2986+-0.0057            0.2936+-0.0041          might be 1.0171x faster
   regexp-prototype-split-observable-side-effects3-global   
                                                      0.2946+-0.0041     ?      0.3050+-0.0084        ? might be 1.0353x slower
   regexp-prototype-split-observable-side-effects3-ignoreCase   
                                                      0.3053+-0.0116            0.2976+-0.0062          might be 1.0261x faster
   regexp-prototype-split-observable-side-effects3-multiline   
                                                      0.2962+-0.0054     ?      0.2990+-0.0056        ?
   regexp-prototype-split-observable-side-effects3-sticky   
                                                      0.3059+-0.0069     ^      0.2941+-0.0042        ^ definitely 1.0399x faster
   regexp-prototype-split-observable-side-effects3-unicode   
                                                      0.2974+-0.0061     ?      0.2998+-0.0075        ?
   regexp-prototype-split-observable-side-effects4   
                                                      0.3159+-0.0097            0.3087+-0.0075          might be 1.0235x faster
   regexp-prototype-test-observable-side-effects   
                                                      0.6642+-0.0090     ?      0.6726+-0.0128        ? might be 1.0127x slower
   regexp-prototype-test-observable-side-effects2   
                                                      0.2772+-0.0081            0.2736+-0.0064          might be 1.0131x faster
   regexp-set-last-index                             10.3625+-0.0567     ?     10.4021+-0.0583        ?
   simple-regexp-exec-folding-fail                    3.1157+-0.0306            3.1115+-0.0299        
   simple-regexp-exec-folding                        17.2350+-0.0906           17.2040+-0.0815        
   simple-regexp-test-folding-fail-with-hoisted-regexp   
                                                      9.0597+-0.0459     !      9.2154+-0.0618        ! definitely 1.0172x slower
   simple-regexp-test-folding-fail                   15.8090+-0.0861     ?     15.8215+-0.0874        ?
   simple-regexp-test-folding-with-hoisted-regexp   
                                                     10.2456+-0.0643     !     10.3946+-0.0626        ! definitely 1.0145x slower
   simple-regexp-test-folding                        16.8708+-0.0912           16.8080+-0.0802        
   try-get-by-id-basic                                7.2059+-0.0429     ^      6.2376+-0.0404        ^ definitely 1.1552x faster
   try-get-by-id-polymorphic                          7.1238+-0.0489     ?      7.1695+-0.0484        ?
   v8-regexp-search                                  51.9214+-0.2015     ?     52.2210+-0.2223        ?

   <geometric>                                        1.8679+-0.0041            1.8602+-0.0040          might be 1.0041x faster

                                                           old                       new                                        
Geomean of preferred means:
   <scaled-result>                                   18.3973+-0.0448           18.3861+-0.0440          might be 1.0006x faster
Comment 23 WebKit Commit Bot 2016-08-25 15:13:44 PDT
Comment on attachment 286973 [details]
patch

Clearing flags on attachment: 286973

Committed r204992: <http://trac.webkit.org/changeset/204992>
Comment 24 WebKit Commit Bot 2016-08-25 15:13:50 PDT
All reviewed patches have been landed.  Closing bug.