Bug 156832 - Align RegExp[@@match] with other @@ methods
Summary: Align RegExp[@@match] with other @@ methods
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Michael Saboff
URL:
Keywords:
Depends on: 157595
Blocks:
  Show dependency treegraph
 
Reported: 2016-04-20 21:39 PDT by Michael Saboff
Modified: 2016-05-17 05:29 PDT (History)
4 users (show)

See Also:


Attachments
Patch (17.34 KB, patch)
2016-04-20 21:58 PDT, Michael Saboff
mark.lam: review-
Details | Formatted Diff | Diff
Fixed typos and adopted suggestion to get exec property from RegExp.prototype in JSGlobalObject.cpp (16.36 KB, patch)
2016-04-20 23:22 PDT, Michael Saboff
mark.lam: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Saboff 2016-04-20 21:39:17 PDT
RegExp[@@search] and [@@split] were implemented after [@@match] and are written to take advantage of capabilities added after [@@match] was written, such as determining if RegExp.prototype.exec has been overwritten.  Therefore RegExp[@@match] should be changed to conform to these other implementations.  Other cleanup is possible, such as making RegExp.prototype.@exec a hidden property of the global object to reduce the overhead of accessing it.
Comment 1 Michael Saboff 2016-04-20 21:58:11 PDT
Created attachment 276898 [details]
Patch

Performance tests show the change to be neutral.

VMs tested:
"Baseline" at /Volumes/Data/src/webkit.work/WebKitBuild/Release/JavaScriptCore.framework/Versions/A/Resources/jsc
"MatchCleanup" at /Volumes/Data/src/webkit/WebKitBuild/Release/JavaScriptCore.framework/Versions/A/Resources/jsc

Collected 4 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample
measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to
get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds.

                                                 Baseline                MatchCleanup                                   
SunSpider:
   3d-cube                                    6.5433+-0.2248     ?      6.5587+-0.0851        ?
   3d-morph                                   6.4429+-1.0437            6.3725+-0.6656          might be 1.0111x faster
   3d-raytrace                                7.5284+-0.5019            7.4109+-0.5968          might be 1.0158x faster
   access-binary-trees                        2.5365+-0.1192     ?      2.6064+-0.1574        ? might be 1.0276x slower
   access-fannkuch                            7.4948+-0.3563            7.4009+-0.0886          might be 1.0127x faster
   access-nbody                               3.3707+-0.2332            3.3433+-0.1980        
   access-nsieve                              3.9188+-0.3051            3.9163+-0.2285        
   bitops-3bit-bits-in-byte                   1.4540+-0.0457     ?      1.4930+-0.0488        ? might be 1.0268x slower
   bitops-bits-in-byte                        3.6873+-0.1733            3.5651+-0.1466          might be 1.0343x faster
   bitops-bitwise-and                         2.4251+-0.0497     ?      2.4637+-0.1788        ? might be 1.0159x slower
   bitops-nsieve-bits                         3.8368+-0.1977            3.8152+-0.2503        
   controlflow-recursive                      2.8558+-0.1176     ?      2.9936+-0.1871        ? might be 1.0482x slower
   crypto-aes                                 5.6514+-0.3803            5.3979+-0.0748          might be 1.0470x faster
   crypto-md5                                 3.3098+-0.1738            3.1350+-0.1633          might be 1.0558x faster
   crypto-sha1                                3.0820+-0.0845     ?      3.1250+-0.1171        ? might be 1.0140x slower
   date-format-tofte                         10.8714+-0.1048           10.7490+-0.5254          might be 1.0114x faster
   date-format-xparb                          6.4147+-0.6334            5.7380+-0.2221          might be 1.1179x faster
   math-cordic                                3.5575+-0.0937     ?      3.5599+-0.2351        ?
   math-partial-sums                          6.1335+-0.3211     ?      6.1850+-0.1287        ?
   math-spectral-norm                         2.5281+-0.1955     ?      2.5421+-0.1738        ?
   regexp-dna                                 8.0870+-0.6587            7.9974+-0.2135          might be 1.0112x faster
   string-base64                              5.5316+-0.5308     ?      5.5733+-0.2920        ?
   string-fasta                               7.1266+-0.4464            6.9807+-0.3774          might be 1.0209x faster
   string-tagcloud                           10.3281+-1.1714           10.2260+-0.9551        
   string-unpack-code                        22.5161+-1.8216     ?     22.6944+-1.9004        ?
   string-validate-input                      5.2698+-0.2386     ?      5.5350+-0.3361        ? might be 1.0503x slower

   <arithmetic>                               5.8655+-0.0147            5.8223+-0.0728          might be 1.0074x faster

                                                 Baseline                MatchCleanup                                   
V8Spider:
   crypto                                    47.6120+-2.5335           47.0132+-0.9663          might be 1.0127x faster
   deltablue                                 63.2944+-1.6345           62.7013+-3.2528        
   earley-boyer                              49.7480+-0.9024     ?     49.9853+-2.4668        ?
   raytrace                                  35.7768+-0.5457     ?     36.9423+-2.2565        ? might be 1.0326x slower
   regexp                                    73.6180+-6.4317     ?     74.0724+-6.1114        ?
   richards                                  55.1022+-1.7113     ?     56.2052+-5.3655        ? might be 1.0200x slower
   splay                                     35.2679+-0.8128     ?     36.3536+-1.6283        ? might be 1.0308x slower

   <geometric>                               49.8615+-1.0403     ?     50.3598+-1.6381        ? might be 1.0100x slower

                                                 Baseline                MatchCleanup                                   
Octane:
   encrypt                                   0.21685+-0.02155          0.21004+-0.02030         might be 1.0324x faster
   decrypt                                   3.78450+-0.37222          3.67021+-0.25879         might be 1.0311x faster
   deltablue                        x2       0.17408+-0.01506    ?     0.17466+-0.01515       ?
   earley                                    0.38849+-0.02485    ?     0.38980+-0.02819       ?
   boyer                                     6.04879+-0.07190    ?     6.45180+-0.44488       ? might be 1.0666x slower
   navier-stokes                    x2       5.71549+-0.19270    ?     5.77068+-0.45744       ?
   raytrace                         x2       1.05800+-0.07801          1.04034+-0.02997         might be 1.0170x faster
   richards                         x2       0.11255+-0.00172    ?     0.11349+-0.00453       ?
   splay                            x2       0.40229+-0.00670          0.40183+-0.00218       
   regexp                           x2      20.58057+-0.60117    ?    22.20006+-1.38589       ? might be 1.0787x slower
   pdfjs                            x2      49.00001+-0.34920    ?    51.39205+-4.38271       ? might be 1.0488x slower
   mandreel                         x2      55.26996+-3.89472    ?    56.08615+-4.51110       ? might be 1.0148x slower
   gbemu                            x2      35.54019+-3.74243    ?    36.83474+-4.23541       ? might be 1.0364x slower
   closure                                   0.75688+-0.05865    ?     0.75805+-0.05174       ?
   jquery                                    9.38341+-0.07685    ?     9.64384+-0.83593       ? might be 1.0278x slower
   box2d                            x2      13.76967+-1.37758         13.48207+-1.09540         might be 1.0213x faster
   zlib                             x2     429.67440+-2.30962        424.39040+-26.78692        might be 1.0125x faster
   typescript                       x2     971.48657+-81.04340       948.14832+-69.03914        might be 1.0246x faster

   <geometric>                               6.60661+-0.19750    ?     6.66511+-0.08633       ? might be 1.0089x slower

                                                 Baseline                MatchCleanup                                   
Kraken:
   ai-astar                                  115.521+-4.322      ?     115.697+-7.411         ?
   audio-beat-detection                       52.152+-3.924             51.529+-0.596           might be 1.0121x faster
   audio-dft                                 118.021+-1.362            117.844+-1.016         
   audio-fft                                  42.993+-3.562             42.079+-2.501           might be 1.0217x faster
   audio-oscillator                           59.745+-0.447      ?      60.338+-1.025         ?
   imaging-darkroom                           78.641+-8.121             78.592+-5.613         
   imaging-desaturate                         68.223+-4.147      ?      71.065+-6.995         ? might be 1.0417x slower
   imaging-gaussian-blur                      88.408+-14.810     ?      88.441+-13.068        ?
   json-parse-financial                       49.546+-0.749      ?      51.770+-5.167         ? might be 1.0449x slower
   json-stringify-tinderbox                   30.319+-2.661             30.225+-1.589         
   stanford-crypto-aes                        47.246+-2.759      ?      48.446+-1.314         ? might be 1.0254x slower
   stanford-crypto-ccm                        44.336+-3.893             42.866+-2.626           might be 1.0343x faster
   stanford-crypto-pbkdf2                    116.219+-9.545      ?     116.504+-2.625         ?
   stanford-crypto-sha256-iterative           46.717+-2.416             46.299+-3.389         

   <arithmetic>                               68.435+-1.608      ?      68.692+-2.509         ? might be 1.0038x slower

                                                 Baseline                MatchCleanup                                   
AsmBench:
   bigfib.cpp                               560.6772+-22.6286    ?    579.9601+-53.6733       ? might be 1.0344x slower
   cray.c                                   449.7924+-27.4427         442.8202+-8.4744          might be 1.0157x faster
   dry.c                                    539.3149+-80.7257    ?    541.6490+-89.3612       ?
   FloatMM.c                                843.0039+-45.7769         827.4686+-4.3725          might be 1.0188x faster
   gcc-loops.cpp                           4786.2945+-50.3050        4785.0507+-85.7497       
   n-body.c                                1050.0835+-53.9163        1040.3228+-10.2812       
   Quicksort.c                              499.2238+-24.4644    ?    505.1614+-37.7292       ? might be 1.0119x slower
   stepanov_container.cpp                  3890.3535+-312.1475   ?   3891.8465+-295.8277      ?
   Towers.c                                 313.3977+-4.9532     ?    322.8462+-22.9486       ? might be 1.0301x slower

   <geometric>                              889.5447+-14.8601    ?    893.0001+-18.6593       ? might be 1.0039x slower

                                                 Baseline                MatchCleanup                                   
Geomean of preferred means:
   <scaled-result>                           41.1219+-0.2931     ?     41.2774+-0.3372        ? might be 1.0038x slower
Comment 2 Keith Miller 2016-04-20 22:17:43 PDT
Comment on attachment 276898 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=276898&action=review

r=me with comments.

> Source/JavaScriptCore/runtime/JSGlobalObject.cpp:265
> +static JSObject* getById(ExecState* exec, JSObject* base, const Identifier& ident)
> +{
> +    JSValue baseValue = JSValue(base);
> +    PropertySlot slot(baseValue, PropertySlot::InternalMethodType::VMInquiry);
> +    return baseValue.get(exec, ident, slot).toObject(exec);
> +}

You don't need this.

> Source/JavaScriptCore/runtime/JSGlobalObject.cpp:566
> +    JSObject* regExpProtoExecObject = getById(exec, m_regExpPrototype.get(), vm.propertyNames->exec);

You can just do "m_regExpPrototype->getDirect(exec, vm.propertyNames->exec)"
Comment 3 Mark Lam 2016-04-20 22:18:36 PDT
Comment on attachment 276898 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=276898&action=review

r- because there's a bug.

> Source/JavaScriptCore/ChangeLog:11
> +        called in @regExpBuiltinExec to match the name it has in the standard.

typo:/called in/called it/.
nit: I think "spec" or "specification" is a better term than "standard" in this case.

> Source/JavaScriptCore/ChangeLog:19
> +        Added the builting function @hasObservableSideEffectsForRegExpMatch() that

typo:/builting/builtin/.

> Source/JavaScriptCore/builtins/RegExpPrototype.js:122
> +        resultList.@push(resultString);
>      }
> -    return builtinExec.@call(regexp, str);
>  }

I think you have a bug here.   I think match() needs to end with a call to @regExpBuiltinExec() to achieve parity with the old code.  I recommend that you rig hasObservableSideEffectsForRegExpMatch() to always return true, and run all the tests (JSC and layout tests) to test the slow path.  That's what I did with @@split and @@search.
Comment 4 Mark Lam 2016-04-20 22:22:24 PDT
Comment on attachment 276898 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=276898&action=review

> Source/JavaScriptCore/builtins/RegExpPrototype.js:75
> +    // These are accessed by the builtin flags getter.
> +    let regexpGlobal = @tryGetById(regexp, "global");
> +    if (regexpGlobal !== @regExpProtoGlobalGetter)
> +        return true;
> +    let regexpIgnoreCase = @tryGetById(regexp, "ignoreCase");
> +    if (regexpIgnoreCase !== @regExpProtoIgnoreCaseGetter)
> +        return true;

Also, according to https://tc39.github.io/ecma262/2016/#sec-regexp.prototype-@@match, the 2 flags that are accessed are "global" and "unicode", not "ignoreCase".  Please fix.
Comment 5 Mark Lam 2016-04-20 22:28:46 PDT
BTW, do you already have tests that tests for the observable side effects on @@match for RegExp and String?  If not, you can take a look at LayoutTests/js/regress/script-tests/regexp-prototype-split-observable-side-effects*.js and .../string-prototype-split-observable-side-effects*.js.  There are also analogous versions for search.

In those tests, you'll find a collection of tests on objects that looks like RegExp, subclasses of RegExp, any of the above with the prototype chain tampered with, the default RegExp prototype tampered with, etc.  The tests that tamper with builtin objects like RegExp.prototype are put in separate files because that tampering cannot be undone to the primodial state (the structure has already transitioned).

If needed, you can used those as a template to write tests to check for observable side effects on @@match for RegExp and String.
Comment 6 Michael Saboff 2016-04-20 23:22:57 PDT
Created attachment 276900 [details]
Fixed typos and adopted suggestion to get exec property from RegExp.prototype in JSGlobalObject.cpp

(In reply to comment #3)
> Comment on attachment 276898 [details]
> > Source/JavaScriptCore/builtins/RegExpPrototype.js:122
> > +        resultList.@push(resultString);
> >      }
> > -    return builtinExec.@call(regexp, str);
> >  }
> 
> I think you have a bug here.   I think match() needs to end with a call to
> @regExpBuiltinExec() to achieve parity with the old code.  I recommend that
> you rig hasObservableSideEffectsForRegExpMatch() to always return true, and
> run all the tests (JSC and layout tests) to test the slow path.  That's what
> I did with @@split and @@search.

No bug.  The diff is confusing.  In the prior code, the fast path was after the loop, which was entered if RegExp.prototype.exec had been overridden.  I added the "not-global" case to the fast path in this patch and as a result moved it higher in this function.  The spec shows the exit for the global case from within the while true loop.

I did run the test with the fast path disabled.

(In reply to comment #5)
> BTW, do you already have tests that tests for the observable side effects on
> @@match for RegExp and String? 

Those tests were written shortly after the original @@match change.
Comment 7 Mark Lam 2016-04-20 23:34:40 PDT
Comment on attachment 276900 [details]
Fixed typos and adopted suggestion to get exec property from RegExp.prototype in JSGlobalObject.cpp

View in context: https://bugs.webkit.org/attachment.cgi?id=276900&action=review

Yes, you're right.  I misread the while loop in RegExp's match.

r=me with the one comment change / removal.

> Source/JavaScriptCore/builtins/RegExpPrototype.js:69
> +    // These are accessed by the builtin flags getter.

Please remove or change this comment.  The builtin flags getter says RegExp.prototype.flags to me, and @@match does not access flags.  In @@split, it actually did read "flags", and that indirectly reads the other individual flags.  In your case, I think this comment is not correct.
Comment 8 Michael Saboff 2016-04-21 03:57:25 PDT
Committed r199812: <http://trac.webkit.org/changeset/199812>