Bug 156832

Summary: Align RegExp[@@match] with other @@ methods
Product: WebKit Reporter: Michael Saboff <msaboff>
Component: JavaScriptCoreAssignee: Michael Saboff <msaboff>
Status: RESOLVED FIXED    
Severity: Normal CC: commit-queue, keith_miller, mark.lam, saam
Priority: P2    
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 157595    
Bug Blocks:    
Attachments:
Description Flags
Patch
mark.lam: review-
Fixed typos and adopted suggestion to get exec property from RegExp.prototype in JSGlobalObject.cpp mark.lam: review+

Michael Saboff
Reported 2016-04-20 21:39:17 PDT
RegExp[@@search] and [@@split] were implemented after [@@match] and are written to take advantage of capabilities added after [@@match] was written, such as determining if RegExp.prototype.exec has been overwritten. Therefore RegExp[@@match] should be changed to conform to these other implementations. Other cleanup is possible, such as making RegExp.prototype.@exec a hidden property of the global object to reduce the overhead of accessing it.
Attachments
Patch (17.34 KB, patch)
2016-04-20 21:58 PDT, Michael Saboff
mark.lam: review-
Fixed typos and adopted suggestion to get exec property from RegExp.prototype in JSGlobalObject.cpp (16.36 KB, patch)
2016-04-20 23:22 PDT, Michael Saboff
mark.lam: review+
Michael Saboff
Comment 1 2016-04-20 21:58:11 PDT
Created attachment 276898 [details] Patch Performance tests show the change to be neutral. VMs tested: "Baseline" at /Volumes/Data/src/webkit.work/WebKitBuild/Release/JavaScriptCore.framework/Versions/A/Resources/jsc "MatchCleanup" at /Volumes/Data/src/webkit/WebKitBuild/Release/JavaScriptCore.framework/Versions/A/Resources/jsc Collected 4 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. Baseline MatchCleanup SunSpider: 3d-cube 6.5433+-0.2248 ? 6.5587+-0.0851 ? 3d-morph 6.4429+-1.0437 6.3725+-0.6656 might be 1.0111x faster 3d-raytrace 7.5284+-0.5019 7.4109+-0.5968 might be 1.0158x faster access-binary-trees 2.5365+-0.1192 ? 2.6064+-0.1574 ? might be 1.0276x slower access-fannkuch 7.4948+-0.3563 7.4009+-0.0886 might be 1.0127x faster access-nbody 3.3707+-0.2332 3.3433+-0.1980 access-nsieve 3.9188+-0.3051 3.9163+-0.2285 bitops-3bit-bits-in-byte 1.4540+-0.0457 ? 1.4930+-0.0488 ? might be 1.0268x slower bitops-bits-in-byte 3.6873+-0.1733 3.5651+-0.1466 might be 1.0343x faster bitops-bitwise-and 2.4251+-0.0497 ? 2.4637+-0.1788 ? might be 1.0159x slower bitops-nsieve-bits 3.8368+-0.1977 3.8152+-0.2503 controlflow-recursive 2.8558+-0.1176 ? 2.9936+-0.1871 ? might be 1.0482x slower crypto-aes 5.6514+-0.3803 5.3979+-0.0748 might be 1.0470x faster crypto-md5 3.3098+-0.1738 3.1350+-0.1633 might be 1.0558x faster crypto-sha1 3.0820+-0.0845 ? 3.1250+-0.1171 ? might be 1.0140x slower date-format-tofte 10.8714+-0.1048 10.7490+-0.5254 might be 1.0114x faster date-format-xparb 6.4147+-0.6334 5.7380+-0.2221 might be 1.1179x faster math-cordic 3.5575+-0.0937 ? 3.5599+-0.2351 ? math-partial-sums 6.1335+-0.3211 ? 6.1850+-0.1287 ? math-spectral-norm 2.5281+-0.1955 ? 2.5421+-0.1738 ? regexp-dna 8.0870+-0.6587 7.9974+-0.2135 might be 1.0112x faster string-base64 5.5316+-0.5308 ? 5.5733+-0.2920 ? string-fasta 7.1266+-0.4464 6.9807+-0.3774 might be 1.0209x faster string-tagcloud 10.3281+-1.1714 10.2260+-0.9551 string-unpack-code 22.5161+-1.8216 ? 22.6944+-1.9004 ? string-validate-input 5.2698+-0.2386 ? 5.5350+-0.3361 ? might be 1.0503x slower <arithmetic> 5.8655+-0.0147 5.8223+-0.0728 might be 1.0074x faster Baseline MatchCleanup V8Spider: crypto 47.6120+-2.5335 47.0132+-0.9663 might be 1.0127x faster deltablue 63.2944+-1.6345 62.7013+-3.2528 earley-boyer 49.7480+-0.9024 ? 49.9853+-2.4668 ? raytrace 35.7768+-0.5457 ? 36.9423+-2.2565 ? might be 1.0326x slower regexp 73.6180+-6.4317 ? 74.0724+-6.1114 ? richards 55.1022+-1.7113 ? 56.2052+-5.3655 ? might be 1.0200x slower splay 35.2679+-0.8128 ? 36.3536+-1.6283 ? might be 1.0308x slower <geometric> 49.8615+-1.0403 ? 50.3598+-1.6381 ? might be 1.0100x slower Baseline MatchCleanup Octane: encrypt 0.21685+-0.02155 0.21004+-0.02030 might be 1.0324x faster decrypt 3.78450+-0.37222 3.67021+-0.25879 might be 1.0311x faster deltablue x2 0.17408+-0.01506 ? 0.17466+-0.01515 ? earley 0.38849+-0.02485 ? 0.38980+-0.02819 ? boyer 6.04879+-0.07190 ? 6.45180+-0.44488 ? might be 1.0666x slower navier-stokes x2 5.71549+-0.19270 ? 5.77068+-0.45744 ? raytrace x2 1.05800+-0.07801 1.04034+-0.02997 might be 1.0170x faster richards x2 0.11255+-0.00172 ? 0.11349+-0.00453 ? splay x2 0.40229+-0.00670 0.40183+-0.00218 regexp x2 20.58057+-0.60117 ? 22.20006+-1.38589 ? might be 1.0787x slower pdfjs x2 49.00001+-0.34920 ? 51.39205+-4.38271 ? might be 1.0488x slower mandreel x2 55.26996+-3.89472 ? 56.08615+-4.51110 ? might be 1.0148x slower gbemu x2 35.54019+-3.74243 ? 36.83474+-4.23541 ? might be 1.0364x slower closure 0.75688+-0.05865 ? 0.75805+-0.05174 ? jquery 9.38341+-0.07685 ? 9.64384+-0.83593 ? might be 1.0278x slower box2d x2 13.76967+-1.37758 13.48207+-1.09540 might be 1.0213x faster zlib x2 429.67440+-2.30962 424.39040+-26.78692 might be 1.0125x faster typescript x2 971.48657+-81.04340 948.14832+-69.03914 might be 1.0246x faster <geometric> 6.60661+-0.19750 ? 6.66511+-0.08633 ? might be 1.0089x slower Baseline MatchCleanup Kraken: ai-astar 115.521+-4.322 ? 115.697+-7.411 ? audio-beat-detection 52.152+-3.924 51.529+-0.596 might be 1.0121x faster audio-dft 118.021+-1.362 117.844+-1.016 audio-fft 42.993+-3.562 42.079+-2.501 might be 1.0217x faster audio-oscillator 59.745+-0.447 ? 60.338+-1.025 ? imaging-darkroom 78.641+-8.121 78.592+-5.613 imaging-desaturate 68.223+-4.147 ? 71.065+-6.995 ? might be 1.0417x slower imaging-gaussian-blur 88.408+-14.810 ? 88.441+-13.068 ? json-parse-financial 49.546+-0.749 ? 51.770+-5.167 ? might be 1.0449x slower json-stringify-tinderbox 30.319+-2.661 30.225+-1.589 stanford-crypto-aes 47.246+-2.759 ? 48.446+-1.314 ? might be 1.0254x slower stanford-crypto-ccm 44.336+-3.893 42.866+-2.626 might be 1.0343x faster stanford-crypto-pbkdf2 116.219+-9.545 ? 116.504+-2.625 ? stanford-crypto-sha256-iterative 46.717+-2.416 46.299+-3.389 <arithmetic> 68.435+-1.608 ? 68.692+-2.509 ? might be 1.0038x slower Baseline MatchCleanup AsmBench: bigfib.cpp 560.6772+-22.6286 ? 579.9601+-53.6733 ? might be 1.0344x slower cray.c 449.7924+-27.4427 442.8202+-8.4744 might be 1.0157x faster dry.c 539.3149+-80.7257 ? 541.6490+-89.3612 ? FloatMM.c 843.0039+-45.7769 827.4686+-4.3725 might be 1.0188x faster gcc-loops.cpp 4786.2945+-50.3050 4785.0507+-85.7497 n-body.c 1050.0835+-53.9163 1040.3228+-10.2812 Quicksort.c 499.2238+-24.4644 ? 505.1614+-37.7292 ? might be 1.0119x slower stepanov_container.cpp 3890.3535+-312.1475 ? 3891.8465+-295.8277 ? Towers.c 313.3977+-4.9532 ? 322.8462+-22.9486 ? might be 1.0301x slower <geometric> 889.5447+-14.8601 ? 893.0001+-18.6593 ? might be 1.0039x slower Baseline MatchCleanup Geomean of preferred means: <scaled-result> 41.1219+-0.2931 ? 41.2774+-0.3372 ? might be 1.0038x slower
Keith Miller
Comment 2 2016-04-20 22:17:43 PDT
Comment on attachment 276898 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=276898&action=review r=me with comments. > Source/JavaScriptCore/runtime/JSGlobalObject.cpp:265 > +static JSObject* getById(ExecState* exec, JSObject* base, const Identifier& ident) > +{ > + JSValue baseValue = JSValue(base); > + PropertySlot slot(baseValue, PropertySlot::InternalMethodType::VMInquiry); > + return baseValue.get(exec, ident, slot).toObject(exec); > +} You don't need this. > Source/JavaScriptCore/runtime/JSGlobalObject.cpp:566 > + JSObject* regExpProtoExecObject = getById(exec, m_regExpPrototype.get(), vm.propertyNames->exec); You can just do "m_regExpPrototype->getDirect(exec, vm.propertyNames->exec)"
Mark Lam
Comment 3 2016-04-20 22:18:36 PDT
Comment on attachment 276898 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=276898&action=review r- because there's a bug. > Source/JavaScriptCore/ChangeLog:11 > + called in @regExpBuiltinExec to match the name it has in the standard. typo:/called in/called it/. nit: I think "spec" or "specification" is a better term than "standard" in this case. > Source/JavaScriptCore/ChangeLog:19 > + Added the builting function @hasObservableSideEffectsForRegExpMatch() that typo:/builting/builtin/. > Source/JavaScriptCore/builtins/RegExpPrototype.js:122 > + resultList.@push(resultString); > } > - return builtinExec.@call(regexp, str); > } I think you have a bug here. I think match() needs to end with a call to @regExpBuiltinExec() to achieve parity with the old code. I recommend that you rig hasObservableSideEffectsForRegExpMatch() to always return true, and run all the tests (JSC and layout tests) to test the slow path. That's what I did with @@split and @@search.
Mark Lam
Comment 4 2016-04-20 22:22:24 PDT
Comment on attachment 276898 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=276898&action=review > Source/JavaScriptCore/builtins/RegExpPrototype.js:75 > + // These are accessed by the builtin flags getter. > + let regexpGlobal = @tryGetById(regexp, "global"); > + if (regexpGlobal !== @regExpProtoGlobalGetter) > + return true; > + let regexpIgnoreCase = @tryGetById(regexp, "ignoreCase"); > + if (regexpIgnoreCase !== @regExpProtoIgnoreCaseGetter) > + return true; Also, according to https://tc39.github.io/ecma262/2016/#sec-regexp.prototype-@@match, the 2 flags that are accessed are "global" and "unicode", not "ignoreCase". Please fix.
Mark Lam
Comment 5 2016-04-20 22:28:46 PDT
BTW, do you already have tests that tests for the observable side effects on @@match for RegExp and String? If not, you can take a look at LayoutTests/js/regress/script-tests/regexp-prototype-split-observable-side-effects*.js and .../string-prototype-split-observable-side-effects*.js. There are also analogous versions for search. In those tests, you'll find a collection of tests on objects that looks like RegExp, subclasses of RegExp, any of the above with the prototype chain tampered with, the default RegExp prototype tampered with, etc. The tests that tamper with builtin objects like RegExp.prototype are put in separate files because that tampering cannot be undone to the primodial state (the structure has already transitioned). If needed, you can used those as a template to write tests to check for observable side effects on @@match for RegExp and String.
Michael Saboff
Comment 6 2016-04-20 23:22:57 PDT
Created attachment 276900 [details] Fixed typos and adopted suggestion to get exec property from RegExp.prototype in JSGlobalObject.cpp (In reply to comment #3) > Comment on attachment 276898 [details] > > Source/JavaScriptCore/builtins/RegExpPrototype.js:122 > > + resultList.@push(resultString); > > } > > - return builtinExec.@call(regexp, str); > > } > > I think you have a bug here. I think match() needs to end with a call to > @regExpBuiltinExec() to achieve parity with the old code. I recommend that > you rig hasObservableSideEffectsForRegExpMatch() to always return true, and > run all the tests (JSC and layout tests) to test the slow path. That's what > I did with @@split and @@search. No bug. The diff is confusing. In the prior code, the fast path was after the loop, which was entered if RegExp.prototype.exec had been overridden. I added the "not-global" case to the fast path in this patch and as a result moved it higher in this function. The spec shows the exit for the global case from within the while true loop. I did run the test with the fast path disabled. (In reply to comment #5) > BTW, do you already have tests that tests for the observable side effects on > @@match for RegExp and String? Those tests were written shortly after the original @@match change.
Mark Lam
Comment 7 2016-04-20 23:34:40 PDT
Comment on attachment 276900 [details] Fixed typos and adopted suggestion to get exec property from RegExp.prototype in JSGlobalObject.cpp View in context: https://bugs.webkit.org/attachment.cgi?id=276900&action=review Yes, you're right. I misread the while loop in RegExp's match. r=me with the one comment change / removal. > Source/JavaScriptCore/builtins/RegExpPrototype.js:69 > + // These are accessed by the builtin flags getter. Please remove or change this comment. The builtin flags getter says RegExp.prototype.flags to me, and @@match does not access flags. In @@split, it actually did read "flags", and that indirectly reads the other individual flags. In your case, I think this comment is not correct.
Michael Saboff
Comment 8 2016-04-21 03:57:25 PDT
Note You need to log in before you can comment on or make changes to this bug.