JavaScriptCore needs to pull constants out of loops SpiderMonkey beats us on the "validate-input" test dramatically. validate-input: 58.2% *slower* 115.2ms +/- 1.2% 182.2ms +/- 0.7% significant I think much of that is due to them having some magical way to recognize constants and pull them out of for loops. in doTest() There is a "pattern" constant regexp, which we re-compile every time through the loop. If I hack the test to pull that out of the loop, we get a 14% speed increase in that one test: validate-input: 14.6% faster 162.8ms +/- 0.8% 139.0ms +/- 1.1% significant
Another way to handle this specific case would be to just unique RegExp objects based on Identifier, then once any one of them is compiled, they all are.
I'm working on this. I think we can just store the compiled regexp in the node tree.
Keep in mind that regexp literals are not truly constants, they create a new object every time and this difference can be observed. It should be safe to cache the compiled guts of the regexp, but not the JavaScript wrapper object.
It would be interesting to test this observation, and see what mozilla caches. :)
This test program seems to show Mozilla caching the whole JS-level regexp object (we do not). I wonder what the standard says? <script> var a = ""; for (var i = 0; i < 5; i++) { var b = /a/; b.foo = "foo"; alert(a.foo); a = b; } </script>
Wowsers. This test program seems to show even more thoroughly that Opera and Mozilla reuse the same regexp object for a regexp created repeatedly from the same source line: <script> var a = ""; for (var i = 0; i < 5; i++) { var b = /a/; b.foo = "bar"; alert(a.foo); alert(a === b); b.foo = ""; a = b; } </script> But not if you create a second instance of the /a/ regexp!
Created attachment 17099 [details] Patch1 Buy this one, get the next one free!
I don't think that re-using the JS wrapper is OK. The JS wrapper holds the lastIndex property, which influences the behavior of matching functions. If two regular expressions got their lastIndex properties tangled up, you'd be in crazy-ville. (Population 2, apparently.)
ECMA 7.8.5: A regular expression literal is an input element that is converted to a RegExp object (section 15.10) when it is scanned. The object is created before evaluation of the containing program or function begins. Evaluation of the literal produces a reference to that object; it does not create a new object. OK, population 3 :).
Do you still want your patch reviewed or would you like to do the more aggressive optimization that ECMA apparently mandates? :-)
Comment on attachment 17099 [details] Patch1 An even better way to eliminate duplication of flags would be to have a way to ask a JSRegExp what its flags are. I think subpatterns is a coined word, so the "P" should not be capitalized. r=me
(In reply to comment #10) > Do you still want your patch reviewed or would you like to do the more > aggressive optimization that ECMA apparently mandates? :-) I'm still going to check this one in -- constructing fewer RegExpImps will be great, but making RegExpImp construction faster is good, too.
Committed revision 27571. Now I'll get to work on creating only one RegExpImp. Kind of tricky, since just putting a ProtectedPtr in the node tree will cause a retain cycle if you set the containing function as a property of the RegExpImp.
Wow, what a fantastic change! ggaren++ ! This was landed in r27571, so closing. I'm assuming ggaren will file a second bug about "creating only one RegExpImp"
Maybe we can convince mjs to sent out updated numbers comparing against SpiderMonkey (or I just need to download the darn thing and run it myself). I doubt we beat them on validate-input yet, but we're probably limited on ActivationImp or something else instead now.
Re-titling to reflect the patch that landed with this bug.