WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
UNCONFIRMED
122891
Yarr does not compile Peacekeeper email validation regex
https://bugs.webkit.org/show_bug.cgi?id=122891
Summary
Yarr does not compile Peacekeeper email validation regex
Jan de Mooij
Reported
2013-10-16 05:27:03 PDT
Created
attachment 214358
[details]
Shell testcase I'm attaching a simple shell version of Peacekeeper's stringValidateForm test. The good news is that Yarr is able to compile 4 of the 5 regular expressions. The bad news is that the other regex is interpreted and this slows us down. If I run the test in Safari it takes 1200 ms, with the email regex commented out it's 460 ms so the slow regex is where we spend most of our time on this test. The email validation part is this: input = "
jaakko.alajoki@futuremark.com
"; result = /^\w+([\.-]?\w+)*@\w+([\.-]?\w+)*(\.\w{2,3})+$/.test(input);
Attachments
Shell testcase
(664 bytes, application/x-javascript)
2013-10-16 05:27 PDT
,
Jan de Mooij
no flags
Details
View All
Add attachment
proposed patch, testcase, etc.
Jan de Mooij
Comment 1
2013-10-16 05:29:47 PDT
Yarr JIT compilation is aborting here I think: // We can currently only compile quantity 1 subpatterns that are // not copies. We generate a copy in the case of a range quantifier, // e.g. /(?:x){3,9}/, or /(?:x)+/ (These are effectively expanded to // /(?:x){3,3}(?:x){0,6}/ and /(?:x)(?:x)*/ repectively). The problem // comes where the subpattern is capturing, in which case we would // need to restore the capture from the first subpattern upon a // failure in the second.
Gavin Barraclough
Comment 2
2013-10-16 10:52:09 PDT
Yarr JIT now compiles regular expressions twice, in one of two mode – 'match-only', and 'include-subpatterns'. Match-only is used for when the regexp is first run, to scan for the match, at which point we return a boolean result (in the case of 'test' matches), or returned as a lazily populated regexp matches array object. Include-subpattens is used when an subpatten match is explicitly required, for example if an entry in the matches array is accessed. The restriction referenced in the comment is that the JIT won't backtrack sub pattern matches, so we can't compile quantified captures unless we can guarantee they won't backtrack. But this restriction doesn't apply for match-only compilations, which aren't recording the matched sub patterns anyway. We can probably pretty much say, if (compileMode == MatchOnly) then we can compile any parens. We'd have to look a little more closely to check whether any of the restrictions still need to apply.
Gavin Barraclough
Comment 3
2013-10-16 17:13:45 PDT
Actually, no, I'm way oversimplifying here. :-( – each iteration would also need separate backtracking state internally, so this does effectively demand stack allocation.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug