RESOLVED FIXED 48101
Yarr gives different results for /(?:a*?){2,}/
https://bugs.webkit.org/show_bug.cgi?id=48101
Summary Yarr gives different results for /(?:a*?){2,}/
Oliver Hunt
Reported 2010-10-21 16:43:01 PDT
The discussion in the referenced bug concludes that yarr behaviour is wrong in this case.
Attachments
The patch (49.18 KB, patch)
2010-11-27 13:45 PST, Gavin Barraclough
sam: review+
Gavin Barraclough
Comment 1 2010-11-27 13:45:31 PST
Created attachment 74965 [details] The patch No performance impact, will update linked mozilla bug with info this afternoon.
Sam Weinig
Comment 2 2010-11-28 12:46:22 PST
Comment on attachment 74965 [details] The patch View in context: https://bugs.webkit.org/attachment.cgi?id=74965&action=review Any perf change? r+. > JavaScriptCore/yarr/RegexInterpreter.cpp:1552 > + // then fix this up at the end! - simplyfying this should make it mch clearer. Typo: mch -> much. You may also want to file a bug on this and put the bug number in the comment.
Sam Weinig
Comment 3 2010-11-28 12:46:52 PST
(In reply to comment #2) > (From update of attachment 74965 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=74965&action=review > > Any perf change? Ignore that, I see your comment above.
Gavin Barraclough
Comment 4 2010-11-28 16:45:55 PST
Fixed in r72781.
Darin Adler
Comment 5 2010-11-29 08:29:06 PST
Comment on attachment 74965 [details] The patch View in context: https://bugs.webkit.org/attachment.cgi?id=74965&action=review > JavaScriptCore/ChangeLog:7 > + The test cases in the linked mozilla bug demostrate a couple of Typo: demostrate > JavaScriptCore/ChangeLog:14 > + matches. In the case of parenthese-single matches (quantity count Is “parenthese-single” the intended phrase or a typo? > JavaScriptCore/ChangeLog:16 > + of terminal subpattern matches we do currenty check, however there Typo: currenty > JavaScriptCore/ChangeLog:28 > + Terminal sunpattern matching contains a second bug, too. The frame Typo: sunpattern > JavaScriptCore/yarr/RegexCompiler.cpp:739 > + // * a set of parens at the end of the regular expression (last term in any of the alternatives of the main body disjunction). > + // * where the parens are non-capturing, and quantified unbounded greedy (*). > + // * where the parens do not contain any capturing subpatterns. Non-parallel grammar here. > JavaScriptCore/yarr/RegexCompiler.cpp:748 > + for (unsigned i =0; i < alternatives.size(); ++i) { Missing space after the "=". Should also probably use size_t for the type here instead of unsigned. > JavaScriptCore/yarr/RegexCompiler.cpp:754 > + && term.quantityCount == UINT_MAX Better to use a named constant instead of UINT_MAX here. We’d normally use numeric_limits<unsigned>::max() instead of UINT_MAX, by the way, but neither seems right since the maximum value has a special meaning. > JavaScriptCore/yarr/RegexInterpreter.cpp:722 > + // Technically this access to inputPosition should be accessing the begin term's > + // inputPosition, but for repeats other than fixed these values should be > + // the same anyway! (we don't pre-check for greedy or non-greedy matches.) Should capitalize the word “we” here. > JavaScriptCore/yarr/RegexInterpreter.cpp:780 > + // should always be returned as a successful match - we should never becktrack to here. Typo: becktrack >> JavaScriptCore/yarr/RegexInterpreter.cpp:1552 >> + // then fix this up at the end! - simplyfying this should make it mch clearer. > > Typo: mch -> much. You may also want to file a bug on this and put the bug number in the comment. Typo: simplyfying > JavaScriptCore/yarr/RegexJIT.cpp:957 > + Jump success = (term.quantityType == QuantifierFixedCount) ? > + jump() : > + branch32(NotEqual, index, Address(stackPointerRegister, (parenthesesFrameLocation * sizeof(void*)))); We normally put the ? and the : at the starts of the lines, not the ends.
Darin Adler
Comment 6 2010-11-29 08:29:31 PST
Comment on attachment 74965 [details] The patch View in context: https://bugs.webkit.org/attachment.cgi?id=74965&action=review > LayoutTests/fast/regex/script-tests/repeat-match-waldemar.js:38 > Property changes on: LayoutTests/fast/regex/script-tests/repeat-match-waldemar.js > ___________________________________________________________________ > Added: svn:executable > + * This should not be checked in executable.
Note You need to log in before you can comment on or make changes to this bug.