WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
Bug 16207
JavaScript regular expressions should match UTF-16 code units rather than characters
https://bugs.webkit.org/show_bug.cgi?id=16207
Summary
JavaScript regular expressions should match UTF-16 code units rather than cha...
Darin Adler
Reported
2007-11-30 07:02:13 PST
Testing with other browsers indicates that the JavaScript regular expression code needs to treat surrogate pairs as two "characters" rather than a single character to match them. This is good news in a way, because it's an easy way to make the regular expression engine faster, by removing the UTF-16 smarts from most of the engine.
Attachments
patch, speeds up SunSpider
(64.63 KB, patch)
2007-11-30 07:08 PST
,
Darin Adler
aroben
: review+
Details
Formatted Diff
Diff
View All
Add attachment
proposed patch, testcase, etc.
Darin Adler
Comment 1
2007-11-30 07:08:54 PST
Created
attachment 17606
[details]
patch, speeds up SunSpider
Adam Roben (:aroben)
Comment 2
2007-11-30 10:08:37 PST
Comment on
attachment 17606
[details]
patch, speeds up SunSpider 2425 d = *++ptr; The precedence here seems correct, but potentially confusing. Maybe *(++ptr) would be better? 757 int c = *stack.currentFrame->args.subjectPtr++; Again, parentheses might make it clearer what precedence you're expecting here (and in the other instances of this expression). 1640 if (stack.currentFrame->args.subjectPtr >= md.end_subject || isNewline(*stack.currentFrame->args.subjectPtr)) Why did you leave the comparison with md.end_subject here but now elsewhere? r=me
Darin Adler
Comment 3
2007-11-30 10:55:00 PST
Committed revision 28243.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug