WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
16036
Further cleanup to PCRE
https://bugs.webkit.org/show_bug.cgi?id=16036
Summary
Further cleanup to PCRE
Eric Seidel (no email)
Reported
2007-11-17 21:09:37 PST
Clean up PCRE more so we can make it faster. :) Going to try attaching a whole bunch of patches to this bug which we can review one at at time.
Attachments
[PATCH] Clean up jsRegExpExecute
(24.14 KB, patch)
2007-11-17 21:09 PST
,
Eric Seidel (no email)
mjs
: review+
Details
Formatted Diff
Diff
[PATCH] Remove register keyword and more cleanup
(35.50 KB, patch)
2007-11-17 21:09 PST
,
Eric Seidel (no email)
mjs
: review+
Details
Formatted Diff
Diff
[PATCH] reformat get_othercase_range
(1.27 KB, patch)
2007-11-17 21:09 PST
,
Eric Seidel (no email)
sam
: review+
Details
Formatted Diff
Diff
[PATCH] Get rid of PCRE custom char types
(16.80 KB, patch)
2007-11-17 21:09 PST
,
Eric Seidel (no email)
sam
: review+
Details
Formatted Diff
Diff
[PATCH] Fix spacing for read_repeat_counts
(2.29 KB, patch)
2007-11-17 21:10 PST
,
Eric Seidel (no email)
sam
: review+
Details
Formatted Diff
Diff
[PATCH] clean up formating in compile_branch
(98.14 KB, patch)
2007-11-17 21:10 PST
,
Eric Seidel (no email)
mjs
: review+
Details
Formatted Diff
Diff
[PATCH] removing more macros
(9.84 KB, patch)
2007-11-17 21:10 PST
,
Eric Seidel (no email)
sam
: review+
Details
Formatted Diff
Diff
[PATCH] Clean up compile_branch, move _pcre_ord2utf8, and rename CompileData
(20.25 KB, patch)
2007-11-17 21:10 PST
,
Eric Seidel (no email)
sam
: review+
Details
Formatted Diff
Diff
[PATCH] Pass around CompileData& instead of CompileData*
(10.98 KB, patch)
2007-11-17 21:10 PST
,
Eric Seidel (no email)
timothy
: review+
Details
Formatted Diff
Diff
[PATCH] Clean up find_firstassertedchar
(5.64 KB, patch)
2007-11-17 21:10 PST
,
Eric Seidel (no email)
sam
: review+
Details
Formatted Diff
Diff
[PATCH] give PCRE_STARTLINE a better name and rename match_data to MatchData
(13.60 KB, patch)
2007-11-17 21:10 PST
,
Eric Seidel (no email)
sam
: review+
Details
Formatted Diff
Diff
[PATCH] Pass around MatchData objects by reference
(43.88 KB, patch)
2007-11-17 21:10 PST
,
Eric Seidel (no email)
sam
: review+
Details
Formatted Diff
Diff
[PATCH] Remove unused function could_be_empty_branch
(6.00 KB, patch)
2007-11-17 21:10 PST
,
Eric Seidel (no email)
mjs
: review+
Details
Formatted Diff
Diff
[PATCH] reformat is_anchored
(2.31 KB, patch)
2007-11-17 21:10 PST
,
Eric Seidel (no email)
sam
: review+
Details
Formatted Diff
Diff
[PATCH] Reformat find_fixedlength
(7.91 KB, patch)
2007-11-17 21:10 PST
,
Eric Seidel (no email)
sam
: review-
Details
Formatted Diff
Diff
[PATCH] clean up check_escape
(8.04 KB, patch)
2007-11-17 21:10 PST
,
Eric Seidel (no email)
sam
: review+
Details
Formatted Diff
Diff
[PATCH] clean up is_counted_repeat
(2.02 KB, patch)
2007-11-17 21:10 PST
,
Eric Seidel (no email)
sam
: review+
Details
Formatted Diff
Diff
[PATCH] split first_significant_code into two simpler functions
(2.84 KB, patch)
2007-11-17 21:10 PST
,
Eric Seidel (no email)
sam
: review+
Details
Formatted Diff
Diff
[PATCH] Use better variable names for case ignoring options
(15.70 KB, patch)
2007-11-17 21:10 PST
,
Eric Seidel (no email)
mjs
: review+
Details
Formatted Diff
Diff
[PATCH] Section off MatchData arguments into args struct
(91.45 KB, patch)
2007-11-17 21:10 PST
,
Eric Seidel (no email)
sam
: review-
Details
Formatted Diff
Diff
[PATCH] Abstract frame variables into locals and args
(72.81 KB, patch)
2007-11-17 21:10 PST
,
Eric Seidel (no email)
sam
: review-
Details
Formatted Diff
Diff
[PATCH] cleanup _pcre_ucp_othercase
(2.67 KB, patch)
2007-11-17 21:10 PST
,
Eric Seidel (no email)
sam
: review+
Details
Formatted Diff
Diff
[PATCH] Add inlines for toLowerCase, isWordChar, isSpaceChar for further regexp speedup
(25.56 KB, patch)
2007-11-17 21:10 PST
,
Eric Seidel (no email)
sam
: review+
Details
Formatted Diff
Diff
[PATCH] Give GET, PUT better names, and add (poor) moveOpcodePtrPastAnyAlternateBranches
(37.66 KB, patch)
2007-11-17 21:10 PST
,
Eric Seidel (no email)
sam
: review+
Details
Formatted Diff
Diff
[PATCH] Further cleanup GET/PUT inlines
(3.66 KB, patch)
2007-11-17 21:10 PST
,
Eric Seidel (no email)
sam
: review+
Details
Formatted Diff
Diff
[PATCH] Get rid of GETCHAR* macros, replacing them with better named inlines
(28.06 KB, patch)
2007-11-17 21:10 PST
,
Eric Seidel (no email)
sam
: review+
Details
Formatted Diff
Diff
[PATCH] Remove redundant match_call_count and move recursion check out of super-hot code path
(5.70 KB, patch)
2007-11-17 21:10 PST
,
Eric Seidel (no email)
mjs
: review+
Details
Formatted Diff
Diff
[PATCH] Remove redundant eptrblock struct
(19.61 KB, patch)
2007-11-17 21:10 PST
,
Eric Seidel (no email)
sam
: review+
Details
Formatted Diff
Diff
[PATCH] Remove match_is_group variable for another 5% speedup
(23.38 KB, patch)
2007-11-19 00:58 PST
,
Eric Seidel (no email)
mjs
: review+
Details
Formatted Diff
Diff
[PATCH] Lower MATCH_LIMIT_RECURSION to more sane levels to prevent hangs on run-javascriptcore-tests
(1.01 KB, patch)
2007-11-19 00:58 PST
,
Eric Seidel (no email)
sam
: review+
Details
Formatted Diff
Diff
[PATCH] Small speedup (0.7%) by simplifying canUseStackBufferForNextFrame() check
(1.91 KB, patch)
2007-11-19 00:58 PST
,
Eric Seidel (no email)
sam
: review+
Details
Formatted Diff
Diff
[PATCH] Give subjectPtr and instructionPtr sane names, reduce size of MatchFrame for a 0.2% speedup.
(108.42 KB, patch)
2007-11-19 00:58 PST
,
Eric Seidel (no email)
mjs
: review+
Details
Formatted Diff
Diff
[PATCH] Comment cleanup
(2.43 KB, patch)
2007-11-19 00:58 PST
,
Eric Seidel (no email)
sam
: review+
Details
Formatted Diff
Diff
[PATCH] Make i locally scoped for better code clarity
(16.08 KB, patch)
2007-11-19 00:58 PST
,
Eric Seidel (no email)
sam
: review+
Details
Formatted Diff
Diff
[PATCH] change getChar* functions to return result and push 'c' into local scopes for clarity
(29.01 KB, patch)
2007-11-19 00:58 PST
,
Eric Seidel (no email)
mjs
: review+
Details
Formatted Diff
Diff
[PATCH] Remove no longer used error code JSRegExpErrorMatchLimit
(1.72 KB, patch)
2007-11-19 00:58 PST
,
Eric Seidel (no email)
sam
: review-
Details
Formatted Diff
Diff
[PATCH] Add repeatInformationFromInstructionOffset inline
(7.08 KB, patch)
2007-11-19 00:58 PST
,
Eric Seidel (no email)
mjs
: review+
Details
Formatted Diff
Diff
[PATCH] Remove branch from return
(1.74 KB, patch)
2007-11-19 00:58 PST
,
Eric Seidel (no email)
sam
: review+
Details
Formatted Diff
Diff
[PATCH] Change _NC operators to use _IGNORING_CASE for clarity
(5.03 KB, patch)
2007-11-19 00:58 PST
,
Eric Seidel (no email)
mitz: review+
Details
Formatted Diff
Diff
[PATCH] Make cur_is_word and prev_is_word locals, and change OP_ANY to OP_ANY_CHAR for clarity
(6.27 KB, patch)
2007-11-19 00:58 PST
,
Eric Seidel (no email)
mitz: review+
Details
Formatted Diff
Diff
[PATCH] Centralize code for subjectPtr adjustments using inlines, only ever check for a single trailing surrogate (as UTF16 only allows one), possibly fix PCRE bugs involving char classes and garbled UTF16 strings.
(16.28 KB, patch)
2007-11-19 00:58 PST
,
Eric Seidel (no email)
mjs
: review+
Details
Formatted Diff
Diff
[PATCH] Deprecate jsRegExpExecute's offset-vector fallback code
(1.10 KB, patch)
2007-11-19 00:58 PST
,
Eric Seidel (no email)
oliver
: review+
Details
Formatted Diff
Diff
[PATCH] give PCRE_MULTILINE a better name: OptionMatchAcrossMultipleLines
(3.33 KB, patch)
2007-11-19 00:58 PST
,
Eric Seidel (no email)
mjs
: review+
Details
Formatted Diff
Diff
[PATCH] Pull first_byte and req_byte optimizations out into separate static funtions, SunSpider reported this as a win.
(10.54 KB, patch)
2007-11-19 00:58 PST
,
Eric Seidel (no email)
sam
: review+
Details
Formatted Diff
Diff
[PATCH] Give consistent naming to the RegExp options/compile flags
(13.19 KB, patch)
2007-11-19 00:58 PST
,
Eric Seidel (no email)
sam
: review+
Details
Formatted Diff
Diff
[PATCH] Further cleanups to calculateCompiledPatternLengthAndFlags
(17.92 KB, patch)
2007-11-19 01:47 PST
,
Eric Seidel (no email)
sam
: review+
Details
Formatted Diff
Diff
Combined patch for Maciej's viewing pleasure
(361.97 KB, patch)
2007-11-19 01:52 PST
,
Eric Seidel (no email)
no flags
Details
Formatted Diff
Diff
View All
Add attachment
proposed patch, testcase, etc.
Eric Seidel (no email)
Comment 1
2007-11-17 21:09:54 PST
Created
attachment 17335
[details]
[PATCH] Clean up jsRegExpExecute JavaScriptCore/pcre/pcre_compile.cpp | 12 +- JavaScriptCore/pcre/pcre_exec.cpp | 553 ++++++++++++++++------------------ JavaScriptCore/pcre/pcre_internal.h | 6 +- 3 files changed, 269 insertions(+), 302 deletions(-)
Eric Seidel (no email)
Comment 2
2007-11-17 21:09:55 PST
Created
attachment 17336
[details]
[PATCH] Remove register keyword and more cleanup JavaScriptCore/pcre/pcre_compile.cpp | 97 ++++++++++++--------------- JavaScriptCore/pcre/pcre_exec.cpp | 119 +++++++++++++++++++-------------- JavaScriptCore/pcre/pcre_internal.h | 92 ++++++++----------------- JavaScriptCore/pcre/pcre_ord2utf8.cpp | 25 +++---- JavaScriptCore/pcre/pcre_xclass.cpp | 85 ++++++++++++++---------- 5 files changed, 203 insertions(+), 215 deletions(-)
Eric Seidel (no email)
Comment 3
2007-11-17 21:09:57 PST
Created
attachment 17337
[details]
[PATCH] reformat get_othercase_range JavaScriptCore/pcre/pcre_compile.cpp | 46 +++++++++++++++++---------------- 1 files changed, 24 insertions(+), 22 deletions(-)
Eric Seidel (no email)
Comment 4
2007-11-17 21:09:58 PST
Created
attachment 17338
[details]
[PATCH] Get rid of PCRE custom char types JavaScriptCore/pcre/pcre_compile.cpp | 84 +++++++++++++++------------------ JavaScriptCore/pcre/pcre_exec.cpp | 48 ++++++++++---------- JavaScriptCore/pcre/pcre_internal.h | 10 +--- 3 files changed, 65 insertions(+), 77 deletions(-)
Eric Seidel (no email)
Comment 5
2007-11-17 21:10:00 PST
Created
attachment 17339
[details]
[PATCH] Fix spacing for read_repeat_counts JavaScriptCore/pcre/pcre_compile.cpp | 79 +++++++++++++++++----------------- 1 files changed, 39 insertions(+), 40 deletions(-)
Eric Seidel (no email)
Comment 6
2007-11-17 21:10:02 PST
Created
attachment 17340
[details]
[PATCH] clean up formating in compile_branch JavaScriptCore/pcre/pcre_compile.cpp | 2351 +++++++++++++++++----------------- 1 files changed, 1145 insertions(+), 1206 deletions(-)
Eric Seidel (no email)
Comment 7
2007-11-17 21:10:03 PST
Created
attachment 17341
[details]
[PATCH] removing more macros JavaScriptCore/pcre/pcre_compile.cpp | 36 +++++++++++++++++----------------- JavaScriptCore/pcre/pcre_exec.cpp | 24 +++++++++++----------- JavaScriptCore/pcre/pcre_internal.h | 18 ++++++++-------- JavaScriptCore/pcre/pcre_xclass.cpp | 4 +- 4 files changed, 41 insertions(+), 41 deletions(-)
Eric Seidel (no email)
Comment 8
2007-11-17 21:10:05 PST
Created
attachment 17342
[details]
[PATCH] Clean up compile_branch, move _pcre_ord2utf8, and rename CompileData .../JavaScriptCore.xcodeproj/project.pbxproj | 4 - JavaScriptCore/pcre/pcre_compile.cpp | 110 ++++++++++++-------- JavaScriptCore/pcre/pcre_internal.h | 19 ++-- JavaScriptCore/pcre/pcre_ord2utf8.cpp | 75 ------------- 4 files changed, 76 insertions(+), 132 deletions(-)
Eric Seidel (no email)
Comment 9
2007-11-17 21:10:06 PST
Created
attachment 17343
[details]
[PATCH] Pass around CompileData& instead of CompileData* JavaScriptCore/pcre/pcre_compile.cpp | 69 +++++++++++++++------------------ 1 files changed, 31 insertions(+), 38 deletions(-)
Eric Seidel (no email)
Comment 10
2007-11-17 21:10:09 PST
Created
attachment 17344
[details]
[PATCH] Clean up find_firstassertedchar JavaScriptCore/pcre/pcre_compile.cpp | 111 +++++++++++++++++---------------- 1 files changed, 57 insertions(+), 54 deletions(-)
Eric Seidel (no email)
Comment 11
2007-11-17 21:10:10 PST
Created
attachment 17345
[details]
[PATCH] give PCRE_STARTLINE a better name and rename match_data to MatchData JavaScriptCore/pcre/pcre_compile.cpp | 160 ++++++++++++++++----------------- JavaScriptCore/pcre/pcre_exec.cpp | 45 +++++----- JavaScriptCore/pcre/pcre_internal.h | 3 +- 3 files changed, 104 insertions(+), 104 deletions(-)
Eric Seidel (no email)
Comment 12
2007-11-17 21:10:12 PST
Created
attachment 17346
[details]
[PATCH] Pass around MatchData objects by reference JavaScriptCore/pcre/pcre_exec.cpp | 294 ++++++++++++++++++------------------ 1 files changed, 147 insertions(+), 147 deletions(-)
Eric Seidel (no email)
Comment 13
2007-11-17 21:10:14 PST
Created
attachment 17347
[details]
[PATCH] Remove unused function could_be_empty_branch JavaScriptCore/pcre/pcre_compile.cpp | 155 ++-------------------------------- 1 files changed, 7 insertions(+), 148 deletions(-)
Eric Seidel (no email)
Comment 14
2007-11-17 21:10:15 PST
Created
attachment 17348
[details]
[PATCH] reformat is_anchored JavaScriptCore/pcre/pcre_compile.cpp | 62 +++++++++++++++------------------ 1 files changed, 28 insertions(+), 34 deletions(-)
Eric Seidel (no email)
Comment 15
2007-11-17 21:10:17 PST
Created
attachment 17349
[details]
[PATCH] Reformat find_fixedlength JavaScriptCore/pcre/pcre_compile.cpp | 281 +++++++++++++++++----------------- 1 files changed, 143 insertions(+), 138 deletions(-)
Eric Seidel (no email)
Comment 16
2007-11-17 21:10:19 PST
Created
attachment 17350
[details]
[PATCH] clean up check_escape JavaScriptCore/pcre/pcre_compile.cpp | 242 +++++++++++++++++----------------- 1 files changed, 118 insertions(+), 124 deletions(-)
Eric Seidel (no email)
Comment 17
2007-11-17 21:10:20 PST
Created
attachment 17351
[details]
[PATCH] clean up is_counted_repeat JavaScriptCore/pcre/pcre_compile.cpp | 44 ++++++++++++++++----------------- 1 files changed, 21 insertions(+), 23 deletions(-)
Eric Seidel (no email)
Comment 18
2007-11-17 21:10:22 PST
Created
attachment 17352
[details]
[PATCH] split first_significant_code into two simpler functions JavaScriptCore/pcre/pcre_compile.cpp | 55 ++++++++++++++++------------------ 1 files changed, 26 insertions(+), 29 deletions(-)
Eric Seidel (no email)
Comment 19
2007-11-17 21:10:23 PST
Created
attachment 17353
[details]
[PATCH] Use better variable names for case ignoring options JavaScriptCore/pcre/pcre_compile.cpp | 41 ++++++++++++++++----------------- JavaScriptCore/pcre/pcre_exec.cpp | 36 +++++++++++++++--------------- JavaScriptCore/pcre/pcre_internal.h | 4 +- 3 files changed, 40 insertions(+), 41 deletions(-)
Eric Seidel (no email)
Comment 20
2007-11-17 21:10:25 PST
Created
attachment 17354
[details]
[PATCH] Section off MatchData arguments into args struct JavaScriptCore/pcre/pcre_exec.cpp | 728 +++++++++++++++++++------------------ 1 files changed, 367 insertions(+), 361 deletions(-)
Eric Seidel (no email)
Comment 21
2007-11-17 21:10:27 PST
Created
attachment 17355
[details]
[PATCH] Abstract frame variables into locals and args JavaScriptCore/pcre/pcre_compile.cpp | 2 +- JavaScriptCore/pcre/pcre_exec.cpp | 465 +++++++++++++++++----------------- JavaScriptCore/pcre/pcre_internal.h | 16 +- 3 files changed, 244 insertions(+), 239 deletions(-)
Eric Seidel (no email)
Comment 22
2007-11-17 21:10:29 PST
Created
attachment 17356
[details]
[PATCH] cleanup _pcre_ucp_othercase JavaScriptCore/pcre/pcre_ucp_searchfuncs.cpp | 69 +++++++++++++------------ 1 files changed, 36 insertions(+), 33 deletions(-)
Eric Seidel (no email)
Comment 23
2007-11-17 21:10:30 PST
Created
attachment 17357
[details]
[PATCH] Add inlines for toLowerCase, isWordChar, isSpaceChar for further regexp speedup JavaScriptCore/pcre/pcre_compile.cpp | 22 +++++----- JavaScriptCore/pcre/pcre_exec.cpp | 75 +++++++++++++++------------------- JavaScriptCore/pcre/pcre_internal.h | 56 ++++++++++++++++--------- JavaScriptCore/pcre/pcre_xclass.cpp | 3 +- 4 files changed, 81 insertions(+), 75 deletions(-)
Eric Seidel (no email)
Comment 24
2007-11-17 21:10:32 PST
Created
attachment 17358
[details]
[PATCH] Give GET, PUT better names, and add (poor) moveOpcodePtrPastAnyAlternateBranches JavaScriptCore/pcre/pcre_compile.cpp | 120 +++++++++++++------------- JavaScriptCore/pcre/pcre_exec.cpp | 62 ++++++------- JavaScriptCore/pcre/pcre_internal.h | 47 ++++++---- JavaScriptCore/pcre/pcre_ucp_searchfuncs.cpp | 4 +- 4 files changed, 117 insertions(+), 116 deletions(-)
Eric Seidel (no email)
Comment 25
2007-11-17 21:10:33 PST
Created
attachment 17359
[details]
[PATCH] Further cleanup GET/PUT inlines JavaScriptCore/pcre/pcre_internal.h | 49 ++++++++++++++++++----------------- 1 files changed, 25 insertions(+), 24 deletions(-)
Eric Seidel (no email)
Comment 26
2007-11-17 21:10:35 PST
Created
attachment 17360
[details]
[PATCH] Get rid of GETCHAR* macros, replacing them with better named inlines JavaScriptCore/pcre/pcre_compile.cpp | 33 ++++------ JavaScriptCore/pcre/pcre_exec.cpp | 112 +++++++++++++++++----------------- JavaScriptCore/pcre/pcre_internal.h | 55 ++++++++++++----- 3 files changed, 106 insertions(+), 94 deletions(-)
Eric Seidel (no email)
Comment 27
2007-11-17 21:10:36 PST
Created
attachment 17361
[details]
[PATCH] Remove redundant match_call_count and move recursion check out of super-hot code path SunSpider says this is at least an 8% speedup for regexp. --- JavaScriptCore/pcre/pcre_exec.cpp | 33 ++++++++++----------------------- JavaScriptCore/pcre/pcre_internal.h | 25 +++++-------------------- 2 files changed, 15 insertions(+), 43 deletions(-)
Eric Seidel (no email)
Comment 28
2007-11-17 21:10:38 PST
Created
attachment 17362
[details]
[PATCH] Remove redundant eptrblock struct JavaScriptCore/pcre/pcre_exec.cpp | 99 ++++++++++++++---------------------- 1 files changed, 39 insertions(+), 60 deletions(-)
Eric Seidel (no email)
Comment 29
2007-11-17 21:11:40 PST
So I've not marked any of these for review. But anyone who would like to go through and review them, that would be most appreciated. :) Hum... I might need to make white-space ignoring versions.
Sam Weinig
Comment 30
2007-11-17 21:23:00 PST
Comment on
attachment 17337
[details]
[PATCH] reformat get_othercase_range The *'s should be next to the type here. +static bool get_othercase_range(int *cptr, int d, int *ocptr, int *odptr) Needs changelog. Otherwise, r=me.
Sam Weinig
Comment 31
2007-11-17 21:24:49 PST
Comment on
attachment 17339
[details]
[PATCH] Fix spacing for read_repeat_counts Needs ChangeLog r=me.
Sam Weinig
Comment 32
2007-11-17 21:28:09 PST
Comment on
attachment 17344
[details]
[PATCH] Clean up find_firstassertedchar r=me. Changelog please
Sam Weinig
Comment 33
2007-11-17 21:36:12 PST
Comment on
attachment 17348
[details]
[PATCH] reformat is_anchored We generally spell unsigned int as unsigned. Otherwise, r=me with a log of changes.
Sam Weinig
Comment 34
2007-11-17 21:38:09 PST
Comment on
attachment 17351
[details]
[PATCH] clean up is_counted_repeat r=me with ChangeLog.
Sam Weinig
Comment 35
2007-11-18 15:10:54 PST
Comment on
attachment 17352
[details]
[PATCH] split first_significant_code into two simpler functions The * should go next to the type here. + const uschar *scode = firstSignificantOpCode(code + 1 + LINK_SIZE); r=me with ChangeLog.
Sam Weinig
Comment 36
2007-11-18 15:14:04 PST
Comment on
attachment 17356
[details]
[PATCH] cleanup _pcre_ucp_othercase In the other patches I have reviewed so far you are changing infinite loops like + for (;;) { to while (true). I think it is better to be consistent and change this one as well. r=me with ChangeLog.
Sam Weinig
Comment 37
2007-11-18 15:22:24 PST
Comment on
attachment 17359
[details]
[PATCH] Further cleanup GET/PUT inlines r=me with the ChangeLog
Sam Weinig
Comment 38
2007-11-18 17:48:43 PST
Comment on
attachment 17341
[details]
[PATCH] removing more macros r=me with ChangeLog.
Sam Weinig
Comment 39
2007-11-18 19:10:11 PST
Comment on
attachment 17338
[details]
[PATCH] Get rid of PCRE custom char types The typedef for USPTR was const UChar* but you seem to be using a non-const version in most places you replaced it. What was the motivation behind that change. Otherwise, r=me with a ChangeLog.
Sam Weinig
Comment 40
2007-11-18 19:32:13 PST
Comment on
attachment 17349
[details]
[PATCH] Reformat find_fixedlength The * should go next to the type here. +static int find_fixedlength(uschar *code, int options) and here + uschar *cc = code + 1 + LINK_SIZE; I think should be while (true) for consistency. + for (;;) { Brace should go on the same line as the switch statement. + switch (op) + { I think spliting this onto multiple lines will make it more clear. (appears twice) + do cc += GET(cc, 1); while (*cc == OP_ALT); In the inner switch statement you indent the cases differently than the outer switch. Consistency is the key. + switch (*cc) { + case OP_CRSTAR: r-
Sam Weinig
Comment 41
2007-11-18 19:45:15 PST
Comment on
attachment 17350
[details]
[PATCH] clean up check_escape I am not a fan of multiple case statements on a line but I don't think we have a rule about this. + case '1': case '2': case '3': case '4': case '5': + case '6': case '7': case '8': case '9': r=me.
Sam Weinig
Comment 42
2007-11-18 19:49:56 PST
Comment on
attachment 17345
[details]
[PATCH] give PCRE_STARTLINE a better name and rename match_data to MatchData r=me with ChangeLoggy.
Sam Weinig
Comment 43
2007-11-18 19:53:15 PST
Comment on
attachment 17342
[details]
[PATCH] Clean up compile_branch, move _pcre_ord2utf8, and rename CompileData r=me
Eric Seidel (no email)
Comment 44
2007-11-19 00:58:27 PST
Created
attachment 17376
[details]
[PATCH] Remove match_is_group variable for another 5% speedup JavaScriptCore/pcre/pcre_compile.cpp | 1 + JavaScriptCore/pcre/pcre_exec.cpp | 143 +++++++++++++++------------------ 2 files changed, 66 insertions(+), 78 deletions(-)
Eric Seidel (no email)
Comment 45
2007-11-19 00:58:29 PST
Created
attachment 17377
[details]
[PATCH] Lower MATCH_LIMIT_RECURSION to more sane levels to prevent hangs on run-javascriptcore-tests JavaScriptCore/pcre/pcre_internal.h | 11 ++++++++--- 1 files changed, 8 insertions(+), 3 deletions(-)
Eric Seidel (no email)
Comment 46
2007-11-19 00:58:30 PST
Created
attachment 17378
[details]
[PATCH] Small speedup (0.7%) by simplifying canUseStackBufferForNextFrame() check JavaScriptCore/pcre/pcre_exec.cpp | 20 +++++++++----------- 1 files changed, 9 insertions(+), 11 deletions(-)
Eric Seidel (no email)
Comment 47
2007-11-19 00:58:31 PST
Created
attachment 17379
[details]
[PATCH] Give subjectPtr and instructionPtr sane names, reduce size of MatchFrame for a 0.2% speedup. JavaScriptCore/pcre/pcre_compile.cpp | 15 +- JavaScriptCore/pcre/pcre_exec.cpp | 653 +++++++++++++++++----------------- JavaScriptCore/pcre/pcre_internal.h | 38 +- JavaScriptCore/pcre/pcre_xclass.cpp | 6 +- 4 files changed, 355 insertions(+), 357 deletions(-)
Eric Seidel (no email)
Comment 48
2007-11-19 00:58:33 PST
Created
attachment 17380
[details]
[PATCH] Comment cleanup JavaScriptCore/pcre/pcre_exec.cpp | 14 ++++++++------ 1 files changed, 8 insertions(+), 6 deletions(-)
Eric Seidel (no email)
Comment 49
2007-11-19 00:58:34 PST
Created
attachment 17381
[details]
[PATCH] Make i locally scoped for better code clarity JavaScriptCore/pcre/pcre_exec.cpp | 64 ++++++++++++++++++------------------- 1 files changed, 31 insertions(+), 33 deletions(-)
Eric Seidel (no email)
Comment 50
2007-11-19 00:58:35 PST
Created
attachment 17382
[details]
[PATCH] change getChar* functions to return result and push 'c' into local scopes for clarity JavaScriptCore/pcre/pcre_compile.cpp | 13 ++-- JavaScriptCore/pcre/pcre_exec.cpp | 138 ++++++++++++++++------------------ JavaScriptCore/pcre/pcre_internal.h | 25 ++++--- 3 files changed, 87 insertions(+), 89 deletions(-)
Eric Seidel (no email)
Comment 51
2007-11-19 00:58:36 PST
Created
attachment 17383
[details]
[PATCH] Remove no longer used error code JSRegExpErrorMatchLimit JavaScriptCore/kjs/regexp.cpp | 2 +- JavaScriptCore/pcre/pcre.h | 1 - JavaScriptCore/pcre/pcre_internal.h | 2 +- 3 files changed, 2 insertions(+), 3 deletions(-)
Eric Seidel (no email)
Comment 52
2007-11-19 00:58:37 PST
Created
attachment 17384
[details]
[PATCH] Add repeatInformationFromInstructionOffset inline JavaScriptCore/pcre/pcre_exec.cpp | 63 ++++++++++++------------------------ 1 files changed, 21 insertions(+), 42 deletions(-)
Eric Seidel (no email)
Comment 53
2007-11-19 00:58:39 PST
Created
attachment 17385
[details]
[PATCH] Remove branch from return JavaScriptCore/pcre/pcre_compile.cpp | 6 ++++-- JavaScriptCore/pcre/pcre_exec.cpp | 3 ++- 2 files changed, 6 insertions(+), 3 deletions(-)
Eric Seidel (no email)
Comment 54
2007-11-19 00:58:40 PST
Created
attachment 17386
[details]
[PATCH] Change _NC operators to use _IGNORING_CASE for clarity JavaScriptCore/pcre/pcre_compile.cpp | 16 ++++++++-------- JavaScriptCore/pcre/pcre_exec.cpp | 4 ++-- JavaScriptCore/pcre/pcre_internal.h | 4 ++-- 3 files changed, 12 insertions(+), 12 deletions(-)
Eric Seidel (no email)
Comment 55
2007-11-19 00:58:41 PST
Created
attachment 17387
[details]
[PATCH] Make cur_is_word and prev_is_word locals, and change OP_ANY to OP_ANY_CHAR for clarity JavaScriptCore/pcre/pcre_compile.cpp | 8 ++++---- JavaScriptCore/pcre/pcre_exec.cpp | 15 +++++++++------ JavaScriptCore/pcre/pcre_internal.h | 4 ++-- 3 files changed, 15 insertions(+), 12 deletions(-)
Eric Seidel (no email)
Comment 56
2007-11-19 00:58:42 PST
Created
attachment 17388
[details]
[PATCH] Centralize code for subjectPtr adjustments using inlines, only ever check for a single trailing surrogate (as UTF16 only allows one), possibly fix PCRE bugs involving char classes and garbled UTF16 strings. JavaScriptCore/pcre/pcre_exec.cpp | 94 +++++++++++++++-------------------- JavaScriptCore/pcre/pcre_internal.h | 35 +++++++++++++- 2 files changed, 74 insertions(+), 55 deletions(-)
Eric Seidel (no email)
Comment 57
2007-11-19 00:58:44 PST
Created
attachment 17389
[details]
[PATCH] Deprecate jsRegExpExecute's offset-vector fallback code JavaScriptCore/pcre/pcre_exec.cpp | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-)
Eric Seidel (no email)
Comment 58
2007-11-19 00:58:45 PST
Created
attachment 17390
[details]
[PATCH] give PCRE_MULTILINE a better name: OptionMatchAcrossMultipleLines JavaScriptCore/pcre/pcre_compile.cpp | 8 ++++---- JavaScriptCore/pcre/pcre_exec.cpp | 2 +- JavaScriptCore/pcre/pcre_internal.h | 2 +- 3 files changed, 6 insertions(+), 6 deletions(-)
Eric Seidel (no email)
Comment 59
2007-11-19 00:58:46 PST
Created
attachment 17391
[details]
[PATCH] Pull first_byte and req_byte optimizations out into separate static funtions, SunSpider reported this as a win. JavaScriptCore/pcre/pcre_exec.cpp | 186 ++++++++++++++++------------------ JavaScriptCore/pcre/pcre_internal.h | 18 ++-- 2 files changed, 99 insertions(+), 105 deletions(-)
Eric Seidel (no email)
Comment 60
2007-11-19 00:58:47 PST
Created
attachment 17392
[details]
[PATCH] Give consistent naming to the RegExp options/compile flags JavaScriptCore/pcre/pcre_compile.cpp | 48 +++++++++++++++++----------------- JavaScriptCore/pcre/pcre_exec.cpp | 10 +++--- JavaScriptCore/pcre/pcre_internal.h | 16 +++++----- 3 files changed, 37 insertions(+), 37 deletions(-)
Eric Seidel (no email)
Comment 61
2007-11-19 01:04:53 PST
Ok. I'm ready to start landing these suckers. After all these patches, RegExp is 18.8% faster. Other tests are randomly slower and faster. None of them beyond 2.7% slower (controlflow-recursive) or 2.9% faster (math-cordic). These are all tests which have nothing to to with RegExps. Getting this in will allow others (like darin, ggaren) to pick off some of the remaining low-hanging fruit in the RegExp engine (now that one can actually begin to tell what the code is trying to do!) If you would like any of the patches re-attached in a white-space ignoring form (or if you'd like a combined patch) I'm happy to provide. I hope this "do lots of work in git and upload all the little patches" approach is a help, not a hinderance. I plan to land these all individually (with changelogs), as then it's very easy to track any accidental regressions.
Maciej Stachowiak
Comment 62
2007-11-19 01:20:15 PST
Comment on
attachment 17347
[details]
[PATCH] Remove unused function could_be_empty_branch r=me
Eric Seidel (no email)
Comment 63
2007-11-19 01:23:57 PST
unccing darin to curb the flood of email to his inbox
Maciej Stachowiak
Comment 64
2007-11-19 01:28:38 PST
Comment on
attachment 17335
[details]
[PATCH] Clean up jsRegExpExecute r=me
Maciej Stachowiak
Comment 65
2007-11-19 01:29:24 PST
Comment on
attachment 17336
[details]
[PATCH] Remove register keyword and more cleanup r=me
Maciej Stachowiak
Comment 66
2007-11-19 01:30:04 PST
Comment on
attachment 17340
[details]
[PATCH] clean up formating in compile_branch r=me
Eric Seidel (no email)
Comment 67
2007-11-19 01:47:30 PST
Created
attachment 17393
[details]
[PATCH] Further cleanups to calculateCompiledPatternLengthAndFlags JavaScriptCore/pcre/pcre_compile.cpp | 154 +++++++++++++++++----------------- JavaScriptCore/pcre/pcre_internal.h | 2 +- 2 files changed, 79 insertions(+), 77 deletions(-)
Eric Seidel (no email)
Comment 68
2007-11-19 01:52:06 PST
Created
attachment 17394
[details]
Combined patch for Maciej's viewing pleasure Here is a combined patch of all preceding patches for folks to try on their own machines. (I'm still looking for individual patch reviews, and plan to land them individually.)
Maciej Stachowiak
Comment 69
2007-11-19 01:52:13 PST
Comment on
attachment 17390
[details]
[PATCH] give PCRE_MULTILINE a better name: OptionMatchAcrossMultipleLines r=me
Maciej Stachowiak
Comment 70
2007-11-19 01:55:41 PST
Comment on
attachment 17353
[details]
[PATCH] Use better variable names for case ignoring options r=me
Eric Seidel (no email)
Comment 71
2007-11-19 17:51:28 PST
I just updated to TOT and re-timed. SunSpider now claims this total patch is a 1.1% speedup.
Sam Weinig
Comment 72
2007-11-19 18:45:22 PST
Comment on
attachment 17346
[details]
[PATCH] Pass around MatchData objects by reference r=me. This comment could be slightly altered now that you are passing by reference. md pointer to matching data block, if is_subject is true
Sam Weinig
Comment 73
2007-11-19 18:47:10 PST
Comment on
attachment 17392
[details]
[PATCH] Give consistent naming to the RegExp options/compile flags r=me
Sam Weinig
Comment 74
2007-11-19 19:51:14 PST
Comment on
attachment 17357
[details]
[PATCH] Add inlines for toLowerCase, isWordChar, isSpaceChar for further regexp speedup r=me
Sam Weinig
Comment 75
2007-11-19 20:00:37 PST
Comment on
attachment 17360
[details]
[PATCH] Get rid of GETCHAR* macros, replacing them with better named inlines r=me
Sam Weinig
Comment 76
2007-11-19 20:08:43 PST
Comment on
attachment 17380
[details]
[PATCH] Comment cleanup r=me
Sam Weinig
Comment 77
2007-11-20 11:03:24 PST
Comment on
attachment 17358
[details]
[PATCH] Give GET, PUT better names, and add (poor) moveOpcodePtrPastAnyAlternateBranches r=me
Sam Weinig
Comment 78
2007-11-20 13:37:30 PST
Comment on
attachment 17362
[details]
[PATCH] Remove redundant eptrblock struct r=me
Sam Weinig
Comment 79
2007-11-20 13:39:24 PST
Comment on
attachment 17354
[details]
[PATCH] Section off MatchData arguments into args struct For this and
http://bugs.webkit.org/attachment.cgi?id=17355
I would like to see some performance numbers to make sure this restructuring did not regress us in any way. r-
Sam Weinig
Comment 80
2007-11-20 13:39:55 PST
Comment on
attachment 17355
[details]
[PATCH] Abstract frame variables into locals and args r- for same reason as above.
Sam Weinig
Comment 81
2007-11-20 13:41:20 PST
Comment on
attachment 17377
[details]
[PATCH] Lower MATCH_LIMIT_RECURSION to more sane levels to prevent hangs on run-javascriptcore-tests r=me
Sam Weinig
Comment 82
2007-11-20 13:45:46 PST
Comment on
attachment 17378
[details]
[PATCH] Small speedup (0.7%) by simplifying canUseStackBufferForNextFrame() check This looks good but I think you can use a static const instead of the #define +#define FRAMES_ON_STACK 16 r=me otherwise.
Sam Weinig
Comment 83
2007-11-20 14:09:48 PST
Comment on
attachment 17391
[details]
[PATCH] Pull first_byte and req_byte optimizations out into separate static funtions, SunSpider reported this as a win. There are a few style issues in tryFirstByteOptimization: Brace after if. + if (first_byte_caseless) + while (subjectPtr < endSubject) { braces after the else + else + while (subjectPtr < endSubject && *subjectPtr != first_char) + subjectPtr++; Else on the same line as the brace. + } + /* Or to just after \n for a multiline match if possible */ + else if (useMultiLineFirstCharOptimization) { r=me
Sam Weinig
Comment 84
2007-11-20 14:52:08 PST
Comment on
attachment 17393
[details]
[PATCH] Further cleanups to calculateCompiledPatternLengthAndFlags r=me
Sam Weinig
Comment 85
2007-11-20 15:00:19 PST
Comment on
attachment 17381
[details]
[PATCH] Make i locally scoped for better code clarity Bad bad Eric. This code will decrement the i counter twice as much. Also, the indenting is off. - i = 255; - while (!opcode_jump_table[i]) - opcode_jump_table[i--] = &&CAPTURING_BRACKET; + for (int i = 255; !opcode_jump_table[i]; i--) + opcode_jump_table[i--] = &&CAPTURING_BRACKET; Otherwise, this looks great. r=me.
Sam Weinig
Comment 86
2007-11-20 15:10:39 PST
Comment on
attachment 17383
[details]
[PATCH] Remove no longer used error code JSRegExpErrorMatchLimit I think if you remove a error code it is a good idea to make the range continuous again (make JSRegExpErrorRecursionLimit = -4) as we aren't keeping any kind of compatibility here. This change seems like it should go in the patch that originally made the comment. - Currently that's 100000 - 16 * (23 * 4) ~ 9MB + Currently that's 100000 - 16 * (23 * 4) ~ 90MB r-
Sam Weinig
Comment 87
2007-11-20 16:07:02 PST
Comment on
attachment 17385
[details]
[PATCH] Remove branch from return r=me
mitz
Comment 88
2007-11-21 20:50:33 PST
Comment on
attachment 17386
[details]
[PATCH] Change _NC operators to use _IGNORING_CASE for clarity r=me
Maciej Stachowiak
Comment 89
2007-11-21 20:50:41 PST
Comment on
attachment 17361
[details]
[PATCH] Remove redundant match_call_count and move recursion check out of super-hot code path r=me
mitz
Comment 90
2007-11-21 20:51:58 PST
Comment on
attachment 17387
[details]
[PATCH] Make cur_is_word and prev_is_word locals, and change OP_ANY to OP_ANY_CHAR for clarity r=me
Maciej Stachowiak
Comment 91
2007-11-21 21:00:24 PST
Comment on
attachment 17387
[details]
[PATCH] Make cur_is_word and prev_is_word locals, and change OP_ANY to OP_ANY_CHAR for clarity r=me
Maciej Stachowiak
Comment 92
2007-11-22 19:58:03 PST
Comment on
attachment 17384
[details]
[PATCH] Add repeatInformationFromInstructionOffset inline r=me
Maciej Stachowiak
Comment 93
2007-11-22 23:03:40 PST
Comment on
attachment 17376
[details]
[PATCH] Remove match_is_group variable for another 5% speedup r=me
Maciej Stachowiak
Comment 94
2007-11-22 23:34:33 PST
Comment on
attachment 17379
[details]
[PATCH] Give subjectPtr and instructionPtr sane names, reduce size of MatchFrame for a 0.2% speedup. r=me
Maciej Stachowiak
Comment 95
2007-11-22 23:35:17 PST
Comment on
attachment 17379
[details]
[PATCH] Give subjectPtr and instructionPtr sane names, reduce size of MatchFrame for a 0.2% speedup. You should fix the indentation here: offset index into the offset vector - eptr points into the subject + subjectPtr points into the subject length length to be matched
Eric Seidel (no email)
Comment 96
2007-11-24 13:49:27 PST
(In reply to
comment #40
)
> (From update of
attachment 17349
[details]
[edit]) > r-
I fixed all your issues except for the case indenting. That will have to go on a later patch, as a I fear it would break all dependent patches in my tree. Sadly, you're not around on IRC, but I'm going to assume you're OK with this. this is git, so I can always go back and change it. :)
Maciej Stachowiak
Comment 97
2007-11-24 19:10:40 PST
Comment on
attachment 17382
[details]
[PATCH] change getChar* functions to return result and push 'c' into local scopes for clarity r=me
Maciej Stachowiak
Comment 98
2007-11-24 19:26:37 PST
Comment on
attachment 17388
[details]
[PATCH] Centralize code for subjectPtr adjustments using inlines, only ever check for a single trailing surrogate (as UTF16 only allows one), possibly fix PCRE bugs involving char classes and garbled UTF16 strings. r=me
Eric Seidel (no email)
Comment 99
2007-11-29 03:02:36 PST
Ok. I made the suggested changes, added ChangeLogs to each commit, and rechecked performance. The final number was a 1.7% improvement (some of that is caused by positive noise in the results). Going to land this in a minute. The final perf numbers: TEST COMPARISON FROM TO DETAILS ============================================================================= ** TOTAL **: 1.7% faster 4320.6ms +/- 0.3% 4249.0ms +/- 0.2% significant ============================================================================= 3d: - 500.2ms +/- 0.4% 498.0ms +/- 0.4% cube: - 168.2ms +/- 1.0% 166.4ms +/- 0.9% morph: ?? 161.4ms +/- 0.4% 161.6ms +/- 0.4% not conclusive: might be 0.1% *slower* raytrace: 0.4% faster 170.6ms +/- 0.4% 170.0ms +/- 0.0% significant access: 1.3% faster 716.4ms +/- 0.4% 707.0ms +/- 0.4% significant binary-trees: ?? 105.4ms +/- 0.6% 105.6ms +/- 0.6% not conclusive: might be 0.2% *slower* fannkuch: 1.8% faster 348.8ms +/- 0.3% 342.6ms +/- 0.4% significant nbody: 1.6% faster 184.0ms +/- 0.8% 181.0ms +/- 0.5% significant nsieve: - 78.2ms +/- 0.7% 77.8ms +/- 0.7% bitops: - 611.2ms +/- 0.2% 611.0ms +/- 0.2% 3bit-bits-in-byte: ?? 111.8ms +/- 0.5% 112.0ms +/- 1.1% not conclusive: might be 0.2% *slower* bits-in-byte: - 150.0ms +/- 0.6% 149.8ms +/- 0.4% bitwise-and: - 214.0ms +/- 0.0% 214.0ms +/- 0.0% nsieve-bits: - 135.4ms +/- 0.5% 135.2ms +/- 0.4% controlflow: - 151.0ms +/- 0.0% 151.0ms +/- 0.0% recursive: - 151.0ms +/- 0.0% 151.0ms +/- 0.0% crypto: - 339.8ms +/- 0.5% 340.0ms +/- 0.5% aes: 1.4% *slower* 97.2ms +/- 0.6% 98.6ms +/- 0.7% significant md5: - 121.2ms +/- 0.5% 121.2ms +/- 0.9% sha1: 1.0% faster 121.4ms +/- 0.6% 120.2ms +/- 0.9% significant date: 0.8% faster 371.8ms +/- 0.3% 369.0ms +/- 0.8% significant format-tofte: 0.9% *slower* 163.8ms +/- 0.3% 165.2ms +/- 1.6% significant format-xparb: 2.0% faster 208.0ms +/- 0.4% 203.8ms +/- 0.3% significant math: 0.4% faster 509.2ms +/- 0.2% 507.0ms +/- 0.5% significant cordic: 0.8% faster 216.0ms +/- 0.4% 214.2ms +/- 0.8% significant partial-sums: - 172.8ms +/- 0.3% 172.2ms +/- 0.6% spectral-norm: ?? 120.4ms +/- 0.6% 120.6ms +/- 0.6% not conclusive: might be 0.2% *slower* regexp: 18.0% faster 268.2ms +/- 0.4% 220.0ms +/- 0.4% significant dna: 18.0% faster 268.2ms +/- 0.4% 220.0ms +/- 0.4% significant string: 0.8% faster 852.8ms +/- 0.5% 846.0ms +/- 0.4% significant base64: 0.8% faster 121.4ms +/- 0.6% 120.4ms +/- 0.6% significant fasta: 1.2% faster 205.0ms +/- 0.6% 202.6ms +/- 0.3% significant tagcloud: ?? 191.8ms +/- 0.8% 192.4ms +/- 0.4% not conclusive: might be 0.3% *slower* unpack-code: 1.5% faster 199.0ms +/- 0.4% 196.0ms +/- 0.4% significant validate-input: - 135.6ms +/- 0.8% 134.6ms +/- 1.9%
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug