Summary: | First step towards replacing PCRE and completing regular expression JIT | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Gavin Barraclough <barraclough> | ||||||
Component: | JavaScriptCore | Assignee: | Gavin Barraclough <barraclough> | ||||||
Status: | RESOLVED FIXED | ||||||||
Severity: | Normal | CC: | darin | ||||||
Priority: | P2 | ||||||||
Version: | 528+ (Nightly build) | ||||||||
Hardware: | Mac | ||||||||
OS: | OS X 10.5 | ||||||||
Attachments: |
|
Description
Gavin Barraclough
2009-04-13 15:40:23 PDT
Created attachment 29442 [details]
Yarr!
Comment on attachment 29442 [details]
Yarr!
I'm ignoring commented-out code, because Gavin told me that he had a local change to remove it.
Let's rename the NARC #define to YARR_JIT.
+#if !ENABLE(YARR)
~RegExp();
+#endif
You can reduce the #ifdef'ing if you leave in ~RegExp in all builds, and just leave out the implementation in YARR builds.
+#if ENABLE(NARC)
+ jitCompileRegex(globalData, m_regExpJITCode, m_pattern, m_numSubpatterns, m_constructionError, ignoreCase(), multiline());
+#else
+ UNUSED_PARAM(globalData);
+ m_regExpBytecode.set(Yarr::byteCompileRegex(m_pattern, m_numSubpatterns, m_constructionError, ignoreCase(), multiline()));
+#endif
Kind of strange that the JIT version of the code does not use the Yarr namespace explicitly, but the bytecode version of the code does.
+CharacterClass* newlineOnce()
+{
+ static CharacterClass* once = newlineCreate();
+ return once;
+}
DANGER WILL ROBINSON. DANGER. Static initialization like this is not thread-safe. If we want to dynamically allocate these character classes, we need to store them in the parser, or the JSGlobalData.
I think function names like this would be easier to read as "newlineCharacterClass()", etc., instead of "newlineOnce()", etc.
+ // If the pattern contains illegal backreferences reset & reparse.
+ if (pattern.containsIllegalBackReference()) {
Let's be more explicit and mention that we're doing this to match a quirk in Firefox's behavior.
+ DisjunctionContext* allocDisjunctionContext(ByteDisjunction* disjunction)
+ {
+ return new(malloc(sizeof(DisjunctionContext) + (disjunction->m_frameSize - 1) * sizeof(uintptr_t))) DisjunctionContext();
+ }
+
+ void freeDisjunctionContext(DisjunctionContext* context)
+ {
+ free(context);
+ }
I think this code, and its client, would be cleaner if you made DisjunctionContext refcounted, and used RefPtr to manage its lifetime automatically.
+ BytecodePattern* compile()
+ {
+ regexBegin(m_pattern.m_body->m_callFrameSize);
+ emitDisjunction(m_pattern.m_body);
+ regexEnd();
+
+ BytecodePattern* bytecode = new BytecodePattern(bodyDisjunction, m_pattern.m_ignoreCase, m_pattern.m_multiline, m_pattern.m_numSubpatterns, m_allParenthesesInfo, m_pattern.m_userCharacterClasses);
+ bodyDisjunction = 0;
+ m_allParenthesesInfo.shrink(0);
+ m_pattern.m_userCharacterClasses.clear();
+ return bytecode;
+ }
RefPtr here would be good, too.
+ ~BytecodePattern()
+ {
+ delete m_body;
+ for (unsigned i = 0; i < m_allParenthesesInfo.size(); ++i)
+ delete m_allParenthesesInfo[i];
+ for (unsigned i = 0; i < m_userCharacterClasses.size(); ++i)
+ delete m_userCharacterClasses[i];
+ }
And here.
+struct RegexJITObject {
+ void* m_jitCode;
+ RefPtr<ExecutablePool> m_executablePool;
+
+ RegexJITObject()
+ : m_jitCode(0)
+ {
+ }
+};
Should we call this something more specific, like Yarr::CodeBlock? "Object" is a pretty generic name for an object.
+// Yarr::parse():
+//
Please use "/**/" syntax for multiline comments.
+ static const unsigned s_maxPatternSize = (1 << 13);
I think you want to standardize with the use of MAX_PATTERN_SIZE elsewhere, to avoid differences between the engines.
r=me
Created attachment 29460 [details]
(with review comment fixes, just missing a changelog...)
Sending JavaScriptCore/ChangeLog Sending JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj Sending JavaScriptCore/assembler/MacroAssemblerX86Common.h Sending JavaScriptCore/assembler/X86Assembler.h Sending JavaScriptCore/runtime/RegExp.cpp Sending JavaScriptCore/runtime/RegExp.h Sending JavaScriptCore/wtf/Platform.h Adding JavaScriptCore/yarr Adding JavaScriptCore/yarr/RegexCompiler.cpp Adding JavaScriptCore/yarr/RegexCompiler.h Adding JavaScriptCore/yarr/RegexInterpreter.cpp Adding JavaScriptCore/yarr/RegexInterpreter.h Adding JavaScriptCore/yarr/RegexJIT.cpp Adding JavaScriptCore/yarr/RegexJIT.h Adding JavaScriptCore/yarr/RegexParser.h Adding JavaScriptCore/yarr/RegexPattern.h Transmitting file data ............... Committed revision 42481. |