NEW 258706
JS markdown parser performs 50x slower in JSC compared to V8, likely due to regex
https://bugs.webkit.org/show_bug.cgi?id=258706
Summary JS markdown parser performs 50x slower in JSC compared to V8, likely due to r...
Jarred Sumner
Reported 2023-06-29 19:02:09 PDT
Created attachment 466880 [details] Reproduction that jsc shell can run I've attached a repro that can be executed in jsc shell directly. It prints out the length of the output file after parsing it. ❯ hyperfine "node out.mjs" "jsc -m out.mjs" Benchmark 1: node out.mjs Time (mean ± σ): 123.8 ms ± 21.1 ms [User: 94.7 ms, System: 11.0 ms] Range (min … max): 77.9 ms … 153.5 ms 16 runs Benchmark 2: jsc -m out.mjs Time (mean ± σ): 6.803 s ± 0.211 s [User: 2.511 s, System: 0.025 s] Range (min … max): 6.507 s … 7.196 s 10 runs Summary 'node out.mjs' ran 54.95 ± 9.53 times faster than 'jsc -m out.mjs' If we profile it with Instruments, we can see that majority of time is spent in JSC::Yarr 2.16 s 97.6% 2.15 s JSC::Yarr::Interpreter<char16_t>::matchDisjunction(JSC::Yarr::ByteDisjunction*, JSC::Yarr::Interpreter<char16_t>::DisjunctionContext*, bool) 3.00 ms 0.1% 3.00 ms void std::__1::__introsort<std::__1::_ClassicAlgPolicy, JSC::ARM64Assembler::jumpsToLink()::'lambda'(auto&, auto&)&, JSC::ARM64Assembler::LinkRecord*>(JSC::ARM64Assembler::LinkRecord*, JSC::ARM64Assembler::LinkRecord*, auto, std::__1::iterator_traits<JSC::ARM64Assembler::LinkRecord*>::difference_type) 2.00 ms 0.0% 2.00 ms JSC::BytecodeGenerator::variable(JSC::Identifier const&, JSC::ThisResolutionType) 5.00 ms 0.2% 2.00 ms void JSC::LinkBuffer::copyCompactAndLinkCode<unsigned int>(JSC::MacroAssembler&, JSC::JITCompilationEffort) 1.00 ms 0.0% 1.00 ms sys_icache_invalidate 3.00 ms 0.1% 1.00 ms JSC::ASTBuilder::SourceElements JSC::Parser<JSC::Lexer<unsigned char>>::parseModuleSourceElements<JSC::ASTBuilder>(JSC::ASTBuilder&) 1.00 ms 0.0% 1.00 ms ash_map.HashMapUnmanaged([]const u8,void,src.bun.StringHashMapContext,80).growIfNeeded 1.00 ms 0.0% 1.00 ms _platform_memset 1.00 ms 0.0% 1.00 ms __bzero 1.00 ms 0.0% 1.00 ms JSC::DFG::DefMethodClobberize<JSC::DFG::(anonymous namespace)::LocalCSEPhase::BlockCSE<JSC::DFG::(anonymous namespace)::LocalCSEPhase::LargeMaps>>::operator()(JSC::DFG::PureValue) const https://github.com/oven-sh/bun/assets/709451/f12bdd48-4bbe-4dd7-83f8-94d7dd920204
Attachments
Reproduction that jsc shell can run (147.61 KB, text/javascript)
2023-06-29 19:02 PDT, Jarred Sumner
no flags
Instruments profile of "vite dev" (654.51 KB, application/zip)
2023-07-26 15:08 PDT, Jarred Sumner
no flags
Radar WebKit Bug Importer
Comment 1 2023-07-06 19:03:17 PDT
Michael Saboff
Comment 2 2023-07-26 14:24:46 PDT
We are running several RegExp’s through the YARR interpreter. I instrumented the RegExp engine to determine which ones and the reasons why. There are 5 RegExp that contain variable counted parenthesis with a non-zero minimum. All 5 of these RegExp contain one or more disjunctions like (?:\*[ \t]*){3,}. There is one RegExp that contains a back reference that currently cannot be JIT’ed because it is a RegExp compiled for 16 bit character matching and it has the ignore case flag. The first non-zero based variable counted parenthesis issue could be addressed at least two ways: 1) With some RegExp rewriting, e.g. changing (?:\*[ \t]*){3,} to (?:\*[ \t]*){3}(?:\*[ \t]*)*. We currently do this for one or more variable counted parens, ie (?:\*[ \t]*)+ 2) Some more involved work to properly handle the fixed non-zero count of variable counted parens in the JIT directly. The reason we don’t support back references for ignore case 16bit JIT’ing is due to the complicated case folding rules for some Unicode characters. Again there are two possible options for addressing this bug. 1) If the RegExp contains back references, allow the back reference if the referenced group’s contents are easily case folded. 8 bit characters would be easily handled by this fix. 2) Completely handle Unicode case folding. This could be built upon the work of the first alternative. A full implementation of this approach would require calling out to a case folding helper for some patterns. This helper could be generated as needed.
Jarred Sumner
Comment 3 2023-07-26 15:08:36 PDT
Created attachment 467128 [details] Instruments profile of "vite dev"
Jarred Sumner
Comment 4 2023-07-26 15:11:42 PDT
I added another case where JSC::Yarr::Interpreter ranks high in a profile. This one is running "vite dev". On the main thread, it's 90% of execution time. I don't have a minimal reproduction of this one yet, but what's interesting about the stack trace is it shows Yarr::Interpreter<unsigned char> which implies it's not 16 bit related. I'll try to make a simpler reproduction of this since a profile in of itself isn't very actionable ``` 382.00 ms 91.6% 302.00 ms 191 JSC::Yarr::Interpreter<unsigned char>::matchDisjunction(JSC::Yarr::ByteDisjunction*, JSC::Yarr::Interpreter<unsigned char>::DisjunctionContext*, bool) 62.00 ms 14.8% 44.00 ms 31 JSC::Yarr::Interpreter<unsigned char>::allocParenthesesDisjunctionContext(JSC::Yarr::ByteDisjunction*, unsigned int*, JSC::Yarr::ByteTerm&) 60.00 ms 14.3% 40.00 ms 30 JSC::Lexer<unsigned char>::lexWithoutClearingLineTerminator(JSC::JSToken*, WTF::OptionSet<JSC::LexerFlags>, bool) 38.00 ms 9.1% 38.00 ms 19 __munmap 24.00 ms 5.7% 24.00 ms 12 __openat_nocancel 20.00 ms 4.7% 20.00 ms 10 __mmap 18.00 ms 4.3% 18.00 ms 9 lstat 14.00 ms 3.3% 14.00 ms 7 JSC::Yarr::Interpreter<unsigned char>::testCharacterClass(JSC::Yarr::CharacterClass*, int) 12.00 ms 2.8% 12.00 ms 6 WTF::equal(WTF::StringImpl const*, unsigned char const*, unsigned int) 12.00 ms 2.8% 12.00 ms 6 WTF::fastMalloc(unsigned long) 254.00 ms 60.9% 10.00 ms 127 llint_entry 10.00 ms 2.3% 10.00 ms 5 WTF::SmallSet<WTF::UniquedStringImpl*, WTF::PtrHashBase<WTF::UniquedStringImpl*, false>, 8u>::add(WTF::UniquedStringImpl*) 8.00 ms 1.9% 8.00 ms 4 __getdirentries64 8.00 ms 1.9% 8.00 ms 4 __psynch_cvsignal 8.00 ms 1.9% 8.00 ms 4 _platform_memset 8.00 ms 1.9% 6.00 ms 4 JSC::CodeBlock::updateAllNonLazyValueProfilePredictionsAndCountLiveness(JSC::ConcurrentJSLocker const&, unsigned int&, unsigned int&) 14.00 ms 3.3% 6.00 ms 7 bmalloc_heap_config_specialized_local_allocator_try_allocate_small_segregated_slow 6.00 ms 1.4% 6.00 ms 3 _platform_memmove 6.00 ms 1.4% 6.00 ms 3 __close_nocancel 6.00 ms 1.4% 6.00 ms 3 JSC::FunctionExecutable::finalizeUnconditionally(JSC::VM&, JSC::CollectionScope) 56.00 ms 13.4% 6.00 ms 28 JSC::SyntaxChecker::Expression JSC::Parser<JSC::Lexer<unsigned char>>::parseMemberExpression<JSC::SyntaxChecker>(JSC::SyntaxChecker&) 26.00 ms 6.2% 4.00 ms 13 JSC::ASTBuilder::Expression JSC::Parser<JSC::Lexer<unsigned char>>::parseMemberExpression<JSC::ASTBuilder>(JSC::ASTBuilder&) 12.00 ms 2.8% 4.00 ms 6 JSC::Scope::collectFreeVariables(JSC::Scope*, bool) 4.00 ms 0.9% 4.00 ms 2 WTF::HashTable<WTF::RefPtr<WTF::UniquedStringImpl, WTF::RawPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>, WTF::KeyValuePair<WTF::RefPtr<WTF::UniquedStringImpl, WTF::RawPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>, JSC::BytecodeGenerator::TDZNecessityLevel>, WTF::KeyValuePairKeyExtractor<WTF::KeyValuePair<WTF::RefPtr<WTF::UniquedStringImpl, WTF::RawPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>, JSC::BytecodeGenerator::TDZNecessityLevel>>, JSC::IdentifierRepHash, WTF::HashMap<WTF::RefPtr<WTF::UniquedStringImpl, WTF::RawPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>, JSC::BytecodeGenerator::TDZNecessityLevel, JSC::IdentifierRepHash, WTF::HashTraits<WTF::RefPtr<WTF::UniquedStringImpl, WTF::RawPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>>, WTF::HashTraits<JSC::BytecodeGenerator::TDZNecessityLevel>, WTF::HashTableTraits>::KeyValuePairTraits, WTF::HashTraits<WTF::RefPtr<WTF::UniquedStringImpl, WTF::RawPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>>>::HashTable(WTF::HashTable<WTF::RefPtr<WTF::UniquedStringImpl, WTF::RawPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>, WTF::KeyValuePair<WTF::RefPtr<WTF::UniquedStringImpl, WTF::RawPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>, JSC::BytecodeGenerator::TDZNecessityLevel>, WTF::KeyValuePairKeyExtractor<WTF::KeyValuePair<WTF::RefPtr<WTF::UniquedStringImpl, WTF::RawPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>, JSC::BytecodeGenerator::TDZNecessityLevel>>, JSC::IdentifierRepHash, WTF::HashMap<WTF::RefPtr<WTF::UniquedStringImpl, WTF::RawPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>, JSC::BytecodeGenerator::TDZNecessityLevel, JSC::IdentifierRepHash, WTF::HashTraits<WTF::RefPtr<WTF::UniquedStringImpl, WTF::RawPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>>, WTF::HashTraits<JSC::BytecodeGenerator::TDZNecessityLevel>, WTF::HashTableTraits>::KeyValuePairTraits, WTF::HashTraits<WTF::RefPtr<WTF::UniquedStringImpl, WTF::RawPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>>> const&) 4.00 ms 0.9% 4.00 ms 2 JSC::BytecodeGenerator::needsTDZCheck(JSC::Variable const&) 4.00 ms 0.9% 4.00 ms 2 JSC::ConservativeRoots::add(void*, void*, JSC::JITStubRoutineSet&, JSC::CodeBlockSet&) 4.00 ms 0.9% 4.00 ms 2 pthread_getspecific 4.00 ms 0.9% 4.00 ms 2 DYLD-STUB$$DYLD-STUB$$_memcpy 16.00 ms 3.8% 4.00 ms 8 WTF::HashTableAddResult<WTF::HashTableIterator<WTF::HashTable<WTF::Packed<WTF::StringImpl*>, WTF::Packed<WTF::StringImpl*>, WTF::IdentityExtractor, WTF::DefaultHash<WTF::Packed<WTF::StringImpl*>>, WTF::HashTraits<WTF::Packed<WTF::StringImpl*>>, WTF::HashTraits<WTF::Packed<WTF::StringImpl*>>>, WTF::Packed<WTF::StringImpl*>, WTF::Packed<WTF::StringImpl*>, WTF::IdentityExtractor, WTF::DefaultHash<WTF::Packed<WTF::StringImpl*>>, WTF::HashTraits<WTF::Packed<WTF::StringImpl*>>, WTF::HashTraits<WTF::Packed<WTF::StringImpl*>>>> WTF::HashTable<WTF::Packed<WTF::StringImpl*>, WTF::Packed<WTF::StringImpl*>, WTF::IdentityExtractor, WTF::DefaultHash<WTF::Packed<WTF::StringImpl*>>, WTF::HashTraits<WTF::Packed<WTF::StringImpl*>>, WTF::HashTraits<WTF::Packed<WTF::StringImpl*>>>::addPassingHashCode<WTF::HashSetTranslatorAdapter<WTF::LCharBufferTranslator>, WTF::HashTranslatorCharBuffer<unsigned char> const&, WTF::HashTranslatorCharBuffer<unsigned char> const&>(WTF::HashTranslatorCharBuffer<unsigned char> const&, WTF::HashTranslatorCharBuffer<unsigned char> const&) 4.00 ms 0.9% 4.00 ms 2 pas_segregated_page_construct 4.00 ms 0.9% 4.00 ms 2 pas_thread_local_cache_flush_deallocation_log 4.00 ms 0.9% 4.00 ms 2 JSC::MarkedBlock::Handle::stopAllocating(JSC::FreeList const&) 4.00 ms 0.9% 4.00 ms 2 slow_path_enter 60.00 ms 14.3% 4.00 ms 30 JSC::SyntaxChecker::Expression JSC::Parser<JSC::Lexer<unsigned char>>::parseAssignmentExpression<JSC::SyntaxChecker>(JSC::SyntaxChecker&, JSC::Parser<JSC::Lexer<unsigned char>>::ExpressionErrorClassifier&) 28.00 ms 6.7% 4.00 ms 14 JSC::ASTBuilder::Expression JSC::Parser<JSC::Lexer<unsigned char>>::parseAssignmentExpression<JSC::ASTBuilder>(JSC::ASTBuilder&, JSC::Parser<JSC::Lexer<unsigned char>>::ExpressionErrorClassifier&) 4.00 ms 0.9% 4.00 ms 2 0x13810aa31 4.00 ms 0.9% 4.00 ms 2 WTF::StringImpl::hashSlowCase() const 2.00 ms 0.4% 2.00 ms 1 WTF::fastFree(void*) 2.00 ms 0.4% 2.00 ms 1 operationSizeFrameForVarargs 2.00 ms 0.4% 2.00 ms 1 JSC::PolymorphicAccess::visitWeak(JSC::VM&) const 2.00 ms 0.4% 2.00 ms 1 __psynch_cvwait 2.00 ms 0.4% 2.00 ms 1 Bun__Path__relative 2.00 ms 0.4% 2.00 ms 1 Bun__getEnvNames 124.00 ms 29.7% 2.00 ms 62 JSC::Parser<JSC::Lexer<unsigned char>>::parseInner(JSC::Identifier const&, JSC::ParsingContext, std::__1::optional<int>, WTF::FixedVector<JSC::JSTextPosition> const*, WTF::HashMap<WTF::RefPtr<WTF::UniquedStringImpl, WTF::PackedPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>, JSC::PrivateNameEntry, JSC::IdentifierRepHash, WTF::HashTraits<WTF::RefPtr<WTF::UniquedStringImpl, WTF::RawPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>>, JSC::PrivateNameEntryHashTraits, WTF::HashTableTraits> const*) 2.00 ms 0.4% 2.00 ms 1 0x13804c01d 2.00 ms 0.4% 2.00 ms 1 JSC::Scope::declareFunction(JSC::Identifier const*, bool, bool) 2.00 ms 0.4% 2.00 ms 1 operationPutByIdStrictOptimize 2.00 ms 0.4% 2.00 ms 1 src.resolver.resolve_path.normalizeStringLooseBuf__anon_778932 36.00 ms 8.6% 2.00 ms 18 JSC::ASTBuilder::SourceElements JSC::Parser<JSC::Lexer<unsigned char>>::parseSourceElements<JSC::ASTBuilder>(JSC::ASTBuilder&, JSC::SourceElementsMode) 2.00 ms 0.4% 2.00 ms 1 JSC::WebAssemblyModuleRecord::initializeExports(JSC::JSGlobalObject*)::$_8::operator()(unsigned int) const 2.00 ms 0.4% 2.00 ms 1 JSC::Lexer<unsigned char>::~Lexer() 2.00 ms 0.4% 2.00 ms 1 JSC::repatchArrayPutByVal(JSC::JSGlobalObject*, JSC::CodeBlock*, JSC::JSValue, JSC::JSValue, JSC::StructureStubInfo&, JSC::PutByKind) 2.00 ms 0.4% 2.00 ms 1 JSC::BuiltinExecutables::createExecutable(JSC::VM&, JSC::SourceCode const&, JSC::Identifier const&, JSC::ImplementationVisibility, JSC::ConstructorKind, JSC::ConstructAbility, JSC::NeedsClassFieldInitializer, JSC::PrivateBrandRequirement) 812.00 ms 194.7% 2.00 ms 406 vmEntryToJavaScript 2.00 ms 0.4% 2.00 ms 1 JSC::BytecodeGenerator::~BytecodeGenerator() 2.00 ms 0.4% 2.00 ms 1 JSC::Scope::fillParametersForSourceProviderCache(JSC::SourceProviderCacheItemCreationParameters&, WTF::SmallSet<WTF::UniquedStringImpl*, WTF::PtrHashBase<WTF::UniquedStringImpl*, false>, 8u> const&) 2.00 ms 0.4% 2.00 ms 1 dyld4::PrebuiltLoader::dependent(dyld4::RuntimeState const&, unsigned int, dyld4::Loader::DependentKind*) const 4.00 ms 0.9% 2.00 ms 2 src.resolver.resolve_path.normalizeStringLooseBuf__anon_802881 2.00 ms 0.4% 2.00 ms 1 JSC::UnlinkedFunctionExecutable::create(JSC::VM&, JSC::SourceCode const&, JSC::FunctionMetadataNode*, JSC::UnlinkedFunctionKind, JSC::ConstructAbility, JSC::JSParserScriptMode, WTF::RefPtr<JSC::TDZEnvironmentLink, WTF::RawPtrTraits<JSC::TDZEnvironmentLink>, WTF::DefaultRefDerefTraits<JSC::TDZEnvironmentLink>>, std::__1::optional<WTF::HashMap<WTF::RefPtr<WTF::UniquedStringImpl, WTF::PackedPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>, JSC::PrivateNameEntry, JSC::IdentifierRepHash, WTF::HashTraits<WTF::RefPtr<WTF::UniquedStringImpl, WTF::RawPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>>, JSC::PrivateNameEntryHashTraits, WTF::HashTableTraits>>, JSC::DerivedContextType, JSC::NeedsClassFieldInitializer, JSC::PrivateBrandRequirement, bool) 2.00 ms 0.4% 2.00 ms 1 JSC::prepareChainForCaching(JSC::JSGlobalObject*, JSC::JSCell*, JSC::Structure*, WTF::UniquedStringImpl*, JSC::JSObject*) 2.00 ms 0.4% 2.00 ms 1 JSC::Parser<JSC::Lexer<unsigned char>>::declareVariable(JSC::Identifier const*, JSC::DeclarationType, JSC::DeclarationImportType) ```
Michael Saboff
Comment 5 2024-03-27 08:03:38 PDT
(In reply to Michael Saboff from comment #2) > The reason we don’t support back references for ignore case 16bit JIT’ing is > due to the complicated case folding rules for some Unicode characters. > Again there are two possible options for addressing this bug. > 1) If the RegExp contains back references, allow the back reference if the > referenced group’s contents are easily case folded. 8 bit characters would > be easily handled by this fix. > 2) Completely handle Unicode case folding. This could be built upon the > work of the first alternative. A full implementation of this approach would > require calling out to a case folding helper for some patterns. This helper > could be generated as needed. JIT support for ignore case backreferences on ARM64 and X86-64 landed in https://commits.webkit.org/276681@main via https://bugs.webkit.org/show_bug.cgi?id=271617. > The first non-zero based variable counted parenthesis issue could be addressed at least two ways: > 1) With some RegExp rewriting, e.g. changing (?:\*[ \t]*){3,} to (?:\*[ \t]*){3}(?:\*[ \t]*)*. We currently do this for one or more variable counted parens, ie (?:\*[ \t]*)+ > 2) Some more involved work to properly handle the fixed non-zero count of variable counted parens in the JIT directly. Working towards the second option to support non-zero based variable counted parenthesis in the Yarr JIT.
Note You need to log in before you can comment on or make changes to this bug.