RESOLVED FIXED 79199
Zero out CopiedBlocks on initialization
https://bugs.webkit.org/show_bug.cgi?id=79199
Summary Zero out CopiedBlocks on initialization
Mark Hahnenberg
Reported 2012-02-21 22:40:20 PST
We should do this in preparation for refactoring op_new_array. This turns out to be a performance win, even if we still manually clear the arrays.
Attachments
Patch (5.42 KB, patch)
2012-02-21 23:08 PST, Mark Hahnenberg
no flags
Bencher results (7.50 KB, text/plain)
2012-02-21 23:11 PST, Mark Hahnenberg
no flags
Patch (5.16 KB, patch)
2012-02-22 22:07 PST, Mark Hahnenberg
no flags
32-bit bencher results (7.53 KB, text/plain)
2012-02-22 23:38 PST, Mark Hahnenberg
no flags
Patch (5.09 KB, patch)
2012-02-23 11:13 PST, Mark Hahnenberg
no flags
32-bit results with just copying JSValues (no memset_pattern) (7.28 KB, text/plain)
2012-02-23 11:15 PST, Mark Hahnenberg
no flags
Mark Hahnenberg
Comment 1 2012-02-21 23:08:14 PST
Mark Hahnenberg
Comment 2 2012-02-21 23:11:50 PST
Created attachment 128140 [details] Bencher results ~1.5% win on v8, neutral on Kraken and SunSpider
Philippe Normand
Comment 3 2012-02-21 23:14:45 PST
Early Warning System Bot
Comment 4 2012-02-21 23:36:11 PST
Gyuyoung Kim
Comment 5 2012-02-22 00:11:48 PST
Filip Pizlo
Comment 6 2012-02-22 00:19:26 PST
Comment on attachment 128139 [details] Patch Why not just have a thing that blitzes empty JS values on 32_64, instead of making 32_64 pay the price twice?
Filip Pizlo
Comment 7 2012-02-22 00:22:10 PST
Ahhh, the wonders of bzero. You have two options here: - Use memset instead, which is generally a good idea since bzero is somewhat deprecated. Be sure to retest performance in case there's some weirdness with the compiler doing smarter things to bzero than it did to memset. - Include all the headers. Different platforms seem to put it in different places. It might be in strings.h, or string.h, or stdlib.h. But then strings.h might not exist on some platforms, ... Yuck. Moral of the story: just use memset if you can.
Mark Hahnenberg
Comment 8 2012-02-22 11:26:00 PST
(In reply to comment #6) > (From update of attachment 128139 [details]) > Why not just have a thing that blitzes empty JS values on 32_64, instead of making 32_64 pay the price twice? The problem with this strategy is that on 32-bit, we alternate 32-bits of zero and 32-bits of non-zero. If one of the allocations into the block is size n where n mod 8 = 4, we get out of sync with this pattern, thus requiring us to re-clear the arrays anyways. I guess we could potentially keep track of whether we're in sync or not inside each CopiedBlock and just check if we need to manually clear or not?
Mark Hahnenberg
Comment 9 2012-02-22 22:07:23 PST
Filip Pizlo
Comment 10 2012-02-22 22:09:39 PST
Comment on attachment 128394 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=128394&action=review > Source/JavaScriptCore/heap/CopiedBlock.h:54 > + ASSERT(is8ByteAligned(static_cast<void*>(m_payload))); > + JSValue emptyValue; > +#if OS(MAC_OS_X) || OS(IOS) > + memset_pattern8(static_cast<void*>(m_payload), static_cast<void*>(&emptyValue), static_cast<size_t>((reinterpret_cast<char*>(this) + allocation.size()) - m_payload)); > +#else > + JSValue* limit = reinterpret_cast<JSValue*>(reinterpret_cast<char*>(this) + allocation.size()); > + for (JSValue* currentValue = reinterpret_cast<JSValue*>(m_payload); currentValue < limit; currentValue++) > + *currentValue = emptyValue; > +#endif For good measure, I'd have a case for USE(JSVALUE64) in which I call memset. I don't trust memset_pattern8 to be as fast as memset if the pattern is zero. Also, did you verify that there is any speed-up with memset_pattern8 at all on JSVALUE32_64?
Mark Hahnenberg
Comment 11 2012-02-22 23:36:46 PST
> Also, did you verify that there is any speed-up with memset_pattern8 at all on JSVALUE32_64? Yep. The speedup is less dramatic, but it's still there.
Mark Hahnenberg
Comment 12 2012-02-22 23:38:54 PST
Created attachment 128409 [details] 32-bit bencher results
Filip Pizlo
Comment 13 2012-02-22 23:40:39 PST
(In reply to comment #11) > > Also, did you verify that there is any speed-up with memset_pattern8 at all on JSVALUE32_64? > > Yep. The speedup is less dramatic, but it's still there. I mean, is there any speed-up over writing your own loop? Like the one that you have when not on IOS or MAC_OS_X? (And btw you can use test both of them as DARWIN. I don't remember is it's OS(DARWIN) or PLATFORM(DARWIN).)
Mark Hahnenberg
Comment 14 2012-02-23 11:13:49 PST
Mark Hahnenberg
Comment 15 2012-02-23 11:15:29 PST
Created attachment 128508 [details] 32-bit results with just copying JSValues (no memset_pattern) There wasn't much of a difference (and the JSValue version might even be a tiny bit faster) so I just got rid of the memset_pattern part altogether to reduce the amount of preprocessor cruft.
WebKit Review Bot
Comment 16 2012-02-23 19:09:43 PST
Comment on attachment 128507 [details] Patch Clearing flags on attachment: 128507 Committed r108716: <http://trac.webkit.org/changeset/108716>
WebKit Review Bot
Comment 17 2012-02-23 19:09:49 PST
All reviewed patches have been landed. Closing bug.
Csaba Osztrogonác
Comment 18 2012-02-23 22:58:16 PST
Reopen, because it made all tests crash on 32 bit Qt. Maybe on other 32 bit platforms too, but I don't know, because build.webkit.org is unavailable nowadays. http://build.webkit.sed.hu/results/x86-32%20Linux%20Qt%20Release%20NRWT/r108716%20%286626%29/results.html $ cat block-wrappers-necessary-crash-log.txt | c++filt 1 0x8066e9b /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/bin/DumpRenderTree() [0x8066e9b] 2 0xf775e400 [0xf775e400] 3 0xf7023981 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WTF::fastRealloc(void*, unsigned int)+0xa21) [0xf7023981] 4 0xf6e81981 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(JSC::JIT::privateCompile(JSC::MacroAssemblerCodePtr*)+0xd21) [0xf6e81981] 5 0xf6f6f767 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(JSC::JIT::compile(JSC::JSGlobalData*, JSC::CodeBlock*, JSC::MacroAssemblerCodePtr*)+0x77) [0xf6f6f767] 6 0xf6f7134d /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(JSC::jitCompileFunctionIfAppropriate(JSC::JSGlobalData&, WTF::OwnPtr<JSC::FunctionCodeBlock>&, JSC::JITCode&, JSC::MacroAssemblerCodePtr&, JSC::SharedSymbolTable*&, JSC::JITCode::JITType)+0x12d) [0xf6f7134d] 7 0xf6f6cc0f /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(JSC::FunctionExecutable::compileForCallInternal(JSC::ExecState*, JSC::ScopeChainNode*, JSC::JITCode::JITType)+0x13f) [0xf6f6cc0f] 8 0xf6e06b68 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(JSC::FunctionExecutable::compileFor(JSC::ExecState*, JSC::ScopeChainNode*, JSC::CodeSpecializationKind)+0x68) [0xf6e06b68] 9 0xf6ea6715 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(cti_vm_lazyLinkCall+0x185) [0xf6ea6715] 10 0xf0d050c8 [0xf0d050c8] 11 0xf6e5b4f1 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(JSC::Interpreter::executeCall(JSC::ExecState*, JSC::JSObject*, JSC::CallType, JSC::CallData const&, JSC::JSValue, JSC::ArgList const&)+0x401) [0xf6e5b4f1] 12 0xf6f56ad6 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(JSC::call(JSC::ExecState*, JSC::JSValue, JSC::CallType, JSC::CallData const&, JSC::JSValue, JSC::ArgList const&)+0x66) [0xf6f56ad6] 13 0xf5e3f783 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::JSEventListener::handleEvent(WebCore::ScriptExecutionContext*, WebCore::Event*)+0x453) [0xf5e3f783] 14 0xf6021ddd /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::EventTarget::fireEventListeners(WebCore::Event*, WebCore::EventTargetData*, WTF::Vector<WebCore::RegisteredEventListener, 1u>&)+0x12d) [0xf6021ddd] 15 0xf6021f9d /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::EventTarget::fireEventListeners(WebCore::Event*)+0x5d) [0xf6021f9d] 16 0xf63960da /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::DOMWindow::dispatchEvent(WTF::PassRefPtr<WebCore::Event>, WTF::PassRefPtr<WebCore::EventTarget>)+0xca) [0xf63960da] 17 0xf639d137 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::DOMWindow::dispatchLoadEvent()+0x117) [0xf639d137] 18 0xf5fd8c3e /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::Document::dispatchWindowLoadEvent()+0x2e) [0xf5fd8c3e] 19 0xf5fe2762 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::Document::implicitClose()+0x122) [0xf5fe2762] 20 0xf631da34 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::FrameLoader::checkCallImplicitClose()+0x74) [0xf631da34] 21 0xf6326b11 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::FrameLoader::checkCompleted()+0xc1) [0xf6326b11] 22 0xf6326c7d /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::FrameLoader::loadDone()+0x1d) [0xf6326c7d] 23 0xf630bdeb /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::CachedResourceLoader::loadDone()+0x4b) [0xf630bdeb] 24 0xf635bc8f /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::SubresourceLoader::releaseResources()+0x5f) [0xf635bc8f] 25 0xf6351a7a /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::ResourceLoader::didFinishLoading(double)+0x3a) [0xf6351a7a] 26 0xf635c551 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::SubresourceLoader::didFinishLoading(double)+0xe1) [0xf635c551] 27 0xf6351541 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::ResourceLoader::didFinishLoading(WebCore::ResourceHandle*, double)+0x71) [0xf6351541] 28 0xf666aa2d /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::QNetworkReplyHandler::finish()+0x1bd) [0xf666aa2d] 29 0xf6667394 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::QNetworkReplyHandlerCallQueue::flush()+0x64) [0xf6667394] 30 0xf6667934 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::QNetworkReplyHandlerCallQueue::push(void (WebCore::QNetworkReplyHandler::*)())+0x34) [0xf6667934] 31 0xf666797f /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::QNetworkReplyWrapper::didReceiveFinished()+0x3f) [0xf666797f] Zoltán, Gábor, Péter, could you check it?
Csaba Osztrogonác
Comment 19 2012-02-24 01:34:21 PST
Additionally there are crashes on Qt ARM too. :(
Zoltan Herczeg
Comment 20 2012-02-24 04:35:45 PST
I summarize what I know so far. This is the new code: JSValue emptyValue; JSValue* limit = reinterpret_cast<JSValue*>(reinterpret_cast<char*>(this) + allocation.size()); for (JSValue* currentValue = reinterpret_cast<JSValue*>(m_payload); currentValue < limit; currentValue++) *currentValue = emptyValue; In DRT this function runs only once for dom/svg/level3/xpath/XPathEvaluator_evaluate_NAMESPACE_ERR.svg the 'this' (0xf02f0000) and 'limit' (0xf0300000) variables seems ok, but m_payload contains 0xf02f002c which is not divisible by 8! I feel this is the root of the issues. I continue debugging.
Zoltan Herczeg
Comment 21 2012-02-24 04:41:45 PST
I suspect this cause the alignment error: void* m_offset; uintptr_t m_isPinned; uintptr_t m_padding; char m_payload[1]; This is 3*4 = 12 bytes on 32 bit, while it is 8*3 = 24 bytes on 64 bit. 24 bytes are divisible by 8, but 12 bytes are not!
Zoltan Herczeg
Comment 22 2012-02-24 05:39:27 PST
Mark Hahnenberg
Comment 23 2012-02-25 08:02:10 PST
Bug 79556 should have taken care of any cross-platform alignment issues. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.