Bug 79199 - Zero out CopiedBlocks on initialization
Summary: Zero out CopiedBlocks on initialization
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Mark Hahnenberg
URL:
Keywords:
Depends on: 79271 79450
Blocks: 79198
  Show dependency treegraph
 
Reported: 2012-02-21 22:40 PST by Mark Hahnenberg
Modified: 2012-02-25 08:02 PST (History)
9 users (show)

See Also:


Attachments
Patch (5.42 KB, patch)
2012-02-21 23:08 PST, Mark Hahnenberg
no flags Details | Formatted Diff | Diff
Bencher results (7.50 KB, text/plain)
2012-02-21 23:11 PST, Mark Hahnenberg
no flags Details
Patch (5.16 KB, patch)
2012-02-22 22:07 PST, Mark Hahnenberg
no flags Details | Formatted Diff | Diff
32-bit bencher results (7.53 KB, text/plain)
2012-02-22 23:38 PST, Mark Hahnenberg
no flags Details
Patch (5.09 KB, patch)
2012-02-23 11:13 PST, Mark Hahnenberg
no flags Details | Formatted Diff | Diff
32-bit results with just copying JSValues (no memset_pattern) (7.28 KB, text/plain)
2012-02-23 11:15 PST, Mark Hahnenberg
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Hahnenberg 2012-02-21 22:40:20 PST
We should do this in preparation for refactoring op_new_array. This turns out to be a performance win, even if we still manually clear the arrays.
Comment 1 Mark Hahnenberg 2012-02-21 23:08:14 PST
Created attachment 128139 [details]
Patch
Comment 2 Mark Hahnenberg 2012-02-21 23:11:50 PST
Created attachment 128140 [details]
Bencher results

~1.5% win on v8, neutral on Kraken and SunSpider
Comment 3 Philippe Normand 2012-02-21 23:14:45 PST
Comment on attachment 128139 [details]
Patch

Attachment 128139 [details] did not pass gtk-ews (gtk):
Output: http://queues.webkit.org/results/11557399
Comment 4 Early Warning System Bot 2012-02-21 23:36:11 PST
Comment on attachment 128139 [details]
Patch

Attachment 128139 [details] did not pass qt-ews (qt):
Output: http://queues.webkit.org/results/11561397
Comment 5 Gyuyoung Kim 2012-02-22 00:11:48 PST
Comment on attachment 128139 [details]
Patch

Attachment 128139 [details] did not pass efl-ews (efl):
Output: http://queues.webkit.org/results/11558397
Comment 6 Filip Pizlo 2012-02-22 00:19:26 PST
Comment on attachment 128139 [details]
Patch

Why not just have a thing that blitzes empty JS values on 32_64, instead of making 32_64 pay the price twice?
Comment 7 Filip Pizlo 2012-02-22 00:22:10 PST
Ahhh, the wonders of bzero.  You have two options here:

- Use memset instead, which is generally a good idea since bzero is somewhat deprecated.  Be sure to retest performance in case there's some weirdness with the compiler doing smarter things to bzero than it did to memset.

- Include all the headers.  Different platforms seem to put it in different places.  It might be in strings.h, or string.h, or stdlib.h.  But then strings.h might not exist on some platforms, ...  Yuck.

Moral of the story: just use memset if you can.
Comment 8 Mark Hahnenberg 2012-02-22 11:26:00 PST
(In reply to comment #6)
> (From update of attachment 128139 [details])
> Why not just have a thing that blitzes empty JS values on 32_64, instead of making 32_64 pay the price twice?

The problem with this strategy is that on 32-bit, we alternate 32-bits of zero and 32-bits of non-zero. If one of the allocations into the block is size n where n mod 8 = 4, we get out of sync with this pattern, thus requiring us to re-clear the arrays anyways. I guess we could potentially keep track of whether we're in sync or not inside each CopiedBlock and just check if we need to manually clear or not?
Comment 9 Mark Hahnenberg 2012-02-22 22:07:23 PST
Created attachment 128394 [details]
Patch
Comment 10 Filip Pizlo 2012-02-22 22:09:39 PST
Comment on attachment 128394 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=128394&action=review

> Source/JavaScriptCore/heap/CopiedBlock.h:54
> +        ASSERT(is8ByteAligned(static_cast<void*>(m_payload)));
> +        JSValue emptyValue;
> +#if OS(MAC_OS_X) || OS(IOS)
> +        memset_pattern8(static_cast<void*>(m_payload), static_cast<void*>(&emptyValue), static_cast<size_t>((reinterpret_cast<char*>(this) + allocation.size()) - m_payload));
> +#else
> +        JSValue* limit = reinterpret_cast<JSValue*>(reinterpret_cast<char*>(this) + allocation.size());
> +        for (JSValue* currentValue = reinterpret_cast<JSValue*>(m_payload); currentValue < limit; currentValue++)
> +            *currentValue = emptyValue;
> +#endif

For good measure, I'd have a case for USE(JSVALUE64) in which I call memset.  I don't trust memset_pattern8 to be as fast as memset if the pattern is zero.

Also, did you verify that there is any speed-up with memset_pattern8 at all on JSVALUE32_64?
Comment 11 Mark Hahnenberg 2012-02-22 23:36:46 PST
> Also, did you verify that there is any speed-up with memset_pattern8 at all on JSVALUE32_64?

Yep. The speedup is less dramatic, but it's still there.
Comment 12 Mark Hahnenberg 2012-02-22 23:38:54 PST
Created attachment 128409 [details]
32-bit bencher results
Comment 13 Filip Pizlo 2012-02-22 23:40:39 PST
(In reply to comment #11)
> > Also, did you verify that there is any speed-up with memset_pattern8 at all on JSVALUE32_64?
> 
> Yep. The speedup is less dramatic, but it's still there.

I mean, is there any speed-up over writing your own loop?  Like the one that you have when not on IOS or MAC_OS_X?

(And btw you can use test both of them as DARWIN.  I don't remember is it's OS(DARWIN) or PLATFORM(DARWIN).)
Comment 14 Mark Hahnenberg 2012-02-23 11:13:49 PST
Created attachment 128507 [details]
Patch
Comment 15 Mark Hahnenberg 2012-02-23 11:15:29 PST
Created attachment 128508 [details]
32-bit results with just copying JSValues (no memset_pattern)

There wasn't much of a difference (and the JSValue version might even be a tiny bit faster) so I just got rid of the memset_pattern part altogether to reduce the amount of preprocessor cruft.
Comment 16 WebKit Review Bot 2012-02-23 19:09:43 PST
Comment on attachment 128507 [details]
Patch

Clearing flags on attachment: 128507

Committed r108716: <http://trac.webkit.org/changeset/108716>
Comment 17 WebKit Review Bot 2012-02-23 19:09:49 PST
All reviewed patches have been landed.  Closing bug.
Comment 18 Csaba Osztrogonác 2012-02-23 22:58:16 PST
Reopen, because it made all tests crash on 32 bit Qt.
Maybe on other 32 bit platforms too, but I don't know,
because build.webkit.org is unavailable nowadays.

http://build.webkit.sed.hu/results/x86-32%20Linux%20Qt%20Release%20NRWT/r108716%20%286626%29/results.html


$ cat block-wrappers-necessary-crash-log.txt | c++filt
1   0x8066e9b /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/bin/DumpRenderTree() [0x8066e9b]
2   0xf775e400 [0xf775e400]
3   0xf7023981 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WTF::fastRealloc(void*, unsigned int)+0xa21) [0xf7023981]
4   0xf6e81981 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(JSC::JIT::privateCompile(JSC::MacroAssemblerCodePtr*)+0xd21) [0xf6e81981]
5   0xf6f6f767 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(JSC::JIT::compile(JSC::JSGlobalData*, JSC::CodeBlock*, JSC::MacroAssemblerCodePtr*)+0x77) [0xf6f6f767]
6   0xf6f7134d /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(JSC::jitCompileFunctionIfAppropriate(JSC::JSGlobalData&, WTF::OwnPtr<JSC::FunctionCodeBlock>&, JSC::JITCode&, JSC::MacroAssemblerCodePtr&, JSC::SharedSymbolTable*&, JSC::JITCode::JITType)+0x12d) [0xf6f7134d]
7   0xf6f6cc0f /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(JSC::FunctionExecutable::compileForCallInternal(JSC::ExecState*, JSC::ScopeChainNode*, JSC::JITCode::JITType)+0x13f) [0xf6f6cc0f]
8   0xf6e06b68 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(JSC::FunctionExecutable::compileFor(JSC::ExecState*, JSC::ScopeChainNode*, JSC::CodeSpecializationKind)+0x68) [0xf6e06b68]
9   0xf6ea6715 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(cti_vm_lazyLinkCall+0x185) [0xf6ea6715]
10  0xf0d050c8 [0xf0d050c8]
11  0xf6e5b4f1 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(JSC::Interpreter::executeCall(JSC::ExecState*, JSC::JSObject*, JSC::CallType, JSC::CallData const&, JSC::JSValue, JSC::ArgList const&)+0x401) [0xf6e5b4f1]
12  0xf6f56ad6 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(JSC::call(JSC::ExecState*, JSC::JSValue, JSC::CallType, JSC::CallData const&, JSC::JSValue, JSC::ArgList const&)+0x66) [0xf6f56ad6]
13  0xf5e3f783 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::JSEventListener::handleEvent(WebCore::ScriptExecutionContext*, WebCore::Event*)+0x453) [0xf5e3f783]
14  0xf6021ddd /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::EventTarget::fireEventListeners(WebCore::Event*, WebCore::EventTargetData*, WTF::Vector<WebCore::RegisteredEventListener, 1u>&)+0x12d) [0xf6021ddd]
15  0xf6021f9d /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::EventTarget::fireEventListeners(WebCore::Event*)+0x5d) [0xf6021f9d]
16  0xf63960da /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::DOMWindow::dispatchEvent(WTF::PassRefPtr<WebCore::Event>, WTF::PassRefPtr<WebCore::EventTarget>)+0xca) [0xf63960da]
17  0xf639d137 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::DOMWindow::dispatchLoadEvent()+0x117) [0xf639d137]
18  0xf5fd8c3e /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::Document::dispatchWindowLoadEvent()+0x2e) [0xf5fd8c3e]
19  0xf5fe2762 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::Document::implicitClose()+0x122) [0xf5fe2762]
20  0xf631da34 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::FrameLoader::checkCallImplicitClose()+0x74) [0xf631da34]
21  0xf6326b11 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::FrameLoader::checkCompleted()+0xc1) [0xf6326b11]
22  0xf6326c7d /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::FrameLoader::loadDone()+0x1d) [0xf6326c7d]
23  0xf630bdeb /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::CachedResourceLoader::loadDone()+0x4b) [0xf630bdeb]
24  0xf635bc8f /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::SubresourceLoader::releaseResources()+0x5f) [0xf635bc8f]
25  0xf6351a7a /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::ResourceLoader::didFinishLoading(double)+0x3a) [0xf6351a7a]
26  0xf635c551 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::SubresourceLoader::didFinishLoading(double)+0xe1) [0xf635c551]
27  0xf6351541 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::ResourceLoader::didFinishLoading(WebCore::ResourceHandle*, double)+0x71) [0xf6351541]
28  0xf666aa2d /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::QNetworkReplyHandler::finish()+0x1bd) [0xf666aa2d]
29  0xf6667394 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::QNetworkReplyHandlerCallQueue::flush()+0x64) [0xf6667394]
30  0xf6667934 /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::QNetworkReplyHandlerCallQueue::push(void (WebCore::QNetworkReplyHandler::*)())+0x34) [0xf6667934]
31  0xf666797f /home/webkitbuildbot/slaves/release32bit-NRWT/buildslave/qt-linux-32-release-NRWT/build/WebKitBuild/Release/lib/libQtWebKit.so.4(WebCore::QNetworkReplyWrapper::didReceiveFinished()+0x3f) [0xf666797f]

Zoltán, Gábor, Péter, could you check it?
Comment 19 Csaba Osztrogonác 2012-02-24 01:34:21 PST
Additionally there are crashes on Qt ARM too. :(
Comment 20 Zoltan Herczeg 2012-02-24 04:35:45 PST
I summarize what I know so far.

This is the new code:

JSValue emptyValue;
JSValue* limit = reinterpret_cast<JSValue*>(reinterpret_cast<char*>(this) + allocation.size());
for (JSValue* currentValue = reinterpret_cast<JSValue*>(m_payload); currentValue < limit; currentValue++)
    *currentValue = emptyValue; 

In DRT this function runs only once for dom/svg/level3/xpath/XPathEvaluator_evaluate_NAMESPACE_ERR.svg

the 'this' (0xf02f0000) and 'limit' (0xf0300000) variables seems ok, but m_payload contains 0xf02f002c which is not divisible by 8! I feel this is the root of the issues. I continue debugging.
Comment 21 Zoltan Herczeg 2012-02-24 04:41:45 PST
I suspect this cause the alignment error:

    void* m_offset;
    uintptr_t m_isPinned;
    uintptr_t m_padding;
    char m_payload[1];

This is 3*4 = 12 bytes on 32 bit, while it is 8*3 = 24 bytes on 64 bit. 24 bytes are divisible by 8, but 12 bytes are not!
Comment 22 Zoltan Herczeg 2012-02-24 05:39:27 PST
Temporary fix for http://trac.webkit.org/changeset/108779
Comment 23 Mark Hahnenberg 2012-02-25 08:02:10 PST
Bug 79556 should have taken care of any cross-platform alignment issues. Closing bug.