Bug 134416

Summary: reduce dynamic memory allocation in css jit compiler
Product: WebKit Reporter: Alex Christensen <achristensen>
Component: CSSAssignee: Alex Christensen <achristensen>
Status: RESOLVED FIXED    
Severity: Normal CC: benjamin, bunhere, cdumez, cmarcelo, commit-queue, ggaren, gyuyoung.kim, sergio
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 134484    
Bug Blocks:    
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch benjamin: review+

Description Alex Christensen 2014-06-27 15:06:27 PDT
The css jit compiler could be sped up.  Calls to malloc are slow.  Let's get rid of them.
Comment 1 Alex Christensen 2014-06-27 15:12:06 PDT
Created attachment 234023 [details]
Patch
Comment 2 WebKit Commit Bot 2014-06-27 15:14:12 PDT
Attachment 234023 [details] did not pass style-queue:


ERROR: Source/WTF/wtf/Deque.h:41:  Code inside a namespace should not be indented.  [whitespace/indent] [4]
Total errors found: 1 in 5 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 3 Darin Adler 2014-06-27 16:48:13 PDT
Comment on attachment 234023 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=234023&action=review

> Source/WTF/ChangeLog:35
> +        (WTF::inlineCapacity>::checkValidity):
> +        (WTF::inlineCapacity>::checkIndexValidity):
> +        (WTF::inlineCapacity>::invalidateIterators):
> +        (WTF::inlineCapacity>::Deque):
> +        (WTF::=):
> +        (WTF::inlineCapacity>::destroyAll):
> +        (WTF::inlineCapacity>::~Deque):
> +        (WTF::inlineCapacity>::swap):
> +        (WTF::inlineCapacity>::clear):
> +        (WTF::inlineCapacity>::expandCapacityIfNeeded):
> +        (WTF::inlineCapacity>::expandCapacity):
> +        (WTF::inlineCapacity>::append):
> +        (WTF::inlineCapacity>::prepend):
> +        (WTF::inlineCapacity>::removeFirst):
> +        (WTF::inlineCapacity>::removeLast):
> +        (WTF::inlineCapacity>::remove):
> +        (WTF::inlineCapacity>::addToIteratorsList):
> +        (WTF::inlineCapacity>::removeFromIteratorsList):
> +        (WTF::inlineCapacity>::DequeIteratorBase):
> +        (WTF::inlineCapacity>::~DequeIteratorBase):
> +        (WTF::inlineCapacity>::isEqual):
> +        (WTF::inlineCapacity>::increment):
> +        (WTF::inlineCapacity>::decrement):
> +        (WTF::inlineCapacity>::after):
> +        (WTF::inlineCapacity>::before):

The script generated garbage here. Please either fix these manually or remove them. I think it’s OK to omit the list of function names entirely.

> Source/WebCore/cssjit/RegisterAllocator.h:65
> +    JSC::ARMRegisters::r9,

Change not mentioned in the change log.

> Source/WebCore/cssjit/RegisterAllocator.h:73
> +    JSC::ARMRegisters::r8,
> +    JSC::ARMRegisters::r10,
> +    JSC::ARMRegisters::r11,

Change not mentioned in the change log.
Comment 4 Benjamin Poulain 2014-06-27 17:08:29 PDT
Comment on attachment 234023 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=234023&action=review

+1 from me too. One comment:

> Source/WebCore/cssjit/StackAllocator.h:75
> -        unsigned registerCount = registerIDs.size();
> -        Vector<StackReference> stackReferences;
> -        stackReferences.reserveInitialCapacity(registerCount);
> +        Vector<StackReference, registerCount> stackReferences;

The stack allocation is not useful here because the vector is returned by value. This will have to be heap allocated in the copy constructor when returning.
Comment 5 Alex Christensen 2014-06-30 11:14:04 PDT
Created attachment 234081 [details]
Patch
Comment 6 Alex Christensen 2014-06-30 11:16:38 PDT
I'll put the additional armv7 registers in a different patch because that change was unrelated.

I expanded my usage of vector inline capacity throughout the selector compiler using 32 as a large number that will probably be more than the number of function calls, fragments, etc.
Comment 7 Geoffrey Garen 2014-06-30 11:17:47 PDT
> +1 from me too. One comment:
> 
> > Source/WebCore/cssjit/StackAllocator.h:75
> > -        unsigned registerCount = registerIDs.size();
> > -        Vector<StackReference> stackReferences;
> > -        stackReferences.reserveInitialCapacity(registerCount);
> > +        Vector<StackReference, registerCount> stackReferences;
> 
> The stack allocation is not useful here because the vector is returned by value. This will have to be heap allocated in the copy constructor when returning.

I don't think so. Vector with inline capacity has an optimization to copy to inline capacity without malloc-ating. It's a little far removed from the Vector class itself, but here is the code:

    VectorBuffer(size_t capacity, size_t size = 0)
        : Base(inlineBuffer(), inlineCapacity, size)
    {
        if (capacity > inlineCapacity)
            Base::allocateBuffer(capacity);
    }
Comment 8 Alex Christensen 2014-06-30 11:21:10 PDT
Created attachment 234083 [details]
Patch
Comment 9 WebKit Commit Bot 2014-06-30 11:23:40 PDT
Attachment 234083 [details] did not pass style-queue:


ERROR: Source/WTF/wtf/Deque.h:41:  Code inside a namespace should not be indented.  [whitespace/indent] [4]
Total errors found: 1 in 7 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 10 Benjamin Poulain 2014-06-30 14:31:53 PDT
Comment on attachment 234083 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=234083&action=review

Some comments, nothing critical.

> Source/WebCore/cssjit/FunctionCall.h:183
> +    Vector<std::pair<JSC::MacroAssembler::Call, JSC::FunctionPtr>, 32>& m_callRegistry;

You can typedef this to avoid repeating the 32.

> Source/WebCore/cssjit/SelectorCompiler.cpp:166
> -    Vector<const AtomicStringImpl*, 1> classNames;
> +    Vector<const AtomicStringImpl*, 32> classNames;
>      HashSet<unsigned> pseudoClasses;
> -    Vector<JSC::FunctionPtr> unoptimizedPseudoClasses;
> -    Vector<AttributeMatchingInfo> attributes;
> -    Vector<std::pair<int, int>> nthChildFilters;
> +    Vector<JSC::FunctionPtr, 32> unoptimizedPseudoClasses;
> +    Vector<AttributeMatchingInfo, 32> attributes;
> +    Vector<std::pair<int, int>, 32> nthChildFilters;

I am not sure that is a good idea. The filters unoptimizedPseudoClasses and nthChildFilters are uncommon. The attributes rarely goes over 1. Classe names are usually 1 or 2.

> Source/WebCore/cssjit/SelectorCompiler.cpp:196
> +typedef Vector<SelectorFragment, 32> SelectorFragmentList;
> +typedef Vector<TagNamePattern, 32> TagNameList;

Again, this seems a little big.

> Source/WebCore/cssjit/StackAllocator.h:75
> +    Vector<StackReference, registerCount> push(const Vector<JSC::MacroAssembler::RegisterID>& registerIDs)
>      {
>          RELEASE_ASSERT(!m_hasFunctionCallPadding);
> -        unsigned registerCount = registerIDs.size();
> -        Vector<StackReference> stackReferences;
> -        stackReferences.reserveInitialCapacity(registerCount);
> +        Vector<StackReference, registerCount> stackReferences;

I think you need to get the output as an argument to avoid the copy.
Comment 11 Alex Christensen 2014-06-30 15:50:26 PDT
http://trac.webkit.org/changeset/170605

Passing a Vector& to push matched pop and it avoided an appendVector.
Comment 12 WebKit Commit Bot 2014-06-30 21:13:39 PDT
Re-opened since this is blocked by bug 134484
Comment 13 Benjamin Poulain 2014-06-30 21:16:51 PDT
I got into several crashes with this backtrace:
  * frame #0: 0x0000000100eb81c8 JavaScriptCore`WTFCrash + 72
    frame #1: 0x0000000101db1318 WebCore`WebCore::StackAllocator::pop(WTF::Vector<WebCore::StackAllocator::StackReference, 0ul, WTF::CrashOnOverflow> const&, WTF::Vector<JSC::ARM64Registers::RegisterID, 0ul, WTF::CrashOnOverflow> const&) + 432
    frame #2: 0x0000000101db54bc WebCore`WebCore::FunctionCall::restoreAllocatedCallerSavedRegisters() + 208
    frame #3: 0x0000000101da8c04 WebCore`WebCore::SelectorCompiler::SelectorCodeGenerator::generateElementFunctionCallTest(JSC::AbstractMacroAssembler<JSC::ARM64Assembler>::JumpList&, JSC::FunctionPtr) + 240
    frame #4: 0x0000000101da6170 WebCore`WebCore::SelectorCompiler::SelectorCodeGenerator::generateElementMatching(JSC::AbstractMacroAssembler<JSC::ARM64Assembler>::JumpList&, JSC::AbstractMacroAssembler<JSC::ARM64Assembler>::JumpList&, WebCore::SelectorCompiler::SelectorFragment const&) + 616
    frame #5: 0x0000000101da56d4 WebCore`WebCore::SelectorCompiler::SelectorCodeGenerator::generateSelectorChecker() + 936
    frame #6: 0x0000000101db04ac WebCore`WebCore::SelectorCompiler::SelectorCodeGenerator::compile(JSC::VM*, JSC::MacroAssemblerCodeRef&) + 72
    frame #7: 0x0000000101da5300 WebCore`WebCore::SelectorCompiler::compileSelector(WebCore::CSSSelector const*, JSC::VM*, WebCore::SelectorCompiler::SelectorContext, JSC::MacroAssemblerCodeRef&) + 84
    frame #8: 0x0000000101dbc9c0 WebCore`void WebCore::SelectorDataList::execute<WebCore::AllElementExtractorSelectorQueryTrait>(WebCore::ContainerNode&, WebCore::AllElementExtractorSelectorQueryTrait::OutputType&) const + 2496
    frame #9: 0x0000000101dbb988 WebCore`WebCore::SelectorDataList::queryAll(WebCore::ContainerNode&) const + 32
    frame #10: 0x00000001013056fc WebCore`WebCore::ContainerNode::querySelectorAll(WTF::String const&, int&) + 52
    frame #11: 0x0000000101888988 WebCore`WebCore::jsElementPrototypeFunctionQuerySelectorAll(JSC::ExecState*) + 340

It's odd because the patch had nothing to do with ARMv8. We'll need to look into that.
Comment 14 Alex Christensen 2014-07-01 10:11:51 PDT
I see what it was.  The very last line should not have changed registerIDs.size() to registerCount.  I don't know why I did that :(
Comment 15 Alex Christensen 2014-07-01 11:55:18 PDT
Created attachment 234181 [details]
Patch
Comment 16 WebKit Commit Bot 2014-07-01 11:56:38 PDT
Attachment 234181 [details] did not pass style-queue:


ERROR: Source/WTF/wtf/Deque.h:41:  Code inside a namespace should not be indented.  [whitespace/indent] [4]
Total errors found: 1 in 7 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 17 Alex Christensen 2014-07-01 12:07:11 PDT
(In reply to comment #14)
> I see what it was.  The very last line should not have changed registerIDs.size() to registerCount.  I don't know why I did that :(
The problem was the exact opposite.  I should have changed all registerCount to registerIDs.size().  I added new variable names to avoid confusion between registerCount and popRegisterCount or pushRegisterCount.
Comment 18 Alex Christensen 2014-07-01 21:50:41 PDT
http://trac.webkit.org/changeset/170697