Bug 144051 - Use less memory when compiling content extensions.
Summary: Use less memory when compiling content extensions.
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Alex Christensen
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-04-22 10:50 PDT by Alex Christensen
Modified: 2015-04-23 12:58 PDT (History)
5 users (show)

See Also:


Attachments
Patch (34.45 KB, patch)
2015-04-22 10:56 PDT, Alex Christensen
no flags Details | Formatted Diff | Diff
Patch (40.07 KB, patch)
2015-04-22 12:09 PDT, Alex Christensen
no flags Details | Formatted Diff | Diff
Patch (50.57 KB, patch)
2015-04-22 18:29 PDT, Alex Christensen
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews104 for mac-mavericks-wk2 (833.39 KB, application/zip)
2015-04-22 19:23 PDT, Build Bot
no flags Details
Patch (46.70 KB, patch)
2015-04-22 21:17 PDT, Alex Christensen
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews107 for mac-mavericks-wk2 (808.53 KB, application/zip)
2015-04-22 22:11 PDT, Build Bot
no flags Details
Patch (46.74 KB, patch)
2015-04-23 07:11 PDT, Alex Christensen
no flags Details | Formatted Diff | Diff
Patch (46.75 KB, patch)
2015-04-23 07:13 PDT, Alex Christensen
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews105 for mac-mavericks-wk2 (811.01 KB, application/zip)
2015-04-23 07:51 PDT, Build Bot
no flags Details
Patch (47.15 KB, patch)
2015-04-23 12:01 PDT, Alex Christensen
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Alex Christensen 2015-04-22 10:50:25 PDT
Right now we have a gigantic DFA that uses way too much memory.  Changing the DFANode structure to not use a HashSet would make it a lot smaller.
Comment 1 Alex Christensen 2015-04-22 10:56:24 PDT
Created attachment 251337 [details]
Patch
Comment 2 Alex Christensen 2015-04-22 10:57:04 PDT
Comment on attachment 251337 [details]
Patch

This patch was work in progress.
Comment 3 Darin Adler 2015-04-22 11:13:41 PDT
Comment on attachment 251337 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=251337&action=review

> Source/WebCore/contentextensions/CombinedURLFilters.cpp:46
> +static size_t recursiveMemoryUsed(const std::unique_ptr<PrefixTreeVertex>& node)

Seems to me this should take const PrefixTreeVertex&.

> Source/WebCore/contentextensions/CombinedURLFilters.cpp:51
> +    for (const std::unique_ptr<PrefixTreeVertex>& child : node->edges.values())

Should use auto& here.

> Source/WebCore/contentextensions/CombinedURLFilters.cpp:52
> +        size += recursiveMemoryUsed(child);

Seems a shame to use explicit recursion here since we’re just trying to walk the tree. Is there no non-recursive way to traverse the tree? We could keep an explicit stack, I suppose.

> Source/WebCore/contentextensions/CombinedURLFilters.cpp:58
> +    return recursiveMemoryUsed(m_prefixTreeRoot);

If this is guaranteed to be non-null, then we should just pass *m_prefixTreeRoot.

> Source/WebCore/contentextensions/DFA.cpp:58
> +        vector.append(dfa.actions[i]);

This can be appendUnchecked since you have reserved capacity.

> Source/WebCore/contentextensions/DFA.cpp:68
> +        vector.append(dfa.transitions[i]);

This can be appendUnchecked since you have reserved capacity.

> Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:136
> +    for (const std::pair<uint8_t, uint32_t>& pair : node.transitions(m_dfa)) {

Should use auto& here instead of the long type.

> Source/WebCore/contentextensions/DFANode.h:52
> +    DFANode()
> +        : m_actionsStart(0)
> +        , m_transitionsStart(0)
> +        , m_actionsLength(0)
> +        , m_transitionsLength(0)
> +        , m_flags(0)
> +    {
> +    }

Should initialize these members where they are defined, then we would not need to define a constructor at all.

> Source/WebCore/contentextensions/DFANode.h:70
> +    uint32_t m_actionsStart;

This is how to do that:

    uint32_t m_actionsStart { 0 };

> Source/WebCore/contentextensions/NFA.cpp:57
> +        size += sizeof(node)
> +        + node.transitions.capacity() * sizeof(std::pair<uint16_t, NFANodeIndexSet>)
> +        + node.transitionsOnAnyCharacter.capacity() * sizeof(unsigned)
> +        + node.finalRuleIds.size() * sizeof(uint64_t);

Wrong indentation.

> Source/WebCore/contentextensions/NFAToDFA.cpp:385
> +            dfa.transitions.append(std::pair<uint8_t, uint32_t>(static_cast<uint8_t>(key), targetNodeId));

Should be able to use make_pair or { } syntax and not need to specify the types of the pair.

> Tools/TestWebKitAPI/Tests/WebCore/DFAMinimizer.cpp:52
> +    for (unsigned i = 0; i < dfa.nodes.size(); ++i) {
> +        if (!dfa.nodes[i].isKilled())
>              ++counter;
>      }

Should use modern for loop now; we couldn’t before.
Comment 4 Alex Christensen 2015-04-22 12:09:14 PDT
Created attachment 251351 [details]
Patch
Comment 5 Alex Christensen 2015-04-22 12:20:04 PDT
(In reply to comment #3)
> Seems a shame to use explicit recursion here since we’re just trying to walk
> the tree. Is there no non-recursive way to traverse the tree? We could keep
> an explicit stack, I suppose.
This function is not used except in LOG_LARGE_STRUCTURES to help me debug the memory usage, so speed and efficiency are not important here.
Comment 6 Alex Christensen 2015-04-22 13:26:23 PDT
Comment on attachment 251351 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=251351&action=review

> Source/WebCore/contentextensions/DFAMinimizer.cpp:73
> +        // FIXME: Use the same location to write new transitions followed by unused memory.

By the way, Ben if you could get this working that'd be great! :)
Comment 7 Darin Adler 2015-04-22 17:52:38 PDT
Comment on attachment 251351 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=251351&action=review

> Source/WebCore/contentextensions/DFA.h:40
>  class WEBCORE_EXPORT DFA {

Should change this to a struct instead of a class.

> Source/WebCore/contentextensions/DFA.h:42
>      DFA();

Should omit this and just let the members initialize themselves.

> Source/WebCore/contentextensions/DFA.h:54
> +    unsigned root;

Should initialize this here:

    unsigned root { 0 };

> Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:130
> +    for (const std::pair<uint8_t, uint32_t>& pair : node.transitions(m_dfa)) {

Should use const auto& here.

> Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:131
> +        RELEASE_ASSERT(pair.first < 128);

Would be better if this compared with WTF_ARRAY_LENGTH(destinations) instead of 128.

> Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:132
> +        ASSERT_WITH_MESSAGE(destinations[pair.first] == noDestination, "Transitions should be unique");

I think this assertion would be fine without the message. Are you sure you want that message?

> Source/WebCore/contentextensions/DFAMinimizer.cpp:480
>          relocationVector.append(i);

Since you reserved capacity, you can use appendUnchecked here.

> Source/WebCore/contentextensions/DFAMinimizer.h:31
> +#include "DFA.h"

Don't need to include DFA.h here. Just forward declare the struct.

> Source/WebCore/contentextensions/DFAMinimizer.h:32
>  #include <wtf/Vector.h>

Don't need to include this header.

> Source/WebCore/contentextensions/DFANode.h:43
>  class DFANode {

This class is super hard to read because it’s got so many functions  all mushed together mixed in with implementations. Should paragraph the functions into logical groups, and put the function bodies separate after the class definition, marked inline since you want them inlined.

> Source/WebCore/contentextensions/DFANode.h:45
> +    

Excess blank line here.

> Source/WebCore/contentextensions/NFAToDFA.cpp:384
> +            RELEASE_ASSERT_WITH_MESSAGE(key <= std::numeric_limits<uint8_t>::max(), "ASCII transition should always fit in 8 bits");

Plain old RELEASE_ASSERT should be fine. I don't think we need this message. Can put it in a comment if you like.

> Source/WebCore/contentextensions/NFAToDFA.cpp:391
> +        RELEASE_ASSERT_WITH_MESSAGE(transitionsLength <= std::numeric_limits<uint8_t>::max(), "ASCII transition count should always fit in 8 bits");

Plain old RELEASE_ASSERT should be fine. I don't think we need this message. Can put it in a comment if you like.
Comment 8 Alex Christensen 2015-04-22 18:29:22 PDT
Created attachment 251393 [details]
Patch
Comment 9 Alex Christensen 2015-04-22 18:32:08 PDT
Comment on attachment 251393 [details]
Patch

This latest version is almost functionally ok.  It still fails ContentExtensionTest.IgnorePrevious.  I'll polish it up, fix that bug, address the comments, and land tomorrow or Friday.
Comment 10 Build Bot 2015-04-22 19:23:56 PDT
Comment on attachment 251393 [details]
Patch

Attachment 251393 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.appspot.com/results/4624308198440960

New failing tests:
http/tests/contentextensions/loading/main-resource-redirect-blocked.php
http/tests/contentextensions/text-track-blocked.html
http/tests/contentextensions/popups.html
http/tests/contentextensions/subresource-redirect-blocked.html
http/tests/contentextensions/media-filtered.html
http/tests/contentextensions/block-cookies-basic.html
http/tests/contentextensions/whitelist.html
http/tests/contentextensions/css-display-none.html
http/tests/contentextensions/basic-filter.html
http/tests/contentextensions/block-cookies-send.html
http/tests/contentextensions/character-set-basic-support.html
Comment 11 Build Bot 2015-04-22 19:23:57 PDT
Created attachment 251397 [details]
Archive of layout-test-results from ews104 for mac-mavericks-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews104  Port: mac-mavericks-wk2  Platform: Mac OS X 10.9.5
Comment 12 Benjamin Poulain 2015-04-22 20:50:17 PDT
Comment on attachment 251393 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=251393&action=review

I had a quick look. Probably useless to review at this point, this is WIP.

> Source/WebCore/contentextensions/ContentExtensionCompiler.cpp:234
> +            unsigned actionsStart = dfa.actions.size();
>              for (uint64_t actionLocation : universalActionLocations)

Since you know the size of universalActionLocations and dfa.actions is big, it would be good to allocate the memory before the loop.

> Source/WebCore/contentextensions/DFA.cpp:47
> +        + transitions.size() * sizeof(std::pair<uint8_t, uint32_t>)

If you want to better pack the transitions, you could have 2 vectors: one for the character, one for the target.

> Source/WebCore/contentextensions/DFA.h:54
> +    Vector<uint64_t> actions;
> +    Vector<std::pair<uint8_t, uint32_t>> transitions;
> +    Vector<DFANode> nodes;
> +    unsigned root;

I am a bit worried about making everything public.

That is something JSC does a lot, and it leads to a lot of weird depedencies.
Comment 13 Alex Christensen 2015-04-22 21:17:24 PDT
Created attachment 251402 [details]
Patch
Comment 14 Build Bot 2015-04-22 22:11:39 PDT
Comment on attachment 251402 [details]
Patch

Attachment 251402 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.appspot.com/results/6472748306006016

New failing tests:
http/tests/contentextensions/loading/main-resource-redirect-blocked.php
http/tests/contentextensions/text-track-blocked.html
http/tests/contentextensions/popups.html
http/tests/contentextensions/subresource-redirect-blocked.html
http/tests/contentextensions/media-filtered.html
http/tests/contentextensions/block-cookies-basic.html
http/tests/contentextensions/whitelist.html
http/tests/contentextensions/css-display-none.html
http/tests/contentextensions/basic-filter.html
http/tests/contentextensions/block-cookies-send.html
http/tests/contentextensions/character-set-basic-support.html
Comment 15 Build Bot 2015-04-22 22:11:42 PDT
Created attachment 251406 [details]
Archive of layout-test-results from ews107 for mac-mavericks-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews107  Port: mac-mavericks-wk2  Platform: Mac OS X 10.9.5
Comment 16 Alex Christensen 2015-04-22 23:43:33 PDT
Tests definitely pass locally. Not sure what to do.
Comment 17 Benjamin Poulain 2015-04-22 23:47:57 PDT
Comment on attachment 251402 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=251402&action=review

The patch is really good. Pretty cool given how little time you had.

The tests fail though so I r-. I did not see the error while reviewing :(

> Source/WebCore/contentextensions/ContentExtensionCompiler.cpp:137
> +    LOG_LARGE_STRUCTURES(actions, actions.size() * sizeof(SerializedActionByte));

Should be actions.capacity();

> Source/WebCore/contentextensions/ContentExtensionCompiler.cpp:169
> +    LOG_LARGE_STRUCTURES(parsedRuleList, parsedRuleList.size() * sizeof(ContentExtensionRule)); // Doesn't include strings.
> +    LOG_LARGE_STRUCTURES(actionLocations, actionLocations.size() * sizeof(unsigned));

capacity() in both cases.

> Source/WebCore/contentextensions/ContentExtensionCompiler.cpp:245
> +    LOG_LARGE_STRUCTURES(universalActionLocations, universalActionLocations.size() * sizeof(unsigned));

capacity()

> Source/WebCore/contentextensions/ContentExtensionCompiler.cpp:260
> +    LOG_LARGE_STRUCTURES(bytecode, bytecode.size() * sizeof(uint8_t));

capacity() * sizeof(DFABytecode)

> Source/WebCore/contentextensions/ContentExtensionsDebugging.h:38
> +#define LOG_LARGE_STRUCTURES(name, size) /* if (size > 1000000) { WTFLogAlways("NAME: %s SIZE %d", #name, (int)(size)); }; */

Another way would be to have LOG_LARGE_STRUCTURES extend to nothing if CONTENT_EXTENSIONS_PERFORMANCE_REPORTING is zero.

> Source/WebCore/contentextensions/DFA.cpp:46
> +// FIXME: Make DFANode.cpp.

Oops :)

> Source/WebCore/contentextensions/DFAMinimizer.cpp:442
> +        hasher.addCharactersAssumingAligned(reinterpret_cast<const UChar*>(&dfa->actions[actionsStart]), actionsLength * sizeof(uint64_t) / sizeof(UChar));

This does not look correct. There is no guarantee the action order will be the same in two nodes.

> Source/WebCore/contentextensions/DFANode.h:103
> +// FIXME: Pack this down to 12.

__attribute__((packed)) may do the trick but I am a bit worried about it.

A loadPair instruction on any attribute would crash.

> Source/WebCore/contentextensions/NFAToDFA.cpp:384
> +            RELEASE_ASSERT(key <= std::numeric_limits<uint8_t>::max());

Wouldn't <= 127 be good too?

> Tools/ChangeLog:6
> +        Reviewed by Darin Adler.

?

> Tools/TestWebKitAPI/Tests/WebCore/ContentExtensions.cpp:490
> +    // FIXME: This does not make multiple DFAs anymore. Add a test that does.

Still a good test, it caught a bug in the minimizer :)
Comment 18 Alex Christensen 2015-04-23 07:11:09 PDT
Created attachment 251432 [details]
Patch
Comment 19 Alex Christensen 2015-04-23 07:13:48 PDT
Created attachment 251433 [details]
Patch
Comment 20 Alex Christensen 2015-04-23 07:46:11 PDT
(In reply to comment #17)
> > Source/WebCore/contentextensions/DFAMinimizer.cpp:442
> > +        hasher.addCharactersAssumingAligned(reinterpret_cast<const UChar*>(&dfa->actions[actionsStart]), actionsLength * sizeof(uint64_t) / sizeof(UChar));
> 
> This does not look correct. There is no guarantee the action order will be
> the same in two nodes.
I added a FIXME for sorting the actions.  For now this is correct because if the order is not the same, then they will also be unequal.  Sorting would only make minimization more effective but not more correct.

I cannot reproduce the errors on Mavericks.  I'm wondering if it's an issue with the EWS bot.
Comment 21 Build Bot 2015-04-23 07:51:10 PDT
Comment on attachment 251433 [details]
Patch

Attachment 251433 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.appspot.com/results/6154858985947136

New failing tests:
http/tests/contentextensions/loading/main-resource-redirect-blocked.php
http/tests/contentextensions/text-track-blocked.html
http/tests/contentextensions/popups.html
http/tests/contentextensions/subresource-redirect-blocked.html
http/tests/contentextensions/media-filtered.html
http/tests/contentextensions/block-cookies-basic.html
http/tests/contentextensions/whitelist.html
http/tests/contentextensions/css-display-none.html
http/tests/contentextensions/basic-filter.html
http/tests/contentextensions/block-cookies-send.html
http/tests/contentextensions/character-set-basic-support.html
Comment 22 Build Bot 2015-04-23 07:51:15 PDT
Created attachment 251434 [details]
Archive of layout-test-results from ews105 for mac-mavericks-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews105  Port: mac-mavericks-wk2  Platform: Mac OS X 10.9.5
Comment 23 Alex Christensen 2015-04-23 08:05:37 PDT
Sure enough, there are a bunch of release-only failures.  Don't have time to look into them right now.  Maybe tonight or tomorrow.
Comment 24 Alex Christensen 2015-04-23 12:01:20 PDT
Created attachment 251461 [details]
Patch
Comment 25 Alex Christensen 2015-04-23 12:02:48 PDT
Inlining ActionKeyHash::equal seemed to be the problem.  Not sure why.  It should work fine now.
Comment 26 WebKit Commit Bot 2015-04-23 12:58:07 PDT
Comment on attachment 251461 [details]
Patch

Clearing flags on attachment: 251461

Committed r183204: <http://trac.webkit.org/changeset/183204>
Comment 27 WebKit Commit Bot 2015-04-23 12:58:13 PDT
All reviewed patches have been landed.  Closing bug.