Bug 144051

Summary: Use less memory when compiling content extensions.
Product: WebKit Reporter: Alex Christensen <achristensen>
Component: WebCore Misc.Assignee: Alex Christensen <achristensen>
Status: RESOLVED FIXED    
Severity: Normal CC: benjamin, buildbot, commit-queue, rniwa, sam
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Archive of layout-test-results from ews104 for mac-mavericks-wk2
none
Patch
none
Archive of layout-test-results from ews107 for mac-mavericks-wk2
none
Patch
none
Patch
none
Archive of layout-test-results from ews105 for mac-mavericks-wk2
none
Patch none

Alex Christensen
Reported 2015-04-22 10:50:25 PDT
Right now we have a gigantic DFA that uses way too much memory. Changing the DFANode structure to not use a HashSet would make it a lot smaller.
Attachments
Patch (34.45 KB, patch)
2015-04-22 10:56 PDT, Alex Christensen
no flags
Patch (40.07 KB, patch)
2015-04-22 12:09 PDT, Alex Christensen
no flags
Patch (50.57 KB, patch)
2015-04-22 18:29 PDT, Alex Christensen
no flags
Archive of layout-test-results from ews104 for mac-mavericks-wk2 (833.39 KB, application/zip)
2015-04-22 19:23 PDT, Build Bot
no flags
Patch (46.70 KB, patch)
2015-04-22 21:17 PDT, Alex Christensen
no flags
Archive of layout-test-results from ews107 for mac-mavericks-wk2 (808.53 KB, application/zip)
2015-04-22 22:11 PDT, Build Bot
no flags
Patch (46.74 KB, patch)
2015-04-23 07:11 PDT, Alex Christensen
no flags
Patch (46.75 KB, patch)
2015-04-23 07:13 PDT, Alex Christensen
no flags
Archive of layout-test-results from ews105 for mac-mavericks-wk2 (811.01 KB, application/zip)
2015-04-23 07:51 PDT, Build Bot
no flags
Patch (47.15 KB, patch)
2015-04-23 12:01 PDT, Alex Christensen
no flags
Alex Christensen
Comment 1 2015-04-22 10:56:24 PDT
Alex Christensen
Comment 2 2015-04-22 10:57:04 PDT
Comment on attachment 251337 [details] Patch This patch was work in progress.
Darin Adler
Comment 3 2015-04-22 11:13:41 PDT
Comment on attachment 251337 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=251337&action=review > Source/WebCore/contentextensions/CombinedURLFilters.cpp:46 > +static size_t recursiveMemoryUsed(const std::unique_ptr<PrefixTreeVertex>& node) Seems to me this should take const PrefixTreeVertex&. > Source/WebCore/contentextensions/CombinedURLFilters.cpp:51 > + for (const std::unique_ptr<PrefixTreeVertex>& child : node->edges.values()) Should use auto& here. > Source/WebCore/contentextensions/CombinedURLFilters.cpp:52 > + size += recursiveMemoryUsed(child); Seems a shame to use explicit recursion here since we’re just trying to walk the tree. Is there no non-recursive way to traverse the tree? We could keep an explicit stack, I suppose. > Source/WebCore/contentextensions/CombinedURLFilters.cpp:58 > + return recursiveMemoryUsed(m_prefixTreeRoot); If this is guaranteed to be non-null, then we should just pass *m_prefixTreeRoot. > Source/WebCore/contentextensions/DFA.cpp:58 > + vector.append(dfa.actions[i]); This can be appendUnchecked since you have reserved capacity. > Source/WebCore/contentextensions/DFA.cpp:68 > + vector.append(dfa.transitions[i]); This can be appendUnchecked since you have reserved capacity. > Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:136 > + for (const std::pair<uint8_t, uint32_t>& pair : node.transitions(m_dfa)) { Should use auto& here instead of the long type. > Source/WebCore/contentextensions/DFANode.h:52 > + DFANode() > + : m_actionsStart(0) > + , m_transitionsStart(0) > + , m_actionsLength(0) > + , m_transitionsLength(0) > + , m_flags(0) > + { > + } Should initialize these members where they are defined, then we would not need to define a constructor at all. > Source/WebCore/contentextensions/DFANode.h:70 > + uint32_t m_actionsStart; This is how to do that: uint32_t m_actionsStart { 0 }; > Source/WebCore/contentextensions/NFA.cpp:57 > + size += sizeof(node) > + + node.transitions.capacity() * sizeof(std::pair<uint16_t, NFANodeIndexSet>) > + + node.transitionsOnAnyCharacter.capacity() * sizeof(unsigned) > + + node.finalRuleIds.size() * sizeof(uint64_t); Wrong indentation. > Source/WebCore/contentextensions/NFAToDFA.cpp:385 > + dfa.transitions.append(std::pair<uint8_t, uint32_t>(static_cast<uint8_t>(key), targetNodeId)); Should be able to use make_pair or { } syntax and not need to specify the types of the pair. > Tools/TestWebKitAPI/Tests/WebCore/DFAMinimizer.cpp:52 > + for (unsigned i = 0; i < dfa.nodes.size(); ++i) { > + if (!dfa.nodes[i].isKilled()) > ++counter; > } Should use modern for loop now; we couldn’t before.
Alex Christensen
Comment 4 2015-04-22 12:09:14 PDT
Alex Christensen
Comment 5 2015-04-22 12:20:04 PDT
(In reply to comment #3) > Seems a shame to use explicit recursion here since we’re just trying to walk > the tree. Is there no non-recursive way to traverse the tree? We could keep > an explicit stack, I suppose. This function is not used except in LOG_LARGE_STRUCTURES to help me debug the memory usage, so speed and efficiency are not important here.
Alex Christensen
Comment 6 2015-04-22 13:26:23 PDT
Comment on attachment 251351 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=251351&action=review > Source/WebCore/contentextensions/DFAMinimizer.cpp:73 > + // FIXME: Use the same location to write new transitions followed by unused memory. By the way, Ben if you could get this working that'd be great! :)
Darin Adler
Comment 7 2015-04-22 17:52:38 PDT
Comment on attachment 251351 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=251351&action=review > Source/WebCore/contentextensions/DFA.h:40 > class WEBCORE_EXPORT DFA { Should change this to a struct instead of a class. > Source/WebCore/contentextensions/DFA.h:42 > DFA(); Should omit this and just let the members initialize themselves. > Source/WebCore/contentextensions/DFA.h:54 > + unsigned root; Should initialize this here: unsigned root { 0 }; > Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:130 > + for (const std::pair<uint8_t, uint32_t>& pair : node.transitions(m_dfa)) { Should use const auto& here. > Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:131 > + RELEASE_ASSERT(pair.first < 128); Would be better if this compared with WTF_ARRAY_LENGTH(destinations) instead of 128. > Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:132 > + ASSERT_WITH_MESSAGE(destinations[pair.first] == noDestination, "Transitions should be unique"); I think this assertion would be fine without the message. Are you sure you want that message? > Source/WebCore/contentextensions/DFAMinimizer.cpp:480 > relocationVector.append(i); Since you reserved capacity, you can use appendUnchecked here. > Source/WebCore/contentextensions/DFAMinimizer.h:31 > +#include "DFA.h" Don't need to include DFA.h here. Just forward declare the struct. > Source/WebCore/contentextensions/DFAMinimizer.h:32 > #include <wtf/Vector.h> Don't need to include this header. > Source/WebCore/contentextensions/DFANode.h:43 > class DFANode { This class is super hard to read because it’s got so many functions all mushed together mixed in with implementations. Should paragraph the functions into logical groups, and put the function bodies separate after the class definition, marked inline since you want them inlined. > Source/WebCore/contentextensions/DFANode.h:45 > + Excess blank line here. > Source/WebCore/contentextensions/NFAToDFA.cpp:384 > + RELEASE_ASSERT_WITH_MESSAGE(key <= std::numeric_limits<uint8_t>::max(), "ASCII transition should always fit in 8 bits"); Plain old RELEASE_ASSERT should be fine. I don't think we need this message. Can put it in a comment if you like. > Source/WebCore/contentextensions/NFAToDFA.cpp:391 > + RELEASE_ASSERT_WITH_MESSAGE(transitionsLength <= std::numeric_limits<uint8_t>::max(), "ASCII transition count should always fit in 8 bits"); Plain old RELEASE_ASSERT should be fine. I don't think we need this message. Can put it in a comment if you like.
Alex Christensen
Comment 8 2015-04-22 18:29:22 PDT
Alex Christensen
Comment 9 2015-04-22 18:32:08 PDT
Comment on attachment 251393 [details] Patch This latest version is almost functionally ok. It still fails ContentExtensionTest.IgnorePrevious. I'll polish it up, fix that bug, address the comments, and land tomorrow or Friday.
Build Bot
Comment 10 2015-04-22 19:23:56 PDT
Comment on attachment 251393 [details] Patch Attachment 251393 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.appspot.com/results/4624308198440960 New failing tests: http/tests/contentextensions/loading/main-resource-redirect-blocked.php http/tests/contentextensions/text-track-blocked.html http/tests/contentextensions/popups.html http/tests/contentextensions/subresource-redirect-blocked.html http/tests/contentextensions/media-filtered.html http/tests/contentextensions/block-cookies-basic.html http/tests/contentextensions/whitelist.html http/tests/contentextensions/css-display-none.html http/tests/contentextensions/basic-filter.html http/tests/contentextensions/block-cookies-send.html http/tests/contentextensions/character-set-basic-support.html
Build Bot
Comment 11 2015-04-22 19:23:57 PDT
Created attachment 251397 [details] Archive of layout-test-results from ews104 for mac-mavericks-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews104 Port: mac-mavericks-wk2 Platform: Mac OS X 10.9.5
Benjamin Poulain
Comment 12 2015-04-22 20:50:17 PDT
Comment on attachment 251393 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=251393&action=review I had a quick look. Probably useless to review at this point, this is WIP. > Source/WebCore/contentextensions/ContentExtensionCompiler.cpp:234 > + unsigned actionsStart = dfa.actions.size(); > for (uint64_t actionLocation : universalActionLocations) Since you know the size of universalActionLocations and dfa.actions is big, it would be good to allocate the memory before the loop. > Source/WebCore/contentextensions/DFA.cpp:47 > + + transitions.size() * sizeof(std::pair<uint8_t, uint32_t>) If you want to better pack the transitions, you could have 2 vectors: one for the character, one for the target. > Source/WebCore/contentextensions/DFA.h:54 > + Vector<uint64_t> actions; > + Vector<std::pair<uint8_t, uint32_t>> transitions; > + Vector<DFANode> nodes; > + unsigned root; I am a bit worried about making everything public. That is something JSC does a lot, and it leads to a lot of weird depedencies.
Alex Christensen
Comment 13 2015-04-22 21:17:24 PDT
Build Bot
Comment 14 2015-04-22 22:11:39 PDT
Comment on attachment 251402 [details] Patch Attachment 251402 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.appspot.com/results/6472748306006016 New failing tests: http/tests/contentextensions/loading/main-resource-redirect-blocked.php http/tests/contentextensions/text-track-blocked.html http/tests/contentextensions/popups.html http/tests/contentextensions/subresource-redirect-blocked.html http/tests/contentextensions/media-filtered.html http/tests/contentextensions/block-cookies-basic.html http/tests/contentextensions/whitelist.html http/tests/contentextensions/css-display-none.html http/tests/contentextensions/basic-filter.html http/tests/contentextensions/block-cookies-send.html http/tests/contentextensions/character-set-basic-support.html
Build Bot
Comment 15 2015-04-22 22:11:42 PDT
Created attachment 251406 [details] Archive of layout-test-results from ews107 for mac-mavericks-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews107 Port: mac-mavericks-wk2 Platform: Mac OS X 10.9.5
Alex Christensen
Comment 16 2015-04-22 23:43:33 PDT
Tests definitely pass locally. Not sure what to do.
Benjamin Poulain
Comment 17 2015-04-22 23:47:57 PDT
Comment on attachment 251402 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=251402&action=review The patch is really good. Pretty cool given how little time you had. The tests fail though so I r-. I did not see the error while reviewing :( > Source/WebCore/contentextensions/ContentExtensionCompiler.cpp:137 > + LOG_LARGE_STRUCTURES(actions, actions.size() * sizeof(SerializedActionByte)); Should be actions.capacity(); > Source/WebCore/contentextensions/ContentExtensionCompiler.cpp:169 > + LOG_LARGE_STRUCTURES(parsedRuleList, parsedRuleList.size() * sizeof(ContentExtensionRule)); // Doesn't include strings. > + LOG_LARGE_STRUCTURES(actionLocations, actionLocations.size() * sizeof(unsigned)); capacity() in both cases. > Source/WebCore/contentextensions/ContentExtensionCompiler.cpp:245 > + LOG_LARGE_STRUCTURES(universalActionLocations, universalActionLocations.size() * sizeof(unsigned)); capacity() > Source/WebCore/contentextensions/ContentExtensionCompiler.cpp:260 > + LOG_LARGE_STRUCTURES(bytecode, bytecode.size() * sizeof(uint8_t)); capacity() * sizeof(DFABytecode) > Source/WebCore/contentextensions/ContentExtensionsDebugging.h:38 > +#define LOG_LARGE_STRUCTURES(name, size) /* if (size > 1000000) { WTFLogAlways("NAME: %s SIZE %d", #name, (int)(size)); }; */ Another way would be to have LOG_LARGE_STRUCTURES extend to nothing if CONTENT_EXTENSIONS_PERFORMANCE_REPORTING is zero. > Source/WebCore/contentextensions/DFA.cpp:46 > +// FIXME: Make DFANode.cpp. Oops :) > Source/WebCore/contentextensions/DFAMinimizer.cpp:442 > + hasher.addCharactersAssumingAligned(reinterpret_cast<const UChar*>(&dfa->actions[actionsStart]), actionsLength * sizeof(uint64_t) / sizeof(UChar)); This does not look correct. There is no guarantee the action order will be the same in two nodes. > Source/WebCore/contentextensions/DFANode.h:103 > +// FIXME: Pack this down to 12. __attribute__((packed)) may do the trick but I am a bit worried about it. A loadPair instruction on any attribute would crash. > Source/WebCore/contentextensions/NFAToDFA.cpp:384 > + RELEASE_ASSERT(key <= std::numeric_limits<uint8_t>::max()); Wouldn't <= 127 be good too? > Tools/ChangeLog:6 > + Reviewed by Darin Adler. ? > Tools/TestWebKitAPI/Tests/WebCore/ContentExtensions.cpp:490 > + // FIXME: This does not make multiple DFAs anymore. Add a test that does. Still a good test, it caught a bug in the minimizer :)
Alex Christensen
Comment 18 2015-04-23 07:11:09 PDT
Alex Christensen
Comment 19 2015-04-23 07:13:48 PDT
Alex Christensen
Comment 20 2015-04-23 07:46:11 PDT
(In reply to comment #17) > > Source/WebCore/contentextensions/DFAMinimizer.cpp:442 > > + hasher.addCharactersAssumingAligned(reinterpret_cast<const UChar*>(&dfa->actions[actionsStart]), actionsLength * sizeof(uint64_t) / sizeof(UChar)); > > This does not look correct. There is no guarantee the action order will be > the same in two nodes. I added a FIXME for sorting the actions. For now this is correct because if the order is not the same, then they will also be unequal. Sorting would only make minimization more effective but not more correct. I cannot reproduce the errors on Mavericks. I'm wondering if it's an issue with the EWS bot.
Build Bot
Comment 21 2015-04-23 07:51:10 PDT
Comment on attachment 251433 [details] Patch Attachment 251433 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.appspot.com/results/6154858985947136 New failing tests: http/tests/contentextensions/loading/main-resource-redirect-blocked.php http/tests/contentextensions/text-track-blocked.html http/tests/contentextensions/popups.html http/tests/contentextensions/subresource-redirect-blocked.html http/tests/contentextensions/media-filtered.html http/tests/contentextensions/block-cookies-basic.html http/tests/contentextensions/whitelist.html http/tests/contentextensions/css-display-none.html http/tests/contentextensions/basic-filter.html http/tests/contentextensions/block-cookies-send.html http/tests/contentextensions/character-set-basic-support.html
Build Bot
Comment 22 2015-04-23 07:51:15 PDT
Created attachment 251434 [details] Archive of layout-test-results from ews105 for mac-mavericks-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews105 Port: mac-mavericks-wk2 Platform: Mac OS X 10.9.5
Alex Christensen
Comment 23 2015-04-23 08:05:37 PDT
Sure enough, there are a bunch of release-only failures. Don't have time to look into them right now. Maybe tonight or tomorrow.
Alex Christensen
Comment 24 2015-04-23 12:01:20 PDT
Alex Christensen
Comment 25 2015-04-23 12:02:48 PDT
Inlining ActionKeyHash::equal seemed to be the problem. Not sure why. It should work fine now.
WebKit Commit Bot
Comment 26 2015-04-23 12:58:07 PDT
Comment on attachment 251461 [details] Patch Clearing flags on attachment: 251461 Committed r183204: <http://trac.webkit.org/changeset/183204>
WebKit Commit Bot
Comment 27 2015-04-23 12:58:13 PDT
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.