Bug 190792

Summary: topPrivatelyControlledDomain is slow
Product: WebKit Reporter: Antti Koivisto <koivisto>
Component: Page LoadingAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: achristensen, beidson, cdumez, commit-queue, ews-watchlist, ggaren, tsavell, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugs.webkit.org/show_bug.cgi?id=190814
Attachments:
Description Flags
patch
none
patch
achristensen: review+
patch
ews-watchlist: commit-queue-
Archive of layout-test-results from ews117 for mac-sierra
none
with locking
none
with locking
none
patch none

Antti Koivisto
Reported 2018-10-22 06:22:06 PDT
It calls into some slowish CFNetwork code and ends up showing up in profiles.
Attachments
patch (2.75 KB, patch)
2018-10-22 06:57 PDT, Antti Koivisto
no flags
patch (2.92 KB, patch)
2018-10-22 07:21 PDT, Antti Koivisto
achristensen: review+
patch (2.93 KB, patch)
2018-10-23 01:29 PDT, Antti Koivisto
ews-watchlist: commit-queue-
Archive of layout-test-results from ews117 for mac-sierra (1.05 MB, application/zip)
2018-10-23 02:40 PDT, EWS Watchlist
no flags
with locking (3.32 KB, patch)
2018-10-23 03:53 PDT, Antti Koivisto
no flags
with locking (3.10 KB, patch)
2018-10-23 05:40 PDT, Antti Koivisto
no flags
patch (3.12 KB, patch)
2018-10-24 00:53 PDT, Antti Koivisto
no flags
Antti Koivisto
Comment 1 2018-10-22 06:57:24 PDT
Antti Koivisto
Comment 2 2018-10-22 07:21:24 PDT
Alex Christensen
Comment 3 2018-10-22 07:27:38 PDT
Comment on attachment 352890 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=352890&action=review > Source/WebCore/platform/mac/PublicSuffixMac.mm:51 > + static NeverDestroyed<HashMap<String, AtomicString, ASCIICaseInsensitiveHash>> cache; Is there really a benefit to using AtomicString here?
Antti Koivisto
Comment 4 2018-10-22 07:36:43 PDT
> Is there really a benefit to using AtomicString here? At least in theory. Many domains may map into the same top domain. It shouldn't hurt.
Chris Dumez
Comment 5 2018-10-22 08:44:00 PDT
Comment on attachment 352890 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=352890&action=review >> Source/WebCore/platform/mac/PublicSuffixMac.mm:51 >> + static NeverDestroyed<HashMap<String, AtomicString, ASCIICaseInsensitiveHash>> cache; > > Is there really a benefit to using AtomicString here? Have you confirmed that this method is not called from multiple threads? I suspect it could be (e.g. for ITP).
Chris Dumez
Comment 6 2018-10-22 08:47:46 PDT
Comment on attachment 352890 [details] patch At the very least, it seems we should ASSERT.
Chris Dumez
Comment 7 2018-10-22 09:44:04 PDT
Comment on attachment 352890 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=352890&action=review > Source/WebCore/platform/mac/PublicSuffixMac.mm:55 > + cache.get().clear(); Seems very aggressive to clear the whole cache? Why not simply kick one value out like we usually do?
Antti Koivisto
Comment 8 2018-10-22 10:20:08 PDT
> Seems very aggressive to clear the whole cache? Why not simply kick one > value out like we usually do? I tested and it turns out it is a terrible strategy for this sort of thing at least in the usual 'map.remove(map.begin())' form. Since the table is never rehashed we always end up removing keys that hash to the early part of the table and the rest are stuck forever.
Antti Koivisto
Comment 9 2018-10-22 10:21:03 PDT
> Have you confirmed that this method is not called from multiple threads? I > suspect it could be (e.g. for ITP). Ok, I'll just switch to String.
Chris Dumez
Comment 10 2018-10-22 10:26:28 PDT
(In reply to Antti Koivisto from comment #9) > > Have you confirmed that this method is not called from multiple threads? I > > suspect it could be (e.g. for ITP). > > Ok, I'll just switch to String. Switching to String would not be sufficient as you'd still have HashMap operations from multiple threads. That said, I do not know for sure we have multithreaded access and an assertion would suffice. If we do have multi threaded access though, then I'd suggest using String + Locks.
Chris Dumez
Comment 11 2018-10-22 10:27:20 PDT
(In reply to Antti Koivisto from comment #8) > > Seems very aggressive to clear the whole cache? Why not simply kick one > > value out like we usually do? > > I tested and it turns out it is a terrible strategy for this sort of thing > at least in the usual 'map.remove(map.begin())' form. Since the table is > never rehashed we always end up removing keys that hash to the early part of > the table and the rest are stuck forever. That's good to know. Sounds like we may want a method that returns a random iterator inside the HashMap? e.g. HashMap::random() or HashMap::any()
Antti Koivisto
Comment 12 2018-10-22 10:37:27 PDT
> Switching to String would not be sufficient as you'd still have HashMap > operations from multiple threads. That said, I do not know for sure we have > multithreaded access and an assertion would suffice. If we do have multi > threaded access though, then I'd suggest using String + Locks. Yep. I'll see if it is being used.
Antti Koivisto
Comment 13 2018-10-22 10:41:33 PDT
> That's good to know. Sounds like we may want a method that returns a random > iterator inside the HashMap? e.g. HashMap::random() or HashMap::any() Or more explicit support for this pattern: HashMap::shrinkRandomlyToSize(size_t)
Geoffrey Garen
Comment 14 2018-10-22 12:27:58 PDT
(In reply to Antti Koivisto from comment #8) > > Seems very aggressive to clear the whole cache? Why not simply kick one > > value out like we usually do? > > I tested and it turns out it is a terrible strategy for this sort of thing > at least in the usual 'map.remove(map.begin())' form. Since the table is > never rehashed we always end up removing keys that hash to the early part of > the table and the rest are stuck forever. Why doesn't the table rehash?
Chris Dumez
Comment 15 2018-10-22 12:49:47 PDT
(In reply to Geoffrey Garen from comment #14) > (In reply to Antti Koivisto from comment #8) > > > Seems very aggressive to clear the whole cache? Why not simply kick one > > > value out like we usually do? > > > > I tested and it turns out it is a terrible strategy for this sort of thing > > at least in the usual 'map.remove(map.begin())' form. Since the table is > > never rehashed we always end up removing keys that hash to the early part of > > the table and the rest are stuck forever. > > Why doesn't the table rehash? My understanding is that we only rehash when shrinking / expanding the HashTable capacity. In this kind of pattern, we have a max entries (in this case 100) so we never need to expand / shrink. We reach 100 entries, remove HashTable::begin() then add the new entry.
Geoffrey Garen
Comment 16 2018-10-22 13:13:16 PDT
(In reply to Chris Dumez from comment #15) > (In reply to Geoffrey Garen from comment #14) > > (In reply to Antti Koivisto from comment #8) > > > > Seems very aggressive to clear the whole cache? Why not simply kick one > > > > value out like we usually do? > > > > > > I tested and it turns out it is a terrible strategy for this sort of thing > > > at least in the usual 'map.remove(map.begin())' form. Since the table is > > > never rehashed we always end up removing keys that hash to the early part of > > > the table and the rest are stuck forever. > > > > Why doesn't the table rehash? > > My understanding is that we only rehash when shrinking / expanding the > HashTable capacity. In this kind of pattern, we have a max entries (in this > case 100) so we never need to expand / shrink. We reach 100 entries, remove > HashTable::begin() then add the new entry. I see. If we always remove one and add one, then the table never rehashes. Yeah, an explicit idiom to select a random entry would be nice.
Antti Koivisto
Comment 17 2018-10-23 01:29:22 PDT
EWS Watchlist
Comment 18 2018-10-23 02:40:38 PDT
Comment on attachment 352960 [details] patch Attachment 352960 [details] did not pass mac-debug-ews (mac): Output: https://webkit-queues.webkit.org/results/9703747 Number of test failures exceeded the failure limit.
EWS Watchlist
Comment 19 2018-10-23 02:40:40 PDT
Created attachment 352965 [details] Archive of layout-test-results from ews117 for mac-sierra The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews117 Port: mac-sierra Platform: Mac OS X 10.12.6
Antti Koivisto
Comment 20 2018-10-23 03:53:24 PDT
Created attachment 352967 [details] with locking
Antti Koivisto
Comment 21 2018-10-23 05:40:28 PDT
Created attachment 352969 [details] with locking
Chris Dumez
Comment 22 2018-10-23 09:00:14 PDT
We should update the code to use https://bugs.webkit.org/show_bug.cgi?id=190814 when it lands.
WebKit Commit Bot
Comment 23 2018-10-23 11:02:41 PDT
Comment on attachment 352969 [details] with locking Clearing flags on attachment: 352969 Committed r237357: <https://trac.webkit.org/changeset/237357>
WebKit Commit Bot
Comment 24 2018-10-23 11:02:43 PDT
All reviewed patches have been landed. Closing bug.
Radar WebKit Bug Importer
Comment 25 2018-10-23 11:03:58 PDT
Truitt Savell
Comment 26 2018-10-23 12:48:09 PDT
After the changes in https://trac.webkit.org/changeset/237357/webkit The API test is now failing Log: https://build.webkit.org/builders/Apple%20Sierra%20Release%20WK1%20%28Tests%29/builds/13555/steps/run-api-tests/logs/stdio build: https://build.webkit.org/builders/Apple%20Sierra%20Release%20WK1%20%28Tests%29/builds/13555 Failed TestWebKitAPI.PublicSuffix.TopPrivatelyControlledDomain /Volumes/Data/slave/sierra-release/build/Tools/TestWebKitAPI/Tests/WebCore/PublicSuffix.cpp:159 Expected equality of these values: String() Which is: topPrivatelyControlledDomain("") Which is:
Truitt Savell
Comment 27 2018-10-23 13:53:25 PDT
Reverted r237357 for reason: API test is now failing on all platforms. Committed r237366: <https://trac.webkit.org/changeset/237366>
Antti Koivisto
Comment 28 2018-10-24 00:53:43 PDT
Antti Koivisto
Comment 29 2018-10-24 03:22:44 PDT
Note You need to log in before you can comment on or make changes to this bug.