Bug 208528

Summary: UIProcess needs mechanism to specify AppBound domains
Product: WebKit Reporter: Kate Cheney <katherine_cheney>
Component: WebKit Misc.Assignee: Kate Cheney <katherine_cheney>
Status: RESOLVED FIXED    
Severity: Normal CC: achristensen, aestes, bfulgham, cdumez, commit-queue, simon.fraser, webkit-bug-importer, wilander
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch none

Kate Cheney
Reported 2020-03-03 12:19:15 PST
We should collect and store app-bound domains in the UIProcess
Attachments
Patch (9.32 KB, patch)
2020-03-03 13:05 PST, Kate Cheney
no flags
Patch (9.35 KB, patch)
2020-03-03 13:50 PST, Kate Cheney
no flags
Patch (33.02 KB, patch)
2020-03-06 12:20 PST, Kate Cheney
no flags
Patch (35.85 KB, patch)
2020-03-06 15:19 PST, Kate Cheney
no flags
Patch (35.98 KB, patch)
2020-03-06 15:34 PST, Kate Cheney
no flags
Patch (36.22 KB, patch)
2020-03-06 17:28 PST, Kate Cheney
no flags
Patch (36.43 KB, patch)
2020-03-06 19:20 PST, Kate Cheney
no flags
Kate Cheney
Comment 1 2020-03-03 12:19:53 PST
Kate Cheney
Comment 2 2020-03-03 13:05:08 PST
Kate Cheney
Comment 3 2020-03-03 13:50:39 PST
John Wilander
Comment 4 2020-03-04 16:38:43 PST
Comment on attachment 392324 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=392324&action=review > Source/WebCore/ChangeLog:3 > + UIProcess needs mechanism to specify AppBound domains It's a little weird to refer to UIProcess without further qualifications in the WebCore change log. Not sure how this could be done better. > Source/WebCore/platform/RuntimeApplicationChecks.h:28 > +#include "RegistrableDomain.h" I believe this could just be forward-declared instead of included. Then you include in the implementation file instead. We try to do that to shorten build times. > Source/WebCore/platform/cocoa/RuntimeApplicationChecksCocoa.mm:78 > +Vector<WebCore::RegistrableDomain>& getAppBoundDomains() I wonder if there's a way to make sure that callers of this function cannot change the app-bound domains. > Source/WebCore/platform/cocoa/RuntimeApplicationChecksCocoa.mm:84 > + NSArray *domains = [[NSBundle mainBundle] objectForInfoDictionaryKey:@"AppBoundDomains"]; Is there a guarantee that these domains will indeed be registrable domains? I don't think there even can be since the Public Suffix List which is the basis for RegistrableDomain changes and thus an array entry may be a registrable domain one day and not another, or vice versa. See comment below. > Source/WebCore/platform/cocoa/RuntimeApplicationChecksCocoa.mm:85 > + int appBoundDomainsCount = std::min(maxAppBoundDomainCount, static_cast<int>([domains count])); Where does maxAppBoundDomainCount come from? It doesn't look like a constant. > Source/WebCore/platform/cocoa/RuntimeApplicationChecksCocoa.mm:88 > + appBoundDomains->append(WebCore::RegistrableDomain::uncheckedCreateFromRegistrableDomainString([domains objectAtIndex:i])); You should use a checked constructor here by creating a URL object from the array entry and then letting RegistrableDomain figure out the registrable domain through its use of the Public Suffix List. Then you also need to check that the returned RegistrableDomain object is not empty since an array entry that does not have a registrable domain will result in an empty object. > Source/WebCore/platform/cocoa/RuntimeApplicationChecksCocoa.mm:90 > + return appBoundDomains; I wonder if we should return a deep copy of this Vector? Otherwise other code can change its content, right? (Not 100% sure about function statics.) It goes back to my thoughts that we should make sure callers of this function cannot change the set of app-bound domains. Another way of doing it is to have a void return type and instead have the caller supply a Vector that this function fills with copies of the registrable domains. > Source/WebKit/UIProcess/WebProcessProxy.cpp:1770 > + m_appBoundDomains.uncheckedAppend(domain); Since this is that way we're using WebCore::getAppBoundDomains(), it looks like supplying the function with the Vector that WebProcessProxy wants filled is a better pattern. > Source/WebKit/UIProcess/WebProcessProxy.h:379 > + Vector<WebCore::RegistrableDomain>& appBoundDomains() { return m_appBoundDomains; } First, this can be made const. Second, the same worry applies here in changing the contents of the Vector. > Tools/TestWebKitAPI/Info.plist:7 > + <string>testDomain1</string> You need to do full testing here. Real registrable domains, real domains with subdomains, bogus domains (such as the ones you have now), empty entries, whitespace entries, full URLs, and IP addresses. Then you need to test below max, at max, and above max number of entries.
Chris Dumez
Comment 5 2020-03-04 17:01:28 PST
Comment on attachment 392324 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=392324&action=review > Source/WebCore/platform/RuntimeApplicationChecks.h:31 > +static const auto maxAppBoundDomainCount = 10; Why is this in the header and not the implementation? > Source/WebCore/platform/RuntimeApplicationChecks.h:55 > +WEBCORE_EXPORT Vector<WebCore::RegistrableDomain>& getAppBoundDomains(); This definitely does not feel like the right header to put this in. Does it even need to be in WebCore? It seems you only use it at the WebKit layer. >> Source/WebCore/platform/cocoa/RuntimeApplicationChecksCocoa.mm:78 >> +Vector<WebCore::RegistrableDomain>& getAppBoundDomains() > > I wonder if there's a way to make sure that callers of this function cannot change the app-bound domains. Return a const Vector<WebCore::RegistrableDomain>& ? > Source/WebCore/platform/cocoa/RuntimeApplicationChecksCocoa.mm:80 > + static auto appBoundDomains = makeNeverDestroyed(Vector<WebCore::RegistrableDomain> { }); static const. Also, this could be initialized with a lambda, see MIMETypeRegistry::supportedMediaMIMETypes() for an example. > Source/WebCore/platform/cocoa/RuntimeApplicationChecksCocoa.mm:87 > + for (int i = 0; i < appBoundDomainsCount; i++) ++i > Source/WebKit/UIProcess/WebProcessProxy.cpp:192 > + readAppBoundDomains(); So every time we launch a WebProcess, the UIProcess will read entries from the disk on the main thread? I have some concerns: - Does it really need to happen more than once? - Is reading this on the main thread really suitable? > Source/WebKit/UIProcess/WebProcessProxy.cpp:1767 > + auto& domains = WebCore::getAppBoundDomains(); m_appBoundDomains = WebCore::getAppBoundDomains(); ? Or am I missing something obvious?
John Wilander
Comment 6 2020-03-04 17:04:23 PST
(In reply to Chris Dumez from comment #5) > > I wonder if there's a way to make sure that callers of this function cannot change the app-bound domains. > > Return a const Vector<WebCore::RegistrableDomain>& ? Will that guarantee that the entries in the Vector can't be changed? Sorry, I'm not that familiar with the semantics of const return types. :)
Chris Dumez
Comment 7 2020-03-04 17:10:17 PST
(In reply to John Wilander from comment #6) > (In reply to Chris Dumez from comment #5) > > > I wonder if there's a way to make sure that callers of this function cannot change the app-bound domains. > > > > Return a const Vector<WebCore::RegistrableDomain>& ? > > Will that guarantee that the entries in the Vector can't be changed? Sorry, > I'm not that familiar with the semantics of const return types. :) Yes, because our Vector implementation is const-correct. You cannot modify a const vector or modify the entries of a const vector, unless you const_cast.
Kate Cheney
Comment 8 2020-03-04 17:20:43 PST
Thanks for the comments Chris and John! (In reply to John Wilander from comment #4) > Comment on attachment 392324 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=392324&action=review > > > Source/WebCore/ChangeLog:3 > > + UIProcess needs mechanism to specify AppBound domains > > It's a little weird to refer to UIProcess without further qualifications in > the WebCore change log. Not sure how this could be done better. > Yes, after reading these comments I think this should be moved out of WebCore. > > Source/WebCore/platform/RuntimeApplicationChecks.h:28 > > +#include "RegistrableDomain.h" > > I believe this could just be forward-declared instead of included. Then you > include in the implementation file instead. We try to do that to shorten > build times. > Will do! > > Source/WebCore/platform/cocoa/RuntimeApplicationChecksCocoa.mm:78 > > +Vector<WebCore::RegistrableDomain>& getAppBoundDomains() > > I wonder if there's a way to make sure that callers of this function cannot > change the app-bound domains. > > > Source/WebCore/platform/cocoa/RuntimeApplicationChecksCocoa.mm:84 > > + NSArray *domains = [[NSBundle mainBundle] objectForInfoDictionaryKey:@"AppBoundDomains"]; > > Is there a guarantee that these domains will indeed be registrable domains? > I don't think there even can be since the Public Suffix List which is the > basis for RegistrableDomain changes and thus an array entry may be a > registrable domain one day and not another, or vice versa. See comment below. > No guarantee of anything, I should definitely add more rigorous checks here. > > Source/WebCore/platform/cocoa/RuntimeApplicationChecksCocoa.mm:88 > > + appBoundDomains->append(WebCore::RegistrableDomain::uncheckedCreateFromRegistrableDomainString([domains objectAtIndex:i])); > > You should use a checked constructor here by creating a URL object from the > array entry and then letting RegistrableDomain figure out the registrable > domain through its use of the Public Suffix List. Then you also need to > check that the returned RegistrableDomain object is not empty since an array > entry that does not have a registrable domain will result in an empty object. > Good point, I will refactor this. > > Source/WebCore/platform/cocoa/RuntimeApplicationChecksCocoa.mm:90 > > + return appBoundDomains; > > I wonder if we should return a deep copy of this Vector? Otherwise other > code can change its content, right? (Not 100% sure about function statics.) > It goes back to my thoughts that we should make sure callers of this > function cannot change the set of app-bound domains. Another way of doing it > is to have a void return type and instead have the caller supply a Vector > that this function fills with copies of the registrable domains. > > > Source/WebKit/UIProcess/WebProcessProxy.cpp:1770 > > + m_appBoundDomains.uncheckedAppend(domain); > > Since this is that way we're using WebCore::getAppBoundDomains(), it looks > like supplying the function with the Vector that WebProcessProxy wants > filled is a better pattern. You're right, that seems better. > > Tools/TestWebKitAPI/Info.plist:7 > > + <string>testDomain1</string> > > You need to do full testing here. Real registrable domains, real domains > with subdomains, bogus domains (such as the ones you have now), empty > entries, whitespace entries, full URLs, and IP addresses. Then you need to > test below max, at max, and above max number of entries. I will add some more testing!
Kate Cheney
Comment 9 2020-03-04 17:24:19 PST
(In reply to Chris Dumez from comment #5) > Comment on attachment 392324 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=392324&action=review > > > Source/WebCore/platform/RuntimeApplicationChecks.h:31 > > +static const auto maxAppBoundDomainCount = 10; > > Why is this in the header and not the implementation? > > > Source/WebCore/platform/RuntimeApplicationChecks.h:55 > > +WEBCORE_EXPORT Vector<WebCore::RegistrableDomain>& getAppBoundDomains(); > > This definitely does not feel like the right header to put this in. Does it > even need to be in WebCore? It seems you only use it at the WebKit layer. > Okay, I'll figure out a new place in WebKit. > >> Source/WebCore/platform/cocoa/RuntimeApplicationChecksCocoa.mm:78 > >> +Vector<WebCore::RegistrableDomain>& getAppBoundDomains() > > > > I wonder if there's a way to make sure that callers of this function cannot change the app-bound domains. > > Return a const Vector<WebCore::RegistrableDomain>& ? > Nice, I'll do this. > > Source/WebCore/platform/cocoa/RuntimeApplicationChecksCocoa.mm:80 > > + static auto appBoundDomains = makeNeverDestroyed(Vector<WebCore::RegistrableDomain> { }); > > static const. > Also, this could be initialized with a lambda, see > MIMETypeRegistry::supportedMediaMIMETypes() for an example. > Noted. > > Source/WebCore/platform/cocoa/RuntimeApplicationChecksCocoa.mm:87 > > + for (int i = 0; i < appBoundDomainsCount; i++) > > ++I Noted. > > Source/WebKit/UIProcess/WebProcessProxy.cpp:192 > > + readAppBoundDomains(); > > So every time we launch a WebProcess, the UIProcess will read entries from > the disk on the main thread? I have some concerns: > - Does it really need to happen more than once? > - Is reading this on the main thread really suitable? > It should not need to happen more than once, I'm not sure of an appropriate place to put it to ensure that it gets called only once though. What would you recommend? I am not an expert in threading, what makes it suitable or unsuitable to read something on the main thread? > > Source/WebKit/UIProcess/WebProcessProxy.cpp:1767 > > + auto& domains = WebCore::getAppBoundDomains(); > > m_appBoundDomains = WebCore::getAppBoundDomains(); ? Or am I missing > something obvious? Good catch!
John Wilander
Comment 10 2020-03-04 17:36:57 PST
(In reply to Chris Dumez from comment #7) > (In reply to John Wilander from comment #6) > > (In reply to Chris Dumez from comment #5) > > > > I wonder if there's a way to make sure that callers of this function cannot change the app-bound domains. > > > > > > Return a const Vector<WebCore::RegistrableDomain>& ? > > > > Will that guarantee that the entries in the Vector can't be changed? Sorry, > > I'm not that familiar with the semantics of const return types. :) > > Yes, because our Vector implementation is const-correct. You cannot modify a > const vector or modify the entries of a const vector, unless you const_cast. And the caller cannot do that cast?
Kate Cheney
Comment 11 2020-03-06 12:20:00 PST
Brent Fulgham
Comment 12 2020-03-06 12:43:11 PST
Comment on attachment 392753 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=392753&action=review I think this looks good, but I would suggest running it by Chris to make sure we captured all of this comments. > Source/WebKit/UIProcess/WebPageProxy.h:2265 > + void setIsNavigatingToAppBoundDomain(bool isMainFrame, const URL&, NavigatingToAppBoundDomain); Much simpler :-) > Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:448 > + NSArray *domains = [[NSBundle mainBundle] objectForInfoDictionaryKey:@"AppBoundDomains"]; I think we should use the key "WKAppBoundDomains" to be consistent with other keys in the system (e.g., SBMatchingApplicationGenres or MPSupportsExternallyPlayableContent, etc.) > Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:461 > + initializedAppBoundDomains = true; Wouldn't it be better to do all of this work in the dispatch queue, and only dispatch the final result to the main thread? Or did Chris feel that copying the NSArray across and working in the main thread would be safer?
Chris Dumez
Comment 13 2020-03-06 12:43:18 PST
Comment on attachment 392753 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=392753&action=review > Source/WebKit/UIProcess/API/Cocoa/WKWebsiteDataStore.mm:605 > + auto completionHandlerCopy = makeBlockPtr(completionHandler); This can be done in the lambda capture. > Source/WebKit/UIProcess/API/Cocoa/WKWebsiteDataStore.mm:607 > + Vector<RefPtr<API::Object>> apiDomains; Seems like a good use case for WTF::map() > Source/WebKit/UIProcess/WebFramePolicyListenerProxy.cpp:51 > + ASSERT(isMainThread()); RunLoop::isMain() is preferred in WebKit2 code, especially in the UIProcess layer where an app could be using both WK1 and WK2 (and thus have a WebThread). isMainThread() is never correct in the UIProcess for this reason. > Source/WebKit/UIProcess/WebFramePolicyListenerProxy.h:44 > +enum class ShouldExpectAppBoundDomainResult { No, Yes }; enum class ShouldExpectAppBoundDomainResult : bool { No, Yes }; > Source/WebKit/UIProcess/WebPageProxy.cpp:5060 > + setIsNavigatingToAppBoundDomain(frame->isMainFrame(), navigation->currentRequest().url(), isAppBoundDomain); What if the policyAction is Ignore? Then we won't navigate. Is it still correct to go set this IsNavigatingToAppBoundDomain flag? > Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:431 > + ASSERT(isMainThread()); RunLoop::isMain() > Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:436 > +void WebsiteDataStore::beginAppBoundDomainCheck(const WebCore::RegistrableDomain domain, WebFramePolicyListenerProxy& listener) WebCore::RegistrableDomain&& > Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:438 > + ASSERT(RunLoop::isMain()); RunLoop::isMain() > Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:442 > + if (initializedAppBoundDomains) { This does not look quite right. If beginAppBoundDomainCheck() gets called twice before initializedAppBoundDomains is set to true, then you'll dispatch to the background thread twice and read the file on disk twice. > Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:447 > + m_queue->dispatch([domain = domain.isolatedCopy(), listener = makeRef(listener)] () mutable { domain = WTFMove(domain) once you switch to an rvalue reference. > Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:448 > + NSArray *domains = [[NSBundle mainBundle] objectForInfoDictionaryKey:@"AppBoundDomains"]; Would 'NSArray<NSString *> *domains' build? > Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:449 > + RunLoop::main().dispatch([domain = domain.isolatedCopy(), listener = WTFMove(listener), domains = retainPtr(domains)] { domain = WTFMove(domain) once you switch to an rvalue reference. > Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:452 > + for (unsigned i = 0; i < appBoundDomainsCount && domainsAdded < maxAppBoundDomainCount; ++i) { for (NSString *domain in domains) > Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:457 > + domainsAdded++; if (appBoundDomains() >= maxAppBoundDomainCount) break; > Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:467 > +void WebsiteDataStore::getAppBoundDomainsForTesting(CompletionHandler<void(Vector<WebCore::RegistrableDomain>)>&& completionHandler) Vector<WebCore::RegistrableDomain>&& > Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:469 > + completionHandler(copyToVector(appBoundDomains())); Nothing guarantees here that appBoundDomains() has been initialized. My personal opinion is that we should start initializing the app bound domains when the first WebsiteDataStore object is constructed to avoid slowing down the very first page load. > Source/WebKit/UIProcess/WebsiteData/WebsiteDataStore.h:286 > + void beginAppBoundDomainCheck(const WebCore::RegistrableDomain, WebFramePolicyListenerProxy&); WebCore::RegistrableDomain&& > Source/WebKit/UIProcess/WebsiteData/WebsiteDataStore.h:287 > + void getAppBoundDomainsForTesting(CompletionHandler<void(Vector<WebCore::RegistrableDomain>)>&&); Vector<WebCore::RegistrableDomain>&&
Brent Fulgham
Comment 14 2020-03-06 12:46:19 PST
Comment on attachment 392753 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=392753&action=review >> Source/WebKit/UIProcess/WebFramePolicyListenerProxy.h:44 >> +enum class ShouldExpectAppBoundDomainResult { No, Yes }; > > enum class ShouldExpectAppBoundDomainResult : bool { No, Yes }; I wonder why the others weren't marked this way. Would we improve memory use if we made that change?
Chris Dumez
Comment 15 2020-03-06 12:48:11 PST
(In reply to Brent Fulgham from comment #14) > Comment on attachment 392753 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=392753&action=review > > >> Source/WebKit/UIProcess/WebFramePolicyListenerProxy.h:44 > >> +enum class ShouldExpectAppBoundDomainResult { No, Yes }; > > > > enum class ShouldExpectAppBoundDomainResult : bool { No, Yes }; > > I wonder why the others weren't marked this way. Would we improve memory use > if we made that change? Please feel free to fix others at the same time. This is a somewhat recent pattern in WebKit so not all code has been ported yet. Yes, it can result in memory saving when stored as a data member, it also has some benefits when sent over IPC. Even if not doing any of those things, specifying an appropriate underlying type is always good practice.
Alex Christensen
Comment 16 2020-03-06 12:51:34 PST
Comment on attachment 392753 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=392753&action=review >>> Source/WebKit/UIProcess/WebFramePolicyListenerProxy.h:44 >>> +enum class ShouldExpectAppBoundDomainResult { No, Yes }; >> >> enum class ShouldExpectAppBoundDomainResult : bool { No, Yes }; > > I wonder why the others weren't marked this way. Would we improve memory use if we made that change? We may as well set their underlying type. It saves memory if they are stored as a member variable, but if they are only passed as parameters it doesn't effectively change anything because it still uses a whole register. >> Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:442 >> + if (initializedAppBoundDomains) { > > This does not look quite right. If beginAppBoundDomainCheck() gets called twice before initializedAppBoundDomains is set to true, then you'll dispatch to the background thread twice and read the file on disk twice. That is intentional. We don't want to hang the main thread, and there are no bad consequences of doing it twice. In the common case it will read the first time then immediately reply after the first time. On rare occasion it will read more than once with no negative consequences. >> Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:461 >> + initializedAppBoundDomains = true; > > Wouldn't it be better to do all of this work in the dispatch queue, and only dispatch the final result to the main thread? Or did Chris feel that copying the NSArray across and working in the main thread would be safer? We don't want to touch appBoundDomains() on a background queue or we will get hash table corruption.
Kate Cheney
Comment 17 2020-03-06 12:57:26 PST
(In reply to Chris Dumez from comment #13) > Comment on attachment 392753 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=392753&action=review > > What if the policyAction is Ignore? Then we won't navigate. Is it still > correct to go set this IsNavigatingToAppBoundDomain flag? > I think the listener will handle the ignore case, see void WebFramePolicyListenerProxy::ignore(). In this case we don't send a value, so I should add a check here for isAppBoundDomain.
Chris Dumez
Comment 18 2020-03-06 13:00:46 PST
Comment on attachment 392753 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=392753&action=review >> Source/WebKit/UIProcess/API/Cocoa/WKWebsiteDataStore.mm:605 >> + auto completionHandlerCopy = makeBlockPtr(completionHandler); > > This can be done in the lambda capture. This can be done in the lambda capture. >> Source/WebKit/UIProcess/API/Cocoa/WKWebsiteDataStore.mm:607 >> + Vector<RefPtr<API::Object>> apiDomains; > > Seems like a good use case for WTF::map() Seems like a good use case for WTF::map() >> Source/WebKit/UIProcess/WebFramePolicyListenerProxy.cpp:51 >> + ASSERT(isMainThread()); > > RunLoop::isMain() is preferred in WebKit2 code, especially in the UIProcess layer where an app could be using both WK1 and WK2 (and thus have a WebThread). isMainThread() is never correct in the UIProcess for this reason. RunLoop::isMain() is preferred in WebKit2 code, especially in the UIProcess layer where an app could be using both WK1 and WK2 (and thus have a WebThread). isMainThread() is never correct in the UIProcess for this reason. >>>>> Source/WebKit/UIProcess/WebFramePolicyListenerProxy.h:44 >>>>> +enum class ShouldExpectAppBoundDomainResult { No, Yes }; >>>> >>>> enum class ShouldExpectAppBoundDomainResult : bool { No, Yes }; >>> >>> I wonder why the others weren't marked this way. Would we improve memory use if we made that change? >> >> Please feel free to fix others at the same time. This is a somewhat recent pattern in WebKit so not all code has been ported yet. Yes, it can result in memory saving when stored as a data member, it also has some benefits when sent over IPC. Even if not doing any of those things, specifying an appropriate underlying type is always good practice. > > We may as well set their underlying type. It saves memory if they are stored as a member variable, but if they are only passed as parameters it doesn't effectively change anything because it still uses a whole register. enum class ShouldExpectAppBoundDomainResult : bool { No, Yes }; >> Source/WebKit/UIProcess/WebPageProxy.cpp:5060 >> + setIsNavigatingToAppBoundDomain(frame->isMainFrame(), navigation->currentRequest().url(), isAppBoundDomain); > > What if the policyAction is Ignore? Then we won't navigate. Is it still correct to go set this IsNavigatingToAppBoundDomain flag? What if the policyAction is Ignore? Then we won't navigate. Is it still correct to go set this IsNavigatingToAppBoundDomain flag? >> Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:431 >> + ASSERT(isMainThread()); > > RunLoop::isMain() RunLoop::isMain() >> Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:436 >> +void WebsiteDataStore::beginAppBoundDomainCheck(const WebCore::RegistrableDomain domain, WebFramePolicyListenerProxy& listener) > > WebCore::RegistrableDomain&& WebCore::RegistrableDomain&& >> Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:438 >> + ASSERT(RunLoop::isMain()); > > RunLoop::isMain() RunLoop::isMain() >>> Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:442 >>> + if (initializedAppBoundDomains) { >> >> This does not look quite right. If beginAppBoundDomainCheck() gets called twice before initializedAppBoundDomains is set to true, then you'll dispatch to the background thread twice and read the file on disk twice. > > That is intentional. We don't want to hang the main thread, and there are no bad consequences of doing it twice. In the common case it will read the first time then immediately reply after the first time. On rare occasion it will read more than once with no negative consequences. Well there are bad consequences on power and perf to doing the same thing more than once unnecessarily. Also, my biggest concern is that you only start populating this on a low-priority thread once you've already started navigating and it actually delays the navigation. This is not a good practice for performance. You don't want your main thread (and user visible navigation) to be waiting on a low priority background thread. As a result, I believe this code has to be refactored anyway to be done eagerly when initializing the first WebsiteDataStore. Note that I agree we don't want to hang the main thread so I am not suggesting we lock. You can still dispatch on the background thread and then back to the main thread. However, you do NOT need to read the file more than once and populate appBoundDomains() more than once. >> Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:447 >> + m_queue->dispatch([domain = domain.isolatedCopy(), listener = makeRef(listener)] () mutable { > > domain = WTFMove(domain) once you switch to an rvalue reference. domain = WTFMove(domain) once you switch to an rvalue reference. >>> Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:448 >>> + NSArray *domains = [[NSBundle mainBundle] objectForInfoDictionaryKey:@"AppBoundDomains"]; >> >> I think we should use the key "WKAppBoundDomains" to be consistent with other keys in the system (e.g., SBMatchingApplicationGenres or MPSupportsExternallyPlayableContent, etc.) > > Would 'NSArray<NSString *> *domains' build? Would 'NSArray<NSString *> *domains' build? >> Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:449 >> + RunLoop::main().dispatch([domain = domain.isolatedCopy(), listener = WTFMove(listener), domains = retainPtr(domains)] { > > domain = WTFMove(domain) once you switch to an rvalue reference. domain = WTFMove(domain) once you switch to an rvalue reference. >> Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:452 >> + for (unsigned i = 0; i < appBoundDomainsCount && domainsAdded < maxAppBoundDomainCount; ++i) { > > for (NSString *domain in domains) for (NSString *domain in domains) >> Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:457 >> + domainsAdded++; > > if (appBoundDomains() >= maxAppBoundDomainCount) > break; if (appBoundDomains() >= maxAppBoundDomainCount) break; >> Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:467 >> +void WebsiteDataStore::getAppBoundDomainsForTesting(CompletionHandler<void(Vector<WebCore::RegistrableDomain>)>&& completionHandler) > > Vector<WebCore::RegistrableDomain>&& Vector<WebCore::RegistrableDomain>&& >> Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:469 >> + completionHandler(copyToVector(appBoundDomains())); > > Nothing guarantees here that appBoundDomains() has been initialized. > > My personal opinion is that we should start initializing the app bound domains when the first WebsiteDataStore object is constructed to avoid slowing down the very first page load. Nothing guarantees here that appBoundDomains() has been initialized. My personal opinion is that we should start initializing the app bound domains when the first WebsiteDataStore object is constructed to avoid slowing down the very first page load. >> Source/WebKit/UIProcess/WebsiteData/WebsiteDataStore.h:286 >> + void beginAppBoundDomainCheck(const WebCore::RegistrableDomain, WebFramePolicyListenerProxy&); > > WebCore::RegistrableDomain&& WebCore::RegistrableDomain&& >> Source/WebKit/UIProcess/WebsiteData/WebsiteDataStore.h:287 >> + void getAppBoundDomainsForTesting(CompletionHandler<void(Vector<WebCore::RegistrableDomain>)>&&); > > Vector<WebCore::RegistrableDomain>&& Vector<WebCore::RegistrableDomain>&&
Chris Dumez
Comment 19 2020-03-06 13:01:39 PST
I don't know why bugzilla repeated all my comments. The part that was new is: >>> Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:442 >>> + if (initializedAppBoundDomains) { >> >> This does not look quite right. If beginAppBoundDomainCheck() gets called twice before initializedAppBoundDomains is set to true, then you'll dispatch to the background thread twice and read the file on disk twice. > > That is intentional. We don't want to hang the main thread, and there are no bad consequences of doing it twice. In the common case it will read the first time then immediately reply after the first time. On rare occasion it will read more than once with no negative consequences. Well there are bad consequences on power and perf to doing the same thing more than once unnecessarily. Also, my biggest concern is that you only start populating this on a low-priority thread once you've already started navigating and it actually delays the navigation. This is not a good practice for performance. You don't want your main thread (and user visible navigation) to be waiting on a low priority background thread. As a result, I believe this code has to be refactored anyway to be done eagerly when initializing the first WebsiteDataStore. Note that I agree we don't want to hang the main thread so I am not suggesting we lock. You can still dispatch on the background thread and then back to the main thread. However, you do NOT need to read the file more than once and populate appBoundDomains() more than once.
Chris Dumez
Comment 20 2020-03-06 13:04:36 PST
(In reply to Alex Christensen from comment #16) > Comment on attachment 392753 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=392753&action=review > > >>> Source/WebKit/UIProcess/WebFramePolicyListenerProxy.h:44 > >>> +enum class ShouldExpectAppBoundDomainResult { No, Yes }; > >> > >> enum class ShouldExpectAppBoundDomainResult : bool { No, Yes }; > > > > I wonder why the others weren't marked this way. Would we improve memory use if we made that change? > > We may as well set their underlying type. It saves memory if they are > stored as a member variable, but if they are only passed as parameters it > doesn't effectively change anything because it still uses a whole register. > > >> Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:442 > >> + if (initializedAppBoundDomains) { > > > > This does not look quite right. If beginAppBoundDomainCheck() gets called twice before initializedAppBoundDomains is set to true, then you'll dispatch to the background thread twice and read the file on disk twice. > > That is intentional. We don't want to hang the main thread, and there are > no bad consequences of doing it twice. In the common case it will read the > first time then immediately reply after the first time. On rare occasion it > will read more than once with no negative consequences. > > >> Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:461 > >> + initializedAppBoundDomains = true; > > > > Wouldn't it be better to do all of this work in the dispatch queue, and only dispatch the final result to the main thread? Or did Chris feel that copying the NSArray across and working in the main thread would be safer? > > We don't want to touch appBoundDomains() on a background queue or we will > get hash table corruption. I think it could be done in a safe way but I do not think it matters much in practice to be doing in on the main thread. It would only be unsafe if the HashMap was queried / modifying by several threads in parallel. Since we only need to populate the HashMap on the background thread once and then we set an atomic<bool>. If the main thread checks this atomic<bool> before reading the HashMap, I do not think there would be a problem?
Chris Dumez
Comment 21 2020-03-06 14:11:30 PST
Comment on attachment 392753 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=392753&action=review > Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:441 > + static std::atomic<bool> initializedAppBoundDomains = false; Note that this is only set and queried on the main thread so I do not see the need for it to be atomic, at least in this current patch iteration.
Kate Cheney
Comment 22 2020-03-06 15:19:34 PST
Alex Christensen
Comment 23 2020-03-06 15:22:47 PST
Comment on attachment 392783 [details] Patch Clever!
Kate Cheney
Comment 24 2020-03-06 15:34:39 PST
Kate Cheney
Comment 25 2020-03-06 15:37:31 PST
(In reply to Alex Christensen from comment #23) > Comment on attachment 392783 [details] > Patch > > Clever! : )
Chris Dumez
Comment 26 2020-03-06 15:56:57 PST
Comment on attachment 392788 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=392788&action=review > Source/WebKit/ChangeLog:11 > + the UI process whether a domain is app-bound. You're not really reporting it to the UIProcess, but you are reading the file from the UIProcess. > Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:71 > +static std::atomic<bool> initializedAppBoundDomains = false; Boolean variables in WebKit need a prefix. So maybe hasInitializedAppBoundDomains ? > Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:448 > + I would drop this empty line. > Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:456 > + URL url { URL(), domain }; Do we really need to have URLs saved on disk? This means we have to parse the URL unnecessarily just to get the host and then compute the registrable domain. Why cannot we simply store registrable domains in the plist? > Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:458 > + if (!appBoundDomain.isEmpty()) { To avoid nesting, maybe this: if (appBoundDomain.isEmpty()) continue; > Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:479 > + ASSERT(initializedAppBoundDomains); Sorry, I know I recommended this approach but there is actually a subtle bug now that I think about it more. The issue is that you're using the WebsiteDataStore's WorkQueue. Each WebsiteDataStore has its own WorkQueue so: 1. WebsiteDataStore A could have queued the task to initialize on WorkQueue WQA 2. WebsiteDataStore B then tries to read app bound domains and initializedAppBoundDomains is false, so its queue a task on WorkQueue WQB. -> Because they are 2 separate work queues, there is no serialization guarantee and you may not have initialized the domains yet by the time you get back to the main thread for reading One way to fix this would be to use a single static WorkQueue dedicated to reading appBoundDomains. > Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:497 > + completionHandler(copyToVector(domains)); Could we return a 'const HashSet<RegistrableDomain>&' here? It is kind of sad right now because we go from: HashSet<RegistrableDomain> -> Vector<RegistrableDomain> -> Vector<RefPtr<API::Object>> -> API::Array I like this better: HashSet<RegistrableDomain> -> Vector<RefPtr<API::Object>> -> API::Array > Source/WebKit/UIProcess/WebsiteData/WebsiteDataStore.cpp:121 > + initializeAppBoundDomains(); Please do this in platformInitialize() instead. > Source/WebKit/UIProcess/WebsiteData/WebsiteDataStore.h:287 > + void getAppBoundDomainsForTesting(CompletionHandler<void(Vector<WebCore::RegistrableDomain>&&)>&&); We don't use get prefix for getters in WebKit. Seems like this method should be const.
Brent Fulgham
Comment 27 2020-03-06 16:01:11 PST
Comment on attachment 392788 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=392788&action=review >> Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:71 >> +static std::atomic<bool> initializedAppBoundDomains = false; > > Boolean variables in WebKit need a prefix. So maybe hasInitializedAppBoundDomains ? +1 >> Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:456 >> + URL url { URL(), domain }; > > Do we really need to have URLs saved on disk? This means we have to parse the URL unnecessarily just to get the host and then compute the registrable domain. Why cannot we simply store registrable domains in the plist? Good point. Kate, let's just drop the URL parsing. >> Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:479 >> + ASSERT(initializedAppBoundDomains); > > Sorry, I know I recommended this approach but there is actually a subtle bug now that I think about it more. The issue is that you're using the WebsiteDataStore's WorkQueue. Each WebsiteDataStore has its own WorkQueue so: > 1. WebsiteDataStore A could have queued the task to initialize on WorkQueue WQA > 2. WebsiteDataStore B then tries to read app bound domains and initializedAppBoundDomains is false, so its queue a task on WorkQueue WQB. > -> Because they are 2 separate work queues, there is no serialization guarantee and you may not have initialized the domains yet by the time you get back to the main thread for reading > > One way to fix this would be to use a single static WorkQueue dedicated to reading appBoundDomains. Doesn't that seem like overkill for reading a maximum of 10 values out of an NSDictionary? >> Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:497 >> + completionHandler(copyToVector(domains)); > > Could we return a 'const HashSet<RegistrableDomain>&' here? It is kind of sad right now because we go from: > HashSet<RegistrableDomain> -> Vector<RegistrableDomain> -> Vector<RefPtr<API::Object>> -> API::Array > > I like this better: > HashSet<RegistrableDomain> -> Vector<RefPtr<API::Object>> -> API::Array No objections from me!
Chris Dumez
Comment 28 2020-03-06 16:05:33 PST
Comment on attachment 392788 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=392788&action=review >>> Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:479 >>> + ASSERT(initializedAppBoundDomains); >> >> Sorry, I know I recommended this approach but there is actually a subtle bug now that I think about it more. The issue is that you're using the WebsiteDataStore's WorkQueue. Each WebsiteDataStore has its own WorkQueue so: >> 1. WebsiteDataStore A could have queued the task to initialize on WorkQueue WQA >> 2. WebsiteDataStore B then tries to read app bound domains and initializedAppBoundDomains is false, so its queue a task on WorkQueue WQB. >> -> Because they are 2 separate work queues, there is no serialization guarantee and you may not have initialized the domains yet by the time you get back to the main thread for reading >> >> One way to fix this would be to use a single static WorkQueue dedicated to reading appBoundDomains. > > Doesn't that seem like overkill for reading a maximum of 10 values out of an NSDictionary? The code is racy otherwise and you may end up proceeding with the loads without having your domains (which I hear is not acceptable). If you have a way to address the raciness that is not overkill, I am all hears :) Using a separate WorkQueue from the one the WebsiteDataStore object uses is not that much work I think. We can also destroy this WorkQueue object once the reading from disk is done (and I think we should).
Kate Cheney
Comment 29 2020-03-06 17:07:54 PST
(In reply to Brent Fulgham from comment #27) > Comment on attachment 392788 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=392788&action=review > > >> Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:456 > >> + URL url { URL(), domain }; > > > > Do we really need to have URLs saved on disk? This means we have to parse the URL unnecessarily just to get the host and then compute the registrable domain. Why cannot we simply store registrable domains in the plist? > > Good point. Kate, let's just drop the URL parsing. I talked to John and he re-iterated that we should keep this in to ensure the URL is valid
Kate Cheney
Comment 30 2020-03-06 17:28:41 PST
Brent Fulgham
Comment 31 2020-03-06 17:38:38 PST
Comment on attachment 392817 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=392817&action=review > Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:464 > + if (!url.isValid()) Did you decide to leave this as a URL to make the code more robust against improper content in the Info.plist?
Brent Fulgham
Comment 32 2020-03-06 17:39:19 PST
Comment on attachment 392817 [details] Patch I think this looks good -- let's get it into a full build. We can massage more later.
Kate Cheney
Comment 33 2020-03-06 17:39:33 PST
Comment on attachment 392817 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=392817&action=review >> Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:464 >> + if (!url.isValid()) > > Did you decide to leave this as a URL to make the code more robust against improper content in the Info.plist? Yes, as per recommended after talking with John
Chris Dumez
Comment 34 2020-03-06 18:40:59 PST
Comment on attachment 392817 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=392817&action=review > Source/WebKit/UIProcess/WebPageProxy.cpp:5059 > + nit: Unnecessary blank line > Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:73 > + static auto& queue = WorkQueue::create("com.apple.WebKit.AppBoundDomains", WorkQueue::Type::Serial).leakRef(); This is OK for now since you are in a hurry but as I mentioned before, we may destroy the WorkQueue once we're done with the import. We should check to see if this keeps unnecessary threads around for the lifetime of the application. >>> Source/WebKit/UIProcess/WebsiteData/Cocoa/WebsiteDataStoreCocoa.mm:464 >>> + if (!url.isValid()) >> >> Did you decide to leave this as a URL to make the code more robust against improper content in the Info.plist? > > Yes, as per recommended after talking with John It'd be good to discuss John's concerns at some point and why we cannot simply call RegistrableDomain::uncheckedCreateFromHost() on strings in the plist. Then we'd simply store "apple.com" in the plist instead of "http://www.apple.com". uncheckedCreateFromHost() calls topPrivatelyControlledDomain() on the input string so even if the input is a subdomain, it would convert it to a registrable domain. I do not really understand why this does not address concerns with regards to bad input. Unnecessary parsing of URLs, then to extra their host and then call topPrivatelyControlledDomain() on the host seems really overkill.
Kate Cheney
Comment 35 2020-03-06 19:20:38 PST
WebKit Commit Bot
Comment 36 2020-03-06 20:48:55 PST
Comment on attachment 392839 [details] Patch Clearing flags on attachment: 392839 Committed r258054: <https://trac.webkit.org/changeset/258054>
WebKit Commit Bot
Comment 37 2020-03-06 20:48:57 PST
All reviewed patches have been landed. Closing bug.
Radar WebKit Bug Importer
Comment 38 2020-03-06 20:49:19 PST
Simon Fraser (smfr)
Comment 39 2020-03-07 10:52:49 PST
Comment on attachment 392839 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=392839&action=review > Tools/TestWebKitAPI/Info.plist:6 > + <key>WKAppBoundDomains</key> > + <array> What about a test where the value of WKAppBoundDomains is not an array, and a test where the values of the array are not strings?
Kate Cheney
Comment 40 2020-03-09 09:15:27 PDT
(In reply to Simon Fraser (smfr) from comment #39) > Comment on attachment 392839 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=392839&action=review > > > Tools/TestWebKitAPI/Info.plist:6 > > + <key>WKAppBoundDomains</key> > > + <array> > > What about a test where the value of WKAppBoundDomains is not an array, and > a test where the values of the array are not strings? Good point. I can add a test where a value is not a string, but since this reads from the Info.plist for TestWebKitAPI, I'm not sure of a good way to test a case when WKAppBoundDomains is not an array (while also keeping the case where it is).
Note You need to log in before you can comment on or make changes to this bug.