RESOLVED FIXED196558
Pick up Ad Click Attribution conversions in NetworkResourceLoader::willSendRedirectedRequest()
https://bugs.webkit.org/show_bug.cgi?id=196558
Summary Pick up Ad Click Attribution conversions in NetworkResourceLoader::willSendRe...
John Wilander
Reported 2019-04-03 13:38:07 PDT
We should look for HTTP redirects to URLs with a path starting with "/.well-known/ad-click-attribution/" and pick up Ad Click Attribution conversions.
Attachments
Work-in-progress (14.80 KB, patch)
2019-04-03 13:57 PDT, John Wilander
no flags
Patch (38.05 KB, patch)
2019-04-05 15:55 PDT, John Wilander
no flags
Patch (39.52 KB, patch)
2019-04-05 16:05 PDT, John Wilander
no flags
Archive of layout-test-results from ews122 for ios-simulator-wk2 (2.55 MB, application/zip)
2019-04-05 18:34 PDT, EWS Watchlist
no flags
Patch (41.12 KB, patch)
2019-04-06 16:12 PDT, John Wilander
no flags
Patch for landing (50.01 KB, patch)
2019-04-09 10:41 PDT, John Wilander
no flags
Radar WebKit Bug Importer
Comment 1 2019-04-03 13:40:11 PDT
John Wilander
Comment 2 2019-04-03 13:49:59 PDT
John Wilander
Comment 3 2019-04-03 13:57:24 PDT
Created attachment 366639 [details] Work-in-progress
Chris Dumez
Comment 4 2019-04-03 13:59:53 PDT
(In reply to John Wilander from comment #3) > Created attachment 366639 [details] > Work-in-progress work-in-progress or r? you have to choose one.
John Wilander
Comment 5 2019-04-03 14:02:45 PDT
Comment on attachment 366639 [details] Work-in-progress Sorry. Review comments are most welcome. I basically have the testing left which is why I named it WiP. My plan is to both augment the API test for WebCore::AdClickAttribution and to create a layout test that does the redirect dance. Removed review flag for now.
Chris Dumez
Comment 6 2019-04-03 14:10:32 PDT
Comment on attachment 366639 [details] Work-in-progress View in context: https://bugs.webkit.org/attachment.cgi?id=366639&action=review > Source/WebCore/loader/AdClickAttribution.cpp:35 > +static const char* const adClickAttributionPathPrefix = "/.well-known/ad-click-attribution/"; Please use: static const char const adClickAttributionPathPrefix[] = "..."; > Source/WebCore/loader/AdClickAttribution.cpp:36 > +static constexpr size_t adClickConversionDataPathSegmentSize = 2; const adClickConversionDataPathSegmentSize = 2; would suffice. > Source/WebCore/loader/AdClickAttribution.cpp:37 > +static constexpr size_t adClickPriorityPathSegmentSize = 2; ditto. > Source/WebCore/loader/AdClickAttribution.h:233 > + WEBCORE_EXPORT static void parseConversionRequest(const URL& redirectingURL, const URL& redirectURL, const URL& firstPartyURL, CompletionHandler<void(const RegistrableDomain& sourceDomain, const RegistrableDomain& destinationDomain, Optional<Conversion>&&)>); CompletionHandler should probably be &&. > Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:590 > + if (auto* networkSession = m_connection->networkProcess().networkSession(sessionID())) How do you know |this| is still alive here?
John Wilander
Comment 7 2019-04-03 14:19:16 PDT
(In reply to Chris Dumez from comment #6) > Comment on attachment 366639 [details] > Work-in-progress > > View in context: > https://bugs.webkit.org/attachment.cgi?id=366639&action=review > > > Source/WebCore/loader/AdClickAttribution.cpp:35 > > +static const char* const adClickAttributionPathPrefix = "/.well-known/ad-click-attribution/"; > > Please use: > static const char const adClickAttributionPathPrefix[] = "..."; Will fix. > > Source/WebCore/loader/AdClickAttribution.cpp:36 > > +static constexpr size_t adClickConversionDataPathSegmentSize = 2; > > const adClickConversionDataPathSegmentSize = 2; would suffice. Will fix. > > Source/WebCore/loader/AdClickAttribution.cpp:37 > > +static constexpr size_t adClickPriorityPathSegmentSize = 2; > > ditto. Will fix. > > Source/WebCore/loader/AdClickAttribution.h:233 > > + WEBCORE_EXPORT static void parseConversionRequest(const URL& redirectingURL, const URL& redirectURL, const URL& firstPartyURL, CompletionHandler<void(const RegistrableDomain& sourceDomain, const RegistrableDomain& destinationDomain, Optional<Conversion>&&)>); > > CompletionHandler should probably be &&. You mean for future use if I want to forward it? > > Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:590 > > + if (auto* networkSession = m_connection->networkProcess().networkSession(sessionID())) > > How do you know |this| is still alive here? I don't. What is the lifetime of the loader? So should use a protectedThis = makeRef(*this)?
Chris Dumez
Comment 8 2019-04-03 14:21:41 PDT
(In reply to John Wilander from comment #7) > (In reply to Chris Dumez from comment #6) > > Comment on attachment 366639 [details] > > Work-in-progress > > > > View in context: > > https://bugs.webkit.org/attachment.cgi?id=366639&action=review > > > > > Source/WebCore/loader/AdClickAttribution.cpp:35 > > > +static const char* const adClickAttributionPathPrefix = "/.well-known/ad-click-attribution/"; > > > > Please use: > > static const char const adClickAttributionPathPrefix[] = "..."; > > Will fix. > > > > Source/WebCore/loader/AdClickAttribution.cpp:36 > > > +static constexpr size_t adClickConversionDataPathSegmentSize = 2; > > > > const adClickConversionDataPathSegmentSize = 2; would suffice. > > Will fix. > > > > Source/WebCore/loader/AdClickAttribution.cpp:37 > > > +static constexpr size_t adClickPriorityPathSegmentSize = 2; > > > > ditto. > > Will fix. > > > > Source/WebCore/loader/AdClickAttribution.h:233 > > > + WEBCORE_EXPORT static void parseConversionRequest(const URL& redirectingURL, const URL& redirectURL, const URL& firstPartyURL, CompletionHandler<void(const RegistrableDomain& sourceDomain, const RegistrableDomain& destinationDomain, Optional<Conversion>&&)>); > > > > CompletionHandler should probably be &&. > > You mean for future use if I want to forward it? No, now. CompletionHandler is not copyable and should always be moved around. Therefore, we normally pass is as a rvalue reference (&&). > > > > Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:590 > > > + if (auto* networkSession = m_connection->networkProcess().networkSession(sessionID())) > > > > How do you know |this| is still alive here? > > I don't. What is the lifetime of the loader? So should use a protectedThis = > makeRef(*this)? Likely yes.
youenn fablet
Comment 9 2019-04-03 14:50:41 PDT
Comment on attachment 366639 [details] Work-in-progress Some suggestions below. View in context: https://bugs.webkit.org/attachment.cgi?id=366639&action=review > Source/WebCore/loader/AdClickAttribution.cpp:51 > + if (!redirectingURL.protocolIsInHTTPFamily() || !redirectURL.protocolIsInHTTPFamily() || !firstPartyURL.protocolIsInHTTPFamily()) { Should we restrict to HTTPS only? > Source/WebCore/loader/AdClickAttribution.cpp:52 > + completionHandler({ }, { }, WTF::nullopt); It seems that the completionHandler takes either nothing or all parameters. I would probably define a structure like a struct Result { RegistrableDomain, RegistrableDomain, Conversion }. Then we could have "CompletionHandler<void(Result&&)>&& completionHandler", and the call here would be completionHandler({ }). > Source/WebCore/loader/AdClickAttribution.cpp:72 > + ConversionData conversionData = path.substring(prefixLength, adClickConversionDataPathSegmentSize).toUIntStrict(&ok); we are creating a string temporarily for the purpose of getting an unsigned integer. StringView has toUInt64Strict which returns an Optional<> which is even better. > Source/WebCore/loader/AdClickAttribution.cpp:93 > + auto conversionPriority = path.substring(prefixLength + adClickConversionDataPathSegmentSize + 1, adClickPriorityPathSegmentSize).toUIntStrict(&ok); Ditto for string > Source/WebCore/loader/AdClickAttribution.cpp:102 > + } Should we call completionHandler({ }) if that last if is not true? Also, it seems there is no asynchronous processing. How about returning an Optional<Result> instead? > Source/WebKit/NetworkProcess/NetworkAdClickAttribution.cpp:54 > +void NetworkAdClickAttribution::convert(const RegistrableDomain& sourceDomain, const RegistrableDomain& destinationDomain, Conversion&& conversion) Maybe it should take a const Source& and a const Destination&. Or a Result&& >>>> Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:590 >>>> + if (auto* networkSession = m_connection->networkProcess().networkSession(sessionID())) >>> >>> How do you know |this| is still alive here? >> >> I don't. What is the lifetime of the loader? So should use a protectedThis = makeRef(*this)? > > Likely yes. Given parseConversionRequest is not doing asynchronous processing, I would do something like: if (auto result = WebCore::AdClickAttribution::parseConversionRequest()) { ... }
Chris Dumez
Comment 10 2019-04-03 15:12:56 PDT
Comment on attachment 366639 [details] Work-in-progress View in context: https://bugs.webkit.org/attachment.cgi?id=366639&action=review >>>>> Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:590 >>>>> + if (auto* networkSession = m_connection->networkProcess().networkSession(sessionID())) >>>> >>>> How do you know |this| is still alive here? >>> >>> I don't. What is the lifetime of the loader? So should use a protectedThis = makeRef(*this)? >> >> Likely yes. > > Given parseConversionRequest is not doing asynchronous processing, I would do something like: > if (auto result = WebCore::AdClickAttribution::parseConversionRequest()) { > ... > } Oh duh, It is super confusing if an synchronous method takes in a completion handler ...
John Wilander
Comment 11 2019-04-05 15:55:18 PDT
John Wilander
Comment 12 2019-04-05 15:57:19 PDT
Thanks Chris and Youenn for your comments. Hopefully, I've addressed them all. The patch also has tests now.
John Wilander
Comment 13 2019-04-05 16:05:18 PDT
John Wilander
Comment 14 2019-04-05 16:05:45 PDT
Fixed an expect file.
youenn fablet
Comment 15 2019-04-05 16:31:20 PDT
Comment on attachment 366854 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=366854&action=review > Source/WebCore/loader/AdClickAttribution.cpp:54 > + auto path = redirectURL.path(); I think we should have path be a StringView. Either make URL::path() return a StringView (but larger change) or use something like StringView(url.string()).substring(url.pathStart(), url.pathEnd() - url.pathStart())); > Source/WebCore/loader/AdClickAttribution.cpp:61 > + ConversionData conversionData = path.substring(prefixLength, adClickConversionDataPathSegmentSize).toUIntStrict(&ok); Once path is a StringView, you can write something like: auto value = path.substring(prefixLength, adClickConversionDataPathSegmentSize).toUInt64Strict(); if (!value || *value > MaxEntropy) return { }; You could also remove the checks for MaxEntropy and use instead Conversion::isValid. That could be done as a last step of this method when returning conversion. > Source/WebCore/loader/AdClickAttribution.cpp:67 > + return conversion; We can probably write these 3 lines as one return line. > Source/WebKit/NetworkProcess/NetworkAdClickAttribution.cpp:54 > +void NetworkAdClickAttribution::convert(const RegistrableDomain& sourceDomain, const RegistrableDomain& destinationDomain, Conversion&& conversion) convert could take a const Source& and a const Destination& instead of registrable domains. Better typing and would remove the count churning inside this method. > Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:589 > + if (optionalAdClickConversion) { if (auto adClickConversion =) > Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:593 > + // The redirect has to be done by the same registrable domain and it has to be a third-party request. Do we absolutely need the request to be a third-party? > Source/WebKit/NetworkProcess/NetworkSession.cpp:143 > +void NetworkSession::convertAdClickAttribution(const WebCore::RegistrableDomain& sourceDomain, const WebCore::RegistrableDomain& destinationDomain, WebCore::AdClickAttribution::Conversion&& conversion) Let's take a source and destination here as well if possible.
John Wilander
Comment 16 2019-04-05 17:44:43 PDT
(In reply to youenn fablet from comment #15) > Comment on attachment 366854 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=366854&action=review > > > Source/WebCore/loader/AdClickAttribution.cpp:54 > > + auto path = redirectURL.path(); > > I think we should have path be a StringView. > Either make URL::path() return a StringView (but larger change) or use > something like StringView(url.string()).substring(url.pathStart(), > url.pathEnd() - url.pathStart())); OK. > > Source/WebCore/loader/AdClickAttribution.cpp:61 > > + ConversionData conversionData = path.substring(prefixLength, adClickConversionDataPathSegmentSize).toUIntStrict(&ok); > > Once path is a StringView, you can write something like: > auto value = path.substring(prefixLength, > adClickConversionDataPathSegmentSize).toUInt64Strict(); > if (!value || *value > MaxEntropy) > return { }; OK. > You could also remove the checks for MaxEntropy and use instead > Conversion::isValid. > That could be done as a last step of this method when returning conversion. Wouldn't that be less performant? Note that all of this code happens in WebCore::AdClickAttribtion anyway so we're not leaking abstractions. > > Source/WebCore/loader/AdClickAttribution.cpp:67 > > + return conversion; > > We can probably write these 3 lines as one return line. I had some compile time issues where it couldn't figure out the inner structs. But I'll take another stab at it. > > Source/WebKit/NetworkProcess/NetworkAdClickAttribution.cpp:54 > > +void NetworkAdClickAttribution::convert(const RegistrableDomain& sourceDomain, const RegistrableDomain& destinationDomain, Conversion&& conversion) > > convert could take a const Source& and a const Destination& instead of > registrable domains. > Better typing and would remove the count churning inside this method. Will try. See my comment furthest down. > > Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:589 > > + if (optionalAdClickConversion) { > > if (auto adClickConversion =) > > > Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:593 > > + // The redirect has to be done by the same registrable domain and it has to be a third-party request. > > Do we absolutely need the request to be a third-party? Yes. This functionality is just a compatibility fix for so called tracking pixels. The ad click sources want to be in control of who gets to report a conversion for their ad clicks so the redirect has to go through them. Combine that with the fact that you cannot store ad click attributions same-site and you end up with a requirement that this request has to be third-party. I'm trying avoid unnecessary calls into WebKit::NetworkAdClickAttribution since the map of pending ad clicks may become large. > > Source/WebKit/NetworkProcess/NetworkSession.cpp:143 > > +void NetworkSession::convertAdClickAttribution(const WebCore::RegistrableDomain& sourceDomain, const WebCore::RegistrableDomain& destinationDomain, WebCore::AdClickAttribution::Conversion&& conversion) > > Let's take a source and destination here as well if possible. Hmm, maybe that's possible now. I remember I ran into some issues with the old patch trying to do that.
EWS Watchlist
Comment 17 2019-04-05 18:34:08 PDT
Comment on attachment 366854 [details] Patch Attachment 366854 [details] did not pass ios-sim-ews (ios-simulator-wk2): Output: https://webkit-queues.webkit.org/results/11786538 New failing tests: http/tests/adClickAttribution/anchor-tag-attributes-validation.html
EWS Watchlist
Comment 18 2019-04-05 18:34:10 PDT
Created attachment 366866 [details] Archive of layout-test-results from ews122 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews122 Port: ios-simulator-wk2 Platform: Mac OS X 10.13.6
youenn fablet
Comment 19 2019-04-05 19:47:39 PDT
> Wouldn't that be less performant? Note that all of this code happens in > WebCore::AdClickAttribtion anyway so we're not leaking abstractions. This adds an unnecessary if check for the priority in case it is set to 0. I think this does not change much and would favor simplicity. > > Do we absolutely need the request to be a third-party? > > Yes. This functionality is just a compatibility fix for so called tracking > pixels. The ad click sources want to be in control of who gets to report a > conversion for their ad clicks so the redirect has to go through them. > Combine that with the fact that you cannot store ad click attributions > same-site and you end up with a requirement that this request has to be > third-party. I'm trying avoid unnecessary calls into > WebKit::NetworkAdClickAttribution since the map of pending ad clicks may > become large. So the source should in practice never be an add click. In theory though, this could happen. I do not think this will change much in terms of performance since: - this case should almost never happen in practice, so no need to optimize it. - the first key in the double map is the one that is checked for third-party, so it will be an empty entry on the first map.
John Wilander
Comment 20 2019-04-06 16:12:19 PDT
John Wilander
Comment 21 2019-04-06 16:17:26 PDT
(In reply to youenn fablet from comment #19) > > Wouldn't that be less performant? Note that all of this code happens in > > WebCore::AdClickAttribtion anyway so we're not leaking abstractions. > > This adds an unnecessary if check for the priority in case it is set to 0. > I think this does not change much and would favor simplicity. The new patch needs static casts from the 64-bit uints to 32-bit ones. To make that safe, I kept the max entropy checks. > > > Do we absolutely need the request to be a third-party? > > > > Yes. This functionality is just a compatibility fix for so called tracking > > pixels. The ad click sources want to be in control of who gets to report a > > conversion for their ad clicks so the redirect has to go through them. > > Combine that with the fact that you cannot store ad click attributions > > same-site and you end up with a requirement that this request has to be > > third-party. I'm trying avoid unnecessary calls into > > WebKit::NetworkAdClickAttribution since the map of pending ad clicks may > > become large. > > So the source should in practice never be an add click. > In theory though, this could happen. > I do not think this will change much in terms of performance since: > - this case should almost never happen in practice, so no need to optimize > it. > - the first key in the double map is the one that is checked for > third-party, so it will be an empty entry on the first map. domainA.com can have pending attribution under domainB.com while at the same time domainB.com can have pending attribution under domainC.com or even under domainA.com. But domainA.com can never have pending attribution under itself. See lines 446-448 in WebCore::HTMLAnchorElement::parseAdClickAttribution(). The HashMap m_adClickAttributionMap may become pretty large with many, many entries for a single source such as the user's search engine. That's why I don't want to do lookups that I know won't produce any results.
youenn fablet
Comment 22 2019-04-08 09:40:54 PDT
Comment on attachment 366891 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=366891&action=review > Source/WebCore/loader/AdClickAttribution.cpp:53 > + return WTF::nullopt; I prefer "return { };", personal taste only. > Source/WebCore/loader/AdClickAttribution.cpp:68 > + return conversion; Can it be written return Conversion { static_cast<uint32_t>(*conversionDataUInt64, Priority { 0 } }? Or change Conversion constructor to take a Priority optional parameter, something like: explicit Conversion(ConversionData data, Priority priority = 0) > Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:588 > + auto optionalAdClickConversion = WebCore::AdClickAttribution::parseConversionRequest(redirectURL); No need for WebCore:: > Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:589 > + if (optionalAdClickConversion) { if (auto optionalAdClickConversion = WebCore::AdClickAttribution::parseConversionRequest(redirectURL)) > Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:595 > + networkSession->convertAdClickAttribution(WebCore::AdClickAttribution::Source { redirectDomain }, WebCore::AdClickAttribution::Destination { firstPartyURL }, WTFMove(*optionalAdClickConversion)); Ideally, we would do AdClickAttribution::Source { WTFMove(redirectDomain) } Remove WebCore:: > Tools/TestWebKitAPI/Tests/WebCore/AdClickAttribution.cpp:92 > + const URL conversionURLWithoutPriority { { }, "https://webkit.org/.well-known/ad-click-attribution/22"_s }; Can you add 63? > Tools/TestWebKitAPI/Tests/WebCore/AdClickAttribution.cpp:102 > + const URL conversionURLWithPriority { { }, "https://webkit.org/.well-known/ad-click-attribution/22/12"_s }; Can you add 63 for priority? > Tools/TestWebKitAPI/Tests/WebCore/AdClickAttribution.cpp:228 > + optionalConversion = AdClickAttribution::parseConversionRequest(conversionURLWithTooLargeConversionDataAndTooLargePriority); Can you add error cases like "https://webkit.org/.well-known/ad-click-attribution//22/12"_s "https://webkit.org/.well-known/ad-click-attribution/22/12/"_s "https://webkit.org/.well-known/ad-click-attribution/22/12?"_s
John Wilander
Comment 23 2019-04-09 10:41:05 PDT
Created attachment 367052 [details] Patch for landing
John Wilander
Comment 24 2019-04-09 10:41:42 PDT
Thank you, Youenn! I addressed all of your comments and you inspired me to expand testing even further.
WebKit Commit Bot
Comment 25 2019-04-09 11:20:05 PDT
Comment on attachment 367052 [details] Patch for landing Clearing flags on attachment: 367052 Committed r244086: <https://trac.webkit.org/changeset/244086>
WebKit Commit Bot
Comment 26 2019-04-09 11:20:07 PDT
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.