Bug 196558 - Pick up Ad Click Attribution conversions in NetworkResourceLoader::willSendRedirectedRequest()
Summary: Pick up Ad Click Attribution conversions in NetworkResourceLoader::willSendRe...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit Misc. (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: John Wilander
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2019-04-03 13:38 PDT by John Wilander
Modified: 2019-04-09 11:20 PDT (History)
9 users (show)

See Also:


Attachments
Work-in-progress (14.80 KB, patch)
2019-04-03 13:57 PDT, John Wilander
no flags Details | Formatted Diff | Diff
Patch (38.05 KB, patch)
2019-04-05 15:55 PDT, John Wilander
no flags Details | Formatted Diff | Diff
Patch (39.52 KB, patch)
2019-04-05 16:05 PDT, John Wilander
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews122 for ios-simulator-wk2 (2.55 MB, application/zip)
2019-04-05 18:34 PDT, EWS Watchlist
no flags Details
Patch (41.12 KB, patch)
2019-04-06 16:12 PDT, John Wilander
no flags Details | Formatted Diff | Diff
Patch for landing (50.01 KB, patch)
2019-04-09 10:41 PDT, John Wilander
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description John Wilander 2019-04-03 13:38:07 PDT
We should look for HTTP redirects to URLs with a path starting with "/.well-known/ad-click-attribution/" and pick up Ad Click Attribution conversions.
Comment 1 Radar WebKit Bug Importer 2019-04-03 13:40:11 PDT
<rdar://problem/49571113>
Comment 2 John Wilander 2019-04-03 13:49:59 PDT
Real radar: <rdar://problem/47650245>
Comment 3 John Wilander 2019-04-03 13:57:24 PDT
Created attachment 366639 [details]
Work-in-progress
Comment 4 Chris Dumez 2019-04-03 13:59:53 PDT
(In reply to John Wilander from comment #3)
> Created attachment 366639 [details]
> Work-in-progress

work-in-progress or r? you have to choose one.
Comment 5 John Wilander 2019-04-03 14:02:45 PDT
Comment on attachment 366639 [details]
Work-in-progress

Sorry. Review comments are most welcome. I basically have the testing left which is why I named it WiP. My plan is to both augment the API test for WebCore::AdClickAttribution and to create a layout test that does the redirect dance.

Removed review flag for now.
Comment 6 Chris Dumez 2019-04-03 14:10:32 PDT
Comment on attachment 366639 [details]
Work-in-progress

View in context: https://bugs.webkit.org/attachment.cgi?id=366639&action=review

> Source/WebCore/loader/AdClickAttribution.cpp:35
> +static const char* const adClickAttributionPathPrefix = "/.well-known/ad-click-attribution/";

Please use:
static const char const adClickAttributionPathPrefix[] = "...";

> Source/WebCore/loader/AdClickAttribution.cpp:36
> +static constexpr size_t adClickConversionDataPathSegmentSize = 2;

const adClickConversionDataPathSegmentSize = 2; would suffice.

> Source/WebCore/loader/AdClickAttribution.cpp:37
> +static constexpr size_t adClickPriorityPathSegmentSize = 2;

ditto.

> Source/WebCore/loader/AdClickAttribution.h:233
> +    WEBCORE_EXPORT static void parseConversionRequest(const URL& redirectingURL, const URL& redirectURL, const URL& firstPartyURL, CompletionHandler<void(const RegistrableDomain& sourceDomain, const RegistrableDomain& destinationDomain, Optional<Conversion>&&)>);

CompletionHandler should probably be &&.

> Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:590
> +        if (auto* networkSession = m_connection->networkProcess().networkSession(sessionID()))

How do you know |this| is still alive here?
Comment 7 John Wilander 2019-04-03 14:19:16 PDT
(In reply to Chris Dumez from comment #6)
> Comment on attachment 366639 [details]
> Work-in-progress
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=366639&action=review
> 
> > Source/WebCore/loader/AdClickAttribution.cpp:35
> > +static const char* const adClickAttributionPathPrefix = "/.well-known/ad-click-attribution/";
> 
> Please use:
> static const char const adClickAttributionPathPrefix[] = "...";

Will fix.

> > Source/WebCore/loader/AdClickAttribution.cpp:36
> > +static constexpr size_t adClickConversionDataPathSegmentSize = 2;
> 
> const adClickConversionDataPathSegmentSize = 2; would suffice.

Will fix.

> > Source/WebCore/loader/AdClickAttribution.cpp:37
> > +static constexpr size_t adClickPriorityPathSegmentSize = 2;
> 
> ditto.

Will fix.

> > Source/WebCore/loader/AdClickAttribution.h:233
> > +    WEBCORE_EXPORT static void parseConversionRequest(const URL& redirectingURL, const URL& redirectURL, const URL& firstPartyURL, CompletionHandler<void(const RegistrableDomain& sourceDomain, const RegistrableDomain& destinationDomain, Optional<Conversion>&&)>);
> 
> CompletionHandler should probably be &&.

You mean for future use if I want to forward it?

> > Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:590
> > +        if (auto* networkSession = m_connection->networkProcess().networkSession(sessionID()))
> 
> How do you know |this| is still alive here?

I don't. What is the lifetime of the loader? So should use a protectedThis = makeRef(*this)?
Comment 8 Chris Dumez 2019-04-03 14:21:41 PDT
(In reply to John Wilander from comment #7)
> (In reply to Chris Dumez from comment #6)
> > Comment on attachment 366639 [details]
> > Work-in-progress
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=366639&action=review
> > 
> > > Source/WebCore/loader/AdClickAttribution.cpp:35
> > > +static const char* const adClickAttributionPathPrefix = "/.well-known/ad-click-attribution/";
> > 
> > Please use:
> > static const char const adClickAttributionPathPrefix[] = "...";
> 
> Will fix.
> 
> > > Source/WebCore/loader/AdClickAttribution.cpp:36
> > > +static constexpr size_t adClickConversionDataPathSegmentSize = 2;
> > 
> > const adClickConversionDataPathSegmentSize = 2; would suffice.
> 
> Will fix.
> 
> > > Source/WebCore/loader/AdClickAttribution.cpp:37
> > > +static constexpr size_t adClickPriorityPathSegmentSize = 2;
> > 
> > ditto.
> 
> Will fix.
> 
> > > Source/WebCore/loader/AdClickAttribution.h:233
> > > +    WEBCORE_EXPORT static void parseConversionRequest(const URL& redirectingURL, const URL& redirectURL, const URL& firstPartyURL, CompletionHandler<void(const RegistrableDomain& sourceDomain, const RegistrableDomain& destinationDomain, Optional<Conversion>&&)>);
> > 
> > CompletionHandler should probably be &&.
> 
> You mean for future use if I want to forward it?

No, now. CompletionHandler is not copyable and should always be moved around. Therefore, we normally pass is as a rvalue reference (&&).

> 
> > > Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:590
> > > +        if (auto* networkSession = m_connection->networkProcess().networkSession(sessionID()))
> > 
> > How do you know |this| is still alive here?
> 
> I don't. What is the lifetime of the loader? So should use a protectedThis =
> makeRef(*this)?

Likely yes.
Comment 9 youenn fablet 2019-04-03 14:50:41 PDT
Comment on attachment 366639 [details]
Work-in-progress

Some suggestions below.

View in context: https://bugs.webkit.org/attachment.cgi?id=366639&action=review

> Source/WebCore/loader/AdClickAttribution.cpp:51
> +    if (!redirectingURL.protocolIsInHTTPFamily() || !redirectURL.protocolIsInHTTPFamily() || !firstPartyURL.protocolIsInHTTPFamily()) {

Should we restrict to HTTPS only?

> Source/WebCore/loader/AdClickAttribution.cpp:52
> +        completionHandler({ }, { }, WTF::nullopt);

It seems that the completionHandler takes either nothing or all parameters.
I would probably define a structure like a struct Result { RegistrableDomain, RegistrableDomain, Conversion }.
Then we could have "CompletionHandler<void(Result&&)>&& completionHandler", and the call here would be completionHandler({ }).

> Source/WebCore/loader/AdClickAttribution.cpp:72
> +        ConversionData conversionData = path.substring(prefixLength, adClickConversionDataPathSegmentSize).toUIntStrict(&ok);

we are creating a string temporarily for the purpose of getting an unsigned integer.
StringView has toUInt64Strict which returns an Optional<> which is even better.

> Source/WebCore/loader/AdClickAttribution.cpp:93
> +        auto conversionPriority = path.substring(prefixLength + adClickConversionDataPathSegmentSize + 1, adClickPriorityPathSegmentSize).toUIntStrict(&ok);

Ditto for string

> Source/WebCore/loader/AdClickAttribution.cpp:102
> +    }

Should we call completionHandler({ }) if that last if is not true?

Also, it seems there is no asynchronous processing.
How about returning an Optional<Result> instead?

> Source/WebKit/NetworkProcess/NetworkAdClickAttribution.cpp:54
> +void NetworkAdClickAttribution::convert(const RegistrableDomain& sourceDomain, const RegistrableDomain& destinationDomain, Conversion&& conversion)

Maybe it should take a const Source& and a const Destination&.
Or a Result&&

>>>> Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:590
>>>> +        if (auto* networkSession = m_connection->networkProcess().networkSession(sessionID()))
>>> 
>>> How do you know |this| is still alive here?
>> 
>> I don't. What is the lifetime of the loader? So should use a protectedThis = makeRef(*this)?
> 
> Likely yes.

Given parseConversionRequest is not doing asynchronous processing, I would do something like:
if (auto result = WebCore::AdClickAttribution::parseConversionRequest()) {
   ...
}
Comment 10 Chris Dumez 2019-04-03 15:12:56 PDT
Comment on attachment 366639 [details]
Work-in-progress

View in context: https://bugs.webkit.org/attachment.cgi?id=366639&action=review

>>>>> Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:590
>>>>> +        if (auto* networkSession = m_connection->networkProcess().networkSession(sessionID()))
>>>> 
>>>> How do you know |this| is still alive here?
>>> 
>>> I don't. What is the lifetime of the loader? So should use a protectedThis = makeRef(*this)?
>> 
>> Likely yes.
> 
> Given parseConversionRequest is not doing asynchronous processing, I would do something like:
> if (auto result = WebCore::AdClickAttribution::parseConversionRequest()) {
>    ...
> }

Oh duh, It is super confusing if an synchronous method takes in a completion handler ...
Comment 11 John Wilander 2019-04-05 15:55:18 PDT
Created attachment 366851 [details]
Patch
Comment 12 John Wilander 2019-04-05 15:57:19 PDT
Thanks Chris and Youenn for your comments. Hopefully, I've addressed them all. The patch also has tests now.
Comment 13 John Wilander 2019-04-05 16:05:18 PDT
Created attachment 366854 [details]
Patch
Comment 14 John Wilander 2019-04-05 16:05:45 PDT
Fixed an expect file.
Comment 15 youenn fablet 2019-04-05 16:31:20 PDT
Comment on attachment 366854 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=366854&action=review

> Source/WebCore/loader/AdClickAttribution.cpp:54
> +    auto path = redirectURL.path();

I think we should have path be a StringView.
Either make URL::path() return a StringView (but larger change) or use something like StringView(url.string()).substring(url.pathStart(), url.pathEnd() - url.pathStart()));

> Source/WebCore/loader/AdClickAttribution.cpp:61
> +        ConversionData conversionData = path.substring(prefixLength, adClickConversionDataPathSegmentSize).toUIntStrict(&ok);

Once path is a StringView, you can write something like:
auto value = path.substring(prefixLength, adClickConversionDataPathSegmentSize).toUInt64Strict();
if (!value || *value > MaxEntropy)
    return { };

You could also remove the checks for MaxEntropy and use instead Conversion::isValid.
That could be done as a last step of this method when returning conversion.

> Source/WebCore/loader/AdClickAttribution.cpp:67
> +        return conversion;

We can probably write these 3 lines as one return line.

> Source/WebKit/NetworkProcess/NetworkAdClickAttribution.cpp:54
> +void NetworkAdClickAttribution::convert(const RegistrableDomain& sourceDomain, const RegistrableDomain& destinationDomain, Conversion&& conversion)

convert could take a const Source& and a const Destination& instead of registrable domains.
Better typing and would remove the count churning inside this method.

> Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:589
> +    if (optionalAdClickConversion) {

if (auto adClickConversion =)

> Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:593
> +        // The redirect has to be done by the same registrable domain and it has to be a third-party request.

Do we absolutely need the request to be a third-party?

> Source/WebKit/NetworkProcess/NetworkSession.cpp:143
> +void NetworkSession::convertAdClickAttribution(const WebCore::RegistrableDomain& sourceDomain, const WebCore::RegistrableDomain& destinationDomain, WebCore::AdClickAttribution::Conversion&& conversion)

Let's take a source and destination here as well if possible.
Comment 16 John Wilander 2019-04-05 17:44:43 PDT
(In reply to youenn fablet from comment #15)
> Comment on attachment 366854 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=366854&action=review
> 
> > Source/WebCore/loader/AdClickAttribution.cpp:54
> > +    auto path = redirectURL.path();
> 
> I think we should have path be a StringView.
> Either make URL::path() return a StringView (but larger change) or use
> something like StringView(url.string()).substring(url.pathStart(),
> url.pathEnd() - url.pathStart()));

OK.

> > Source/WebCore/loader/AdClickAttribution.cpp:61
> > +        ConversionData conversionData = path.substring(prefixLength, adClickConversionDataPathSegmentSize).toUIntStrict(&ok);
> 
> Once path is a StringView, you can write something like:
> auto value = path.substring(prefixLength,
> adClickConversionDataPathSegmentSize).toUInt64Strict();
> if (!value || *value > MaxEntropy)
>     return { };

OK.

> You could also remove the checks for MaxEntropy and use instead
> Conversion::isValid.
> That could be done as a last step of this method when returning conversion.

Wouldn't that be less performant? Note that all of this code happens in WebCore::AdClickAttribtion anyway so we're not leaking abstractions.

> > Source/WebCore/loader/AdClickAttribution.cpp:67
> > +        return conversion;
> 
> We can probably write these 3 lines as one return line.

I had some compile time issues where it couldn't figure out the inner structs. But I'll take another stab at it.

> > Source/WebKit/NetworkProcess/NetworkAdClickAttribution.cpp:54
> > +void NetworkAdClickAttribution::convert(const RegistrableDomain& sourceDomain, const RegistrableDomain& destinationDomain, Conversion&& conversion)
> 
> convert could take a const Source& and a const Destination& instead of
> registrable domains.
> Better typing and would remove the count churning inside this method.

Will try. See my comment furthest down.

> > Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:589
> > +    if (optionalAdClickConversion) {
> 
> if (auto adClickConversion =)
> 
> > Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:593
> > +        // The redirect has to be done by the same registrable domain and it has to be a third-party request.
> 
> Do we absolutely need the request to be a third-party?

Yes. This functionality is just a compatibility fix for so called tracking pixels. The ad click sources want to be in control of who gets to report a conversion for their ad clicks so the redirect has to go through them. Combine that with the fact that you cannot store ad click attributions same-site and you end up with a requirement that this request has to be third-party. I'm trying avoid unnecessary calls into WebKit::NetworkAdClickAttribution since the map of pending ad clicks may become large.

> > Source/WebKit/NetworkProcess/NetworkSession.cpp:143
> > +void NetworkSession::convertAdClickAttribution(const WebCore::RegistrableDomain& sourceDomain, const WebCore::RegistrableDomain& destinationDomain, WebCore::AdClickAttribution::Conversion&& conversion)
> 
> Let's take a source and destination here as well if possible.

Hmm, maybe that's possible now. I remember I ran into some issues with the old patch trying to do that.
Comment 17 EWS Watchlist 2019-04-05 18:34:08 PDT
Comment on attachment 366854 [details]
Patch

Attachment 366854 [details] did not pass ios-sim-ews (ios-simulator-wk2):
Output: https://webkit-queues.webkit.org/results/11786538

New failing tests:
http/tests/adClickAttribution/anchor-tag-attributes-validation.html
Comment 18 EWS Watchlist 2019-04-05 18:34:10 PDT
Created attachment 366866 [details]
Archive of layout-test-results from ews122 for ios-simulator-wk2

The attached test failures were seen while running run-webkit-tests on the ios-sim-ews.
Bot: ews122  Port: ios-simulator-wk2  Platform: Mac OS X 10.13.6
Comment 19 youenn fablet 2019-04-05 19:47:39 PDT
> Wouldn't that be less performant? Note that all of this code happens in
> WebCore::AdClickAttribtion anyway so we're not leaking abstractions.

This adds an unnecessary if check for the priority in case it is set to 0.
I think this does not change much and would favor simplicity.

> > Do we absolutely need the request to be a third-party?
> 
> Yes. This functionality is just a compatibility fix for so called tracking
> pixels. The ad click sources want to be in control of who gets to report a
> conversion for their ad clicks so the redirect has to go through them.
> Combine that with the fact that you cannot store ad click attributions
> same-site and you end up with a requirement that this request has to be
> third-party. I'm trying avoid unnecessary calls into
> WebKit::NetworkAdClickAttribution since the map of pending ad clicks may
> become large.

So the source should in practice never be an add click.
In theory though, this could happen.
I do not think this will change much in terms of performance since:
- this case should almost never happen in practice, so no need to optimize it.
- the first key in the double map is the one that is checked for third-party, so it will be an empty entry on the first map.
Comment 20 John Wilander 2019-04-06 16:12:19 PDT
Created attachment 366891 [details]
Patch
Comment 21 John Wilander 2019-04-06 16:17:26 PDT
(In reply to youenn fablet from comment #19)
> > Wouldn't that be less performant? Note that all of this code happens in
> > WebCore::AdClickAttribtion anyway so we're not leaking abstractions.
> 
> This adds an unnecessary if check for the priority in case it is set to 0.
> I think this does not change much and would favor simplicity.

The new patch needs static casts from the 64-bit uints to 32-bit ones. To make that safe, I kept the max entropy checks.

> > > Do we absolutely need the request to be a third-party?
> > 
> > Yes. This functionality is just a compatibility fix for so called tracking
> > pixels. The ad click sources want to be in control of who gets to report a
> > conversion for their ad clicks so the redirect has to go through them.
> > Combine that with the fact that you cannot store ad click attributions
> > same-site and you end up with a requirement that this request has to be
> > third-party. I'm trying avoid unnecessary calls into
> > WebKit::NetworkAdClickAttribution since the map of pending ad clicks may
> > become large.
> 
> So the source should in practice never be an add click.
> In theory though, this could happen.
> I do not think this will change much in terms of performance since:
> - this case should almost never happen in practice, so no need to optimize
> it.
> - the first key in the double map is the one that is checked for
> third-party, so it will be an empty entry on the first map.

domainA.com can have pending attribution under domainB.com while at the same time domainB.com can have pending attribution under domainC.com or even under domainA.com. But domainA.com can never have pending attribution under itself. See lines 446-448 in WebCore::HTMLAnchorElement::parseAdClickAttribution().

The HashMap m_adClickAttributionMap may become pretty large with many, many entries for a single source such as the user's search engine. That's why I don't want to do lookups that I know won't produce any results.
Comment 22 youenn fablet 2019-04-08 09:40:54 PDT
Comment on attachment 366891 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=366891&action=review

> Source/WebCore/loader/AdClickAttribution.cpp:53
> +        return WTF::nullopt;

I prefer "return { };", personal taste only.

> Source/WebCore/loader/AdClickAttribution.cpp:68
> +        return conversion;

Can it be written return Conversion { static_cast<uint32_t>(*conversionDataUInt64, Priority { 0 } }?
Or change Conversion constructor to take a Priority optional parameter, something like:

explicit Conversion(ConversionData data, Priority priority = 0)

> Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:588
> +    auto optionalAdClickConversion = WebCore::AdClickAttribution::parseConversionRequest(redirectURL);

No need for WebCore::

> Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:589
> +    if (optionalAdClickConversion) {

if (auto optionalAdClickConversion = WebCore::AdClickAttribution::parseConversionRequest(redirectURL))

> Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:595
> +            networkSession->convertAdClickAttribution(WebCore::AdClickAttribution::Source { redirectDomain }, WebCore::AdClickAttribution::Destination { firstPartyURL }, WTFMove(*optionalAdClickConversion));

Ideally, we would do AdClickAttribution::Source { WTFMove(redirectDomain) }
Remove WebCore::

> Tools/TestWebKitAPI/Tests/WebCore/AdClickAttribution.cpp:92
> +    const URL conversionURLWithoutPriority { { }, "https://webkit.org/.well-known/ad-click-attribution/22"_s };

Can you add 63?

> Tools/TestWebKitAPI/Tests/WebCore/AdClickAttribution.cpp:102
> +    const URL conversionURLWithPriority { { }, "https://webkit.org/.well-known/ad-click-attribution/22/12"_s };

Can you add 63 for priority?

> Tools/TestWebKitAPI/Tests/WebCore/AdClickAttribution.cpp:228
> +    optionalConversion = AdClickAttribution::parseConversionRequest(conversionURLWithTooLargeConversionDataAndTooLargePriority);

Can you add error cases like "https://webkit.org/.well-known/ad-click-attribution//22/12"_s "https://webkit.org/.well-known/ad-click-attribution/22/12/"_s  "https://webkit.org/.well-known/ad-click-attribution/22/12?"_s
Comment 23 John Wilander 2019-04-09 10:41:05 PDT
Created attachment 367052 [details]
Patch for landing
Comment 24 John Wilander 2019-04-09 10:41:42 PDT
Thank you, Youenn! I addressed all of your comments and you inspired me to expand testing even further.
Comment 25 WebKit Commit Bot 2019-04-09 11:20:05 PDT
Comment on attachment 367052 [details]
Patch for landing

Clearing flags on attachment: 367052

Committed r244086: <https://trac.webkit.org/changeset/244086>
Comment 26 WebKit Commit Bot 2019-04-09 11:20:07 PDT
All reviewed patches have been landed.  Closing bug.