Bug 193916 - Add data abstraction and validation for Ad Click Attribution
Summary: Add data abstraction and validation for Ad Click Attribution
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: John Wilander
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2019-01-28 11:12 PST by John Wilander
Modified: 2019-01-29 11:54 PST (History)
8 users (show)

See Also:


Attachments
Patch (26.10 KB, patch)
2019-01-28 11:41 PST, John Wilander
no flags Details | Formatted Diff | Diff
Patch (26.85 KB, patch)
2019-01-28 15:45 PST, John Wilander
no flags Details | Formatted Diff | Diff
Patch (26.54 KB, patch)
2019-01-28 21:10 PST, John Wilander
no flags Details | Formatted Diff | Diff
Patch for landing (27.73 KB, patch)
2019-01-29 11:19 PST, John Wilander
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description John Wilander 2019-01-28 11:12:55 PST
We need a class to represent and validate Ad Click Attribution requests.
Comment 1 Radar WebKit Bug Importer 2019-01-28 11:13:27 PST
<rdar://problem/47603481>
Comment 2 John Wilander 2019-01-28 11:41:00 PST
Created attachment 360361 [details]
Patch
Comment 3 Daniel Bates 2019-01-28 12:12:13 PST
Comment on attachment 360361 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=360361&action=review

More use of uniform initializer syntax, please?

> Source/WebCore/loader/AdClickAttribution.cpp:35
> +using namespace WTF;    

Why?

> Source/WebCore/loader/AdClickAttribution.cpp:37
> +static inline bool isMaxSixBits(unsigned short data)

Name feels weird and not flexible to change because it hardcodes policy in the name. One day if we want 7 bits then we have to rename...hopefully such a person will do the rename.

> Source/WebCore/loader/AdClickAttribution.cpp:52
> +void AdClickAttribution::setCampaign(Campaign campaign)

Why out-of-line? Will this be API one day? If not, just inline in header as binary compatibility is not a concern.

> Source/WebCore/loader/AdClickAttribution.h:41
> +    typedef unsigned short CampaignId;

Old style. New hotness is “using”.

> Source/WebCore/loader/AdClickAttribution.h:42
> +    typedef unsigned short ConversionData;

Ditto.

> Source/WebCore/loader/AdClickAttribution.h:43
> +    typedef unsigned short PriorityValue;

Ditto.

> Source/WebCore/loader/AdClickAttribution.h:45
> +struct Campaign {

Why a struct and a typeset for id? Where’s the type-safety issue? Prefer to defer this struct until a campaign is more than an id z

> Source/WebCore/loader/AdClickAttribution.h:51
> +};

Indent of struct is weird.

> Source/WebCore/loader/AdClickAttribution.h:63
> +};

Weird indent

> Source/WebCore/loader/AdClickAttribution.h:75
> +};

Weird indent.

> Source/WebCore/loader/AdClickAttribution.h:77
> +struct Conversion {

Ok... I can see why you made a Campaign struct

> Source/WebCore/loader/AdClickAttribution.h:101
> +    AdClickAttribution() = delete;

Unnecessary.

> Source/WebCore/loader/AdClickAttribution.h:102
> +    AdClickAttribution(const AdClickAttribution&) = delete;

Use Noncopyable instead of doing this.

> Source/WebCore/loader/AdClickAttribution.h:104
> +    AdClickAttribution(AdClickAttribution&&) = default;

I think this is unnecessary...

> Source/WebCore/loader/AdClickAttribution.h:105
> +    AdClickAttribution& operator=(AdClickAttribution&&) = default;

Ditto.

> Source/WebCore/loader/AdClickAttribution.h:107
> +    WEBCORE_EXPORT void setCampaign(Campaign);

Two ways to set campaign. Is this necessary?

> Source/WebCore/loader/AdClickAttribution.h:109
> +

(Reviewing on iOS is wack) Below method should be const. if not, why?

> Source/WebCore/loader/AdClickAttribution.h:110
> +    WEBCORE_EXPORT URL attributionUrl();

Name of this is wrong. Read style guide for methods with acronyms.

> Source/WebCore/loader/AdClickAttribution.h:111
> +    WEBCORE_EXPORT URL attributionReferrer();

Const? Also is the “attribution” prefix needed in this method and the one above as the class is called attribution. Avoid duplications

> Source/WebCore/loader/AdClickAttribution.h:112
> +    Optional<WallTime> earliestTimeToSendAttribution() { return m_earliestTimeToSend; };

Why optional?
Comment 4 Brent Fulgham 2019-01-28 12:17:18 PST
Comment on attachment 360361 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=360361&action=review

> Source/WebCore/loader/AdClickAttribution.cpp:37
> +static inline bool isMaxSixBits(unsigned short data)

I wonder if we should encode these as "isAllowableConversionBits" and "isAllowableCampaignBits", since they might be different?

Or better yet, make them const members of the Conversion and Campaign structs.

I think these also need "isValid" members so we can generate DOM errors if one is created in script outside of allowable bounds.

> Source/WebCore/loader/AdClickAttribution.cpp:61
> +        return;

This should generate a DOM error.

> Source/WebCore/loader/AdClickAttribution.cpp:76
> +    builder.append(m_source.topPrivatelyControlledDomain);

Why don't we call this member "m_attributionDomain" or something, rather than m_source?

> Source/WebCore/loader/AdClickAttribution.cpp:80
> +    builder.appendNumber(m_campaign.id);

Seems like we could build this at construction, rather than every time you asked for the Url. Then this could be const (and efficient).

Or at the very least, we should lazily build it and remember it for later.

> Source/WebCore/loader/AdClickAttribution.cpp:89
> +URL AdClickAttribution::attributionReferrer()

Ditto.

> Source/WebCore/loader/AdClickAttribution.cpp:96
> +    builder.append(m_destination.topPrivatelyControlledDomain);

Why don't we call this "m_referrer" rather than "m_destination"?

> Source/WebCore/loader/AdClickAttribution.h:54
> +    explicit Source(const String& host)

What is the benefit of a struct containing a String, versus just storing a String? Do you plan on incorporating more state about the Source/Destination in a later patch?

> Source/WebCore/loader/AdClickAttribution.h:56
> +        : topPrivatelyControlledDomain(WebCore::topPrivatelyControlledDomain(host))

If the domain is only every the top privately controlled domain, we might be better off using "topDomain" or "sourceDomain".

> Source/WebCore/loader/AdClickAttribution.h:85
> +struct Priority {

This seems like it would be better as a "plain-old-type" in the class, rather than this wrapping struct. Why is Priority an object?
Comment 5 Alex Christensen 2019-01-28 12:53:05 PST
Comment on attachment 360361 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=360361&action=review

> Source/WebCore/WebCore.xcodeproj/project.pbxproj:1999
> +		6B0A07F321FA4B5C00D57391 /* AdClickAttribution.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 6B0A07F121FA4B5C00D57391 /* AdClickAttribution.cpp */; };

This should be added to Sources.txt instead of included in the WebCore target.  We should also add a runtime switch, but probably in another patch.
Comment 6 John Wilander 2019-01-28 13:17:51 PST
(In reply to Daniel Bates from comment #3)
> Comment on attachment 360361 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=360361&action=review
> 
> More use of uniform initializer syntax, please?

Do you have a specific example so that I know I'm addressing the issue?

> > Source/WebCore/loader/AdClickAttribution.cpp:35
> > +using namespace WTF;    
> 
> Why?

For WallTime, String, and Optional. Maybe it's not needed?

> > Source/WebCore/loader/AdClickAttribution.cpp:37
> > +static inline bool isMaxSixBits(unsigned short data)
> 
> Name feels weird and not flexible to change because it hardcodes policy in
> the name. One day if we want 7 bits then we have to rename...hopefully such
> a person will do the rename.

The reason I hardcoded it is to make the code readable, especially for people who care about the bits of entropy here. Brent had a suggested new name. I can try that.

> > Source/WebCore/loader/AdClickAttribution.cpp:52
> > +void AdClickAttribution::setCampaign(Campaign campaign)
> 
> Why out-of-line? Will this be API one day? If not, just inline in header as
> binary compatibility is not a concern.

I can make it in-line.

> > Source/WebCore/loader/AdClickAttribution.h:41
> > +    typedef unsigned short CampaignId;
> 
> Old style. New hotness is “using”.

Ah. Will fix.

> > Source/WebCore/loader/AdClickAttribution.h:42
> > +    typedef unsigned short ConversionData;
> 
> Ditto.

Will fix.

> > Source/WebCore/loader/AdClickAttribution.h:43
> > +    typedef unsigned short PriorityValue;
> 
> Ditto.

Will fix.

> > Source/WebCore/loader/AdClickAttribution.h:45
> > +struct Campaign {
> 
> Why a struct and a typeset for id? Where’s the type-safety issue? Prefer to
> defer this struct until a campaign is more than an id z

I don't want the function call to use plain strings where the caller can make a mistake on which is the source and which is the destination. After implementing that abstraction it looked more consistent to me to have all three arguments be typed. On top of that, the code handles three pieces of data that are unsigned shorts and I want to avoid mistakes where they get mixed up.

> > Source/WebCore/loader/AdClickAttribution.h:51
> > +};
> 
> Indent of struct is weird.

Can you be more specific, please? Are the curly braces wrong, or the body of the structs, or what?

> > Source/WebCore/loader/AdClickAttribution.h:63
> > +};
> 
> Weird indent
> 
> > Source/WebCore/loader/AdClickAttribution.h:75
> > +};
> 
> Weird indent.
> 
> > Source/WebCore/loader/AdClickAttribution.h:77
> > +struct Conversion {
> 
> Ok... I can see why you made a Campaign struct
> 
> > Source/WebCore/loader/AdClickAttribution.h:101
> > +    AdClickAttribution() = delete;
> 
> Unnecessary.
> 
> > Source/WebCore/loader/AdClickAttribution.h:102
> > +    AdClickAttribution(const AdClickAttribution&) = delete;
> 
> Use Noncopyable instead of doing this.
> 
> > Source/WebCore/loader/AdClickAttribution.h:104
> > +    AdClickAttribution(AdClickAttribution&&) = default;
> 
> I think this is unnecessary...
> 
> > Source/WebCore/loader/AdClickAttribution.h:105
> > +    AdClickAttribution& operator=(AdClickAttribution&&) = default;
> 
> Ditto.
> 
> > Source/WebCore/loader/AdClickAttribution.h:107
> > +    WEBCORE_EXPORT void setCampaign(Campaign);
> 
> Two ways to set campaign. Is this necessary?

You mean constructor + this setter? I had thoughts on users clicking multiple ad campaigns but you're right. I don't need this setter now so it's better to add it if/when there really is a need.

> > Source/WebCore/loader/AdClickAttribution.h:109
> > +
> 
> (Reviewing on iOS is wack) Below method should be const. if not, why?

You're right. I'll make it const.

> > Source/WebCore/loader/AdClickAttribution.h:110
> > +    WEBCORE_EXPORT URL attributionUrl();
> 
> Name of this is wrong. Read style guide for methods with acronyms.

OK. I assume URL should be URL and not Url.

> > Source/WebCore/loader/AdClickAttribution.h:111
> > +    WEBCORE_EXPORT URL attributionReferrer();
> 
> Const? Also is the “attribution” prefix needed in this method and the one
> above as the class is called attribution. Avoid duplications

I went back and forth on the prefix, thinking about the caller and what they may have named the local object/parameter they're working on. But you're probably right. I'll remove the prefix.

> > Source/WebCore/loader/AdClickAttribution.h:112
> > +    Optional<WallTime> earliestTimeToSendAttribution() { return m_earliestTimeToSend; };
> 
> Why optional?

Since the object may not yet have received a call to requestAttribution(), m_earliestTimeToSend may not be set.

Thanks for all the comments!
Comment 7 John Wilander 2019-01-28 13:29:44 PST
(In reply to Alex Christensen from comment #5)
> Comment on attachment 360361 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=360361&action=review
> 
> > Source/WebCore/WebCore.xcodeproj/project.pbxproj:1999
> > +		6B0A07F321FA4B5C00D57391 /* AdClickAttribution.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 6B0A07F121FA4B5C00D57391 /* AdClickAttribution.cpp */; };
> 
> This should be added to Sources.txt instead of included in the WebCore
> target.

To be sure, that will still work with API tests? Because I had to make it WebCore private for that.

> We should also add a runtime switch, but probably in another patch.

Beyond what landed in https://bugs.webkit.org/show_bug.cgi?id=193685 ? Or would you like this class to make use of the runtime flag? I started out that way but decided that this class is merely a data abstraction and it'll be its users that check whether the feature is on or off.

Thanks!
Comment 8 John Wilander 2019-01-28 13:40:57 PST
(In reply to Brent Fulgham from comment #4)
> Comment on attachment 360361 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=360361&action=review
> 
> > Source/WebCore/loader/AdClickAttribution.cpp:37
> > +static inline bool isMaxSixBits(unsigned short data)
> 
> I wonder if we should encode these as "isAllowableConversionBits" and
> "isAllowableCampaignBits", since they might be different?

Dan commented on this too. I just wanted it to be explicit and readable for the people who care about entropy. But your abstract name works too.

> Or better yet, make them const members of the Conversion and Campaign
> structs.
> 
> I think these also need "isValid" members so we can generate DOM errors if
> one is created in script outside of allowable bounds.

So that would mean the two structs have constants and one isValid() each that's used by the class-level isValid()?

> > Source/WebCore/loader/AdClickAttribution.cpp:61
> > +        return;
> 
> This should generate a DOM error.

Shouldn't that be done outside this data abstraction, i.e. where the class is used? This class has no notion of a DOM or such.

> > Source/WebCore/loader/AdClickAttribution.cpp:76
> > +    builder.append(m_source.topPrivatelyControlledDomain);
> 
> Why don't we call this member "m_attributionDomain" or something, rather
> than m_source?

I'm trying to mirror the language we've used discussing this, i.e. ad click source and ad click destination. Do you think attributionDomain is more clear?

> > Source/WebCore/loader/AdClickAttribution.cpp:80
> > +    builder.appendNumber(m_campaign.id);
> 
> Seems like we could build this at construction, rather than every time you
> asked for the Url. Then this could be const (and efficient).

That is now a possibility, given that I'm removing the campaign ID setter (Dan's comment). But that also means holding duplicate data in memory. Normally, this function should only be called once — when the time has come to send the attribution request. After that, the object should be destructed.

What I'm trying to do is to hold minimal data in memory and construct the URL on-demand for the single call when it's needed.

> Or at the very least, we should lazily build it and remember it for later.
> 
> > Source/WebCore/loader/AdClickAttribution.cpp:89
> > +URL AdClickAttribution::attributionReferrer()
> 
> Ditto.

See my reasoning above.

> > Source/WebCore/loader/AdClickAttribution.cpp:96
> > +    builder.append(m_destination.topPrivatelyControlledDomain);
> 
> Why don't we call this "m_referrer" rather than "m_destination"?

Not a bad idea. I'm a little reluctant since referrer has a specific meaning in HTTP requests. The getter is named this way because it really should be the referrer (unless we redesign it to be for instance the Origin header).

> > Source/WebCore/loader/AdClickAttribution.h:54
> > +    explicit Source(const String& host)
> 
> What is the benefit of a struct containing a String, versus just storing a
> String? Do you plan on incorporating more state about the Source/Destination
> in a later patch?

The purpose is to avoid mistakes where source and destination get confused. By maintaining this type abstraction I can make any future internal functionality be explicit about which of the two strings it expects.

> 
> > Source/WebCore/loader/AdClickAttribution.h:56
> > +        : topPrivatelyControlledDomain(WebCore::topPrivatelyControlledDomain(host))
> 
> If the domain is only every the top privately controlled domain, we might be
> better off using "topDomain" or "sourceDomain".
> 
> > Source/WebCore/loader/AdClickAttribution.h:85
> > +struct Priority {
> 
> This seems like it would be better as a "plain-old-type" in the class,
> rather than this wrapping struct. Why is Priority an object?

Again, to avoid confusion and bugs when juggling the three pieces of data that are unsigned shorts.

Thanks for all the comments!
Comment 9 John Wilander 2019-01-28 15:45:06 PST
Created attachment 360391 [details]
Patch
Comment 10 John Wilander 2019-01-28 17:20:00 PST
Mac-wk2 test failures are unrelated. Win tree seems red.
Comment 11 Daniel Bates 2019-01-28 18:08:39 PST
Comment on attachment 360391 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=360391&action=review

> Source/WebCore/loader/AdClickAttribution.cpp:52
> +    m_priority = priority;

Move priority into the ivar

> Source/WebCore/loader/AdClickAttribution.h:51
> +            : id(id)

Uniform initializer syntax (UIS)?

> Source/WebCore/loader/AdClickAttribution.h:66
> +            : topPrivatelyControlledDomain(WebCore::topPrivatelyControlledDomain(host))

Ditto.

> Source/WebCore/loader/AdClickAttribution.h:72
> +        String topPrivatelyControlledDomain;

Ok as-is, but I don't like this terminology. The world seems to standardize on "registrable domain", which is also shorter. Maybe the "top" should be encoded in the name of this class? Or someone may just pass the registrable domain of a subframe's document.

> Source/WebCore/loader/AdClickAttribution.h:78
> +            : topPrivatelyControlledDomain(WebCore::topPrivatelyControlledDomain(host))

Why is the WebCore function prefixed with "Top"? What does "top" mean? Now, I am confused.

> Source/WebCore/loader/AdClickAttribution.h:84
> +        String topPrivatelyControlledDomain;

Registrable doman?

> Source/WebCore/loader/AdClickAttribution.h:89
> +            : data(data)

UIS?

> Source/WebCore/loader/AdClickAttribution.h:103
> +            : value(value)

UIS?

> Source/WebCore/loader/AdClickAttribution.h:115
> +    AdClickAttribution(const Campaign& campaign, const Source& source, const Destination& destination)

Pass campaign by value because it's small.

> Source/WebCore/loader/AdClickAttribution.h:116
> +        : m_campaign(campaign)

UIS here and the lines below.

> Source/WebCore/loader/AdClickAttribution.h:123
> +    WEBCORE_EXPORT void setConversion(const Conversion&, const Optional<Priority>&);

Pass conversion by value. Pass optional by rvalue reference.
Comment 12 John Wilander 2019-01-28 21:05:51 PST
(In reply to Daniel Bates from comment #11)
> Comment on attachment 360391 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=360391&action=review
> 
> > Source/WebCore/loader/AdClickAttribution.cpp:52
> > +    m_priority = priority;
> 
> Move priority into the ivar

Yeah, that's better. Will fix.

> > Source/WebCore/loader/AdClickAttribution.h:51
> > +            : id(id)
> 
> Uniform initializer syntax (UIS)?

Sorry, missed UIS. Will fix.

> > Source/WebCore/loader/AdClickAttribution.h:66
> > +            : topPrivatelyControlledDomain(WebCore::topPrivatelyControlledDomain(host))
> 
> Ditto.

Will fix.

> > Source/WebCore/loader/AdClickAttribution.h:72
> > +        String topPrivatelyControlledDomain;
> 
> Ok as-is, but I don't like this terminology. The world seems to standardize
> on "registrable domain", which is also shorter. Maybe the "top" should be
> encoded in the name of this class? Or someone may just pass the registrable
> domain of a subframe's document.

I recall you introducing registrable domain as part of your SameSite cookies work but then I couldn't find it. I'm fine using that term though. Will fix.

> > Source/WebCore/loader/AdClickAttribution.h:78
> > +            : topPrivatelyControlledDomain(WebCore::topPrivatelyControlledDomain(host))
> 
> Why is the WebCore function prefixed with "Top"? What does "top" mean? Now,
> I am confused.

It's top in the domain hierarchy, as in the highest domain you can control privately.

> > Source/WebCore/loader/AdClickAttribution.h:84
> > +        String topPrivatelyControlledDomain;
> 
> Registrable doman?

Yup. Will fix.

> > Source/WebCore/loader/AdClickAttribution.h:89
> > +            : data(data)
> 
> UIS?

Will fix.

> > Source/WebCore/loader/AdClickAttribution.h:103
> > +            : value(value)
> 
> UIS?

Will fix.

> > Source/WebCore/loader/AdClickAttribution.h:115
> > +    AdClickAttribution(const Campaign& campaign, const Source& source, const Destination& destination)
> 
> Pass campaign by value because it's small.

OK. Will fix.

> > Source/WebCore/loader/AdClickAttribution.h:116
> > +        : m_campaign(campaign)
> 
> UIS here and the lines below.

Will fix.

> > Source/WebCore/loader/AdClickAttribution.h:123
> > +    WEBCORE_EXPORT void setConversion(const Conversion&, const Optional<Priority>&);
> 
> Pass conversion by value. Pass optional by rvalue reference.

Since Conversion now holds both a ConversionID and a PriorityValue, I assume you'll want it as an rvalue reference.

Thanks for all these detailed review comments, Dan! New patch coming up.
Comment 13 John Wilander 2019-01-28 21:10:33 PST
Created attachment 360432 [details]
Patch
Comment 14 Daniel Bates 2019-01-29 10:49:06 PST
Comment on attachment 360432 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=360432&action=review

> Source/WebCore/loader/AdClickAttribution.h:51
> +            : id {id}

Thanks for switching to UIS, but if I had known you were just going to just replace (/) with {/} and add a space before the { then I wouldn't have said anything. There should be spaces inside the curly braces on both sides as well just like how you wrote line 85 in the .cpp. Where is the description in the ChangeLog? Where is the why or how comment on the 24-48 duration? Why do I still see "Url" references in this patch? It's the little things like this that give me pause when reviewing this patch (maybe for someone else they are just noise?). I am done reviewing.
Comment 15 Daniel Bates 2019-01-29 10:50:52 PST
Comment on attachment 360432 [details]
Patch

Oops! Didn't meant to r+ the patch because I actually stopped reviewed it after my last comment. Oh well, consider my r+ equivalent to a rubber-stamp, rs=me.
Comment 16 John Wilander 2019-01-29 11:19:48 PST
Created attachment 360480 [details]
Patch for landing
Comment 17 John Wilander 2019-01-29 11:22:00 PST
Thanks, Dan! I fixed the remaining issues.

The WebKit Style Guide says the following on names (https://webkit.org/code-style-guidelines/#names):

"Use CamelCase. Capitalize the first letter, including all letters in an acronym, in a class, struct, protocol, or namespace name. Lower-case the first letter, including all letters in an acronym, in a variable or function name."

I guess URL is an exception to this rule.
Comment 18 WebKit Commit Bot 2019-01-29 11:54:38 PST
Comment on attachment 360480 [details]
Patch for landing

Clearing flags on attachment: 360480

Committed r240667: <https://trac.webkit.org/changeset/240667>
Comment 19 WebKit Commit Bot 2019-01-29 11:54:40 PST
All reviewed patches have been landed.  Closing bug.