WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
Bug 182664
Resource Load Statistics: Classify resources as prevalent based on redirects to other prevalent resources
https://bugs.webkit.org/show_bug.cgi?id=182664
Summary
Resource Load Statistics: Classify resources as prevalent based on redirects ...
John Wilander
Reported
2018-02-09 16:33:21 PST
A and B should be classified as prevalent if domain A has redirected to domain B which in turn has redirected to prevalent resource domain C.
Attachments
Patch
(58.33 KB, patch)
2018-02-09 17:03 PST
,
John Wilander
no flags
Details
Formatted Diff
Diff
Patch
(59.63 KB, patch)
2018-02-09 17:12 PST
,
John Wilander
no flags
Details
Formatted Diff
Diff
Patch
(58.82 KB, patch)
2018-02-12 13:40 PST
,
John Wilander
no flags
Details
Formatted Diff
Diff
Patch
(58.69 KB, patch)
2018-02-12 13:48 PST
,
John Wilander
no flags
Details
Formatted Diff
Diff
Patch for landing
(58.63 KB, patch)
2018-02-12 20:59 PST
,
John Wilander
no flags
Details
Formatted Diff
Diff
Show Obsolete
(4)
View All
Add attachment
proposed patch, testcase, etc.
John Wilander
Comment 1
2018-02-09 16:33:31 PST
rdar://problem/37372572
John Wilander
Comment 2
2018-02-09 17:03:36 PST
Created
attachment 333536
[details]
Patch
John Wilander
Comment 3
2018-02-09 17:12:42 PST
Created
attachment 333539
[details]
Patch
Brent Fulgham
Comment 4
2018-02-12 10:28:43 PST
Comment on
attachment 333539
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=333539&action=review
Very nice! I asked you to make a few adjustments to calm my fears, but otherwise this looks good.
> Source/WebCore/loader/ResourceLoadObserver.cpp:167 > + auto& targetStatistics = ensureResourceStatisticsForPrimaryDomain(targetPrimaryDomain);
I don't see why this line had to be moved. It makes me nervous, because we have had many threading/memory corruption bugs caused by invalidated iterators, and I'd prefer to limit the scope of these references to the smaller area possible. It doesn't seem like this is referenced again in your new code below, so I don't think you should make this change. Or did you do this to make sure there is an existing entry for the top frame in the redirect code below? If so, I propose you do so at the point you need it, and add the same comment you do elsewhere ("For consistency, make sure we also have a statistics entry for the top frame domain."). I would also give it the name "topFrameStatistics" in the redirect sections below.
> Source/WebCore/loader/ResourceLoadObserver.cpp:234 > + auto& targetStatistics = ensureResourceStatisticsForPrimaryDomain(targetPrimaryDomain);
Ditto my comments in the "logFrameNavigation" code.
> Source/WebKit/UIProcess/WebResourceLoadStatisticsStore.cpp:488 > +{
Please add "ASSERT(!RunLoop::isMain());" here.
John Wilander
Comment 5
2018-02-12 10:40:35 PST
(In reply to Brent Fulgham from
comment #4
)
> Comment on
attachment 333539
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=333539&action=review
> > Very nice! I asked you to make a few adjustments to calm my fears, but > otherwise this looks good. > > > Source/WebCore/loader/ResourceLoadObserver.cpp:167 > > + auto& targetStatistics = ensureResourceStatisticsForPrimaryDomain(targetPrimaryDomain); > > I don't see why this line had to be moved. It makes me nervous, because we > have had many threading/memory corruption bugs caused by invalidated > iterators, and I'd prefer to limit the scope of these references to the > smaller area possible. It doesn't seem like this is referenced again in your > new code below, so I don't think you should make this change.
I see. However, I do use targetStatistics in two more places now, which is why I moved it. Please see new lines 182 and 185 where I now add topFrameUniqueRedirectsFrom and subresourceUniqueRedirectsFrom respectively. Previously we only added statistics to the redirecting domain but now we need to keep track of both directions for the purpose of traversing the redirect graph backwards. Please advice.
> Or did you do this to make sure there is an existing entry for the top frame > in the redirect code below? If so, I propose you do so at the point you need > it, and add the same comment you do elsewhere ("For consistency, make sure > we also have a statistics entry for the top frame domain."). I would also > give it the name "topFrameStatistics" in the redirect sections below. > > > Source/WebCore/loader/ResourceLoadObserver.cpp:234 > > + auto& targetStatistics = ensureResourceStatisticsForPrimaryDomain(targetPrimaryDomain); > > Ditto my comments in the "logFrameNavigation" code.
Ditto reply. Please advice.
> > Source/WebKit/UIProcess/WebResourceLoadStatisticsStore.cpp:488 > > +{ > > Please add "ASSERT(!RunLoop::isMain());" here.
Will do.
Brent Fulgham
Comment 6
2018-02-12 11:09:49 PST
Comment on
attachment 333539
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=333539&action=review
Now that I understand better, this really was introducing a nasty bug. I'm changing to r- for you to fix. :-)
>>> Source/WebCore/loader/ResourceLoadObserver.cpp:167 >>> + auto& targetStatistics = ensureResourceStatisticsForPrimaryDomain(targetPrimaryDomain); >> >> I don't see why this line had to be moved. It makes me nervous, because we have had many threading/memory corruption bugs caused by invalidated iterators, and I'd prefer to limit the scope of these references to the smaller area possible. It doesn't seem like this is referenced again in your new code below, so I don't think you should make this change. >> >> Or did you do this to make sure there is an existing entry for the top frame in the redirect code below? If so, I propose you do so at the point you need it, and add the same comment you do elsewhere ("For consistency, make sure we also have a statistics entry for the top frame domain."). I would also give it the name "topFrameStatistics" in the redirect sections below. > > I see. However, I do use targetStatistics in two more places now, which is why I moved it. Please see new lines 182 and 185 where I now add topFrameUniqueRedirectsFrom and subresourceUniqueRedirectsFrom respectively. > > Previously we only added statistics to the redirecting domain but now we need to keep track of both directions for the purpose of traversing the redirect graph backwards. > > Please advice.
Ah! I missed that. This is dangerous, because both calls to "ensureResourceStatisticsForPrimaryDomain" potentially insert content into the HashSet, invalidating the reference you obtained earlier. I think in the redirect case you need to change to retrieving a copy of the content, then setting it back in the HashSet to avoid creating another set of difficult-to-debug crashes. So: 1. Leave the original targetStatistics code on line 172 alone. That one is fine. 2. In the redirect section, do one of these two things: (a) Assign the return value of the 'ensureResourceStatisticsForPrimaryDomain" to a temporary value (a copy), and then set the value back in the m_resourceStatisticsMap, -or- (b) Update the 'redirectingOriginStatistics" as you have done, then do a call to "ensureResourceStatistics..." to a local reference and then update the 'targetStatistics' value. If you don't the 'targetStatistics' object will be potentially deallocated memory and your use of it will bring about the end of the world. Or at least, will cause some nasty security bugs.
> Source/WebCore/loader/ResourceLoadObserver.cpp:181 > + isNewRedirectToEntry = redirectingOriginStatistics.topFrameUniqueRedirectsTo.add(targetPrimaryDomain).isNewEntry;
So I suggest that immediately after line 181 you do: auto& targetStatistics = ensureResourceStatisticsForPrimaryDomain(targetPrimaryDomain);
> Source/WebCore/loader/ResourceLoadObserver.cpp:184 > + isNewRedirectToEntry = redirectingOriginStatistics.subresourceUniqueRedirectsTo.add(targetPrimaryDomain).isNewEntry;
Ditto, right here after line 184: auto& targetStatistics = ensureResourceStatisticsForPrimaryDomain(targetPrimaryDomain);
John Wilander
Comment 7
2018-02-12 11:13:33 PST
(In reply to Brent Fulgham from
comment #6
)
> Comment on
attachment 333539
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=333539&action=review
> > Now that I understand better, this really was introducing a nasty bug. I'm > changing to r- for you to fix. :-) > > >>> Source/WebCore/loader/ResourceLoadObserver.cpp:167 > >>> + auto& targetStatistics = ensureResourceStatisticsForPrimaryDomain(targetPrimaryDomain); > >> > >> I don't see why this line had to be moved. It makes me nervous, because we have had many threading/memory corruption bugs caused by invalidated iterators, and I'd prefer to limit the scope of these references to the smaller area possible. It doesn't seem like this is referenced again in your new code below, so I don't think you should make this change. > >> > >> Or did you do this to make sure there is an existing entry for the top frame in the redirect code below? If so, I propose you do so at the point you need it, and add the same comment you do elsewhere ("For consistency, make sure we also have a statistics entry for the top frame domain."). I would also give it the name "topFrameStatistics" in the redirect sections below. > > > > I see. However, I do use targetStatistics in two more places now, which is why I moved it. Please see new lines 182 and 185 where I now add topFrameUniqueRedirectsFrom and subresourceUniqueRedirectsFrom respectively. > > > > Previously we only added statistics to the redirecting domain but now we need to keep track of both directions for the purpose of traversing the redirect graph backwards. > > > > Please advice. > > Ah! I missed that. > > This is dangerous, because both calls to > "ensureResourceStatisticsForPrimaryDomain" potentially insert content into > the HashSet, invalidating the reference you obtained earlier. > > I think in the redirect case you need to change to retrieving a copy of the > content, then setting it back in the HashSet to avoid creating another set > of difficult-to-debug crashes. > > So: > > 1. Leave the original targetStatistics code on line 172 alone. That one is > fine. > 2. In the redirect section, do one of these two things: > > (a) Assign the return value of the > 'ensureResourceStatisticsForPrimaryDomain" to a temporary value (a copy), > and then set the value back in the m_resourceStatisticsMap, > -or- > (b) Update the 'redirectingOriginStatistics" as you have done, then do a > call to "ensureResourceStatistics..." to a local reference and then update > the 'targetStatistics' value. > > If you don't the 'targetStatistics' object will be potentially deallocated > memory and your use of it will bring about the end of the world. Or at > least, will cause some nasty security bugs. > > > Source/WebCore/loader/ResourceLoadObserver.cpp:181 > > + isNewRedirectToEntry = redirectingOriginStatistics.topFrameUniqueRedirectsTo.add(targetPrimaryDomain).isNewEntry; > > So I suggest that immediately after line 181 you do: > > auto& targetStatistics = > ensureResourceStatisticsForPrimaryDomain(targetPrimaryDomain); > > > Source/WebCore/loader/ResourceLoadObserver.cpp:184 > > + isNewRedirectToEntry = redirectingOriginStatistics.subresourceUniqueRedirectsTo.add(targetPrimaryDomain).isNewEntry; > > Ditto, right here after line 184: > > auto& targetStatistics = > ensureResourceStatisticsForPrimaryDomain(targetPrimaryDomain);
Thanks, Brent! I will do these changes and get it up here for you to look at again.
John Wilander
Comment 8
2018-02-12 13:40:14 PST
Created
attachment 333626
[details]
Patch
John Wilander
Comment 9
2018-02-12 13:48:34 PST
Created
attachment 333628
[details]
Patch
Brent Fulgham
Comment 10
2018-02-12 17:21:42 PST
Comment on
attachment 333628
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=333628&action=review
Looks great! r=me (but please fix the boolean initialization stuff I mentioned).
> Source/WebCore/loader/ResourceLoadObserver.cpp:178 > + bool isNewRedirectFromEntry;
I would initialize these to false, just because I am pedantic that way.
> Source/WebCore/loader/ResourceLoadObserver.cpp:247 > + bool isNewRedirectFromEntry;
Don't declare these up here. Just declare them at initialization.
> Source/WebCore/loader/ResourceLoadObserver.cpp:249 > + isNewRedirectToEntry = redirectingOriginStatistics.subresourceUniqueRedirectsTo.add(targetPrimaryDomain).isNewEntry;
bool isNewRedirectToEntry = redirectingOrigin....
> Source/WebCore/loader/ResourceLoadObserver.cpp:251 > + isNewRedirectFromEntry = targetStatistics.subresourceUniqueRedirectsFrom.add(sourcePrimaryDomain).isNewEntry;
bool isNewRedirectFromEntry = targetStatistics.subresource...
John Wilander
Comment 11
2018-02-12 20:59:57 PST
Created
attachment 333669
[details]
Patch for landing
John Wilander
Comment 12
2018-02-12 21:00:31 PST
Thanks for the review, Brent!
John Wilander
Comment 13
2018-02-12 22:07:59 PST
Comment on
attachment 333669
[details]
Patch for landing Tree seems red. Landing directly instead.
John Wilander
Comment 14
2018-02-12 22:08:20 PST
Committed
r228416
: <
https://trac.webkit.org/changeset/228416
>
Michael Catanzaro
Comment 15
2018-02-13 18:57:51 PST
Can I ask: what is a prevalent resource?
John Wilander
Comment 16
2018-02-13 20:18:54 PST
A prevalent resource is a domain that has shown its ability to track the user cross-site. Some simply call them trackers but since we don’t know what they do with their ability to track, we just call them prevalent.
Michael Catanzaro
Comment 17
2018-02-13 21:07:28 PST
Everything is much less confusing now, thanks!
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug