A and B should be classified as prevalent if domain A has redirected to domain B which in turn has redirected to prevalent resource domain C.
rdar://problem/37372572
Created attachment 333536 [details] Patch
Created attachment 333539 [details] Patch
Comment on attachment 333539 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=333539&action=review Very nice! I asked you to make a few adjustments to calm my fears, but otherwise this looks good. > Source/WebCore/loader/ResourceLoadObserver.cpp:167 > + auto& targetStatistics = ensureResourceStatisticsForPrimaryDomain(targetPrimaryDomain); I don't see why this line had to be moved. It makes me nervous, because we have had many threading/memory corruption bugs caused by invalidated iterators, and I'd prefer to limit the scope of these references to the smaller area possible. It doesn't seem like this is referenced again in your new code below, so I don't think you should make this change. Or did you do this to make sure there is an existing entry for the top frame in the redirect code below? If so, I propose you do so at the point you need it, and add the same comment you do elsewhere ("For consistency, make sure we also have a statistics entry for the top frame domain."). I would also give it the name "topFrameStatistics" in the redirect sections below. > Source/WebCore/loader/ResourceLoadObserver.cpp:234 > + auto& targetStatistics = ensureResourceStatisticsForPrimaryDomain(targetPrimaryDomain); Ditto my comments in the "logFrameNavigation" code. > Source/WebKit/UIProcess/WebResourceLoadStatisticsStore.cpp:488 > +{ Please add "ASSERT(!RunLoop::isMain());" here.
(In reply to Brent Fulgham from comment #4) > Comment on attachment 333539 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=333539&action=review > > Very nice! I asked you to make a few adjustments to calm my fears, but > otherwise this looks good. > > > Source/WebCore/loader/ResourceLoadObserver.cpp:167 > > + auto& targetStatistics = ensureResourceStatisticsForPrimaryDomain(targetPrimaryDomain); > > I don't see why this line had to be moved. It makes me nervous, because we > have had many threading/memory corruption bugs caused by invalidated > iterators, and I'd prefer to limit the scope of these references to the > smaller area possible. It doesn't seem like this is referenced again in your > new code below, so I don't think you should make this change. I see. However, I do use targetStatistics in two more places now, which is why I moved it. Please see new lines 182 and 185 where I now add topFrameUniqueRedirectsFrom and subresourceUniqueRedirectsFrom respectively. Previously we only added statistics to the redirecting domain but now we need to keep track of both directions for the purpose of traversing the redirect graph backwards. Please advice. > Or did you do this to make sure there is an existing entry for the top frame > in the redirect code below? If so, I propose you do so at the point you need > it, and add the same comment you do elsewhere ("For consistency, make sure > we also have a statistics entry for the top frame domain."). I would also > give it the name "topFrameStatistics" in the redirect sections below. > > > Source/WebCore/loader/ResourceLoadObserver.cpp:234 > > + auto& targetStatistics = ensureResourceStatisticsForPrimaryDomain(targetPrimaryDomain); > > Ditto my comments in the "logFrameNavigation" code. Ditto reply. Please advice. > > Source/WebKit/UIProcess/WebResourceLoadStatisticsStore.cpp:488 > > +{ > > Please add "ASSERT(!RunLoop::isMain());" here. Will do.
Comment on attachment 333539 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=333539&action=review Now that I understand better, this really was introducing a nasty bug. I'm changing to r- for you to fix. :-) >>> Source/WebCore/loader/ResourceLoadObserver.cpp:167 >>> + auto& targetStatistics = ensureResourceStatisticsForPrimaryDomain(targetPrimaryDomain); >> >> I don't see why this line had to be moved. It makes me nervous, because we have had many threading/memory corruption bugs caused by invalidated iterators, and I'd prefer to limit the scope of these references to the smaller area possible. It doesn't seem like this is referenced again in your new code below, so I don't think you should make this change. >> >> Or did you do this to make sure there is an existing entry for the top frame in the redirect code below? If so, I propose you do so at the point you need it, and add the same comment you do elsewhere ("For consistency, make sure we also have a statistics entry for the top frame domain."). I would also give it the name "topFrameStatistics" in the redirect sections below. > > I see. However, I do use targetStatistics in two more places now, which is why I moved it. Please see new lines 182 and 185 where I now add topFrameUniqueRedirectsFrom and subresourceUniqueRedirectsFrom respectively. > > Previously we only added statistics to the redirecting domain but now we need to keep track of both directions for the purpose of traversing the redirect graph backwards. > > Please advice. Ah! I missed that. This is dangerous, because both calls to "ensureResourceStatisticsForPrimaryDomain" potentially insert content into the HashSet, invalidating the reference you obtained earlier. I think in the redirect case you need to change to retrieving a copy of the content, then setting it back in the HashSet to avoid creating another set of difficult-to-debug crashes. So: 1. Leave the original targetStatistics code on line 172 alone. That one is fine. 2. In the redirect section, do one of these two things: (a) Assign the return value of the 'ensureResourceStatisticsForPrimaryDomain" to a temporary value (a copy), and then set the value back in the m_resourceStatisticsMap, -or- (b) Update the 'redirectingOriginStatistics" as you have done, then do a call to "ensureResourceStatistics..." to a local reference and then update the 'targetStatistics' value. If you don't the 'targetStatistics' object will be potentially deallocated memory and your use of it will bring about the end of the world. Or at least, will cause some nasty security bugs. > Source/WebCore/loader/ResourceLoadObserver.cpp:181 > + isNewRedirectToEntry = redirectingOriginStatistics.topFrameUniqueRedirectsTo.add(targetPrimaryDomain).isNewEntry; So I suggest that immediately after line 181 you do: auto& targetStatistics = ensureResourceStatisticsForPrimaryDomain(targetPrimaryDomain); > Source/WebCore/loader/ResourceLoadObserver.cpp:184 > + isNewRedirectToEntry = redirectingOriginStatistics.subresourceUniqueRedirectsTo.add(targetPrimaryDomain).isNewEntry; Ditto, right here after line 184: auto& targetStatistics = ensureResourceStatisticsForPrimaryDomain(targetPrimaryDomain);
(In reply to Brent Fulgham from comment #6) > Comment on attachment 333539 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=333539&action=review > > Now that I understand better, this really was introducing a nasty bug. I'm > changing to r- for you to fix. :-) > > >>> Source/WebCore/loader/ResourceLoadObserver.cpp:167 > >>> + auto& targetStatistics = ensureResourceStatisticsForPrimaryDomain(targetPrimaryDomain); > >> > >> I don't see why this line had to be moved. It makes me nervous, because we have had many threading/memory corruption bugs caused by invalidated iterators, and I'd prefer to limit the scope of these references to the smaller area possible. It doesn't seem like this is referenced again in your new code below, so I don't think you should make this change. > >> > >> Or did you do this to make sure there is an existing entry for the top frame in the redirect code below? If so, I propose you do so at the point you need it, and add the same comment you do elsewhere ("For consistency, make sure we also have a statistics entry for the top frame domain."). I would also give it the name "topFrameStatistics" in the redirect sections below. > > > > I see. However, I do use targetStatistics in two more places now, which is why I moved it. Please see new lines 182 and 185 where I now add topFrameUniqueRedirectsFrom and subresourceUniqueRedirectsFrom respectively. > > > > Previously we only added statistics to the redirecting domain but now we need to keep track of both directions for the purpose of traversing the redirect graph backwards. > > > > Please advice. > > Ah! I missed that. > > This is dangerous, because both calls to > "ensureResourceStatisticsForPrimaryDomain" potentially insert content into > the HashSet, invalidating the reference you obtained earlier. > > I think in the redirect case you need to change to retrieving a copy of the > content, then setting it back in the HashSet to avoid creating another set > of difficult-to-debug crashes. > > So: > > 1. Leave the original targetStatistics code on line 172 alone. That one is > fine. > 2. In the redirect section, do one of these two things: > > (a) Assign the return value of the > 'ensureResourceStatisticsForPrimaryDomain" to a temporary value (a copy), > and then set the value back in the m_resourceStatisticsMap, > -or- > (b) Update the 'redirectingOriginStatistics" as you have done, then do a > call to "ensureResourceStatistics..." to a local reference and then update > the 'targetStatistics' value. > > If you don't the 'targetStatistics' object will be potentially deallocated > memory and your use of it will bring about the end of the world. Or at > least, will cause some nasty security bugs. > > > Source/WebCore/loader/ResourceLoadObserver.cpp:181 > > + isNewRedirectToEntry = redirectingOriginStatistics.topFrameUniqueRedirectsTo.add(targetPrimaryDomain).isNewEntry; > > So I suggest that immediately after line 181 you do: > > auto& targetStatistics = > ensureResourceStatisticsForPrimaryDomain(targetPrimaryDomain); > > > Source/WebCore/loader/ResourceLoadObserver.cpp:184 > > + isNewRedirectToEntry = redirectingOriginStatistics.subresourceUniqueRedirectsTo.add(targetPrimaryDomain).isNewEntry; > > Ditto, right here after line 184: > > auto& targetStatistics = > ensureResourceStatisticsForPrimaryDomain(targetPrimaryDomain); Thanks, Brent! I will do these changes and get it up here for you to look at again.
Created attachment 333626 [details] Patch
Created attachment 333628 [details] Patch
Comment on attachment 333628 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=333628&action=review Looks great! r=me (but please fix the boolean initialization stuff I mentioned). > Source/WebCore/loader/ResourceLoadObserver.cpp:178 > + bool isNewRedirectFromEntry; I would initialize these to false, just because I am pedantic that way. > Source/WebCore/loader/ResourceLoadObserver.cpp:247 > + bool isNewRedirectFromEntry; Don't declare these up here. Just declare them at initialization. > Source/WebCore/loader/ResourceLoadObserver.cpp:249 > + isNewRedirectToEntry = redirectingOriginStatistics.subresourceUniqueRedirectsTo.add(targetPrimaryDomain).isNewEntry; bool isNewRedirectToEntry = redirectingOrigin.... > Source/WebCore/loader/ResourceLoadObserver.cpp:251 > + isNewRedirectFromEntry = targetStatistics.subresourceUniqueRedirectsFrom.add(sourcePrimaryDomain).isNewEntry; bool isNewRedirectFromEntry = targetStatistics.subresource...
Created attachment 333669 [details] Patch for landing
Thanks for the review, Brent!
Comment on attachment 333669 [details] Patch for landing Tree seems red. Landing directly instead.
Committed r228416: <https://trac.webkit.org/changeset/228416>
Can I ask: what is a prevalent resource?
A prevalent resource is a domain that has shown its ability to track the user cross-site. Some simply call them trackers but since we don’t know what they do with their ability to track, we just call them prevalent.
Everything is much less confusing now, thanks!