Bug 222248 - Non-cookie website data not deleted after 7 days of browser use without user interaction
Summary: Non-cookie website data not deleted after 7 days of browser use without user ...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit Misc. (show other bugs)
Version: Safari 14
Hardware: All All
: P2 Normal
Assignee: John Wilander
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2021-02-21 07:05 PST by Antoine Bourlon
Modified: 2021-02-26 14:26 PST (History)
4 users (show)

See Also:


Attachments
Patch (86.21 KB, patch)
2021-02-25 17:56 PST, John Wilander
no flags Details | Formatted Diff | Diff
Patch (86.54 KB, patch)
2021-02-26 10:37 PST, John Wilander
no flags Details | Formatted Diff | Diff
Patch (86.79 KB, patch)
2021-02-26 11:22 PST, John Wilander
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Antoine Bourlon 2021-02-21 07:05:06 PST
ITP is supposed to delete non-cookie data of a website after 7 days of browser use without user interaction with the website, as described in https://bugs.webkit.org/show_bug.cgi?id=204779 and https://webkit.org/blog/10218/full-third-party-cookie-blocking-and-more/. However we couldn't reproduce this expected behavior in our real-life tests on iOS 14 and macOS Safari 14. Only cookies (set with document.cookie) are deleted, not LocalStorage.

Steps to Reproduce: 

1) Open a page with Safari from domain A that adds some data to LocalStorage. Tap somewhere on the page (to register a user interaction) and close the tab.

2) Open Safari every day for the next 7 days.

3) On day 8 open a page (different than step 1) from domain A that doesn't write to LocalStorage and check that data stored in step 1 is still there. Do not tap or input anything on the page.

4) On day 9 and 10 retry step 3 just in case :)

Additional Information:

After some debugging on a local build of WebKit with MiniBrowser, it seems to be related to the condition below in hasStatisticsExpired(). I don't understand why mostRecentUserInteractionTime is compared to m_leastRecentOperatingDate (30 operating days ago IIUC) instead of "7 operating days ago", even when hasStatisticsExpired() is called from hasHadUnexpiredRecentUserInteraction() (shouldRemoveAllButCookiesFor()):

>    if (m_operatingDatesSize >= operatingDatesWindowInDays) {
>        if (OperatingDate::fromWallTime(mostRecentUserInteractionTime) < m_leastRecentOperatingDate)
>            return true;
>    }
Comment 1 John Wilander 2021-02-21 09:02:09 PST
Thanks for filing! We’ll have a look.
Comment 2 Radar WebKit Bug Importer 2021-02-22 13:58:39 PST
<rdar://problem/74612589>
Comment 3 John Wilander 2021-02-25 17:56:40 PST
Created attachment 421591 [details]
Patch
Comment 4 John Wilander 2021-02-25 19:45:54 PST
Failing API test is unrelated.
Comment 5 Kate Cheney 2021-02-26 09:52:45 PST
Comment on attachment 421591 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=421591&action=review

I think this looks good. I think m_leastRecentOperatingDate and related logic should be removed before landing, and I think it would be better to have a single statement to get operating dates windows to avoid duplication. Everything else is just nits. Thanks for fixing this!

> Source/WebKit/ChangeLog:28
> +            Test infrastructure. The same changes as in the DB store.

This function can probably be removed. We don't run tests with the memory store anymore.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:2805
> +    SQLiteStatement getLongWindowOperatingDateStatement(m_database, "SELECT * FROM OperatingDates ORDER BY year DESC, month DESC, monthDay DESC LIMIT 1 OFFSET ?;"_s);

I think this should be one statement, maybe getOperatingDateForWindowStatement. Then you can assign it to variables with different names and different bind parameters.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:2831
> +            RELEASE_LOG_ERROR_IF_ALLOWED(m_sessionID, "%p - ResourceLoadStatisticsDatabaseStore::updateOperatingDatesParameters getShortWindowOperatingDateStatement failed to step, error message: %{private}s", this, m_database.lastErrorMsg());

Nit, the error message would probably read better as "getShortWindowOperatingDateStatement failed, error ..." because the failure could be in the prepare(), bind() or step().

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:2841
> +        if (getLongWindowOperatingDateStatement.prepare() != SQLITE_OK

If you do end up using one statement for both, you will need to reset it before you bind again, using <statement-name>.reset().

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:2844
> +            RELEASE_LOG_ERROR_IF_ALLOWED(m_sessionID, "%p - ResourceLoadStatisticsDatabaseStore::updateOperatingDatesParameters getLongWindowOperatingDateStatement failed to step, error message: %{private}s", this, m_database.lastErrorMsg());

Ditto about the error message.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:-2887
> -        if (OperatingDate::fromWallTime(mostRecentUserInteractionTime) < m_leastRecentOperatingDate)

m_leastRecentOperatingDate is now unused and should be removed.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:2921
> +void ResourceLoadStatisticsDatabaseStore::insertExpiredStatisticForTesting(const RegistrableDomain& domain, unsigned numberOfOperatingDaysPassed, bool hasUserInteraction, bool isScheduledForAllButCookieDataRemoval, bool isPrevalent)

I wonder if isScheduledForAllButCookieDataRemoval should just be removed? It doesn't seem to be needed for testing, because it should get marked after a statistic is marked for website data removal.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:2925
> +    for (unsigned i = 1; i <= numberOfOperatingDaysPassed; i++) {

Nice, this is much better for testing.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.h:295
>      OperatingDate m_leastRecentOperatingDate;

Should be removed.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsMemoryStore.cpp:1140
> +void ResourceLoadStatisticsMemoryStore::insertExpiredStatisticForTesting(const RegistrableDomain& domain, unsigned numberOfOperatingDaysPassed, bool hasUserInteraction, bool isScheduledForAllButCookieDataRemoval, bool isPrevalent)

This function could be removed if you want.

> LayoutTests/ChangeLog:12
> +        A new test case makes sure website data is not deleted below the threshold.

Good idea to have this new test case.
Comment 6 John Wilander 2021-02-26 10:22:30 PST
(In reply to katherine_cheney from comment #5)
> Comment on attachment 421591 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=421591&action=review
> 
> I think this looks good. I think m_leastRecentOperatingDate and related
> logic should be removed before landing, and I think it would be better to
> have a single statement to get operating dates windows to avoid duplication.
> Everything else is just nits. Thanks for fixing this!
> 
> > Source/WebKit/ChangeLog:28
> > +            Test infrastructure. The same changes as in the DB store.
> 
> This function can probably be removed. We don't run tests with the memory
> store anymore.

It's a pure virtual function in WebKit::ResourceLoadStatisticsStore which means it has to be implemented in all concrete subclasses. We could make it a no-op but I don't think that's much better than just updating it. The whole memory store should be removed, right?

> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:2805
> > +    SQLiteStatement getLongWindowOperatingDateStatement(m_database, "SELECT * FROM OperatingDates ORDER BY year DESC, month DESC, monthDay DESC LIMIT 1 OFFSET ?;"_s);
> 
> I think this should be one statement, maybe
> getOperatingDateForWindowStatement. Then you can assign it to variables with
> different names and different bind parameters.

Obvious now that you point it out. The trap of incremental development.

> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:2831
> > +            RELEASE_LOG_ERROR_IF_ALLOWED(m_sessionID, "%p - ResourceLoadStatisticsDatabaseStore::updateOperatingDatesParameters getShortWindowOperatingDateStatement failed to step, error message: %{private}s", this, m_database.lastErrorMsg());
> 
> Nit, the error message would probably read better as
> "getShortWindowOperatingDateStatement failed, error ..." because the failure
> could be in the prepare(), bind() or step().

Will fix.

> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:2841
> > +        if (getLongWindowOperatingDateStatement.prepare() != SQLITE_OK
> 
> If you do end up using one statement for both, you will need to reset it
> before you bind again, using <statement-name>.reset().

Will fix.

> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:2844
> > +            RELEASE_LOG_ERROR_IF_ALLOWED(m_sessionID, "%p - ResourceLoadStatisticsDatabaseStore::updateOperatingDatesParameters getLongWindowOperatingDateStatement failed to step, error message: %{private}s", this, m_database.lastErrorMsg());
> 
> Ditto about the error message.

Will fix.

> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:-2887
> > -        if (OperatingDate::fromWallTime(mostRecentUserInteractionTime) < m_leastRecentOperatingDate)
> 
> m_leastRecentOperatingDate is now unused and should be removed.

Will check and remove if truly unused.

> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:2921
> > +void ResourceLoadStatisticsDatabaseStore::insertExpiredStatisticForTesting(const RegistrableDomain& domain, unsigned numberOfOperatingDaysPassed, bool hasUserInteraction, bool isScheduledForAllButCookieDataRemoval, bool isPrevalent)
> 
> I wonder if isScheduledForAllButCookieDataRemoval should just be removed? It
> doesn't seem to be needed for testing, because it should get marked after a
> statistic is marked for website data removal.

Afaik, isScheduledForAllButCookieDataRemoval is still used for when potential link decoration is detected and so having a way to test with that setting separately is probably good even though it was used for the wrong test earlier.

> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:2925
> > +    for (unsigned i = 1; i <= numberOfOperatingDaysPassed; i++) {
> 
> Nice, this is much better for testing.
> 
> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.h:295
> >      OperatingDate m_leastRecentOperatingDate;
> 
> Should be removed.

Will fix.

> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsMemoryStore.cpp:1140
> > +void ResourceLoadStatisticsMemoryStore::insertExpiredStatisticForTesting(const RegistrableDomain& domain, unsigned numberOfOperatingDaysPassed, bool hasUserInteraction, bool isScheduledForAllButCookieDataRemoval, bool isPrevalent)
> 
> This function could be removed if you want.

See comment above.

> > LayoutTests/ChangeLog:12
> > +        A new test case makes sure website data is not deleted below the threshold.
> 
> Good idea to have this new test case.

Thanks, Kate! Will upload a new patch shortly.
Comment 7 Kate Cheney 2021-02-26 10:27:09 PST
Comment on attachment 421591 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=421591&action=review

>>> Source/WebKit/ChangeLog:28
>>> +            Test infrastructure. The same changes as in the DB store.
>> 
>> This function can probably be removed. We don't run tests with the memory store anymore.
> 
> It's a pure virtual function in WebKit::ResourceLoadStatisticsStore which means it has to be implemented in all concrete subclasses. We could make it a no-op but I don't think that's much better than just updating it. The whole memory store should be removed, right?

Ah, you're right. I forgot I had made it virtual. And yes, rdar://57878709 is tracking the removal, I've been meaning to get around to it :)
Comment 8 John Wilander 2021-02-26 10:37:21 PST
Created attachment 421675 [details]
Patch
Comment 9 Brent Fulgham 2021-02-26 10:44:56 PST
Comment on attachment 421675 [details]
Patch

r=me
Comment 10 John Wilander 2021-02-26 10:51:01 PST
Thanks, Kate and Brent! The previous patch was all green and the changes I made should not affect its quality. I'll wait for a reasonable amount of EWS feedback before putting it on the queue.
Comment 11 John Wilander 2021-02-26 11:22:58 PST
Created attachment 421680 [details]
Patch
Comment 12 EWS 2021-02-26 14:26:40 PST
Committed r273590: <https://commits.webkit.org/r273590>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 421680 [details].