Bug 211637 - SHOULD NEVER BE REACHED in WebKit::ResourceLoadStatisticsDatabaseStore::openAndUpdateSchemaIfNecessary
Summary: SHOULD NEVER BE REACHED in WebKit::ResourceLoadStatisticsDatabaseStore::openA...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Kate Cheney
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2020-05-08 12:18 PDT by Ryan Haddad
Modified: 2020-05-19 11:48 PDT (History)
8 users (show)

See Also:


Attachments
Patch (118.32 KB, patch)
2020-05-12 11:40 PDT, Kate Cheney
no flags Details | Formatted Diff | Diff
Patch (118.41 KB, patch)
2020-05-12 17:10 PDT, Kate Cheney
no flags Details | Formatted Diff | Diff
Patch (118.36 KB, patch)
2020-05-13 08:27 PDT, Kate Cheney
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Ryan Haddad 2020-05-08 12:18:13 PDT
Even after https://trac.webkit.org/changeset/261146/webkit, we are still seeing assertion failures during iOS Debug test runs:

ERROR: SQLite database failed to set journal_mode to WAL, error: database is locked
./platform/sql/SQLiteDatabase.cpp(208) : void WebCore::SQLiteDatabase::useWALJournalMode()
ERROR: SQLite database failed to checkpoint: database is locked
./platform/sql/SQLiteDatabase.cpp(218) : void WebCore::SQLiteDatabase::useWALJournalMode()
ERROR: Unable to prepare statement to fetch schema for the ObservedDomains table.
/Volumes/Data/slave/ios-simulator-13-debug/build/Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp(514) : void WebKit::ResourceLoadStatisticsDatabaseStore::openAndUpdateSchemaIfNecessary()
SHOULD NEVER BE REACHED
/Volumes/Data/slave/ios-simulator-13-debug/build/Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp(515) : void WebKit::ResourceLoadStatisticsDatabaseStore::openAndUpdateSchemaIfNecessary()
1   0x1189d4b89 WTFCrash
2   0x103b5584b WTFCrashWithInfo(int, char const*, char const*, int)
3   0x104283eff WebKit::ResourceLoadStatisticsDatabaseStore::openAndUpdateSchemaIfNecessary()
4   0x104283a86 WebKit::ResourceLoadStatisticsDatabaseStore::ResourceLoadStatisticsDatabaseStore(WebKit::WebResourceLoadStatisticsStore&, WTF::WorkQueue&, WebKit::ShouldIncludeLocalhost, WTF::String const&, PAL::SessionID)
5   0x104284943 WebKit::ResourceLoadStatisticsDatabaseStore::ResourceLoadStatisticsDatabaseStore(WebKit::WebResourceLoadStatisticsStore&, WTF::WorkQueue&, WebKit::ShouldIncludeLocalhost, WTF::String const&, PAL::SessionID)
6   0x104335cec std::__1::__unique_if<WebKit::ResourceLoadStatisticsDatabaseStore>::__unique_single std::__1::make_unique<WebKit::ResourceLoadStatisticsDatabaseStore, WebKit::WebResourceLoadStatisticsStore&, WTF::Ref<WTF::WorkQueue, WTF::DumbPtrTraits<WTF::WorkQueue> >&, WebKit::ShouldIncludeLocalhost const&, WTF::String const&, PAL::SessionID const&>(WebKit::WebResourceLoadStatisticsStore&, WTF::Ref<WTF::WorkQueue, WTF::DumbPtrTraits<WTF::WorkQueue> >&, WebKit::ShouldIncludeLocalhost const&, WTF::String const&, PAL::SessionID const&)
7   0x1043358d3 decltype(auto) WTF::makeUnique<WebKit::ResourceLoadStatisticsDatabaseStore, WebKit::WebResourceLoadStatisticsStore&, WTF::Ref<WTF::WorkQueue, WTF::DumbPtrTraits<WTF::WorkQueue> >&, WebKit::ShouldIncludeLocalhost const&, WTF::String const&, PAL::SessionID const&>(WebKit::WebResourceLoadStatisticsStore&, WTF::Ref<WTF::WorkQueue, WTF::DumbPtrTraits<WTF::WorkQueue> >&, WebKit::ShouldIncludeLocalhost const&, WTF::String const&, PAL::SessionID const&)
8   0x104335626 WebKit::WebResourceLoadStatisticsStore::WebResourceLoadStatisticsStore(WebKit::NetworkSession&, WTF::String const&, WebKit::ShouldIncludeLocalhost, WebCore::ResourceLoadStatistics::IsEphemeral)::$_42::operator()() const
9   0x1043354de WTF::Detail::CallableWrapper<WebKit::WebResourceLoadStatisticsStore::WebResourceLoadStatisticsStore(WebKit::NetworkSession&, WTF::String const&, WebKit::ShouldIncludeLocalhost, WebCore::ResourceLoadStatistics::IsEphemeral)::$_42, void>::call()
10  0x103c48452 WTF::Function<void ()>::operator()() const
11  0x10430bd1e WebKit::WebResourceLoadStatisticsStore::postTask(WTF::Function<void ()>&&)::'lambda'()::operator()() const
12  0x10430bc4e WTF::Detail::CallableWrapper<WebKit::WebResourceLoadStatisticsStore::postTask(WTF::Function<void ()>&&)::'lambda'(), void>::call()
13  0x118a00072 WTF::Function<void ()>::operator()() const
14  0x118b0ae2e WTF::WorkQueue::dispatch(WTF::Function<void ()>&&)::$_0::operator()() const
15  0x118b0b022 WTF::BlockPtr<void ()> WTF::BlockPtr<void ()>::fromCallable<WTF::WorkQueue::dispatch(WTF::Function<void ()>&&)::$_0>(WTF::WorkQueue::dispatch(WTF::Function<void ()>&&)::$_0)::'lambda'(void*)::operator()(void*) const
16  0x118b0aff5 WTF::BlockPtr<void ()> WTF::BlockPtr<void ()>::fromCallable<WTF::WorkQueue::dispatch(WTF::Function<void ()>&&)::$_0>(WTF::WorkQueue::dispatch(WTF::Function<void ()>&&)::$_0)::'lambda'(void*)::__invoke(void*)
17  0x111a8f951 _dispatch_call_block_and_release
18  0x111a908cb _dispatch_client_callout
19  0x111a9660c _dispatch_lane_serial_drain
20  0x111a97044 _dispatch_lane_invoke
21  0x111aa10c4 _dispatch_workloop_worker_thread
22  0x111f17a3d _pthread_wqthread
23  0x111f16b77 start_wqthread

https://build.webkit.org/results/Apple%20iOS%2013%20Simulator%20Debug%20WK2%20(Tests)/r261385%20(3583)/results.html
Comment 1 Radar WebKit Bug Importer 2020-05-08 12:18:44 PDT
<rdar://problem/63031185>
Comment 2 Sihui Liu 2020-05-08 19:15:25 PDT
From the error log, "ERROR: SQLite database failed to set journal_mode to WAL, error: database is locked", it looks like there are conflicted transactions on the same database connection (https://www.sqlite.org/rescode.html#locked). 

As this is regression from r260791(?), it's possible a database connection is not closed properly, e.g. a connection will not be closed if there are unfinalized prepared statements(https://www.sqlite.org/c3ref/close.html).

Looking at r260791, it seems NetworkSession::recreateResourceLoadStatisticStore can be called consecutively, and m_resourceLoadStatistics will be set before task dispatched in WebResourceLoadStatisticsStore::flushAndDestroyPersistentStore is done. Not sure if this is an issue. Maybe verify by waiting in NetworkSession::recreateResourceLoadStatisticStore until the completion handler is called to see if this fixes the issue.
Comment 3 Sihui Liu 2020-05-08 19:22:25 PDT
It may be helpful to add some logging about error code/message around the assertion.
Comment 4 Kate Cheney 2020-05-11 09:05:36 PDT
(In reply to Sihui Liu from comment #2)

Thanks for taking a look at this.

> From the error log, "ERROR: SQLite database failed to set journal_mode to
> WAL, error: database is locked", it looks like there are conflicted
> transactions on the same database connection
> (https://www.sqlite.org/rescode.html#locked). 
> 
> As this is regression from r260791(?), it's possible a database connection
> is not closed properly, e.g. a connection will not be closed if there are
> unfinalized prepared statements(https://www.sqlite.org/c3ref/close.html).
> 

This could definitely be related. The statements are all prepared in ResourceLoadStatisticsDatabaseStore::prepareStatements(), but the unused ones are not finalized before closing.

> Looking at r260791, it seems
> NetworkSession::recreateResourceLoadStatisticStore can be called
> consecutively, and m_resourceLoadStatistics will be set before task
> dispatched in WebResourceLoadStatisticsStore::flushAndDestroyPersistentStore
> is done. Not sure if this is an issue. Maybe verify by waiting in
> NetworkSession::recreateResourceLoadStatisticStore until the completion
> handler is called to see if this fixes the issue.

Isn't this already occurring? m_resourceLoadStatistics is set in the completion handler of destroyResourceLoadStatistics, which gets passed to WebResourceLoadStatisticsStore::flushAndDestroyPersistentStore(), so it won't be set until that has finished.
Comment 5 Kate Cheney 2020-05-12 11:40:26 PDT
Created attachment 399150 [details]
Patch
Comment 6 Kate Cheney 2020-05-12 11:41:48 PDT
Still trying to reproduce to confirm this patch will fix the crashes, letting EWS run in the meantime.
Comment 7 Kate Cheney 2020-05-12 17:10:25 PDT
Created attachment 399206 [details]
Patch
Comment 8 Kate Cheney 2020-05-12 17:11:24 PDT
Still can't reproduce this crash, but this fix makes sense from a performance standpoint (we should not be recreating a memory store between two database tests).
Comment 9 Kate Cheney 2020-05-12 17:16:26 PDT
(In reply to katherine_cheney from comment #8)
> Still can't reproduce this crash, but this fix makes sense from a
> performance standpoint (we should not be recreating a memory store between
> two database tests).

And, after talking with Sihui, I think it will likely fix the crash.
Comment 10 Kate Cheney 2020-05-13 08:27:55 PDT
Created attachment 399263 [details]
Patch
Comment 11 John Wilander 2020-05-13 13:23:51 PDT
Is testRunner.setUseITPDatabase() blocking so that the next statement after it guarantees that ITP is set up accordingly?
Comment 12 Alex Christensen 2020-05-13 13:44:48 PDT
It uses WKWebsiteDataStoreSetUseITPDatabase which is waited for.  The actual implementation might not be waiting for everything it needs to, but waiting is happening.
Comment 13 Kate Cheney 2020-05-13 14:46:15 PDT
(In reply to Alex Christensen from comment #12)
> It uses WKWebsiteDataStoreSetUseITPDatabase which is waited for.  The actual
> implementation might not be waiting for everything it needs to, but waiting
> is happening.

The goal of the patch is to make sure recreateResourceLoadStatisticsStore only gets called once. I don't think the call to setUseITPDatabase() waits for statisticsResetToConsisistentState() to be finished.
Comment 14 EWS 2020-05-13 15:00:25 PDT
Committed r261659: <https://trac.webkit.org/changeset/261659>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 399263 [details].
Comment 15 Chris Dumez 2020-05-19 10:57:00 PDT
Comment on attachment 399263 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=399263&action=review

> Source/WebKit/UIProcess/API/C/WKWebsiteDataStoreRef.cpp:-663
> -    store.setUseITPDatabase(false, [callbackAggregator = callbackAggregator.copyRef()] { });

This change is wrong and is leading to flakiness in the tests... Now I have tests that give different output whether or not the test that runs right before it called testRunner.setUseITPDatabase(true) or not.

TestController::resetState() needs to resets ALL state. It is NEVER OK to rely on tests to reset state and has historically always led to flakiness. Please fix.
Comment 16 Kate Cheney 2020-05-19 11:44:10 PDT
(In reply to Chris Dumez from comment #15)
> Comment on attachment 399263 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=399263&action=review
> 
> > Source/WebKit/UIProcess/API/C/WKWebsiteDataStoreRef.cpp:-663
> > -    store.setUseITPDatabase(false, [callbackAggregator = callbackAggregator.copyRef()] { });
> 
> This change is wrong and is leading to flakiness in the tests... Now I have
> tests that give different output whether or not the test that runs right
> before it called testRunner.setUseITPDatabase(true) or not.
> 
> TestController::resetState() needs to resets ALL state. It is NEVER OK to
> rely on tests to reset state and has historically always led to flakiness.
> Please fix.

fix: https://bugs.webkit.org/show_bug.cgi?id=212094

Interesting, do you know why the flakiness was happening based on testRunner.setUseITPDatabase(true)? I thought the original implementation seemed more likely to cause flakiness if two database tests were occurring back to back. With the reset-to-memory-store implementation, this would destruct a database store, make a memory store, destruct the memory store, then make a new database store, which did not seem right, and seemed to be contributing to database locking.
Comment 17 Chris Dumez 2020-05-19 11:48:26 PDT
(In reply to katherine_cheney from comment #16)
> (In reply to Chris Dumez from comment #15)
> > Comment on attachment 399263 [details]
> > Patch
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=399263&action=review
> > 
> > > Source/WebKit/UIProcess/API/C/WKWebsiteDataStoreRef.cpp:-663
> > > -    store.setUseITPDatabase(false, [callbackAggregator = callbackAggregator.copyRef()] { });
> > 
> > This change is wrong and is leading to flakiness in the tests... Now I have
> > tests that give different output whether or not the test that runs right
> > before it called testRunner.setUseITPDatabase(true) or not.
> > 
> > TestController::resetState() needs to resets ALL state. It is NEVER OK to
> > rely on tests to reset state and has historically always led to flakiness.
> > Please fix.
> 
> fix: https://bugs.webkit.org/show_bug.cgi?id=212094
> 
> Interesting, do you know why the flakiness was happening based on
> testRunner.setUseITPDatabase(true)? I thought the original implementation
> seemed more likely to cause flakiness if two database tests were occurring
> back to back. With the reset-to-memory-store implementation, this would
> destruct a database store, make a memory store, destruct the memory store,
> then make a new database store, which did not seem right, and seemed to be
> contributing to database locking.

Well, yes I know. The flakiness is happening because the previous test is setting the ITP database to true and the new test is not setting it to false. I know you tried in this patch to update tests to set the ITP database to false but it looks like some were missed. In any case, it is terrible practice to rely on tests to do clean up at the end and will always lead to flakiness, one way or another. TestController::resetState() is the central way of resetting state between tests to avoid flakiness.