Bug 155120 - Reduce startup and shutdown cost of resource load statistics
Summary: Reduce startup and shutdown cost of resource load statistics
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit2 (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Brent Fulgham
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2016-03-07 10:20 PST by Brent Fulgham
Modified: 2016-03-08 09:40 PST (History)
9 users (show)

See Also:


Attachments
Patch (8.37 KB, patch)
2016-03-07 10:41 PST, Brent Fulgham
no flags Details | Formatted Diff | Diff
Patch (9.47 KB, patch)
2016-03-07 13:20 PST, Brent Fulgham
no flags Details | Formatted Diff | Diff
Patch (24.13 KB, patch)
2016-03-07 16:48 PST, Brent Fulgham
no flags Details | Formatted Diff | Diff
Patch (24.10 KB, patch)
2016-03-07 16:55 PST, Brent Fulgham
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews100 for mac-yosemite (909.80 KB, application/zip)
2016-03-07 17:32 PST, Build Bot
no flags Details
Patch (22.83 KB, patch)
2016-03-07 17:57 PST, Brent Fulgham
no flags Details | Formatted Diff | Diff
Patch (22.85 KB, patch)
2016-03-07 18:02 PST, Brent Fulgham
aestes: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Brent Fulgham 2016-03-07 10:20:29 PST
Avoid blocking the main thread when starting up (or shutting down) by loading any existing resource load statistics off of the main thread. Likewise, be keep the on-disk copy of the data up-to-date by writing an update when new data has been added to the set of statistics.
Comment 1 Radar WebKit Bug Importer 2016-03-07 10:33:45 PST
<rdar://problem/25010162>
Comment 2 Radar WebKit Bug Importer 2016-03-07 10:33:52 PST
<rdar://problem/25010167>
Comment 3 Brent Fulgham 2016-03-07 10:41:54 PST
Created attachment 273194 [details]
Patch
Comment 4 Brent Fulgham 2016-03-07 11:07:31 PST
Comment on attachment 273194 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=273194&action=review

> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:46
> +    , m_loadDataQueue(WorkQueue::create("WebResourceLoadStatisticsStore Load Data Queue"))

WK2 experts: Do we need two queues here? I was thinking we might want to continue receiving data from the WebProcess while the "old" statistics were read off disk.

> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:57
> +    Vector<WebCore::ResourceLoadStatistics> statistics(origins);

WK2 and C++ exports: This copy may be dumb, I'm not sure I need it. Shouldn't the 'statistics' capture in the lambda (below) cause this copy already?
Comment 5 Andy Estes 2016-03-07 12:49:00 PST
Comment on attachment 273194 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=273194&action=review

Brent and I discussed this in person. We don't think we need two additional queues. Rather, we should create one queue that is also the message receiver queue for ResourceLoadStatisticsUpdated. Then we don't have to worry about having a lock to coordinate the file I/O, or about making isolated copies of statistics vectors.

>> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:57
>> +    Vector<WebCore::ResourceLoadStatistics> statistics(origins);
> 
> WK2 and C++ exports: This copy may be dumb, I'm not sure I need it. Shouldn't the 'statistics' capture in the lambda (below) cause this copy already?

If you end up using a separate queue, then you actually need to copy this in a way that makes isolated copies of all the Strings in the Vector.
Comment 6 Brent Fulgham 2016-03-07 13:20:36 PST
Created attachment 273203 [details]
Patch
Comment 7 Andy Estes 2016-03-07 13:57:20 PST
Comment on attachment 273203 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=273203&action=review

> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:78
> +    if (!m_resourceLoadStatisticsEnabled || m_dataWasLoadedFromDisk)

It's not safe to read this if m_statisticsQueue can write to it concurrently. But if m_resourceLoadStatisticsEnabled == true, wouldn't readDataFromDiskIfNeeded() have already been called?

> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:-94
> -    if (!m_resourceLoadStatisticsEnabled)
> -        return;
> -
> -    // FIXME(154642): TEMPORARY CODE: This should not be done in one long operation when exiting.
> -    writeToDisk();

What guarantees data will be saved before shutdown? Do you not want to make that guarantee anymore?

> Source/WebKit2/UIProcess/WebsiteData/WebsiteDataStore.cpp:984
> -    connection.addWorkQueueMessageReceiver(Messages::WebResourceLoadStatisticsStore::messageReceiverName(), &m_queue.get(), m_resourceLoadStatistics.get());
> +    connection.addWorkQueueMessageReceiver(Messages::WebResourceLoadStatisticsStore::messageReceiverName(), &m_resourceLoadStatistics->queue(), m_resourceLoadStatistics.get());

Instead of adding a public getter for WebResourceLoadStatisticsStore's WorkQueue, I'd follow the pattern set by StorageManager and add processWillOpenConnection() and processDidCloseConnection() functions to WebResourceLoadStatisticsStore that add/remove the work queue message receiver.
Comment 8 Brent Fulgham 2016-03-07 14:23:27 PST
Comment on attachment 273203 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=273203&action=review

>> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:78
>> +    if (!m_resourceLoadStatisticsEnabled || m_dataWasLoadedFromDisk)
> 
> It's not safe to read this if m_statisticsQueue can write to it concurrently. But if m_resourceLoadStatisticsEnabled == true, wouldn't readDataFromDiskIfNeeded() have already been called?

I'm worried about the case where someone enables the feature (activates resource load statistics), then deactivates the feature, then turns the feature back on. This could cause us to double-load the disk copy of the statistics, which would then be merged back into the in-memory data.

We could purge the in-memory data when the feature is turned off to avoid the problem, but it seems like using this flag would be less costly.
Comment 9 Brent Fulgham 2016-03-07 16:48:46 PST
Created attachment 273240 [details]
Patch
Comment 10 Brent Fulgham 2016-03-07 16:55:52 PST
Created attachment 273241 [details]
Patch
Comment 11 Build Bot 2016-03-07 17:32:26 PST
Comment on attachment 273241 [details]
Patch

Attachment 273241 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.webkit.org/results/938792

New failing tests:
http/tests/navigation/statistics.html
Comment 12 Build Bot 2016-03-07 17:32:30 PST
Created attachment 273245 [details]
Archive of layout-test-results from ews100 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews100  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 13 Brent Fulgham 2016-03-07 17:33:38 PST
Looks like I introduced a crash when migrating code out of WebCore. I'm looking at the cause right now.
Comment 14 Andy Estes 2016-03-07 17:38:52 PST
Comment on attachment 273241 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=273241&action=review

> Source/WebCore/loader/ResourceLoadStatisticsStore.cpp:138
> +void ResourceLoadStatisticsStore::mergeStatistics(const ResourceLoadStatisticsStore& statistics)
> +{
> +    for (auto& statistic : statistics.m_resourceStatisticsMap) {
> +        auto result = m_resourceStatisticsMap.ensure(statistic.value.highLevelDomain, [&statistic] {
> +            return ResourceLoadStatistics(statistic.value.highLevelDomain);
> +        });
> +        
> +        result.iterator->value.merge(statistic.value);
> +    }
> +}

I don't think we actually need this. See below.

> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:76
> +    RefPtr<WebResourceLoadStatisticsStore> self(this);
> +    m_statisticsQueue->dispatch([self] {
> +        self->coreStore().clear();
> +    });

You only want to do this when enabled == true. Otherwise you'll potentially delete the store while web processes are still sending back data to merge.

Let's just make the clearing be part of readDataFromDiskIfNeeded(). It's safe for readDataFromDiskIfNeeded() to always clear the store, because it's guaranteed that (1) any previously received statistics have already been saved to disk, and (2) no statistics will be received concurrently with reading from disk, since both operations occur on the same work queue.

> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:100
> +        self->coreStore().mergeStatistics(tempStatisticsStore);

If we always clear the store when calling readDataFromDiskIfNeeded(), then there's nothing to merge. That means you don't need to add a new mergeStatistics().
Comment 15 Brent Fulgham 2016-03-07 17:43:00 PST
Comment on attachment 273241 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=273241&action=review

>> Source/WebCore/loader/ResourceLoadStatisticsStore.cpp:138
>> +}
> 
> I don't think we actually need this. See below.

Oh, that's a good point. I'll scrap it.

>> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:76
>> +    });
> 
> You only want to do this when enabled == true. Otherwise you'll potentially delete the store while web processes are still sending back data to merge.
> 
> Let's just make the clearing be part of readDataFromDiskIfNeeded(). It's safe for readDataFromDiskIfNeeded() to always clear the store, because it's guaranteed that (1) any previously received statistics have already been saved to disk, and (2) no statistics will be received concurrently with reading from disk, since both operations occur on the same work queue.

Agreed. I'll change the code to reflect this.

>> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:100
>> +        self->coreStore().mergeStatistics(tempStatisticsStore);
> 
> If we always clear the store when calling readDataFromDiskIfNeeded(), then there's nothing to merge. That means you don't need to add a new mergeStatistics().

OK!
Comment 16 Brent Fulgham 2016-03-07 17:57:40 PST
Created attachment 273248 [details]
Patch
Comment 17 Brent Fulgham 2016-03-07 18:02:39 PST
Created attachment 273250 [details]
Patch
Comment 18 Brent Fulgham 2016-03-07 18:02:57 PST
Comment on attachment 273250 [details]
Patch

Attempt to satisfy EFL build.
Comment 19 Andy Estes 2016-03-07 18:49:06 PST
Comment on attachment 273250 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=273250&action=review

> Source/WebCore/ChangeLog:3
> +        Reduce startup and shutdown cost of resource load statistics

Not sure this'll do much for shutdown time, since we synchronize with the statistics queue on shutdown.

> Source/WebCore/ChangeLog:10
> +        Move all file-related code out of WebCore. Add a new overload so that we can
> +        merge an entire ResourceLoadStatisticsStore object with another.

There's no overload anymore.

> Source/WebKit2/ChangeLog:13
> +        that it does not delay startup or shutdown. If load statistics are observed
> +        before the disk read is complete, we simply merge the on-disk data
> +        with whatever we've seen in the meantime.

Load statistics can't be observed during the disk read.
Comment 20 Brent Fulgham 2016-03-07 18:56:11 PST
Committed r197721: <http://trac.webkit.org/changeset/197721>
Comment 21 Brent Fulgham 2016-03-08 09:40:38 PST
Small follow-up bug fix:

Committed r197771: <http://trac.webkit.org/changeset/197771>