Bug 187773 - Add Web API Statistics Collection
Summary: Add Web API Statistics Collection
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Woodrow Wang
URL:
Keywords: InRadar
Depends on: 189211 189213 189216
Blocks:
  Show dependency treegraph
 
Reported: 2018-07-18 11:13 PDT by Woodrow Wang
Modified: 2019-01-11 16:03 PST (History)
15 users (show)

See Also:


Attachments
Patch (82.90 KB, patch)
2018-07-18 11:24 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews102 for mac-sierra (2.48 MB, application/zip)
2018-07-18 12:49 PDT, EWS Watchlist
no flags Details
Archive of layout-test-results from ews107 for mac-sierra-wk2 (2.86 MB, application/zip)
2018-07-18 12:58 PDT, EWS Watchlist
no flags Details
Patch (82.94 KB, patch)
2018-07-18 13:10 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (83.03 KB, patch)
2018-07-18 13:56 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews204 for win-future (13.02 MB, application/zip)
2018-07-19 00:00 PDT, EWS Watchlist
no flags Details
Patch (83.41 KB, patch)
2018-07-23 09:45 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (82.62 KB, patch)
2018-07-23 15:54 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (83.71 KB, patch)
2018-07-26 10:25 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews200 for win-future (12.91 MB, application/zip)
2018-07-26 14:02 PDT, EWS Watchlist
no flags Details
Patch (87.30 KB, patch)
2018-07-27 16:27 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (86.40 KB, patch)
2018-07-30 11:14 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews104 for mac-sierra-wk2 (3.36 MB, application/zip)
2018-07-30 12:56 PDT, EWS Watchlist
no flags Details
Archive of layout-test-results from ews124 for ios-simulator-wk2 (2.39 MB, application/zip)
2018-07-30 13:22 PDT, EWS Watchlist
no flags Details
Archive of layout-test-results from ews125 for ios-simulator-wk2 (2.53 MB, application/zip)
2018-07-30 15:22 PDT, EWS Watchlist
no flags Details
Patch (85.86 KB, patch)
2018-07-30 16:05 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (86.38 KB, patch)
2018-07-30 16:36 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (85.55 KB, patch)
2018-07-30 16:47 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (85.40 KB, patch)
2018-07-30 17:06 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (85.40 KB, patch)
2018-07-31 09:55 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (91.62 KB, patch)
2018-08-16 10:32 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (91.62 KB, patch)
2018-08-16 11:25 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (91.62 KB, patch)
2018-08-16 12:05 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (128.55 KB, patch)
2018-08-24 14:49 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (124.92 KB, patch)
2018-08-24 16:45 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews124 for ios-simulator-wk2 (831.35 KB, application/zip)
2018-08-24 18:28 PDT, EWS Watchlist
no flags Details
Archive of layout-test-results from ews105 for mac-sierra-wk2 (1.03 MB, application/zip)
2018-08-24 20:04 PDT, EWS Watchlist
no flags Details
Archive of layout-test-results from ews102 for mac-sierra (2.35 MB, application/zip)
2018-08-24 20:11 PDT, EWS Watchlist
no flags Details
Archive of layout-test-results from ews115 for mac-sierra (3.10 MB, application/zip)
2018-08-24 20:48 PDT, EWS Watchlist
no flags Details
Patch (124.95 KB, patch)
2018-08-27 16:02 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (124.59 KB, patch)
2018-08-27 16:11 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews116 for mac-sierra (3.18 MB, application/zip)
2018-08-27 19:27 PDT, EWS Watchlist
no flags Details
Patch (127.19 KB, patch)
2018-08-28 10:45 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews114 for mac-sierra (3.17 MB, application/zip)
2018-08-28 12:40 PDT, EWS Watchlist
no flags Details
Archive of layout-test-results from ews102 for mac-sierra (2.53 MB, application/zip)
2018-08-28 12:56 PDT, EWS Watchlist
no flags Details
Patch (136.59 KB, patch)
2018-08-30 11:10 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (136.64 KB, patch)
2018-08-30 11:59 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (136.94 KB, patch)
2018-08-30 12:23 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (137.05 KB, patch)
2018-08-30 13:57 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (138.92 KB, patch)
2018-08-31 11:17 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (138.85 KB, patch)
2018-08-31 12:06 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (19.49 KB, patch)
2018-08-31 17:03 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (113.77 KB, patch)
2018-08-31 17:55 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (71.36 KB, patch)
2018-09-05 14:39 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (71.84 KB, patch)
2018-09-05 15:02 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (77.36 KB, patch)
2018-09-05 16:08 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (77.44 KB, patch)
2018-09-07 10:37 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (72.63 KB, patch)
2018-09-07 12:20 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (72.68 KB, patch)
2018-09-07 14:38 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (72.85 KB, patch)
2018-09-07 14:47 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (72.83 KB, patch)
2018-09-10 09:55 PDT, Woodrow Wang
bfulgham: review+
bfulgham: commit-queue-
Details | Formatted Diff | Diff
Patch (72.96 KB, patch)
2018-09-10 14:20 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff
Patch (72.91 KB, patch)
2018-09-10 14:45 PDT, Woodrow Wang
commit-queue: commit-queue-
Details | Formatted Diff | Diff
Patch (72.93 KB, patch)
2018-09-10 16:28 PDT, Woodrow Wang
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Woodrow Wang 2018-07-18 11:13:11 PDT
Added Web API Statistics Collection
Comment 1 Woodrow Wang 2018-07-18 11:24:24 PDT
Created attachment 345263 [details]
Patch
Comment 2 EWS Watchlist 2018-07-18 12:49:53 PDT
Comment on attachment 345263 [details]
Patch

Attachment 345263 [details] did not pass mac-ews (mac):
Output: https://webkit-queues.webkit.org/results/8577220

New failing tests:
fast/dom/navigator-detached-no-crash.html
Comment 3 EWS Watchlist 2018-07-18 12:49:55 PDT
Created attachment 345271 [details]
Archive of layout-test-results from ews102 for mac-sierra

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews102  Port: mac-sierra  Platform: Mac OS X 10.12.6
Comment 4 EWS Watchlist 2018-07-18 12:58:03 PDT
Comment on attachment 345263 [details]
Patch

Attachment 345263 [details] did not pass mac-wk2-ews (mac-wk2):
Output: https://webkit-queues.webkit.org/results/8577293

New failing tests:
fast/dom/navigator-detached-no-crash.html
Comment 5 EWS Watchlist 2018-07-18 12:58:05 PDT
Created attachment 345272 [details]
Archive of layout-test-results from ews107 for mac-sierra-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews107  Port: mac-sierra-wk2  Platform: Mac OS X 10.12.6
Comment 6 Woodrow Wang 2018-07-18 13:10:01 PDT
Created attachment 345274 [details]
Patch
Comment 7 Woodrow Wang 2018-07-18 13:56:45 PDT
Created attachment 345278 [details]
Patch
Comment 8 Woodrow Wang 2018-07-18 13:57:52 PDT
Added null pointer checks to avoid dereferencing any null pointers, which fixed a regression found by the build bot
Comment 9 EWS Watchlist 2018-07-19 00:00:47 PDT
Comment on attachment 345278 [details]
Patch

Attachment 345278 [details] did not pass win-ews (win):
Output: https://webkit-queues.webkit.org/results/8583844

New failing tests:
http/tests/preload/onload_event.html
Comment 10 EWS Watchlist 2018-07-19 00:00:59 PDT
Created attachment 345335 [details]
Archive of layout-test-results from ews204 for win-future

The attached test failures were seen while running run-webkit-tests on the win-ews.
Bot: ews204  Port: win-future  Platform: CYGWIN_NT-6.1-2.9.0-0.318-5-3-x86_64-64bit
Comment 11 Alex Christensen 2018-07-19 10:29:52 PDT
Comment on attachment 345278 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=345278&action=review

> Source/WebCore/ChangeLog:12
> +        Added data collection for web API statistics, specifically regarding the canvas, font loads, 
> +        screen functions, and navigator functions. All of the code is placed under the compile time flag

Why did you choose these specific functions?

> Source/WebCore/PAL/ChangeLog:9
> +        we would like to collect various web API statistics for analysis. 

What kind of analysis do you intend to do?
Comment 12 Daniel Bates 2018-07-19 11:12:56 PDT
Comment on attachment 345278 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=345278&action=review

> Source/WebCore/ChangeLog:13
> +        ENABLE_WEB_API_STATISTICS. The statistics are stored in a ResrouceLoadStatistics object and written

ResrouceLoadStatistics => ResourceLoadStatistics

> Source/WebCore/html/HTMLCanvasElement.cpp:1050
> +    unsigned currentHeight = height();
> +    unsigned currentWidth = width();
> +    canvasLog.maxHeight = std::max(canvasLog.maxHeight, currentHeight);
> +    canvasLog.maxWidth = std::max(canvasLog.maxWidth, currentWidth);
> +    canvasLog.exportCalled = exportCalled;
> +    ResourceLoadObserver::shared().logCanvasFingerprint(document(), canvasLog);

I suggest we stack allocate the CanvasLog here and then WTFMove() it into logCanvasFingerprint(). Also, is it necessary to accumulate all the written string (textWritten) indefinitely? Would it be sufficient to clear all of them in this function after notifying the observer?

> Source/WebCore/html/HTMLCanvasElement.h:160
> +    // Canvas stats

This comment does not add value.

> Source/WebCore/html/HTMLCanvasElement.h:161
> +    CanvasLog canvasLog;

Can we just track the written strings:

HashSet<String> m_textWritten;

(Can we come up with a better name?)

And expose a setter:

appendTextWritten(String&& s) { m_textWritten.add(WTFMove(s)); }

> Source/WebCore/html/HTMLCanvasElement.h:162
> +    void logFingerprint(bool exportCalled);

If we remove the calls to logFingerprint() from CanvasRenderingContext2D.cpp then we should make this private.

> Source/WebCore/html/canvas/CanvasRenderingContext2D.cpp:370
> +    canvas.logFingerprint(true);

Is it necessary to update statistics here? Would it be sufficient to update statistic when pixel data is read?

> Source/WebCore/html/canvas/CanvasRenderingContext2D.cpp:465
> +    canvas.logFingerprint(false);

Ditto.

> Source/WebCore/loader/CanvasLog.h:34
> +    // max width and max height when toDataURL or other function is called

I don't see the value of this comment.

> Source/WebCore/loader/CanvasLog.h:38
> +    bool exportCalled { false };

What is being exported? Can we come up with a better name for this?

> Source/WebCore/loader/ResourceLoadObserver.cpp:288
> +void ResourceLoadObserver::logFontLoad(Document& document, String familyName, bool hasSuccessfullyLoaded)

Can document be made const?

String  => const String&

> Source/WebCore/loader/ResourceLoadObserver.cpp:293
> +    if (url.isBlankURL() || url.isEmpty())

This does not seem correct. The first disjunct will exempt dynamic pages written to about: URLs. What is an example when the latter disjunct evaluates to true?

> Source/WebCore/loader/ResourceLoadObserver.cpp:296
> +    String primaryDomainString = primaryDomain(primaryDomainURL);

The preferred terminology for this is registrableDomain.

> Source/WebCore/loader/ResourceLoadObserver.cpp:297
> +    auto frame = document.frame();

I would inline the right-hand side into the expression below.

> Source/WebCore/loader/ResourceLoadObserver.cpp:324
> +    statistics.canvasLog.maxHeight = canvasLog.maxHeight;
> +    statistics.canvasLog.maxWidth = canvasLog.maxWidth;
> +    statistics.canvasLog.textWritten.add(canvasLog.textWritten.begin(), canvasLog.textWritten.end());
> +    statistics.canvasLog.exportCalled |= canvasLog.exportCalled;

statistics.canvasLog = WTFMove(canvasLog)

> Source/WebCore/loader/ResourceLoadObserver.cpp:349
> +void ResourceLoadObserver::logScreenLoad(Frame* frame, String functionName)

All of the callers are passing a string literal. We should change the datatype of the second argument functionName from String to const char*. Can we make the screenFunctions HashSet store const char*s?

> Source/WebCore/loader/ResourceLoadStatistics.h:123
> +    // Screen functions accessed
> +    HashSet<String> screenFunctions;

Can we come up with a better name and remove the comment?
Comment 13 Woodrow Wang 2018-07-23 09:45:25 PDT
Created attachment 345578 [details]
Patch
Comment 14 Daniel Bates 2018-07-23 11:23:10 PDT
Comment on attachment 345578 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=345578&action=review

Woody and I talked about this patch today. Looking at the canvas logic, it seems sufficient to define at least two new ResourceLoadObserver functions: logCanvasRead() and logCanvasMeasureOrWrite() to log any operation that reads the pixel data from the canvas and text measurement and draw operations, respectively. The logCanvasRead() function could take an HTMLCanvasElement (or just the height and width) and the logCanvasMeasureOrWrite() function could take  an HTMLCanvasElement (or just the height and width) and the string that was measured or drawn.

If we make the above change, then we should move the CanvasLog struct to be a private struct of ResourceLoadStatistics since that this only code that makes use of it.

> Source/WebCore/css/CSSFontSelector.cpp:328
> +#if ENABLE(WEB_API_STATISTICS)
> +        auto font = FontCache::singleton().fontForFamily(fontDescription, familyForLookup);
> +        if (font == nullptr) {
> +            // Log font failed to load
> +            ResourceLoadObserver::shared().logFontLoad(*m_document, familyName, false);
> +        }
> +        return FontRanges((RefPtr<Font>&&)font);
> +#else
>          return FontRanges(FontCache::singleton().fontForFamily(fontDescription, familyForLookup));
> +#endif

Can we share more code?

> Source/WebCore/html/HTMLCanvasElement.cpp:1050
> +    m_textWritten.clear();

m_textWritten always holds exactly one string. It is unnecessary for m_textWritten to be a HashSet.

> Source/WebCore/loader/CanvasLog.h:32
> +    HashSet<String> m_fontsUsed;

We are not making meaningful use of this. We should remove it.

> Source/WebCore/loader/CanvasLog.h:37
> +struct CanvasLog {
> +    HashSet<String> m_textWritten;
> +    HashSet<String> m_fontsUsed;
> +    
> +    unsigned m_maxWidth { 0 };
> +    unsigned m_maxHeight { 0 };
> +    
> +    bool m_canvasDataRead { false };

We should make this a private class of ResourceLoadStatistics and remove all "m_" prefixes from ivars.

> Source/WebCore/loader/ResourceLoadObserver.cpp:315
> +    if (statistics.canvasLog.m_textWritten.add(canvasLogForUpdate.m_textWritten.begin(), canvasLogForUpdate.m_textWritten.end()))

What is the lifetime of ResourceLoadStatistics? Better question, can statistics.canvasLog.m_textWritten grow unbounded?
Comment 15 Woodrow Wang 2018-07-23 15:54:58 PDT
Created attachment 345617 [details]
Patch
Comment 16 Daniel Bates 2018-07-24 11:25:27 PDT
Comment on attachment 345617 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=345617&action=review

> Source/WebCore/css/CSSFontSelector.cpp:306
>  {
>      // If this ASSERT() fires, it usually means you forgot a document.updateStyleIfNeeded() somewhere.
>      ASSERT(!m_buildIsUnderway || m_computingRootStyleFontCount);
> +    
> +#if ENABLE(WEB_API_STATISTICS)

From the name of this function CSSFontSelector::fontRangesForFamily() it is not obvious whether this is the most direct way to know whether a font loaded or failed. Where does this function fall in the flow to support the Font Loading API events <https://drafts.csswg.org/css-font-loading/#FontFaceSet-events>?

> Source/WebCore/css/CSSFontSelector.cpp:308
> +        ResourceLoadObserver::shared().logFontLoad(*m_document, familyName, true);

It is not obvious what the purpose of the last argument is without looking at the prototype of ResourceLoadObserver::logFontLoad(). I suggest we use an enum class instead of a boolean to make the purpose of the last argument clear at the call site.

> Source/WebCore/loader/ResourceLoadObserver.cpp:292
> +    String registrableDomainString = primaryDomain(document.url());

I do not see the value in having the suffix "String" in the name of this variable. I suggest we would remove this suffix and call this variable registrableDomain. I also suggest we change the data type of this variable from String to auto and move this definition to be above line 295 so as to be closer to where this variable is used.  In general, it is good programming practice to place variables as close to where they are used as possible both for readability as well as to avoid computation until needed.

> Source/WebCore/loader/ResourceLoadObserver.cpp:293
> +    const URL& mainFrameURL = document.frame()->mainFrame().document()->url();

This variable is only referenced in the line below and I do not find its name any more descriptive than its value. I suggest we inline its value into the line below and remove this variable.

> Source/WebCore/loader/ResourceLoadObserver.cpp:294
> +    auto mainFrameRegistrableDomain = primaryDomain(mainFrameURL);

I suggest that we move this variable to be above line 299 (right above the line that uses it).

> Source/WebCore/loader/ResourceLoadObserver.cpp:317
> +void ResourceLoadObserver::logCanvasRead(const Document& document, const unsigned height, const unsigned width)
> +{
> +    if (!shouldLog(document.page()))
> +        return;
> +    const URL& mainFrameURL = document.frame()->mainFrame().document()->url();
> +    String registrableDomainString = primaryDomain(document.url());
> +    auto mainFrameRegistrableDomain = primaryDomain(mainFrameURL);
> +    auto& statistics = ensureResourceStatisticsForPrimaryDomain(registrableDomainString);
> +    statistics.mainFrameURL.add(mainFrameRegistrableDomain);
> +    statistics.canvasLog.maxHeight = std::max(statistics.canvasLog.maxHeight, height);
> +    statistics.canvasLog.maxWidth = std::max(statistics.canvasLog.maxWidth, width);
> +    statistics.canvasLog.canvasDataRead = true;
> +    if (statistics.canvasUnderTopFrameOrigins.add(mainFrameRegistrableDomain).isNewEntry)
> +        scheduleNotificationIfNeeded();
> +}

Please simplify this code using the suggestions I gave for ResourceLoadObserver::logFontLoad()?

> Source/WebCore/loader/ResourceLoadObserver.cpp:337
> +void ResourceLoadObserver::logCanvasWriteOrMeasure(const Document& document, const unsigned height, const unsigned width, const String& textWritten)
> +{
> +    if (!shouldLog(document.page()))
> +        return;
> +    bool shouldCallNotificationCallback = false;
> +    const URL& mainFrameURL = document.frame()->mainFrame().document()->url();
> +    String registrableDomainString = primaryDomain(document.url());
> +    auto mainFrameRegistrableDomain = primaryDomain(mainFrameURL);
> +    auto& statistics = ensureResourceStatisticsForPrimaryDomain(registrableDomainString);
> +    statistics.mainFrameURL.add(mainFrameRegistrableDomain);
> +    statistics.canvasLog.maxHeight = std::max(statistics.canvasLog.maxHeight, height);
> +    statistics.canvasLog.maxWidth = std::max(statistics.canvasLog.maxWidth, width);
> +    if (statistics.canvasLog.textWritten.add(textWritten))
> +        shouldCallNotificationCallback = true;
> +    if (statistics.canvasUnderTopFrameOrigins.add(mainFrameRegistrableDomain).isNewEntry)
> +        shouldCallNotificationCallback = true;
> +    if (shouldCallNotificationCallback)
> +        scheduleNotificationIfNeeded();
> +}

Ditto.

> Source/WebCore/loader/ResourceLoadObserver.cpp:353
> +void ResourceLoadObserver::logNavigatorLoad(const Frame* frame, const String& functionName)
> +{
> +    if (!frame)
> +        return;
> +    Document* document = frame->document();
> +    if (!shouldLog(document->page()))
> +        return;
> +    String registrableDomainString = primaryDomain(document->url());
> +    const URL& mainFrameURL = frame->mainFrame().document()->url();
> +    auto mainFrameRegistrableDomain = primaryDomain(mainFrameURL);
> +    auto& statistics = ensureResourceStatisticsForPrimaryDomain(registrableDomainString);
> +    statistics.navigatorFunctionsAccessed.add(functionName);
> +    if (statistics.mainFrameURL.add(mainFrameRegistrableDomain).isNewEntry)
> +        scheduleNotificationIfNeeded();
> +}

Ditto.

This function mostly works with the document. I suggest it take a const Document& instead of a const Frame*.

> Source/WebCore/loader/ResourceLoadObserver.cpp:369
> +void ResourceLoadObserver::logScreenLoad(const Frame* frame, const String& functionName)
> +{
> +    if (!frame)
> +        return;
> +    Document* document = frame->document();
> +    if (!shouldLog(document->page()))
> +        return;
> +    String registrableDomainString = primaryDomain(document->url());
> +    const URL& mainFrameURL = frame->mainFrame().document()->url();
> +    auto mainFrameRegistrableDomain = primaryDomain(mainFrameURL);
> +    auto& statistics = ensureResourceStatisticsForPrimaryDomain(registrableDomainString);
> +    statistics.screenFunctionsAccessed.add(functionName);
> +    if (statistics.mainFrameURL.add(mainFrameRegistrableDomain).isNewEntry)
> +        scheduleNotificationIfNeeded();
> +}

Ditto.

This function mostly works with the document. I suggest it take a const Document& instead of a const Frame*.

> Source/WebCore/loader/ResourceLoadStatistics.cpp:65
> +    encoder.encodeObject(label, canvasLog, [](KeyedEncoder& encoderInner, const CanvasLog log) {

This lambda is unnecessarily copying the passed CanvasLog. We should change "const CanvasLog" to "const CanvasLog&"  to just pass a const lvalue reference to the CanvasLog and avoid the copy. Please put a space between the capture list ([]) and the parameter list.

> Source/WebCore/loader/ResourceLoadStatistics.cpp:69
> +        encoderInner.encodeObjects("textWritten", log.textWritten.begin(), log.textWritten.end(), [](KeyedEncoder& encoderInner2, const String text) {

This lambda cause ref-count churn with the pass string. We should change const String to "const String&" to avoid it. Please put a space between the capture list ([]) and the parameter list.

> Source/WebCore/loader/ResourceLoadStatistics.cpp:153
> +    decoder.decodeObject(label, canvasLog, [](KeyedDecoder& decoderInner, CanvasLog& canvasLog) {

Please put a space between the capture list ([]) and the parameter list.

> Source/WebCore/loader/ResourceLoadStatistics.cpp:161
> +        decoderInner.decodeObjects("textWritten", ignore, [&canvasLog](KeyedDecoder& decoderInner2, String text) {

Please put a space between the capture list and the parameter list. "String text"?

> Source/WebCore/loader/ResourceLoadStatistics.cpp:165
> +            const String canvasText = text;

We're making an unnecessary copy here.

> Source/WebCore/loader/ResourceLoadStatistics.cpp:418
> +    canvasLog.maxHeight =  std::max(otherCanvasLog.maxHeight, canvasLog.maxHeight);
> +    canvasLog.maxWidth =  std::max(otherCanvasLog.maxWidth, canvasLog.maxWidth);

Nit: Extra space character to the right of the '='.

> Source/WebCore/page/Navigator.cpp:78
> +    ResourceLoadObserver::shared().logNavigatorLoad(frame(), "appVersion");

We should only call this function when we have a non-null frame. Otherwise, we are performing an unnecessary function call because ResourceLoadObserver::logNavigatorLoad() does nothing when a null frame is passed to it.

> Source/WebCore/page/Navigator.cpp:91
> +    ResourceLoadObserver::shared().logNavigatorLoad(frame(), "userAgent");

Ditto.

> Source/WebCore/page/Navigator.cpp:106
> +    ResourceLoadObserver::shared().logNavigatorLoad(frame(), "plugins");

Ditto.

> Source/WebCore/page/Navigator.cpp:116
> +    ResourceLoadObserver::shared().logNavigatorLoad(frame(), "mimeTypes");

Ditto.

> Source/WebCore/page/Navigator.cpp:126
> +    ResourceLoadObserver::shared().logNavigatorLoad(frame(), "cookieEnabled");

Ditto.

> Source/WebCore/page/Navigator.cpp:144
> +    ResourceLoadObserver::shared().logNavigatorLoad(frame(), "javaEnabled");

Ditto.

> Source/WebCore/page/Navigator.cpp:162
> +    ResourceLoadObserver::shared().logNavigatorLoad(frame(), "standAlone");

Ditto.

> Source/WebCore/page/Navigator.cpp:172
> +    ResourceLoadObserver::shared().logNavigatorLoad(frame(), "getStorageUpdates");

Ditto.

> Source/WebCore/page/Screen.cpp:52
> +#if ENABLE(WEB_API_STATISTICS)
> +    ResourceLoadObserver::shared().logScreenLoad(frame(), "height");
> +#endif

We should only call this function when we have a non-null frame. Otherwise, we are performing an unnecessary function call because ResourceLoadObserver::logScreenLoad() does nothing when a null frame is passed to it.

> Source/WebCore/page/Screen.cpp:62
> +    ResourceLoadObserver::shared().logScreenLoad(frame(), "width");

Ditto.

> Source/WebCore/page/Screen.cpp:73
> +    ResourceLoadObserver::shared().logScreenLoad(frame(), "colorDepth");

Ditto.

> Source/WebCore/page/Screen.cpp:83
> +    ResourceLoadObserver::shared().logScreenLoad(frame(), "pixelDepth");

Ditto.

> Source/WebCore/page/Screen.cpp:93
> +    ResourceLoadObserver::shared().logScreenLoad(frame(), "availLeft");

Ditto.

> Source/WebCore/page/Screen.cpp:103
> +    ResourceLoadObserver::shared().logScreenLoad(frame(), "availTop");

Ditto.

> Source/WebCore/page/Screen.cpp:113
> +    ResourceLoadObserver::shared().logScreenLoad(frame(), "availHeight");

Ditto.

> Source/WebCore/page/Screen.cpp:123
> +    ResourceLoadObserver::shared().logScreenLoad(frame(), "availWidth");

Ditto.
Comment 17 Woodrow Wang 2018-07-26 10:25:48 PDT
Created attachment 345851 [details]
Patch
Comment 18 Daniel Bates 2018-07-26 12:53:32 PDT
Comment on attachment 345851 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=345851&action=review

Can we write tests for this change?

> Source/WebCore/css/CSSFontSelector.cpp:322
> +            // Log font failed to load

This comment just explains what the code does. Please remove.

> Source/WebCore/css/CSSFontSelector.cpp:326
> +        return FontRanges((RefPtr<Font>&&)font);

(RefPtr<Font>&&)font => WTFMove(font)

Can we use uniform initialization syntax (e.g. FontRanges { WTFMove(font) })?

> Source/WebCore/html/HTMLCanvasElement.cpp:695
> +#if ENABLE(WEB_API_STATISTICS)
> +    ResourceLoadObserver::shared().logCanvasRead(document(), height(), width());
> +#endif

Is it intentional that we log this even though the web page may not have read any data (say, if we thrown a security error). If it is intentional then we should write a comment to explain why we are doing this as well as add a remark for this function in the ChangeLog.

> Source/WebCore/html/HTMLCanvasElement.cpp:725
> +#if ENABLE(WEB_API_STATISTICS)
> +    ResourceLoadObserver::shared().logCanvasRead(document(), height(), width());
> +#endif

Ditto.

> Source/WebCore/html/HTMLCanvasElement.cpp:762
> +#if ENABLE(WEB_API_STATISTICS)
> +    ResourceLoadObserver::shared().logCanvasRead(document(), height(), width());
> +#endif

Ditto.

> Source/WebCore/html/HTMLCanvasElement.cpp:777
> +#if ENABLE(WEB_API_STATISTICS)
> +    ResourceLoadObserver::shared().logCanvasRead(document(), height(), width());
> +#endif

Ditto.

> Source/WebCore/html/HTMLCanvasElement.cpp:794
> +#if ENABLE(WEB_API_STATISTICS)
> +    ResourceLoadObserver::shared().logCanvasRead(document(), height(), width());
> +#endif

Ditto.

> Source/WebCore/loader/ResourceLoadObserver.cpp:287
> +// Web API stats

Please remove this comment as the ENABLE() above is sufficient to demarcate the section.

> Source/WebCore/loader/ResourceLoadObserver.cpp:317
> +static const auto CANVAS_TEXT_CAPACITY = 10;

No need for static, const have internal linkage. I suggest we be explicit with the datatype instead of using auto. Also the name of this constant does not follow the Code Style Guidelines. We use the same conventions for constants as we do for variables.

> Source/WebCore/loader/ResourceLoadObserver.cpp:331
> +    if (statistics.canvasUnderTopFrameOrigins.add(mainFrameRegistrableDomain).isNewEntry)

Maybe a better name for the instance variable would be topRegistrableDomainsWithCanvases or topPrivatelyControlledDomainsWithCanvases.

> Source/WebCore/loader/ResourceLoadObserver.cpp:357
> +    if (statistics.mainFrameURL.add(mainFrameRegistrableDomain).isNewEntry)

Can we come up with a better name for mainFrameURL?

> Source/WebCore/loader/ResourceLoadObserver.h:70
> +    void logCanvasWriteOrMeasure(const Document&, const unsigned height, const unsigned width, const String& textWritten);

const unsigned => unsigned

It is not meaningful to mark pass-by-value types as const as a copy of the value is given to the function by definition.
Comment 19 EWS Watchlist 2018-07-26 14:01:48 PDT
Comment on attachment 345851 [details]
Patch

Attachment 345851 [details] did not pass win-ews (win):
Output: https://webkit-queues.webkit.org/results/8664979

New failing tests:
http/tests/security/canvas-remote-read-remote-video-blocked-no-crossorigin.html
Comment 20 EWS Watchlist 2018-07-26 14:02:01 PDT
Created attachment 345865 [details]
Archive of layout-test-results from ews200 for win-future

The attached test failures were seen while running run-webkit-tests on the win-ews.
Bot: ews200  Port: win-future  Platform: CYGWIN_NT-6.1-2.9.0-0.318-5-3-x86_64-64bit
Comment 21 Woodrow Wang 2018-07-27 16:27:07 PDT
Created attachment 345969 [details]
Patch
Comment 22 Woodrow Wang 2018-07-30 11:14:47 PDT
Created attachment 346072 [details]
Patch
Comment 23 Woodrow Wang 2018-07-30 11:15:35 PDT
Resolved changes in respective FeatureDefines.xcconfig files
Comment 24 EWS Watchlist 2018-07-30 12:56:27 PDT
Comment on attachment 346072 [details]
Patch

Attachment 346072 [details] did not pass mac-wk2-ews (mac-wk2):
Output: https://webkit-queues.webkit.org/results/8701546

New failing tests:
fast/text/css-font-loading-arraybuffer.html
Comment 25 EWS Watchlist 2018-07-30 12:56:28 PDT
Created attachment 346089 [details]
Archive of layout-test-results from ews104 for mac-sierra-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews104  Port: mac-sierra-wk2  Platform: Mac OS X 10.12.6
Comment 26 EWS Watchlist 2018-07-30 13:22:30 PDT
Comment on attachment 346072 [details]
Patch

Attachment 346072 [details] did not pass ios-sim-ews (ios-simulator-wk2):
Output: https://webkit-queues.webkit.org/results/8701492

New failing tests:
fast/text/css-font-loading-arraybuffer.html
Comment 27 EWS Watchlist 2018-07-30 13:22:32 PDT
Created attachment 346092 [details]
Archive of layout-test-results from ews124 for ios-simulator-wk2

The attached test failures were seen while running run-webkit-tests on the ios-sim-ews.
Bot: ews124  Port: ios-simulator-wk2  Platform: Mac OS X 10.13.4
Comment 28 EWS Watchlist 2018-07-30 15:22:54 PDT
Comment on attachment 346072 [details]
Patch

Attachment 346072 [details] did not pass ios-sim-ews (ios-simulator-wk2):
Output: https://webkit-queues.webkit.org/results/8703010

New failing tests:
fast/text/css-font-loading-arraybuffer.html
Comment 29 EWS Watchlist 2018-07-30 15:22:56 PDT
Created attachment 346108 [details]
Archive of layout-test-results from ews125 for ios-simulator-wk2

The attached test failures were seen while running run-webkit-tests on the ios-sim-ews.
Bot: ews125  Port: ios-simulator-wk2  Platform: Mac OS X 10.13.4
Comment 30 Woodrow Wang 2018-07-30 16:05:16 PDT
Created attachment 346115 [details]
Patch
Comment 31 Woodrow Wang 2018-07-30 16:36:21 PDT
Created attachment 346123 [details]
Patch
Comment 32 Woodrow Wang 2018-07-30 16:47:44 PDT
Created attachment 346126 [details]
Patch
Comment 33 Woodrow Wang 2018-07-30 17:06:06 PDT
Created attachment 346127 [details]
Patch
Comment 34 Woodrow Wang 2018-07-31 09:55:46 PDT
Created attachment 346172 [details]
Patch
Comment 35 Daniel Bates 2018-08-01 15:59:25 PDT
Comment on attachment 346172 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=346172&action=review

> Source/WebCore/css/CSSFontFace.cpp:810
> +        case CSSFontFaceSource::Status::Success: {
>              if (RefPtr<Font> result = source->font(fontDescription, syntheticBold, syntheticItalic, m_featureSettings, m_variantSettings, m_fontSelectionCapabilities))
>                  return result;
>              break;
> +        }

Please revert this change as it does not seem necessary to fix this bug.

> Source/WebCore/css/CSSFontFaceSource.cpp:193
> +            if (fontSelector->document()) {
> +                ResourceLoadObserver::shared().logFontLoad(*fontSelector->document(), m_familyNameOrURI.string(), ResourceLoadObserver::LoadStatus::Attempted);
> +                if (!success)
> +                    ResourceLoadObserver::shared().logFontLoad(*fontSelector->document(), m_familyNameOrURI.string(), ResourceLoadObserver::LoadStatus::Failed);
> +            }

When I first read the name "logFontLoad" I got the impression that this function should be invoked exactly once for each font that we loaded or attempted to load. This code implies that this function should be invoked whenever the load status of the font changes. We should rename ResourceLoadObserver::logFontLoad() to better reflect this design. Maybe logFontStatusChanged? Alternatively we could change the design of logFontLoad to match what its name implies and have it take a boolean, status, as to whether the load succeeded. When called with status := true it would do the same thing it does in your patch when invoked with ResourceLoadObserver::LoadStatus::Attempted. When called with status := false, it would do the same thing it does in your patch when invoked with ResourceLoadObserver::LoadStatus::Attempted plus do the same thing it does in your patch when invoked with ResourceLoadObserver::LoadStatus::Failed.

Then we can simplify this code to, caching fontSelector->document() in a scoped local:

if (auto& document = fontSelector->document())
    ResourceLoadObserver::shared().logFontLoad(document, m_familyNameOrURI.string(), success);

> Source/WebCore/css/CSSFontSelector.cpp:309
> +    
> +#if ENABLE(WEB_API_STATISTICS)
> +    if (m_document)
> +        ResourceLoadObserver::shared().logFontLoad(*m_document, familyName.string(), ResourceLoadObserver::LoadStatus::Attempted);
> +#endif

If you take my suggestion above then we do not need this code....

> Source/WebCore/css/CSSFontSelector.cpp:324
>      if (!face) {
>          if (!resolveGenericFamilyFirst)
>              familyForLookup = resolveGenericFamily(m_document, fontDescription, familyName);
> -        return FontRanges(FontCache::singleton().fontForFamily(fontDescription, familyForLookup));
> +        auto font = FontCache::singleton().fontForFamily(fontDescription, familyForLookup);
> +#if ENABLE(WEB_API_STATISTICS)
> +        if (!font && m_document)
> +            ResourceLoadObserver::shared().logFontLoad(*m_document, familyName.string(), ResourceLoadObserver::LoadStatus::Failed);
> +#endif
> +        return FontRanges { WTFMove(font) };

And this code would change to:

if (face) {
#if ENABLE(WEB_API_STATISTICS)
    if (m_document)
        ResourceLoadObserver::shared().logFontLoad(*m_document, familyName.string(), true /* success */);
#endif
    return face->fontRanges(fontDescription);
}

if (!resolveGenericFamilyFirst)
    familyForLookup = resolveGenericFamily(m_document, fontDescription, familyName);
auto font = FontCache::singleton().fontForFamily(fontDescription, familyForLookup);
#if ENABLE(WEB_API_STATISTICS)
if (m_document)
    ResourceLoadObserver::shared().logFontLoad(*m_document, familyName.string(), !!font);
#endif
return FontRanges { WTFMove(font) };

> Source/WebCore/css/CSSFontSelector.cpp:407
> +    

Please remove the leading whitespace.

> Source/WebCore/css/CSSFontSelector.cpp:409
> +    

Please remove the leading whitespace.

> Source/WebCore/css/CSSFontSelector.cpp:415
> +    if (m_document) {
> +        ResourceLoadObserver::shared().logFontLoad(*m_document, m_document->settings().pictographFontFamily().string(), ResourceLoadObserver::LoadStatus::Attempted);
> +        if (!font)
> +            ResourceLoadObserver::shared().logFontLoad(*m_document, m_document->settings().pictographFontFamily().string(), ResourceLoadObserver::LoadStatus::Failed);
> +    }

If you take my suggestion above then this code would change to:

if (m_document)
    ResourceLoadObserver::shared().logFontLoad(*m_document, m_document->settings().pictographFontFamily().string(), !!font);

> Source/WebCore/html/HTMLCanvasElement.cpp:701
> +    // Records attempt to read canvas data, regardless of if the read is successful

How did you come to the decision to add a comment about logCanvasRead() here and not in HTMLCanvasElement::toBlob() and HTMLCanvasElement::captureStream()?

"of if" => whether

Missing a period at the end of this sentence.

> Source/WebCore/html/HTMLCanvasElement.cpp:770
> +#if ENABLE(WEB_API_STATISTICS)
> +    ResourceLoadObserver::shared().logCanvasRead(document(), height(), width());
> +#endif

Is it intentional that we log this read even if we do not have a context? If so, please add a comment to explain why.

> Source/WebCore/html/HTMLCanvasElement.cpp:785
> +#if ENABLE(WEB_API_STATISTICS)
> +    ResourceLoadObserver::shared().logCanvasRead(document(), height(), width());
> +#endif

Is it intentional that we log this read even if we do not have a buffer? If so, please add a comment to explain why.

> Source/WebCore/html/HTMLCanvasElement.cpp:802
> +#if ENABLE(WEB_API_STATISTICS)
> +    ResourceLoadObserver::shared().logCanvasRead(document(), height(), width());
> +#endif

How did you come to the decision to not add a comment here about logCanvasRead() even though you did add a comment for logCanvasRead() in HTMLCanvasElement::toDataURL()?

> Source/WebCore/loader/CanvasLog.h:30
> +struct CanvasLog {

Maybe a better name for this struct would be CanvasActivityRecord?

> Source/WebCore/loader/CanvasLog.h:36
> +    bool canvasDataRead { false };

It is redundant to include the word "canvas" in the name of this variable as it is obvious from the name of this struct that we are talking about the HTML canvas element. Maybe a better name for this variable would be didReadData or wasDataRead?

> Source/WebCore/loader/ResourceLoadObserver.cpp:318
> +void ResourceLoadObserver::logCanvasWriteOrMeasure(const Document& document, unsigned height, unsigned width, const String& textWritten)

We should write the logic in this function to bail out early if topFrameRegistrableDomainsWithCanvases already contains mainFrameRegistrableDomain and the canvas log cannot record any additional strings.

> Source/WebCore/loader/ResourceLoadObserver.cpp:328
> +    statistics.canvasLog.maxHeight = std::max(statistics.canvasLog.maxHeight, height);
> +    statistics.canvasLog.maxWidth = std::max(statistics.canvasLog.maxWidth, width);

Eww. We shouldn't duplicate this computation. I suggest that move this computation to CanvasLog and let it ensure the invariant that max{Width, Height} represent what their names imply. One way to do this is to add setters for width and height to CanvasLog:

void setWidth(unsigned width) { m_maxWidth = std::max(m_maxWidth, width); }
void setHeight(unsigned height) { m_maxHeight = std::max(m_maxHeight, height); }

If wanted to strictly enforce this we could make CanvasLog a class, make all its data private and expose getters for its data.

> Source/WebCore/loader/ResourceLoadObserver.cpp:330
> +    if (statistics.canvasLog.textWritten.size() < canvasTextCapacity && statistics.canvasLog.textWritten.add(textWritten))
> +        shouldCallNotificationCallback = true;

It would be better to move this logic into CanvasLog. I suggest adding a recordWrittenOrMeasuredText(const String&) that knows the criterion for adding a new string to the hash set.

> Source/WebCore/loader/ResourceLoadObserver.cpp:331
> +    if (statistics.topFrameRegistrableDomainsWithCanvases.add(mainFrameRegistrableDomain).isNewEntry)

Minor: How did you come to the decision to explicitly check isNewEntry here but implicitly check it on line 329?

> Source/WebCore/loader/ResourceLoadObserver.cpp:345
> +    if (statistics.topFrameRegistrableDomains.add(mainFrameRegistrableDomain).isNewEntry)

Minor: How did you come to the decision to explicitly check isNewEntry here but implicitly check it on line 329?

> Source/WebCore/loader/ResourceLoadObserver.cpp:357
> +    if (statistics.topFrameRegistrableDomains.add(mainFrameRegistrableDomain).isNewEntry)

Minor: How did you come to the decision to explicitly check isNewEntry here but implicitly check it on line 329?

> Source/WebCore/loader/ResourceLoadObserver.h:67
> +    // Web API stats collection functions

Please remove this comment as the ENABLE(WEB_API_STATISTICS) provides sufficient context.

> Source/WebCore/loader/ResourceLoadStatistics.cpp:111
> +    // Web API stats

Ditto.

> Source/WebCore/loader/ResourceLoadStatistics.cpp:418
> +    canvasLog.maxHeight = std::max(otherCanvasLog.maxHeight, canvasLog.maxHeight);
> +    canvasLog.maxWidth = std::max(otherCanvasLog.maxWidth, canvasLog.maxWidth);
> +    canvasLog.textWritten.add(otherCanvasLog.textWritten.begin(), otherCanvasLog.textWritten.end());
> +    canvasLog.canvasDataRead |= otherCanvasLog.canvasDataRead;

We should add a CanvasLog::mergeWith(const CanvasLog&) that knows how to merge itself with another CanvasLog by performing this logic.

> Source/WebCore/loader/ResourceLoadStatistics.h:110
> +    // Keeps track of registrable domains of top frames under which API are accessed

We should explain that topFrameRegistrableDomains represents access to the top frame's content or one of its subframes.

> Source/WebCore/loader/ResourceLoadStatistics.h:111
> +    HashCountedSet<String> topFrameRegistrableDomains;

Can we come up with a more descriptive name for this variable and remove or minimize the need for the comment above it? Maybe topFrameRegistrableDomainsThatAccessedWebAPIs.

> Source/WebCore/loader/ResourceLoadStatistics.h:113
> +    // Keeps track of registrable domains of top frames under which canvas API are accessed
> +    HashCountedSet<String> topFrameRegistrableDomainsWithCanvases;

I suggest we rename the variable topFrameRegistrableDomainsWithAccessedCanvases and then update the comment to explain that the accesses include content loaded in the top frame and any sub frame.

> Source/WebCore/page/Navigator.cpp:95
> +    if (!m_frame)
> +        return m_userAgent;
> +#if ENABLE(WEB_API_STATISTICS)
> +    ResourceLoadObserver::shared().logNavigatorLoad(*m_frame->document(), "userAgent");
> +#endif
> +    if (m_userAgent.isNull() && m_frame->page())

Is this correct? Do we really want to log a navigator load when we do not have a page?

> Source/WebCore/page/Navigator.cpp:114
> +        ResourceLoadObserver::shared().logNavigatorLoad(*m_frame->document(), "plugins");

Can we come up with a more descriptive name for ResourceLoadObserver::logNavigatorLoad()? It does not seem to have any relevant to page loading. Maybe logNavigatorAccessed?

> Source/WebCore/page/Navigator.cpp:179
> +    

Please revert this change.

> Source/WebCore/page/Screen.cpp:53
> +    ResourceLoadObserver::shared().logScreenLoad(*m_frame->document(), "height");

Can we come up with a more descriptive name for ResourceLoadObserver::logScreenLoad()? It does not seem to have any relevant to page loading. Maybe logScreenAccessed? or logScreenAPIAccessed? (If you go with the latter we should rename logNavigatorLoad appropriately).

> Source/WebKit/Shared/WebCoreArgumentCoders.cpp:2637
> +    encoder << statistics.canvasLog.canvasDataRead;
> +    encoder << statistics.canvasLog.textWritten;
> +    encoder << statistics.canvasLog.maxHeight;
> +    encoder << statistics.canvasLog.maxWidth;

For your consideration, I suggest that we teach CanvasLog how to encode itself such that we can simplify this code to:

encoder << statistics.canvasLog;

> Source/WebKit/Shared/WebCoreArgumentCoders.cpp:2726
> +    if (!decoder.decode(statistics.canvasLog.canvasDataRead))
> +        return std::nullopt;
> +    
> +    if (!decoder.decode(statistics.canvasLog.textWritten))
> +        return std::nullopt;
> +    
> +    if (!decoder.decode(statistics.canvasLog.maxHeight))
> +        return std::nullopt;
> +    
> +    if (!decoder.decode(statistics.canvasLog.maxWidth))
> +        return std::nullopt;

Similarly I suggest that we move the CanvasLog decoding logic into CanvasLog and then we can simplify this code.
Comment 36 Woodrow Wang 2018-08-02 09:39:19 PDT
(In reply to Daniel Bates from comment #18)
> Comment on attachment 345851 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=345851&action=review
> 
> Can we write tests for this change?
> 
> > Source/WebCore/css/CSSFontSelector.cpp:322
> > +            // Log font failed to load
> 
> This comment just explains what the code does. Please remove.
> 
> > Source/WebCore/css/CSSFontSelector.cpp:326
> > +        return FontRanges((RefPtr<Font>&&)font);
> 
> (RefPtr<Font>&&)font => WTFMove(font)
> 
> Can we use uniform initialization syntax (e.g. FontRanges { WTFMove(font) })?
> 
> > Source/WebCore/html/HTMLCanvasElement.cpp:695
> > +#if ENABLE(WEB_API_STATISTICS)
> > +    ResourceLoadObserver::shared().logCanvasRead(document(), height(), width());
> > +#endif
> 
> Is it intentional that we log this even though the web page may not have
> read any data (say, if we thrown a security error). If it is intentional
> then we should write a comment to explain why we are doing this as well as
> add a remark for this function in the ChangeLog.
> 
> > Source/WebCore/html/HTMLCanvasElement.cpp:725
> > +#if ENABLE(WEB_API_STATISTICS)
> > +    ResourceLoadObserver::shared().logCanvasRead(document(), height(), width());
> > +#endif
> 
> Ditto.
> 
> > Source/WebCore/html/HTMLCanvasElement.cpp:762
> > +#if ENABLE(WEB_API_STATISTICS)
> > +    ResourceLoadObserver::shared().logCanvasRead(document(), height(), width());
> > +#endif
> 
> Ditto.
> 
> > Source/WebCore/html/HTMLCanvasElement.cpp:777
> > +#if ENABLE(WEB_API_STATISTICS)
> > +    ResourceLoadObserver::shared().logCanvasRead(document(), height(), width());
> > +#endif
> 
> Ditto.
> 
> > Source/WebCore/html/HTMLCanvasElement.cpp:794
> > +#if ENABLE(WEB_API_STATISTICS)
> > +    ResourceLoadObserver::shared().logCanvasRead(document(), height(), width());
> > +#endif
> 
> Ditto.
> 
> > Source/WebCore/loader/ResourceLoadObserver.cpp:287
> > +// Web API stats
> 
> Please remove this comment as the ENABLE() above is sufficient to demarcate
> the section.
> 
> > Source/WebCore/loader/ResourceLoadObserver.cpp:317
> > +static const auto CANVAS_TEXT_CAPACITY = 10;
> 
> No need for static, const have internal linkage. I suggest we be explicit
> with the datatype instead of using auto. Also the name of this constant does
> not follow the Code Style Guidelines. We use the same conventions for
> constants as we do for variables.
> 
> > Source/WebCore/loader/ResourceLoadObserver.cpp:331
> > +    if (statistics.canvasUnderTopFrameOrigins.add(mainFrameRegistrableDomain).isNewEntry)
> 
> Maybe a better name for the instance variable would be
> topRegistrableDomainsWithCanvases or
> topPrivatelyControlledDomainsWithCanvases.
> 
> > Source/WebCore/loader/ResourceLoadObserver.cpp:357
> > +    if (statistics.mainFrameURL.add(mainFrameRegistrableDomain).isNewEntry)
> 
> Can we come up with a better name for mainFrameURL?
> 
> > Source/WebCore/loader/ResourceLoadObserver.h:70
> > +    void logCanvasWriteOrMeasure(const Document&, const unsigned height, const unsigned width, const String& textWritten);
> 
> const unsigned => unsigned
> 
> It is not meaningful to mark pass-by-value types as const as a copy of the
> value is given to the function by definition.
Comment 37 Woodrow Wang 2018-08-16 10:32:37 PDT
Created attachment 347273 [details]
Patch
Comment 38 Woodrow Wang 2018-08-16 11:25:07 PDT
Created attachment 347277 [details]
Patch
Comment 39 Woodrow Wang 2018-08-16 12:05:01 PDT
Created attachment 347282 [details]
Patch
Comment 40 Woodrow Wang 2018-08-24 14:49:58 PDT
Created attachment 348041 [details]
Patch
Comment 41 Woodrow Wang 2018-08-24 16:45:57 PDT
Created attachment 348057 [details]
Patch
Comment 42 EWS Watchlist 2018-08-24 18:28:42 PDT
Comment on attachment 348057 [details]
Patch

Attachment 348057 [details] did not pass ios-sim-ews (ios-simulator-wk2):
Output: https://webkit-queues.webkit.org/results/8977235

Number of test failures exceeded the failure limit.
Comment 43 EWS Watchlist 2018-08-24 18:28:44 PDT
Created attachment 348065 [details]
Archive of layout-test-results from ews124 for ios-simulator-wk2

The attached test failures were seen while running run-webkit-tests on the ios-sim-ews.
Bot: ews124  Port: ios-simulator-wk2  Platform: Mac OS X 10.13.4
Comment 44 EWS Watchlist 2018-08-24 20:03:59 PDT
Comment on attachment 348057 [details]
Patch

Attachment 348057 [details] did not pass mac-wk2-ews (mac-wk2):
Output: https://webkit-queues.webkit.org/results/8977992

Number of test failures exceeded the failure limit.
Comment 45 EWS Watchlist 2018-08-24 20:04:01 PDT
Created attachment 348067 [details]
Archive of layout-test-results from ews105 for mac-sierra-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews105  Port: mac-sierra-wk2  Platform: Mac OS X 10.12.6
Comment 46 EWS Watchlist 2018-08-24 20:11:41 PDT
Comment on attachment 348057 [details]
Patch

Attachment 348057 [details] did not pass mac-ews (mac):
Output: https://webkit-queues.webkit.org/results/8978007

New failing tests:
http/tests/webAPIStatistics/font-load-data-collection.html
http/tests/webAPIStatistics/canvas-read-and-write-data-collection.html
http/tests/webAPIStatistics/navigator-functions-accessed-data-collection.html
http/tests/webAPIStatistics/screen-functions-accessed-data-collection.html
Comment 47 EWS Watchlist 2018-08-24 20:11:43 PDT
Created attachment 348068 [details]
Archive of layout-test-results from ews102 for mac-sierra

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews102  Port: mac-sierra  Platform: Mac OS X 10.12.6
Comment 48 EWS Watchlist 2018-08-24 20:48:02 PDT
Comment on attachment 348057 [details]
Patch

Attachment 348057 [details] did not pass mac-debug-ews (mac):
Output: https://webkit-queues.webkit.org/results/8978099

New failing tests:
http/tests/webAPIStatistics/font-load-data-collection.html
http/tests/webAPIStatistics/canvas-read-and-write-data-collection.html
http/tests/webAPIStatistics/navigator-functions-accessed-data-collection.html
http/tests/webAPIStatistics/screen-functions-accessed-data-collection.html
Comment 49 EWS Watchlist 2018-08-24 20:48:04 PDT
Created attachment 348070 [details]
Archive of layout-test-results from ews115 for mac-sierra

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews115  Port: mac-sierra  Platform: Mac OS X 10.12.6
Comment 50 Woodrow Wang 2018-08-27 16:02:09 PDT
Created attachment 348227 [details]
Patch
Comment 51 Woodrow Wang 2018-08-27 16:11:34 PDT
Created attachment 348230 [details]
Patch
Comment 52 EWS Watchlist 2018-08-27 19:27:11 PDT
Comment on attachment 348230 [details]
Patch

Attachment 348230 [details] did not pass mac-debug-ews (mac):
Output: https://webkit-queues.webkit.org/results/9004056

New failing tests:
http/tests/webAPIStatistics/font-load-data-collection.html
http/tests/webAPIStatistics/canvas-read-and-write-data-collection.html
http/tests/webAPIStatistics/navigator-functions-accessed-data-collection.html
http/tests/webAPIStatistics/screen-functions-accessed-data-collection.html
Comment 53 EWS Watchlist 2018-08-27 19:27:14 PDT
Created attachment 348253 [details]
Archive of layout-test-results from ews116 for mac-sierra

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews116  Port: mac-sierra  Platform: Mac OS X 10.12.6
Comment 54 Woodrow Wang 2018-08-28 10:45:11 PDT
Created attachment 348309 [details]
Patch
Comment 55 John Wilander 2018-08-28 11:25:30 PDT
Comment on attachment 348309 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=348309&action=review

Nice test infrastructure addition!

> Source/WebCore/css/CSSFontSelector.cpp:313
> +        // This logs an attempted font load by setting the loadStatus boolean to true

This comment seems unnecessary.

> Source/WebCore/html/HTMLCanvasElement.cpp:706
> +    // Records attempt to read canvas data, regardless of whether the read is successful or not.

Ditto.

> Source/WebCore/html/HTMLCanvasElement.cpp:739
> +    // Records attempt to read canvas data, regardless of whether the read is successful or not.

Ditto.

> Source/WebCore/html/HTMLCanvasElement.cpp:772
> +        // Records attempt to read canvas data, regardless of whether the read is successful or not.

Ditto.

> Source/WebCore/html/HTMLCanvasElement.cpp:789
> +    // Records attempt to read canvas data, regardless of whether the read is successful or not.

Ditto.

> Source/WebCore/html/HTMLCanvasElement.cpp:806
> +    // Records attempt to read canvas data, regardless of whether the read is successful or not.

Ditto.

> Source/WebCore/loader/ResourceLoadObserver.cpp:248
> +    if (!shouldLog(document.sessionID().isEphemeral()) || !document.frame())

I wonder if we should go with an ASSERT(document.frame()) instead. Or are we hitting known cases where there is no frame?

> Source/WebCore/loader/ResourceLoadObserver.cpp:250
> +    if (!document.frame())

Duplicate logic.

> Source/WebCore/loader/ResourceLoadObserver.cpp:264
> +    if (!shouldLog(document.sessionID().isEphemeral()) || !document.frame())

I wonder if we should go with an ASSERT(document.frame()) instead. Or are we hitting known cases where there is no frame?

> Source/WebCore/loader/ResourceLoadObserver.cpp:276
> +    if (!shouldLog(document.sessionID().isEphemeral()) || !document.frame())

Ditto.

> Source/WebCore/loader/ResourceLoadObserver.cpp:292
> +    if (!shouldLog(document.sessionID().isEphemeral()) || !document.frame())

Ditto.

> Source/WebCore/loader/ResourceLoadObserver.cpp:308
> +    if (!shouldLog(document.sessionID().isEphemeral()) || !document.frame())

Ditto.

> Source/WebCore/loader/ResourceLoadStatistics.h:111
> +    HashSet<String> navigatorFunctionsAccessed;

Could the functions be represented as an enum, declared here? That way we don't have to keep multiple copies of strings in memory and on disk. It would also provide a concise list of what we're monitoring.

> Source/WebCore/loader/ResourceLoadStatistics.h:112
> +    HashSet<String> screenFunctionsAccessed;

Similar for these, i.e. enum.

> Source/WebCore/page/Navigator.cpp:78
> +    ResourceLoadObserver::shared().logNavigatorAPIAccessed(*m_frame->document(), "appVersion");

This would be nicer with an enum for the function.

> Source/WebCore/page/Navigator.cpp:91
> +    ResourceLoadObserver::shared().logNavigatorAPIAccessed(*m_frame->document(), "userAgent");

Ditto.

> Source/WebCore/page/Navigator.cpp:112
> +        ResourceLoadObserver::shared().logNavigatorAPIAccessed(*m_frame->document(), "plugins");

Ditto.

> Source/WebCore/page/Navigator.cpp:123
> +        ResourceLoadObserver::shared().logNavigatorAPIAccessed(*m_frame->document(), "mimeTypes");

Ditto.

> Source/WebCore/page/Navigator.cpp:136
> +    ResourceLoadObserver::shared().logNavigatorAPIAccessed(*m_frame->document(), "cookieEnabled");

Ditto.

> Source/WebCore/page/Navigator.cpp:155
> +    ResourceLoadObserver::shared().logNavigatorAPIAccessed(*m_frame->document(), "javaEnabled");

Ditto.

> Source/WebCore/page/Screen.cpp:53
> +    ResourceLoadObserver::shared().logScreenAPIAccessed(*m_frame->document(), "height");

Ditto.

> Source/WebCore/page/Screen.cpp:64
> +    ResourceLoadObserver::shared().logScreenAPIAccessed(*m_frame->document(), "width");

Ditto.

> Source/WebCore/page/Screen.cpp:75
> +    ResourceLoadObserver::shared().logScreenAPIAccessed(*m_frame->document(), "colorDepth");

Ditto.

> Source/WebCore/page/Screen.cpp:85
> +    ResourceLoadObserver::shared().logScreenAPIAccessed(*m_frame->document(), "pixelDepth");

Ditto.

> Source/WebCore/page/Screen.cpp:95
> +    ResourceLoadObserver::shared().logScreenAPIAccessed(*m_frame->document(), "availLeft");

Ditto.

> Source/WebCore/page/Screen.cpp:105
> +    ResourceLoadObserver::shared().logScreenAPIAccessed(*m_frame->document(), "availTop");

Ditto.

> Source/WebCore/page/Screen.cpp:115
> +    ResourceLoadObserver::shared().logScreenAPIAccessed(*m_frame->document(), "availHeight");

Ditto.

> Source/WebCore/page/Screen.cpp:125
> +    ResourceLoadObserver::shared().logScreenAPIAccessed(*m_frame->document(), "availWidth");

Ditto.
Comment 56 EWS Watchlist 2018-08-28 12:40:21 PDT
Comment on attachment 348309 [details]
Patch

Attachment 348309 [details] did not pass mac-debug-ews (mac):
Output: https://webkit-queues.webkit.org/results/9012652

New failing tests:
http/tests/webAPIStatistics/font-load-data-collection.html
http/tests/webAPIStatistics/canvas-read-and-write-data-collection.html
http/tests/webAPIStatistics/navigator-functions-accessed-data-collection.html
http/tests/webAPIStatistics/screen-functions-accessed-data-collection.html
Comment 57 EWS Watchlist 2018-08-28 12:40:23 PDT
Created attachment 348323 [details]
Archive of layout-test-results from ews114 for mac-sierra

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews114  Port: mac-sierra  Platform: Mac OS X 10.12.6
Comment 58 EWS Watchlist 2018-08-28 12:56:12 PDT
Comment on attachment 348309 [details]
Patch

Attachment 348309 [details] did not pass mac-ews (mac):
Output: https://webkit-queues.webkit.org/results/9013083

New failing tests:
http/tests/webAPIStatistics/font-load-data-collection.html
http/tests/webAPIStatistics/canvas-read-and-write-data-collection.html
http/tests/webAPIStatistics/navigator-functions-accessed-data-collection.html
http/tests/webAPIStatistics/screen-functions-accessed-data-collection.html
Comment 59 EWS Watchlist 2018-08-28 12:56:14 PDT
Created attachment 348326 [details]
Archive of layout-test-results from ews102 for mac-sierra

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews102  Port: mac-sierra  Platform: Mac OS X 10.12.6
Comment 60 Woodrow Wang 2018-08-28 14:53:38 PDT
(In reply to John Wilander from comment #55)
> Comment on attachment 348309 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=348309&action=review
> 
> Nice test infrastructure addition!
> 
> > Source/WebCore/css/CSSFontSelector.cpp:313
> > +        // This logs an attempted font load by setting the loadStatus boolean to true
> 
> This comment seems unnecessary.
I'll remove the unnecessary comments. 
> 
> > Source/WebCore/html/HTMLCanvasElement.cpp:706
> > +    // Records attempt to read canvas data, regardless of whether the read is successful or not.
> 
> Ditto.
> 
> > Source/WebCore/html/HTMLCanvasElement.cpp:739
> > +    // Records attempt to read canvas data, regardless of whether the read is successful or not.
> 
> Ditto.
> 
> > Source/WebCore/html/HTMLCanvasElement.cpp:772
> > +        // Records attempt to read canvas data, regardless of whether the read is successful or not.
> 
> Ditto.
> 
> > Source/WebCore/html/HTMLCanvasElement.cpp:789
> > +    // Records attempt to read canvas data, regardless of whether the read is successful or not.
> 
> Ditto.
> 
> > Source/WebCore/html/HTMLCanvasElement.cpp:806
> > +    // Records attempt to read canvas data, regardless of whether the read is successful or not.
> 
> Ditto.
> 
> > Source/WebCore/loader/ResourceLoadObserver.cpp:248
> > +    if (!shouldLog(document.sessionID().isEphemeral()) || !document.frame())
> 
> I wonder if we should go with an ASSERT(document.frame()) instead. Or are we
> hitting known cases where there is no frame?
There exists some test cases where the document no longer has a frame, thus when the frame pointer is dereferenced later a seg fault occurs. 
> 
> > Source/WebCore/loader/ResourceLoadObserver.cpp:250
> > +    if (!document.frame())
> 
> Duplicate logic.
Oops, forgot to remove this!
> 
> > Source/WebCore/loader/ResourceLoadObserver.cpp:264
> > +    if (!shouldLog(document.sessionID().isEphemeral()) || !document.frame())
> 
> I wonder if we should go with an ASSERT(document.frame()) instead. Or are we
> hitting known cases where there is no frame?
> 
> > Source/WebCore/loader/ResourceLoadObserver.cpp:276
> > +    if (!shouldLog(document.sessionID().isEphemeral()) || !document.frame())
> 
> Ditto.
> 
> > Source/WebCore/loader/ResourceLoadObserver.cpp:292
> > +    if (!shouldLog(document.sessionID().isEphemeral()) || !document.frame())
> 
> Ditto.
> 
> > Source/WebCore/loader/ResourceLoadObserver.cpp:308
> > +    if (!shouldLog(document.sessionID().isEphemeral()) || !document.frame())
> 
> Ditto.
> 
> > Source/WebCore/loader/ResourceLoadStatistics.h:111
> > +    HashSet<String> navigatorFunctionsAccessed;
> 
> Could the functions be represented as an enum, declared here? That way we
> don't have to keep multiple copies of strings in memory and on disk. It
> would also provide a concise list of what we're monitoring.

After discussing with Dan Bates, I will look into storing which functions are accessed in a bitmask in an OptionSet<NavigatorFunctionsNames>. In the dumping of the resource load statistics map, the bitmask can be reconverted to the appropriate strings for debugging. 
> 
> > Source/WebCore/loader/ResourceLoadStatistics.h:112
> > +    HashSet<String> screenFunctionsAccessed;
> 
> Similar for these, i.e. enum.
> 
> > Source/WebCore/page/Navigator.cpp:78
> > +    ResourceLoadObserver::shared().logNavigatorAPIAccessed(*m_frame->document(), "appVersion");
> 
> This would be nicer with an enum for the function.
> 
> > Source/WebCore/page/Navigator.cpp:91
> > +    ResourceLoadObserver::shared().logNavigatorAPIAccessed(*m_frame->document(), "userAgent");
> 
> Ditto.
> 
> > Source/WebCore/page/Navigator.cpp:112
> > +        ResourceLoadObserver::shared().logNavigatorAPIAccessed(*m_frame->document(), "plugins");
> 
> Ditto.
> 
> > Source/WebCore/page/Navigator.cpp:123
> > +        ResourceLoadObserver::shared().logNavigatorAPIAccessed(*m_frame->document(), "mimeTypes");
> 
> Ditto.
> 
> > Source/WebCore/page/Navigator.cpp:136
> > +    ResourceLoadObserver::shared().logNavigatorAPIAccessed(*m_frame->document(), "cookieEnabled");
> 
> Ditto.
> 
> > Source/WebCore/page/Navigator.cpp:155
> > +    ResourceLoadObserver::shared().logNavigatorAPIAccessed(*m_frame->document(), "javaEnabled");
> 
> Ditto.
> 
> > Source/WebCore/page/Screen.cpp:53
> > +    ResourceLoadObserver::shared().logScreenAPIAccessed(*m_frame->document(), "height");
> 
> Ditto.
> 
> > Source/WebCore/page/Screen.cpp:64
> > +    ResourceLoadObserver::shared().logScreenAPIAccessed(*m_frame->document(), "width");
> 
> Ditto.
> 
> > Source/WebCore/page/Screen.cpp:75
> > +    ResourceLoadObserver::shared().logScreenAPIAccessed(*m_frame->document(), "colorDepth");
> 
> Ditto.
> 
> > Source/WebCore/page/Screen.cpp:85
> > +    ResourceLoadObserver::shared().logScreenAPIAccessed(*m_frame->document(), "pixelDepth");
> 
> Ditto.
> 
> > Source/WebCore/page/Screen.cpp:95
> > +    ResourceLoadObserver::shared().logScreenAPIAccessed(*m_frame->document(), "availLeft");
> 
> Ditto.
> 
> > Source/WebCore/page/Screen.cpp:105
> > +    ResourceLoadObserver::shared().logScreenAPIAccessed(*m_frame->document(), "availTop");
> 
> Ditto.
> 
> > Source/WebCore/page/Screen.cpp:115
> > +    ResourceLoadObserver::shared().logScreenAPIAccessed(*m_frame->document(), "availHeight");
> 
> Ditto.
> 
> > Source/WebCore/page/Screen.cpp:125
> > +    ResourceLoadObserver::shared().logScreenAPIAccessed(*m_frame->document(), "availWidth");
> 
> Ditto.
Comment 61 Woodrow Wang 2018-08-30 11:10:25 PDT
Created attachment 348512 [details]
Patch
Comment 62 John Wilander 2018-08-30 11:26:02 PDT
Comment on attachment 348512 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=348512&action=review

Looks good to me with a couple of minor comments below. Did you look at what the plist output looks like? Have you done at least a manual test that a relaunch reads back the correct values?

> Source/WebCore/loader/CanvasActivityRecord.cpp:29
> +const unsigned textCapacity = 10;

Could you come up with a brief code comment that explains why this is 10?

> Source/WebCore/loader/ResourceLoadStatistics.cpp:405
> +    appendBoolean(builder, "grandfathered", grandfathered);

Was this change deliberate? I think this indentation was done for readability. Was it wrong? Or are you making the change to make the output machine readable?

> Source/WebCore/loader/ResourceLoadStatistics.cpp:426
> +    appendBoolean(builder, "isVeryPrevalentResource", isVeryPrevalentResource);

Ditto.

> Source/WebCore/loader/ResourceLoadStatistics.h:131
> +    OptionSet<ScreenFunctionName> screenFunctionsAccessed;

These enums are a great improvement.
Comment 63 Woodrow Wang 2018-08-30 11:59:21 PDT
Created attachment 348523 [details]
Patch
Comment 64 Woodrow Wang 2018-08-30 12:08:54 PDT
(In reply to John Wilander from comment #62)
> Comment on attachment 348512 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=348512&action=review
> 
> Looks good to me with a couple of minor comments below. Did you look at what
> the plist output looks like? Have you done at least a manual test that a
> relaunch reads back the correct values?
> 
> > Source/WebCore/loader/CanvasActivityRecord.cpp:29
> > +const unsigned textCapacity = 10;
> 
> Could you come up with a brief code comment that explains why this is 10?
Yes, I'll add a comment to the code, but I found when analyzing the data that it is sufficient to see just a few strings that a certain registrable domain writes to canvas to determine whether or not they are a prevalent web API user. Also, limiting the size of the HashSet would keep the information on the plist more compact and prevent bloating it from a site that may use the canvas to write many strings in a legitimate way.  
> 
> > Source/WebCore/loader/ResourceLoadStatistics.cpp:405
> > +    appendBoolean(builder, "grandfathered", grandfathered);

I made a few changes to the format of the output of toString() for each resourceLoadStatistics object. The appendBoolean function already adds 4 spaces, so the old code added an extra 4 spaces, unless it was deliberate to have "grandfathered" be indented to the right under "mostRecentUserInteraction". See the test expectations below for the format of the dumping of the resourceLoadStatistics. 
> 
> Was this change deliberate? I think this indentation was done for
> readability. Was it wrong? Or are you making the change to make the output
> machine readable?
> 
> > Source/WebCore/loader/ResourceLoadStatistics.cpp:426
> > +    appendBoolean(builder, "isVeryPrevalentResource", isVeryPrevalentResource);
> 
> Ditto.
> 
> > Source/WebCore/loader/ResourceLoadStatistics.h:131
> > +    OptionSet<ScreenFunctionName> screenFunctionsAccessed;
> 
> These enums are a great improvement.
Comment 65 Woodrow Wang 2018-08-30 12:23:44 PDT
Created attachment 348528 [details]
Patch
Comment 66 Woodrow Wang 2018-08-30 13:57:14 PDT
Created attachment 348542 [details]
Patch
Comment 67 Daniel Bates 2018-08-30 16:56:55 PDT
Comment on attachment 348542 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=348542&action=review

> Source/WebCore/loader/ResourceLoadStatistics.h:112
> +    enum class NavigatorFunctionName : uint64_t {

Maybe a better name for this enum would be NavigatorAPI?

> Source/WebCore/loader/ResourceLoadStatistics.h:120
> +    enum class ScreenFunctionName : uint64_t {

Maybe a better name for this enum would be ScreenAPI?

> Source/WebCore/platform/KeyedCoding.h:45
> +#if PLATFORM(MAC) || PLATFORM(IOS)

I do not see a need to guard this function.

> Source/WebCore/platform/KeyedCoding.h:156
> +#if PLATFORM(MAC) || PLATFORM(IOS)

Ditto.

> Source/WebCore/platform/cf/KeyedDecoderCF.cpp:85
> +#if PLATFORM(MAC) || PLATFORM(IOS)

Ditto.

> Source/WebCore/platform/cf/KeyedDecoderCF.cpp:92
> +    return CFNumberGetValue(number, kCFNumberLongLongType, &result);

According to <https://developer.apple.com/documentation/corefoundation/cfnumbertype/kcfnumberlonglongtype> kCFNumberLongLongType corresponds to long long. This type is neither unsigned nor is its width guaranteed to be 64 bits. The C/C++ standards only guaranteed it to be at least 64 bits wide.

> Source/WebCore/platform/cf/KeyedDecoderCF.h:44
> +#if PLATFORM(MAC) || PLATFORM(IOS)

Ditto.

> Source/WebKit/UIProcess/API/C/WKWebsiteDataStoreRef.cpp:154
> +    

Please remove the leading whitespace on this line.

> Source/WebKit/UIProcess/API/C/WKWebsiteDataStoreRef.cpp:155
> +    store->dumpResourceLoadStatistics([context, callback](String dumpResourceLoadStatistics) {

Nit: Please put a space between the capture list and the parameter list.

> Source/WebKit/UIProcess/ResourceLoadStatisticsMemoryStore.cpp:679
> +    StringBuilder dumpNavigatorString;

Can we come up with a better name for this variable?

> Source/WebKit/UIProcess/ResourceLoadStatisticsMemoryStore.cpp:682
> +    dumpNavigatorString.appendLiteral("Begin resource load statistics map dump: ");
> +    dumpNavigatorString.append('\n');
> +    dumpNavigatorString.append('\n');

I would just emit "Resource Load Statistics:\n\n".

> Source/WebKit/UIProcess/ResourceLoadStatisticsMemoryStore.cpp:685
> +    dumpNavigatorString.appendLiteral("End resource load statistics map dump.");

I do not see the need for this as the statistics are always written to the end of the file.

> Source/WebKit/UIProcess/WebResourceLoadStatisticsStore.cpp:502
> +        String dumpResourceLoadStatistics = m_memoryStore ? m_memoryStore->dumpResourceLoadStatistics() : "";

"" => emptyString()

> Tools/WebKitTestRunner/TestInvocation.cpp:259
> +    if (m_shouldDumpResourceLoadStatistics) {
> +        if (m_dumpResourceLoadStatistics.isEmpty()) {
> +            WKStringRef dumpResourceLoadStatistics = TestController::singleton().dumpResourceLoadStatistics();
> +            m_dumpResourceLoadStatistics.append(toWTFString(dumpResourceLoadStatistics));
> +        }
> +        m_textOutput.append(m_dumpResourceLoadStatistics);
> +    }

It is unnecessary to append to m_dumpResourceLoadStatistics when m_dumpResourceLoadStatistics is empty. I would simplify this code to read:

if (m_shouldDumpResourceLoadStatistics)
    m_textOutput.append(m_dumpResourceLoadStatistics.isEmpty() ? toWTFString(TestController::singleton().dumpResourceLoadStatistics()) : m_dumpResourceLoadStatistics);

> Tools/WebKitTestRunner/TestInvocation.cpp:260
> +    

Please remove the leading whitespace.

> Tools/WebKitTestRunner/TestInvocation.cpp:1082
> +        

Please remove the leading whitespace.

> Tools/WebKitTestRunner/TestInvocation.h:138
> +    StringBuilder m_dumpResourceLoadStatistics;

Can we come up with a more descriptive name for the this variable? The purpose of this variable is to save the resource load statistics dump when we get the StatisticsResetToConsistentState message. Maybe a better name for this variable would be m_savedResourceLoadStatistics?

> LayoutTests/http/tests/webAPIStatistics/canvas-read-and-write-data-collection-expected.txt:9
> +Tests for canvas read and write data collection in ResourceLoadStatistics plist by rendering and reading text on the canvas and dumping the entire resource load statistics map.
> +
> +On success, you will see a series of "PASS" messages, followed by "TEST COMPLETE".
> +
> +
> +PASS successfullyParsed is true
> +
> +TEST COMPLETE
> +

All of these tests, including this one, are underutilizing the design choice of having the dumped statistics concatenated to the end of the test output. If this is the norm for such tests then I suggest that we make testRunner.setDumpResourceLoadStatistics() behave like testRunner.dumpDOMAsWebArchive() and have it opt-into a code path where we only dump the statistics.

> LayoutTests/http/tests/webAPIStatistics/canvas-read-and-write-data-collection.html:2
> +<html lang="en">

Is the HTML attribute lang needed?

> LayoutTests/http/tests/webAPIStatistics/navigator-functions-accessed-data-collection.html:2
> +<html lang="en">

Is the HTML attribute lang needed?

> LayoutTests/http/tests/webAPIStatistics/screen-functions-accessed-data-collection.html:2
> +<html lang="en">

Is the HTML attribute lang needed?
Comment 68 Woodrow Wang 2018-08-31 11:17:35 PDT
Created attachment 348653 [details]
Patch
Comment 69 Woodrow Wang 2018-08-31 11:23:01 PDT
(In reply to Daniel Bates from comment #67)
> Comment on attachment 348542 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=348542&action=review
> 
> > Source/WebCore/loader/ResourceLoadStatistics.h:112
> > +    enum class NavigatorFunctionName : uint64_t {
> 
> Maybe a better name for this enum would be NavigatorAPI?

Agreed, changing the name!
> 
> > Source/WebCore/loader/ResourceLoadStatistics.h:120
> > +    enum class ScreenFunctionName : uint64_t {
> 
> Maybe a better name for this enum would be ScreenAPI?

Agreed!
> 
> > Source/WebCore/platform/KeyedCoding.h:45
> > +#if PLATFORM(MAC) || PLATFORM(IOS)
> 
> I do not see a need to guard this function.

I guarded the function as it was a pure virtual and thus required to be implemented by derived classes. I've now implemented the function in the derived classes to remove the need for the guard. 
> 
> > Source/WebCore/platform/KeyedCoding.h:156
> > +#if PLATFORM(MAC) || PLATFORM(IOS)
> 
> Ditto.
> 
> > Source/WebCore/platform/cf/KeyedDecoderCF.cpp:85
> > +#if PLATFORM(MAC) || PLATFORM(IOS)
> 
> Ditto.
> 
> > Source/WebCore/platform/cf/KeyedDecoderCF.cpp:92
> > +    return CFNumberGetValue(number, kCFNumberLongLongType, &result);
> 
> According to
> <https://developer.apple.com/documentation/corefoundation/cfnumbertype/
> kcfnumberlonglongtype> kCFNumberLongLongType corresponds to long long. This
> type is neither unsigned nor is its width guaranteed to be 64 bits. The
> C/C++ standards only guaranteed it to be at least 64 bits wide.

I will change this to use a fixed width 64 bit type. 
> 
> > Source/WebCore/platform/cf/KeyedDecoderCF.h:44
> > +#if PLATFORM(MAC) || PLATFORM(IOS)
> 
> Ditto.
> 
> > Source/WebKit/UIProcess/API/C/WKWebsiteDataStoreRef.cpp:154
> > +    
> 
> Please remove the leading whitespace on this line.

Done. 
> 
> > Source/WebKit/UIProcess/API/C/WKWebsiteDataStoreRef.cpp:155
> > +    store->dumpResourceLoadStatistics([context, callback](String dumpResourceLoadStatistics) {
> 
> Nit: Please put a space between the capture list and the parameter list.

Done. 
> 
> > Source/WebKit/UIProcess/ResourceLoadStatisticsMemoryStore.cpp:679
> > +    StringBuilder dumpNavigatorString;
> 
> Can we come up with a better name for this variable?

Changed to dumpResourceLoadStatisticsMapString
> 
> > Source/WebKit/UIProcess/ResourceLoadStatisticsMemoryStore.cpp:682
> > +    dumpNavigatorString.appendLiteral("Begin resource load statistics map dump: ");
> > +    dumpNavigatorString.append('\n');
> > +    dumpNavigatorString.append('\n');
> 
> I would just emit "Resource Load Statistics:\n\n".

Done. 
> 
> > Source/WebKit/UIProcess/ResourceLoadStatisticsMemoryStore.cpp:685
> > +    dumpNavigatorString.appendLiteral("End resource load statistics map dump.");
> 
> I do not see the need for this as the statistics are always written to the
> end of the file.

Removed this line. 
> 
> > Source/WebKit/UIProcess/WebResourceLoadStatisticsStore.cpp:502
> > +        String dumpResourceLoadStatistics = m_memoryStore ? m_memoryStore->dumpResourceLoadStatistics() : "";
> 
> "" => emptyString()

Done. 
> 
> > Tools/WebKitTestRunner/TestInvocation.cpp:259
> > +    if (m_shouldDumpResourceLoadStatistics) {
> > +        if (m_dumpResourceLoadStatistics.isEmpty()) {
> > +            WKStringRef dumpResourceLoadStatistics = TestController::singleton().dumpResourceLoadStatistics();
> > +            m_dumpResourceLoadStatistics.append(toWTFString(dumpResourceLoadStatistics));
> > +        }
> > +        m_textOutput.append(m_dumpResourceLoadStatistics);
> > +    }
> 
> It is unnecessary to append to m_dumpResourceLoadStatistics when
> m_dumpResourceLoadStatistics is empty. I would simplify this code to read:
> 
> if (m_shouldDumpResourceLoadStatistics)
>     m_textOutput.append(m_dumpResourceLoadStatistics.isEmpty() ?
> toWTFString(TestController::singleton().dumpResourceLoadStatistics()) :
> m_dumpResourceLoadStatistics);
> 

Changed code to use the ternary line. 
> > Tools/WebKitTestRunner/TestInvocation.cpp:260
> > +    
> 
> Please remove the leading whitespace.
> 
Done. 
> > Tools/WebKitTestRunner/TestInvocation.cpp:1082
> > +        
> 
> Please remove the leading whitespace.
Done. 
> 
> > Tools/WebKitTestRunner/TestInvocation.h:138
> > +    StringBuilder m_dumpResourceLoadStatistics;
> 
> Can we come up with a more descriptive name for the this variable? The
> purpose of this variable is to save the resource load statistics dump when
> we get the StatisticsResetToConsistentState message. Maybe a better name for
> this variable would be m_savedResourceLoadStatistics?

Changed to use m_savedResourceLoadStatistics
> 
> > LayoutTests/http/tests/webAPIStatistics/canvas-read-and-write-data-collection-expected.txt:9
> > +Tests for canvas read and write data collection in ResourceLoadStatistics plist by rendering and reading text on the canvas and dumping the entire resource load statistics map.
> > +
> > +On success, you will see a series of "PASS" messages, followed by "TEST COMPLETE".
> > +
> > +
> > +PASS successfullyParsed is true
> > +
> > +TEST COMPLETE
> > +
> 
> All of these tests, including this one, are underutilizing the design choice
> of having the dumped statistics concatenated to the end of the test output.
> If this is the norm for such tests then I suggest that we make
> testRunner.setDumpResourceLoadStatistics() behave like
> testRunner.dumpDOMAsWebArchive() and have it opt-into a code path where we
> only dump the statistics.

Future tests will likely test if preventative measures are properly invoked on prevalent web API users and it would be nice to have a dump of the resource load statistics at the end to confirm data was collected properly. 
> 
> > LayoutTests/http/tests/webAPIStatistics/canvas-read-and-write-data-collection.html:2
> > +<html lang="en">
> 
> Is the HTML attribute lang needed?

Removed. 
> 
> > LayoutTests/http/tests/webAPIStatistics/navigator-functions-accessed-data-collection.html:2
> > +<html lang="en">
> 
> Is the HTML attribute lang needed?
> 
> > LayoutTests/http/tests/webAPIStatistics/screen-functions-accessed-data-collection.html:2
> > +<html lang="en">
> 
> Is the HTML attribute lang needed?
Comment 70 Woodrow Wang 2018-08-31 12:06:25 PDT
Created attachment 348660 [details]
Patch
Comment 71 Woodrow Wang 2018-08-31 17:03:58 PDT
Created attachment 348682 [details]
Patch
Comment 72 Woodrow Wang 2018-08-31 17:55:10 PDT
Created attachment 348693 [details]
Patch
Comment 73 Woodrow Wang 2018-09-05 14:39:16 PDT
Created attachment 348964 [details]
Patch
Comment 74 Radar WebKit Bug Importer 2018-09-05 14:46:09 PDT
<rdar://problem/44155162>
Comment 75 Woodrow Wang 2018-09-05 15:02:34 PDT
Created attachment 348968 [details]
Patch
Comment 76 Woodrow Wang 2018-09-05 16:08:56 PDT
Created attachment 348976 [details]
Patch
Comment 77 Simon Fraser (smfr) 2018-09-05 16:31:23 PDT
We don't generally allow research experiments to land in WebKit. It adds complexity, and is only used by a few people, often bit-rotting. Why is this any different?

If you think it should land, please post to webkit-dev@lists.webkit.org and argue your case.
Comment 78 Brent Fulgham 2018-09-05 16:33:35 PDT
(In reply to Simon Fraser (smfr) from comment #77)
> We don't generally allow research experiments to land in WebKit. It adds
> complexity, and is only used by a few people, often bit-rotting. Why is this
> any different?
> 
> If you think it should land, please post to webkit-dev@lists.webkit.org and
> argue your case.

Woody is working on this with me. Please see me if you would like to talk about this further.
Comment 79 Simon Fraser (smfr) 2018-09-05 17:07:17 PDT
(In reply to Simon Fraser (smfr) from comment #77)
> We don't generally allow research experiments to land in WebKit. It adds
> complexity, and is only used by a few people, often bit-rotting. Why is this
> any different?
> 
> If you think it should land, please post to webkit-dev@lists.webkit.org and
> argue your case.

Nevermind. The stanford.edu email threw me.
Comment 80 Chris Dumez 2018-09-06 09:20:12 PDT
Comment on attachment 348976 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=348976&action=review

> Source/WebCore/css/CSSFontFaceSource.cpp:187
> +                if (auto document = fontSelector->document())

nit: we prefer auto* assuming its returns a raw pointer.

> Source/WebCore/css/CSSFontSelector.cpp:403
> +    RefPtr<Font> font = FontCache::singleton().fontForFamily(fontDescription, m_document->settings().pictographFontFamily());

nit: auto

> Source/WebCore/css/CSSFontSelector.cpp:405
> +        if (m_document)

Unnecessary null check given that it is dereferenced right above.

> Source/WebCore/html/canvas/CanvasRenderingContext2D.cpp:369
> +        HTMLCanvasElement& canvas = this->canvas();

auto&

> Source/WebCore/html/canvas/CanvasRenderingContext2D.cpp:463
> +        HTMLCanvasElement& canvas = this->canvas();

unnecessary local variable.

> Source/WebCore/loader/CanvasActivityRecord.cpp:29
> +const unsigned textCapacity = 10;

I think the naming is not clear. Maybe something like maximumTextLengthToRecord ?

> Source/WebCore/loader/CanvasActivityRecord.cpp:36
> +    if (textWritten.size() < textCapacity && textWritten.add(text).isNewEntry)

Seems to me like this could get out of hand pretty quickly for some use cases. We should probably have a max number of entries in the HashSet as well.

> Source/WebCore/loader/ResourceLoadObserver.cpp:248
> +    if (!shouldLog(document.sessionID().isEphemeral()) || !document.frame())

We can probably drop the !document.frame() check if you use topDocument().

> Source/WebCore/loader/ResourceLoadObserver.cpp:254
> +    statistics.fontsLoaded.add(familyName);

Is it intentional that the font gets added to both fontsFailedToLoad and fontsLoaded when it fails to load?

> Source/WebCore/loader/ResourceLoadObserver.cpp:255
> +    auto mainFrameRegistrableDomain = primaryDomain(document.frame()->mainFrame().document()->url());

document.frame()->mainFrame().document()->url() -> document.topDocument().url()

> Source/WebCore/loader/ResourceLoadObserver.cpp:262
> +    if (!shouldLog(document.sessionID().isEphemeral()) || !document.frame())

We can probably drop the !document.frame() check if you use topDocument().

> Source/WebCore/loader/ResourceLoadObserver.cpp:266
> +    auto mainFrameRegistrableDomain = primaryDomain(document.frame()->mainFrame().document()->url());

topDocument()

> Source/WebCore/loader/ResourceLoadObserver.cpp:274
> +    if (!shouldLog(document.sessionID().isEphemeral()) || !document.frame())

We can probably drop the !document.frame() check if you use topDocument().

> Source/WebCore/loader/ResourceLoadObserver.cpp:279
> +    auto mainFrameRegistrableDomain = primaryDomain(document.frame()->mainFrame().document()->url());

topDocument()

> Source/WebCore/loader/ResourceLoadObserver.cpp:290
> +    if (!shouldLog(document.sessionID().isEphemeral()) || !document.frame())

We can probably drop the !document.frame() check if you use topDocument().

> Source/WebCore/loader/ResourceLoadObserver.cpp:295
> +    if (!statistics.navigatorFunctionsAccessed.contains(functionName)) {

Unnecessary double lookup? Can't use use to AddResult.isNewEntry?

> Source/WebCore/loader/ResourceLoadObserver.cpp:299
> +    auto mainFrameRegistrableDomain = primaryDomain(document.frame()->mainFrame().document()->url());

topDocument()

> Source/WebCore/loader/ResourceLoadObserver.cpp:308
> +    if (!shouldLog(document.sessionID().isEphemeral()) || !document.frame())

We can probably drop the !document.frame() check if you use topDocument().

> Source/WebCore/loader/ResourceLoadObserver.cpp:313
> +    if (!statistics.screenFunctionsAccessed.contains(functionName)) {

Unnecessary double lookup?

> Source/WebCore/loader/ResourceLoadObserver.cpp:317
> +    auto mainFrameRegistrableDomain = primaryDomain(document.frame()->mainFrame().document()->url());

topDocument()

> Source/WebCore/loader/ResourceLoadStatistics.cpp:52
> +static void encodeHashSet(KeyedEncoder& encoder, const String& label, const HashSet<String>& hashSet, const String& key)

Wouldn't it be clearer with hashSet / key parameters reversed?

> Source/WebCore/loader/ResourceLoadStatistics.cpp:62
> +static void encodeOriginHashSet(KeyedEncoder& encoder, const String& label, const HashSet<String>& hashSet)

Not sure this utility function brings much. Call sites could be updated to pass "origin".

> Source/WebCore/loader/ResourceLoadStatistics.cpp:77
> +static void encodeFontHashSet(KeyedEncoder& encoder, const String& label, const HashSet<String>& hashSet)

Call sites could be updated and this could be dropped.

> Source/WebCore/loader/ResourceLoadStatistics.cpp:162
> +static void decodeOriginHashSet(KeyedDecoder& decoder, const String& label, HashSet<String>& hashSet)

Call sites could be updated and this be dropped.

> Source/WebCore/loader/ResourceLoadStatistics.cpp:172
> +    optionSet.add(OptionSet<T>::fromRaw(optionSetBitMask));

doesn't this work?
optionSet = OptionSet<T>::fromRaw(optionSetBitMask);

> Source/WebCore/loader/ResourceLoadStatistics.cpp:175
> +static void decodeFontHashSet(KeyedDecoder& decoder, const String& label, HashSet<String>& hashSet)

Call sites could be updated and this be dropped.

> Source/WebCore/loader/ResourceLoadStatistics.cpp:309
> +static String navigatorEnumToString(ResourceLoadStatistics::NavigatorAPI navigatorEnum)

Bad naming IMO. maybe navigatorAPIEnumToString? or simply navigatorAPIToString().

> Source/WebCore/loader/ResourceLoadStatistics.cpp:313
> +        return "javaEnabled";

return ASCIILiteral("javaEnabled");

Ditto for other literal strings below.

> Source/WebCore/loader/ResourceLoadStatistics.cpp:324
> +    default:

Do we really need a default case? We usually omit it so that we get a compiling error if we fail to handle one of the enum values.

Ditto for other switch cases below.

> Source/WebCore/loader/ResourceLoadStatistics.cpp:329
> +static String screenEnumToString(ResourceLoadStatistics::ScreenAPI screenEnum)

Same comment about naming.

> Source/WebCore/loader/ResourceLoadStatistics.cpp:353
> +static void appendNavigatorOptionSet(StringBuilder& builder, const OptionSet<ResourceLoadStatistics::NavigatorAPI>& optionSet)

NavigatorAPIOptionSet ? Omitting "API" makes things confusing.

> Source/WebCore/loader/ResourceLoadStatistics.cpp:357
> +    builder.appendLiteral("    ");

builder.appendLiteral("    navigatorFunctionsAccessed:\n");

> Source/WebCore/loader/ResourceLoadStatistics.cpp:358
> +    builder.append("navigatorFunctionsAccessed:");

Not needed.

> Source/WebCore/loader/ResourceLoadStatistics.cpp:359
> +    builder.append('\n');

Not needed.

> Source/WebCore/loader/ResourceLoadStatistics.cpp:367
> +static void appendScreenOptionSet(StringBuilder& builder, const OptionSet<ResourceLoadStatistics::ScreenAPI>& optionSet)

Same comment about naming.

> Source/WebCore/loader/ResourceLoadStatistics.cpp:371
> +    builder.appendLiteral("    ");

Please do all 3 in one append, as suggested above.

> Source/WebCore/loader/ResourceLoadStatistics.cpp:374
> +    for (OptionSet<ResourceLoadStatistics::ScreenAPI>::iterator it = optionSet.begin(); it != optionSet.end(); ++it) {

Any reason you cannot use a modern C++ range loop? for (auto& screenAPI : optionSet)

> Source/WebCore/loader/ResourceLoadStatistics.h:108
> +    enum class NavigatorAPI : uint64_t {

Why uint64_t ?

> Source/WebCore/loader/ResourceLoadStatistics.h:116
> +    enum class ScreenAPI : uint64_t {

Why uint64_t ?
Comment 81 Woodrow Wang 2018-09-06 11:16:39 PDT
(In reply to Chris Dumez from comment #80)
> Comment on attachment 348976 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=348976&action=review
> 
> > Source/WebCore/css/CSSFontFaceSource.cpp:187
> > +                if (auto document = fontSelector->document())
> 
> nit: we prefer auto* assuming its returns a raw pointer.
Sounds good, will fix accordingly. 
> 
> > Source/WebCore/css/CSSFontSelector.cpp:403
> > +    RefPtr<Font> font = FontCache::singleton().fontForFamily(fontDescription, m_document->settings().pictographFontFamily());
> 
> nit: auto
> 
> > Source/WebCore/css/CSSFontSelector.cpp:405
> > +        if (m_document)
> 
> Unnecessary null check given that it is dereferenced right above.

Good point, I'll remove the unnecessary null check. 
> 
> > Source/WebCore/html/canvas/CanvasRenderingContext2D.cpp:369
> > +        HTMLCanvasElement& canvas = this->canvas();
> 
> auto&
> 
> > Source/WebCore/html/canvas/CanvasRenderingContext2D.cpp:463
> > +        HTMLCanvasElement& canvas = this->canvas();
> 
> unnecessary local variable.

I'll remove this local variable. 
> 
> > Source/WebCore/loader/CanvasActivityRecord.cpp:29
> > +const unsigned textCapacity = 10;
> 
> I think the naming is not clear. Maybe something like
> maximumTextLengthToRecord ?
> 
> > Source/WebCore/loader/CanvasActivityRecord.cpp:36
> > +    if (textWritten.size() < textCapacity && textWritten.add(text).isNewEntry)
> 
> Seems to me like this could get out of hand pretty quickly for some use
> cases. We should probably have a max number of entries in the HashSet as
> well.

The textCapacity variable is intended to be a max number of entries. I can rename that variable to maximumNumberOfStringsToRecord. 
> 
> > Source/WebCore/loader/ResourceLoadObserver.cpp:248
> > +    if (!shouldLog(document.sessionID().isEphemeral()) || !document.frame())
> 
> We can probably drop the !document.frame() check if you use topDocument().

We would like to record these statistics on a frame by frame level as of now, so the top level document would not be appropriate for that design. 
> 
> > Source/WebCore/loader/ResourceLoadObserver.cpp:254
> > +    statistics.fontsLoaded.add(familyName);
> 
> Is it intentional that the font gets added to both fontsFailedToLoad and
> fontsLoaded when it fails to load?

It is intentional, but after further thought, since fontsFailedToLoad is a subset of fontsLoaded, it may be more efficient just to keep a HashSet of fontsFailedToLoad and fontsSuccessfullyLoaded. 
> 
> > Source/WebCore/loader/ResourceLoadObserver.cpp:255
> > +    auto mainFrameRegistrableDomain = primaryDomain(document.frame()->mainFrame().document()->url());
> 
> document.frame()->mainFrame().document()->url() ->
> document.topDocument().url()
> 
> > Source/WebCore/loader/ResourceLoadObserver.cpp:262
> > +    if (!shouldLog(document.sessionID().isEphemeral()) || !document.frame())
> 
> We can probably drop the !document.frame() check if you use topDocument().
> 
> > Source/WebCore/loader/ResourceLoadObserver.cpp:266
> > +    auto mainFrameRegistrableDomain = primaryDomain(document.frame()->mainFrame().document()->url());
> 
> topDocument()
> 
> > Source/WebCore/loader/ResourceLoadObserver.cpp:274
> > +    if (!shouldLog(document.sessionID().isEphemeral()) || !document.frame())
> 
> We can probably drop the !document.frame() check if you use topDocument().
> 
> > Source/WebCore/loader/ResourceLoadObserver.cpp:279
> > +    auto mainFrameRegistrableDomain = primaryDomain(document.frame()->mainFrame().document()->url());
> 
> topDocument()
> 
> > Source/WebCore/loader/ResourceLoadObserver.cpp:290
> > +    if (!shouldLog(document.sessionID().isEphemeral()) || !document.frame())
> 
> We can probably drop the !document.frame() check if you use topDocument().
> 
> > Source/WebCore/loader/ResourceLoadObserver.cpp:295
> > +    if (!statistics.navigatorFunctionsAccessed.contains(functionName)) {
> 
> Unnecessary double lookup? Can't use use to AddResult.isNewEntry?

Since navigatorFunctionsAccessed is an OptionSet, is there an addResult from using add on the OptionSet? My understanding is that the add function simply overloads a bitwise or operation, so I manually check if the entry is new and then set the boolean to update the resource load statistics. 
> 
> > Source/WebCore/loader/ResourceLoadObserver.cpp:299
> > +    auto mainFrameRegistrableDomain = primaryDomain(document.frame()->mainFrame().document()->url());
> 
> topDocument()
> 
> > Source/WebCore/loader/ResourceLoadObserver.cpp:308
> > +    if (!shouldLog(document.sessionID().isEphemeral()) || !document.frame())
> 
> We can probably drop the !document.frame() check if you use topDocument().
> 
> > Source/WebCore/loader/ResourceLoadObserver.cpp:313
> > +    if (!statistics.screenFunctionsAccessed.contains(functionName)) {
> 
> Unnecessary double lookup?
> 
> > Source/WebCore/loader/ResourceLoadObserver.cpp:317
> > +    auto mainFrameRegistrableDomain = primaryDomain(document.frame()->mainFrame().document()->url());
> 
> topDocument()
> 
> > Source/WebCore/loader/ResourceLoadStatistics.cpp:52
> > +static void encodeHashSet(KeyedEncoder& encoder, const String& label, const HashSet<String>& hashSet, const String& key)
> 
> Wouldn't it be clearer with hashSet / key parameters reversed?

I agree it would be clearer. The current order is mainly because I added the key parameter to the existing function and figured it would convolute old call sites less if the parameter were just appended to the end; however, there are only a few other call sites, so I can easily make the change without significant refactoring. 
> 
> > Source/WebCore/loader/ResourceLoadStatistics.cpp:62
> > +static void encodeOriginHashSet(KeyedEncoder& encoder, const String& label, const HashSet<String>& hashSet)
> 
> Not sure this utility function brings much. Call sites could be updated to
> pass "origin".

After some discussion with Dan Bates, I figured there would be less chance of error if we provided functions that passed the proper string itself, such as "origin" or "font". Otherwise, if the programmer mistakenly inputs a string with a minor difference, such as "Origin" or a misspelling, the plist would have erroneous data. The utility functions are intended to ensure compliance with the keys in the existing plist infrastructure. An alternative would be to use an enum, but the added utility functions benefit in not needing to add logic convert the enum to the appropriate string for the keyed encoding/decoding. 
> 
> > Source/WebCore/loader/ResourceLoadStatistics.cpp:77
> > +static void encodeFontHashSet(KeyedEncoder& encoder, const String& label, const HashSet<String>& hashSet)
> 
> Call sites could be updated and this could be dropped.
> 
> > Source/WebCore/loader/ResourceLoadStatistics.cpp:162
> > +static void decodeOriginHashSet(KeyedDecoder& decoder, const String& label, HashSet<String>& hashSet)
> 
> Call sites could be updated and this be dropped.
> 
> > Source/WebCore/loader/ResourceLoadStatistics.cpp:172
> > +    optionSet.add(OptionSet<T>::fromRaw(optionSetBitMask));
> 
> doesn't this work?
> optionSet = OptionSet<T>::fromRaw(optionSetBitMask);
> 

Yes, that should work. The current code would combine the existing OptionSet and the decoded OptionSet, but when decoding, we would want to simply assign the OptionSet in memory to the decoded OptionSet. 
> > Source/WebCore/loader/ResourceLoadStatistics.cpp:175
> > +static void decodeFontHashSet(KeyedDecoder& decoder, const String& label, HashSet<String>& hashSet)
> 
> Call sites could be updated and this be dropped.
> 
> > Source/WebCore/loader/ResourceLoadStatistics.cpp:309
> > +static String navigatorEnumToString(ResourceLoadStatistics::NavigatorAPI navigatorEnum)
> 
> Bad naming IMO. maybe navigatorAPIEnumToString? or simply
> navigatorAPIToString().

To make it clear that an enum is being converted to a String, I will rename the respective functions to *APIEnumToString. 
> 
> > Source/WebCore/loader/ResourceLoadStatistics.cpp:313
> > +        return "javaEnabled";
> 
> return ASCIILiteral("javaEnabled");

I will change this accordingly. 
> 
> Ditto for other literal strings below.
> 
> > Source/WebCore/loader/ResourceLoadStatistics.cpp:324
> > +    default:
> 
> Do we really need a default case? We usually omit it so that we get a
> compiling error if we fail to handle one of the enum values.

I did not know about this convention, but that makes sense. I will remove the default values. 
> 
> Ditto for other switch cases below.
> 
> > Source/WebCore/loader/ResourceLoadStatistics.cpp:329
> > +static String screenEnumToString(ResourceLoadStatistics::ScreenAPI screenEnum)
> 
> Same comment about naming.
> 
> > Source/WebCore/loader/ResourceLoadStatistics.cpp:353
> > +static void appendNavigatorOptionSet(StringBuilder& builder, const OptionSet<ResourceLoadStatistics::NavigatorAPI>& optionSet)
> 
> NavigatorAPIOptionSet ? Omitting "API" makes things confusing.

I will fix this. 
> 
> > Source/WebCore/loader/ResourceLoadStatistics.cpp:357
> > +    builder.appendLiteral("    ");
> 
> builder.appendLiteral("    navigatorFunctionsAccessed:\n");

I'll condense the string building. 
> 
> > Source/WebCore/loader/ResourceLoadStatistics.cpp:358
> > +    builder.append("navigatorFunctionsAccessed:");
> 
> Not needed.
> 
> > Source/WebCore/loader/ResourceLoadStatistics.cpp:359
> > +    builder.append('\n');
> 
> Not needed.
> 
> > Source/WebCore/loader/ResourceLoadStatistics.cpp:367
> > +static void appendScreenOptionSet(StringBuilder& builder, const OptionSet<ResourceLoadStatistics::ScreenAPI>& optionSet)
> 
> Same comment about naming.
> 
> > Source/WebCore/loader/ResourceLoadStatistics.cpp:371
> > +    builder.appendLiteral("    ");
> 
> Please do all 3 in one append, as suggested above.
> 
> > Source/WebCore/loader/ResourceLoadStatistics.cpp:374
> > +    for (OptionSet<ResourceLoadStatistics::ScreenAPI>::iterator it = optionSet.begin(); it != optionSet.end(); ++it) {
> 
> Any reason you cannot use a modern C++ range loop? for (auto& screenAPI :
> optionSet)

I wasn't aware I could do this, but I see now that OptionSet implements begin() and end() so I can use the range loop. Thanks for the tip!
> 
> > Source/WebCore/loader/ResourceLoadStatistics.h:108
> > +    enum class NavigatorAPI : uint64_t {
> 
> Why uint64_t ?

Since our OptionSet is represented under the hood by a uint64_t, we would like each enum to be a uint64_t as well to be of the same type. We choose the maximum width of 64 bits so that, if ever necessary, we could record up to 64 different navigator/screen API calls in the OptionSet/underlying bitmask. 
> 
> > Source/WebCore/loader/ResourceLoadStatistics.h:116
> > +    enum class ScreenAPI : uint64_t {
> 
> Why uint64_t ?
Comment 82 Chris Dumez 2018-09-06 12:04:34 PDT
Comment on attachment 348976 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=348976&action=review

>>> Source/WebCore/loader/ResourceLoadObserver.cpp:248
>>> +    if (!shouldLog(document.sessionID().isEphemeral()) || !document.frame())
>> 
>> We can probably drop the !document.frame() check if you use topDocument().
> 
> We would like to record these statistics on a frame by frame level as of now, so the top level document would not be appropriate for that design.

You use document.frame()->mainFrame().document()->url() below, which is clearly the topDocument's URL.
Comment 83 Chris Dumez 2018-09-06 12:07:45 PDT
(In reply to Chris Dumez from comment #82)
> Comment on attachment 348976 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=348976&action=review
> 
> >>> Source/WebCore/loader/ResourceLoadObserver.cpp:248
> >>> +    if (!shouldLog(document.sessionID().isEphemeral()) || !document.frame())
> >> 
> >> We can probably drop the !document.frame() check if you use topDocument().
> > 
> > We would like to record these statistics on a frame by frame level as of now, so the top level document would not be appropriate for that design.
> 
> You use document.frame()->mainFrame().document()->url() below, which is
> clearly the topDocument's URL.

My point was that it seemed you were null-checking document.frame() here becomes you dereference it below to get to top document's URL. If you switch to using topDocument() as I suggest to get the URL, you no longer dereference document.frame() and it would seem you no longer need to null check it.
Comment 84 Woodrow Wang 2018-09-06 12:18:13 PDT
(In reply to Chris Dumez from comment #83)
> (In reply to Chris Dumez from comment #82)
> > Comment on attachment 348976 [details]
> > Patch
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=348976&action=review
> > 
> > >>> Source/WebCore/loader/ResourceLoadObserver.cpp:248
> > >>> +    if (!shouldLog(document.sessionID().isEphemeral()) || !document.frame())
> > >> 
> > >> We can probably drop the !document.frame() check if you use topDocument().
> > > 
> > > We would like to record these statistics on a frame by frame level as of now, so the top level document would not be appropriate for that design.
> > 
> > You use document.frame()->mainFrame().document()->url() below, which is
> > clearly the topDocument's URL.
> 
> My point was that it seemed you were null-checking document.frame() here
> becomes you dereference it below to get to top document's URL. If you switch
> to using topDocument() as I suggest to get the URL, you no longer
> dereference document.frame() and it would seem you no longer need to null
> check it.

Ah, I misunderstood, as I thought you were suggesting only considering top level documents and not the original document where the API were accessed. In that case, using topDocument would indeed remove the need to null check the document. I'll need to double check the behavior when the document does not have a frame, and how using topDocument() will affect it.
Comment 85 Woodrow Wang 2018-09-07 10:37:45 PDT
Created attachment 349156 [details]
Patch
Comment 86 Woodrow Wang 2018-09-07 12:20:07 PDT
Created attachment 349175 [details]
Patch
Comment 87 Woodrow Wang 2018-09-07 14:38:56 PDT
Created attachment 349193 [details]
Patch
Comment 88 Woodrow Wang 2018-09-07 14:47:48 PDT
Created attachment 349196 [details]
Patch
Comment 89 Woodrow Wang 2018-09-10 09:55:48 PDT
Created attachment 349314 [details]
Patch
Comment 90 Brent Fulgham 2018-09-10 11:56:05 PDT
Comment on attachment 349314 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=349314&action=review

I think the patch looks good, with a few small adjustments. Also, please double-check that you meant to use 'familyForLookup' in CSSFontSelector::fontRangesForFamily

> Source/WebCore/ChangeLog:4
> +        https://bugs.webkit.org/show_bug.cgi?id=187773

Please include a Radar number here.

> Source/WebCore/css/CSSFontSelector.cpp:312
> +                ResourceLoadObserver::shared().logFontLoad(*m_document, familyName.string(), true);

Should this be familyForLookup? Couldn't it have been changed by resolveGenericFamily?

Also: Do we really want to log what WebKit figured out was the closest match to the script's request? Why not log the actual requested font -- that seems like it might have more signal than the value WebKit found as the closest match.

> Source/WebCore/css/CSSFontSelector.cpp:403
> +    auto font = FontCache::singleton().fontForFamily(fontDescription, m_document->settings().pictographFontFamily());

Personal preference, but I would only grab the pictographFontFamily once and use it in both places. But it probably doesn't really matter.

> Source/WebCore/loader/CanvasActivityRecord.cpp:38
> +    return false;

Maybe this should be:

if (textWritten.size() >= maximumNumberOfStringsToRecord)
    return false;

return textWritten.add(text).isNewEntry;

I don't like how the important part of the function (textWritten.add(...)) is kind of hidden at the end of the size check.
Comment 91 Woodrow Wang 2018-09-10 13:59:45 PDT
(In reply to Brent Fulgham from comment #90)
> Comment on attachment 349314 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=349314&action=review
> 
> I think the patch looks good, with a few small adjustments. Also, please
> double-check that you meant to use 'familyForLookup' in
> CSSFontSelector::fontRangesForFamily
> 
> > Source/WebCore/ChangeLog:4
> > +        https://bugs.webkit.org/show_bug.cgi?id=187773
> 
> Please include a Radar number here.
I'll add the radar number!
> 
> > Source/WebCore/css/CSSFontSelector.cpp:312
> > +                ResourceLoadObserver::shared().logFontLoad(*m_document, familyName.string(), true);
> 
> Should this be familyForLookup? Couldn't it have been changed by
> resolveGenericFamily?
> 
> Also: Do we really want to log what WebKit figured out was the closest match
> to the script's request? Why not log the actual requested font -- that seems
> like it might have more signal than the value WebKit found as the closest
> match.
> 
After further examination, yes we can use familyForLookup, as it will resolve something like -webkit-serif to Times if that's the system settings, for example. As seen in the test cases, I believe these font names are before any matching occurs, as we successfully record the font "notARealFont" in fontsFailedToLoad. 
> > Source/WebCore/css/CSSFontSelector.cpp:403
> > +    auto font = FontCache::singleton().fontForFamily(fontDescription, m_document->settings().pictographFontFamily());
> 
> Personal preference, but I would only grab the pictographFontFamily once and
> use it in both places. But it probably doesn't really matter.
Agreed, I'll add a local variable to fix this. 
> 
> > Source/WebCore/loader/CanvasActivityRecord.cpp:38
> > +    return false;
> 
> Maybe this should be:
> 
> if (textWritten.size() >= maximumNumberOfStringsToRecord)
>     return false;
> 
> return textWritten.add(text).isNewEntry;
> 
> I don't like how the important part of the function (textWritten.add(...))
> is kind of hidden at the end of the size check.
I'll make this change as well.
Comment 92 Woodrow Wang 2018-09-10 14:20:09 PDT
Created attachment 349328 [details]
Patch
Comment 93 Woodrow Wang 2018-09-10 14:45:17 PDT
Created attachment 349330 [details]
Patch
Comment 94 WebKit Commit Bot 2018-09-10 15:54:26 PDT
Comment on attachment 349330 [details]
Patch

Rejecting attachment 349330 [details] from commit-queue.

Failed to run "['/Volumes/Data/EWS/WebKit/Tools/Scripts/webkit-patch', '--status-host=webkit-queues.webkit.org', '--bot-id=webkit-cq-02', 'apply-attachment', '--no-update', '--non-interactive', 349330, '--port=mac']" exit_code: 2 cwd: /Volumes/Data/EWS/WebKit

Logging in as commit-queue@webkit.org...
Fetching: https://bugs.webkit.org/attachment.cgi?id=349330&action=edit
Fetching: https://bugs.webkit.org/show_bug.cgi?id=187773&ctype=xml&excludefield=attachmentdata
Processing 1 patch from 1 bug.
Processing patch 349330 from bug 187773.
Fetching: https://bugs.webkit.org/attachment.cgi?id=349330
Failed to run "[u'/Volumes/Data/EWS/WebKit/Tools/Scripts/svn-apply', '--force']" exit_code: 1 cwd: /Volumes/Data/EWS/WebKit

Parsed 33 diffs from patch file(s).
patching file Source/WebCore/ChangeLog
Hunk #1 succeeded at 1 with fuzz 3.
patching file Source/WebKit/ChangeLog
Hunk #1 succeeded at 1 with fuzz 3.
patching file Source/WebCore/Sources.txt
patching file Source/WebCore/WebCore.xcodeproj/project.pbxproj
patching file Source/WebCore/css/CSSFontFaceSource.cpp
patching file Source/WebCore/css/CSSFontSelector.cpp
patching file Source/WebCore/html/HTMLCanvasElement.cpp
patching file Source/WebCore/html/canvas/CanvasRenderingContext2D.cpp
patching file Source/WebCore/loader/CanvasActivityRecord.cpp
patching file Source/WebCore/loader/CanvasActivityRecord.h
patching file Source/WebCore/loader/DocumentThreadableLoader.cpp
patching file Source/WebCore/loader/FrameLoader.cpp
patching file Source/WebCore/loader/ResourceLoadObserver.cpp
patching file Source/WebCore/loader/ResourceLoadObserver.h
patching file Source/WebCore/loader/ResourceLoadStatistics.cpp
patching file Source/WebCore/loader/ResourceLoadStatistics.h
patching file Source/WebCore/loader/ResourceTiming.cpp
patching file Source/WebCore/page/Navigator.cpp
patching file Source/WebCore/page/Screen.cpp
patching file Source/WebKit/Shared/WebCoreArgumentCoders.cpp
patching file Source/WebKit/UIProcess/ResourceLoadStatisticsMemoryStore.cpp
patching file LayoutTests/ChangeLog
Hunk #1 succeeded at 1 with fuzz 3.
patching file LayoutTests/TestExpectations
Hunk #1 FAILED at 2242.
1 out of 1 hunk FAILED -- saving rejects to file LayoutTests/TestExpectations.rej
patching file LayoutTests/http/tests/webAPIStatistics/canvas-read-and-write-data-collection-expected.txt
patching file LayoutTests/http/tests/webAPIStatistics/canvas-read-and-write-data-collection.html
patching file LayoutTests/http/tests/webAPIStatistics/font-load-data-collection-expected.txt
patching file LayoutTests/http/tests/webAPIStatistics/font-load-data-collection.html
patching file LayoutTests/http/tests/webAPIStatistics/navigator-functions-accessed-data-collection-expected.txt
patching file LayoutTests/http/tests/webAPIStatistics/navigator-functions-accessed-data-collection.html
patching file LayoutTests/http/tests/webAPIStatistics/screen-functions-accessed-data-collection-expected.txt
patching file LayoutTests/http/tests/webAPIStatistics/screen-functions-accessed-data-collection.html
patching file LayoutTests/platform/ios-wk2/TestExpectations
patching file LayoutTests/platform/mac-wk2/TestExpectations

Failed to run "[u'/Volumes/Data/EWS/WebKit/Tools/Scripts/svn-apply', '--force']" exit_code: 1 cwd: /Volumes/Data/EWS/WebKit

Full output: https://webkit-queues.webkit.org/results/9165532
Comment 95 Woodrow Wang 2018-09-10 16:28:33 PDT
Created attachment 349353 [details]
Patch
Comment 96 WebKit Commit Bot 2018-09-11 10:14:14 PDT
Comment on attachment 349353 [details]
Patch

Clearing flags on attachment: 349353

Committed r235897: <https://trac.webkit.org/changeset/235897>
Comment 97 Truitt Savell 2018-09-11 16:26:18 PDT
Looks like the new test introduced in https://trac.webkit.org/changeset/235897

is flakey: http/tests/webAPIStatistics/canvas-read-and-write-data-collection.html
from the diff looks like the test isn't outputting data. 

History:
https://webkit-test-results.webkit.org/dashboards/flakiness_dashboard.html#showAllRuns=true&tests=http%2Ftests%2FwebAPIStatistics%2Fcanvas-read-and-write-data-collection.html


Diff:


--- /Volumes/Data/slave/highsierra-release-tests-wk2/build/layout-test-results/http/tests/webAPIStatistics/canvas-read-and-write-data-collection-expected.txt
+++ /Volumes/Data/slave/highsierra-release-tests-wk2/build/layout-test-results/http/tests/webAPIStatistics/canvas-read-and-write-data-collection-actual.txt
@@ -9,22 +9,3 @@
 

 Resource load statistics:
 
-High level domain: 127.0.0.1
-    lastSeen: 0
-    hadUserInteraction: No
-    mostRecentUserInteraction: -1
-    grandfathered: No
-    isPrevalentResource: No
-    isVeryPrevalentResource: No
-    dataRecordsRemoved: 0
-    isMarkedForCookieBlocking: No
-    fontsSuccessfullyLoaded:
-        Helvetica
-        Times
-        Courier
-    topFrameRegistrableDomainsWhichAccessedWebAPIs:
-        127.0.0.1: 8
-    canvasTextWritten:
-        suspicious invisible text
-    canvasReadData: Yes
-
Comment 98 Woodrow Wang 2018-09-12 15:02:36 PDT
Working on a fix for stability issues of the tests in https://bugs.webkit.org/show_bug.cgi?id=189560