Bug 153575

Summary: [WK1] Gather some rudimentary statistics during resource load
Product: WebKit Reporter: Brent Fulgham <bfulgham>
Component: WebCore Misc.Assignee: Brent Fulgham <bfulgham>
Status: RESOLVED FIXED    
Severity: Normal CC: achristensen, andersca, beidson, bfulgham, buildbot, cdumez, commit-queue, darin, dbates, esprehn+autocc, japhet, kangil.han, mitz, rniwa, sam, wilander
Priority: P2    
Version: WebKit Nightly Build   
Hardware: Mac   
OS: All   
Bug Depends on:    
Bug Blocks: 154278    
Attachments:
Description Flags
Patch
none
Archive of layout-test-results from ews104 for mac-yosemite-wk2
none
Archive of layout-test-results from ews112 for mac-yosemite
none
Archive of layout-test-results from ews100 for mac-yosemite
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Updated based on Darin's comments.
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Disable write-to-disk in normal build. beidson: review+

Brent Fulgham
Reported 2016-01-27 17:28:44 PST
Simple POC trying to gather some statistics about resource loads. Do not review for landing -- I just don't want to lose this code.
Attachments
Patch (69.55 KB, patch)
2016-01-27 17:47 PST, Brent Fulgham
no flags
Archive of layout-test-results from ews104 for mac-yosemite-wk2 (1.49 MB, application/zip)
2016-01-27 20:39 PST, Build Bot
no flags
Archive of layout-test-results from ews112 for mac-yosemite (1.20 MB, application/zip)
2016-01-27 20:44 PST, Build Bot
no flags
Archive of layout-test-results from ews100 for mac-yosemite (1.40 MB, application/zip)
2016-01-27 20:47 PST, Build Bot
no flags
Patch (57.58 KB, patch)
2016-01-29 14:24 PST, Brent Fulgham
no flags
Patch (58.78 KB, patch)
2016-01-29 14:36 PST, Brent Fulgham
no flags
Patch (70.04 KB, patch)
2016-01-29 18:14 PST, Brent Fulgham
no flags
Patch (72.03 KB, patch)
2016-01-30 16:06 PST, Brent Fulgham
no flags
Patch (72.05 KB, patch)
2016-01-31 12:41 PST, Brent Fulgham
no flags
Updated based on Darin's comments. (69.61 KB, patch)
2016-02-01 21:35 PST, Brent Fulgham
no flags
Patch (60.01 KB, patch)
2016-02-02 12:51 PST, Brent Fulgham
no flags
Patch (66.76 KB, patch)
2016-02-02 17:41 PST, Brent Fulgham
no flags
Patch (66.68 KB, patch)
2016-02-03 11:54 PST, Brent Fulgham
no flags
Patch (59.93 KB, patch)
2016-02-08 15:28 PST, Brent Fulgham
no flags
Patch (62.65 KB, patch)
2016-02-10 16:32 PST, Brent Fulgham
no flags
Patch (77.41 KB, patch)
2016-02-12 13:26 PST, Brent Fulgham
no flags
Patch (74.57 KB, patch)
2016-02-12 17:24 PST, Brent Fulgham
no flags
Disable write-to-disk in normal build. (74.71 KB, patch)
2016-02-15 13:09 PST, Brent Fulgham
beidson: review+
Brent Fulgham
Comment 1 2016-01-27 17:47:58 PST
WebKit Commit Bot
Comment 2 2016-01-27 17:50:59 PST
Attachment 270069 [details] did not pass style-queue: ERROR: Source/WebCore/loader/cf/ResourceLoadStatisticsPersistenceCF.cpp:30: Streams are highly discouraged. [readability/streams] [3] Total errors found: 1 in 17 files If any of these errors are false positives, please file a bug against check-webkit-style.
Build Bot
Comment 3 2016-01-27 20:39:30 PST
Comment on attachment 270069 [details] Patch Attachment 270069 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.webkit.org/results/748187 Number of test failures exceeded the failure limit.
Build Bot
Comment 4 2016-01-27 20:39:35 PST
Created attachment 270083 [details] Archive of layout-test-results from ews104 for mac-yosemite-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews104 Port: mac-yosemite-wk2 Platform: Mac OS X 10.10.5
Build Bot
Comment 5 2016-01-27 20:44:34 PST
Comment on attachment 270069 [details] Patch Attachment 270069 [details] did not pass mac-debug-ews (mac): Output: http://webkit-queues.webkit.org/results/748192 Number of test failures exceeded the failure limit.
Build Bot
Comment 6 2016-01-27 20:44:37 PST
Created attachment 270084 [details] Archive of layout-test-results from ews112 for mac-yosemite The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews112 Port: mac-yosemite Platform: Mac OS X 10.10.5
Build Bot
Comment 7 2016-01-27 20:47:11 PST
Comment on attachment 270069 [details] Patch Attachment 270069 [details] did not pass mac-ews (mac): Output: http://webkit-queues.webkit.org/results/748205 Number of test failures exceeded the failure limit.
Build Bot
Comment 8 2016-01-27 20:47:15 PST
Created attachment 270085 [details] Archive of layout-test-results from ews100 for mac-yosemite The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews100 Port: mac-yosemite Platform: Mac OS X 10.10.5
Brady Eidson
Comment 9 2016-01-27 23:13:21 PST
Comment on attachment 270069 [details] Patch I don't have time before bed to give a full review, but I can both resolve the style complaint (don't use streams!), simplify a lot of the CF-style code, and set this up to be more easily cross platform going forward: Use KeyedEncoder/Decoder instead of your own custom property list code. They already exist in WebCore/platform and the CF backend is already a property list.
Brent Fulgham
Comment 10 2016-01-29 14:24:58 PST
Brent Fulgham
Comment 11 2016-01-29 14:36:21 PST
Sam Weinig
Comment 12 2016-01-29 14:40:13 PST
What makes this Mac specific?
Brent Fulgham
Comment 13 2016-01-29 18:14:47 PST
Brent Fulgham
Comment 14 2016-01-29 18:15:38 PST
Currently this patch is only useful for WK1 measurements.
Brent Fulgham
Comment 15 2016-01-29 18:16:14 PST
(In reply to comment #12) > What makes this Mac specific? It's not. It's now cross-platform, but I've only bothered to do a WK1 integration for now.
Brent Fulgham
Comment 16 2016-01-30 16:06:57 PST
Brady Eidson
Comment 17 2016-01-30 20:17:14 PST
All patches here are marked obsolete, and none are for review. Intentional?
Brent Fulgham
Comment 18 2016-01-30 23:17:26 PST
(In reply to comment #17) > All patches here are marked obsolete, and none are for review. > Intentional? Yes. I found a bug in that patch that wild cause test failures, so I pulled it before anyone wasted time on it.
Brent Fulgham
Comment 19 2016-01-31 12:41:21 PST
Darin Adler
Comment 20 2016-01-31 19:18:40 PST
Comment on attachment 270344 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=270344&action=review This patch is interesting. I don’t understand the basic goals and techniques here. I can’t think of anything else that just writes things into the home directory, and it’s also a little peculiar to build the writing in at such a low level. > Source/WebCore/dom/Document.h:1676 > + bool m_userHasInteractedWithDocument { false }; This is unused. Please don’t add it. > Source/WebCore/dom/UserGestureIndicator.cpp:59 > + Document& topDocument = document->topDocument(); > + > + if (!topDocument.userHasInteractedWithDocument()) > + ResourceLoadObserver::sharedObserver().logUserInteraction(topDocument); > + > + topDocument.updateLastHandledUserGestureTimestamp(); Could this logic go into updateLastHandledUserGestureTimestamp instead of here? If so, then we could avoid touching Document.h at all. > Source/WebCore/loader/DocumentLoader.cpp:528 > + const URL& targetUrl = newRequest.url(); Should be targetURL. But also, not sure I see the benefit of having this local variable. > Source/WebCore/loader/DocumentLoader.cpp:529 > + const Ref<SecurityOrigin> targetOrigin(SecurityOrigin::create(targetUrl)); No need to use const here. Also, this local variable is computed but never used, so please don’t make this change. > Source/WebCore/loader/DocumentLoader.cpp:532 > + Frame& frame = frameLoader()->frame(); > + Frame& topFrame = m_frame->tree().top(); Not sure why we have both frame and m_frame here. Really hard to see how the two differ. > Source/WebCore/loader/DocumentLoader.cpp:534 > + ResourceLoadObserver::sharedObserver().logFrameNavigation(!redirectResponse.isNull(), frame.document()->url(), targetUrl, frame.isMainFrame(), topFrame.document()->url()); Do we have guarantees that both frame.document() and topFrame.document() are non-null? > Source/WebCore/loader/ResourceLoadObserver.cpp:61 > +void ResourceLoadObserver::logFrameNavigation(bool isRedirect, const URL& sourceUrl, const URL& targetUrl, bool isMainFrame, const URL& mainFrameUrl) sourceURL, targetURL, mainFrameURL This function is quite long. Please consider refactoring it into multiple smaller functions. > Source/WebCore/loader/ResourceLoadObserver.cpp:67 > + if (!targetUrl.isValid() || !mainFrameUrl.isValid()) > + return; Why check these two and not sourceURL? > Source/WebCore/loader/ResourceLoadObserver.cpp:70 > + const Ref<SecurityOrigin> targetOrigin(SecurityOrigin::create(targetUrl)); > + const Ref<SecurityOrigin> mainFrameOrigin(SecurityOrigin::create(mainFrameUrl)); // Note: Is the same as sourceOrigin if navigation happens in the top frame No need for the local variables to be const. I suggest just using auto for these: auto targetOrigin = SecurityOrigin::create(targetURL); > Source/WebCore/loader/ResourceLoadObserver.cpp:72 > + const String& targetOriginString = targetOrigin->toString(); I suggest the type String or auto rather than const String& here. > Source/WebCore/loader/ResourceLoadObserver.cpp:74 > + auto& targetOriginResourceStatistics = resourceStatisticsForPrimaryDomain(targetUrl); In the context of this function perhaps we can use shorter names. No need to keep saying "resourceStatistics" all the time. And to say "origin" all the time. I think targetStatistics would be a good name for the local variable. > Source/WebCore/loader/ResourceLoadObserver.cpp:76 > + NetworkStorageSession& session = NetworkStorageSession::defaultStorageSession(); I suggest auto&. > Source/WebCore/loader/ResourceLoadObserver.cpp:77 > + String cookieStringForTargetUrl = platformStrategies()->cookiesStrategy()->cookiesForDOM(session, targetUrl, targetUrl); Are we guaranteed these strategy pointers are non-null? We should change them to be references. I suggest you call this cookies rather than cookieStringForTargetUrl, unless there is some reason to constantly be reminded exactly which cookies. > Source/WebCore/loader/ResourceLoadObserver.cpp:91 > + size_t numberOfTimesSubFrameUnderThisMainFrameOrigin = targetOriginResourceStatistics.subFrameUnderTopFrameOrigins.get(mainFrameOrigin->toString()); > + targetOriginResourceStatistics.subFrameUnderTopFrameOrigins.set(mainFrameOrigin->toString(), numberOfTimesSubFrameUnderThisMainFrameOrigin + 1); This is an inefficient way to use a HashMap. I suggest we use a HashCountedSet instead if the whole point is to count something; it does this efficiently. > Source/WebCore/loader/ResourceLoadObserver.cpp:96 > + Ref<SecurityOrigin> redirectingOrigin(SecurityOrigin::create(sourceUrl)); Again, I suggest using auto here for the type. > Source/WebCore/loader/ResourceLoadObserver.cpp:99 > + auto& redirectingOriginResourceStatistics = resourceStatisticsForPrimaryDomain(sourceUrl); How about "redirectingOriginStatistics". No need to keep repeating "resource". Or maybe sourceStatistics. > Source/WebCore/loader/ResourceLoadObserver.cpp:103 > + size_t numberOfTimesRedirectedToThisPrevalentOrigin = redirectingOriginResourceStatistics.redirectedToOtherPrevalentResourceOrigins.get(targetOriginString); > + redirectingOriginResourceStatistics.redirectedToOtherPrevalentResourceOrigins.set(targetOriginString, numberOfTimesRedirectedToThisPrevalentOrigin + 1); HashCountedSet would be better. > Source/WebCore/loader/ResourceLoadObserver.cpp:117 > + Ref<SecurityOrigin> sourceOrigin(SecurityOrigin::create(sourceUrl)); auto would be better > Source/WebCore/loader/ResourceLoadObserver.cpp:171 > + double totalVisited = std::max(static_cast<double>(m_originsVisitedMap.size()), 1.0); This will always be 1 since nothing is populating m_originsVisitedMap. There is no need to convert this to double before doing the math below. Only one of the numbers involved needs to be a double. > Source/WebCore/loader/ResourceLoadObserver.cpp:178 > + double totalVisited = std::max(static_cast<double>(m_originsVisitedMap.size()), 1.0); This will always be 1 since nothing is populating m_originsVisitedMap. There is no need to convert this to double before doing the math below. Only one of the numbers involved needs to be a double. > Source/WebCore/loader/ResourceLoadObserver.cpp:180 > + targetOriginResourceStatistics.subresourceHasBeenSubresourceCountDividedByTotalNumberOfOriginsVisited = static_cast<double>(targetOriginResourceStatistics.subresourceHasBeenSubresourceCount) / totalVisited; This code and much of the rest of this clause is copied and pasted. Please refactor to share more code. > Source/WebCore/loader/ResourceLoadObserver.cpp:191 > + auto& topOriginResourceStatistics = resourceStatisticsForPrimaryDomain(document.url()); Just "statistics" would be fine for this local variable. > Source/WebCore/loader/ResourceLoadObserver.cpp:198 > + UNUSED_PARAM(url); This is an ugly use of UNUSED_PARAM for a parameter that is not always unused. Would be nicer to do this below in an #else to make it clearer. > Source/WebCore/loader/ResourceLoadObserver.cpp:214 > + String primaryDomain = getPrimaryDomain(url); No need for the local variable here. > Source/WebCore/loader/ResourceLoadObserver.cpp:235 > +String ResourceLoadObserver::getPrimaryDomain(const URL& url) const WebKit coding style does not use "get" in function names like this. Why is this a const member function rather than a non-member function or a static member function? > Source/WebCore/loader/ResourceLoadObserver.cpp:242 > + String primaryDomain = ""; This is initialized in all branches below, so no need to initialize it here. Also, this is the slow way to initialize to the empty string. The fast way is emptyString(). > Source/WebCore/loader/ResourceLoadObserver.cpp:249 > + // Skip TLD and then up to two domains smaller than 4 characters This seems peculiar. Why not the real TLD rule? > Source/WebCore/loader/ResourceLoadObserver.cpp:263 > + // All parts are smaller than Strange comment here. > Source/WebCore/loader/ResourceLoadObserver.cpp:277 > + m_statisticsPersistence->writeDataToDisk(m_resourceStatisticsMap); Why no check for Settings::resourceLoadStatisticsEnabled here? > Source/WebCore/loader/ResourceLoadObserver.h:63 > + HashMap<String, size_t> m_originsVisitedMap; We want to fold case for origins, so this should be using ASCIICaseInsensitiveHash. I see no code writing to this map, just code calling size() on it. I have no idea what the value in the map is. A visit count maybe? > Source/WebCore/loader/ResourceLoadObserver.h:64 > + HashMap<String, std::unique_ptr<ResourceLoadStatistics>> m_resourceStatisticsMap; If the key here is an origin, then we should use ASCIICaseInsensitiveHash. Should just use ResourceLoadStatistics for the values in the map rather than using unique_ptr. > Source/WebCore/loader/ResourceLoadStatistics.cpp:45 > +// Magic numbers > +static size_t redirectedToOtherPrevalentResourceOriginLimit = 3; > + > +// Sub frame thresholds > +static size_t subFrameUnderTopFrameOriginsThreshold = 2; > +static size_t subFrameHasBeenRedirectedFromThreshold = 2; > +static size_t subFrameHasBeenRedirectedToThreshold = 3; > + > +// Subresource thresholds > +static size_t subresourceUnderTopFrameOriginsThreshold = 3; > +static size_t subresourceHasBeenRedirectedFromThreshold = 3; > +static size_t subresourceHasBeenRedirectedToThreshold = 5; > + > +static double subresourceHasBeenSubresourceCountDividedByTotalNumberOfOriginsVisitedThreshold = 1.5; > +static size_t minimumOriginsVisitedForPrevalenceClassification = 100; Why aren’t any of these marked const? > Source/WebCore/loader/ResourceLoadStatistics.cpp:47 > +bool ResourceLoadStatistics::shouldBecomePrevalentResource(const ResourceLoadStatistics& statistics, size_t originsVisitedSoFar) Why is this a static member function rather than a const member function? > Source/WebCore/loader/ResourceLoadStatistics.h:39 > + bool doesHavePrevalentRedirection() const; > + bool doesHavePrevalentFrameMembership() const; Should be named "has" rather than "doesHave" > Source/WebCore/loader/ResourceLoadStatistics.h:42 > + bool hasHadUserInteraction { false }; Should just be m_hadUserInteraction. > Source/WebCore/loader/ResourceLoadStatistics.h:49 > + uint32_t topFrameHasBeenRedirectedTo { 0 }; > + uint32_t topFrameHasBeenRedirectedFrom { 0 }; > + uint32_t topFrameInitialLoadCount { 0 }; > + uint32_t topFrameHasBeenNavigatedTo { 0 }; > + uint32_t topFrameHasBeenNavigatedFrom { 0 }; Why uint32_t? So specific. Elsewhere you were using size_t. And we normally use unsigned. > Source/WebCore/loader/ResourceLoadStatistics.h:71 > + HashMap<String, uint32_t> redirectedToOtherPrevalentResourceOrigins; Seems like this should use ASCIICaseInsensitiveHash. > Source/WebCore/loader/ResourceLoadStatisticsPersistence.cpp:40 > +String ResourceLoadStatisticsPersistence::persistentStoragePath(const String& label) const I’m not sure I’m entirely comfortable with this kind of direct file system manipulation. Highly unusual for WebKit code. Is this the right idiom? > Source/WebCore/loader/ResourceLoadStatisticsPersistence.cpp:52 > + String resourceLog(pwd.pw_dir); This is no the correct way to convert a file system path into a String. It won’t work if any characters are non-ASCII because it will treat them as Latin-1, not UTF-8. All the functions in FileSystemMac.mm seem to use String::fromUTF8 for this. And there's one named homeDirectoryPath(), so I suggest just using that! > Source/WebCore/loader/ResourceLoadStatisticsPersistence.cpp:85 > + const auto& keys = resourceStatistics.subFrameUnderTopFrameOrigins.keys(); The const here is not all that helpful. > Source/WebCore/loader/ResourceLoadStatisticsPersistence.cpp:86 > + encoder.encodeObjects("subFrameUnderTopFrameOrigins", keys.begin(), keys.end(), [&](KeyedEncoder& encoderInner, const String& origin) { Please consider naming the captured variable explicitly instead of using [&]. > Source/WebCore/loader/ResourceLoadStatisticsPersistence.cpp:101 > + const auto& keys = resourceStatistics.subresourceUnderTopFrameOrigins.keys(); > + encoder.encodeObjects("subresourceUnderTopFrameOrigins", keys.begin(), keys.end(), [&](KeyedEncoder& encoderInner, const String& underTopFrameOrigin) { Ditto. > Source/WebCore/loader/ResourceLoadStatisticsPersistence.cpp:110 > + const auto& keys = resourceStatistics.redirectedToOtherPrevalentResourceOrigins.keys(); > + encoder.encodeObjects("redirectedToOtherPrevalentResourceOrigins", keys.begin(), keys.end(), [&](KeyedEncoder& encoderInner, const String& otherPrevalentOrigin) { Ditto. > Source/WebCore/loader/ResourceLoadStatisticsPersistence.cpp:125 > + ASSERT(storedOrigin == originString); > + UNUSED_PARAM(originString); ASSERT_UNUSED is designed for uses like this one. Please use it. > Source/WebCore/loader/ResourceLoadStatisticsPersistence.cpp:183 > + Vector<String> ignore; > + decoder.decodeObjects("subFrameUnderTopFrameOrigins", ignore, [&](KeyedDecoder& decoderInner, String& origin) { > + if (!decoderInner.decodeString("origin", origin)) > + return false; > + > + uint32_t count; > + if (!decoderInner.decodeUInt32("count", count)) > + return false; > + > + resourceStatistics.subFrameUnderTopFrameOrigins.set(origin, count); > + return true; > + }); Looks like if the inner decoding fails, we won’t fail the outer decoding. Probably need to look at the return value from decodeObjects. > Source/WebCore/loader/ResourceLoadStatisticsPersistence.cpp:204 > + decoder.decodeObjects("subresourceUnderTopFrameOrigins", ignore, [&](KeyedDecoder& decoderInner, String& origin) { Ditto. > Source/WebCore/loader/ResourceLoadStatisticsPersistence.cpp:247 > + int64_t writtenBytes = writeToFile(handle, rawData->data(), rawData->size()); I suggest the type auto for writtenBytes instead of int64_t. > Source/WebCore/loader/ResourceLoadStatisticsPersistence.cpp:251 > + if (writtenBytes != static_cast<int64_t>(rawData->size())) { > + deleteFile(resourceLog); Is it our error handling policy to delete the file if we can’t write it completely successfully a good one? Seems it would make things mysterious. How helpful is this anyway? Do we think partially written files are a likely error case and deleting those files will save us time in some way? > Source/WebCore/loader/ResourceLoadStatisticsPersistence.cpp:252 > + return; The return here doesn’t do anything useful. This is the end of the function. > Source/WebCore/loader/ResourceLoadStatisticsPersistence.cpp:260 > + String resourceLog = persistentStoragePath(label); > + > + RefPtr<SharedBuffer> rawData = SharedBuffer::createWithContentsOfFile(resourceLog); Would read better without the local variable, I think. > Source/WebCore/loader/ResourceLoadStatisticsPersistence.cpp:278 > + String label = origin; > + label.replace('/', '_'); > + label.replace(':', '+'); What is this scheme about? FileSystem.h has a function named encodeForFileName that I think is intended to deal with things like this. > Source/WebCore/loader/ResourceLoadStatisticsPersistence.cpp:310 > + browsingStatistics.set(origin, std::make_unique<ResourceLoadStatistics>()); > + auto& statistics = *browsingStatistics.get(origin); This is an inefficient way to add something to a set. Not sure exactly why it's being done like this but this does the same thing twice as fast: auto& statistics = browsingStatistics.add(origin, std::make_unique<ResourceLoadStatistics>()).iterator->value; > Source/WebCore/loader/ResourceLoadStatisticsPersistence.h:40 > +class KeyedDecoder; > +class KeyedEncoder; Not sure about the strategy of using KeyedEncoder/Decoder for this at all. I don’t think it’s designed for this kind of use. Please discuss with Anders. > Source/WebCore/loader/ResourceLoadStatisticsPersistence.h:50 > + ResourceLoadStatisticsPersistence() > + { > + } > + > + ~ResourceLoadStatisticsPersistence() > + { > + } Just leave these out completely. Defining them like this has no effect. > Source/WebCore/loader/ResourceLoadStatisticsPersistence.h:63 > + void writeDataToDisk(const String& originString, const ResourceLoadStatistics&) const; > + void writeDataToDisk(const HashMap<String, std::unique_ptr<ResourceLoadStatistics>>&) const; > + void readDataFromDisk(HashMap<String, std::unique_ptr<ResourceLoadStatistics>>&) const; > + > +private: > + void addOriginToEncoder(KeyedEncoder&, const String& origin, const ResourceLoadStatistics&) const; > + void writeEncoderToDisk(KeyedEncoder&, const String& label) const; > + > + bool readOriginFromDecoder(KeyedDecoder&, const String& origin, ResourceLoadStatistics&) const; > + std::unique_ptr<KeyedDecoder> createDecoderFromDisk(const String& label) const; > + > + String persistentStoragePath(const String& label) const; Since this class has no state, all its members could just be static member functions. Or these could just be non-member functions.
Darin Adler
Comment 21 2016-01-31 19:19:15 PST
Why is this patch marked review?
Brent Fulgham
Comment 22 2016-01-31 19:21:24 PST
(In reply to comment #21) > Why is this patch marked review? I spoke with Brady and others about this code, and we wanted to get it into the tree so more people could play with it. I'm waiting for Brady to do a review.
Darin Adler
Comment 23 2016-01-31 19:34:40 PST
OK, I was confused because above it says "Do not review for landing" and there was no comment about when that changed.
Brent Fulgham
Comment 24 2016-02-01 21:35:52 PST
Created attachment 270471 [details] Updated based on Darin's comments.
Brent Fulgham
Comment 25 2016-02-01 21:37:43 PST
Comment on attachment 270471 [details] Updated based on Darin's comments. This patch still isn't perfect, but is much better now. I would like to get Brady's review of the encoding logic, and hopefully Anders can give it a quick look.
Darin Adler
Comment 26 2016-02-02 09:03:00 PST
Comment on attachment 270471 [details] Updated based on Darin's comments. View in context: https://bugs.webkit.org/attachment.cgi?id=270471&action=review Many of my comments on the original patch were ignored. Here I repeat them. > Source/WebCore/dom/Document.h:1189 > + bool userHasInteractedWithDocument() const { return !!m_lastHandledUserGestureTimestamp; } No value to adding this function. It would be clearer if this rule ("0 means never stored a time stamp before") was implemented directly in the updateLastHandledUserGestureTimestamp. > Source/WebCore/loader/ResourceLoadObserver.cpp:42 > +#undef LOG_TO_FILE More conventional to use an always defined symbol and define it to either 0 or 1. > Source/WebCore/loader/ResourceLoadObserver.cpp:46 > +typedef HashMap<String, ResourceLoadStatistics>::iterator ResourceLoadStatisticsValue; Unused typedef should be removed. > Source/WebCore/loader/ResourceLoadObserver.cpp:72 > + const Ref<SecurityOrigin> targetOrigin(SecurityOrigin::create(targetURL)); > + const Ref<SecurityOrigin> mainFrameOrigin(SecurityOrigin::create(mainFrameURL)); // Note: Is the same as sourceOrigin if navigation happens in the top frame OK, but a little peculiar, to use const here. we have lots of other local variables that are const but we don’t state it explicitly. > Source/WebCore/loader/ResourceLoadObserver.cpp:74 > + const String& targetOriginString = targetOrigin->toString(); Clearer to have the type here be String or even auto. Using const String& is misleading because it’s going to be a reference to a temporary anyway. > Source/WebCore/loader/ResourceLoadObserver.cpp:105 > + targetStatistics.topFrameHasBeenRedirectedTo++; I think we should put the ++ at the start of the line on all of these. Makes it easier to spot, and kind of standard for C++ anyway. > Source/WebCore/loader/ResourceLoadObserver.cpp:123 > + } else if (!sourceOrigin.get().equal(&targetOrigin.get())) { We normally prefer targetOrigin.ptr() over &targetOrigin.get(). But also wondering why the equal function requires a pointer rather than a reference. > Source/WebCore/loader/ResourceLoadObserver.cpp:167 > + double totalVisited = std::max(static_cast<double>(m_originsVisitedMap.size()), 1.0); > + > + targetStatistics.subresourceHasBeenSubresourceCountDividedByTotalNumberOfOriginsVisited = static_cast<double>(targetStatistics.subresourceHasBeenSubresourceCount) / totalVisited; This is overdoing the "double" casting. We need only a single double here to make the floating point division use double, but instead we are making the size and the constant 1 both be double, and also casting the numerator to double. I suggest writing: auto totalVisited = std::max(m_originsVisitedMap.size(), 1U); Or something like that. No need to convert to double before doing the division. Might also be a better design to just track the numerator and denominator and do the math later when reporting. Not valuable to keep recomputing the ratio as the numbers change. > Source/WebCore/loader/ResourceLoadObserver.cpp:174 > + double totalVisited = std::max(static_cast<double>(m_originsVisitedMap.size()), 1.0); > + > + targetStatistics.subresourceHasBeenSubresourceCountDividedByTotalNumberOfOriginsVisited = static_cast<double>(targetStatistics.subresourceHasBeenSubresourceCount) / totalVisited; Ditto. > Source/WebCore/loader/ResourceLoadObserver.cpp:190 > +void ResourceLoadObserver::logIfPrevalentResource(const URL& url, ResourceLoadStatistics& statistics) Peculiar that this function has a side effect of setting the isPrevalentResource boolean given its name. Quite unclear. > Source/WebCore/loader/ResourceLoadObserver.cpp:255 > + primaryDomain = hostSplitOnDot.at(primaryDomainCutOffIndex); > + for (size_t j = primaryDomainCutOffIndex + 1; j < hostSplitOnDot.size(); j++) > + primaryDomain = primaryDomain + "." + hostSplitOnDot.at(j); Should not append strings in a loop like this. Super slow and inefficient. Should use StringBuilder for this sort of thing. > Source/WebCore/loader/ResourceLoadStatistics.cpp:37 > +// Sub frame thresholds > +static const unsigned subFrameUnderTopFrameOriginsThreshold = 2; > +static const unsigned subFrameHasBeenRedirectedFromThreshold = 2; > +static const unsigned subFrameHasBeenRedirectedToThreshold = 3; We should use the word “subframe”, not the two words “sub frame”, so the identifiers should not have a capital “F” and the comment should not have a space between “Sub” and “frame”. > Source/WebCore/loader/ResourceLoadStatistics.h:34 > +class ResourceLoadStatistics { This should be a struct, not a class. > Source/WebCore/loader/ResourceLoadStatisticsPersistence.cpp:54 > + // Home Directory > + char buffer[4096]; > + int bufferSize = sizeof(buffer); > + struct passwd pwd; > + struct passwd* result = nullptr; > + if (getpwuid_r(getuid(), &pwd, buffer, bufferSize, &result) || !result) { > + WTFLogAlways("ResourceLoadStatisticsPersistence::persistentStoragePath: Couldn't find home directory\n"); > + return String(); > + } > + > + String resourceLog(pwd.pw_dir); Should use the homeDirectoryPath() function from FileSystem.h instead of reimplementing it. > Source/WebCore/loader/ResourceLoadStatisticsPersistence.cpp:258 > + deleteFile(resourceLog); Still a slightly strange choice to delete the file in this case; not entirely sure what the benefit is. > Source/WebCore/loader/ResourceLoadStatisticsPersistence.cpp:259 > + return; Strange to have a return here at the end of a function. > Source/WebCore/loader/ResourceLoadStatisticsPersistence.h:42 > +class ResourceLoadStatisticsPersistence { This should not be a class. It has no state. If we do use a class all the function should be static member functions, not const member functions. > Source/WebCore/loader/ResourceLoadStatisticsPersistence.h:50 > + ResourceLoadStatisticsPersistence() > + { > + } > + > + ~ResourceLoadStatisticsPersistence() > + { > + } Please omit these. Defining them like this has no effect.
Brady Eidson
Comment 27 2016-02-02 09:06:25 PST
(I'll review after the next patch is uploaded, which I assume is imminent due to Darin's feedback)
Brent Fulgham
Comment 28 2016-02-02 11:52:06 PST
Comment on attachment 270471 [details] Updated based on Darin's comments. View in context: https://bugs.webkit.org/attachment.cgi?id=270471&action=review >> Source/WebCore/dom/Document.h:1189 >> + bool userHasInteractedWithDocument() const { return !!m_lastHandledUserGestureTimestamp; } > > No value to adding this function. It would be clearer if this rule ("0 means never stored a time stamp before") was implemented directly in the updateLastHandledUserGestureTimestamp. OK! >> Source/WebCore/loader/ResourceLoadObserver.cpp:42 >> +#undef LOG_TO_FILE > > More conventional to use an always defined symbol and define it to either 0 or 1. I see that in a few other cases; I'll do the same. >> Source/WebCore/loader/ResourceLoadObserver.cpp:46 >> +typedef HashMap<String, ResourceLoadStatistics>::iterator ResourceLoadStatisticsValue; > > Unused typedef should be removed. Done. >> Source/WebCore/loader/ResourceLoadObserver.cpp:72 >> + const Ref<SecurityOrigin> mainFrameOrigin(SecurityOrigin::create(mainFrameURL)); // Note: Is the same as sourceOrigin if navigation happens in the top frame > > OK, but a little peculiar, to use const here. we have lots of other local variables that are const but we don’t state it explicitly. I'll switch to auto, as we did elsewhere. >> Source/WebCore/loader/ResourceLoadObserver.cpp:74 >> + const String& targetOriginString = targetOrigin->toString(); > > Clearer to have the type here be String or even auto. Using const String& is misleading because it’s going to be a reference to a temporary anyway. OK! >> Source/WebCore/loader/ResourceLoadObserver.cpp:105 >> + targetStatistics.topFrameHasBeenRedirectedTo++; > > I think we should put the ++ at the start of the line on all of these. Makes it easier to spot, and kind of standard for C++ anyway. Will do. >> Source/WebCore/loader/ResourceLoadObserver.cpp:123 >> + } else if (!sourceOrigin.get().equal(&targetOrigin.get())) { > > We normally prefer targetOrigin.ptr() over &targetOrigin.get(). > > But also wondering why the equal function requires a pointer rather than a reference. At a minimum, it does seem like it should be "!sourceOrigin->equal(targetOrigin.ptr())" It's a little weird that the SourceOrigin class wants a pointer to the element being compared with, rather than a reference. Maybe SourceOrigins are frequently nullptr when being compared? >> Source/WebCore/loader/ResourceLoadObserver.cpp:167 >> + targetStatistics.subresourceHasBeenSubresourceCountDividedByTotalNumberOfOriginsVisited = static_cast<double>(targetStatistics.subresourceHasBeenSubresourceCount) / totalVisited; > > This is overdoing the "double" casting. We need only a single double here to make the floating point division use double, but instead we are making the size and the constant 1 both be double, and also casting the numerator to double. I suggest writing: > > auto totalVisited = std::max(m_originsVisitedMap.size(), 1U); > > Or something like that. No need to convert to double before doing the division. > > Might also be a better design to just track the numerator and denominator and do the math later when reporting. Not valuable to keep recomputing the ratio as the numbers change. Hmm. Maybe it could just be a member function of the statistics class that takes the visited count as an argument. >> Source/WebCore/loader/ResourceLoadObserver.cpp:174 >> + targetStatistics.subresourceHasBeenSubresourceCountDividedByTotalNumberOfOriginsVisited = static_cast<double>(targetStatistics.subresourceHasBeenSubresourceCount) / totalVisited; > > Ditto. Will do. >> Source/WebCore/loader/ResourceLoadObserver.cpp:190 >> +void ResourceLoadObserver::logIfPrevalentResource(const URL& url, ResourceLoadStatistics& statistics) > > Peculiar that this function has a side effect of setting the isPrevalentResource boolean given its name. Quite unclear. I changed it to "logAndSetAsPrevalentResourceIfNecessary", and changed "shouldBecomePrevalentResource" to "checkAndSetAsPrevalentResourceIfNecessary" >> Source/WebCore/loader/ResourceLoadObserver.cpp:255 >> + primaryDomain = primaryDomain + "." + hostSplitOnDot.at(j); > > Should not append strings in a loop like this. Super slow and inefficient. Should use StringBuilder for this sort of thing. Oh, Yes! Thank you for catching this. >> Source/WebCore/loader/ResourceLoadStatistics.cpp:37 >> +static const unsigned subFrameHasBeenRedirectedToThreshold = 3; > > We should use the word “subframe”, not the two words “sub frame”, so the identifiers should not have a capital “F” and the comment should not have a space between “Sub” and “frame”. Done. >> Source/WebCore/loader/ResourceLoadStatistics.h:34 >> +class ResourceLoadStatistics { > > This should be a struct, not a class. Done. >> Source/WebCore/loader/ResourceLoadStatisticsPersistence.cpp:54 >> + String resourceLog(pwd.pw_dir); > > Should use the homeDirectoryPath() function from FileSystem.h instead of reimplementing it. OK! >> Source/WebCore/loader/ResourceLoadStatisticsPersistence.cpp:258 >> + deleteFile(resourceLog); > > Still a slightly strange choice to delete the file in this case; not entirely sure what the benefit is. OK. I'll just log that something unexpected happened. >> Source/WebCore/loader/ResourceLoadStatisticsPersistence.cpp:259 >> + return; > > Strange to have a return here at the end of a function. OK! >> Source/WebCore/loader/ResourceLoadStatisticsPersistence.h:42 >> +class ResourceLoadStatisticsPersistence { > > This should not be a class. It has no state. If we do use a class all the function should be static member functions, not const member functions. I think I'll merge it back into ResourceLoadStatistics. >> Source/WebCore/loader/ResourceLoadStatisticsPersistence.h:50 >> + } > > Please omit these. Defining them like this has no effect. OK!
Brent Fulgham
Comment 29 2016-02-02 12:51:55 PST
Brent Fulgham
Comment 30 2016-02-02 17:14:20 PST
Comment on attachment 270508 [details] Patch I somehow missed a string name change I made while tidying up the patch. Doh!
Brent Fulgham
Comment 31 2016-02-02 17:41:58 PST
Darin Adler
Comment 32 2016-02-02 20:35:21 PST
Comment on attachment 270535 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=270535&action=review > Source/WebCore/loader/ResourceLoadObserver.cpp:240 > + primaryDomain = hostSplitOnDot.at(primaryDomainCutOffIndex); > + StringBuilder builder; > + builder.append(primaryDomain); This should just be: StringBuilder builder; builder.append(hostSplitOnDot.at(primaryDomainCutOffIndex)); There’s no reason to initialize primaryDomain to this string and then just overwrite it later. > Source/WebCore/loader/ResourceLoadObserver.cpp:242 > + builder.append("."); Should append '.', the character, rather than ".", the string. > Source/WebCore/loader/ResourceLoadObserver.cpp:291 > + String fileName = label + "_resourceLog.plist"; > + return pathByAppendingComponent(homeDirectoryPath(), fileName); Should be a one liner, not two lines: return pathByAppendingComponent(homeDirectoryPath(), label + "_resourceLog.plist");
Alexey Proskuryakov
Comment 33 2016-02-02 22:07:22 PST
Comment on attachment 270535 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=270535&action=review Is this purely debugging code for MiniBrowser only? If that's the case, I think that we should find a way to not expose this functionality in production via user defaults. > Source/WebCore/loader/ResourceLoadObserver.cpp:54 > +void ResourceLoadObserver::logFrameNavigation(bool isRedirect, const URL& sourceURL, const URL& targetURL, bool isMainFrame, const URL& mainFrameURL) This function hardcodes a lot of assumption about how loading works, so much that it seems unlikely that we could maintain it going forward. > Source/WebCore/loader/ResourceLoadObserver.cpp:65 > + auto& session = NetworkStorageSession::defaultStorageSession(); Why is the default session always appropriate here? > Source/WebCore/loader/ResourceLoadObserver.cpp:70 > + targetStatistics.topFrameHadCookiesBeforeFirstNavigationTo = true; cookiesForDOM is not the right function to use when dealing with navigation, as it excludes HTTPOnly cookies. > Source/WebCore/loader/ResourceLoadObserver.cpp:191 > +bool ResourceLoadObserver::isPrevalentResource(const URL& url) const Even after reading this function, I'm still not sure what a prevalent resource is. Did we use this concept before? > Source/WebCore/loader/ResourceLoadObserver.cpp:209 > +String ResourceLoadObserver::primaryDomain(const URL& url) I have not seen the concept of "primary domain" before, what is this function expected to do? > Source/WebCore/loader/ResourceLoadObserver.h:38 > +class Document; > +class KeyedDecoder; > +class KeyedEncoder; > +struct ResourceLoadStatistics; > +class URL; I think that we normally sort these lines, so all class come before all structs. > Source/WebCore/loader/ResourceLoadStatistics.h:30 > +#include <wtf/text/WTFString.h> This include shouldn't be needed, wtf/Forward.h declares String.
Brent Fulgham
Comment 34 2016-02-03 11:50:36 PST
Comment on attachment 270535 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=270535&action=review >> Source/WebCore/loader/ResourceLoadObserver.cpp:54 >> +void ResourceLoadObserver::logFrameNavigation(bool isRedirect, const URL& sourceURL, const URL& targetURL, bool isMainFrame, const URL& mainFrameURL) > > This function hardcodes a lot of assumption about how loading works, so much that it seems unlikely that we could maintain it going forward. Alexey, I'll have John go over these topics with you separately. >> Source/WebCore/loader/ResourceLoadObserver.cpp:240 >> + builder.append(primaryDomain); > > This should just be: > > StringBuilder builder; > builder.append(hostSplitOnDot.at(primaryDomainCutOffIndex)); > > There’s no reason to initialize primaryDomain to this string and then just overwrite it later. OK! >> Source/WebCore/loader/ResourceLoadObserver.cpp:242 >> + builder.append("."); > > Should append '.', the character, rather than ".", the string. Done. >> Source/WebCore/loader/ResourceLoadObserver.cpp:291 >> + return pathByAppendingComponent(homeDirectoryPath(), fileName); > > Should be a one liner, not two lines: > > return pathByAppendingComponent(homeDirectoryPath(), label + "_resourceLog.plist"); Done. >> Source/WebCore/loader/ResourceLoadObserver.h:38 >> +class URL; > > I think that we normally sort these lines, so all class come before all structs. Done. >> Source/WebCore/loader/ResourceLoadStatistics.h:30 >> +#include <wtf/text/WTFString.h> > > This include shouldn't be needed, wtf/Forward.h declares String. OK!
Brent Fulgham
Comment 35 2016-02-03 11:52:30 PST
Comment on attachment 270535 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=270535&action=review >>> Source/WebCore/loader/ResourceLoadStatistics.h:30 >>> +#include <wtf/text/WTFString.h> >> >> This include shouldn't be needed, wtf/Forward.h declares String. > > OK! Nope -- it won't build without the include.
Brent Fulgham
Comment 36 2016-02-03 11:54:03 PST
Brent Fulgham
Comment 37 2016-02-08 15:28:01 PST
Brady Eidson
Comment 38 2016-02-08 16:29:31 PST
Comment on attachment 270886 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=270886&action=review I have a few major quibbles. My overarching concern is that we're starting a new mechanism from scratch - one that will run code on every single load, and one that will touch the disk - yet there's pretty huge gaps in "best practices". Let's get this clean from the start. > Source/WebCore/ChangeLog:8 > + No new tests: No change in behavior. I'm concerned about the testing strategy here. I know this is pretty much "initial scaffolding" for a new feature, but IMHO that's the most important time to lay groundwork for testing. > Source/WebCore/dom/Document.cpp:6401 > + if (!m_lastHandledUserGestureTimestamp) > + ResourceLoadObserver::sharedObserver().logUserInteraction(*this); So we only care the first time the user interacts with a document? Why not any subsequent time? > Source/WebCore/loader/DocumentLoader.cpp:535 > + ResourceLoadObserver::sharedObserver().logFrameNavigation(!redirectResponse.isNull(), m_frame->document()->url(), newRequest.url(), m_frame->isMainFrame(), topFrame.document()->url()); While this feature is on the ground floor I think we need to get in the habit of always checking if its even enabled before we log stuff. On the one hand, logFrameNavigation() probably only logs things if logging is enabled. But on the other hand, we do a lot of unnecessary work just in calling logFrameNavigation() if logging is not enabled. Historically, all those little allocations and conversations between object types tend to show up on profiles in aggregate. > Source/WebCore/loader/ResourceLoadObserver.cpp:65 > + auto& session = NetworkStorageSession::defaultStorageSession(); Why the default session? > Source/WebCore/loader/ResourceLoadObserver.cpp:79 > + auto mainFrameOrigin = SecurityOrigin::create(mainFrameURL); // Note: Is the same as sourceOrigin if navigation happens in the top frame Please don't have comments on the same line as the code they're commenting on. > Source/WebCore/loader/ResourceLoadObserver.cpp:88 > + if (isPrevalentResource(targetURL)) What is a "prevalent resource"? As someone familiar with loading and networking code, I don't know what this means, nor can I guess. > Source/WebCore/loader/ResourceLoadObserver.cpp:184 > +#if LOG_STATISTICS_TO_INDIVIDUAL_FILES We're developing something brand new, from the ground up - Why are we giving it two different *compile-time* modes right out of the gate? It doesn't make sense to me why we need this. > Source/WebCore/loader/ResourceLoadObserver.cpp:223 > + int primaryDomainCutOffIndex = hostSplitOnDot.size() - 2; // Type int since we're looping backwards and it can become negative Please don't have comments on the same line as the code they're commenting on. > Source/WebCore/loader/ResourceLoadObserver.cpp:225 > + size_t numberOfParts = 1; // Start with TLD as a given part Please don't have comments on the same line as the code they're commenting on. > Source/WebCore/loader/ResourceLoadObserver.cpp:232 > + if (hostSplitOnDot.at(primaryDomainCutOffIndex).length() >= 4 || numberOfParts >= 3) { > + // We have either a domain part that's 4 chars or longer, or 3 domain parts including TLD > + break; > + } This comment could go above the "if", removing the stylistic need for the braces. > Source/WebCore/loader/ResourceLoadObserver.cpp:321 > + ASSERT(visitedOrigins == static_cast<size_t>(loadedOrigins.size())); // I add an extra key/value to hold total size Please don't have comments on the same line as the code they're commenting on. "I add..." is kind of a weird comment. > Source/WebCore/loader/ResourceLoadObserver.cpp:337 > +void ResourceLoadObserver::writeEncoderToDisk(KeyedEncoder& encoder, const String& label) const > +{ > +#if LOG_STATISTICS_TO_FILE This method is called "write to disk", then if this mysterious compile time flag is set it writes stuff to disk... but if the flag is not set, it does literally nothing. Again, wondering what the point is? > Source/WebCore/loader/SubresourceLoader.cpp:194 > + ResourceLoadObserver::sharedObserver().logSubresourceLoading(!redirectResponse.isNull(), redirectResponse.url(), newRequest.url(), m_frame ? m_frame->mainFrame().document()->url() : URL()); As mentioned before, this should only be called if the feature is currently enabled (otherwise we do some potentially costly work for no reason) > Tools/MiniBrowser/mac/SettingsController.m:116 > + [self _addItemWithTitle:@"Enable Resource Load Statistics" action:@selector(toggleResourceLoadStatisticsEnabled:) indented:NO]; Indication that this is WK1-only for now?
Brent Fulgham
Comment 39 2016-02-08 17:59:18 PST
Comment on attachment 270886 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=270886&action=review >> Source/WebCore/ChangeLog:8 >> + No new tests: No change in behavior. > > I'm concerned about the testing strategy here. > > I know this is pretty much "initial scaffolding" for a new feature, but IMHO that's the most important time to lay groundwork for testing. OK >> Source/WebCore/loader/DocumentLoader.cpp:535 >> + ResourceLoadObserver::sharedObserver().logFrameNavigation(!redirectResponse.isNull(), m_frame->document()->url(), newRequest.url(), m_frame->isMainFrame(), topFrame.document()->url()); > > While this feature is on the ground floor I think we need to get in the habit of always checking if its even enabled before we log stuff. > > On the one hand, logFrameNavigation() probably only logs things if logging is enabled. > > But on the other hand, we do a lot of unnecessary work just in calling logFrameNavigation() if logging is not enabled. > > Historically, all those little allocations and conversations between object types tend to show up on profiles in aggregate. That sounds reasonable. We can revise things to avoid the work if the feature is disabled. >> Source/WebCore/loader/ResourceLoadObserver.cpp:79 >> + auto mainFrameOrigin = SecurityOrigin::create(mainFrameURL); // Note: Is the same as sourceOrigin if navigation happens in the top frame > > Please don't have comments on the same line as the code they're commenting on. OK! >> Source/WebCore/loader/ResourceLoadObserver.cpp:184 >> +#if LOG_STATISTICS_TO_INDIVIDUAL_FILES > > We're developing something brand new, from the ground up - Why are we giving it two different *compile-time* modes right out of the gate? > > It doesn't make sense to me why we need this. This is basically debugging code that I left in for convenience. I'll strip it from the patch since no one else is likely to want this stuff, and it's easy enough to put back if I do want it. >> Source/WebCore/loader/ResourceLoadObserver.cpp:223 >> + int primaryDomainCutOffIndex = hostSplitOnDot.size() - 2; // Type int since we're looping backwards and it can become negative > > Please don't have comments on the same line as the code they're commenting on. OK! >> Source/WebCore/loader/ResourceLoadObserver.cpp:225 >> + size_t numberOfParts = 1; // Start with TLD as a given part > > Please don't have comments on the same line as the code they're commenting on. OK! >> Source/WebCore/loader/ResourceLoadObserver.cpp:232 >> + } > > This comment could go above the "if", removing the stylistic need for the braces. Good idea! >> Source/WebCore/loader/ResourceLoadObserver.cpp:321 >> + ASSERT(visitedOrigins == static_cast<size_t>(loadedOrigins.size())); // I add an extra key/value to hold total size > > Please don't have comments on the same line as the code they're commenting on. > > "I add..." is kind of a weird comment. Ha! It's not even true anymore! A great example of why comments are worse than code that just says what it does. >> Source/WebCore/loader/ResourceLoadObserver.cpp:337 >> +#if LOG_STATISTICS_TO_FILE > > This method is called "write to disk", then if this mysterious compile time flag is set it writes stuff to disk... but if the flag is not set, it does literally nothing. > > Again, wondering what the point is? One of the reasons I wanted you to review this patch was to discuss how to store these statistics. Right now I'm throwing them into a plist so I can play around with this stuff, but I'm not sure how this should be handled longer term. Essentially this is "writeEncoderToLongTermStorage" or something like that. But today, that's a plist in my home directory -- that's not how this should ship. This is essentially a cache that I want to reload when I launch so I can continue gathering statistics as I browse the web. What's the correct place to store this kind of stuff? >> Source/WebCore/loader/SubresourceLoader.cpp:194 >> + ResourceLoadObserver::sharedObserver().logSubresourceLoading(!redirectResponse.isNull(), redirectResponse.url(), newRequest.url(), m_frame ? m_frame->mainFrame().document()->url() : URL()); > > As mentioned before, this should only be called if the feature is currently enabled (otherwise we do some potentially costly work for no reason) Done! >> Tools/MiniBrowser/mac/SettingsController.m:116 >> + [self _addItemWithTitle:@"Enable Resource Load Statistics" action:@selector(toggleResourceLoadStatisticsEnabled:) indented:NO]; > > Indication that this is WK1-only for now? Hmm. I can rename this to "Enable WK1 Resource Load Statistics" for now.
Brady Eidson
Comment 40 2016-02-09 14:10:45 PST
(In reply to comment #39) > Comment on attachment 270886 [details] > Patch > >> Source/WebCore/loader/ResourceLoadObserver.cpp:337 > >> +#if LOG_STATISTICS_TO_FILE > > > > This method is called "write to disk", then if this mysterious compile time flag is set it writes stuff to disk... but if the flag is not set, it does literally nothing. > > > > Again, wondering what the point is? > > One of the reasons I wanted you to review this patch was to discuss how to > store these statistics. Right now I'm throwing them into a plist so I can > play around with this stuff, but I'm not sure how this should be handled > longer term. > > Essentially this is "writeEncoderToLongTermStorage" or something like that. > But today, that's a plist in my home directory -- that's not how this should > ship. > > This is essentially a cache that I want to reload when I launch so I can > continue gathering statistics as I browse the web. What's the correct place > to store this kind of stuff? A flat plist is fine while its under development (obviously not for shipping). We can figure out the longer term later. You call it a "cache", so if you're right and it really is a cache (e.g., purgable) then it can go in ~/Library/Caches/<app bundle id> If you don't want it to be anywhere purgable, ~/Library/Application Support/<app bundle id> In either case, the path should come from WebKit and not WebCore.
Brent Fulgham
Comment 41 2016-02-10 16:32:01 PST
Brent Fulgham
Comment 42 2016-02-10 16:35:37 PST
(In reply to comment #40) > You call it a "cache", so if you're right and it really is a cache (e.g., > purgable) then it can go in ~/Library/Caches/<app bundle id> > > If you don't want it to be anywhere purgable, ~/Library/Application > Support/<app bundle id> > > In either case, the path should come from WebKit and not WebCore. I decided it wasn't really a cache because we can't purge it at any time without losing history. So I am using the OS API calls to get the program's "Application Support" directory to store the file.
Brent Fulgham
Comment 43 2016-02-11 12:25:03 PST
(In reply to comment #39) > Comment on attachment 270886 [details] > >> + No new tests: No change in behavior. > > > > I'm concerned about the testing strategy here. > > > > I know this is pretty much "initial scaffolding" for a new feature, but IMHO that's the most important time to lay groundwork for testing. I think what I'd like to do is expose something through the Internals object that would allow us to dump the statistics in JS as part of a layout test. I could do something like our existing redirect or history tests, and just confirm that counts changed as expected or something. At present, this could only be a WK1 test. Do you want that to be in place as part of this initial scaffolding, or can we add it as part of the next step?
Brady Eidson
Comment 44 2016-02-11 13:18:35 PST
(In reply to comment #43) > (In reply to comment #39) > > Comment on attachment 270886 [details] > > >> + No new tests: No change in behavior. > > > > > > I'm concerned about the testing strategy here. > > > > > > I know this is pretty much "initial scaffolding" for a new feature, but IMHO that's the most important time to lay groundwork for testing. > > I think what I'd like to do is expose something through the Internals object > that would allow us to dump the statistics in JS as part of a layout test. > > I could do something like our existing redirect or history tests, and just > confirm that counts changed as expected or something. > > At present, this could only be a WK1 test. > > Do you want that to be in place as part of this initial scaffolding, or can > we add it as part of the next step? Ready for mind==blown? How about the Internals test dumping support... *before* this patch? =D =D =D
Brent Fulgham
Comment 45 2016-02-12 13:26:57 PST
Brady Eidson
Comment 46 2016-02-12 14:26:25 PST
Comment on attachment 271210 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=271210&action=review > Source/WebCore/loader/ResourceLoadObserver.cpp:64 > + auto& session = NetworkStorageSession::defaultStorageSession(); Both Alexey and I have asked this, and I don't see a response yet - Why is the default session the right one to use? This seems quite wrong.
Brent Fulgham
Comment 47 2016-02-12 16:47:46 PST
(In reply to comment #46) > Comment on attachment 271210 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=271210&action=review > > > Source/WebCore/loader/ResourceLoadObserver.cpp:64 > > + auto& session = NetworkStorageSession::defaultStorageSession(); > > Both Alexey and I have asked this, and I don't see a response yet - Why is > the default session the right one to use? It turns out this isn't needed. I'm correcting the patch.
Brent Fulgham
Comment 48 2016-02-12 17:24:05 PST
Brent Fulgham
Comment 49 2016-02-15 13:09:24 PST
Created attachment 271363 [details] Disable write-to-disk in normal build.
Brady Eidson
Comment 50 2016-02-15 16:09:44 PST
Comment on attachment 271363 [details] Disable write-to-disk in normal build. r+ as long as Windows builds!
Brent Fulgham
Comment 51 2016-02-15 20:35:12 PST
Windows build was failing because it didn't compile the KeyedEncoderCF/KeyedDecoderCF code.
Brent Fulgham
Comment 52 2016-02-15 21:36:35 PST
Note You need to log in before you can comment on or make changes to this bug.