Bug 195266

Summary: Web Inspector: Audit: provide a way to get the contents of resources
Product: WebKit Reporter: Devin Rousso <hi>
Component: Web InspectorAssignee: Devin Rousso <hi>
Status: RESOLVED FIXED    
Severity: Normal CC: commit-queue, ews-watchlist, hi, inspector-bugzilla-changes, joepeck, rniwa, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: All   
OS: All   
Bug Depends on: 190754    
Bug Blocks:    
Attachments:
Description Flags
Patch
ews-watchlist: commit-queue-
Archive of layout-test-results from ews101 for mac-highsierra
none
Archive of layout-test-results from ews106 for mac-highsierra-wk2
none
Archive of layout-test-results from ews112 for mac-highsierra
none
Patch
none
Patch
none
Archive of layout-test-results from ews100 for mac-highsierra
none
Archive of layout-test-results from ews105 for mac-highsierra-wk2
none
Patch
none
Patch
none
Patch
none
Archive of layout-test-results from ews103 for mac-highsierra
none
Archive of layout-test-results from ews105 for mac-highsierra-wk2
none
Archive of layout-test-results from ews117 for mac-highsierra
none
Patch
none
Patch
commit-queue: commit-queue-
Patch none

Description Devin Rousso 2019-03-03 21:35:30 PST
This would be very helpful for performing linting/style-checking kinds of audits.  Otherwise, you'd either have to use an inline <script> or re-fetch the contents of the script from the server to test it.
Comment 1 Radar WebKit Bug Importer 2019-03-03 21:36:00 PST
<rdar://problem/48550911>
Comment 2 Devin Rousso 2019-03-03 21:47:37 PST
Created attachment 363485 [details]
Patch

I'm missing some other cases (e.g. injected/anonymous/extension scripts), but I figured I'd put this up for now for early feedback.  I'd also like to make this work for non-web inspection, as it would be very useful there.
Comment 3 EWS Watchlist 2019-03-03 22:52:15 PST Comment hidden (obsolete)
Comment 4 EWS Watchlist 2019-03-03 22:52:17 PST Comment hidden (obsolete)
Comment 5 EWS Watchlist 2019-03-03 23:01:26 PST Comment hidden (obsolete)
Comment 6 EWS Watchlist 2019-03-03 23:01:28 PST Comment hidden (obsolete)
Comment 7 EWS Watchlist 2019-03-03 23:31:44 PST Comment hidden (obsolete)
Comment 8 EWS Watchlist 2019-03-03 23:31:46 PST Comment hidden (obsolete)
Comment 9 Devin Rousso 2019-03-04 13:29:24 PST
Created attachment 363548 [details]
Patch
Comment 10 Joseph Pecoraro 2019-03-05 13:26:00 PST
Comment on attachment 363548 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=363548&action=review

> Source/WebCore/inspector/InspectorAuditResourcesObject.cpp:26
> +
> +

Nit: Extra

> Source/WebCore/inspector/InspectorAuditResourcesObject.cpp:54
> +    Vector<Resource> resources;
> +    if (auto* frame = document.frame()) {

Style: Early return?

> Source/WebCore/inspector/InspectorAuditResourcesObject.cpp:61
> +                if (!options.value().url.isNull() && !cachedResource->url().string().contains(options.value().url))
> +                    continue;
> +
> +                if (!options.value().mimeType.isNull() && !cachedResource->mimeType().contains(options.value().mimeType))
> +                    continue;

Should these queries by case-insensitive?

Maybe it make more sense to make the filters be an array. So that someone could do something like:

    WebInspectorAudit.Resources.getContents({
        mimeTypes: [
            "javascript",
            "json",
        ]
    });

To get a set of resources of varying mime types:

    text/javascript
    application/json
    application/vnd.api+json

> Source/WebCore/inspector/InspectorAuditResourcesObject.h:49
> +        String contents;

How about `content` instead of `contents`?

Shouldn't this include a base64Encoded property to signal to the client if they should base64 decode it or not instead of assuming off of the mime type?

> LayoutTests/inspector/audit/run-resources.html:7
> +<script src="resources/sample-resource.js"></script>
> +<link rel="stylesheet" href="resources/sample-resource.css">

How about an image resource?

• Someone could run a pngcrush optimizer and see if images could be reduced.
• Someone could check color schemes of images for adherence to some rules
Comment 11 Devin Rousso 2019-03-05 15:37:30 PST
Created attachment 363700 [details]
Patch
Comment 12 EWS Watchlist 2019-03-05 16:39:02 PST Comment hidden (obsolete)
Comment 13 EWS Watchlist 2019-03-05 16:39:04 PST Comment hidden (obsolete)
Comment 14 EWS Watchlist 2019-03-05 16:48:44 PST Comment hidden (obsolete)
Comment 15 EWS Watchlist 2019-03-05 16:48:45 PST Comment hidden (obsolete)
Comment 16 Devin Rousso 2019-03-05 17:30:34 PST
Created attachment 363713 [details]
Patch
Comment 17 Joseph Pecoraro 2019-03-06 20:19:27 PST
Comment on attachment 363713 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=363713&action=review

> Source/WebCore/inspector/InspectorAuditResourcesObject.cpp:97
> +        if (base64Encoded)
> +            resource.base64Encoded = base64Encoded;

I'd expect this to always be set.

> Source/WebCore/inspector/InspectorAuditResourcesObject.h:58
> +    ExceptionOr<Vector<Resource>> getContents(Document&, Optional<QueryOptions>);

One drawback of having this be one API instead of two is that if you want all the resources you get an enormous JSON result with the contents of all resources. In fact, building that JSON might blow up the memory of inspected page.

We could have this be two APIs:

    getResources(query?)
    getResourceContent(url|id)

This is also nicer for developers to get the complete list of URLs/MIMETypes and filter them by RegExp on their own.

What do you think about this approach?

> Source/WebCore/inspector/InspectorAuditResourcesObject.idl:40
> +    boolean? base64Encoded;

Why make this optional? It is either true or false, what would null/undefined mean?

> LayoutTests/inspector/audit/run-resources-expected.txt:54
> +-- Running test case: Audit.run.Resources.getContents.InvalidURL

There aren't Invalid, they are not not found. Could rename the tests.
Comment 18 Joseph Pecoraro 2019-03-06 22:35:44 PST
Comment on attachment 363713 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=363713&action=review

>> Source/WebCore/inspector/InspectorAuditResourcesObject.h:58
>> +    ExceptionOr<Vector<Resource>> getContents(Document&, Optional<QueryOptions>);
> 
> One drawback of having this be one API instead of two is that if you want all the resources you get an enormous JSON result with the contents of all resources. In fact, building that JSON might blow up the memory of inspected page.
> 
> We could have this be two APIs:
> 
>     getResources(query?)
>     getResourceContent(url|id)
> 
> This is also nicer for developers to get the complete list of URLs/MIMETypes and filter them by RegExp on their own.
> 
> What do you think about this approach?

I'm going to r- while we discuss this aspect, I think this is a better API.

Also maybe we should make these asynchronous, but everyone enjoys the simplicity of synchronous APIs.
Comment 19 Devin Rousso 2019-03-07 01:42:04 PST
Comment on attachment 363713 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=363713&action=review

>>> Source/WebCore/inspector/InspectorAuditResourcesObject.h:58
>>> +    ExceptionOr<Vector<Resource>> getContents(Document&, Optional<QueryOptions>);
>> 
>> One drawback of having this be one API instead of two is that if you want all the resources you get an enormous JSON result with the contents of all resources. In fact, building that JSON might blow up the memory of inspected page.
>> 
>> We could have this be two APIs:
>> 
>>     getResources(query?)
>>     getResourceContent(url|id)
>> 
>> This is also nicer for developers to get the complete list of URLs/MIMETypes and filter them by RegExp on their own.
>> 
>> What do you think about this approach?
> 
> I'm going to r- while we discuss this aspect, I think this is a better API.
> 
> Also maybe we should make these asynchronous, but everyone enjoys the simplicity of synchronous APIs.

I like that approach a bit more.  Getting the content by URL makes the most sense to me.
Comment 20 Devin Rousso 2019-03-07 01:48:03 PST
Comment on attachment 363713 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=363713&action=review

>>>> Source/WebCore/inspector/InspectorAuditResourcesObject.h:58
>>>> +    ExceptionOr<Vector<Resource>> getContents(Document&, Optional<QueryOptions>);
>>> 
>>> One drawback of having this be one API instead of two is that if you want all the resources you get an enormous JSON result with the contents of all resources. In fact, building that JSON might blow up the memory of inspected page.
>>> 
>>> We could have this be two APIs:
>>> 
>>>     getResources(query?)
>>>     getResourceContent(url|id)
>>> 
>>> This is also nicer for developers to get the complete list of URLs/MIMETypes and filter them by RegExp on their own.
>>> 
>>> What do you think about this approach?
>> 
>> I'm going to r- while we discuss this aspect, I think this is a better API.
>> 
>> Also maybe we should make these asynchronous, but everyone enjoys the simplicity of synchronous APIs.
> 
> I like that approach a bit more.  Getting the content by URL makes the most sense to me.

Actually, that may not be enough.  I think we'd want to generate an id for each resource.
Comment 21 Devin Rousso 2019-03-07 02:47:25 PST
Created attachment 363864 [details]
Patch
Comment 22 Joseph Pecoraro 2019-03-07 10:41:40 PST
Comment on attachment 363864 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=363864&action=review

> Source/WebCore/inspector/InspectorAuditResourcesObject.h:71
> +    HashMap<String, CachedResource*> m_resources;

Even though this InspectorAuditResourcesObject gets recreated between audit runs it is still possible that someone could:

    1. Get a list of resources which has an (`id` => CachedResource A)
    2. Load enough resources such that (A) gets purged from the cache
    3. getResourceContent(`id`) => causing a use-after-free or worse

You will need to keep the CachedResource alive (which I think you can do by adding a CachedResourceClient) or remove CachedResources that are purged from this list.

> Source/WebCore/inspector/InspectorAuditResourcesObject.idl:31
> +    [CallWith=Document, MayThrowException] sequence<Resource> getResources(optional QueryOptions? options);

Should we drop the QueryObjects concept now? It is easier to filter the list script, and it is already weird that developers don't know that the filter is case-insensitive and partial.
Comment 23 Devin Rousso 2019-03-07 13:53:24 PST
Created attachment 363922 [details]
Patch

This made me cry T.T
Comment 24 Joseph Pecoraro 2019-03-07 14:01:05 PST
Comment on attachment 363922 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=363922&action=review

> Source/WebCore/inspector/InspectorAuditResourcesObject.cpp:124
> +void InspectorAuditResourcesObject::willDestroyCachedResource(CachedResource& cachedResource)
> +{
> +    m_resources.removeIf([&] (const auto& entry) {
> +        return entry.value == &cachedResource;
> +    });
> +
> +    cachedResource.removeClient(clientForResource(cachedResource));
> +}

Can this happen if we are a client?

CachedResource.cpp will only call willDestroy if it canDelete which can only happen if it has no clients:

    bool canDelete() const { return !hasClients() && !m_loader && !m_preloadCount && !m_handleCount && !m_resourceToRevalidate && !m_proxyResource; }

If we have a client then we should be preventing deletion from happening.

> Source/WebCore/inspector/InspectorAuditResourcesObject.h:57
> +    class InspectorAuditCachedResourceClient : public CachedResourceClient { };
> +    class InspectorAuditCachedFontClient : public CachedFontClient { };
> +    class InspectorAuditCachedImageClient : public CachedImageClient { };
> +    class InspectorAuditCachedRawResourceClient : public CachedRawResourceClient { };
> +    class InspectorAuditCachedSVGDocumentClient : public CachedSVGDocumentClient { };
> +    class InspectorAuditCachedStyleSheetClient : public CachedStyleSheetClient { };

These deserve a comment.

Can these be private?

Why can we not just use an empty CachedResourceClient subclass? Do we need the specifics because of assertions?
Comment 25 EWS Watchlist 2019-03-07 15:02:11 PST Comment hidden (obsolete)
Comment 26 EWS Watchlist 2019-03-07 15:02:12 PST Comment hidden (obsolete)
Comment 27 EWS Watchlist 2019-03-07 16:21:22 PST Comment hidden (obsolete)
Comment 28 EWS Watchlist 2019-03-07 16:21:24 PST Comment hidden (obsolete)
Comment 29 EWS Watchlist 2019-03-07 18:26:43 PST Comment hidden (obsolete)
Comment 30 EWS Watchlist 2019-03-07 18:26:45 PST Comment hidden (obsolete)
Comment 31 Devin Rousso 2019-03-08 10:26:32 PST
Comment on attachment 363922 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=363922&action=review

>> Source/WebCore/inspector/InspectorAuditResourcesObject.cpp:124
>> +}
> 
> Can this happen if we are a client?
> 
> CachedResource.cpp will only call willDestroy if it canDelete which can only happen if it has no clients:
> 
>     bool canDelete() const { return !hasClients() && !m_loader && !m_preloadCount && !m_handleCount && !m_resourceToRevalidate && !m_proxyResource; }
> 
> If we have a client then we should be preventing deletion from happening.

Good point!  I missed that.

>> Source/WebCore/inspector/InspectorAuditResourcesObject.h:57
>> +    class InspectorAuditCachedStyleSheetClient : public CachedStyleSheetClient { };
> 
> These deserve a comment.
> 
> Can these be private?
> 
> Why can we not just use an empty CachedResourceClient subclass? Do we need the specifics because of assertions?

I will try private.  They need to subclass because `CachedResourceClient` has a protected constructor.  It's necessary to have these be separate because there are assertions/casts.
Comment 32 Devin Rousso 2019-03-08 11:02:48 PST
Created attachment 364037 [details]
Patch
Comment 33 Joseph Pecoraro 2019-03-13 23:09:07 PDT
Comment on attachment 364037 [details]
Patch

r=me
Comment 34 WebKit Commit Bot 2019-03-14 00:05:12 PDT Comment hidden (obsolete)
Comment 35 Devin Rousso 2019-03-14 02:26:59 PDT
Created attachment 364650 [details]
Patch
Comment 36 WebKit Commit Bot 2019-03-14 02:31:04 PDT Comment hidden (obsolete)
Comment 37 Devin Rousso 2019-03-14 02:35:09 PDT
Created attachment 364651 [details]
Patch
Comment 38 WebKit Commit Bot 2019-03-14 03:14:03 PDT
Comment on attachment 364651 [details]
Patch

Clearing flags on attachment: 364651

Committed r242941: <https://trac.webkit.org/changeset/242941>
Comment 39 WebKit Commit Bot 2019-03-14 03:14:05 PDT
All reviewed patches have been landed.  Closing bug.