Bug 148363 - Implement Subresource Integrity (SRI)
Summary: Implement Subresource Integrity (SRI)
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit Misc. (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Sam Weinig
URL: https://w3c.github.io/webappsec/specs...
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2015-08-22 18:21 PDT by Michael[tm] Smith
Modified: 2017-03-18 17:26 PDT (History)
16 users (show)

See Also:


Attachments
Patch (132.71 KB, patch)
2017-03-08 17:37 PST, Sam Weinig
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews104 for mac-elcapitan-wk2 (1.10 MB, application/zip)
2017-03-08 19:01 PST, Build Bot
no flags Details
Archive of layout-test-results from ews101 for mac-elcapitan (1002.25 KB, application/zip)
2017-03-08 19:38 PST, Build Bot
no flags Details
Archive of layout-test-results from ews123 for ios-simulator-wk2 (827.92 KB, application/zip)
2017-03-08 19:55 PST, Build Bot
no flags Details
Patch (143.49 KB, patch)
2017-03-12 20:43 PDT, Sam Weinig
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Michael[tm] Smith 2015-08-22 18:21:49 PDT
The SRI specification "defines a mechanism by which user agents may verify that a fetched resource has been delivered without unexpected manipulation" using a validation scheme and "extending several HTML elements with an integrity attribute that contains a cryptographic hash of the representation of the resource the author expects to load." http://w3c.github.io/webappsec/specs/subresourceintegrity/

Example: If a document loads some JavaScript library code from a shared server at https://example.com/example-framework.js rather than from the same own origin as the document, the document can specify the expected SHA-256 hash of https://example.com/example-framework.js (e.g., C6CB9UYIS9UJeqinPHWTHVqh/E1uhG5Twh+Y5qFQmYg=) and the UA, before executing the JavaScript, can verify that the data matches that expected hash.

<script src="https://example.com/example-framework.js"
        integrity="sha256-C6CB9UYIS9UJeqinPHWTHVqh/E1uhG5Twh+Y5qFQmYg="
        crossorigin="anonymous"></script>

The mechanism can also be used for resources loaded through <link> elements.

As far as support in other UAs, Chrome has supported Subresource Integrity since v45, and Firefox has since v43. https://developer.mozilla.org/en/docs/Web/HTML/Element/script#Browser_compatibility
Comment 1 Michael[tm] Smith 2016-06-27 02:03:44 PDT
Subresource Integrity is now a final W3C Recommendation https://www.w3.org/TR/SRI/
Comment 2 Barrett Harber 2016-12-19 13:12:47 PST
+1 for this
Comment 3 Dan Fabulich 2017-03-08 10:17:45 PST
I filed a Radar issue for this; it was marked as a duplicate of rdar://18945879
Comment 4 Dan Fabulich 2017-03-08 12:46:10 PST
Oops, I mean rdar://problem/18945879
Comment 5 Sam Weinig 2017-03-08 17:37:40 PST
Created attachment 303875 [details]
Patch
Comment 6 Sam Weinig 2017-03-08 17:40:00 PST
WIP patch. I would like some folks who have played in the Fetch / Loader stack more recently than I to take a look.

Some known things I need to improve and test:
- Fetch with a stream.
- Fix FIXME in HTMLLinkElement about using the correct integrity metadata.
- Probably more.
Comment 7 Build Bot 2017-03-08 19:01:53 PST
Comment on attachment 303875 [details]
Patch

Attachment 303875 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.webkit.org/results/3273336

New failing tests:
imported/w3c/web-platform-tests/html/dom/reflection-metadata.html
imported/w3c/web-platform-tests/html/dom/reflection-misc.html
Comment 8 Build Bot 2017-03-08 19:01:57 PST
Created attachment 303887 [details]
Archive of layout-test-results from ews104 for mac-elcapitan-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews104  Port: mac-elcapitan-wk2  Platform: Mac OS X 10.11.6
Comment 9 Build Bot 2017-03-08 19:38:20 PST
Comment on attachment 303875 [details]
Patch

Attachment 303875 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.webkit.org/results/3273635

New failing tests:
imported/w3c/web-platform-tests/html/dom/reflection-metadata.html
imported/w3c/web-platform-tests/html/dom/reflection-misc.html
Comment 10 Build Bot 2017-03-08 19:38:24 PST
Created attachment 303890 [details]
Archive of layout-test-results from ews101 for mac-elcapitan

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews101  Port: mac-elcapitan  Platform: Mac OS X 10.11.6
Comment 11 Build Bot 2017-03-08 19:55:44 PST
Comment on attachment 303875 [details]
Patch

Attachment 303875 [details] did not pass ios-sim-ews (ios-simulator-wk2):
Output: http://webkit-queues.webkit.org/results/3273716

New failing tests:
imported/w3c/web-platform-tests/html/dom/reflection-metadata.html
imported/w3c/web-platform-tests/html/dom/reflection-misc.html
Comment 12 Build Bot 2017-03-08 19:55:49 PST
Created attachment 303891 [details]
Archive of layout-test-results from ews123 for ios-simulator-wk2

The attached test failures were seen while running run-webkit-tests on the ios-sim-ews.
Bot: ews123  Port: ios-simulator-wk2  Platform: Mac OS X 10.11.6
Comment 13 youenn fablet 2017-03-08 21:12:37 PST
Looking quickly at the patch, I am wondering why some integrity checks are done at client loader level.
Why not doing that at SubresourceLoader/CachedResource/CachedResourceLoader level?
The one thing needed for clients would be to ensure not leaking any data before didFinishLoading is called.
I guess this might only apply to fetch API.
Comment 14 Sam Weinig 2017-03-09 08:42:15 PST
(In reply to comment #13)
> Looking quickly at the patch, I am wondering why some integrity checks are
> done at client loader level.
> Why not doing that at SubresourceLoader/CachedResource/CachedResourceLoader
> level?

I ended up doing them at the client level because I couldn't find a good way to handle the case where where there are two requests for the same cached resource (which has yet to finish loading). In that case, when the load finishes, the CachedResource only knows about clients of the resource, not the requests that initiated them, so can't retrieve the integrity metadata needed. We could consider changing CachedResource so that it keeps tracks of the CachedResourceRequests as well, but that didn't immediately seem like an improvement worth making without re-evaluating loading more holistically. 

> The one thing needed for clients would be to ensure not leaking any data
> before didFinishLoading is called.
> I guess this might only apply to fetch API.

Yeah, the fetch one is tricky, since the CachedResource doesn't seem to even be buffering the data (due to explicitly asking the ThreadableLoader not to). I am kind of curious if we are missing out on some memory cache goodness for fetch requests.
Comment 15 youenn fablet 2017-03-09 09:26:37 PST
(In reply to comment #14)
> (In reply to comment #13)
> > Looking quickly at the patch, I am wondering why some integrity checks are
> > done at client loader level.
> > Why not doing that at SubresourceLoader/CachedResource/CachedResourceLoader
> > level?
> 
> I ended up doing them at the client level because I couldn't find a good way
> to handle the case where where there are two requests for the same cached
> resource (which has yet to finish loading). In that case, when the load
> finishes, the CachedResource only knows about clients of the resource, not
> the requests that initiated them, so can't retrieve the integrity metadata
> needed. We could consider changing CachedResource so that it keeps tracks of
> the CachedResourceRequests as well, but that didn't immediately seem like an
> improvement worth making without re-evaluating loading more holistically. 

I see.
DocumentThreadableLoader would be a better place for fetch and the likes.
At least, it would be closer to ResourceLoader et al.

CORS checks are now removed at the client level and this seems like the right thing to do. The added complexity in CachedResource/CachedResourceLoader is not very good though, but origin is more fundamental than SRI.

Maybe we need a better communication channel with CachedResourceClient to accommodate SRI. For instance, we only have notifyFinished. Why not having one for success and one for error.

> > The one thing needed for clients would be to ensure not leaking any data
> > before didFinishLoading is called.
> > I guess this might only apply to fetch API.
> 
> Yeah, the fetch one is tricky, since the CachedResource doesn't seem to even
> be buffering the data (due to explicitly asking the ThreadableLoader not
> to). I am kind of curious if we are missing out on some memory cache
> goodness for fetch requests.

IIRC, RawResourceRequest memory caching is indeed something we should work on.
Especially with prefetching arriving.
As of the hash computation, it should probably be done on the fly.
We should not require to store the whole data just for this purpose.
Comment 16 Sam Weinig 2017-03-09 10:04:59 PST
(In reply to comment #15)
> (In reply to comment #14)
> > (In reply to comment #13)
> > > Looking quickly at the patch, I am wondering why some integrity checks are
> > > done at client loader level.
> > > Why not doing that at SubresourceLoader/CachedResource/CachedResourceLoader
> > > level?
> > 
> > I ended up doing them at the client level because I couldn't find a good way
> > to handle the case where where there are two requests for the same cached
> > resource (which has yet to finish loading). In that case, when the load
> > finishes, the CachedResource only knows about clients of the resource, not
> > the requests that initiated them, so can't retrieve the integrity metadata
> > needed. We could consider changing CachedResource so that it keeps tracks of
> > the CachedResourceRequests as well, but that didn't immediately seem like an
> > improvement worth making without re-evaluating loading more holistically. 
> 
> I see.
> DocumentThreadableLoader would be a better place for fetch and the likes.
> At least, it would be closer to ResourceLoader et al.
> 
> CORS checks are now removed at the client level and this seems like the
> right thing to do. The added complexity in
> CachedResource/CachedResourceLoader is not very good though, but origin is
> more fundamental than SRI.
> 
> Maybe we need a better communication channel with CachedResourceClient to
> accommodate SRI. For instance, we only have notifyFinished. Why not having
> one for success and one for error.
> 
> > > The one thing needed for clients would be to ensure not leaking any data
> > > before didFinishLoading is called.
> > > I guess this might only apply to fetch API.
> > 
> > Yeah, the fetch one is tricky, since the CachedResource doesn't seem to even
> > be buffering the data (due to explicitly asking the ThreadableLoader not
> > to). I am kind of curious if we are missing out on some memory cache
> > goodness for fetch requests.
> 
> IIRC, RawResourceRequest memory caching is indeed something we should work
> on.
> Especially with prefetching arriving.
> As of the hash computation, it should probably be done on the fly.
> We should not require to store the whole data just for this purpose.

Alas, I think we have to store the whole data for this purpose in the fetch API case, as we have to wait until we have checked the integrity to pass any data to the client. We could totally calculate the digest as the data comes in, I'm just not sure it adds anything.
Comment 17 youenn fablet 2017-03-09 14:53:46 PST
> Alas, I think we have to store the whole data for this purpose in the fetch
> API case, as we have to wait until we have checked the integrity to pass any
> data to the client. We could totally calculate the digest as the data comes
> in, I'm just not sure it adds anything.

If the checks are moved to DocumentThreadableLoader and/or below, it might be better to do so. DTL is just passing the data to the client for instance atm.
Comment 18 Sam Weinig 2017-03-12 20:43:16 PDT
Created attachment 304219 [details]
Patch
Comment 19 Sam Weinig 2017-03-12 20:56:39 PDT
Things I would like to still do for this patch:
- Add copy (or variant) of imported tests to http directory that use localhost / 127.0.0.1 instead of localhost vs. www.localhost for cross origin checks.
- Add tests of fetch that use a stream.
- Reconsider if the checks in CachedResourceLoader are necessary. Since we also check in the CachedResourceClient, I'm not sure they are.
- Decide if moving integrity check (and buffering) to DocumentThreadableLoader from Fetch code makes sense for this first pass.
Comment 20 Sam Weinig 2017-03-12 21:05:16 PDT
(In reply to comment #19)
> Things I would like to still do for this patch:
> - Add copy (or variant) of imported tests to http directory that use
> localhost / 127.0.0.1 instead of localhost vs. www.localhost for cross
> origin checks.
> - Add tests of fetch that use a stream.
> - Reconsider if the checks in CachedResourceLoader are necessary. Since we
> also check in the CachedResourceClient, I'm not sure they are.
> - Decide if moving integrity check (and buffering) to
> DocumentThreadableLoader from Fetch code makes sense for this first pass.

Additional thoughts:
- Add tests that use both base64 styles (normal and URL) and ensure this matches other browsers.
- Add tests that have options (https://w3c.github.io/webappsec-subresource-integrity/#grammardef-option-expression), and make sure they are ignored correctly (noting the spec says "In order for user agents to remain fully forwards compatible with future options, the user agent MUST ignore all unrecognized option-expressions.")
Comment 21 youenn fablet 2017-03-12 22:06:39 PDT
(In reply to comment #19)
> Things I would like to still do for this patch:
> - Add copy (or variant) of imported tests to http directory that use
> localhost / 127.0.0.1 instead of localhost vs. www.localhost for cross
> origin checks.

It would be better to update w3c wpt to use https://github.com/w3c/web-platform-tests/blob/master/common/get-host-info.sub.js
Just include this file in the html file, then use get_host_info(). HTTP_REMOTE_ORIGIN instead of creating the cross-origin directly in the test.
We have done so for other tests and upstreaming the changes to wpt should not cause any issue.

> - Add tests of fetch that use a stream.
> - Reconsider if the checks in CachedResourceLoader are necessary. Since we
> also check in the CachedResourceClient, I'm not sure they are.

Agreed, it is better to catch that at CachedResourceClient since there may be other paths leading to resource loading anyway.
Also, we could try to have a more declarative way to say that a CachedResource is SRI-enabled and if so, should check SRI for each of its client before calling loading finished callback. In such a case, the client would only be responsible to log the error if any.
That would be more inline with the fetch algorithm concept.

> - Decide if moving integrity check (and buffering) to
> DocumentThreadableLoader from Fetch code makes sense for this first pass.

That makes a lot of sense to me.
SRI is a fetch concept, not a fetch API one. Fetch API should just be a pass-through.
We could add a boolean flag asking whether to send the data once validated or whether progressive is fine to the threadable loader options in addition to the integrity parameter. That would remove the need to modify fetch API implementation.
Comment 22 youenn fablet 2017-03-18 17:26:44 PDT
(In reply to comment #14)
> (In reply to comment #13)
> > Looking quickly at the patch, I am wondering why some integrity checks are
> > done at client loader level.
> > Why not doing that at SubresourceLoader/CachedResource/CachedResourceLoader
> > level?
> 
> I ended up doing them at the client level because I couldn't find a good way
> to handle the case where where there are two requests for the same cached
> resource (which has yet to finish loading).

I forgot to mention: is this a practical use case? Is it worth optimising it? Can we just have separate loads until we agree on a better approach?