Bug 57277 - Preload scan external CSS scripts as they're downloaded
Summary: Preload scan external CSS scripts as they're downloaded
Status: REOPENED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: Other OS X 10.5
: P2 Normal
Assignee: James Simonsen
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-03-28 15:28 PDT by James Simonsen
Modified: 2012-09-13 18:30 PDT (History)
3 users (show)

See Also:


Attachments
Patch (10.53 KB, patch)
2011-03-28 15:38 PDT, James Simonsen
no flags Details | Formatted Diff | Diff
Patch (13.12 KB, patch)
2011-03-29 10:29 PDT, James Simonsen
no flags Details | Formatted Diff | Diff
Patch (15.38 KB, patch)
2011-03-29 15:19 PDT, James Simonsen
no flags Details | Formatted Diff | Diff
Patch (14.67 KB, patch)
2011-03-29 15:30 PDT, James Simonsen
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description James Simonsen 2011-03-28 15:28:19 PDT
Preload scan external CSS scripts as they're downloaded
Comment 1 James Simonsen 2011-03-28 15:38:07 PDT
Created attachment 87223 [details]
Patch
Comment 2 Adam Barth 2011-03-28 16:23:45 PDT
Comment on attachment 87223 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=87223&action=review

> Source/WebCore/ChangeLog:8
> +        No new tests. Passes existing tests. Not sure how to test CSS preloading with layout tests.

We have some existing preload scanner tests.  They work using DRT magic.

> Source/WebCore/html/parser/CSSPreloadScanner.cpp:53
> +    m_loader = loader;

Usually we call these variables m_cachedResourceLoader because there are so many "loader" objects.

> Source/WebCore/html/parser/CSSPreloadScanner.cpp:63
> +    for (size_t i = 0; i < data.length(); ++i)
> +        tokenize(data[i]);

That seems slow.

> Source/WebCore/html/parser/CSSPreloadScanner.h:-66
> -    bool m_scanningBody;

We don't need m_scanningBody anymore ?  Seems worth discussing in the ChangeLog.

> Source/WebCore/loader/cache/CachedCSSStyleSheet.cpp:43
> +    , m_numCharactersPreloadScanned(0)

We generally avoid abbreviations like "num"

> Source/WebCore/loader/cache/CachedCSSStyleSheet.cpp:106
> +        String newlyDecodedSheetText = m_decoder->decode(m_data->data() + m_numCharactersPreloadScanned, m_data->size() - m_numCharactersPreloadScanned);
> +        m_preloadScanner.scan(newlyDecodedSheetText, m_cachedResourceLoader);

This looks like it will resume scanning in the middle of tokens.  Can that cause problems?  It seems like we'll get different behavior depending on how the message gets broken down into network packets...  Maybe the scanner keeps the state?

> Source/WebCore/loader/cache/CachedCSSStyleSheet.cpp:114
> +    m_decodedSheetText += m_decoder->flush();

So we never scan the part that needs to be flushed?

> Source/WebCore/loader/cache/CachedCSSStyleSheet.h:46
> +        virtual void load(CachedResourceLoader* cachedResourceLoader)  { load(cachedResourceLoader, true, DoSecurityCheck, true); }

Virtual methods should not have implementations in headers.  They can't be inlined.
Comment 3 Tony Gentilcore 2011-03-28 16:37:03 PDT
Comment on attachment 87223 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=87223&action=review

I like this idea. Could be a nice win for large external sheets that have @imports near the top. Just a few questions/comments. Antti should review this too.

> Source/WebCore/html/parser/CSSPreloadScanner.cpp:194
> +            m_loader->preload(CachedResource::CSSStyleSheet, value, String(), false);

This is a behavior change. For instance now foo.css could be loaded while blocked on foo.js:

<head>
<script src="foo.js"></script>
</head>
<body>
<style>
  @import url('foo.css')
</style>

Please call that out in the ChangeLog. Also, is it possible to test for that change (in a way similar to the other preload scanner tests)?

> Source/WebCore/loader/cache/CachedCSSStyleSheet.cpp:43
> +    , m_numCharactersPreloadScanned(0)

It is unfortunate for this class to have to track this. Any clean way for the CSSPreloadScanner to own this?

> Source/WebCore/loader/cache/CachedCSSStyleSheet.cpp:106
> +        m_preloadScanner.scan(newlyDecodedSheetText, m_cachedResourceLoader);

Is there any benefit to preload scanning when allDataReceived == true? Perhaps we should just save the CPU in that case and only worry about scanning when !allDataReceived.
Comment 4 James Simonsen 2011-03-29 10:28:52 PDT
Comment on attachment 87223 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=87223&action=review

>> Source/WebCore/ChangeLog:8
>> +        No new tests. Passes existing tests. Not sure how to test CSS preloading with layout tests.
> 
> We have some existing preload scanner tests.  They work using DRT magic.

Yeah, I saw that. The problem is there are no DOM events for us to hook into when the stylesheets load, so we can't do it the same way.

>> Source/WebCore/html/parser/CSSPreloadScanner.cpp:53
>> +    m_loader = loader;
> 
> Usually we call these variables m_cachedResourceLoader because there are so many "loader" objects.

Done.

>> Source/WebCore/html/parser/CSSPreloadScanner.cpp:63
>> +        tokenize(data[i]);
> 
> That seems slow.

That's how it works while parsing <style> blocks too.

>> Source/WebCore/html/parser/CSSPreloadScanner.cpp:194
>> +            m_loader->preload(CachedResource::CSSStyleSheet, value, String(), false);
> 
> This is a behavior change. For instance now foo.css could be loaded while blocked on foo.js:
> 
> <head>
> <script src="foo.js"></script>
> </head>
> <body>
> <style>
>   @import url('foo.css')
> </style>
> 
> Please call that out in the ChangeLog. Also, is it possible to test for that change (in a way similar to the other preload scanner tests)?

ChangeLog updated. I can't get the test to work reliably though. It works fine when I run it manually, but with run_webkit_tests.sh on Linux, it's as if the preload scanner isn't enabled. Any ideas?

>> Source/WebCore/html/parser/CSSPreloadScanner.h:-66
>> -    bool m_scanningBody;
> 
> We don't need m_scanningBody anymore ?  Seems worth discussing in the ChangeLog.

Done.

>>> Source/WebCore/loader/cache/CachedCSSStyleSheet.cpp:43
>>> +    , m_numCharactersPreloadScanned(0)
>> 
>> We generally avoid abbreviations like "num"
> 
> It is unfortunate for this class to have to track this. Any clean way for the CSSPreloadScanner to own this?

Done and done.

>>> Source/WebCore/loader/cache/CachedCSSStyleSheet.cpp:106
>>> +        m_preloadScanner.scan(newlyDecodedSheetText, m_cachedResourceLoader);
>> 
>> This looks like it will resume scanning in the middle of tokens.  Can that cause problems?  It seems like we'll get different behavior depending on how the message gets broken down into network packets...  Maybe the scanner keeps the state?
> 
> Is there any benefit to preload scanning when allDataReceived == true? Perhaps we should just save the CPU in that case and only worry about scanning when !allDataReceived.

Yep, the scanner keeps track of state.

We might be preloading a script that @imports another script near the end. In that case, we'd still want to preload scan the last packet, since we won't actually parse it until sometime later.

Should we make the cached resources aware of this case and decide appropriately here?
Comment 5 James Simonsen 2011-03-29 10:29:39 PDT
Created attachment 87358 [details]
Patch
Comment 6 James Simonsen 2011-03-29 15:19:30 PDT
Created attachment 87416 [details]
Patch
Comment 7 James Simonsen 2011-03-29 15:30:18 PDT
Created attachment 87417 [details]
Patch
Comment 8 Tony Gentilcore 2011-03-29 16:23:57 PDT
Comment on attachment 87417 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=87417&action=review

This LGTM, I'm going to hold off on r+ing in case Adam has more comments. And I still think you might want to have Antti take a look.

> Source/WebCore/html/parser/CSSPreloadScanner.cpp:36
> +#include <wtf/CurrentTime.h>

Stray diff

> Source/WebCore/html/parser/CSSPreloadScanner.cpp:200
> +        m_state = DoneParsingImportRules;

This change is begging for a test now.

Maybe even invalid CSS:
<style>
@import pass.css
body {background:red}
@import fail.css
</style>

> Source/WebCore/loader/TextResourceDecoder.cpp:695
> +String TextResourceDecoder::decodeAdditionalData(const char* data, size_t dataLength)

Perhaps for another patch, but are there other CachedResources that should share this now?
Comment 9 James Simonsen 2011-04-01 12:10:24 PDT
This ends up not being a noticeable benefit, so I'm going to shelve it for now. The only change that looks like a win, is terminating preload scanning early. I'll start a separate bug for that.
Comment 10 James Simonsen 2012-09-13 18:22:00 PDT
I've been pinged twice offline about sites where this would help. It might be time to try this out against those sites.
Comment 11 William Chan 2012-09-13 18:30:48 PDT
The example I saw was www.nytimes.com. It has a stylesheet at http://css.nyt.com/css/0.1/screen/build/homepage/styles.css?v=20120119 with several resources that don't get fetched until the style is used. YMMV in terms of performance impact, but in the trace I saw (with some network issues that magnified the problem), it would have loaded resources significantly earlier. This optimization may not be a big win in the median case, but in the long tail case, I posit that it may help out significantly.