Bug 163015

Summary: CachedResourceLoader should not need to remove fragment identifier
Product: WebKit Reporter: youenn fablet <youennf>
Component: WebCore Misc.Assignee: youenn fablet <youennf>
Status: RESOLVED FIXED    
Severity: Normal CC: cdumez, commit-queue, dbates, japhet
Priority: P2    
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch
none
Archive of layout-test-results from ews112 for mac-yosemite
none
Patch
none
Patch
none
Patch
none
Patch for landing none

Description youenn fablet 2016-10-06 08:28:21 PDT
CachedResourceLoader should not need to remove fragment identifier.
Currently, it is done to check for the cache, but CachedResource seems to need it sadly.
It may be best to encapsulate this in CachedResourceRequest.
Comment 1 youenn fablet 2016-10-06 08:45:05 PDT
Created attachment 290825 [details]
Patch
Comment 2 youenn fablet 2016-10-07 00:06:23 PDT
Created attachment 290906 [details]
Patch
Comment 3 Build Bot 2016-10-07 01:16:28 PDT
Comment on attachment 290906 [details]
Patch

Attachment 290906 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/2236086

New failing tests:
svg/css/svg-resource-fragment-identifier-img-src.html
accessibility/mac/text-marker-paragraph-nav.html
accessibility/text-marker/text-marker-previous-next.html
Comment 4 Build Bot 2016-10-07 01:16:31 PDT
Created attachment 290912 [details]
Archive of layout-test-results from ews112 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews112  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 5 youenn fablet 2016-10-07 02:50:26 PDT
Created attachment 290918 [details]
Patch
Comment 6 Darin Adler 2016-10-07 12:09:20 PDT
Comment on attachment 290906 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=290906&action=review

> Source/WebCore/loader/cache/CachedResource.cpp:158
> +    // FIXME: For this constructor, we should probably mandate that the URL has no fragment identifier.
>      if (!m_resourceRequest.url().hasFragmentIdentifier())

It’s really ugly to have the identical code here and in splitFragmentIdentifierFromRequestURL.

> Source/WebCore/loader/cache/CachedResourceLoader.cpp:155
>      URL url = m_document->completeURL(resourceURL);
> -    return cachedResource(url); 
> +    return cachedResource(url);

Should really get rid of this local variable as long as you are touching this.

> Source/WebCore/loader/cache/CachedResourceLoader.cpp:221
> +    request.clearFragmentIdentifier();

I don’t understand why this is needed. Wouldn’t splitFragmentIdentifierFromRequestURL have already taken care of this?

> Source/WebCore/loader/cache/CachedResourceRequest.cpp:60
> +    if (!m_resourceRequest.url().hasFragmentIdentifier())
> +        return;
> +    m_fragmentIdentifier = m_resourceRequest.url().fragmentIdentifier();
> +    m_resourceRequest.setURL(MemoryCache::removeFragmentIdentifierIfNeeded(m_resourceRequest.url()));

I am really confused by the "if needed" in the name of MemoryCache::removeFragmentIdentifierIfNeeded in the context of this code. Is there some case where we won’t remove it?

> Source/WebCore/loader/cache/CachedResourceRequest.h:68
> +

Please don’t put this blank line in here.

> Source/WebCore/loader/cache/MemoryCache.cpp:175
> +    // FIXME: CachedResourceLoader is making sure to make request url having no fragment identifier.
> +    // This should also be done for other users of the MemoryCache.

url should be URL in comments.

This comment is confusing. I would write.

    // FIXME: Change all clients to make sure URLs have no fragment identifiers before calling here.
    // CachedResourceLoader is now doing this. Add an assertion once all other clients are doing it too.
Comment 7 youenn fablet 2016-10-10 06:29:41 PDT
Thanks for the review.

> View in context:
> https://bugs.webkit.org/attachment.cgi?id=290906&action=review
> 
> > Source/WebCore/loader/cache/CachedResource.cpp:158
> > +    // FIXME: For this constructor, we should probably mandate that the URL has no fragment identifier.
> >      if (!m_resourceRequest.url().hasFragmentIdentifier())
> 
> It’s really ugly to have the identical code here and in
> splitFragmentIdentifierFromRequestURL.

OK, I will add a static CachedResourceRequest method.

> > Source/WebCore/loader/cache/CachedResourceLoader.cpp:155
> >      URL url = m_document->completeURL(resourceURL);
> > -    return cachedResource(url); 
> > +    return cachedResource(url);
> 
> Should really get rid of this local variable as long as you are touching
> this.

OK

> > Source/WebCore/loader/cache/CachedResourceLoader.cpp:221
> > +    request.clearFragmentIdentifier();
> 
> I don’t understand why this is needed. Wouldn’t
> splitFragmentIdentifierFromRequestURL have already taken care of this?

splitFragmentIdentifierFromRequestURL is storing the fragment identifier in the request so that the resource can retrieve it.

By calling clearFragmentIdentifier, the fragment identifier will get lost.
This should be the behavior for all resources: ideally CachedResource should not need that information.

I don't know why this is done so only for CSS stylesheets.


> > Source/WebCore/loader/cache/CachedResourceRequest.cpp:60
> > +    if (!m_resourceRequest.url().hasFragmentIdentifier())
> > +        return;
> > +    m_fragmentIdentifier = m_resourceRequest.url().fragmentIdentifier();
> > +    m_resourceRequest.setURL(MemoryCache::removeFragmentIdentifierIfNeeded(m_resourceRequest.url()));
> 
> I am really confused by the "if needed" in the name of
> MemoryCache::removeFragmentIdentifierIfNeeded in the context of this code.
> Is there some case where we won’t remove it?

Apparently fragment identifiers may be needed for none-HTTP URLs.
See the comment in MemoryCache::removeFragmentIdentifierIfNeeded implementation.
But I am not exactly sure how much that comment is correct/still true.

For instance, why data URLs should be kept unmodified? Is it a problem with our URL parser? data URL loader? Probably worth investigating.

> > Source/WebCore/loader/cache/CachedResourceRequest.h:68
> > +
> 
> Please don’t put this blank line in here.

OK

> > Source/WebCore/loader/cache/MemoryCache.cpp:175
> > +    // FIXME: CachedResourceLoader is making sure to make request url having no fragment identifier.
> > +    // This should also be done for other users of the MemoryCache.
> 
> url should be URL in comments.
> 
> This comment is confusing. I would write.
> 
>     // FIXME: Change all clients to make sure URLs have no fragment
> identifiers before calling here.
>     // CachedResourceLoader is now doing this. Add an assertion once all
> other clients are doing it too.

OK
Comment 8 youenn fablet 2016-10-10 06:47:38 PDT
Created attachment 291099 [details]
Patch
Comment 9 Darin Adler 2016-10-10 14:17:50 PDT
Comment on attachment 291099 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=291099&action=review

> Source/WebCore/loader/cache/CachedResourceLoader.cpp:154
> +    return cachedResource(m_document->completeURL(resourceURL));

Where is the code that removes the fragment identifier in this code path? Seems we take an arbitrary string, then complete it, then call a function that asserts there is no fragment identifier that needs to be removed. Sounds like wishful thinking!

> Source/WebCore/loader/cache/CachedResourceLoader.cpp:236
>      CachedResourceHandle<CachedCSSStyleSheet> userSheet = new CachedCSSStyleSheet(WTFMove(request), sessionID());

Wow, that looks wrong. Don’t we need adoptRef for lines of code like this one?

> Source/WebCore/loader/cache/CachedResourceRequest.cpp:58
> +        return String();

For the null string in cases like, I like to write { } rather than String(), or String { }.

> Source/WebCore/loader/cache/CachedResourceRequest.cpp:61
> +    String fragmentIdentifier = request.url().fragmentIdentifier();
> +    request.setURL(MemoryCache::removeFragmentIdentifierIfNeeded(request.url()));
> +    return fragmentIdentifier;

If MemoryCache::removeFragmentIdentifierIfNeeded decides not to remove the fragment identifier, do we really want the fragment identifier still be returned separately like this? Then we will have it twice, right?

> Source/WebCore/loader/cache/CachedResourceRequest.h:65
> +    void clearFragmentIdentifier() { m_fragmentIdentifier = String(); }

I think that { } is best style for a null string in a context like this, String { } second best, and String() a distant third.
Comment 10 youenn fablet 2016-10-11 07:55:07 PDT
(In reply to comment #9)
> Comment on attachment 291099 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=291099&action=review
> 
> > Source/WebCore/loader/cache/CachedResourceLoader.cpp:154
> > +    return cachedResource(m_document->completeURL(resourceURL));
> 
> Where is the code that removes the fragment identifier in this code path?
> Seems we take an arbitrary string, then complete it, then call a function
> that asserts there is no fragment identifier that needs to be removed.
> Sounds like wishful thinking!

I was not exactly sure what to do with this method.
This method is used by WebKit windows port and can receive any URL.

With the current patch, there is no behavior change in release mode, but there is a risk that it will assert on debug builds.
We can remove the fragment identifier, but it will change the behavior for that port.
Probably this is actually an improvement.

> 
> > Source/WebCore/loader/cache/CachedResourceLoader.cpp:236
> >      CachedResourceHandle<CachedCSSStyleSheet> userSheet = new CachedCSSStyleSheet(WTFMove(request), sessionID());
> 
> Wow, that looks wrong. Don’t we need adoptRef for lines of code like this
> one?

Not sure what is wrong.
Are you suggesting to add a method like adoptResource similarly to adoptRef?

> > Source/WebCore/loader/cache/CachedResourceRequest.cpp:58
> > +        return String();
> 
> For the null string in cases like, I like to write { } rather than String(),
> or String { }.

OK

> > Source/WebCore/loader/cache/CachedResourceRequest.cpp:61
> > +    String fragmentIdentifier = request.url().fragmentIdentifier();
> > +    request.setURL(MemoryCache::removeFragmentIdentifierIfNeeded(request.url()));
> > +    return fragmentIdentifier;
> 
> If MemoryCache::removeFragmentIdentifierIfNeeded decides not to remove the
> fragment identifier, do we really want the fragment identifier still be
> returned separately like this? Then we will have it twice, right?

Right, this would make unnecessary processing at CachedResource level.

> > Source/WebCore/loader/cache/CachedResourceRequest.h:65
> > +    void clearFragmentIdentifier() { m_fragmentIdentifier = String(); }
> 
> I think that { } is best style for a null string in a context like this,
> String { } second best, and String() a distant third.

OK
Comment 11 youenn fablet 2016-10-11 09:05:22 PDT
Created attachment 291265 [details]
Patch
Comment 12 Darin Adler 2016-10-17 10:13:00 PDT
Comment on attachment 291265 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=291265&action=review

> Source/WebCore/loader/cache/CachedResource.cpp:144
> +// FIXME: For this constructor, we should probably mandate that the URL has no fragment identifier.

I would write a comment more like the one in the memory cache, instead. Something a bit like this, I think:

    // FIXME: Once we update callers to pass URLs with no fragment identifiers, we should add an assertion to that effect and remove the call to splitFragmentIdentifierFromRequestURL.

But I am not sure that’s worded exactly correctly. For non-HTTP-family URLs we are not planning on removing fragment identifiers. Right?

> Source/WebCore/loader/cache/MemoryCache.cpp:180
> +    // FIXME: Change all clients to make sure URLs have no fragment identifiers before calling here.
> +    // CachedResourceLoader is now doing this. Add an assertion once all other clients are doing it too.

This doesn’t say HTTP-family URLs, and I think it should.
Comment 13 youenn fablet 2016-10-18 04:58:05 PDT
Created attachment 291940 [details]
Patch for landing
Comment 14 WebKit Commit Bot 2016-10-18 05:33:23 PDT
Comment on attachment 291940 [details]
Patch for landing

Clearing flags on attachment: 291940

Committed r207459: <http://trac.webkit.org/changeset/207459>
Comment 15 WebKit Commit Bot 2016-10-18 05:33:28 PDT
All reviewed patches have been landed.  Closing bug.
Comment 16 youenn fablet 2016-10-18 05:38:22 PDT
> But I am not sure that’s worded exactly correctly. For non-HTTP-family URLs
> we are not planning on removing fragment identifiers. Right?

I do not really understand this restriction to HTTP(s) URLS only.
This needs further investigation.

> > Source/WebCore/loader/cache/MemoryCache.cpp:180
> > +    // FIXME: Change all clients to make sure URLs have no fragment identifiers before calling here.
> > +    // CachedResourceLoader is now doing this. Add an assertion once all other clients are doing it too.
> 
> This doesn’t say HTTP-family URLs, and I think it should.

Done.