Bug 50182

Summary: Introduce the notion of a "display-isolated" URL scheme for use by Chrome-internal URLs
Product: WebKit Reporter: Adam Barth <abarth>
Component: New BugsAssignee: Adam Barth <abarth>
Status: RESOLVED FIXED    
Severity: Normal CC: commit-queue, darin, fishd, sam, webkit-ews
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Other   
OS: OS X 10.5   
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch eric: review+, abarth: commit-queue+

Description Adam Barth 2010-11-29 14:07:59 PST
Introduce the notion of a "display-isolated" URL scheme for use by Chrome-internal URLs
Comment 1 Adam Barth 2010-11-29 14:18:29 PST
Created attachment 75059 [details]
Patch
Comment 2 Adam Barth 2010-11-29 14:20:21 PST
Created attachment 75060 [details]
Patch
Comment 4 Adam Barth 2010-11-29 14:23:33 PST
See also http://codereview.chromium.org/5268006, which is the Chrome side of the change.
Comment 5 Early Warning System Bot 2010-11-29 14:27:38 PST
Attachment 75059 [details] did not build on qt:
Build output: http://queues.webkit.org/results/6384096
Comment 6 Darin Adler 2010-11-29 14:32:36 PST
Comment on attachment 75060 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=75060&action=review

I have enough questions about this that I’m going to say review- for now, but it may just be my misunderstanding rather than defects in the patch.

> WebCore/page/SecurityOrigin.cpp:-313
> -    if (!SchemeRegistry::shouldTreatURLAsLocal(url.string()))
> -        return true;

Since you’re changing this policy, how do we maintain compatibility with existing Mac OS X applications using the public API, +[WebKit registerURLSchemeAsLocal:], and expecting this behavior?

> WebCore/platform/SchemeRegistry.cpp:-95
> -    if (scheme == "file")
> -        return;
> -#if PLATFORM(MAC)
> -    if (scheme == "applewebdata")
> -        return;
> -#endif

Why did you remove this?

> WebCore/platform/SchemeRegistry.cpp:-134
> -    // This avoids an allocation of another String and the HashSet contains()
> -    // call for the file: and http: schemes.
> -    if (scheme.length() == 4) {
> -        const UChar* s = scheme.characters();
> -        if (s[0] == 'h' && s[1] == 't' && s[2] == 't' && s[3] == 'p')
> -            return false;
> -        if (s[0] == 'f' && s[1] == 'i' && s[2] == 'l' && s[3] == 'e')
> -            return true;
> -    }

What’s the rationale for removing this fast path?

> WebCore/platform/SchemeRegistry.cpp:116
> -    if (scheme.isEmpty())
> -        return false;
> -
> -    return WebCore::localURLSchemes().contains(scheme);
> +    return localURLSchemes().contains(scheme);

Why isn’t the isEmpty change needed any more? Does function still work for both null strings and empty strings? If not, did you audit the callers or do something else?

> WebKit/chromium/public/WebSecurityPolicy.h:53
> +    // Registers a URL scheme to be treated as display-isolated.  This means

Double space after period here. Are you a conscientious objector?

> WebKit/chromium/public/WebSecurityPolicy.h:54
> +    // that pages cannot dispaly these URLs unless they from the same scheme.

Typo: “dispaly”
Typo: “they from”
Comment 7 Darin Adler 2010-11-29 14:33:11 PST
Is there a way to isolate the addition of the new “display-isolated” feature from the refactoring and change in the behavior of existing features?
Comment 8 Early Warning System Bot 2010-11-29 14:36:39 PST
Attachment 75060 [details] did not build on qt:
Build output: http://queues.webkit.org/results/6459006
Comment 9 Adam Barth 2010-11-29 14:59:26 PST
Comment on attachment 75060 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=75060&action=review

>> WebCore/page/SecurityOrigin.cpp:-313
>> -    if (!SchemeRegistry::shouldTreatURLAsLocal(url.string()))
>> -        return true;
> 
> Since you’re changing this policy, how do we maintain compatibility with existing Mac OS X applications using the public API, +[WebKit registerURLSchemeAsLocal:], and expecting this behavior?

This shouldn't affect +[WebKit registerURLSchemeAsLocal:].  The diff is just confusing.  Notice that this branch now occurs on line 320.  The only difference is that we're using the protocol from the parsed URL instead of having shouldTreatURLAsLocal "reparse" the URL.

>> WebCore/platform/SchemeRegistry.cpp:-95
>> -void SchemeRegistry::removeURLSchemeRegisteredAsLocal(const String& scheme)
>> -{
>> -    if (scheme == "file")
>> -        return;
>> -#if PLATFORM(MAC)
>> -    if (scheme == "applewebdata")
>> -        return;
>> -#endif
> 
> Why did you remove this?

Because I couldn't find any callers, but it looks like I didn't search widely enough as Qt seems to use it.  :)

>> WebCore/platform/SchemeRegistry.cpp:-134
>> -    // This avoids an allocation of another String and the HashSet contains()
>> -    // call for the file: and http: schemes.
>> -    if (scheme.length() == 4) {
>> -        const UChar* s = scheme.characters();
>> -        if (s[0] == 'h' && s[1] == 't' && s[2] == 't' && s[3] == 'p')
>> -            return false;
>> -        if (s[0] == 'f' && s[1] == 'i' && s[2] == 'l' && s[3] == 'e')
>> -            return true;
>> -    }
> 
> What’s the rationale for removing this fast path?

It looks a lot like premature optimization.  Do you know of any benchmark that's affected by this change?  If so, we should add "https" to the list least secure pages be disadvantaged.

>> WebCore/platform/SchemeRegistry.cpp:116
>> +    return localURLSchemes().contains(scheme);
> 
> Why isn’t the isEmpty change needed any more? Does function still work for both null strings and empty strings? If not, did you audit the callers or do something else?

None of the other similar functions have a special-case for the empty scheme.  To get different behavior, someone would have to call this function with an empty string (which doesn't make sense because URL can't have an empty scheme) and someone would have to register the empty string as a local URL (which also doesn't make sense for the same reason).

In any case, I can restore this function to its previous form and post these changes separately.

>> WebKit/chromium/public/WebSecurityPolicy.h:53
>> +    // Registers a URL scheme to be treated as display-isolated.  This means
> 
> Double space after period here. Are you a conscientious objector?

I don't mean to be a conscientious objector.  I do a lot of writing and almost all of it is in "two space" style.  :(

>> WebKit/chromium/public/WebSecurityPolicy.h:54
>> +    // that pages cannot dispaly these URLs unless they from the same scheme.
> 
> Typo: “dispaly”
> Typo: “they from”

Thanks.  Fixed.
Comment 10 Adam Barth 2010-11-29 15:11:31 PST
Created attachment 75070 [details]
Patch
Comment 11 Darin Adler 2010-11-29 15:50:58 PST
(In reply to comment #9)
t> >> WebCore/platform/SchemeRegistry.cpp:-134
> >> -    // This avoids an allocation of another String and the HashSet contains()
> >> -    // call for the file: and http: schemes.
> >> -    if (scheme.length() == 4) {
> >> -        const UChar* s = scheme.characters();
> >> -        if (s[0] == 'h' && s[1] == 't' && s[2] == 't' && s[3] == 'p')
> >> -            return false;
> >> -        if (s[0] == 'f' && s[1] == 'i' && s[2] == 'l' && s[3] == 'e')
> >> -            return true;
> >> -    }
> > 
> > What’s the rationale for removing this fast path?
> 
> It looks a lot like premature optimization. Do you know of any benchmark that's affected by this change? If so, we should add "https" to the list least secure pages be disadvantaged.

WIth code that is 4 years old (added in http://trac.webkit.org/changeset/19952 in early 2007) I think the burden of proof may go in the other direction. I understand your desire to remove an unhelpful “optimization” and if we can quickly convince ourselves it’s not needed then it does seem good to get rid of it. To maximize our chances of noticing if it causes a problem, it would be best to remove it in a separate patch.

It does seem that the code path is much changed since that original change set. The original fast path was more valuable and avoided more work. The fast path was copied when the “scheme” version was done in http://trac.webkit.org/changeset/28343 nine months later.

> >> WebCore/platform/SchemeRegistry.cpp:116
> >> +    return localURLSchemes().contains(scheme);
> > 
> > Why isn’t the isEmpty change needed any more? Does function still work for both null strings and empty strings? If not, did you audit the callers or do something else?
> 
> None of the other similar functions have a special-case for the empty scheme. To get different behavior, someone would have to call this function with an empty string (which doesn't make sense because URL can't have an empty scheme) and someone would have to register the empty string as a local URL (which also doesn't make sense for the same reason).
> 
> In any case, I can restore this function to its previous form and post these changes separately.

The difference from other functions made me want to research who added the isEmpty and why. It looks like it was included when shouldTreatSchemeAsLocal was added back in r28343.

I think doing it separately would ease my concern a bit; maybe I would not have even squawked if you included a rationale in the change log.

The register functions can definitely be called with the null string and empty string as part of the Mac OS X WebKit API, so those functions need some kind of guard at either the WebKit or WebCore level, or we can just live with crashing if someone passes a nonsensical scheme string.

> I do a lot of writing and almost all of it is in "two space" style.  :(

Same here, but the other way around. But less of my writing ends up published, more’s the pity :-(

Since I was wrong about registerURLSchemeAsLocal, I would like to change my review- to a review+, but I think you have to make changes to compile on Qt anyway, so I’ll leave it for now.
Comment 12 Adam Barth 2010-11-29 16:17:07 PST
Comment on attachment 75070 [details]
Patch

Thanks for the review.  I'll most a separate patch with the controversial parts in a bit.
Comment 13 WebKit Commit Bot 2010-11-29 20:48:53 PST
Comment on attachment 75070 [details]
Patch

Clearing flags on attachment: 75070

Committed r72876: <http://trac.webkit.org/changeset/72876>
Comment 14 WebKit Commit Bot 2010-11-29 20:48:59 PST
All reviewed patches have been landed.  Closing bug.
Comment 15 Adam Barth 2011-01-10 16:04:41 PST
This was reverted in http://trac.webkit.org/changeset/73002
Comment 16 Adam Barth 2011-01-10 17:02:17 PST
Committed r75455: <http://trac.webkit.org/changeset/75455>
Comment 17 Adam Barth 2011-01-10 17:03:23 PST
Fixed is a bit of a misnomer.  I just landed the routine parts of the old patch.  Looking at the non-routine parts now.
Comment 18 Adam Barth 2011-01-10 18:31:53 PST
Created attachment 78487 [details]
Patch
Comment 19 Eric Seidel (no email) 2011-01-11 00:51:40 PST
Comment on attachment 78487 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=78487&action=review

In general this looks fine.  How do we test this?

> Source/WebCore/page/SecurityOrigin.cpp:306
> +    RefPtr<SecurityOrigin> targetOrigin = SecurityOrigin::create(url);
> +    return isAccessWhiteListed(targetOrigin.get());

Really?  A malloc is required to check a whitelist?

> Source/WebCore/page/SecurityOrigin.cpp:315
> -    if (url.protocolIs(BlobURL::blobProtocol()))
> +    if (protocol == BlobURL::blobProtocol())

How does .protocolIs differ from == ?  I'm reminded of QualifiedName == vs. .matches()
Comment 20 Adam Barth 2011-01-11 01:02:39 PST
(In reply to comment #19)
> (From update of attachment 78487 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=78487&action=review
> 
> In general this looks fine.  How do we test this?

That's a good question.  This stuff is really better tested with unit tests.  We could expose APIs in layoutTestController for manipulating these registries, but we wouldn't be able to test different URL schemes without also somehow registering URL scheme handlers...  We certainly have tests for the basic http/file URL interactions.

> > Source/WebCore/page/SecurityOrigin.cpp:306
> > +    RefPtr<SecurityOrigin> targetOrigin = SecurityOrigin::create(url);
> > +    return isAccessWhiteListed(targetOrigin.get());
> 
> Really?  A malloc is required to check a whitelist?

The problem is that the canonicalization logic is baked into the SecurityOrigin constructor.  We should pull it out into its own function so it can be called without instantiating a SecurityOrigin object.

> > Source/WebCore/page/SecurityOrigin.cpp:315
> > -    if (url.protocolIs(BlobURL::blobProtocol()))
> > +    if (protocol == BlobURL::blobProtocol())
> 
> How does .protocolIs differ from == ?  I'm reminded of QualifiedName == vs. .matches()

I can leave this as is.  protocolIs just saves a string allocation and understands how to do case folding.  In this case, we've already allocated the string and canonicalized it, so using == is fine.
Comment 21 Eric Seidel (no email) 2011-01-11 01:26:52 PST
Comment on attachment 78487 [details]
Patch

OK.
Comment 22 Adam Barth 2011-01-11 14:43:31 PST
Comment on attachment 78487 [details]
Patch

I'm going to watch the bloat-http perf bot once this lands.
Comment 23 Adam Barth 2011-01-11 14:53:54 PST
Committed r75557: <http://trac.webkit.org/changeset/75557>