Bug 103655 - Add PlugInOriginHash
Summary: Add PlugInOriginHash
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit Misc. (show other bugs)
Version: 528+ (Nightly build)
Hardware: Mac OS X 10.8
: P2 Normal
Assignee: Jon Lee
URL:
Keywords: InRadar
Depends on:
Blocks: 103206
  Show dependency treegraph
 
Reported: 2012-11-29 11:55 PST by Jon Lee
Modified: 2012-12-07 00:47 PST (History)
15 users (show)

See Also:


Attachments
Patch (13.52 KB, patch)
2012-11-29 12:47 PST, Jon Lee
no flags Details | Formatted Diff | Diff
Patch (13.53 KB, patch)
2012-11-29 13:14 PST, Jon Lee
no flags Details | Formatted Diff | Diff
Patch (14.14 KB, patch)
2012-11-29 13:18 PST, Jon Lee
no flags Details | Formatted Diff | Diff
Patch (13.99 KB, patch)
2012-12-04 01:32 PST, Jon Lee
no flags Details | Formatted Diff | Diff
Patch (14.05 KB, patch)
2012-12-04 02:07 PST, Jon Lee
no flags Details | Formatted Diff | Diff
Patch (14.13 KB, patch)
2012-12-05 09:14 PST, Jon Lee
ap: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jon Lee 2012-11-29 11:55:44 PST
Create a hash that uniquely identifies a combination of origins for plug-ins.
Comment 1 Radar WebKit Bug Importer 2012-11-29 11:55:52 PST
<rdar://problem/12778949>
Comment 2 Jon Lee 2012-11-29 12:47:50 PST
Created attachment 176795 [details]
Patch
Comment 3 Jon Lee 2012-11-29 13:14:03 PST
Created attachment 176798 [details]
Patch
Comment 4 Jon Lee 2012-11-29 13:18:13 PST
Created attachment 176800 [details]
Patch
Comment 5 Anders Carlsson 2012-11-29 14:45:15 PST
Comment on attachment 176800 [details]
Patch

I don't think you want to force localHost if the plug-in or page-host are empty. In addition, you should also check the plug-in MIME type. I'm wondering if this should just take a HTMLPlugInElement instead.
Comment 6 Alexey Proskuryakov 2012-11-29 14:51:07 PST
Comment on attachment 176800 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=176800&action=review

r- for layering violation.

> Source/WebCore/ChangeLog:18
> +            In the case where the host has an empty string (like a file URL), we substitute in "localhost".

Why is that desirable?

> Source/WebCore/platform/PlugInOriginHash.cpp:13
> + * THIS SOFTWARE IS PROVIDED BY APPLE COMPUTER, INC. ``AS IS'' AND ANY

What's "Apple Computer, Inc."?

> Source/WebCore/platform/PlugInOriginHash.cpp:41
> +static inline uint64_t codeFoldedHash(const String& string)
> +{
> +    if (string.is8Bit())
> +        return StringHasher::computeHash<LChar, CaseFoldingHash::foldCase<LChar> >(string.characters8(), string.length());

The function you are calling is named "compute hash", so why does the new function not have this verb in its name?

> Source/WebCore/platform/PlugInOriginHash.h:34
> +typedef uint64_t PlugInOriginHash;

Our existing types named "Hash" are classes, see e.g. ones in StringHash.h. It's quite confusing to have a drastically different one for plug-ins.

> Source/WebCore/platform/PlugInOriginHash.h:36
> +PlugInOriginHash plugInOriginHash(const Page*, const KURL& srcURL);

Code in WebCore/platform cannot use Page or Frame classes. It's either (1) platform API wrappers that do not know about Web classes or concepts or (2) platform level code that we build from scratch.

Even having "plugin" in the name is quite suspect.
Comment 7 Jon Lee 2012-12-04 01:32:31 PST
Created attachment 177446 [details]
Patch
Comment 8 WebKit Review Bot 2012-12-04 01:36:14 PST
Comment on attachment 177446 [details]
Patch

Attachment 177446 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/15100945
Comment 9 Build Bot 2012-12-04 01:38:34 PST
Comment on attachment 177446 [details]
Patch

Attachment 177446 [details] did not pass mac-ews (mac):
Output: http://queues.webkit.org/results/15132359
Comment 10 Early Warning System Bot 2012-12-04 01:38:37 PST
Comment on attachment 177446 [details]
Patch

Attachment 177446 [details] did not pass qt-ews (qt):
Output: http://queues.webkit.org/results/15098972
Comment 11 EFL EWS Bot 2012-12-04 01:40:09 PST
Comment on attachment 177446 [details]
Patch

Attachment 177446 [details] did not pass efl-ews (efl):
Output: http://queues.webkit.org/results/15097949
Comment 12 Early Warning System Bot 2012-12-04 01:41:04 PST
Comment on attachment 177446 [details]
Patch

Attachment 177446 [details] did not pass qt-wk2-ews (qt):
Output: http://queues.webkit.org/results/15120760
Comment 13 Peter Beverloo (cr-android ews) 2012-12-04 01:46:36 PST
Comment on attachment 177446 [details]
Patch

Attachment 177446 [details] did not pass cr-android-ews (chromium-android):
Output: http://queues.webkit.org/results/15126551
Comment 14 Jon Lee 2012-12-04 02:07:44 PST
Created attachment 177451 [details]
Patch
Comment 15 Darin Adler 2012-12-04 10:13:57 PST
Comment on attachment 177451 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=177451&action=review

review- because there is a much better way to do this hash correctly

> Source/WebCore/ChangeLog:19
> +        benefit either, since retrieving the host from the KURL creates a brand new string every time. We shift
> +        the page's host hash, since otherwise, in the case of where the plug-in is loading from the same domain
> +        as the page, it would lead to a combined hash of 0.

I don’t understand why a right shift of 1 is the right way to solve that problem. It throws away 1/32 of the host hash. I’d think a rotate would be better than a shift.

> Source/WebCore/plugins/PlugInOriginHash.cpp:44
> +static inline unsigned computeCaseFoldedHash(const String& string)
> +{
> +    if (string.is8Bit())
> +        return StringHasher::computeHash<LChar, CaseFoldingHash::foldCase<LChar> >(string.characters8(), string.length());
> +    return StringHasher::computeHash<UChar, CaseFoldingHash::foldCase<UChar> >(string.characters16(), string.length());
> +}

The fact that this is not a duplicate of the CaseFoldingHash::hash function from StringHash.h is a subtle detail that is not explained here. The change log mentions that we don’t want to mask the top 8 bits, but it’s easy to overlook that so we need a comment explaining that, not just change log. And I also think we should consider putting this function into StringHash.h in the future. We just need to figure out the right name for it.

> Source/WebCore/plugins/PlugInOriginHash.cpp:51
> +    String plugInSrcHost = plugInURL.host();

Why put this into a local variable, but not put plugInElement->serviceType() into a local variable?

> Source/WebCore/plugins/PlugInOriginHash.cpp:53
> +    return (computeCaseFoldedHash(pageHost) >> 1) ^ computeCaseFoldedHash(plugInSrcHost) ^ computeCaseFoldedHash(plugInElement->serviceType());

I’m not sure that exclusive or is a good way to combine these three hashes; normally we hash the hashes, or make a single hash. And the need to shift the hash of the page host and the source host is sufficiently non-obvious that it needs to be in a comment, not just change log.

I think I know a much better way to do this. Here’s my idea. First, we refactor the StringHasher::computeHash static member function into a new StringHasher::addCharacters(const T* data, unsigned length) non-static member function, so we have a function we can use on an existing StringHasher multiple times. Then we write a function to add a string to a hash named addCaseFoldedCharacters, which calls addCharacters. Then we write this:

    StringHasher hasher;
    addCaseFoldedCharacters(hasher, page->mainFrame()->document()->baseURL().host());
    hasher.addCharacter(0);
    addCaseFoldedCharacters(hasher, plugInURL.host());
    hasher.addCharacter(0);
    addCaseFoldedCharacters(hasher, plugInElement->serviceType());
    return hasher.hash();

This way, we won’t have to do any tricky shifting or worry about how well exclusive or works.

> Source/WebCore/plugins/PlugInOriginHash.h:35
> +    static unsigned hash(HTMLPlugInImageElement* plugInElement, const KURL& plugInURL);

Normally we need an equality function to go along with a hash function. Are we actually going to rely on these hashes without also doing equality checks?
Comment 16 Jon Lee 2012-12-04 22:49:22 PST
Comment on attachment 177451 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=177451&action=review

>> Source/WebCore/plugins/PlugInOriginHash.cpp:53
>> +    return (computeCaseFoldedHash(pageHost) >> 1) ^ computeCaseFoldedHash(plugInSrcHost) ^ computeCaseFoldedHash(plugInElement->serviceType());
> 
> I’m not sure that exclusive or is a good way to combine these three hashes; normally we hash the hashes, or make a single hash. And the need to shift the hash of the page host and the source host is sufficiently non-obvious that it needs to be in a comment, not just change log.
> 
> I think I know a much better way to do this. Here’s my idea. First, we refactor the StringHasher::computeHash static member function into a new StringHasher::addCharacters(const T* data, unsigned length) non-static member function, so we have a function we can use on an existing StringHasher multiple times. Then we write a function to add a string to a hash named addCaseFoldedCharacters, which calls addCharacters. Then we write this:
> 
>     StringHasher hasher;
>     addCaseFoldedCharacters(hasher, page->mainFrame()->document()->baseURL().host());
>     hasher.addCharacter(0);
>     addCaseFoldedCharacters(hasher, plugInURL.host());
>     hasher.addCharacter(0);
>     addCaseFoldedCharacters(hasher, plugInElement->serviceType());
>     return hasher.hash();
> 
> This way, we won’t have to do any tricky shifting or worry about how well exclusive or works.

Excellent idea. I've posted a patch to implement StringHasher::addCharacters() in bug 104076.

>> Source/WebCore/plugins/PlugInOriginHash.h:35
>> +    static unsigned hash(HTMLPlugInImageElement* plugInElement, const KURL& plugInURL);
> 
> Normally we need an equality function to go along with a hash function. Are we actually going to rely on these hashes without also doing equality checks?

Normally you would, but this hash is not being used as a hash function for any data structure, and the intent was to only have the hashes shared among the processes like visited links. Having an equal() would imply needing to create a new struct for plug-in origins, and then sharing that data.

I'm hoping with your proposed scheme that we avoid collisions. If we find ourselves with collisions, then I think we can revisit this, and switch from using HashSet<unsigned> to HashSet<PlugInOrigin> or something to that effect.
Comment 17 Jon Lee 2012-12-05 09:14:15 PST
Created attachment 177773 [details]
Patch
Comment 18 Alexey Proskuryakov 2012-12-06 17:55:49 PST
Comment on attachment 177773 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=177773&action=review

Looks like Darin's comments have been addressed.

> Source/WebCore/plugins/PlugInOriginHash.cpp:39
> +static inline void addCaseFoldedCharacters(StringHasher &hasher, const String& string)

Misplaced & here.

> Source/WebCore/plugins/PlugInOriginHash.cpp:49
> +    if (!plugInElement->document()->page())
> +        return 0;

That's quite likely to cause collisions!

Perhaps we should require the document to be in page instead of silently returning a bad hash?

> Source/WebCore/plugins/PlugInOriginHash.cpp:59
> +    LOG(Plugins, "Hash: %s %s %s", plugInElement->document()->page()->mainFrame()->document()->baseURL().host().utf8().data(), plugInURL.host().utf8().data(), plugInElement->serviceType().utf8().data());

Do you still need this? It looks like it will be too chatty with Plugins channel enabled.

> Source/WebCore/plugins/PlugInOriginHash.h:35
> +    static unsigned hash(HTMLPlugInImageElement* plugInElement, const KURL& plugInURL);

Argument names seem unnecessary here.
Comment 19 Jon Lee 2012-12-07 00:39:47 PST
Comment on attachment 177773 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=177773&action=review

Thanks for reviewing!

>> Source/WebCore/plugins/PlugInOriginHash.cpp:39
>> +static inline void addCaseFoldedCharacters(StringHasher &hasher, const String& string)
> 
> Misplaced & here.

Fixed.

>> Source/WebCore/plugins/PlugInOriginHash.cpp:49
>> +        return 0;
> 
> That's quite likely to cause collisions!
> 
> Perhaps we should require the document to be in page instead of silently returning a bad hash?

I can convert this to an assert.

>> Source/WebCore/plugins/PlugInOriginHash.cpp:59
>> +    LOG(Plugins, "Hash: %s %s %s", plugInElement->document()->page()->mainFrame()->document()->baseURL().host().utf8().data(), plugInURL.host().utf8().data(), plugInElement->serviceType().utf8().data());
> 
> Do you still need this? It looks like it will be too chatty with Plugins channel enabled.

I'd like to keep this in for a little while. It will only appear for ports that use plug-in snapshotting, anyway.

>> Source/WebCore/plugins/PlugInOriginHash.h:35
>> +    static unsigned hash(HTMLPlugInImageElement* plugInElement, const KURL& plugInURL);
> 
> Argument names seem unnecessary here.

Removed.
Comment 20 Jon Lee 2012-12-07 00:44:24 PST
Committed r136932: <http://trac.webkit.org/changeset/136932>
Comment 21 Jon Lee 2012-12-07 00:47:34 PST
And build fix r136933.