Bug 177575

Summary: Split some logic out of VisitedLinkStore and make it reusable
Product: WebKit Reporter: Chris Dumez <cdumez>
Component: WebKit2Assignee: Chris Dumez <cdumez>
Status: RESOLVED FIXED    
Severity: Normal CC: achristensen, beidson, commit-queue, darin, ggaren, sam, webkit-bug-importer, youennf
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugs.webkit.org/show_bug.cgi?id=174541
Bug Depends on:    
Bug Blocks: 177776    
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch
none
Patch none

Description Chris Dumez 2017-09-27 15:40:15 PDT
Split some logic out of VisitedLinkStore and make it reusable for other purposes than visited links and from other processes than the UIProcess.
Comment 1 Chris Dumez 2017-09-27 15:44:41 PDT
Created attachment 322032 [details]
Patch
Comment 2 Alex Christensen 2017-09-28 10:22:13 PDT
Comment on attachment 322032 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=322032&action=review

> Source/WebKit/Shared/LinkHashTable.h:35
> +class LinkHashTable {

Since we're going to be using this for things that aren't links, could we call this something like SharedStringHashTable?

> Source/WebKit/Shared/LinkHashTable.h:45
> +    bool contains(WebCore::LinkHash) const;

Could we rename LinkHash to SharedStringHash?
Comment 3 Alex Christensen 2017-09-28 10:26:36 PDT
Specifically, I think we're going to be putting SecurityOrigins into the shared hash table for service workers, so they won't even be URLs.
Comment 4 Chris Dumez 2017-09-28 10:39:20 PDT
(In reply to Alex Christensen from comment #3)
> Specifically, I think we're going to be putting SecurityOrigins into the
> shared hash table for service workers, so they won't even be URLs.

A SecurityOrigin can be stored as an URL and thus as a LinkHash as far as I can tell? Renaming LinkHash would make sense though.
Comment 5 Chris Dumez 2017-09-28 10:41:30 PDT
(In reply to Alex Christensen from comment #3)
> Specifically, I think we're going to be putting SecurityOrigins into the
> shared hash table for service workers, so they won't even be URLs.(In reply to Chris Dumez from comment #4)
> (In reply to Alex Christensen from comment #3)
> > Specifically, I think we're going to be putting SecurityOrigins into the
> > shared hash table for service workers, so they won't even be URLs.
> 
> A SecurityOrigin can be stored as an URL and thus as a LinkHash as far as I
> can tell? Renaming LinkHash would make sense though.

Also, if we eventually store the scope (path), storing as a URL makes even more sense.
Comment 6 Alex Christensen 2017-09-28 10:42:24 PDT
Either way, it will be a String that is not a link.
Comment 7 Chris Dumez 2017-09-28 11:57:42 PDT
(In reply to Alex Christensen from comment #6)
> Either way, it will be a String that is not a link.

How about renaming LinkHash to URLHash since it is constructed from a URL?
Then I'd rename LinkHashTable to URLHashTable.
Comment 8 Sam Weinig 2017-09-28 14:29:59 PDT
Comment on attachment 322032 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=322032&action=review

> Source/WebKit/Shared/WebSharedLinkHashStore.h:36
> +class WebSharedLinkHashStore {

Why the Web prefix? Seems like SharedLinkHashStore would be fine.
Comment 9 Alex Christensen 2017-09-28 14:31:26 PDT
(In reply to Chris Dumez from comment #7)
> (In reply to Alex Christensen from comment #6)
> > Either way, it will be a String that is not a link.
> 
> How about renaming LinkHash to URLHash since it is constructed from a URL?
> Then I'd rename LinkHashTable to URLHashTable.
It's constructed from a String.
Comment 10 Chris Dumez 2017-09-28 14:53:20 PDT
(In reply to Sam Weinig from comment #8)
> Comment on attachment 322032 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=322032&action=review
> 
> > Source/WebKit/Shared/WebSharedLinkHashStore.h:36
> > +class WebSharedLinkHashStore {
> 
> Why the Web prefix? Seems like SharedLinkHashStore would be fine.

Good to know. I have no idea when it is suitable to use the Web prefixed and when it is not :)
Comment 11 Chris Dumez 2017-09-28 16:09:44 PDT
Created attachment 322136 [details]
Patch
Comment 12 Chris Dumez 2017-09-28 16:10:25 PDT
Created attachment 322137 [details]
Patch
Comment 13 Chris Dumez 2017-09-28 16:22:19 PDT
Created attachment 322140 [details]
Patch
Comment 14 Chris Dumez 2017-09-28 16:45:33 PDT
Created attachment 322145 [details]
Patch
Comment 15 Sam Weinig 2017-09-28 19:56:31 PDT
(In reply to Chris Dumez from comment #10)
> (In reply to Sam Weinig from comment #8)
> > Comment on attachment 322032 [details]
> > Patch
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=322032&action=review
> > 
> > > Source/WebKit/Shared/WebSharedLinkHashStore.h:36
> > > +class WebSharedLinkHashStore {
> > 
> > Why the Web prefix? Seems like SharedLinkHashStore would be fine.
> 
> Good to know. I have no idea when it is suitable to use the Web prefixed and
> when it is not :)

Never. It was a mistake of my long forgotten youth.
Comment 16 Alex Christensen 2017-09-28 21:42:48 PDT
Comment on attachment 322145 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=322145&action=review

> Source/WebCore/platform/SharedStringHash.cpp:225
> +static ALWAYS_INLINE void computeSharedStringHashInline(const URL& base, const CharacterType* characters, unsigned length, Vector<CharacterType, 512>& buffer)

It looks like this is specific to the visited links, since it does a bunch of stuff with URLs.  This should probably not be shared.
Comment 17 Chris Dumez 2017-09-29 08:48:45 PDT
Comment on attachment 322145 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=322145&action=review

>> Source/WebCore/platform/SharedStringHash.cpp:225
>> +static ALWAYS_INLINE void computeSharedStringHashInline(const URL& base, const CharacterType* characters, unsigned length, Vector<CharacterType, 512>& buffer)
> 
> It looks like this is specific to the visited links, since it does a bunch of stuff with URLs.  This should probably not be shared.

It computes a SharedStringHash from a URL. Not sure I see anything wrong with using generic naming here.
Comment 18 Alex Christensen 2017-09-29 13:42:06 PDT
Comment on attachment 322145 [details]
Patch

I guess you're right.  r=me.
Comment 19 WebKit Commit Bot 2017-09-29 14:14:14 PDT
Comment on attachment 322145 [details]
Patch

Clearing flags on attachment: 322145

Committed r222664: <http://trac.webkit.org/changeset/222664>
Comment 20 WebKit Commit Bot 2017-09-29 14:14:16 PDT
All reviewed patches have been landed.  Closing bug.
Comment 21 Radar WebKit Bug Importer 2017-09-29 14:15:30 PDT
<rdar://problem/34747281>
Comment 22 Darin Adler 2017-09-30 17:03:54 PDT
Comment on attachment 322145 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=322145&action=review

>>> Source/WebCore/platform/SharedStringHash.cpp:225
>>> +static ALWAYS_INLINE void computeSharedStringHashInline(const URL& base, const CharacterType* characters, unsigned length, Vector<CharacterType, 512>& buffer)
>> 
>> It looks like this is specific to the visited links, since it does a bunch of stuff with URLs.  This should probably not be shared.
> 
> It computes a SharedStringHash from a URL. Not sure I see anything wrong with using generic naming here.

I think the generic naming might not be great.

What makes this the hash of a URL a "link hash" is that when computing the hash of a URL, the code ignores things to allow links that are “equivalent” in a high level way have the same hash. It’s not just a hash of the contents of the URL, it’s a hash that goes out of its way to ignore certain classes of differences and guarantees that these equivalent sets of links have the same hashes. I don’t think the name “link hash” is exactly right for this either.

The second thing that makes this hash different from the other hashes we provide in WTF is that this is a hash that is intended to be used without a fall back to string comparison. We use this hash to check if URLs are the same and we don’t compare the URL strings at all. That’s why it has to be larger than 32 bits.

I don’t think that “shared string hash” makes it clear that the hash of a URL has these properties, nor that the hash of the strings does, so I’m not sure it’s a great name. A great name would make it clear that it’s a hash used for direct comparisons rather than a hash used along with actual equality comparisons, and that’s why it’s large, and could drive other tradeoffs. A great name might also make it clear that the URL function is specifically designed for “link coloring” and might not be appropriate for all cases of hashing URLs.
Comment 23 Chris Dumez 2017-10-02 13:01:41 PDT
Comment on attachment 322145 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=322145&action=review

>>>> Source/WebCore/platform/SharedStringHash.cpp:225
>>>> +static ALWAYS_INLINE void computeSharedStringHashInline(const URL& base, const CharacterType* characters, unsigned length, Vector<CharacterType, 512>& buffer)
>>> 
>>> It looks like this is specific to the visited links, since it does a bunch of stuff with URLs.  This should probably not be shared.
>> 
>> It computes a SharedStringHash from a URL. Not sure I see anything wrong with using generic naming here.
> 
> I think the generic naming might not be great.
> 
> What makes this the hash of a URL a "link hash" is that when computing the hash of a URL, the code ignores things to allow links that are “equivalent” in a high level way have the same hash. It’s not just a hash of the contents of the URL, it’s a hash that goes out of its way to ignore certain classes of differences and guarantees that these equivalent sets of links have the same hashes. I don’t think the name “link hash” is exactly right for this either.
> 
> The second thing that makes this hash different from the other hashes we provide in WTF is that this is a hash that is intended to be used without a fall back to string comparison. We use this hash to check if URLs are the same and we don’t compare the URL strings at all. That’s why it has to be larger than 32 bits.
> 
> I don’t think that “shared string hash” makes it clear that the hash of a URL has these properties, nor that the hash of the strings does, so I’m not sure it’s a great name. A great name would make it clear that it’s a hash used for direct comparisons rather than a hash used along with actual equality comparisons, and that’s why it’s large, and could drive other tradeoffs. A great name might also make it clear that the URL function is specifically designed for “link coloring” and might not be appropriate for all cases of hashing URLs.

I don't have a good name suggestion at the moment. How about I keep the previous naming (computeVisitedLinkHash) for this function?
Comment 24 Darin Adler 2017-10-02 13:04:15 PDT
(In reply to Chris Dumez from comment #23)
> I don't have a good name suggestion at the moment. How about I keep the
> previous naming (computeVisitedLinkHash) for this function?

Sounds OK to me.