WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
Bug 224138
Use Hasher more, remove IntegerHasher, fix hashing-related mistakes
https://bugs.webkit.org/show_bug.cgi?id=224138
Summary
Use Hasher more, remove IntegerHasher, fix hashing-related mistakes
Darin Adler
Reported
2021-04-02 15:50:17 PDT
Use Hasher more, remove IntegerHasher, fix hashing-related mistakes
Attachments
Patch
(85.08 KB, patch)
2021-04-02 17:02 PDT
,
Darin Adler
no flags
Details
Formatted Diff
Diff
Patch
(86.57 KB, patch)
2021-04-05 16:38 PDT
,
Darin Adler
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(91.99 KB, patch)
2021-04-05 17:08 PDT
,
Darin Adler
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(92.62 KB, patch)
2021-04-05 17:31 PDT
,
Darin Adler
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(93.37 KB, patch)
2021-04-05 18:30 PDT
,
Darin Adler
no flags
Details
Formatted Diff
Diff
Patch
(93.34 KB, patch)
2021-04-06 14:43 PDT
,
Darin Adler
no flags
Details
Formatted Diff
Diff
Patch
(94.81 KB, patch)
2021-04-07 13:18 PDT
,
Darin Adler
cdumez
: review+
Details
Formatted Diff
Diff
Show Obsolete
(6)
View All
Add attachment
proposed patch, testcase, etc.
Darin Adler
Comment 1
2021-04-02 17:02:54 PDT
Created
attachment 425065
[details]
Patch
Darin Adler
Comment 2
2021-04-02 17:12:12 PDT
GTK build requires some changes to FontCacheFreeType.cpp
Darin Adler
Comment 3
2021-04-05 16:38:00 PDT
Created
attachment 425221
[details]
Patch
Darin Adler
Comment 4
2021-04-05 17:08:14 PDT
Created
attachment 425224
[details]
Patch
Darin Adler
Comment 5
2021-04-05 17:31:48 PDT
Created
attachment 425227
[details]
Patch
Darin Adler
Comment 6
2021-04-05 18:30:04 PDT
Created
attachment 425228
[details]
Patch
Darin Adler
Comment 7
2021-04-06 14:43:15 PDT
Created
attachment 425328
[details]
Patch
Darin Adler
Comment 8
2021-04-07 13:18:56 PDT
Created
attachment 425430
[details]
Patch
Chris Dumez
Comment 9
2021-04-07 14:30:21 PDT
Comment on
attachment 425430
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=425430&action=review
> Source/WebCore/ChangeLog:12 > + hashes using exclusive or. The new one is almost certainly bettter.
typo: bettter
> Source/WTF/wtf/Hasher.h:100 > + bool remainder = string.length() & 1;
Wouldn't it be more efficient to call String::hash()? In cases where the String hash has already been computed, String::hash() just returns the cached hash.
Chris Dumez
Comment 10
2021-04-07 14:42:40 PDT
Comment on
attachment 425430
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=425430&action=review
> Source/WebCore/platform/graphics/FontTaggedSettings.h:43 > +inline void add(Hasher& hasher, std::array<char, 4> array)
Could use FontTag?
Darin Adler
Comment 11
2021-04-07 14:56:01 PDT
Comment on
attachment 425430
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=425430&action=review
>> Source/WTF/wtf/Hasher.h:100 >> + bool remainder = string.length() & 1; > > Wouldn't it be more efficient to call String::hash()? In cases where the String hash has already been computed, String::hash() just returns the cached hash.
If we want to compute a hash by hashing a hash, then yes, that avoids scanning the characters again. But we’d have to always do it, not just in cases where the hash is already computed. But generally common wisdom I have read in research about hashing is that it’s much better to hash characters than to hash an already-computed hash of the characters. I believe enough better that it’s worthwhile to re-scan the characters in the string. Anyway, I’m open to either approach. What I like about Hasher is that we make the call here in one place.
>> Source/WebCore/platform/graphics/FontTaggedSettings.h:43 >> +inline void add(Hasher& hasher, std::array<char, 4> array) > > Could use FontTag?
We could, but I intentionally did not because this function is not a hash specific to FontTag; it’s hash for any 4-element array of bytes. Could even make it a template so it works for all different 8-bit integer types. Maybe even move it to Hasher.h instead of leaving it here.
Chris Dumez
Comment 12
2021-04-07 15:00:37 PDT
(In reply to Darin Adler from
comment #11
)
> Comment on
attachment 425430
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=425430&action=review
> > >> Source/WTF/wtf/Hasher.h:100 > >> + bool remainder = string.length() & 1; > > > > Wouldn't it be more efficient to call String::hash()? In cases where the String hash has already been computed, String::hash() just returns the cached hash. > > If we want to compute a hash by hashing a hash, then yes, that avoids > scanning the characters again. But we’d have to always do it, not just in > cases where the hash is already computed. > > But generally common wisdom I have read in research about hashing is that > it’s much better to hash characters than to hash an already-computed hash of > the characters. I believe enough better that it’s worthwhile to re-scan the > characters in the string. > > Anyway, I’m open to either approach. What I like about Hasher is that we > make the call here in one place.
I see. Your call. I just thought it was sad to have hashing logic and caching of that hash in String, and yet not use it for hashing.
> > >> Source/WebCore/platform/graphics/FontTaggedSettings.h:43 > >> +inline void add(Hasher& hasher, std::array<char, 4> array) > > > > Could use FontTag? > > We could, but I intentionally did not because this function is not a hash > specific to FontTag; it’s hash for any 4-element array of bytes. Could even > make it a template so it works for all different 8-bit integer types. Maybe > even move it to Hasher.h instead of leaving it here.
Ok.
Darin Adler
Comment 13
2021-04-07 17:00:28 PDT
Comment on
attachment 425430
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=425430&action=review
>>>> Source/WTF/wtf/Hasher.h:100 >>>> + bool remainder = string.length() & 1; >>> >>> Wouldn't it be more efficient to call String::hash()? In cases where the String hash has already been computed, String::hash() just returns the cached hash. >> >> If we want to compute a hash by hashing a hash, then yes, that avoids scanning the characters again. But we’d have to always do it, not just in cases where the hash is already computed. >> >> But generally common wisdom I have read in research about hashing is that it’s much better to hash characters than to hash an already-computed hash of the characters. I believe enough better that it’s worthwhile to re-scan the characters in the string. >> >> Anyway, I’m open to either approach. What I like about Hasher is that we make the call here in one place. > > I see. Your call. I just thought it was sad to have hashing logic and caching of that hash in String, and yet not use it for hashing.
I’ll land it like this, but it’s not my "final word" on the matter. It’s worth considering improvements and finding a way to measure.
Darin Adler
Comment 14
2021-04-07 19:19:28 PDT
Committed
r275650
(
236286@main
): <
https://commits.webkit.org/236286@main
>
Radar WebKit Bug Importer
Comment 15
2021-04-07 19:20:17 PDT
<
rdar://problem/76377558
>
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug