Use Hasher more, remove IntegerHasher, fix hashing-related mistakes
Created attachment 425065 [details] Patch
GTK build requires some changes to FontCacheFreeType.cpp
Created attachment 425221 [details] Patch
Created attachment 425224 [details] Patch
Created attachment 425227 [details] Patch
Created attachment 425228 [details] Patch
Created attachment 425328 [details] Patch
Created attachment 425430 [details] Patch
Comment on attachment 425430 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=425430&action=review > Source/WebCore/ChangeLog:12 > + hashes using exclusive or. The new one is almost certainly bettter. typo: bettter > Source/WTF/wtf/Hasher.h:100 > + bool remainder = string.length() & 1; Wouldn't it be more efficient to call String::hash()? In cases where the String hash has already been computed, String::hash() just returns the cached hash.
Comment on attachment 425430 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=425430&action=review > Source/WebCore/platform/graphics/FontTaggedSettings.h:43 > +inline void add(Hasher& hasher, std::array<char, 4> array) Could use FontTag?
Comment on attachment 425430 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=425430&action=review >> Source/WTF/wtf/Hasher.h:100 >> + bool remainder = string.length() & 1; > > Wouldn't it be more efficient to call String::hash()? In cases where the String hash has already been computed, String::hash() just returns the cached hash. If we want to compute a hash by hashing a hash, then yes, that avoids scanning the characters again. But we’d have to always do it, not just in cases where the hash is already computed. But generally common wisdom I have read in research about hashing is that it’s much better to hash characters than to hash an already-computed hash of the characters. I believe enough better that it’s worthwhile to re-scan the characters in the string. Anyway, I’m open to either approach. What I like about Hasher is that we make the call here in one place. >> Source/WebCore/platform/graphics/FontTaggedSettings.h:43 >> +inline void add(Hasher& hasher, std::array<char, 4> array) > > Could use FontTag? We could, but I intentionally did not because this function is not a hash specific to FontTag; it’s hash for any 4-element array of bytes. Could even make it a template so it works for all different 8-bit integer types. Maybe even move it to Hasher.h instead of leaving it here.
(In reply to Darin Adler from comment #11) > Comment on attachment 425430 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=425430&action=review > > >> Source/WTF/wtf/Hasher.h:100 > >> + bool remainder = string.length() & 1; > > > > Wouldn't it be more efficient to call String::hash()? In cases where the String hash has already been computed, String::hash() just returns the cached hash. > > If we want to compute a hash by hashing a hash, then yes, that avoids > scanning the characters again. But we’d have to always do it, not just in > cases where the hash is already computed. > > But generally common wisdom I have read in research about hashing is that > it’s much better to hash characters than to hash an already-computed hash of > the characters. I believe enough better that it’s worthwhile to re-scan the > characters in the string. > > Anyway, I’m open to either approach. What I like about Hasher is that we > make the call here in one place. I see. Your call. I just thought it was sad to have hashing logic and caching of that hash in String, and yet not use it for hashing. > > >> Source/WebCore/platform/graphics/FontTaggedSettings.h:43 > >> +inline void add(Hasher& hasher, std::array<char, 4> array) > > > > Could use FontTag? > > We could, but I intentionally did not because this function is not a hash > specific to FontTag; it’s hash for any 4-element array of bytes. Could even > make it a template so it works for all different 8-bit integer types. Maybe > even move it to Hasher.h instead of leaving it here. Ok.
Comment on attachment 425430 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=425430&action=review >>>> Source/WTF/wtf/Hasher.h:100 >>>> + bool remainder = string.length() & 1; >>> >>> Wouldn't it be more efficient to call String::hash()? In cases where the String hash has already been computed, String::hash() just returns the cached hash. >> >> If we want to compute a hash by hashing a hash, then yes, that avoids scanning the characters again. But we’d have to always do it, not just in cases where the hash is already computed. >> >> But generally common wisdom I have read in research about hashing is that it’s much better to hash characters than to hash an already-computed hash of the characters. I believe enough better that it’s worthwhile to re-scan the characters in the string. >> >> Anyway, I’m open to either approach. What I like about Hasher is that we make the call here in one place. > > I see. Your call. I just thought it was sad to have hashing logic and caching of that hash in String, and yet not use it for hashing. I’ll land it like this, but it’s not my "final word" on the matter. It’s worth considering improvements and finding a way to measure.
Committed r275650 (236286@main): <https://commits.webkit.org/236286@main>
<rdar://problem/76377558>