RESOLVED FIXED 45970
Add StringHasher class
https://bugs.webkit.org/show_bug.cgi?id=45970
Summary Add StringHasher class
Patrick R. Gansterer
Reported 2010-09-17 10:06:47 PDT
This is the the first patch required for adding AtomicString::fromUTF8. (see https://bugs.webkit.org/show_bug.cgi?id=45594#c10) required work after this patch (1-3 is cleanup): 1) Use StringHasher::createHash directly insted of WTF::stringHash. 2) Use StringHasher::createHash in StringHash.h and differenct WebCore files (e.g. QualifiedName). 3) Rename StringHashFunctions.h into StringHasher.h 4) Create a function for calculating the hash directly out of null terminated UTF-8 data 5) Use this UTF8->hash function in AtomicString::fromUTF8. 6) Use AtomicString::fromUTF8 in XMLParser.
Attachments
Patch (8.30 KB, patch)
2010-09-17 10:23 PDT, Patrick R. Gansterer
no flags
Patch (use one 31 bit) (8.39 KB, patch)
2010-09-17 10:55 PDT, Patrick R. Gansterer
no flags
Patch (use one 31 bit) (8.39 KB, patch)
2010-09-18 15:37 PDT, Patrick R. Gansterer
no flags
Fix typo (2.01 KB, patch)
2010-09-25 01:31 PDT, Patrick R. Gansterer
no flags
Patrick R. Gansterer
Comment 1 2010-09-17 10:23:57 PDT
Created attachment 67916 [details] Patch > + if (m_cachedCharacter != invalidCharacterValue) { > + addCharactersToHash(m_cachedCharacter, ch); > + m_cachedCharacter = invalidCharacterValue; > + return; > + } Will be used later in the UTF8 to hash function to add each single character (avoids caching there). > - hash &= 0x7fffffff; > - > - // this avoids ever returning a hash code of 0, since that is used to > - // signal "hash not computed yet", using a value that is likely to be > - // effectively the same as 0 when the low bits are masked > - if (hash == 0) > - hash = 0x40000000; > + if (!result) > + return 0x80000000; This is the same algorithm as in StringHash.h. > + template<typename T, UChar Coverter(T)> static inline unsigned createHash(const T* data, unsigned length) "UChar Coverter(T)" will be used later in CaseFoldingHash::hash to do a createHash<UChar, WTF::Unicode::foldCase>().
Darin Adler
Comment 2 2010-09-17 10:35:38 PDT
(In reply to comment #1) > > - hash &= 0x7fffffff; > > - > > - // this avoids ever returning a hash code of 0, since that is used to > > - // signal "hash not computed yet", using a value that is likely to be > > - // effectively the same as 0 when the low bits are masked > > - if (hash == 0) > > - hash = 0x40000000; > > + if (!result) > > + return 0x80000000; > This is the same algorithm as in StringHash.h. Changing this will almost certainly break JavaScriptCore, which uses the high bit for something else. Please talk to Gavin Barraclough about this.
Patrick R. Gansterer
Comment 3 2010-09-17 10:55:49 PDT
Created attachment 67920 [details] Patch (use one 31 bit) (In reply to comment #2) > Changing this will almost certainly break JavaScriptCore, which uses the high bit for something else. Please talk to Gavin Barraclough about this. http://trac.webkit.org/changeset/53456: I don't know what "it causes much sadness in this world" means exactly ;-) http://trac.webkit.org/changeset/52856: "change string hash result from 32-bit to 31-bit, to free a bit in UStringImpl for m_isIdentifier" I changed it to use only 31 bits and added a comment.
Early Warning System Bot
Comment 4 2010-09-18 15:31:09 PDT
Patrick R. Gansterer
Comment 5 2010-09-18 15:37:03 PDT
Created attachment 68017 [details] Patch (use one 31 bit) Make Qt build happy.
Patrick R. Gansterer
Comment 6 2010-09-24 10:32:52 PDT
ping?
Gavin Barraclough
Comment 7 2010-09-24 12:05:03 PDT
Comment on attachment 68017 [details] Patch (use one 31 bit) Sorry for the delay - looks like a nice improvement! We could probably switch back to a 32-bit hash now, if we wanted - I'd have to check, but I don't think we're stealing a bit for anything at the minute. However the hash algorithm is used in a compile stage, adding constant hashes to autogenerated tables, so we'd have to change that too. r+. G.
WebKit Commit Bot
Comment 8 2010-09-24 14:06:41 PDT
Comment on attachment 68017 [details] Patch (use one 31 bit) Clearing flags on attachment: 68017 Committed r68289: <http://trac.webkit.org/changeset/68289>
WebKit Commit Bot
Comment 9 2010-09-24 14:06:46 PDT
All reviewed patches have been landed. Closing bug.
WebKit Review Bot
Comment 10 2010-09-24 15:37:13 PDT
http://trac.webkit.org/changeset/68289 might have broken GTK Linux 64-bit Debug
Darin Adler
Comment 11 2010-09-24 18:35:44 PDT
There’s a typo: "coverter" in this patch.
Patrick R. Gansterer
Comment 12 2010-09-25 01:31:04 PDT
Created attachment 68816 [details] Fix typo
Patrick R. Gansterer
Comment 13 2010-09-25 01:34:38 PDT
Reopen for commit queue.
WebKit Commit Bot
Comment 14 2010-09-25 02:07:15 PDT
Comment on attachment 68816 [details] Fix typo Clearing flags on attachment: 68816 Committed r68330: <http://trac.webkit.org/changeset/68330>
WebKit Commit Bot
Comment 15 2010-09-25 02:07:21 PDT
All reviewed patches have been landed. Closing bug.
WebKit Review Bot
Comment 16 2010-09-25 09:10:30 PDT
Mihai Parparita
Comment 17 2010-09-25 13:51:02 PDT
This change causes the DRT to crash on fast/js/encode-URI-test.html (I can reproduce on Snow Leopard locally, but you can also see it on the Leopard bot (not sure why the SL one isn't crashing): http://build.webkit.org/results/Leopard%20Intel%20Debug%20(Tests)/r68336%20(20697)/results.html). The culprit appears to be the "\uFFFE" string literal in the test, just entering that string into the JS console in a build with this change is enough to trigger the assert on line 50 of StringHashFunctions.h.
Patrick R. Gansterer
Comment 18 2010-09-25 19:02:44 PDT
(In reply to comment #17) > This change causes the DRT to crash on fast/js/encode-URI-test.html (I can reproduce on Snow Leopard locally, but you can also see it on the Leopard bot (not sure why the SL one isn't crashing): http://build.webkit.org/results/Leopard%20Intel%20Debug%20(Tests)/r68336%20(20697)/results.html). Sorry for that! The patch at bug 46553 (needs r+) should fix it.
Adam Barth
Comment 19 2010-09-26 11:54:12 PDT
Reopen for commit-queue to see the patch.
WebKit Commit Bot
Comment 20 2010-09-26 11:55:17 PDT
Comment on attachment 68816 [details] Fix typo Rejecting patch 68816 from commit-queue. Failed to run "['./WebKitTools/Scripts/webkit-patch', '--status-host=queues.webkit.org', 'apply-attachment', '--force-clean', '--non-interactive', '--quiet', 68816]" exit_code: 2 Cleaning working directory Updating working directory Logging in as commit-queue@webkit.org... Fetching: https://bugs.webkit.org/attachment.cgi?id=68816&action=edit Fetching: https://bugs.webkit.org/show_bug.cgi?id=45970&ctype=xml Processing 1 patch from 1 bug. Processing patch 68816 from bug 45970. Failed to run "[u'/Projects/CommitQueue/WebKitTools/Scripts/svn-apply', u'--reviewer', u'Gavin Barraclough', u'--force']" exit_code: 1 Full output: http://queues.webkit.org/results/4049167
Patrick R. Gansterer
Comment 21 2010-09-26 11:58:17 PDT
Comment on attachment 68816 [details] Fix typo Clearing flags on attachment: 68816 Already committed r68330: <http://trac.webkit.org/changeset/68330>
Note You need to log in before you can comment on or make changes to this bug.