Bug 97794

Summary: IndexedDB: Optimize encodeString/decodeString
Product: WebKit Reporter: Joshua Bell <jsbell>
Component: WebCore Misc.Assignee: Joshua Bell <jsbell>
Status: RESOLVED FIXED    
Severity: Normal CC: abarth, alecflett, dgrogan, hans, tony, webkit.review.bot
Priority: P2    
Version: 523.x (Safari 3)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 97831, 98054    
Bug Blocks:    
Attachments:
Description Flags
Patch
none
Patch for landing
none
Patch
none
Patch for landing none

Joshua Bell
Reported 2012-09-27 09:17:25 PDT
CPU profiling shows that string encoding/decoding is a hot spot for IDB. * The backing store encodes strings in UTF-16BE. Most CPUs these days are little-endian, which means byte swapping * The byte swapping for decode is done into a StringBuilder with append() calls, which have overhead. Short term, this could be optimized by bypassing StringBuilder and walking memory short by short doing using htons/ntohs c/o arpa/inet.h Long term it would make sense to switch the backing store encoding to UTF-16LE to save work on most architectures.
Attachments
Patch (3.35 KB, patch)
2012-09-27 10:29 PDT, Joshua Bell
no flags
Patch for landing (3.60 KB, patch)
2012-09-27 14:27 PDT, Joshua Bell
no flags
Patch (3.89 KB, patch)
2012-09-28 11:03 PDT, Joshua Bell
no flags
Patch for landing (3.63 KB, patch)
2012-10-03 09:28 PDT, Joshua Bell
no flags
Joshua Bell
Comment 1 2012-09-27 10:29:36 PDT
Joshua Bell
Comment 2 2012-09-27 10:29:57 PDT
dgrogan@, alecflett@ - please take a look?
Alec Flett
Comment 3 2012-09-27 10:39:01 PDT
Comment on attachment 166028 [details] Patch Seems fine - is there any reason to think that using the iterators is better (e.g. safer, future-proof) than using raw buffers and length? (I guess it seems odd to use Vector<UChar> when we're really just treating it as UChar[length] ) I'm not sure if Vector<UChar, length> works (it might!) but I'm pretty sure UChar buffer[length] works in modern compilers...
Joshua Bell
Comment 4 2012-09-27 10:45:39 PDT
(In reply to comment #3) > (From update of attachment 166028 [details]) > Seems fine - is there any reason to think that using the iterators is better (e.g. safer, future-proof) than using raw buffers and length? Given the rest of the file, no. :) > (I guess it seems odd to use Vector<UChar> when we're really just treating it as UChar[length] ) > > I'm not sure if Vector<UChar, length> works (it might!) but I'm pretty sure UChar buffer[length] works in modern compilers... String::adopt(buffer) is optimized to avoid a copy, and takes a Vector (or StringBuffer).
Alec Flett
Comment 5 2012-09-27 10:46:57 PDT
oh right, I forgot that the buffer got adopted, so stack-based allocation isn't helpful :) LGTM then!
Joshua Bell
Comment 6 2012-09-27 10:53:20 PDT
tony@ - r?
Tony Chang
Comment 7 2012-09-27 14:00:08 PDT
Comment on attachment 166028 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=166028&action=review > Source/WebCore/ChangeLog:9 > + Optimize string encoding/decoding, which showed up as a CPU hot spot during profiling. > + The backing store uses big-endian ordering of 16-bit code unit strings, so a memcopy Nit: It would be nice to include some perf numbers. > Source/WebCore/Modules/indexeddb/IDBLevelDBCoding.cpp:29 > +// for htons/ntohs Nit: Capitalize and end with a period.
Joshua Bell
Comment 8 2012-09-27 14:27:41 PDT
Created attachment 166062 [details] Patch for landing
WebKit Review Bot
Comment 9 2012-09-27 15:03:34 PDT
Comment on attachment 166062 [details] Patch for landing Clearing flags on attachment: 166062 Committed r129806: <http://trac.webkit.org/changeset/129806>
WebKit Review Bot
Comment 10 2012-09-27 15:03:38 PDT
All reviewed patches have been landed. Closing bug.
WebKit Review Bot
Comment 11 2012-09-27 17:03:18 PDT
Re-opened since this is blocked by bug 97831
Joshua Bell
Comment 12 2012-09-28 11:03:59 PDT
Joshua Bell
Comment 13 2012-09-28 11:08:45 PDT
So arpa/inet.h is only available on Unices. Bleah, shoulda seen that coming. Other places in WebCore that need htons/ntohs use local defines, so I went that route. I loosely followed platform/graphics/WOFFFileFormat.cpp but made the macros call a static inline function so that the argument wouldn't be evaluated twice (otherwise *dst = htons(*src++) doesn't do what you'd expect). (Should possibly add these to a header in platform?)
Joshua Bell
Comment 14 2012-09-28 12:23:13 PDT
tony@ - can you take another look? any suggestions?
Tony Chang
Comment 15 2012-09-28 13:14:00 PDT
Adam suggests making a new file in WTF/wtf for this. Maybe ByteOrder.h? It would just include <arpa/inet.h> on unix and define the functions on Windows. Maybe do that as a patch first?
Joshua Bell
Comment 16 2012-09-28 13:50:17 PDT
(In reply to comment #15) > Adam suggests making a new file in WTF/wtf for this. Maybe ByteOrder.h? It would just include <arpa/inet.h> on unix and define the functions on Windows. Maybe do that as a patch first? SGTM. (We don't have that "add a file to all projects" script yet, do we?)
Adam Barth
Comment 17 2012-09-28 14:15:18 PDT
> We don't have that "add a file to all projects" script yet, do we? Sadly, we don't.
Hans Wennborg
Comment 18 2012-10-02 04:01:06 PDT
Nice one!
Joshua Bell
Comment 19 2012-10-03 09:28:38 PDT
Created attachment 166911 [details] Patch for landing
WebKit Review Bot
Comment 20 2012-10-03 10:01:01 PDT
Comment on attachment 166911 [details] Patch for landing Clearing flags on attachment: 166911 Committed r130299: <http://trac.webkit.org/changeset/130299>
WebKit Review Bot
Comment 21 2012-10-03 10:01:06 PDT
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.