CPU profiling shows that string encoding/decoding is a hot spot for IDB. * The backing store encodes strings in UTF-16BE. Most CPUs these days are little-endian, which means byte swapping * The byte swapping for decode is done into a StringBuilder with append() calls, which have overhead. Short term, this could be optimized by bypassing StringBuilder and walking memory short by short doing using htons/ntohs c/o arpa/inet.h Long term it would make sense to switch the backing store encoding to UTF-16LE to save work on most architectures.
Created attachment 166028 [details] Patch
dgrogan@, alecflett@ - please take a look?
Comment on attachment 166028 [details] Patch Seems fine - is there any reason to think that using the iterators is better (e.g. safer, future-proof) than using raw buffers and length? (I guess it seems odd to use Vector<UChar> when we're really just treating it as UChar[length] ) I'm not sure if Vector<UChar, length> works (it might!) but I'm pretty sure UChar buffer[length] works in modern compilers...
(In reply to comment #3) > (From update of attachment 166028 [details]) > Seems fine - is there any reason to think that using the iterators is better (e.g. safer, future-proof) than using raw buffers and length? Given the rest of the file, no. :) > (I guess it seems odd to use Vector<UChar> when we're really just treating it as UChar[length] ) > > I'm not sure if Vector<UChar, length> works (it might!) but I'm pretty sure UChar buffer[length] works in modern compilers... String::adopt(buffer) is optimized to avoid a copy, and takes a Vector (or StringBuffer).
oh right, I forgot that the buffer got adopted, so stack-based allocation isn't helpful :) LGTM then!
tony@ - r?
Comment on attachment 166028 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=166028&action=review > Source/WebCore/ChangeLog:9 > + Optimize string encoding/decoding, which showed up as a CPU hot spot during profiling. > + The backing store uses big-endian ordering of 16-bit code unit strings, so a memcopy Nit: It would be nice to include some perf numbers. > Source/WebCore/Modules/indexeddb/IDBLevelDBCoding.cpp:29 > +// for htons/ntohs Nit: Capitalize and end with a period.
Created attachment 166062 [details] Patch for landing
Comment on attachment 166062 [details] Patch for landing Clearing flags on attachment: 166062 Committed r129806: <http://trac.webkit.org/changeset/129806>
All reviewed patches have been landed. Closing bug.
Re-opened since this is blocked by bug 97831
Created attachment 166281 [details] Patch
So arpa/inet.h is only available on Unices. Bleah, shoulda seen that coming. Other places in WebCore that need htons/ntohs use local defines, so I went that route. I loosely followed platform/graphics/WOFFFileFormat.cpp but made the macros call a static inline function so that the argument wouldn't be evaluated twice (otherwise *dst = htons(*src++) doesn't do what you'd expect). (Should possibly add these to a header in platform?)
tony@ - can you take another look? any suggestions?
Adam suggests making a new file in WTF/wtf for this. Maybe ByteOrder.h? It would just include <arpa/inet.h> on unix and define the functions on Windows. Maybe do that as a patch first?
(In reply to comment #15) > Adam suggests making a new file in WTF/wtf for this. Maybe ByteOrder.h? It would just include <arpa/inet.h> on unix and define the functions on Windows. Maybe do that as a patch first? SGTM. (We don't have that "add a file to all projects" script yet, do we?)
> We don't have that "add a file to all projects" script yet, do we? Sadly, we don't.
Nice one!
Created attachment 166911 [details] Patch for landing
Comment on attachment 166911 [details] Patch for landing Clearing flags on attachment: 166911 Committed r130299: <http://trac.webkit.org/changeset/130299>