Bug 97794 - IndexedDB: Optimize encodeString/decodeString
Summary: IndexedDB: Optimize encodeString/decodeString
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: 523.x (Safari 3)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Joshua Bell
URL:
Keywords:
Depends on: 97831 98054
Blocks:
  Show dependency treegraph
 
Reported: 2012-09-27 09:17 PDT by Joshua Bell
Modified: 2012-10-03 10:01 PDT (History)
6 users (show)

See Also:


Attachments
Patch (3.35 KB, patch)
2012-09-27 10:29 PDT, Joshua Bell
no flags Details | Formatted Diff | Diff
Patch for landing (3.60 KB, patch)
2012-09-27 14:27 PDT, Joshua Bell
no flags Details | Formatted Diff | Diff
Patch (3.89 KB, patch)
2012-09-28 11:03 PDT, Joshua Bell
no flags Details | Formatted Diff | Diff
Patch for landing (3.63 KB, patch)
2012-10-03 09:28 PDT, Joshua Bell
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Joshua Bell 2012-09-27 09:17:25 PDT
CPU profiling shows that string encoding/decoding is a hot spot for IDB.

* The backing store encodes strings in UTF-16BE. Most CPUs these days are little-endian, which means byte swapping
* The byte swapping for decode is done into a StringBuilder with append() calls, which have overhead.

Short term, this could be optimized by bypassing StringBuilder and walking memory short by short doing using htons/ntohs c/o arpa/inet.h

Long term it would make sense to switch the backing store encoding to UTF-16LE to save work on most architectures.
Comment 1 Joshua Bell 2012-09-27 10:29:36 PDT
Created attachment 166028 [details]
Patch
Comment 2 Joshua Bell 2012-09-27 10:29:57 PDT
dgrogan@, alecflett@ - please take a look?
Comment 3 Alec Flett 2012-09-27 10:39:01 PDT
Comment on attachment 166028 [details]
Patch

Seems fine - is there any reason to think that using the iterators is better (e.g. safer, future-proof) than using raw buffers and length?

(I guess it seems odd to use Vector<UChar> when we're really just treating it as UChar[length] )

I'm not sure if Vector<UChar, length> works (it might!) but I'm pretty sure UChar buffer[length] works in modern compilers...
Comment 4 Joshua Bell 2012-09-27 10:45:39 PDT
(In reply to comment #3)
> (From update of attachment 166028 [details])
> Seems fine - is there any reason to think that using the iterators is better (e.g. safer, future-proof) than using raw buffers and length?

Given the rest of the file, no. :)

> (I guess it seems odd to use Vector<UChar> when we're really just treating it as UChar[length] )
> 
> I'm not sure if Vector<UChar, length> works (it might!) but I'm pretty sure UChar buffer[length] works in modern compilers...

String::adopt(buffer) is optimized to avoid a copy, and takes a Vector (or StringBuffer).
Comment 5 Alec Flett 2012-09-27 10:46:57 PDT
oh right, I forgot that the buffer got adopted, so stack-based allocation isn't helpful :)

LGTM then!
Comment 6 Joshua Bell 2012-09-27 10:53:20 PDT
tony@ - r?
Comment 7 Tony Chang 2012-09-27 14:00:08 PDT
Comment on attachment 166028 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=166028&action=review

> Source/WebCore/ChangeLog:9
> +        Optimize string encoding/decoding, which showed up as a CPU hot spot during profiling.
> +        The backing store uses big-endian ordering of 16-bit code unit strings, so a memcopy

Nit: It would be nice to include some perf numbers.

> Source/WebCore/Modules/indexeddb/IDBLevelDBCoding.cpp:29
> +// for htons/ntohs

Nit: Capitalize and end with a period.
Comment 8 Joshua Bell 2012-09-27 14:27:41 PDT
Created attachment 166062 [details]
Patch for landing
Comment 9 WebKit Review Bot 2012-09-27 15:03:34 PDT
Comment on attachment 166062 [details]
Patch for landing

Clearing flags on attachment: 166062

Committed r129806: <http://trac.webkit.org/changeset/129806>
Comment 10 WebKit Review Bot 2012-09-27 15:03:38 PDT
All reviewed patches have been landed.  Closing bug.
Comment 11 WebKit Review Bot 2012-09-27 17:03:18 PDT
Re-opened since this is blocked by bug 97831
Comment 12 Joshua Bell 2012-09-28 11:03:59 PDT
Created attachment 166281 [details]
Patch
Comment 13 Joshua Bell 2012-09-28 11:08:45 PDT
So arpa/inet.h is only available on Unices. Bleah, shoulda seen that coming.

Other places in WebCore that need htons/ntohs use local defines, so I went that route. I loosely followed platform/graphics/WOFFFileFormat.cpp but made the macros call a static inline function so that the argument wouldn't be evaluated twice (otherwise *dst = htons(*src++) doesn't do what you'd expect).

(Should possibly add these to a header in platform?)
Comment 14 Joshua Bell 2012-09-28 12:23:13 PDT
tony@ - can you take another look? any suggestions?
Comment 15 Tony Chang 2012-09-28 13:14:00 PDT
Adam suggests making a new file in WTF/wtf for this.  Maybe ByteOrder.h?  It would just include <arpa/inet.h> on unix and define the functions on Windows.  Maybe do that as a patch first?
Comment 16 Joshua Bell 2012-09-28 13:50:17 PDT
(In reply to comment #15)
> Adam suggests making a new file in WTF/wtf for this.  Maybe ByteOrder.h?  It would just include <arpa/inet.h> on unix and define the functions on Windows.  Maybe do that as a patch first?

SGTM. (We don't have that "add a file to all projects" script yet, do we?)
Comment 17 Adam Barth 2012-09-28 14:15:18 PDT
> We don't have that "add a file to all projects" script yet, do we?

Sadly, we don't.
Comment 18 Hans Wennborg 2012-10-02 04:01:06 PDT
Nice one!
Comment 19 Joshua Bell 2012-10-03 09:28:38 PDT
Created attachment 166911 [details]
Patch for landing
Comment 20 WebKit Review Bot 2012-10-03 10:01:01 PDT
Comment on attachment 166911 [details]
Patch for landing

Clearing flags on attachment: 166911

Committed r130299: <http://trac.webkit.org/changeset/130299>
Comment 21 WebKit Review Bot 2012-10-03 10:01:06 PDT
All reviewed patches have been landed.  Closing bug.