Summary: | Optimize serialization of quoted JSON strings. | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Andreas Kling <kling> | ||||||
Component: | JavaScriptCore | Assignee: | Andreas Kling <kling> | ||||||
Status: | RESOLVED FIXED | ||||||||
Severity: | Normal | CC: | benjamin, cdumez, cgarcia, cmarcelo, commit-queue, darin, ggaren, kling, simon.fraser | ||||||
Priority: | P2 | Keywords: | Performance | ||||||
Version: | 528+ (Nightly build) | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Bug Depends on: | 144784 | ||||||||
Bug Blocks: | |||||||||
Attachments: |
|
Description
Andreas Kling
2015-05-07 13:18:46 PDT
Created attachment 252615 [details]
Patch idea
Comment on attachment 252615 [details] Patch idea View in context: https://bugs.webkit.org/attachment.cgi?id=252615&action=review I guess I like this. No big deal to build the JSON logic into StringBuilder. We could always refactor to deal with that later if we like. > Source/WTF/wtf/text/StringBuilder.cpp:364 > +template <typename OutputCharType, typename InputCharType> We should write out CharacterType and not use CharType. > Source/WTF/wtf/text/StringBuilder.cpp:369 > + *(output++) = input[i++]; I don’t get why this is a separate loop. Why can’t we just do one character at a time and use continue and avoid the extra break below? if (input[i] > 0x1F && input[i] != '"' && input[i] != '\\') { *output++ = input[i]; continue; } Is there something I’m missing that would make this super slow? It’s also a little strange that we use indexing for input but pointer math for output. > Source/WTF/wtf/text/StringBuilder.cpp:374 > + *(output++) = '\\'; I think that idiomatically it’s always been *output++ for me, without parentheses. > Source/WTF/wtf/text/StringBuilder.cpp:402 > + static const char hexDigits[] = "0123456789abcdef"; It’s a little strange to use a different approach here than lowerNibbleToASCIIHexDigit does in ASCIICType.h. But this is lowercase hex and that is uppercase, and maybe this way of doing things is faster than what that does. > Source/WTF/wtf/text/StringBuilder.cpp:407 > + *(output++) = static_cast<LChar>(hexDigits[(ch >> 12) & 0xF]); > + *(output++) = static_cast<LChar>(hexDigits[(ch >> 8) & 0xF]); I think these are always '0'; we can’t get here unless the character is <= 0x1F; > Source/WTF/wtf/text/StringBuilder.cpp:417 > + size_t minimumNewCapacity = length() + 2 + (string.length() * 6); This could be a much lower value when string.is8Bit() is true. I think it’s worth optimizing that. I don’t think you need parentheses. I think you should explain in a comment both what the 2 and the 6 here mean. I think the word “minimum” here seems a little strange. As far as I can tell, this is a maximum size, not a minimum. I guess minimum means “before rounding up”? Committed r183961: <http://trac.webkit.org/changeset/183961> s/dom/JSON-stringify.html starting timing out after this: https://build.webkit.org/results/Apple%20Mavericks%20Release%20WK1%20(Tests)/r183961%20(14539)/results.html (In reply to comment #4) > s/dom/JSON-stringify.html starting timing out after this: > https://build.webkit.org/results/Apple%20Mavericks%20Release%20WK1%20(Tests)/ > r183961%20(14539)/results.html Investigating.. Re-opened since this is blocked by bug 144784 (In reply to comment #5) > (In reply to comment #4) > > s/dom/JSON-stringify.html starting timing out after this: > > https://build.webkit.org/results/Apple%20Mavericks%20Release%20WK1%20(Tests)/ > > r183961%20(14539)/results.html > > Investigating.. Hah, this was happening because I inadvertently made the \uNNNN output be uppercase hexadecimal, and this test was stringifying a JSON object containing all the unicode characters between \u0000 and \uFFFF. The timeout happened because WebKit spent eons going back and forth with fontd to find a fallback font for every single character in the "Expected \uabcd but got \uABCD etc" message. (In reply to comment #7) > Also this: > https://build.webkit.org/results/Apple%20Mavericks%20Debug%20WK2%20(Tests)/ > r183976%20(11594)/storage/indexeddb/list-ordering-crash-log.txt This was happening because I integrated the suggestion about a lower (than 6x) fudge factor for 8-bit strings. "\x00" is a valid 8-bit string, and encodes to "\u0000", so needs 6 characters, not 2. :/ Created attachment 252697 [details]
Patch for landing
Attachment 252697 [details] did not pass style-queue:
ERROR: Source/WTF/wtf/text/StringBuilder.cpp:402: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5]
ERROR: Source/WTF/wtf/text/StringBuilder.cpp:408: More than one command on the same line [whitespace/newline] [4]
Total errors found: 2 in 8 files
If any of these errors are false positives, please file a bug against check-webkit-style.
Committed r183988: <http://trac.webkit.org/changeset/183988> Potential 0.6% regression on JetStream according to the bots. (In reply to comment #9) > This was happening because I integrated the suggestion about a lower (than > 6x) fudge factor for 8-bit strings. "\x00" is a valid 8-bit string, and > encodes to "\u0000", so needs 6 characters, not 2. :/ Damn, I realized that was wrong, but only after the fact! (In reply to comment #13) > Potential 0.6% regression on JetStream according to the bots. Uh oh! Maybe it was my “continue” suggestion that caused the problem? (In reply to comment #16) > Maybe it was my “continue” suggestion that caused the problem? I don't think so. The loop was originally structured that way because it wanted to find the longest possible range to pass to StringBuilder::append(characters, length). The "continue" change was actually a speedup according to run-jsc-benchmarks :) I can't reproduce any slowdown on any of the JetStream subtests, nor can I understand what would get slower from this change. Given that the JetStream bot has quickly gone back up to the "pre-regression" level without reverting this change, I'm inclined to believe that what we saw was a bot hiccup. |