RESOLVED FIXED 144754
Optimize serialization of quoted JSON strings.
https://bugs.webkit.org/show_bug.cgi?id=144754
Summary Optimize serialization of quoted JSON strings.
Andreas Kling
Reported 2015-05-07 13:18:46 PDT
JSON.stringify() can do better than calling StringBuilder::append() over and over.
Attachments
Patch idea (14.16 KB, patch)
2015-05-07 13:34 PDT, Andreas Kling
darin: review+
Patch for landing (14.55 KB, patch)
2015-05-08 00:52 PDT, Andreas Kling
no flags
Andreas Kling
Comment 1 2015-05-07 13:34:39 PDT
Created attachment 252615 [details] Patch idea
Darin Adler
Comment 2 2015-05-07 15:32:12 PDT
Comment on attachment 252615 [details] Patch idea View in context: https://bugs.webkit.org/attachment.cgi?id=252615&action=review I guess I like this. No big deal to build the JSON logic into StringBuilder. We could always refactor to deal with that later if we like. > Source/WTF/wtf/text/StringBuilder.cpp:364 > +template <typename OutputCharType, typename InputCharType> We should write out CharacterType and not use CharType. > Source/WTF/wtf/text/StringBuilder.cpp:369 > + *(output++) = input[i++]; I don’t get why this is a separate loop. Why can’t we just do one character at a time and use continue and avoid the extra break below? if (input[i] > 0x1F && input[i] != '"' && input[i] != '\\') { *output++ = input[i]; continue; } Is there something I’m missing that would make this super slow? It’s also a little strange that we use indexing for input but pointer math for output. > Source/WTF/wtf/text/StringBuilder.cpp:374 > + *(output++) = '\\'; I think that idiomatically it’s always been *output++ for me, without parentheses. > Source/WTF/wtf/text/StringBuilder.cpp:402 > + static const char hexDigits[] = "0123456789abcdef"; It’s a little strange to use a different approach here than lowerNibbleToASCIIHexDigit does in ASCIICType.h. But this is lowercase hex and that is uppercase, and maybe this way of doing things is faster than what that does. > Source/WTF/wtf/text/StringBuilder.cpp:407 > + *(output++) = static_cast<LChar>(hexDigits[(ch >> 12) & 0xF]); > + *(output++) = static_cast<LChar>(hexDigits[(ch >> 8) & 0xF]); I think these are always '0'; we can’t get here unless the character is <= 0x1F; > Source/WTF/wtf/text/StringBuilder.cpp:417 > + size_t minimumNewCapacity = length() + 2 + (string.length() * 6); This could be a much lower value when string.is8Bit() is true. I think it’s worth optimizing that. I don’t think you need parentheses. I think you should explain in a comment both what the 2 and the 6 here mean. I think the word “minimum” here seems a little strange. As far as I can tell, this is a maximum size, not a minimum. I guess minimum means “before rounding up”?
Andreas Kling
Comment 3 2015-05-07 17:19:09 PDT
Simon Fraser (smfr)
Comment 4 2015-05-07 18:36:57 PDT
Andreas Kling
Comment 5 2015-05-07 20:10:21 PDT
(In reply to comment #4) > s/dom/JSON-stringify.html starting timing out after this: > https://build.webkit.org/results/Apple%20Mavericks%20Release%20WK1%20(Tests)/ > r183961%20(14539)/results.html Investigating..
WebKit Commit Bot
Comment 6 2015-05-07 20:22:18 PDT
Re-opened since this is blocked by bug 144784
Andreas Kling
Comment 8 2015-05-08 00:30:50 PDT
(In reply to comment #5) > (In reply to comment #4) > > s/dom/JSON-stringify.html starting timing out after this: > > https://build.webkit.org/results/Apple%20Mavericks%20Release%20WK1%20(Tests)/ > > r183961%20(14539)/results.html > > Investigating.. Hah, this was happening because I inadvertently made the \uNNNN output be uppercase hexadecimal, and this test was stringifying a JSON object containing all the unicode characters between \u0000 and \uFFFF. The timeout happened because WebKit spent eons going back and forth with fontd to find a fallback font for every single character in the "Expected \uabcd but got \uABCD etc" message.
Andreas Kling
Comment 9 2015-05-08 00:34:43 PDT
(In reply to comment #7) > Also this: > https://build.webkit.org/results/Apple%20Mavericks%20Debug%20WK2%20(Tests)/ > r183976%20(11594)/storage/indexeddb/list-ordering-crash-log.txt This was happening because I integrated the suggestion about a lower (than 6x) fudge factor for 8-bit strings. "\x00" is a valid 8-bit string, and encodes to "\u0000", so needs 6 characters, not 2. :/
Andreas Kling
Comment 10 2015-05-08 00:52:21 PDT
Created attachment 252697 [details] Patch for landing
WebKit Commit Bot
Comment 11 2015-05-08 00:53:59 PDT
Attachment 252697 [details] did not pass style-queue: ERROR: Source/WTF/wtf/text/StringBuilder.cpp:402: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5] ERROR: Source/WTF/wtf/text/StringBuilder.cpp:408: More than one command on the same line [whitespace/newline] [4] Total errors found: 2 in 8 files If any of these errors are false positives, please file a bug against check-webkit-style.
Andreas Kling
Comment 12 2015-05-08 01:45:48 PDT
Chris Dumez
Comment 13 2015-05-08 13:50:19 PDT
Potential 0.6% regression on JetStream according to the bots.
Darin Adler
Comment 14 2015-05-08 19:33:06 PDT
(In reply to comment #9) > This was happening because I integrated the suggestion about a lower (than > 6x) fudge factor for 8-bit strings. "\x00" is a valid 8-bit string, and > encodes to "\u0000", so needs 6 characters, not 2. :/ Damn, I realized that was wrong, but only after the fact!
Darin Adler
Comment 15 2015-05-08 19:33:21 PDT
(In reply to comment #13) > Potential 0.6% regression on JetStream according to the bots. Uh oh!
Darin Adler
Comment 16 2015-05-09 18:24:29 PDT
Maybe it was my “continue” suggestion that caused the problem?
Andreas Kling
Comment 17 2015-05-09 18:55:03 PDT
(In reply to comment #16) > Maybe it was my “continue” suggestion that caused the problem? I don't think so. The loop was originally structured that way because it wanted to find the longest possible range to pass to StringBuilder::append(characters, length). The "continue" change was actually a speedup according to run-jsc-benchmarks :) I can't reproduce any slowdown on any of the JetStream subtests, nor can I understand what would get slower from this change. Given that the JetStream bot has quickly gone back up to the "pre-regression" level without reverting this change, I'm inclined to believe that what we saw was a bot hiccup.
Note You need to log in before you can comment on or make changes to this bug.