WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
90321
Convert HTML parser to handle 8-bit resources without converting to UChar*
https://bugs.webkit.org/show_bug.cgi?id=90321
Summary
Convert HTML parser to handle 8-bit resources without converting to UChar*
Michael Saboff
Reported
2012-06-29 15:52:03 PDT
This task is to convert the HTML parser so that it can properly parse 8-bit source strings without converting to 16-bit format.
Attachments
Patch for Review
(22.96 KB, patch)
2012-07-11 18:23 PDT
,
Michael Saboff
buildbot
: commit-queue-
Details
Formatted Diff
Diff
COmbined patches from 90319, 90320 and 90321 for EWS testing
(44.13 KB, patch)
2012-07-11 18:24 PDT
,
Michael Saboff
buildbot
: commit-queue-
Details
Formatted Diff
Diff
Updated patch
(24.44 KB, patch)
2012-07-12 11:12 PDT
,
Michael Saboff
webkit.review.bot
: commit-queue-
Details
Formatted Diff
Diff
Archive of layout-test-results from gce-cr-linux-01
(316.09 KB, application/zip)
2012-07-12 12:31 PDT
,
WebKit Review Bot
no flags
Details
Patch with speculative fix for chrome-linux test failure
(24.62 KB, patch)
2012-07-16 11:46 PDT
,
Michael Saboff
webkit.review.bot
: commit-queue-
Details
Formatted Diff
Diff
Archive of layout-test-results from gce-cr-linux-04
(356.49 KB, application/zip)
2012-07-16 13:06 PDT
,
WebKit Review Bot
no flags
Details
Updated Patch with another Speculative Fix for Chrome Linux
(24.71 KB, patch)
2012-07-16 18:36 PDT
,
Michael Saboff
no flags
Details
Formatted Diff
Diff
Updated patch with performance tunes for Parsing Tests
(36.94 KB, patch)
2012-07-24 11:43 PDT
,
Michael Saboff
ggaren
: review+
Details
Formatted Diff
Diff
Patch with performance update to r123679
(40.21 KB, patch)
2012-08-03 13:46 PDT
,
Michael Saboff
abarth
: review+
Details
Formatted Diff
Diff
Performance results of latest patch
(17.61 KB, text/plain)
2012-08-03 14:34 PDT
,
Michael Saboff
no flags
Details
Show Obsolete
(6)
View All
Add attachment
proposed patch, testcase, etc.
Michael Saboff
Comment 1
2012-07-11 18:23:37 PDT
Created
attachment 151836
[details]
Patch for Review This patch depends on the changes for
https://bugs.webkit.org/show_bug.cgi?id=90319
and
https://bugs.webkit.org/show_bug.cgi?id=90320
. I will post another patch with the combination of this patch and the other two for EWS bot analysis.
Michael Saboff
Comment 2
2012-07-11 18:24:28 PDT
Created
attachment 151838
[details]
COmbined patches from 90319, 90320 and 90321 for EWS testing
Michael Saboff
Comment 3
2012-07-11 18:35:56 PDT
The attached proposed patches are neutral in the page load tests across three tests before and after Before geo mean of 210.9ms After geo mean of 209.9ms Before test results: Mean (Confidence) Fastest Run **************************************************************************************************** 16784.35 ms +/- 2.34% 14990.80 ms Total 215.38 ms +/- 1.00% 207.80 ms Mean of Geometric Means **************************************************************************************************** **************************************************************************************************** 16413.54 ms +/- 1.80% 15042.10 ms Total 208.62 ms +/- 0.87% 199.90 ms Mean of Geometric Means **************************************************************************************************** **************************************************************************************************** 16514.65 ms +/- 1.71% 15592.80 ms Total 208.61 ms +/- 0.72% 202.70 ms Mean of Geometric Means **************************************************************************************************** After test results: Mean (Confidence) Fastest Run **************************************************************************************************** 16401.53 ms +/- 2.50% 14870.60 ms Total 206.50 ms +/- 1.17% 196.30 ms Mean of Geometric Means **************************************************************************************************** **************************************************************************************************** 16800.39 ms +/- 1.35% 15901.40 ms Total 216.90 ms +/- 0.44% 212.90 ms Mean of Geometric Means **************************************************************************************************** **************************************************************************************************** 16530.17 ms +/- 2.29% 14977.70 ms Total 206.26 ms +/- 0.92% 199.50 ms Mean of Geometric Means ****************************************************************************************************
Build Bot
Comment 4
2012-07-11 19:17:47 PDT
Comment on
attachment 151836
[details]
Patch for Review
Attachment 151836
[details]
did not pass mac-ews (mac): Output:
http://queues.webkit.org/results/13221027
WebKit Review Bot
Comment 5
2012-07-11 19:25:04 PDT
Comment on
attachment 151836
[details]
Patch for Review
Attachment 151836
[details]
did not pass chromium-ews (chromium-xvfb): Output:
http://queues.webkit.org/results/13209551
Build Bot
Comment 6
2012-07-11 19:41:51 PDT
Comment on
attachment 151836
[details]
Patch for Review
Attachment 151836
[details]
did not pass win-ews (win): Output:
http://queues.webkit.org/results/13208531
Early Warning System Bot
Comment 7
2012-07-11 20:07:48 PDT
Comment on
attachment 151836
[details]
Patch for Review
Attachment 151836
[details]
did not pass qt-wk2-ews (qt): Output:
http://queues.webkit.org/results/13204554
Sam Weinig
Comment 8
2012-07-11 20:08:56 PDT
Comment on
attachment 151836
[details]
Patch for Review View in context:
https://bugs.webkit.org/attachment.cgi?id=151836&action=review
> Source/WebCore/platform/text/SegmentedString.h:37 > + , m_offset(-1)
Should this -1 be in a constant?
> Source/WebCore/platform/text/SegmentedString.h:81 > int m_length; > - const UChar* m_current; > + UChar m_currentChar; > + int m_offset;
With the bool below, I think this could be packed to not expand the size of the class, at least on 64bit.
Early Warning System Bot
Comment 9
2012-07-11 20:20:19 PDT
Comment on
attachment 151836
[details]
Patch for Review
Attachment 151836
[details]
did not pass qt-ews (qt): Output:
http://queues.webkit.org/results/13201555
Gyuyoung Kim
Comment 10
2012-07-11 20:39:26 PDT
Comment on
attachment 151836
[details]
Patch for Review
Attachment 151836
[details]
did not pass efl-ews (efl): Output:
http://queues.webkit.org/results/13206554
Build Bot
Comment 11
2012-07-12 10:16:55 PDT
Comment on
attachment 151838
[details]
COmbined patches from 90319, 90320 and 90321 for EWS testing
Attachment 151838
[details]
did not pass win-ews (win): Output:
http://queues.webkit.org/results/13205737
WebKit Review Bot
Comment 12
2012-07-12 10:28:21 PDT
Comment on
attachment 151838
[details]
COmbined patches from 90319, 90320 and 90321 for EWS testing
Attachment 151838
[details]
did not pass chromium-ews (chromium-xvfb): Output:
http://queues.webkit.org/results/13205736
Early Warning System Bot
Comment 13
2012-07-12 10:45:05 PDT
Comment on
attachment 151838
[details]
COmbined patches from 90319, 90320 and 90321 for EWS testing
Attachment 151838
[details]
did not pass qt-wk2-ews (qt): Output:
http://queues.webkit.org/results/13209795
Build Bot
Comment 14
2012-07-12 10:55:27 PDT
Comment on
attachment 151838
[details]
COmbined patches from 90319, 90320 and 90321 for EWS testing
Attachment 151838
[details]
did not pass mac-ews (mac): Output:
http://queues.webkit.org/results/13199771
Michael Saboff
Comment 15
2012-07-12 11:12:08 PDT
Created
attachment 152005
[details]
Updated patch (In reply to
comment #8
)
> (From update of
attachment 151836
[details]
) > View in context:
https://bugs.webkit.org/attachment.cgi?id=151836&action=review
> > > Source/WebCore/platform/text/SegmentedString.h:37 > > + , m_offset(-1) > > Should this -1 be in a constant?
I checked all the cases that use m_offset, they are all protected by !m_length so I changed this to m_offset(0).
> > > Source/WebCore/platform/text/SegmentedString.h:81 > > int m_length; > > - const UChar* m_current; > > + UChar m_currentChar; > > + int m_offset; > > With the bool below, I think this could be packed to not expand the size of the class, at least on 64bit.
I moved the fields around a little to allow for better packing: public: int m_length; int m_offset; UChar m_currentChar; private: bool m_doNotExcludeLineNumbers; String m_string; I moved that WTFString::dataSize() change to this patch and now it builds independent of the other two patches. It still isn't useful without one or both of the other two patches.
WebKit Review Bot
Comment 16
2012-07-12 12:31:02 PDT
Comment on
attachment 152005
[details]
Updated patch
Attachment 152005
[details]
did not pass chromium-ews (chromium-xvfb): Output:
http://queues.webkit.org/results/13200848
New failing tests: http/tests/misc/xmltokenizer-do-not-crash.pl
WebKit Review Bot
Comment 17
2012-07-12 12:31:06 PDT
Created
attachment 152023
[details]
Archive of layout-test-results from gce-cr-linux-01 The attached test failures were seen while running run-webkit-tests on the chromium-ews. Bot: gce-cr-linux-01 Port: <class 'webkitpy.common.config.ports.ChromiumXVFBPort'> Platform: Linux-2.6.39-gcg-201203291735-x86_64-with-Ubuntu-10.04-lucid
Michael Saboff
Comment 18
2012-07-16 11:46:59 PDT
Created
attachment 152575
[details]
Patch with speculative fix for chrome-linux test failure I suspect that we were getting a null character before isEmpty() was returning null with a SegmentedString. I tightened up the end checking.
WebKit Review Bot
Comment 19
2012-07-16 13:06:55 PDT
Comment on
attachment 152575
[details]
Patch with speculative fix for chrome-linux test failure
Attachment 152575
[details]
did not pass chromium-ews (chromium-xvfb): Output:
http://queues.webkit.org/results/13266033
New failing tests: http/tests/misc/xmltokenizer-do-not-crash.pl
WebKit Review Bot
Comment 20
2012-07-16 13:06:59 PDT
Created
attachment 152597
[details]
Archive of layout-test-results from gce-cr-linux-04 The attached test failures were seen while running run-webkit-tests on the chromium-ews. Bot: gce-cr-linux-04 Port: <class 'webkitpy.common.config.ports.ChromiumXVFBPort'> Platform: Linux-2.6.39-gcg-201203291735-x86_64-with-Ubuntu-10.04-lucid
Michael Saboff
Comment 21
2012-07-16 18:36:13 PDT
Created
attachment 152676
[details]
Updated Patch with another Speculative Fix for Chrome Linux Changed SegmentedSubstring::appendTo() to not add a substring after it has been cleared.
Oliver Hunt
Comment 22
2012-07-17 10:57:44 PDT
Comment on
attachment 152676
[details]
Updated Patch with another Speculative Fix for Chrome Linux View in context:
https://bugs.webkit.org/attachment.cgi?id=152676&action=review
r=me, but make sure you checked that html parser perf is still okay. Check with eseidel, abarth, or antti as I believe that there is a html parser benchmark in the repo somewhere.
> Source/WTF/wtf/text/WTFString.h:197 > + return m_impl->length() * (is8Bit() ? sizeof(LChar) : sizeof(UChar));
If this turns out to be hot (i doubt it) we can make it branchless with length + length * !is8Bit(). But only if this is hot, as it would be hideously non-obvious what was going on.
> Source/WebCore/platform/text/SegmentedString.h:81 > - const UChar* m_current; > + int m_offset;
Hopefully this will actually result in a reduced struct size in 64bit, which is nice.
Michael Saboff
Comment 23
2012-07-24 11:43:50 PDT
Created
attachment 154114
[details]
Updated patch with performance tunes for Parsing Tests (In reply to
comment #22
)
> (From update of
attachment 152676
[details]
) > View in context:
https://bugs.webkit.org/attachment.cgi?id=152676&action=review
> > r=me, but make sure you checked that html parser perf is still okay. Check with eseidel, abarth, or antti as I believe that there is a html parser benchmark in the repo somewhere.
The prior patch was a regression to neutral on the parsing tests. This reworking of the patch has tunes to turn all but one test into progressions. The table below shows the test results. I restarted Safari before each test. The units and better direction is listed below the test. Test
r123007
r123011
and Delta baseline this patch css-parser-yui Average 368.65 367.76 -0.24% (runs/sec) Median 368.42 368.66 0.07% (Bigger is better) StDev 2.12 2.57 html-parser Average 1900.05 1856.50 2.29% (ms) Median 1903.50 1857.50 2.42% (Smaller is better) StDev 11.97 5.12 html5-full-render Average 20524 20381.5 0.69% (ms) Median 20524 20381.5 0.69% (Smaller is better) StDev 219 198.5 innerHTML-setter Average 289.67 295.67 2.07% (runs/sec) Median 289.86 296.11 2.16% (Bigger is better) StDev 1.43 1.66 query-selector-deep Average 386.64 395.68 2.34% (runs/sec) Median 386.47 395.55 2.35% (Bigger is better) StDev 1.15 1.09 query-selector-first Average 2108.09 2031.64 -3.63% (runs/sec) Median 2108.04 2034.33 -3.50% (Bigger is better) StDev 21.37 17.72 query-selector-last Average 372.19 389.22 4.58% (runs/sec) Median 372.34 389.29 4.55% (Bigger is better) StDev 0.94 0.67 simple-url Average 43.00 48.88 13.69% (runs/sec) Median 42.99 49.26 14.59% (Bigger is better) StDev 0.24 0.65 textarea-parsing Average 56.49 56.27 -0.38% (runs/sec) Median 56.63 56.37 -0.45% (Bigger is better) StDev 0.72 0.56 tiny-innerHTML Average 5.68 6.36 12.02% (runs/sec) Median 5.68 6.37 12.03% (Bigger is better) StDev 0.05 0.04 url-parser Average 96.96 113.65 17.20% (runs/sec) Median 97.03 113.71 17.19% (Bigger is better) StDev 1.09 0.56 xml-parser Average 6.89 7.09 2.81% (runs/sec) Median 6.90 7.10 2.84% (Bigger is better) StDev 0.04 0.03
Geoffrey Garen
Comment 24
2012-07-24 12:12:15 PDT
Comment on
attachment 154114
[details]
Updated patch with performance tunes for Parsing Tests View in context:
https://bugs.webkit.org/attachment.cgi?id=154114&action=review
Performance results look good. Please fix the style and organization issues below before landing.
> Source/WTF/wtf/text/WTFString.h:193 > + unsigned dataSize() const
Everything is data. How about calling this "sizeInBytes()"?
> Source/WebCore/html/parser/HTMLParserIdioms.cpp:76 > + if (string.is8Bit()) { > + const LChar* characters = string.characters8(); > + > + for (; numLeadingSpaces < length; ++numLeadingSpaces) { > + if (isNotHTMLSpace(characters[numLeadingSpaces])) > + break; > + } > + > + if (numLeadingSpaces == length) > + return string.isNull() ? string : emptyAtom.string(); > > + for (; numTrailingSpaces < length; ++numTrailingSpaces) { > + if (isNotHTMLSpace(characters[length - numTrailingSpaces - 1])) > + break; > + } > + } else { > + const UChar* characters = string.characters(); > + > + for (; numLeadingSpaces < length; ++numLeadingSpaces) { > + if (isNotHTMLSpace(characters[numLeadingSpaces])) > + break; > + } > + > + if (numLeadingSpaces == length) > + return string.isNull() ? string : emptyAtom.string(); > + > + for (; numTrailingSpaces < length; ++numTrailingSpaces) { > + if (isNotHTMLSpace(characters[length - numTrailingSpaces - 1])) > + break; > + } > + }
Here we have two code paths that differ only in their character type. This comes up a lot in 8/16 work. In this case, and cases like it, I'd rather see a function template that abstracted out the character type, to avoid duplicating all this code. A template will make our code more compact, and it will mean that we only need to fix bugs in one place instead of two. In this case, I would templatize stripLeadingAndTrailingHTMLSpaces.
> Source/WebCore/platform/text/SegmentedString.cpp:40 > + m_advanceFunc = other.m_advanceFunc; > + m_advanceAndUpdateLineNumberFunc = other.m_advanceAndUpdateLineNumberFunc;
These two assignments should use initializer syntax.
> Source/WebCore/platform/text/SegmentedString.cpp:247 > +void SegmentedString::advanceAndUpdateLineNumber8IncludeLineNumbers()
Other 8 vs 16 functions put the "8" at the end. This one should too. Can you just remove the "IncludeLineNumbers" from the end here? I don't think it adds anything above "UpdateLineNumber".
> Source/WebCore/platform/text/SegmentedString.cpp:259 > +void SegmentedString::advanceAndUpdateLineNumber16IncludeLineNumbers()
Ditto.
> Source/WebCore/platform/text/SegmentedString.h:151 > + m_advanceFunc = &SegmentedString::advanceEmpty; > + m_advanceAndUpdateLineNumberFunc = &SegmentedString::advanceEmpty;
Initializer syntax, please.
> Source/WebCore/platform/text/SegmentedString.h:211 > + (this->*m_advanceFunc)();
No need for "this->".
> Source/WebCore/platform/text/SegmentedString.h:216 > + (this->*m_advanceAndUpdateLineNumberFunc)();
No need for "this->".
> Source/WebCore/platform/text/SegmentedString.h:293 > + void setSlowCase()
WebKit style uses the "set" prefix to mean "I am passing you the value to set". This function should be named "updateSlowCaseFunctionPointers()".
> Source/WebCore/platform/text/SegmentedString.h:306 > + void setAdvanceFunctionPointers()
WebKit style uses the "set" prefix to mean "I am passing you the value to set". This function should be named "updateAdvanceFunctionPointers()".
Michael Saboff
Comment 25
2012-07-24 17:49:40 PDT
> > Source/WebCore/platform/text/SegmentedString.h:211 > > + (this->*m_advanceFunc)(); > > No need for "this->". > > > Source/WebCore/platform/text/SegmentedString.h:216 > > + (this->*m_advanceAndUpdateLineNumberFunc)(); > > No need for "this->".
The C++ standard requires either ->* or .* to bind an specific class object when invoking a member function. Furthermore, given the precedence of these "pointer to member select" operators being lower than the () for the arguments, the construct needs to be surrounded with a set of parenthesis. Without the (this->*..) clang gives: /Volumes/Data/src/webkit.work/Source/WebCore/platform/text/SegmentedString.h:211:10: error: indirection requires pointer operand ('void (WebCore::SegmentedString::*)()' invalid) (*m_advanceFunc)(); ^~~~~~~~~~~~~~ I made all the other recommended changes.
Michael Saboff
Comment 26
2012-07-24 18:11:37 PDT
Committed
r123560
: <
http://trac.webkit.org/changeset/123560
>
Andrew Wilson
Comment 27
2012-07-25 10:43:16 PDT
I'm seeing valgrind errors in the chromium build bots, as well as failing/flaky tests after this patch: 10:24:57 memcheck_analyze.py [ERROR] Command: InvalidRead Invalid read of size 1 WebCore::SegmentedSubstring::incrementAndGetCurrentChar8() (third_party/WebKit/Source/WebCore/platform/text/SegmentedString.h:94) WebCore::SegmentedString::advanceAndUpdateLineNumberSlowCase() (third_party/WebKit/Source/WebCore/platform/text/SegmentedString.h:121) WebCore::SegmentedString::advanceAndUpdateLineNumber() (third_party/WebKit/Source/WebCore/platform/text/SegmentedString.h:216) WebCore::HTMLTokenizer::nextToken(WebCore::SegmentedString&, WebCore::HTMLToken&) (third_party/WebKit/Source/WebCore/xml/parser/MarkupTokenizerBase.h:121) WebCore::HTMLDocumentParser::pumpTokenizer(WebCore::HTMLDocumentParser::SynchronousMode) (third_party/WebKit/Source/WebCore/html/parser/HTMLDocumentParser.cpp:258) WebCore::HTMLDocumentParser::pumpTokenizerIfPossible(WebCore::HTMLDocumentParser::SynchronousMode) (third_party/WebKit/Source/WebCore/html/parser/HTMLDocumentParser.cpp:173) WebCore::HTMLDocumentParser::append(WebCore::SegmentedString const&) (third_party/WebKit/Source/WebCore/html/parser/HTMLDocumentParser.cpp:361) WebCore::TextDocumentParser::append(WebCore::SegmentedString const&) (third_party/WebKit/Source/WebCore/html/parser/TextDocumentParser.cpp:52) WebCore::DecodedDataDocumentParser::appendBytes(WebCore::DocumentWriter*, char const*, unsigned int) (third_party/WebKit/Source/WebCore/dom/DecodedDataDocumentParser.cpp:50) WebCore::DocumentWriter::addData(char const*, unsigned int) (third_party/WebKit/Source/WebCore/loader/DocumentWriter.cpp:218) WebCore::DocumentLoader::commitData(char const*, unsigned int) (third_party/WebKit/Source/WebCore/loader/DocumentLoader.cpp:355) WebKit::WebFrameImpl::commitDocumentData(char const*, unsigned int) (third_party/WebKit/Source/WebKit/chromium/src/WebFrameImpl.cpp:1158) WebKit::FrameLoaderClientImpl::committedLoad(WebCore::DocumentLoader*, char const*, int) (third_party/WebKit/Source/WebKit/chromium/src/FrameLoaderClientImpl.cpp:1124) WebCore::DocumentLoader::commitLoad(char const*, int) (third_party/WebKit/Source/WebCore/loader/DocumentLoader.cpp:322) WebCore::DocumentLoader::receivedData(char const*, int) (third_party/WebKit/Source/WebCore/loader/DocumentLoader.cpp:384) WebCore::MainResourceLoader::addData(char const*, int, bool) (third_party/WebKit/Source/WebCore/loader/MainResourceLoader.cpp:192) WebCore::ResourceLoader::didReceiveData(char const*, int, long long, bool) (third_party/WebKit/Source/WebCore/loader/ResourceLoader.cpp:272) WebCore::MainResourceLoader::didReceiveData(char const*, int, long long, bool) (third_party/WebKit/Source/WebCore/loader/MainResourceLoader.cpp:474) WebCore::ResourceLoader::didReceiveData(WebCore::ResourceHandle*, char const*, int, int) (third_party/WebKit/Source/WebCore/loader/ResourceLoader.cpp:430) WebCore::ResourceHandleInternal::didReceiveData(WebKit::WebURLLoader*, char const*, int, int) (third_party/WebKit/Source/WebCore/platform/network/chromium/ResourceHandle.cpp:138) webkit_glue::WebURLLoaderImpl::Context::OnReceivedData(char const*, int, int) (webkit/glue/weburlloader_impl.cc:615) webkit_glue::WebURLLoaderImpl::Context::HandleDataURL() (webkit/glue/weburlloader_impl.cc:713) Address 0x72cfed3 is 0 bytes after a block of size 27 alloc'd malloc (m_replacemalloc/vg_replace_malloc.c:1072) WTF::fastMalloc(unsigned int) (third_party/WebKit/Source/WTF/wtf/FastMalloc.cpp:268) WTF::StringImpl::createUninitialized(unsigned int, unsigned char*&) (third_party/WebKit/Source/WTF/wtf/text/StringImpl.cpp:98) WTF::String::createUninitialized(unsigned int, unsigned char*&) (third_party/WebKit/Source/WTF/wtf/text/WTFString.h:331) WebCore::TextCodecLatin1::decode(char const*, unsigned int, bool, bool, bool&) (third_party/WebKit/Source/WebCore/platform/text/TextCodecLatin1.cpp:126) WebCore::TextResourceDecoder::decode(char const*, unsigned int) (third_party/WebKit/Source/WebCore/loader/TextResourceDecoder.cpp:648) WebCore::DecodedDataDocumentParser::appendBytes(WebCore::DocumentWriter*, char const*, unsigned int) (third_party/WebKit/Source/WebCore/dom/DecodedDataDocumentParser.cpp:45) WebCore::DocumentWriter::addData(char const*, unsigned int) (third_party/WebKit/Source/WebCore/loader/DocumentWriter.cpp:218) WebCore::DocumentLoader::commitData(char const*, unsigned int) (third_party/WebKit/Source/WebCore/loader/DocumentLoader.cpp:355) WebKit::WebFrameImpl::commitDocumentData(char const*, unsigned int) (third_party/WebKit/Source/WebKit/chromium/src/WebFrameImpl.cpp:1158) WebKit::FrameLoaderClientImpl::committedLoad(WebCore::DocumentLoader*, char const*, int) (third_party/WebKit/Source/WebKit/chromium/src/FrameLoaderClientImpl.cpp:1124) WebCore::DocumentLoader::commitLoad(char const*, int) (third_party/WebKit/Source/WebCore/loader/DocumentLoader.cpp:322) WebCore::DocumentLoader::receivedData(char const*, int) (third_party/WebKit/Source/WebCore/loader/DocumentLoader.cpp:384) WebCore::MainResourceLoader::addData(char const*, int, bool) (third_party/WebKit/Source/WebCore/loader/MainResourceLoader.cpp:192) WebCore::ResourceLoader::didReceiveData(char const*, int, long long, bool) (third_party/WebKit/Source/WebCore/loader/ResourceLoader.cpp:272) WebCore::MainResourceLoader::didReceiveData(char const*, int, long long, bool) (third_party/WebKit/Source/WebCore/loader/MainResourceLoader.cpp:474) WebCore::ResourceLoader::didReceiveData(WebCore::ResourceHandle*, char const*, int, int) (third_party/WebKit/Source/WebCore/loader/ResourceLoader.cpp:430) WebCore::ResourceHandleInternal::didReceiveData(WebKit::WebURLLoader*, char const*, int, int) (third_party/WebKit/Source/WebCore/platform/network/chromium/ResourceHandle.cpp:138) webkit_glue::WebURLLoaderImpl::Context::OnReceivedData(char const*, int, int) (webkit/glue/weburlloader_impl.cc:615) webkit_glue::WebURLLoaderImpl::Context::HandleDataURL() (webkit/glue/weburlloader_impl.cc:713) Suppression (error hash=#26BEFD99CCA1F7AB#): For more info on using suppressions see
http://dev.chromium.org/developers/tree-sheriffs/sheriff-details-chromium/memory-sheriff#TOC-Suppressing-memory-reports
{ <insert_a_suppression_name_here> Memcheck:Unaddressable fun:_ZN7WebCore18SegmentedSubstring27incrementAndGetCurrentChar8Ev fun:_ZN7WebCore15SegmentedString34advanceAndUpdateLineNumberSlowCaseEv fun:_ZN7WebCore15SegmentedString26advanceAndUpdateLineNumberEv fun:_ZN7WebCore13HTMLTokenizer9nextTokenERNS_15SegmentedStringERNS_9HTMLTokenE fun:_ZN7WebCore18HTMLDocumentParser13pumpTokenizerENS0_15SynchronousModeE fun:_ZN7WebCore18HTMLDocumentParser23pumpTokenizerIfPossibleENS0_15SynchronousModeE fun:_ZN7WebCore18HTMLDocumentParser6appendERKNS_15SegmentedStringE fun:_ZN7WebCore18TextDocumentParser6appendERKNS_15SegmentedStringE fun:_ZN7WebCore25DecodedDataDocumentParser11appendBytesEPNS_14DocumentWriterEPKcj fun:_ZN7WebCore14DocumentWriter7addDataEPKcj fun:_ZN7WebCore14DocumentLoader10commitDataEPKcj fun:_ZN6WebKit12WebFrameImpl18commitDocumentDataEPKcj fun:_ZN6WebKit21FrameLoaderClientImpl13committedLoadEPN7WebCore14DocumentLoaderEPKci fun:_ZN7WebCore14DocumentLoader10commitLoadEPKci fun:_ZN7WebCore14DocumentLoader12receivedDataEPKci fun:_ZN7WebCore18MainResourceLoader7addDataEPKcib fun:_ZN7WebCore14ResourceLoader14didReceiveDataEPKcixb fun:_ZN7WebCore18MainResourceLoader14didReceiveDataEPKcixb fun:_ZN7WebCore14ResourceLoader14didReceiveDataEPNS_14ResourceHandleEPKcii fun:_ZN7WebCore22ResourceHandleInternal14didReceiveDataEPN6WebKit12WebURLLoaderEPKcii fun:_ZN11webkit_glue16WebURLLoaderImpl7Context14OnReceivedDataEPKcii fun:_ZN11webkit_glue16WebURLLoaderImpl7Context13HandleDataURLEv }
http://build.chromium.org/p/chromium.webkit/builders/Linux%20Valgrind/builds/28539/steps/memory%20test%3A%20test_shell/logs/stdio
Geoffrey Garen
Comment 28
2012-07-25 10:47:50 PDT
Could you post a list of failing tests, along with instructions for how to reproduce the failures?
Andrew Wilson
Comment 29
2012-07-25 10:51:03 PDT
The valgrind output I linked to should show a backtrace for the invalid read that's happening. As for failing tests, they are part of the chromium browser tests:
http://build.chromium.org/p/chromium.webkit/builders/Linux%20Tests/builds/22779/steps/browser_tests/logs/stdio
You can compile the chromium browser tests (assuming you have a chromium buildout) and run them like this: browser_tests --gtest_filter=WebUiBidiCheckerBrowserTest* Specific test failures are: WebUIBidiCheckerBrowserTestLTR.TestSettingsClearBrowserDataPage WebUIBidiCheckerBrowserTestLTR.TestSettingsContentSettingsExceptionsPage WebUIBidiCheckerBrowserTestLTR.TestSettingsFrameSyncSetup WebUIBidiCheckerBrowserTestLTR.TestSettingsFrameMangageProfile WebUIBidiCheckerBrowserTestLTR.TestSettingsFrameContentExceptionsJavascript WebUIBidiCheckerBrowserTestLTR.TestSettingsFrameContentExceptionsPlugins These tests are flaky now, which is expected given the nature of the memory issue.
Andrew Wilson
Comment 30
2012-07-25 11:12:36 PDT
Reverted
r123560
for reason: Breaks Committed
r123635
: <
http://trac.webkit.org/changeset/123635
>
Michael Saboff
Comment 31
2012-07-25 16:42:31 PDT
Committed
r123679
: <
http://trac.webkit.org/changeset/123679
>
Kwang Yul Seo
Comment 32
2012-07-25 18:08:16 PDT
(In reply to
comment #31
)
> Committed
r123679
: <
http://trac.webkit.org/changeset/123679
>
I've noticed that this patch caused 3.4% performance regression in the HTML parsing benchmark. (2.26GHz Intel Core 2 Duo Mac Book Pro) Is this expected? * Before kseo@cat:WebKit (master)> ./Tools/Scripts/run-perf-tests Parser/html-parser.html Running 1 tests Running Parser/html-parser.html (1 of 1) RESULT Parser: html-parser= 2746.7 ms median= 2746.0 ms, stdev= 4.07553677446 ms, min= 2741.0 ms, max= 2757.0 ms * After kseo@cat:WebKit (master)> ./Tools/Scripts/run-perf-tests Parser/html-parser.html Running 1 tests Running Parser/html-parser.html (1 of 1) RESULT Parser: html-parser= 2839.0 ms median= 2838.0 ms, stdev= 3.30151480384 ms, min= 2833.0 ms, max= 2847.0 ms
Michael Saboff
Comment 33
2012-07-25 18:19:26 PDT
(In reply to
comment #32
)
> (In reply to
comment #31
) > > Committed
r123679
: <
http://trac.webkit.org/changeset/123679
> > > I've noticed that this patch caused 3.4% performance regression in the HTML parsing benchmark. (2.26GHz Intel Core 2 Duo Mac Book Pro) Is this expected? > > * Before > kseo@cat:WebKit (master)> ./Tools/Scripts/run-perf-tests Parser/html-parser.html > Running 1 tests > Running Parser/html-parser.html (1 of 1) > RESULT Parser: html-parser= 2746.7 ms > median= 2746.0 ms, stdev= 4.07553677446 ms, min= 2741.0 ms, max= 2757.0 ms > > * After > kseo@cat:WebKit (master)> ./Tools/Scripts/run-perf-tests Parser/html-parser.html > Running 1 tests > Running Parser/html-parser.html (1 of 1) > RESULT Parser: html-parser= 2839.0 ms > median= 2838.0 ms, stdev= 3.30151480384 ms, min= 2833.0 ms, max= 2847.0 ms
My experience is just the opposite. See the results I got in
comment #23
. I will rerun results before and after
r123679
and post.
Kwang Yul Seo
Comment 34
2012-07-25 18:39:35 PDT
(In reply to
comment #33
)
> My experience is just the opposite. See the results I got in
comment #23
. > > I will rerun results before and after
r123679
and post.
I noticed the regression of
r123560
, which was rolled back due to valgrind errors in the chromium build bots. I will also rerun tests with
r123679
.
Kwang Yul Seo
Comment 35
2012-07-25 23:25:27 PDT
(In reply to
comment #33
) I ran the HTML parser benchmark again before and after
r123679
. The performance regression is 1.17%, which is still far from the 2.27% progression (
comment #23
). Michael, what about your result? * Before (
r123678
) kseo@cat:WebKit (before)> ./Tools/Scripts/run-perf-tests Parser/html-parser.html Running 1 tests Running Parser/html-parser.html (1 of 1) RESULT Parser: html-parser= 2794.3 ms median= 2793.0 ms, stdev= 4.8795491595 ms, min= 2789.0 ms, max= 2808.0 ms * After (
r123679
) kseo@cat:WebKit (after)> ./Tools/Scripts/run-perf-tests Parser/html-parser.html Running 1 tests Running Parser/html-parser.html (1 of 1) RESULT Parser: html-parser= 2827.0 ms median= 2826.5 ms, stdev= 4.38178046004 ms, min= 2820.0 ms, max= 2838.0 ms
Michael Saboff
Comment 36
2012-07-26 10:21:30 PDT
I ran all the parser tests on both 123678 and 123679 on a MacPro with results below. For html-parser, I see a 3.6% regression. Here are the pro/re-gression percents: Parser/css-parser-yui.html 0.16% Parser/html-parser.html -3.59% Parser/html5-full-render.html -0.31% Parser/innerHTML-setter.html -1.28% Parser/query-selector-deep.html 0.55% Parser/query-selector-first.html 4.97% Parser/query-selector-last.html 1.70% Parser/simple-url.html 8.17% Parser/textarea-parsing.html -3.03% Parser/tiny-innerHTML.html -2.81% Parser/url-parser.html 13.42% Parser/xml-parser.html 1.83% Two other things I want to point out about my earlier results, I was measuring the impact of three changes (
r123008
,
r123011
and pre-checkin of this change) and they were run in browser. My before was a build of
r123007
while my after was 123011 plus these changes. === Results from 123678: === Running 12 tests Running Parser/css-parser-yui.html (1 of 12) RESULT Parser: css-parser-yui= 367.458737712 runs/s median= 367.69549667 runs/s, stdev= 1.30800629291 runs/s, min= 364.583333333 runs/s, max= 369.393139842 runs/s Running Parser/html-parser.html (2 of 12) RESULT Parser: html-parser= 1769.7 ms median= 1768.0 ms, stdev= 9.40265919833 ms, min= 1754.0 ms, max= 1789.0 ms Running Parser/html5-full-render.html (3 of 12) RESULT Parser: html5-full-render= 21201.5 ms median= 21201.5 ms, stdev= 67.5 ms, min= 21134.0 ms, max= 21269.0 ms Running Parser/innerHTML-setter.html (4 of 12) DESCRIPTION: This benchmark tests innerHTML setter for a large DOM tree RESULT Parser: innerHTML-setter= 287.075598983 runs/s median= 286.831812256 runs/s, stdev= 0.38009048583 runs/s, min= 286.458333333 runs/s, max= 287.958115183 runs/s Running Parser/query-selector-deep.html (5 of 12) DESCRIPTION: This benchmark tests querySelector() for an element that appears in the depths in the document RESULT Parser: query-selector-deep= 388.162250314 runs/s median= 388.349514563 runs/s, stdev= 0.671546741178 runs/s, min= 386.473429952 runs/s, max= 388.821385176 runs/s Running Parser/query-selector-first.html (6 of 12) DESCRIPTION: This benchmark tests querySelector() for an element that appears near the head of the document RESULT Parser: query-selector-first= 1990.6349835 runs/s median= 1993.77024438 runs/s, stdev= 13.3392583811 runs/s, min= 1960.78431373 runs/s, max= 2007.52823087 runs/s Running Parser/query-selector-last.html (7 of 12) DESCRIPTION: This benchmark tests querySelector() for an element that appears near the tail of the document RESULT Parser: query-selector-last= 378.8329393 runs/s median= 378.698224852 runs/s, stdev= 0.287089450694 runs/s, min= 378.250591017 runs/s, max= 379.146919431 runs/s Running Parser/simple-url.html (8 of 12) RESULT Parser: simple-url= 46.0038726752 runs/s median= 45.9770114943 runs/s, stdev= 0.139322571179 runs/s, min= 45.6621004566 runs/s, max= 46.2427745665 runs/s Running Parser/textarea-parsing.html (9 of 12) RESULT Parser: textarea-parsing= 54.8423222733 runs/s median= 54.8847420417 runs/s, stdev= 0.431728764152 runs/s, min= 53.1208499336 runs/s, max= 55.4323725055 runs/s Running Parser/tiny-innerHTML.html (10 of 12) RESULT Parser: tiny-innerHTML= 5.53018943992 runs/s median= 5.52486187845 runs/s, stdev= 0.0271940183488 runs/s, min= 5.47645125958 runs/s, max= 5.58035714286 runs/s Running Parser/url-parser.html (11 of 12) RESULT Parser: url-parser= 99.3605864085 runs/s median= 99.3788819876 runs/s, stdev= 0.147430421588 runs/s, min= 99.1325898389 runs/s, max= 99.6264009963 runs/s Running Parser/xml-parser.html (12 of 12) RESULT Parser: xml-parser= 7.22943725008 runs/s median= 7.23763570567 runs/s, stdev= 0.0248782874786 runs/s, min= 7.15137067938 runs/s, max= 7.26392251816 runs/s === Results from 123679: === Running 12 tests Running Parser/css-parser-yui.html (1 of 12) RESULT Parser: css-parser-yui= 368.042099343 runs/s median= 368.421052632 runs/s, stdev= 1.75896891388 runs/s, min= 363.636363636 runs/s, max= 370.37037037 runs/s Running Parser/html-parser.html (2 of 12) RESULT Parser: html-parser= 1833.2 ms median= 1833.5 ms, stdev= 3.84187454246 ms, min= 1826.0 ms, max= 1842.0 ms Running Parser/html5-full-render.html (3 of 12) RESULT Parser: html5-full-render= 21267.5 ms median= 21267.5 ms, stdev= 9.5 ms, min= 21258.0 ms, max= 21277.0 ms Running Parser/innerHTML-setter.html (4 of 12) DESCRIPTION: This benchmark tests innerHTML setter for a large DOM tree RESULT Parser: innerHTML-setter= 283.414547175 runs/s median= 283.505154639 runs/s, stdev= 0.444597811497 runs/s, min= 282.413350449 runs/s, max= 283.870967742 runs/s Running Parser/query-selector-deep.html (5 of 12) DESCRIPTION: This benchmark tests querySelector() for an element that appears in the depths in the document RESULT Parser: query-selector-deep= 390.292536214 runs/s median= 390.48214658 runs/s, stdev= 0.635651062783 runs/s, min= 388.821385176 runs/s, max= 391.19804401 runs/s Running Parser/query-selector-first.html (6 of 12) DESCRIPTION: This benchmark tests querySelector() for an element that appears near the head of the document RESULT Parser: query-selector-first= 2089.65837585 runs/s median= 2090.13805696 runs/s, stdev= 11.8558960002 runs/s, min= 2067.18346253 runs/s, max= 2113.60634082 runs/s Running Parser/query-selector-last.html (7 of 12) DESCRIPTION: This benchmark tests querySelector() for an element that appears near the tail of the document RESULT Parser: query-selector-last= 385.288272322 runs/s median= 385.542168675 runs/s, stdev= 0.735912678771 runs/s, min= 382.317801673 runs/s, max= 386.007237636 runs/s Running Parser/simple-url.html (8 of 12) RESULT Parser: simple-url= 49.761237881 runs/s median= 49.7823763572 runs/s, stdev= 0.187995923193 runs/s, min= 49.504950495 runs/s, max= 50.0 runs/s Running Parser/textarea-parsing.html (9 of 12) RESULT Parser: textarea-parsing= 53.1830425356 runs/s median= 53.1914893617 runs/s, stdev= 0.388830329202 runs/s, min= 51.8806744488 runs/s, max= 53.9374325782 runs/s Running Parser/tiny-innerHTML.html (10 of 12) RESULT Parser: tiny-innerHTML= 5.37477235626 runs/s median= 5.37056928034 runs/s, stdev= 0.0405333190561 runs/s, min= 5.27704485488 runs/s, max= 5.44069640914 runs/s Running Parser/url-parser.html (11 of 12) RESULT Parser: url-parser= 112.698565873 runs/s median= 112.852708937 runs/s, stdev= 0.388816247722 runs/s, min= 111.940298507 runs/s, max= 113.065326633 runs/s Running Parser/xml-parser.html (12 of 12) RESULT Parser: xml-parser= 7.36174978442 runs/s median= 7.35294117647 runs/s, stdev= 0.0419260067497 runs/s, min= 7.29927007299 runs/s, max= 7.45341614907 runs/s
Kwang Yul Seo
Comment 37
2012-07-26 22:16:26 PDT
(In reply to
comment #36
)
> I ran all the parser tests on both 123678 and 123679 on a MacPro with results below. For html-parser, I see a 3.6% regression. > > Here are the pro/re-gression percents: > Parser/css-parser-yui.html 0.16% > Parser/html-parser.html -3.59% > Parser/html5-full-render.html -0.31% > Parser/innerHTML-setter.html -1.28% > Parser/query-selector-deep.html 0.55% > Parser/query-selector-first.html 4.97% > Parser/query-selector-last.html 1.70% > Parser/simple-url.html 8.17% > Parser/textarea-parsing.html -3.03% > Parser/tiny-innerHTML.html -2.81% > Parser/url-parser.html 13.42% > Parser/xml-parser.html 1.83%
I assume this patch is intended to improve the performance of parser tests including html5-parser.html. 3.59% regression seems quite big and unexpected to me. Do you have any clue why this happened? Recently, I've worked on HTML5 parser refactoring and bug fixes (
Bug 91703
,
Bug 92065
,
Bug 92079
, ...) and struggled with performance regressions. So I happened to notice this regression.
Peter Kasting
Comment 38
2012-07-27 17:07:10 PDT
This _might_ have caused an increase of about 1% in the number of renderer syscalls on one of the Chromium perf bots:
http://code.google.com/p/chromium/issues/detail?id=139382
We basically decided not to worry about this downstream, and no one tried to bisect the builds, so I'm not even sure it's this anyway. Just noting that in case it's somehow useful data if you guys are looking at perf impacts.
Adam Barth
Comment 39
2012-07-27 17:33:02 PDT
A 3.6% regression in the HTML parser benchmark is sad times. The "advance" function in SegmentedString are really performance sensitive. It looks like you've made SegmentedString::advance() into an indirect jump, which might be causing the problem. We probably should roll out the patch while we investigate.
Adam Barth
Comment 40
2012-07-27 17:34:55 PDT
> Parser/tiny-innerHTML.html -2.81%
This benchmark is an abstraction of one of the hot parts of the "peacemaker" benchmark. I suspect your change also make peacemaker slower.
WebKit Review Bot
Comment 41
2012-07-27 17:35:15 PDT
Re-opened since this is blocked by 92565
WebKit Review Bot
Comment 42
2012-07-27 17:46:33 PDT
Comment on
attachment 152676
[details]
Updated Patch with another Speculative Fix for Chrome Linux Cleared Oliver Hunt's review+ from obsolete
attachment 152676
[details]
so that this bug does not appear in
http://webkit.org/pending-commit
.
Michael Saboff
Comment 43
2012-08-03 13:46:46 PDT
Created
attachment 156448
[details]
Patch with performance update to
r123679
This patch eliminates the 3.59% slowdown on html-parser.html and the 2.81% slowdown on tiny-innerHTML.html. These are now a 2.17 progression and a -.22% regression (greater that the StDev). Comparison of pre-8bit changes with this patch. The pre-8bit changes is
r124244
with the 8 bit Latin-1 (
r123008
) and UTF8 (
r123011
) changesets removed. This patch is built on
r124244
. Pre-8bit This patch css-parser-yui.html Average 359.00 359.42 0.12% (runs/sec) Median 359.67 359.90 0.06% (Bigger is better) StDev 1.63 1.73 Min 355.33 355.33 Max 360.82 362.23 html-parser.html Average 1835.25 1795.45 2.17% (ms) Median 1835.00 1794.00 2.23% (Smaller is better) StDev 4.38 3.98 Min 1827.00 1788.00 Max 1844.00 1804.00 html5-full-render.html Average 21508 21303 0.96% (ms) Median 21508 21303 0.96% (Smaller is better) StDev 61 60 Min 21447 21243 Max 21569 21362 innerHTML-setter.html Average 273.21 299.57 9.65% (runs/sec) Median 273.12 299.63 9.70% (Bigger is better) StDev 0.74 0.76 Min 271.94 297.40 Max 274.66 300.75 query-selector-deep.html Average 392.83 387.46 -1.37% (runs/sec) Median 393.12 387.41 -1.45% (Bigger is better) StDev 0.93 0.92 Min 390.72 385.54 Max 394.09 389.29 query-selector-first.html Average 2065.49 1996.82 -3.32% (runs/sec) Median 2065.85 1996.26 -3.37% (Bigger is better) StDev 18.65 11.74 Min 2007.53 1975.31 Max 2091.50 2022.76 query-selector-last.html Average 361.27 378.34 4.73% (runs/sec) Median 361.29 378.25 4.69% (Bigger is better) StDev 0.65 0.74 Min 359.44 376.91 Max 362.23 379.60 simple-url.html Average 45.67 49.89 9.23% (runs/sec) Median 45.71 49.82 8.97% (Bigger is better) StDev 0.13 0.50 Min 45.35 49.26 Max 45.87 50.51 textarea-parsing.html Average 52.22 56.20 7.62% (runs/sec) Median 52.29 56.31 7.69% (Bigger is better) StDev 0.33 0.33 Min 51.41 55.07 Max 52.63 56.56 tiny-innerHTML.html Average 4.89 4.88 -0.22% (runs/sec) Median 4.88 4.88 -0.18% (Bigger is better) StDev 0.05 0.03 Min 4.81 4.82 Max 4.98 4.93 url-parser.html Average 102.56 107.27 4.59% (runs/sec) Median 102.63 104.99 2.30% (Bigger is better) StDev 0.86 4.71 Min 100.88 104.44 Max 103.90 118.27 xml-parser.html Average 7.64 7.77 1.70% (runs/sec) Median 7.64 7.76 1.62% (Bigger is better) StDev 0.03 0.04 Min 7.56 7.70 Max 7.71 7.82
Michael Saboff
Comment 44
2012-08-03 14:34:02 PDT
Created
attachment 156460
[details]
Performance results of latest patch Comparison of query-selector-deep and query-selector-first across three test runs. The raw data from those three complete test runs.
Adam Barth
Comment 45
2012-08-03 14:49:18 PDT
Comment on
attachment 156448
[details]
Patch with performance update to
r123679
View in context:
https://bugs.webkit.org/attachment.cgi?id=156448&action=review
Thanks for iterating on the patch. I'm mostly forwarding Geoffrey Garen's r+, but I read through the whole patch and it looks reasonable. SegmentedString is getting kind of complicated, but I think that's the price of performance here.
> Source/WebCore/html/parser/HTMLParserIdioms.cpp:66 > + unsigned length = string.length();
At some point we need to change all these "unsigned" types to "size_t", but that's not a problem we should solve in this patch.
Michael Saboff
Comment 46
2012-08-03 18:08:08 PDT
Committed
r124679
: <
http://trac.webkit.org/changeset/124679
>
Ryosuke Niwa
Comment 47
2012-08-09 22:03:24 PDT
It appears that this patch improved our score on Parser/url-parser by 24%:
http://webkit-perf.appspot.com/graph.html#tests=[[3069,2001,32196]]&sel=1344036206653.709,1344071023730.035,45.96774193548387,97.1774193548387&displayrange=7&datatype=running
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug