WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
40172
HTML5 parser should be within 1% of old parser performance
https://bugs.webkit.org/show_bug.cgi?id=40172
Summary
HTML5 parser should be within 1% of old parser performance
Adam Barth
Reported
2010-06-04 10:27:07 PDT
HTML5 parser should be within 1% of old parser performance
Attachments
Patch
(9.74 KB, patch)
2010-06-04 10:30 PDT
,
Adam Barth
no flags
Details
Formatted Diff
Diff
Patch
(11.15 KB, patch)
2010-06-04 11:19 PDT
,
Adam Barth
no flags
Details
Formatted Diff
Diff
Patch
(11.15 KB, patch)
2010-06-04 13:17 PDT
,
Adam Barth
abarth
: review-
Details
Formatted Diff
Diff
patch with check converted to assert
(11.67 KB, patch)
2010-06-04 22:48 PDT
,
Adam Barth
darin
: review+
Details
Formatted Diff
Diff
Show Obsolete
(3)
View All
Add attachment
proposed patch, testcase, etc.
Adam Barth
Comment 1
2010-06-04 10:30:06 PDT
Created
attachment 57893
[details]
Patch
Eric Seidel (no email)
Comment 2
2010-06-04 10:36:03 PDT
Attachment 57893
[details]
did not build on mac: Build output:
http://webkit-commit-queue.appspot.com/results/3076033
Early Warning System Bot
Comment 3
2010-06-04 10:37:10 PDT
Attachment 57893
[details]
did not build on qt: Build output:
http://webkit-commit-queue.appspot.com/results/3086045
Eric Seidel (no email)
Comment 4
2010-06-04 10:39:27 PDT
Comment on
attachment 57893
[details]
Patch Seems we should understand which of these changes actually matter. The m_appropriateEndTagName one seems unrelated. Also, this changes behavior. This should fix the \0 layout tests.
Darin Adler
Comment 5
2010-06-04 10:41:35 PDT
I don’t think that we should limit ourselves to “within 1%”. It would be fine with me if it was 10% faster ;-)
Darin Adler
Comment 6
2010-06-04 10:43:36 PDT
Comment on
attachment 57893
[details]
Patch Looks like this doesn't compile either because of distinct Vector types. To assign between vectors, currently the inline capacities need to be identical. Also, it's likely to be inefficient to assign between vectors with inline capacities since the characters need to be copied. We could add code to make assignment between vectors with different inline capacities possible, but it might be worth looking at a design that to uses pointers a bit more so that is not needed.
WebKit Review Bot
Comment 7
2010-06-04 10:59:56 PDT
Attachment 57893
[details]
did not build on gtk: Build output:
http://webkit-commit-queue.appspot.com/results/3143019
Adam Barth
Comment 8
2010-06-04 11:03:50 PDT
There is a function for assigning between vectors of different inline capacities: template<typename T, size_t inlineCapacity> template<size_t otherCapacity> Vector<T, inlineCapacity>& Vector<T, inlineCapacity>::operator=(const Vector<T, otherCapacity>& other) I had to change this locally to get it to build, but I must have lost the change somehow.
Adam Barth
Comment 9
2010-06-04 11:07:25 PDT
> Also, it's likely to be inefficient to assign between vectors with inline capacities since the > characters need to be copied. We could add code to make assignment between vectors with > different inline capacities possible, but it might be worth looking at a design that to uses > pointers a bit more so that is not needed.
That's an interesting idea. I'd have to look at the code to how we can avoid more copies. For the moment, this patch seems like a big perf win, so I'm inclined to start with it (once it builds) and then iterate.
Adam Barth
Comment 10
2010-06-04 11:12:01 PDT
Benchmark numbers: HTML5 parser w/ patch: Running 20 times Ignoring warm-up run (3176) 2805 2795 2810 2810 2789 2817 2822 2814 2811 2840 2826 2826 2831 2845 2841 2832 2849 2857 2849 2841 avg 2825.5 stdev 18.481071397513727 Old parser: Running 20 times Ignoring warm-up run (3103) 2769 2781 2770 2771 2786 2783 2775 2780 2796 2787 2784 2808 2799 2795 2814 2807 2810 2805 2833 2822 avg 2793.75 stdev 17.699929378390188
Adam Barth
Comment 11
2010-06-04 11:19:27 PDT
Created
attachment 57899
[details]
Patch
WebKit Review Bot
Comment 12
2010-06-04 12:24:53 PDT
Attachment 57899
[details]
did not build on chromium: Build output:
http://webkit-commit-queue.appspot.com/results/3099036
Eric Seidel (no email)
Comment 13
2010-06-04 13:08:28 PDT
../../JavaScriptCore/wtf/Vector.h: In member function ‘WTF::Vector<T, inlineCapacity>& WTF::Vector<T, inlineCapacity>::operator=(const WTF::Vector<T, otherCapacity>&) [with unsigned int otherCapacity = 1024u, T = short unsigned int, unsigned int inlineCapacity = 32u]’: ../../WebCore/html/HTML5Lexer.cpp:1589: instantiated from here ../../JavaScriptCore/wtf/Vector.h:706: error: invalid cast from type ‘const WTF::Vector<short unsigned int, 1024u>’ to type ‘const void*’ make: *** [out/Release/obj.target/webcore/../../WebCore/html/HTML5Lexer.o] Error 1
Eric Seidel (no email)
Comment 14
2010-06-04 13:09:42 PDT
Comment on
attachment 57899
[details]
Patch Would be nice to know which part of this is actually the big speedup. In general I like the concept, but can't really r+ since it would break the world. :(
Adam Barth
Comment 15
2010-06-04 13:17:58 PDT
Created
attachment 57908
[details]
Patch
WebKit Review Bot
Comment 16
2010-06-04 13:36:47 PDT
Attachment 57899
[details]
did not build on gtk: Build output:
http://webkit-commit-queue.appspot.com/results/3068045
Darin Adler
Comment 17
2010-06-04 17:34:13 PDT
Comment on
attachment 57908
[details]
Patch Why reinterpret_cast? I'm pretty sure that static_cast will work. Another possibility would be to use local variables or a helper function then you would need no casting at all. // Helper function that does a type conversion. inline bool typelessPointersAreEqual(const void* a, const void* b) { return a == b; } if (typelessPointersAreEqual(&other, this)) return *this; I think that's way better than a reinterpret_cast.
Adam Barth
Comment 18
2010-06-04 19:14:42 PDT
> I think that's way better than a reinterpret_cast.
Ok. They're pretty close semantically because static_cast to void* doesn't do any of the multi-inheritance offsets that static_cast usually does. Using a helper function is just getting the compiler to generate the static_cast implicitly.
Adam Barth
Comment 19
2010-06-04 19:21:36 PDT
Actually, the more I think about it, that check can probably be removed. If the objects occupy the same address, they must have the same inline capacity, which means we'll call the more specific template. We can change it into an ASSERT if we're scared of just deleting it.
Darin Adler
Comment 20
2010-06-04 22:27:21 PDT
(In reply to
comment #18
)
> They're pretty close semantically because static_cast to void* doesn't do any of the multi-inheritance offsets that static_cast usually does.
Sure, what the two casts do is pretty close. But the point of the differently named casts is more about cases where they the casts can't be used be used. A reinterpret_cast could convert an integer to a pointer for example, but when you see a static_cast you know that’s not going on even without looking at the rest of the line of code. I don’t want to see these sorts of conversions to void* when I grep through the code for all reinterpret_casts.
> Using a helper function is just getting the compiler to generate the static_cast implicitly.
That’s a funny way to put it. In my opinion, relying on normal type conversion rules and avoiding a cast is not “just a way to generate [a cast]”. Casts can force the compiler to perform conversions that normally would not be done. Type conversion does a more limited set. It’s great, though, that you can avoid the check entirely. I do think you should just omit it.
Adam Barth
Comment 21
2010-06-04 22:48:12 PDT
Created
attachment 57962
[details]
patch with check converted to assert
Adam Barth
Comment 22
2010-06-04 22:57:48 PDT
Thanks Darin.
Adam Barth
Comment 23
2010-06-04 23:26:03 PDT
Committed
r60738
: <
http://trac.webkit.org/changeset/60738
>
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug