Bug 142495

Summary: Provide a faster path for :lang() with a single language range argument
Product: WebKit Reporter: Dhi Aurrahman <diorahman>
Component: New BugsAssignee: Dhi Aurrahman <diorahman>
Status: NEW    
Severity: Normal CC: benjamin, buildbot, rniwa
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch
none
Archive of layout-test-results from ews105 for mac-mavericks-wk2
none
Archive of layout-test-results from ews103 for mac-mavericks
none
Patch
none
Patch darin: review+

Dhi Aurrahman
Reported 2015-03-09 09:52:16 PDT
Provide a faster path for :lang() with single language range argument
Attachments
Patch (6.10 KB, patch)
2015-03-09 10:00 PDT, Dhi Aurrahman
no flags
Patch (8.79 KB, patch)
2015-03-15 23:58 PDT, Dhi Aurrahman
no flags
Patch (8.91 KB, patch)
2015-03-16 05:08 PDT, Dhi Aurrahman
no flags
Patch (4.81 KB, patch)
2015-03-17 21:54 PDT, Dhi Aurrahman
no flags
Archive of layout-test-results from ews105 for mac-mavericks-wk2 (922.33 KB, application/zip)
2015-03-17 22:11 PDT, Build Bot
no flags
Archive of layout-test-results from ews103 for mac-mavericks (831.50 KB, application/zip)
2015-03-17 22:43 PDT, Build Bot
no flags
Patch (4.80 KB, patch)
2015-03-18 20:52 PDT, Dhi Aurrahman
no flags
Patch (15.21 KB, patch)
2015-03-20 07:59 PDT, Dhi Aurrahman
darin: review+
Dhi Aurrahman
Comment 1 2015-03-09 10:00:51 PDT
Darin Adler
Comment 2 2015-03-10 10:30:26 PDT
Comment on attachment 248251 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=248251&action=review This is good, but there is significant room for improvement. I’m going to say review+ but I’d request that you consider my comments and provide a new patch unless you don’t find anything actionable. > Source/WTF/wtf/text/StringView.h:104 > + bool startsWithIgnoringASCIICase(StringView) const; We should add tests for this to Tools/TestWebKitAPI/Tests/WTF/StringView.cpp While some of our old classes aren’t tested well there, StringView is covered relatively well right now so it would be good to add a test when adding a function. > Source/WTF/wtf/text/StringView.h:359 > +inline bool StringView::startsWithIgnoringASCIICase(StringView a) const Why does this need to be inlined? Seems like a pretty long function; the similar StringImpl::startsWith is not inlined. If this is really performance critical, then we probably want separate loops for 8-bit and 16-bit; the common case is likely 8-bit so we’d presumably only want to inline that. Lets start with it not inlined and inline as needed based on performance testing. For functions with two arguments with no special meaning of the two arguments it’s OK to just name them “a” and “b”. But here, “a” is not a good name for the argument. I think it should be named “prefix”. > Source/WTF/wtf/text/StringView.h:362 > + if (a.isEmpty()) > + return false; This behavior doesn’t match String::startsWith. That function returns true if passed an empty string as the prefix, which makes sense and comes naturally out of the implementation without a special case. (Strangely, it returns false if passed a null string as the prefix; we might want to remove that quirk at some point.) I’d prefer if the semantics of startsWith and startsWithIgnoringASCIICase were identical. Or alternatively if we just left out this special case check since there’s no reason to pay a performance cost for this quirk; then we could fix the quirk in String::startsWith later. > Source/WTF/wtf/text/StringView.h:367 > + if (toASCIILower(a[i]) != toASCIILower(is8Bit() ? characters8()[i] : characters16()[i])) That is8Bit() thing is not needed. Should just write: if (toASCIILower(a[i]) != toASCIILower((*this)[i])) Unless we are explicitly writing separate loops for 8-bit and 16-bit. > Source/WebCore/css/SelectorCheckerTestFunctions.h:221 > + ASSERT(element); Strange to assert element here but not assert range. > Source/WebCore/css/SelectorCheckerTestFunctions.h:229 > + AtomicString language; > +#if ENABLE(VIDEO_TRACK) > + if (is<WebVTTElement>(*element)) > + language = downcast<WebVTTElement>(*element).language(); > + else > +#endif > + language = element->computeInheritedLanguage(); This strange code to get the language for an element seems like it belongs in a helper function, not in-line here. Is there no other place that has to implement this rule? > Source/WebCore/css/SelectorCheckerTestFunctions.h:232 > + if (language.isEmpty() || range->isEmpty() || language.length() < range->length()) > + return false; The check of language.isEmpty() here is redundant. If language is empty, the rest of that expression will definitely return true. > Source/WebCore/css/SelectorCheckerTestFunctions.h:243 > + if (languageStringView.length() == rangeStringView.length()) > + return true; This is dead code. Can’t ever happen because equalIgnoringASCIICase would have been true above. You should either remove this or remove the equalIgnoringASCIICase call above.
Benjamin Poulain
Comment 3 2015-03-10 13:58:59 PDT
Comment on attachment 248251 [details] Patch Nice! Did you test how much faster this is?
Dhi Aurrahman
Comment 4 2015-03-11 08:40:27 PDT
(In reply to comment #2) > Comment on attachment 248251 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=248251&action=review > > This is good, but there is significant room for improvement. I’m going to > say review+ but I’d request that you consider my comments and provide a new > patch unless you don’t find anything actionable. I will definitely visit each comment.
Dhi Aurrahman
Comment 5 2015-03-11 08:41:17 PDT
(In reply to comment #3) > Comment on attachment 248251 [details] > Patch > > Nice! > > Did you test how much faster this is? Before: https://cloudup.com/c0N0v-uBxAk After: https://cloudup.com/cQww6mEW32R
Benjamin Poulain
Comment 6 2015-03-12 23:43:40 PDT
(In reply to comment #5) > (In reply to comment #3) > > Comment on attachment 248251 [details] > > Patch > > > > Nice! > > > > Did you test how much faster this is? > > Before: https://cloudup.com/c0N0v-uBxAk > After: https://cloudup.com/cQww6mEW32R Damn, impressive how much this helps.
Dhi Aurrahman
Comment 7 2015-03-15 23:58:14 PDT
Dhi Aurrahman
Comment 8 2015-03-16 05:08:28 PDT
Dhi Aurrahman
Comment 9 2015-03-17 21:54:17 PDT
Build Bot
Comment 10 2015-03-17 22:11:39 PDT
Comment on attachment 248908 [details] Patch Attachment 248908 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.appspot.com/results/5955964486287360 New failing tests: fast/selectors/lang-valid-extended-filtering.html http/tests/misc/acid3.html fast/selectors/lang-specificity.html fast/selectors/lang-extended-filtering.html fast/selectors/lang-chained-multiple.html fast/selectors/lang-vs-xml-lang.html fast/selectors/lang-vs-xml-lang-xhtml.xhtml fast/dom/SelectorAPI/resig-SelectorsAPI-test.xhtml fast/selectors/lang-inheritance.html fast/selectors/lang-specificity-xml.xhtml
Build Bot
Comment 11 2015-03-17 22:11:41 PDT
Created attachment 248909 [details] Archive of layout-test-results from ews105 for mac-mavericks-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews105 Port: mac-mavericks-wk2 Platform: Mac OS X 10.9.5
Dhi Aurrahman
Comment 12 2015-03-17 22:24:50 PDT
It depends on StringView::startsWithIgnoringASCIICase https://bugs.webkit.org/show_bug.cgi?id=142772
Build Bot
Comment 13 2015-03-17 22:43:19 PDT
Comment on attachment 248908 [details] Patch Attachment 248908 [details] did not pass mac-ews (mac): Output: http://webkit-queues.appspot.com/results/5758712207638528 New failing tests: fast/selectors/lang-valid-extended-filtering.html http/tests/misc/acid3.html fast/selectors/lang-specificity.html fast/selectors/lang-extended-filtering.html fast/selectors/lang-chained-multiple.html fast/selectors/lang-vs-xml-lang.html fast/selectors/lang-vs-xml-lang-xhtml.xhtml fast/dom/SelectorAPI/resig-SelectorsAPI-test.xhtml fast/selectors/lang-inheritance.html fast/selectors/lang-specificity-xml.xhtml
Build Bot
Comment 14 2015-03-17 22:43:22 PDT
Created attachment 248911 [details] Archive of layout-test-results from ews103 for mac-mavericks The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews103 Port: mac-mavericks Platform: Mac OS X 10.9.5
Dhi Aurrahman
Comment 15 2015-03-18 20:52:51 PDT
Dhi Aurrahman
Comment 16 2015-03-18 21:13:33 PDT
Comment on attachment 249013 [details] Patch After r181717
Darin Adler
Comment 17 2015-03-19 08:23:35 PDT
Comment on attachment 249013 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=249013&action=review > Source/WebCore/css/SelectorCheckerTestFunctions.h:153 > +ALWAYS_INLINE AtomicString elementLanguage(const Element* element) This function should return const AtomicString&. WebVTTElement::language should return const AtomicString&. Element::computeInheritedLanguage should return const AtomicString&. Document::contentLanguage should return const AtomicString&. Document::m_contentLanguage should be AtomicString. Those changes will prevent us from doing a wasteful ref/deref every time. > Source/WebCore/css/SelectorCheckerTestFunctions.h:222 > + if (range->isEmpty() || language.length() < range->length()) > + return false; Neither of these checks seems necessary. Why are we optimizing the range empty case? Is that a common case that needs a fast path for some reason? It seems unlikely to ever happen in reasonable markup, so seems strange to optimize it. Why are we optimizing the case where the language is the same length as the prefix? Is that common? The startsWith function will correctly handle this, so we should only add another check for it if we have some reason to think we need this case to be even faster, and worth having redundant code. > Source/WebCore/css/SelectorCheckerTestFunctions.h:224 > + if (*range == "*") > + return true; I suggest checking this before calling elementLanguage(element). I also realize I am not fully understanding why we have to do this here. Is there no way to notice the string "*" at compile time and generate completely different code? > Source/WebCore/css/SelectorCheckerTestFunctions.h:227 > + StringView languageStringView = language.string(); > + StringView rangeStringView = range->string(); It seems overkill to create local variables just to call startWithIgnoringASCIICase. And it makes the whole function look more complicated than it should. Ideally we would just pass in *range to that function, but given the current state of things I think we’d need to pass range->string(). I think we should be adding String::view() and AtomicString::view() just to clean up code like this. Then it would be one of these: if (language.view().startsWithIgnoringASCIICase(range->view())) if (language.view().startsWithIgnoringASCIICase(*range)) The other uses of length and subscripting seem to me that they should just be done on the original strings directly. AtomicString has both of those operations. I’d feel differently if we had factored this whole thing out into a function that takes two StringView arguments.
Dhi Aurrahman
Comment 18 2015-03-19 19:51:21 PDT
(In reply to comment #17) > Comment on attachment 249013 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=249013&action=review > > > Source/WebCore/css/SelectorCheckerTestFunctions.h:153 > > +ALWAYS_INLINE AtomicString elementLanguage(const Element* element) > > This function should return const AtomicString&. WebVTTElement::language > should return const AtomicString&. Element::computeInheritedLanguage should > return const AtomicString&. Document::contentLanguage should return const > AtomicString&. Document::m_contentLanguage should be AtomicString. Those > changes will prevent us from doing a wasteful ref/deref every time. > > > Source/WebCore/css/SelectorCheckerTestFunctions.h:222 > > + if (range->isEmpty() || language.length() < range->length()) > > + return false; > > Neither of these checks seems necessary. > > Why are we optimizing the range empty case? Is that a common case that needs > a fast path for some reason? It seems unlikely to ever happen in reasonable > markup, so seems strange to optimize it. > > Why are we optimizing the case where the language is the same length as the > prefix? Is that common? The startsWith function will correctly handle this, > so we should only add another check for it if we have some reason to think > we need this case to be even faster, and worth having redundant code. OK. I will remove this. > > > Source/WebCore/css/SelectorCheckerTestFunctions.h:224 > > + if (*range == "*") > > + return true; > > I suggest checking this before calling elementLanguage(element). There were discussion: If the element's language is empty, it will always give "not match", even the range is "*". > > I also realize I am not fully understanding why we have to do this here. Is > there no way to notice the string "*" at compile time and generate > completely different code? > > > Source/WebCore/css/SelectorCheckerTestFunctions.h:227 > > + StringView languageStringView = language.string(); > > + StringView rangeStringView = range->string(); > > It seems overkill to create local variables just to call > startWithIgnoringASCIICase. And it makes the whole function look more > complicated than it should. > > Ideally we would just pass in *range to that function, but given the current > state of things I think we’d need to pass range->string(). I think we should > be adding String::view() and AtomicString::view() just to clean up code like > this. Then it would be one of these: > > if (language.view().startsWithIgnoringASCIICase(range->view())) > if (language.view().startsWithIgnoringASCIICase(*range)) > > The other uses of length and subscripting seem to me that they should just > be done on the original strings directly. AtomicString has both of those > operations. I’d feel differently if we had factored this whole thing out > into a function that takes two StringView arguments.
Dhi Aurrahman
Comment 19 2015-03-20 07:59:07 PDT
Dhi Aurrahman
Comment 20 2015-03-20 08:02:56 PDT
Comment on attachment 249114 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=249114&action=review > Source/WebCore/css/SelectorCheckerTestFunctions.h:226 > + if (language.view().startsWithIgnoringASCIICase(range->view())) { If both the language and range are empty, it will give true and the length of both language and range are the same. So I guess, checking language.isEmpty() is still needed.
Benjamin Poulain
Comment 21 2015-03-23 13:54:37 PDT
(In reply to comment #17) > > Source/WebCore/css/SelectorCheckerTestFunctions.h:227 > > + StringView languageStringView = language.string(); > > + StringView rangeStringView = range->string(); > > It seems overkill to create local variables just to call > startWithIgnoringASCIICase. And it makes the whole function look more > complicated than it should. > > Ideally we would just pass in *range to that function, but given the current > state of things I think we’d need to pass range->string(). I think we should > be adding String::view() and AtomicString::view() just to clean up code like > this. Then it would be one of these: > > if (language.view().startsWithIgnoringASCIICase(range->view())) > if (language.view().startsWithIgnoringASCIICase(*range)) > > The other uses of length and subscripting seem to me that they should just > be done on the original strings directly. AtomicString has both of those > operations. I’d feel differently if we had factored this whole thing out > into a function that takes two StringView arguments. I am unconvinced. When you have to start thinking about the inline/out-of-line functions, things get really complicated. Creating a StringView from a String is a lot of useless operations when not eliminated by the compiler (which would be the case here since startWith() is too big). Darin, what do you think of having startsWithIgnoringASCIICase(StringView) on String? Alternatively, startsWithIgnoringASCIICase() from StringCommon handle any combination of String/StringView. The problem with it is we cannot tune the inlining the way we optimize StringImpl.
Darin Adler
Comment 22 2015-03-31 09:59:44 PDT
Comment on attachment 249114 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=249114&action=review I’m saying review+ but I am not happy with the one place here where we are using view() so I think we still need more refinement on this. > Source/WTF/wtf/text/AtomicString.cpp:587 > +StringView AtomicString::view() const > +{ > + return m_string.view(); > +} This needs to be inlined. I don’t think we want function call overhead every time this is called. > Source/WTF/wtf/text/WTFString.cpp:876 > +StringView String::view() const > +{ > + return StringView(*this); > +} This needs to be inlined. I don’t think we want function call overhead every time this is called. >> Source/WebCore/css/SelectorCheckerTestFunctions.h:226 >> + if (language.view().startsWithIgnoringASCIICase(range->view())) { > > If both the language and range are empty, it will give true and the length of both language and range are the same. So I guess, checking language.isEmpty() is still needed. This isn’t great. We do want the operations to work on StringView, but explicitly calling view on both arguments here seems really weak. I’d be happier with the function as a free function that we can overload as needed. Code would read better with early return, I think. if (!language.view().startsWithIgnoringASCIICase(range->view())) Sorry for all the headache about StringView, though. We want functions that work on StringView, but at call sites we want people to call functions with the “natural” types and let the inlining be controlled by overloads in the WTF implementation. Explicit conversion with view() defeats future optimization. > Source/WebCore/cssjit/SelectorCompiler.cpp:3277 > +bool SelectorCodeGenerator::isAsteriskRange(const Vector<AtomicString>* ranges) > +{ > + ASSERT(ranges); > + return ranges->find("*") != notFound; > +} I’m concerned about all the functions that take “non-null pointers”. The correct idiom for this in WebKit is to use references. If we are using pointers because this is very low level and used directly by the code generator, that’s fine, but once we are passing on to helper functions that aren’t referenced directly by the generated code, we should be using references, as opposed to “pointers guaranteed not to be null”. This is an example that should take a reference.
Note You need to log in before you can comment on or make changes to this bug.