Summary: | [JSC] Add ASCII comparison fast path for IntlCollator | ||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Yusuke Suzuki <ysuzuki> | ||||||||||||||||||||||||
Component: | New Bugs | Assignee: | Yusuke Suzuki <ysuzuki> | ||||||||||||||||||||||||
Status: | RESOLVED FIXED | ||||||||||||||||||||||||||
Severity: | Normal | CC: | 7raivis, darin, ews-watchlist, keith_miller, mark.lam, msaboff, ross.kirsling, saam, tzagallo, webkit-bug-importer | ||||||||||||||||||||||||
Priority: | P2 | Keywords: | InRadar | ||||||||||||||||||||||||
Version: | WebKit Nightly Build | ||||||||||||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||||||||||
Bug Depends on: | |||||||||||||||||||||||||||
Bug Blocks: | 213425 | ||||||||||||||||||||||||||
Attachments: |
|
Description
Yusuke Suzuki
2020-08-24 23:16:20 PDT
Created attachment 407175 [details]
Patch
Created attachment 407176 [details]
Patch
Created attachment 407177 [details]
Patch
Comment on attachment 407177 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=407177&action=review > Source/JavaScriptCore/runtime/IntlCollator.cpp:294 > + if (x.isAllASCII() && y.isAllASCII()) { see comment right below, but I'd write this as: if (canDoASCIIUCADUCETComparison() && x.isAllASCII() && y.isAllASCII()) > Source/JavaScriptCore/runtime/IntlCollator.cpp:295 > + if (canDoASCIIUCADUCETComparison()) { shouldn't be the first thing to check for the speed of not having to check the entire strings if they're ASCII when in a UCollator that can't go down the fast path? > Source/JavaScriptCore/runtime/IntlCollator.cpp:396 > +static bool canDoASCIIUCADUCETComparisonWithUCollator(UCollator& collator) should some of this code live in WTF? > Source/JavaScriptCore/runtime/IntlCollator.cpp:409 > + UErrorCode status = U_ZERO_ERROR; should this go inside the loop? > Source/JavaScriptCore/runtime/IntlCollator.cpp:411 > + int32_t index = 0; remove > Source/JavaScriptCore/runtime/IntlCollator.cpp:417 > + index++; remove > Source/JavaScriptCore/runtime/IntlObjectInlines.h:126 > + auto compareCharacter = [](CharacterType1 lhs, CharacterType2 rhs) { should we inline this below just to ensure that we don't actually have the "result != 0" branch in case the compiler doesn't optimize it away? > Source/JavaScriptCore/runtime/IntlObjectInlines.h:137 > + while (true) { why not just: "while (position < commonLength)"? Comment on attachment 407177 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=407177&action=review >> Source/JavaScriptCore/runtime/IntlCollator.cpp:295 >> + if (canDoASCIIUCADUCETComparison()) { > > shouldn't be the first thing to check for the speed of not having to check the entire strings if they're ASCII when in a UCollator that can't go down the fast path? No, this has the ucol_strcollUTF8 optimization too, not just compareASCIIWithUACDUCET. > Source/JavaScriptCore/runtime/IntlCollator.cpp:297 > + Vector<LChar> xBuffer; > + Vector<LChar> yBuffer; If strings are often short, inline capacity could avoid two malloc/free operations. Also, could we limit the buffers to the shorter of the two lengths, or the shorter of the two lengths + 1? Maybe we don’t need to copy leading characters that are equal in both strings? I guess a better proposal is to call compareASCIIWithUACDUCET on the characters with the widths they currently have rather than allocating memory to narrow them to 8-bit before calling. if (x.is8Bit() && y.is8Bit()) return static_cast<UCollationResult>(compareASCIIWithUACDUCET(m_collator.get(), x.characters8(), x.length(), y.characters8(), y.length())); if (x.is8Bit()) return static_cast<UCollationResult>(compareASCIIWithUACDUCET(m_collator.get(), x.characters8(), x.length(), y.characters16(), y.length())); if (y.is8Bit()) return static_cast<UCollationResult>(compareASCIIWithUACDUCET(m_collator.get(), x.characters16(), x.length(), y.characters8(), y.length())); return static_cast<UCollationResult>(compareASCIIWithUACDUCET(m_collator.get(), x.characters16(), x.length(), y.characters16(), y.length())); > Source/JavaScriptCore/runtime/IntlCollator.cpp:298 > + auto getCharacters8 = [&] (const StringView& view, Vector<LChar>& buffer) -> const LChar* { Can this use a return value instead of an out argument? Or will the vector operation be too expensive? > Source/JavaScriptCore/runtime/IntlCollator.cpp:301 > + buffer.resize(view.length()); This should be grow rather than resize. > Source/JavaScriptCore/runtime/IntlCollator.cpp:449 > + m_canDoASCIIUCADUCETComparison = result ? TriState::True : TriState::False; m_canDoASCIIUCADUCETComparison = triState(result); > Source/JavaScriptCore/runtime/IntlCollator.cpp:456 > +void IntlCollator::checkICULocaleInvariants(const HashSet<String>& locales) > +{ > + UNUSED_PARAM(locales); > +#if ASSERT_ENABLED Small style idea. Consider putting this in the header: #if !ASSERT_ENABLED inline void IntlCollator::checkICULocaleInvariants(const HashSet<String>&) { } #endif Then here just write: #if ASSERT_ENABLED void IntlCollator::checkICULocaleInvariants(const HashSet<String>& locales) { } #endif Without the UNUSED_PARAM. > Source/JavaScriptCore/runtime/IntlCollator.cpp:470 > + dataLogLn("BAD ", locale, " ", makeString(hex(x)), "(", String(xstring, 1), ") <=> ", makeString(hex(y)), "(", String(ystring, 1), ") ICU:(", static_cast<int32_t>(resultICU), "),WTF:(", static_cast<int32_t>(resultWTF), ")"); These should be StringView, no need to create and destroy a string each time. Comment on attachment 407177 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=407177&action=review > Source/JavaScriptCore/runtime/IntlCollator.cpp:293 > + UCollationResult result = ([&]() -> UCollationResult { How come this is an immediately-invoked lambda? Is there a benefit over using a block? > Source/JavaScriptCore/runtime/IntlCollator.cpp:456 > +void IntlCollator::checkICULocaleInvariants(const HashSet<String>& locales) > +{ > + UNUSED_PARAM(locales); > +#if ASSERT_ENABLED I can see how this was useful to figure out ICU's current behavior and how it would provide a way to immediately see if that changes, but it seems rather heavy to be treated as "just some assertions"? > Source/JavaScriptCore/runtime/IntlCollator.cpp:458 > + ([&] { Here too I'm not quite sure about the immediately-invoked lambda. > Source/JavaScriptCore/runtime/IntlObjectInlines.h:145 > + ++characters1; > + ++characters2; > + ++position; Is this better than just indexing on position? Comment on attachment 407177 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=407177&action=review Thanks! >> Source/JavaScriptCore/runtime/IntlCollator.cpp:293 >> + UCollationResult result = ([&]() -> UCollationResult { > > How come this is an immediately-invoked lambda? Is there a benefit over using a block? This is for early returning. This makes code readable. >> Source/JavaScriptCore/runtime/IntlCollator.cpp:294 >> + if (x.isAllASCII() && y.isAllASCII()) { > > see comment right below, but I'd write this as: > > if (canDoASCIIUCADUCETComparison() && x.isAllASCII() && y.isAllASCII()) There is ucol_strcollUTF8 case. So we anyway need to perform `x.isAllASCII() && y.isAllASCII()` first. >>> Source/JavaScriptCore/runtime/IntlCollator.cpp:295 >>> + if (canDoASCIIUCADUCETComparison()) { >> >> shouldn't be the first thing to check for the speed of not having to check the entire strings if they're ASCII when in a UCollator that can't go down the fast path? > > No, this has the ucol_strcollUTF8 optimization too, not just compareASCIIWithUACDUCET. Yeah, this is necessary since there is ucol_strcollUTF8 path. >> Source/JavaScriptCore/runtime/IntlCollator.cpp:297 >> + Vector<LChar> yBuffer; > > If strings are often short, inline capacity could avoid two malloc/free operations. Also, could we limit the buffers to the shorter of the two lengths, or the shorter of the two lengths + 1? Maybe we don’t need to copy leading characters that are equal in both strings? > > I guess a better proposal is to call compareASCIIWithUACDUCET on the characters with the widths they currently have rather than allocating memory to narrow them to 8-bit before calling. > > if (x.is8Bit() && y.is8Bit()) > return static_cast<UCollationResult>(compareASCIIWithUACDUCET(m_collator.get(), x.characters8(), x.length(), y.characters8(), y.length())); > if (x.is8Bit()) > return static_cast<UCollationResult>(compareASCIIWithUACDUCET(m_collator.get(), x.characters8(), x.length(), y.characters16(), y.length())); > if (y.is8Bit()) > return static_cast<UCollationResult>(compareASCIIWithUACDUCET(m_collator.get(), x.characters16(), x.length(), y.characters8(), y.length())); > return static_cast<UCollationResult>(compareASCIIWithUACDUCET(m_collator.get(), x.characters16(), x.length(), y.characters16(), y.length())); Sounds good. Changing. >> Source/JavaScriptCore/runtime/IntlCollator.cpp:298 >> + auto getCharacters8 = [&] (const StringView& view, Vector<LChar>& buffer) -> const LChar* { > > Can this use a return value instead of an out argument? Or will the vector operation be too expensive? This is because we do not want to allocate Vector if view is 8Bit. I'll specialize function instead of doing this here. >> Source/JavaScriptCore/runtime/IntlCollator.cpp:301 >> + buffer.resize(view.length()); > > This should be grow rather than resize. I'll change getCharacters16 side. >> Source/JavaScriptCore/runtime/IntlCollator.cpp:396 >> +static bool canDoASCIIUCADUCETComparisonWithUCollator(UCollator& collator) > > should some of this code live in WTF? For now, I think putting this function in IntlCollator is better because the following options are derived from the default options used in Intl.Collator. And currently, I don't think any other WTF client wants to use it. If we have such a client, then I think we should move it to WTF. >> Source/JavaScriptCore/runtime/IntlCollator.cpp:409 >> + UErrorCode status = U_ZERO_ERROR; > > should this go inside the loop? Moved. >> Source/JavaScriptCore/runtime/IntlCollator.cpp:411 >> + int32_t index = 0; > > remove Removed. >> Source/JavaScriptCore/runtime/IntlCollator.cpp:417 >> + index++; > > remove Removed. >> Source/JavaScriptCore/runtime/IntlCollator.cpp:449 >> + m_canDoASCIIUCADUCETComparison = result ? TriState::True : TriState::False; > > m_canDoASCIIUCADUCETComparison = triState(result); Fixed. >>> Source/JavaScriptCore/runtime/IntlCollator.cpp:456 >>> +#if ASSERT_ENABLED >> >> Small style idea. Consider putting this in the header: >> >> #if !ASSERT_ENABLED >> inline void IntlCollator::checkICULocaleInvariants(const HashSet<String>&) { } >> #endif >> >> Then here just write: >> >> #if ASSERT_ENABLED >> >> void IntlCollator::checkICULocaleInvariants(const HashSet<String>& locales) >> { >> } >> >> #endif >> >> Without the UNUSED_PARAM. > > I can see how this was useful to figure out ICU's current behavior and how it would provide a way to immediately see if that changes, but it seems rather heavy to be treated as "just some assertions"? I think this function is very important assertion. If the invariant is broken, we should fix the code, and without this exhaustive check, we cannot notice ICU is changed. We should exhaustively check ICU behavior here to catch future changes in ICU. >> Source/JavaScriptCore/runtime/IntlCollator.cpp:458 >> + ([&] { > > Here too I'm not quite sure about the immediately-invoked lambda. In this case, I can use `continue` instead. Changed. >> Source/JavaScriptCore/runtime/IntlCollator.cpp:470 >> + dataLogLn("BAD ", locale, " ", makeString(hex(x)), "(", String(xstring, 1), ") <=> ", makeString(hex(y)), "(", String(ystring, 1), ") ICU:(", static_cast<int32_t>(resultICU), "),WTF:(", static_cast<int32_t>(resultWTF), ")"); > > These should be StringView, no need to create and destroy a string each time. Fixed. >> Source/JavaScriptCore/runtime/IntlObjectInlines.h:126 >> + auto compareCharacter = [](CharacterType1 lhs, CharacterType2 rhs) { > > should we inline this below just to ensure that we don't actually have the "result != 0" branch in case the compiler doesn't optimize it away? Sounds like inlining this manually is better. Changed. >> Source/JavaScriptCore/runtime/IntlObjectInlines.h:137 >> + while (true) { > > why not just: "while (position < commonLength)"? This sounds good, fixed. >> Source/JavaScriptCore/runtime/IntlObjectInlines.h:145 >> + ++position; > > Is this better than just indexing on position? We can just use position. Created attachment 407219 [details]
Patch
Comment on attachment 407177 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=407177&action=review >>>> Source/JavaScriptCore/runtime/IntlCollator.cpp:295 >>>> + if (canDoASCIIUCADUCETComparison()) { >>> >>> shouldn't be the first thing to check for the speed of not having to check the entire strings if they're ASCII when in a UCollator that can't go down the fast path? >> >> No, this has the ucol_strcollUTF8 optimization too, not just compareASCIIWithUACDUCET. > > Yeah, this is necessary since there is ucol_strcollUTF8 path. yeah I missed that path. Ignore comment Created attachment 407236 [details]
Patch
Created attachment 407239 [details]
Patch
Created attachment 407240 [details]
Patch
Comment on attachment 407239 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=407239&action=review > Source/JavaScriptCore/runtime/IntlCollator.cpp:297 > + if (x.isAllSpecialCharacters<static_cast<bool*(UChar)>(canUseFastPath)>() && y.isAllSpecialCharacters<static_cast<bool*(UChar)>(canUseFastPath)>()) { Do the type conversion with another local variable? Could get rid of the need for static_cast. bool (*canUseFastPathPredicate)(UChar) = canUseFastPath; > Source/JavaScriptCore/runtime/IntlCollator.cpp:309 > + if (x.is8Bit() && y.is8Bit()) > + return ucol_strcollUTF8(m_collator.get(), bitwise_cast<const char*>(x.characters8()), x.length(), bitwise_cast<const char*>(y.characters8()), y.length(), &status); This could be done even in cases involving control characters. Is it worth doing that? > Source/JavaScriptCore/runtime/IntlCollator.cpp:324 > + auto getCharacters16 = [&] (const StringView& view, Vector<UChar>& buffer) -> const UChar* { > if (!view.is8Bit()) > return view.characters16(); > - buffer.resize(view.length()); > + buffer.grow(view.length()); > StringImpl::copyCharacters(buffer.data(), view.characters8(), view.length()); > return buffer.data(); > }; > > Vector<UChar> xBuffer; > Vector<UChar> yBuffer; > - const UChar* xCharacters = getCharacters(x, xBuffer); > - const UChar* yCharacters = getCharacters(y, yBuffer); > - result = ucol_strcoll(m_collator.get(), xCharacters, x.length(), yCharacters, y.length()); > - } > + const UChar* xCharacters = getCharacters16(x, xBuffer); > + const UChar* yCharacters = getCharacters16(y, yBuffer); > + return ucol_strcoll(m_collator.get(), xCharacters, x.length(), yCharacters, y.length()); The StringView::upconvertedCharacters function is designed to that uses like this are easier to write: return ucol_strcoll(m_collator.get(), x.upconvertedCharacters(), x.length(), y.upconvertedCharacters(), y.length()); Will that work? > Source/JavaScriptCore/runtime/IntlCollator.cpp:403 > + // We do not check UCOL_NORMALIZATION_MODE status since FCD normalization do nothing for ASCII strings. "do nothing" -> "does nothing" > Source/JavaScriptCore/runtime/IntlCollator.cpp:408 > + UColAttributeValue result = ucol_getAttribute(&collator, pair.first, &status); auto > Source/JavaScriptCore/runtime/IntlCollator.cpp:414 > + // Check existence of tailoring rules. If it does not exist, collation algorithm is UAC DUCET. "it does not exist" -> "they do not exist" > Source/JavaScriptCore/runtime/IntlCollator.h:71 > + if (m_canDoASCIIUCADUCETComparison != TriState::Indeterminate) > + return m_canDoASCIIUCADUCETComparison == TriState::True; > + return canDoASCIIUCADUCETComparisonSlow(); I prefer to a style like this: if (m_canDoASCIIUCADUCETComparison == TriState::Indeterminate) updateCanDoASCIIUCADUCETComparison(); return m_canDoASCIIUCADUCETComparison == TriState::True; > Source/JavaScriptCore/runtime/IntlObjectInlines.h:121 > +inline int compareASCIIWithUACDUCET(const CharacterType1* characters1, unsigned length1, const CharacterType2* characters2, unsigned length2) Caller wants UCollationResult. Why not return it here instead of putting the typecast at the call location. Created attachment 407241 [details]
Patch
Created attachment 407243 [details]
Patch
Comment on attachment 407239 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=407239&action=review >> Source/JavaScriptCore/runtime/IntlCollator.cpp:297 >> + if (x.isAllSpecialCharacters<static_cast<bool*(UChar)>(canUseFastPath)>() && y.isAllSpecialCharacters<static_cast<bool*(UChar)>(canUseFastPath)>()) { > > Do the type conversion with another local variable? Could get rid of the need for static_cast. > > bool (*canUseFastPathPredicate)(UChar) = canUseFastPath; It turned out this is not compiled well in GCC. I'll just define a function. >> Source/JavaScriptCore/runtime/IntlCollator.cpp:309 >> + return ucol_strcollUTF8(m_collator.get(), bitwise_cast<const char*>(x.characters8()), x.length(), bitwise_cast<const char*>(y.characters8()), y.length(), &status); > > This could be done even in cases involving control characters. Is it worth doing that? I think this is not worth doing. Control characters are very rare. Calling isAllASCII for control characters could hurt UTF-16 case performance. >> Source/JavaScriptCore/runtime/IntlCollator.cpp:324 >> + return ucol_strcoll(m_collator.get(), xCharacters, x.length(), yCharacters, y.length()); > > The StringView::upconvertedCharacters function is designed to that uses like this are easier to write: > > return ucol_strcoll(m_collator.get(), x.upconvertedCharacters(), x.length(), y.upconvertedCharacters(), y.length()); > > Will that work? It seems that upconvertedCharacters can just work. Changed. >> Source/JavaScriptCore/runtime/IntlCollator.cpp:403 >> + // We do not check UCOL_NORMALIZATION_MODE status since FCD normalization do nothing for ASCII strings. > > "do nothing" -> "does nothing" Fixed. >> Source/JavaScriptCore/runtime/IntlCollator.cpp:408 >> + UColAttributeValue result = ucol_getAttribute(&collator, pair.first, &status); > > auto Changed. >> Source/JavaScriptCore/runtime/IntlCollator.cpp:414 >> + // Check existence of tailoring rules. If it does not exist, collation algorithm is UAC DUCET. > > "it does not exist" -> "they do not exist" Fixed. >> Source/JavaScriptCore/runtime/IntlCollator.h:71 >> + return canDoASCIIUCADUCETComparisonSlow(); > > I prefer to a style like this: > > if (m_canDoASCIIUCADUCETComparison == TriState::Indeterminate) > updateCanDoASCIIUCADUCETComparison(); > return m_canDoASCIIUCADUCETComparison == TriState::True; Changed. >> Source/JavaScriptCore/runtime/IntlObjectInlines.h:121 >> +inline int compareASCIIWithUACDUCET(const CharacterType1* characters1, unsigned length1, const CharacterType2* characters2, unsigned length2) > > Caller wants UCollationResult. Why not return it here instead of putting the typecast at the call location. Originally, it was aligned to WTF's generic codePointCompare. But since this function is only used by IntlCollator, using UCollationResult is OK. Changed. Created attachment 407246 [details]
Patch
Created attachment 407261 [details]
Patch
Comment on attachment 407261 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=407261&action=review > Source/JavaScriptCore/runtime/IntlObjectInlines.h:125 > + static_assert(UCOL_EQUAL == 0); > + static_assert(UCOL_GREATER == 1); > + static_assert(UCOL_LESS == -1); Don’t need these any more. > Source/JavaScriptCore/runtime/IntlObjectInlines.h:129 > + CharacterType1 lhs = characters1[position]; > + CharacterType2 rhs = characters2[position]; auto > Source/JavaScriptCore/runtime/IntlObjectInlines.h:144 > + return (length1 > length2) ? UCOL_GREATER : UCOL_LESS; No parentheses needed here (match the weight if statement above). Comment on attachment 407261 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=407261&action=review Thanks! >> Source/JavaScriptCore/runtime/IntlObjectInlines.h:125 >> + static_assert(UCOL_LESS == -1); > > Don’t need these any more. Removed. >> Source/JavaScriptCore/runtime/IntlObjectInlines.h:129 >> + CharacterType2 rhs = characters2[position]; > > auto Fixed. >> Source/JavaScriptCore/runtime/IntlObjectInlines.h:144 >> + return (length1 > length2) ? UCOL_GREATER : UCOL_LESS; > > No parentheses needed here (match the weight if statement above). Removed Committed r266178: <https://trac.webkit.org/changeset/266178> *** Bug 230199 has been marked as a duplicate of this bug. *** |