RESOLVED FIXED Bug 200565
XMLHttpRequest: getAllResponseHeaders() sorting
https://bugs.webkit.org/show_bug.cgi?id=200565
Summary XMLHttpRequest: getAllResponseHeaders() sorting
Anne van Kesteren
Reported 2019-08-09 01:14:12 PDT
As discussed in https://github.com/whatwg/xhr/issues/248 getAllResponseHeaders() should sort headers using the ASCII-uppercase header name as key (the end result is still ASCII-lowercase header names). This should not affect the Headers object. (Perhaps there can be a Fetch Component for issues like this?)
Attachments
Patch (4.62 KB, patch)
2020-03-20 14:12 PDT, Rob Buis
no flags
Patch (8.38 KB, patch)
2020-03-20 14:49 PDT, Rob Buis
no flags
Patch (13.13 KB, patch)
2020-03-21 09:39 PDT, Rob Buis
no flags
Patch (13.11 KB, patch)
2020-03-21 12:01 PDT, Rob Buis
no flags
Patch (15.05 KB, patch)
2020-03-22 13:26 PDT, Rob Buis
no flags
Patch (13.77 KB, patch)
2020-03-23 00:58 PDT, Rob Buis
no flags
Patch (13.86 KB, patch)
2020-03-23 09:16 PDT, Rob Buis
no flags
Rob Buis
Comment 1 2020-03-20 14:12:51 PDT
Rob Buis
Comment 2 2020-03-20 14:49:24 PDT
Darin Adler
Comment 3 2020-03-20 15:00:33 PDT
Comment on attachment 394132 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=394132&action=review > Source/WebCore/xml/XMLHttpRequest.cpp:822 > + for (auto& header : m_response.httpHeaderFields()) > + headers.uncheckedAppend(std::make_pair(header.key.convertToASCIIUppercase(), header.value)); > + > + std::sort(headers.begin(), headers.end(), [] (const std::pair<String, String>& x, const std::pair<String, String>& y) { > + return WTF::codePointCompareLessThan(x.first, y.first); > + }); Seems a little silly that we have to actually make a copy of the string converted to ASCII uppercase just to do *comparison* based on the uppercased letters; should be easy to write a custom codePointCompareLessThan that converts ASCII characters to uppercase one by one as we go. But fine for now, I suppose. > Source/WebCore/xml/XMLHttpRequest.cpp:832 > + for (auto& header : headers) { > + stringBuilder.append(header.first.convertToASCIILowercase()); > + stringBuilder.append(':'); > + stringBuilder.append(' '); > + stringBuilder.append(header.second); > + stringBuilder.append('\r'); > + stringBuilder.append('\n'); > + } Here’s how to write this in modern code: for (auto& header : headers) stringBuilder.append(header.first.convertToASCIILowercase(), ": ", header.second, "\r\n"); It’s not only easier to read than the old code but also more efficient. If we really wanted to go crazy with optimization we could come up with a manipulator that allows you to append a string and convert it ASCII lowercase in the process of doing the appending without allocating and destroying a string every time. But we probably don’t have it already sitting on the shelf.
Darin Adler
Comment 4 2020-03-20 15:07:52 PDT
(In reply to Darin Adler from comment #3) > If we really wanted to go crazy with optimization we could come up with a > manipulator that allows you to append a string and convert it ASCII > lowercase in the process of doing the appending without allocating and > destroying a string every time. But we probably don’t have it already > sitting on the shelf. Let me know if you decide you really want to do that optimization. Want to learn how to write this. It would integrate with string concatenation the way the pad() and PaddingSpecification from StringConcatenate.h does, although much more simply. The actual concatenation would be much like the code in StringTypeAdapter<String, void> except instead of StringView::getCharactersWithUpconvert we'd write a function that calls toASCIIUpper on each character as we go.
Rob Buis
Comment 5 2020-03-21 09:39:50 PDT
Rob Buis
Comment 6 2020-03-21 12:01:16 PDT
Rob Buis
Comment 7 2020-03-21 14:07:54 PDT
Comment on attachment 394132 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=394132&action=review >> Source/WebCore/xml/XMLHttpRequest.cpp:832 >> + } > > Here’s how to write this in modern code: > > for (auto& header : headers) > stringBuilder.append(header.first.convertToASCIILowercase(), ": ", header.second, "\r\n"); > > It’s not only easier to read than the old code but also more efficient. > > If we really wanted to go crazy with optimization we could come up with a manipulator that allows you to append a string and convert it ASCII lowercase in the process of doing the appending without allocating and destroying a string every time. But we probably don’t have it already sitting on the shelf. I took a stab at that, let me know what you tihink.
Darin Adler
Comment 8 2020-03-21 17:16:24 PDT
Comment on attachment 394172 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=394172&action=review Neat that you did the case conversion manipulator! Nice job, too. > Source/WTF/wtf/text/StringConcatenate.h:290 > +struct ToASCIICaseConvertor { Seems like this name should be ASCIICaseConverter or ASCIICaseConversionSpecification. I don’t know why we’d use the unconventional spelling "convertor" instead of the more conventional "converter" and I don’t think the converter converts "to" ASCII case, it converts ASCII case. > Source/WTF/wtf/text/StringConcatenate.h:299 > +template<StringView::CaseConvertType type> > +ToASCIICaseConvertor toASCIICase(StringView string) > +{ > + return { type, string }; > +} I think it would be more elegant to have manipulators named "lowercase" and "uppercase" or "lowercased" and "uppercased" rather than toASCIICase<StringView::CaseConvertType::Lower> and toASCIICase<StringView::CaseConvertType::Upper>. The analogy here is with "pad". > Source/WTF/wtf/text/StringConcatenate.h:313 > + WTF_STRINGTYPEADAPTER_COPIED_WTF_STRING(); This is not correct and should be omitted. > Source/WTF/wtf/text/StringConcatenate.h:317 > + const ToASCIICaseConvertor& m_toASCIICaseConvertor; Just m_convertor or m_converter would be a fine name. There’s plenty of context and we don’t need to repeat it all. > Source/WTF/wtf/text/StringView.cpp:249 > + static_assert(std::is_same<SourceCharacterType, LChar>::value || std::is_same<SourceCharacterType, UChar>::value); > + static_assert(std::is_same<DestinationCharacterType, LChar>::value || std::is_same<DestinationCharacterType, UChar>::value); Could also static_assert that sizeof(DestinationCharacterType) <= sizeof(SourceCharacterType), since this would not work properly if destination was LChar and source was UChar. > Source/WTF/wtf/text/StringView.cpp:251 > + destination[i] = type == StringView::CaseConvertType::Lower ? toASCIILower(source[i]) : toASCIIUpper(source[i]); Maybe this conditional should go outside the loop? I am not sure the compiler will be smart enough to hoist the branch out of the loop. > Source/WTF/wtf/text/StringView.h:123 > + enum class CaseConvertType { Upper, Lower }; It’s quite inelegant that we have this separately in every string class! Should clean that up at some point. > Source/WebCore/xml/XMLHttpRequest.cpp:818 > + headers.uncheckedAppend(std::make_pair(header.key.convertToASCIIUppercase(), header.value)); No need to convert to ASCII uppercase. We can write a version of WTF::codePointCompareLessThan that does that as part of the comparison process. I mentioned that in my last review. Did you decide against it, or just miss the comment? > Source/WebCore/xml/XMLHttpRequest.cpp:821 > + return WTF::codePointCompareLessThan(x.first, y.first); Here’s one cut at it: unsigned xLength = x.first.length(); unsigned yLength = y.first.length(); unsigned commonLength = std::min(xLength, yLength); for (unsigned i = 0; i < commonLength; ++i) { auto xCharacter = toASCIIUpper(x.first[i]); auto yCharacter = toASCIIUpper(y.first[i]); if (xCharacter != yCharacter) return xCharacter < yCharacter); } return xLength < yLength; Could optimize more, but maybe we don’t need to. The big win is not copying the strings.
Rob Buis
Comment 9 2020-03-22 13:26:02 PDT
Darin Adler
Comment 10 2020-03-22 13:40:39 PDT
Comment on attachment 394227 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=394227&action=review > Source/WTF/wtf/text/StringConcatenate.h:290 > +struct ToASCIICaseConverter { I suggest removing the "To" from this name. > Source/WTF/wtf/text/StringConcatenate.h:291 > + StringView::CaseConvertType caseConvertType; How about just the name "type" for this data member? > Source/WTF/wtf/text/StringConcatenate.h:306 > + bool is8Bit() const { return m_converter.string.isNull() || m_converter.string.is8Bit(); } No need for the isNull check here. Both String::is8Bit and StringView::is8Bit already return true for null strings. > Source/WTF/wtf/text/StringConcatenate.h:394 > using WTF::makeString; > using WTF::pad; > +using WTF::lowercase; > +using WTF::uppercase; Alphabetical order would put lowercase before makeString, not after pad. > Source/WTF/wtf/text/StringView.cpp:250 > + static_assert(std::is_same<DestinationCharacterType, LChar>::value || std::is_same<DestinationCharacterType, UChar>::value); You didn’t add the static_assert I suggested. static_assert(sizeof(DestinationCharacterType) <= sizeof(SourceCharacterType)); Did you try and decide not to? > Source/WTF/wtf/text/StringView.cpp:251 > + auto fun = (type == StringView::CaseConvertType::Lower) ? toASCIILower<SourceCharacterType> : toASCIIUpper<SourceCharacterType>; WebKit coding style says use words, to "function" rather than "fun". > Source/WTF/wtf/text/StringView.cpp:406 > +ToASCIICaseConverter lowercase(StringView string) > +{ > + return { StringView::CaseConvertType::Lower, string }; > +} > + > +ToASCIICaseConverter uppercase(StringView string) > +{ > + return { StringView::CaseConvertType::Upper, string }; > +} I don’t think these will work correctly. We’re returning a reference to the argument, and that argument will be destroyed when the function returns. I think these need to take const StringView& instead of StringView. Also probably should make these inline. > Source/WTF/wtf/text/WTFString.h:439 > +bool codePointCompareLessThanCaseInsensitive(const String&, const String&); This name isn’t right. 1) Case insensitive is not the same as ASCII case insensitive. 2) ASCII case insensitive is a way of comparing for equality, not ordering. When doing a code point comparison, it matters whether the ASCII letters are being uppercased or lowercased -- those two orderings are different when letters are mixed with non-letters. So we need a name that mentions ASCII and mentions uppercasing. I’d be tempted to just put this function in place in the file using it and not adding it to String.h at all. > Source/WTF/wtf/text/WTFString.h:614 > +inline bool codePointCompareLessThanCaseInsensitive(const String& a, const String& b) > +{ > + unsigned xLength = a.length(); > + unsigned yLength = b.length(); > + unsigned commonLength = std::min(xLength, yLength); > + for (unsigned i = 0; i < commonLength; ++i) { > + auto xCharacter = toASCIIUpper(a[i]); > + auto yCharacter = toASCIIUpper(b[i]); > + if (xCharacter != yCharacter) > + return xCharacter < yCharacter; > + } > + return xLength < yLength; > +} This is a relatively unoptimized function, which I think is fine for what we’re using it for. But it seems wasteful to make it inlined since we aren’t doing the other kinds of optimizations on it.
Rob Buis
Comment 11 2020-03-23 00:58:50 PDT
Rob Buis
Comment 12 2020-03-23 09:11:12 PDT
Comment on attachment 394227 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=394227&action=review >> Source/WTF/wtf/text/StringConcatenate.h:290 >> +struct ToASCIICaseConverter { > > I suggest removing the "To" from this name. Done. >> Source/WTF/wtf/text/StringConcatenate.h:291 >> + StringView::CaseConvertType caseConvertType; > > How about just the name "type" for this data member? Sure, done. >> Source/WTF/wtf/text/StringConcatenate.h:306 >> + bool is8Bit() const { return m_converter.string.isNull() || m_converter.string.is8Bit(); } > > No need for the isNull check here. Both String::is8Bit and StringView::is8Bit already return true for null strings. Done. >> Source/WTF/wtf/text/StringConcatenate.h:394 >> +using WTF::uppercase; > > Alphabetical order would put lowercase before makeString, not after pad. Done. >> Source/WTF/wtf/text/StringView.cpp:250 >> + static_assert(std::is_same<DestinationCharacterType, LChar>::value || std::is_same<DestinationCharacterType, UChar>::value); > > You didn’t add the static_assert I suggested. > > static_assert(sizeof(DestinationCharacterType) <= sizeof(SourceCharacterType)); > > Did you try and decide not to? I tried and hit the assert. Let me check one more time... >> Source/WTF/wtf/text/StringView.cpp:251 >> + auto fun = (type == StringView::CaseConvertType::Lower) ? toASCIILower<SourceCharacterType> : toASCIIUpper<SourceCharacterType>; > > WebKit coding style says use words, to "function" rather than "fun". Can't we have fun? :) Fixed. >> Source/WTF/wtf/text/StringView.cpp:406 >> +} > > I don’t think these will work correctly. We’re returning a reference to the argument, and that argument will be destroyed when the function returns. I think these need to take const StringView& instead of StringView. Also probably should make these inline. Done. >> Source/WTF/wtf/text/WTFString.h:439 >> +bool codePointCompareLessThanCaseInsensitive(const String&, const String&); > > This name isn’t right. > > 1) Case insensitive is not the same as ASCII case insensitive. > 2) ASCII case insensitive is a way of comparing for equality, not ordering. When doing a code point comparison, it matters whether the ASCII letters are being uppercased or lowercased -- those two orderings are different when letters are mixed with non-letters. > > So we need a name that mentions ASCII and mentions uppercasing. > > I’d be tempted to just put this function in place in the file using it and not adding it to String.h at all. I did inline it in the end. >> Source/WTF/wtf/text/WTFString.h:614 >> +} > > This is a relatively unoptimized function, which I think is fine for what we’re using it for. But it seems wasteful to make it inlined since we aren’t doing the other kinds of optimizations on it. Okay, not relevant anymore since it got inlined.
Darin Adler
Comment 13 2020-03-23 09:12:44 PDT
Comment on attachment 394227 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=394227&action=review >>> Source/WTF/wtf/text/StringView.cpp:250 >>> + static_assert(std::is_same<DestinationCharacterType, LChar>::value || std::is_same<DestinationCharacterType, UChar>::value); >> >> You didn’t add the static_assert I suggested. >> >> static_assert(sizeof(DestinationCharacterType) <= sizeof(SourceCharacterType)); >> >> Did you try and decide not to? > > I tried and hit the assert. Let me check one more time... Oops, wrote that backwards. It’s >=.
Rob Buis
Comment 14 2020-03-23 09:16:41 PDT
Rob Buis
Comment 15 2020-03-23 11:11:27 PDT
Comment on attachment 394227 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=394227&action=review >>>> Source/WTF/wtf/text/StringView.cpp:250 >>>> + static_assert(std::is_same<DestinationCharacterType, LChar>::value || std::is_same<DestinationCharacterType, UChar>::value); >>> >>> You didn’t add the static_assert I suggested. >>> >>> static_assert(sizeof(DestinationCharacterType) <= sizeof(SourceCharacterType)); >>> >>> Did you try and decide not to? >> >> I tried and hit the assert. Let me check one more time... > > Oops, wrote that backwards. It’s >=. Aha! Fixed. Could you do one final check?
Darin Adler
Comment 16 2020-03-23 11:39:14 PDT
Comment on attachment 394264 [details] Patch This all looks great
EWS
Comment 17 2020-03-23 12:31:41 PDT
Committed r258866: <https://trac.webkit.org/changeset/258866> All reviewed patches have been landed. Closing bug and clearing flags on attachment 394264 [details].
Radar WebKit Bug Importer
Comment 18 2020-03-23 12:32:19 PDT
Note You need to log in before you can comment on or make changes to this bug.