Bug 200565 - XMLHttpRequest: getAllResponseHeaders() sorting
Summary: XMLHttpRequest: getAllResponseHeaders() sorting
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: DOM (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Rob Buis
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2019-08-09 01:14 PDT by Anne van Kesteren
Modified: 2020-03-23 12:32 PDT (History)
8 users (show)

See Also:


Attachments
Patch (4.62 KB, patch)
2020-03-20 14:12 PDT, Rob Buis
no flags Details | Formatted Diff | Diff
Patch (8.38 KB, patch)
2020-03-20 14:49 PDT, Rob Buis
no flags Details | Formatted Diff | Diff
Patch (13.13 KB, patch)
2020-03-21 09:39 PDT, Rob Buis
no flags Details | Formatted Diff | Diff
Patch (13.11 KB, patch)
2020-03-21 12:01 PDT, Rob Buis
no flags Details | Formatted Diff | Diff
Patch (15.05 KB, patch)
2020-03-22 13:26 PDT, Rob Buis
no flags Details | Formatted Diff | Diff
Patch (13.77 KB, patch)
2020-03-23 00:58 PDT, Rob Buis
no flags Details | Formatted Diff | Diff
Patch (13.86 KB, patch)
2020-03-23 09:16 PDT, Rob Buis
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Anne van Kesteren 2019-08-09 01:14:12 PDT
As discussed in https://github.com/whatwg/xhr/issues/248 getAllResponseHeaders() should sort headers using the ASCII-uppercase header name as key (the end result is still ASCII-lowercase header names). This should not affect the Headers object. 

(Perhaps there can be a Fetch Component for issues like this?)
Comment 1 Rob Buis 2020-03-20 14:12:51 PDT
Created attachment 394124 [details]
Patch
Comment 2 Rob Buis 2020-03-20 14:49:24 PDT
Created attachment 394132 [details]
Patch
Comment 3 Darin Adler 2020-03-20 15:00:33 PDT
Comment on attachment 394132 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=394132&action=review

> Source/WebCore/xml/XMLHttpRequest.cpp:822
> +        for (auto& header : m_response.httpHeaderFields())
> +            headers.uncheckedAppend(std::make_pair(header.key.convertToASCIIUppercase(), header.value));
> +
> +        std::sort(headers.begin(), headers.end(), [] (const std::pair<String, String>& x, const std::pair<String, String>& y) {
> +            return WTF::codePointCompareLessThan(x.first, y.first);
> +        });

Seems a little silly that we have to actually make a copy of the string converted to ASCII uppercase just to do *comparison* based on the uppercased letters; should be easy to write a custom codePointCompareLessThan that converts ASCII characters to uppercase one by one as we go. But fine for now, I suppose.

> Source/WebCore/xml/XMLHttpRequest.cpp:832
> +        for (auto& header : headers) {
> +            stringBuilder.append(header.first.convertToASCIILowercase());
> +            stringBuilder.append(':');
> +            stringBuilder.append(' ');
> +            stringBuilder.append(header.second);
> +            stringBuilder.append('\r');
> +            stringBuilder.append('\n');
> +        }

Here’s how to write this in modern code:

    for (auto& header : headers)
        stringBuilder.append(header.first.convertToASCIILowercase(), ": ", header.second, "\r\n");

It’s not only easier to read than the old code but also more efficient.

If we really wanted to go crazy with optimization we could come up with a manipulator that allows you to append a string and convert it ASCII lowercase in the process of doing the appending without allocating and destroying a string every time. But we probably don’t have it already sitting on the shelf.
Comment 4 Darin Adler 2020-03-20 15:07:52 PDT
(In reply to Darin Adler from comment #3)
> If we really wanted to go crazy with optimization we could come up with a
> manipulator that allows you to append a string and convert it ASCII
> lowercase in the process of doing the appending without allocating and
> destroying a string every time. But we probably don’t have it already
> sitting on the shelf.

Let me know if you decide you really want to do that optimization. Want to learn how to write this. It would integrate with string concatenation the way the pad() and PaddingSpecification from StringConcatenate.h does, although much more simply. The actual concatenation would be much like the code in StringTypeAdapter<String, void> except instead of StringView::getCharactersWithUpconvert we'd write a function that calls toASCIIUpper on each character as we go.
Comment 5 Rob Buis 2020-03-21 09:39:50 PDT
Created attachment 394168 [details]
Patch
Comment 6 Rob Buis 2020-03-21 12:01:16 PDT
Created attachment 394172 [details]
Patch
Comment 7 Rob Buis 2020-03-21 14:07:54 PDT
Comment on attachment 394132 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=394132&action=review

>> Source/WebCore/xml/XMLHttpRequest.cpp:832
>> +        }
> 
> Here’s how to write this in modern code:
> 
>     for (auto& header : headers)
>         stringBuilder.append(header.first.convertToASCIILowercase(), ": ", header.second, "\r\n");
> 
> It’s not only easier to read than the old code but also more efficient.
> 
> If we really wanted to go crazy with optimization we could come up with a manipulator that allows you to append a string and convert it ASCII lowercase in the process of doing the appending without allocating and destroying a string every time. But we probably don’t have it already sitting on the shelf.

I took a stab at that, let me know what you tihink.
Comment 8 Darin Adler 2020-03-21 17:16:24 PDT
Comment on attachment 394172 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=394172&action=review

Neat that you did the case conversion manipulator! Nice job, too.

> Source/WTF/wtf/text/StringConcatenate.h:290
> +struct ToASCIICaseConvertor {

Seems like this name should be ASCIICaseConverter or ASCIICaseConversionSpecification. I don’t know why we’d use the unconventional spelling "convertor" instead of the more conventional "converter" and I don’t think the converter converts "to" ASCII case, it converts ASCII case.

> Source/WTF/wtf/text/StringConcatenate.h:299
> +template<StringView::CaseConvertType type>
> +ToASCIICaseConvertor toASCIICase(StringView string)
> +{
> +    return { type, string };
> +}

I think it would be more elegant to have manipulators named "lowercase" and "uppercase" or "lowercased" and "uppercased" rather than toASCIICase<StringView::CaseConvertType::Lower> and toASCIICase<StringView::CaseConvertType::Upper>. The analogy here is with "pad".

> Source/WTF/wtf/text/StringConcatenate.h:313
> +        WTF_STRINGTYPEADAPTER_COPIED_WTF_STRING();

This is not correct and should be omitted.

> Source/WTF/wtf/text/StringConcatenate.h:317
> +    const ToASCIICaseConvertor& m_toASCIICaseConvertor;

Just m_convertor or m_converter would be a fine name. There’s plenty of context and we don’t need to repeat it all.

> Source/WTF/wtf/text/StringView.cpp:249
> +    static_assert(std::is_same<SourceCharacterType, LChar>::value || std::is_same<SourceCharacterType, UChar>::value);
> +    static_assert(std::is_same<DestinationCharacterType, LChar>::value || std::is_same<DestinationCharacterType, UChar>::value);

Could also static_assert that sizeof(DestinationCharacterType) <= sizeof(SourceCharacterType), since this would not work properly if destination was LChar and source was UChar.

> Source/WTF/wtf/text/StringView.cpp:251
> +        destination[i] = type == StringView::CaseConvertType::Lower ? toASCIILower(source[i]) : toASCIIUpper(source[i]);

Maybe this conditional should go outside the loop? I am not sure the compiler will be smart enough to hoist the branch out of the loop.

> Source/WTF/wtf/text/StringView.h:123
> +    enum class CaseConvertType { Upper, Lower };

It’s quite inelegant that we have this separately in every string class! Should clean that up at some point.

> Source/WebCore/xml/XMLHttpRequest.cpp:818
> +            headers.uncheckedAppend(std::make_pair(header.key.convertToASCIIUppercase(), header.value));

No need to convert to ASCII uppercase. We can write a version of WTF::codePointCompareLessThan that does that as part of the comparison process. I mentioned that in my last review. Did you decide against it, or just miss the comment?

> Source/WebCore/xml/XMLHttpRequest.cpp:821
> +            return WTF::codePointCompareLessThan(x.first, y.first);

Here’s one cut at it:

    unsigned xLength = x.first.length();
    unsigned yLength = y.first.length();
    unsigned commonLength = std::min(xLength, yLength);
    for (unsigned i = 0; i < commonLength; ++i) {
        auto xCharacter = toASCIIUpper(x.first[i]);
        auto yCharacter = toASCIIUpper(y.first[i]);
        if (xCharacter != yCharacter)
            return xCharacter < yCharacter);
    }
    return xLength < yLength;

Could optimize more, but maybe we don’t need to. The big win is not copying the strings.
Comment 9 Rob Buis 2020-03-22 13:26:02 PDT
Created attachment 394227 [details]
Patch
Comment 10 Darin Adler 2020-03-22 13:40:39 PDT
Comment on attachment 394227 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=394227&action=review

> Source/WTF/wtf/text/StringConcatenate.h:290
> +struct ToASCIICaseConverter {

I suggest removing the "To" from this name.

> Source/WTF/wtf/text/StringConcatenate.h:291
> +    StringView::CaseConvertType caseConvertType;

How about just the name "type" for this data member?

> Source/WTF/wtf/text/StringConcatenate.h:306
> +    bool is8Bit() const { return m_converter.string.isNull() || m_converter.string.is8Bit(); }

No need for the isNull check here. Both String::is8Bit and StringView::is8Bit already return true for null strings.

> Source/WTF/wtf/text/StringConcatenate.h:394
>  using WTF::makeString;
>  using WTF::pad;
> +using WTF::lowercase;
> +using WTF::uppercase;

Alphabetical order would put lowercase before makeString, not after pad.

> Source/WTF/wtf/text/StringView.cpp:250
> +    static_assert(std::is_same<DestinationCharacterType, LChar>::value || std::is_same<DestinationCharacterType, UChar>::value);

You didn’t add the static_assert I suggested.

    static_assert(sizeof(DestinationCharacterType) <= sizeof(SourceCharacterType));

Did you try and decide not to?

> Source/WTF/wtf/text/StringView.cpp:251
> +    auto fun = (type == StringView::CaseConvertType::Lower) ? toASCIILower<SourceCharacterType> : toASCIIUpper<SourceCharacterType>;

WebKit coding style says use words, to "function" rather than "fun".

> Source/WTF/wtf/text/StringView.cpp:406
> +ToASCIICaseConverter lowercase(StringView string)
> +{
> +    return { StringView::CaseConvertType::Lower, string };
> +}
> +
> +ToASCIICaseConverter uppercase(StringView string)
> +{
> +    return { StringView::CaseConvertType::Upper, string };
> +}

I don’t think these will work correctly. We’re returning a reference to the argument, and that argument will be destroyed when the function returns. I think these need to take const StringView& instead of StringView. Also probably should make these inline.

> Source/WTF/wtf/text/WTFString.h:439
> +bool codePointCompareLessThanCaseInsensitive(const String&, const String&);

This name isn’t right.

1) Case insensitive is not the same as ASCII case insensitive.
2) ASCII case insensitive is a way of comparing for equality, not ordering. When doing a code point comparison, it matters whether the ASCII letters are being uppercased or lowercased -- those two orderings are different when letters are mixed with non-letters.

So we need a name that mentions ASCII and mentions uppercasing.

I’d be tempted to just put this function in place in the file using it and not adding it to String.h at all.

> Source/WTF/wtf/text/WTFString.h:614
> +inline bool codePointCompareLessThanCaseInsensitive(const String& a, const String& b)
> +{
> +    unsigned xLength = a.length();
> +    unsigned yLength = b.length();
> +    unsigned commonLength = std::min(xLength, yLength);
> +    for (unsigned i = 0; i < commonLength; ++i) {
> +        auto xCharacter = toASCIIUpper(a[i]);
> +        auto yCharacter = toASCIIUpper(b[i]);
> +        if (xCharacter != yCharacter)
> +            return xCharacter < yCharacter;
> +    }
> +    return xLength < yLength;
> +}

This is a relatively unoptimized function, which I think is fine for what we’re using it for. But it seems wasteful to make it inlined since we aren’t doing the other kinds of optimizations on it.
Comment 11 Rob Buis 2020-03-23 00:58:50 PDT
Created attachment 394242 [details]
Patch
Comment 12 Rob Buis 2020-03-23 09:11:12 PDT
Comment on attachment 394227 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=394227&action=review

>> Source/WTF/wtf/text/StringConcatenate.h:290
>> +struct ToASCIICaseConverter {
> 
> I suggest removing the "To" from this name.

Done.

>> Source/WTF/wtf/text/StringConcatenate.h:291
>> +    StringView::CaseConvertType caseConvertType;
> 
> How about just the name "type" for this data member?

Sure, done.

>> Source/WTF/wtf/text/StringConcatenate.h:306
>> +    bool is8Bit() const { return m_converter.string.isNull() || m_converter.string.is8Bit(); }
> 
> No need for the isNull check here. Both String::is8Bit and StringView::is8Bit already return true for null strings.

Done.

>> Source/WTF/wtf/text/StringConcatenate.h:394
>> +using WTF::uppercase;
> 
> Alphabetical order would put lowercase before makeString, not after pad.

Done.

>> Source/WTF/wtf/text/StringView.cpp:250
>> +    static_assert(std::is_same<DestinationCharacterType, LChar>::value || std::is_same<DestinationCharacterType, UChar>::value);
> 
> You didn’t add the static_assert I suggested.
> 
>     static_assert(sizeof(DestinationCharacterType) <= sizeof(SourceCharacterType));
> 
> Did you try and decide not to?

I tried and hit the assert. Let me check one more time...

>> Source/WTF/wtf/text/StringView.cpp:251
>> +    auto fun = (type == StringView::CaseConvertType::Lower) ? toASCIILower<SourceCharacterType> : toASCIIUpper<SourceCharacterType>;
> 
> WebKit coding style says use words, to "function" rather than "fun".

Can't we have fun? :) Fixed.

>> Source/WTF/wtf/text/StringView.cpp:406
>> +}
> 
> I don’t think these will work correctly. We’re returning a reference to the argument, and that argument will be destroyed when the function returns. I think these need to take const StringView& instead of StringView. Also probably should make these inline.

Done.

>> Source/WTF/wtf/text/WTFString.h:439
>> +bool codePointCompareLessThanCaseInsensitive(const String&, const String&);
> 
> This name isn’t right.
> 
> 1) Case insensitive is not the same as ASCII case insensitive.
> 2) ASCII case insensitive is a way of comparing for equality, not ordering. When doing a code point comparison, it matters whether the ASCII letters are being uppercased or lowercased -- those two orderings are different when letters are mixed with non-letters.
> 
> So we need a name that mentions ASCII and mentions uppercasing.
> 
> I’d be tempted to just put this function in place in the file using it and not adding it to String.h at all.

I did inline it in the end.

>> Source/WTF/wtf/text/WTFString.h:614
>> +}
> 
> This is a relatively unoptimized function, which I think is fine for what we’re using it for. But it seems wasteful to make it inlined since we aren’t doing the other kinds of optimizations on it.

Okay, not relevant anymore since it got inlined.
Comment 13 Darin Adler 2020-03-23 09:12:44 PDT
Comment on attachment 394227 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=394227&action=review

>>> Source/WTF/wtf/text/StringView.cpp:250
>>> +    static_assert(std::is_same<DestinationCharacterType, LChar>::value || std::is_same<DestinationCharacterType, UChar>::value);
>> 
>> You didn’t add the static_assert I suggested.
>> 
>>     static_assert(sizeof(DestinationCharacterType) <= sizeof(SourceCharacterType));
>> 
>> Did you try and decide not to?
> 
> I tried and hit the assert. Let me check one more time...

Oops, wrote that backwards. It’s >=.
Comment 14 Rob Buis 2020-03-23 09:16:41 PDT
Created attachment 394264 [details]
Patch
Comment 15 Rob Buis 2020-03-23 11:11:27 PDT
Comment on attachment 394227 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=394227&action=review

>>>> Source/WTF/wtf/text/StringView.cpp:250
>>>> +    static_assert(std::is_same<DestinationCharacterType, LChar>::value || std::is_same<DestinationCharacterType, UChar>::value);
>>> 
>>> You didn’t add the static_assert I suggested.
>>> 
>>>     static_assert(sizeof(DestinationCharacterType) <= sizeof(SourceCharacterType));
>>> 
>>> Did you try and decide not to?
>> 
>> I tried and hit the assert. Let me check one more time...
> 
> Oops, wrote that backwards. It’s >=.

Aha! Fixed. Could you do one final check?
Comment 16 Darin Adler 2020-03-23 11:39:14 PDT
Comment on attachment 394264 [details]
Patch

This all looks great
Comment 17 EWS 2020-03-23 12:31:41 PDT
Committed r258866: <https://trac.webkit.org/changeset/258866>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 394264 [details].
Comment 18 Radar WebKit Bug Importer 2020-03-23 12:32:19 PDT
<rdar://problem/60786515>