Bug 133504

Summary: Align srcset parser with recent spec changes
Product: WebKit Reporter: Yoav Weiss <yoav>
Component: New BugsAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: bdakin, benjamin, bunhere, calvaris, cmarcelo, commit-queue, dino, eric.carlson, esprehn+autocc, gyuyoung.kim, jonlee, kling, philipj, rakuco, sergio, simon.fraser
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch
none
Patch none

Description Yoav Weiss 2014-06-04 00:39:14 PDT
Align srcset parser with recent spec changes
Comment 1 Yoav Weiss 2014-06-04 01:19:26 PDT
Created attachment 232468 [details]
Patch
Comment 2 Yoav Weiss 2014-06-04 01:21:47 PDT
The srcset parser changes align it with the spec: http://picture.responsiveimages.org/#parse-srcset-attr.

The spec changes are meant to give the parser better future compatibility, and enable it to work well with the 'w' descriptor, the soon to be added 'h' descriptor and function descriptors (e.g. future-descriptor(a, b, c) ).
Comment 3 Darin Adler 2014-06-04 15:18:36 PDT
Comment on attachment 232468 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=232468&action=review

Patch is OK but I still see a lot of things that are either mysterious or need a bit of work. I’m especially concerned about the code here that does "length - 1"

> Source/WTF/ChangeLog:9
> +        I've exposed charactersToInt so that it can be used by
> +        HTMLSrcsetParser.cpp.

The best way for us to do this is to make versions that take StringView arguments in WTF.

> Source/WebCore/html/HTMLImageElement.cpp:144
> +            fastGetAttribute(srcAttr),
> +            fastGetAttribute(srcsetAttr));

These would look better on one line rather than successive lines.

> Source/WebCore/html/parser/HTMLParserIdioms.h:93
> +template<typename CharType>
> +inline bool isComma(CharType character)
> +{
> +    return character == ',';
> +}

Do we really need a function for this? I think that using == ',' at the call site would be fine and it’s not enhanced with a function.

> Source/WebCore/html/parser/HTMLPreloadScanner.cpp:110
> +            ImageCandidate imageCandidate = bestFitSourceForImageAttributes(m_deviceScaleFactor,
> +                m_urlToLoad,
> +                m_srcSetAttribute);

I think this would read better all on one line.

> Source/WebCore/html/parser/HTMLPreloadScanner.cpp:112
> +            String srcMatchingScale = imageCandidate.url();
>              setUrlToLoad(srcMatchingScale, true);

I don’t think the local variable is helpful here. I suggest combining these two into a single line.

But also, it’s really strange to call the url() function returning an AtomicString but then just put it into a String.

> Source/WebCore/html/parser/HTMLSrcsetParser.cpp:40
> +static bool compareByDensity(const ImageCandidate& first, const ImageCandidate& second)

Should mark this inline.

> Source/WebCore/html/parser/HTMLSrcsetParser.cpp:56
> +    descriptorStart = 0;

Please use nullptr.

> Source/WebCore/html/parser/HTMLSrcsetParser.cpp:72
> +template<typename CharType>
> +static bool isEOF(const CharType* position, const CharType* end)
> +{
> +    return position >= end;
> +}

Does this helper function aid in readability?

> Source/WebCore/html/parser/HTMLSrcsetParser.cpp:77
> +template<typename CharType>
> +static void tokenizeDescriptors(const CharType*& position,
> +    const CharType* attributeEnd,
> +    Vector<StringView>& descriptors)

Please put this all on one line rather than splitting it up vertically.

> Source/WebCore/html/parser/HTMLSrcsetParser.cpp:82
> +    while (true) {

I think this would read better as a for loop:

    for (; ; ++position) {

> Source/WebCore/html/parser/HTMLSrcsetParser.cpp:134
> +static float stringViewToFloat(const StringView& string, bool& isValid)
> +{
> +    if (string.is8Bit())
> +        return charactersToFloat(string.characters8(), string.length() - 1, &isValid);
> +    return charactersToFloat(string.characters16(), string.length() - 1, &isValid);
>  }

This function belongs in WTF. And also it need not have string view in its name. Also, why length - 1?

> Source/WebCore/html/parser/HTMLSrcsetParser.cpp:141
> +static int stringViewToInt(const StringView& string, bool& isValid)
>  {
> -    ASSERT(imageCandidates.isEmpty());
> +    if (string.is8Bit())
> +        return charactersToInt(string.characters8(), string.length() - 1, &isValid);
> +    return charactersToInt(string.characters16(), string.length() - 1, &isValid);
> +}

This function belongs in WTF. And also it need not have string view in its name. Also, why length - 1?

> Source/WebCore/html/parser/HTMLSrcsetParser.cpp:145
> +    for (Vector<StringView>::iterator descriptor = descriptors.begin(); descriptor != descriptors.end(); ++descriptor) {

An iterator should not be named "descriptor". Also, we should almost never use iterators explicitly to iterate through vectors; iterating with an index is almost always better, except when using something generic that always uses iterators. But, we can use a modern for loop to solve both those problems:

    for (auto& descriptor : descriptors) {

> Source/WebCore/html/parser/HTMLSrcsetParser.cpp:147
> +        if (!descriptor->length())
> +            continue;

Normally we’d use isEmpty() for this.

> Source/WebCore/html/parser/HTMLSrcsetParser.cpp:148
> +        UChar c = (*descriptor)[descriptor->length() - 1];

Please consider a word rather than a letter for the name of this variable.

> Source/WebCore/html/parser/HTMLSrcsetParser.cpp:179
> +static void parseImageCandidatesFromSrcsetAttribute(const String& attribute, const CharType* attributeStart, unsigned length, Vector<ImageCandidate>& imageCandidates)

This function does not need the attribute argument. You can make a StringView from a pointer and length, you don’t need to go back to the original String to make it.

> Source/WebCore/html/parser/HTMLSrcsetParser.cpp:184
> +    while (position < attributeEnd) {

I suggest a for loop instead of a while loop here.

> Source/WebCore/html/parser/HTMLSrcsetParser.cpp:229
> +static void parseImageCandidatesFromSrcsetAttribute(const String& attribute, Vector<ImageCandidate>& imageCandidates)

This function should take a StringView argument rather than a const String&.

> Source/WebCore/html/parser/HTMLSrcsetParser.cpp:232
> +    if (attribute.isNull())
> +        return;

No need for this check. The code would already do the right thing without it and it’s not something we need to optimize for.

> Source/WebCore/html/parser/HTMLSrcsetParser.cpp:237
> +    if (attribute.is8Bit())
> +        parseImageCandidatesFromSrcsetAttribute<LChar>(attribute, attribute.characters8(), attribute.length(), imageCandidates);
> +    else
> +        parseImageCandidatesFromSrcsetAttribute<UChar>(attribute, attribute.characters16(), attribute.length(), imageCandidates);

I’m not sure we need separately optimized versions for 8-bit and 16-bit characters. We should considering using StringView more for the argument types and indexing rather than chasing pointers during the parsing process.

> Source/WebCore/html/parser/HTMLSrcsetParser.cpp:241
> +static ImageCandidate pickBestImageCandidate(float deviceScaleFactor,
> +    Vector<ImageCandidate>& imageCandidates)

Please put this on one line.

> Source/WebCore/html/parser/HTMLSrcsetParser.cpp:243
> +    const float defaultDensityValue = 1.0;

I don’t think this constant adds much to the readability of the code, but I guess maybe it’s OK. It’s strange to define it here, though, high up in the function. Also, I see places that are using a default of 1.0 elsewhere without a named constant.

> Source/WebCore/html/parser/HTMLSrcsetParser.cpp:244
> +    bool ignoreSrc = false;

What’s the point of this? It’s always false?

> Source/WebCore/html/parser/HTMLSrcsetParser.cpp:249
> +    for (Vector<ImageCandidate>::iterator it = imageCandidates.begin(); it != imageCandidates.end(); ++it) {

Please use a modern for loop here:

    for (auto& candidate : imageCandidates) {

> Source/WebCore/html/parser/HTMLSrcsetParser.cpp:250
> +        if (it->density() < 0)

It’s really strange to have this "< 0" here. Is there some cleaner way for us to do this? I noticed in other places you have both a named constant and a hasDensity function, but here we are following a different pattern.

> Source/WebCore/html/parser/HTMLSrcsetParser.cpp:279
> +ImageCandidate bestFitSourceForImageAttributes(float deviceScaleFactor,
> +    const String& srcAttribute,
> +    const String& srcsetAttribute)

Please put this all on one line.

The argument types should probably be const AtomicString& if they are attribute values since that’s what attribute values are.

> Source/WebCore/html/parser/HTMLSrcsetParser.cpp:295
> +    return pickBestImageCandidate(deviceScaleFactor,
> +        imageCandidates);

This should be a single line.

> Source/WebCore/html/parser/HTMLSrcsetParser.h:36
>  #include <wtf/text/WTFString.h>

Can remove this include since StringView.h includes WTFString.h.

> Source/WebCore/html/parser/HTMLSrcsetParser.h:69
> +class ImageCandidate {

I think it would be better to use a struct rather than a class for this. The class isn’t adding much encapsulation.

> Source/WebCore/html/parser/HTMLSrcsetParser.h:81
> +    ImageCandidate(const String& source, unsigned start, unsigned length, const DescriptorParsingResult& result, OriginAttribute originAttribute)

This constructor should just take a StringView rather than source, start, and length.

> Source/WebCore/html/parser/HTMLSrcsetParser.h:84
> +        , m_density(result.hasDensity()?result.density():UninitializedDescriptor)
> +        , m_resourceWidth(result.hasWidth()?result.resourceWidth():UninitializedDescriptor)

Please put spaces around the ? and the : in the ?: operator.

> Source/WebCore/html/parser/HTMLSrcsetParser.h:92
> +    String toString() const
>      {
> -        return m_scaleFactor;
> +        return String(m_string.toString());
> +    }

The getter should just return the StringView. We don’t need to build a toString function into this class.

It’s also wrong to write String(x.toString()). Should just be x.toString().

> Source/WebCore/html/parser/HTMLSrcsetParser.h:97
> +    AtomicString url() const
> +    {
> +        return AtomicString(m_string.toString());
> +    }

We should not have this function. Also, we would use toAtomicString for this, not toString, if we needed it.

> Source/WebCore/html/parser/HTMLSrcsetParser.h:122
> +    inline bool isEmpty() const
> +    {
> +        return m_string.isEmpty();
>      }

We don’t need a separate function for this. We should just use the function that returns a StringView and call isEmpty on that.

Also, no need to mark this inline. All functions defined in a class like this are inline.

> Source/WebCore/html/parser/HTMLSrcsetParser.h:133
> +ImageCandidate bestFitSourceForImageAttributes(float deviceScaleFactor,
> +    const String& srcAttribute,
> +    const String& srcsetAttribute);

Should be all on one line. There’s this new vertical coding style you are using a lot. Maybe it’s a Google/Blink thing?

> Source/WebCore/html/parser/ParsingUtilities.h:42
> +template<typename CharType>
> +bool skipExactly(const CharType*& position, const CharType* end, CharType delimiter)
> +{
> +    if (position < end && *position == delimiter) {
> +        ++position;
> +        return true;
> +    }
> +    return false;
> +}

We should consider writing these utilities for a StringView instead of a character pointer so we can do this kind of parsing without having two whole copies of every function.
Comment 4 Yoav Weiss 2014-06-05 02:24:47 PDT
Comment on attachment 232468 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=232468&action=review

>> Source/WebCore/html/parser/HTMLParserIdioms.h:93
>> +}
> 
> Do we really need a function for this? I think that using == ',' at the call site would be fine and it’s not enhanced with a function.

The function is needed in order to pass it to reverseSkipWhile in HTMLSrcsetParser.

>> Source/WebCore/html/parser/HTMLSrcsetParser.cpp:72
>> +}
> 
> Does this helper function aid in readability?

I think it does since it aligns well with the spec's language.

>> Source/WebCore/html/parser/HTMLSrcsetParser.cpp:134
>>  }
> 
> This function belongs in WTF. And also it need not have string view in its name. Also, why length - 1?

OK. The length-1 is there since we need to ignore the latest char, which is the descriptor char (e.g. 'x').
I've changed it to make it clearer.

>> Source/WebCore/html/parser/HTMLSrcsetParser.cpp:184
>> +    while (position < attributeEnd) {
> 
> I suggest a for loop instead of a while loop here.

Since this loop doesn't advance position on every step (according to spec), I thought a while is more appropriate.

>> Source/WebCore/html/parser/HTMLSrcsetParser.cpp:237
>> +        parseImageCandidatesFromSrcsetAttribute<UChar>(attribute, attribute.characters16(), attribute.length(), imageCandidates);
> 
> I’m not sure we need separately optimized versions for 8-bit and 16-bit characters. We should considering using StringView more for the argument types and indexing rather than chasing pointers during the parsing process.

OK. I've added a FIXME for that. I'll address it in a separate patch.

>> Source/WebCore/html/parser/ParsingUtilities.h:42
>> +}
> 
> We should consider writing these utilities for a StringView instead of a character pointer so we can do this kind of parsing without having two whole copies of every function.

I think that would be required in order to move the parsing to working with StringView and position instead of direct pointers.
Comment 5 Yoav Weiss 2014-06-05 02:29:05 PDT
Created attachment 232533 [details]
Patch
Comment 6 Yoav Weiss 2014-06-05 02:34:29 PDT
Thanks for reviewing! :)
I addressed the review comments
Comment 7 Yoav Weiss 2014-06-05 03:51:03 PDT
Created attachment 232534 [details]
Patch
Comment 8 WebKit Commit Bot 2014-06-05 23:06:46 PDT
Comment on attachment 232534 [details]
Patch

Clearing flags on attachment: 232534

Committed r169637: <http://trac.webkit.org/changeset/169637>
Comment 9 WebKit Commit Bot 2014-06-05 23:06:54 PDT
All reviewed patches have been landed.  Closing bug.