Bug 137094 - [SVG -> OTF Converter] Implement ligatures
Summary: [SVG -> OTF Converter] Implement ligatures
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Layout and Rendering (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Myles C. Maxfield
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-09-24 17:25 PDT by Myles C. Maxfield
Modified: 2015-01-11 14:55 PST (History)
15 users (show)

See Also:


Attachments
Patch depends on https://bugs.webkit.org/show_bug.cgi?id=137092 (22.48 KB, patch)
2014-09-28 19:14 PDT, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
Patch (19.22 KB, patch)
2014-10-05 10:04 PDT, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
Patch (22.07 KB, patch)
2014-11-10 22:06 PST, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
Patch (21.41 KB, patch)
2014-12-16 17:13 PST, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
Patch (21.80 KB, patch)
2014-12-16 17:40 PST, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
Patch (23.14 KB, patch)
2015-01-06 17:52 PST, Myles C. Maxfield
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Myles C. Maxfield 2014-09-24 17:25:28 PDT
The following tests fail because of this (when the SVG -> OTF converter is turned on):

svg/W3C-SVG-1.1/fonts-glyph-04-t.svg
svg/W3C-SVG-1.1/text-text-06-t.svg
svg/text/kerning.svg
svg/text/multichar-glyph.svg

Note that svg/W3C-SVG-1.1/fonts-glyph-04-t.svg claims that the elements should be searched in order. This should be verified with the spec before implementing.
Comment 1 Myles C. Maxfield 2014-09-28 19:14:16 PDT
Created attachment 238835 [details]
Patch depends on https://bugs.webkit.org/show_bug.cgi?id=137092
Comment 2 Myles C. Maxfield 2014-10-05 10:04:15 PDT
Created attachment 239300 [details]
Patch
Comment 3 Myles C. Maxfield 2014-11-10 22:06:02 PST
Created attachment 241332 [details]
Patch
Comment 4 mitz 2014-12-11 14:21:51 PST
Comment on attachment 241332 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=241332&action=review

> Source/WebCore/ChangeLog:17
> +        (WebCore::getNextCodepoint): Convenience function around U16_NEXT().
> +        (WebCore::SVGToOTFFontConverter::appendCMAPTable): Use getNextCodepoint().

This tells me that the change log entry is out of date

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:185
> +    unsigned featureCountGSUB;

Member variable names should have the m_ prefix

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:233
> +        bool skipCodepoint = false;

Not sure why you need this variable…

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:240
> +                skipCodepoint = true;

…you can simply continue here.

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:609
> +        ASSERT(lhs.first.size() > 1 && rhs.first.size() > 1);

This should be split into two assertions, though I don’t think it’s necessary to assert this here. The code that prevents shorter-than-2 vectors is just above.

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:618
> +            ASSERT(ligaturePair.first.size() > 1);

Not sure this assertion is useful.

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:621
> +                segmentStart = i; // FIXME: This is not quite right for the first one

It’s not? What’s wrong with it?

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:633
> +    for (size_t i = 0; i < overlappingFirstGlyphSegmentLengths.size(); ++i)
> +        append16(0); // Placeholder for offset to LigatureSet table

Do we have a more efficient way to do this?

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:641
> +        for (size_t j = 0; j < overlappingFirstGlyphSegmentLengths[i]; ++j)
> +            append16(0); // Placeholder for offset to Ligature table

And this?

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:650
> +            ASSERT(ligaturePairIndex < ligaturePairs.size());
> +            auto& ligaturePair = ligaturePairs[ligaturePairIndex];

I think this second line includes a release assert that’s equivalent to the assertion on the first line.

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:873
> +Vector<Glyph, 1> SVGToOTFFontConverter::lookupCodepoint(UChar32 codepoint) const

I think this function could have a better name. Maybe glyphsForCodepoint?
Comment 5 Myles C. Maxfield 2014-12-16 17:13:10 PST
Created attachment 243406 [details]
Patch
Comment 6 Myles C. Maxfield 2014-12-16 17:14:11 PST
Comment on attachment 241332 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=241332&action=review

>> Source/WebCore/ChangeLog:17
>> +        (WebCore::SVGToOTFFontConverter::appendCMAPTable): Use getNextCodepoint().
> 
> This tells me that the change log entry is out of date

Done.

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:185
>> +    unsigned featureCountGSUB;
> 
> Member variable names should have the m_ prefix

Done.

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:233
>> +        bool skipCodepoint = false;
> 
> Not sure why you need this variable…

Done.

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:240
>> +                skipCodepoint = true;
> 
> …you can simply continue here.

Done.

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:609
>> +        ASSERT(lhs.first.size() > 1 && rhs.first.size() > 1);
> 
> This should be split into two assertions, though I don’t think it’s necessary to assert this here. The code that prevents shorter-than-2 vectors is just above.

Done.

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:618
>> +            ASSERT(ligaturePair.first.size() > 1);
> 
> Not sure this assertion is useful.

Done.

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:621
>> +                segmentStart = i; // FIXME: This is not quite right for the first one
> 
> It’s not? What’s wrong with it?

It is right. Done.

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:633
>> +        append16(0); // Placeholder for offset to LigatureSet table
> 
> Do we have a more efficient way to do this?

Done.

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:641
>> +            append16(0); // Placeholder for offset to Ligature table
> 
> And this?

Done.

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:650
>> +            auto& ligaturePair = ligaturePairs[ligaturePairIndex];
> 
> I think this second line includes a release assert that’s equivalent to the assertion on the first line.

I didn't realize that. Done.

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:873
>> +Vector<Glyph, 1> SVGToOTFFontConverter::lookupCodepoint(UChar32 codepoint) const
> 
> I think this function could have a better name. Maybe glyphsForCodepoint?

Done.
Comment 7 Myles C. Maxfield 2014-12-16 17:40:02 PST
Created attachment 243412 [details]
Patch
Comment 8 mitz 2015-01-01 11:09:48 PST
Comment on attachment 243412 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=243412&action=review

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:111
> +    void growByLength(size_t numberOfBytes)

“by length” doesn’t contribute much to this function’s name. If you think that “grow” is not sufficiently clear, then maybe this should be called “growResult”.

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:174
> +    Vector<Glyph, 1> glyphsForCodepoint(UChar32 codepoint) const;

No need to name the parameter here.

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:242
> +            codepoint = *iterator;
> +            ++iterator;

codepoint = iterator++;

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:598
> +        Vector<Glyph> ligatureGlyphs;

You (almost?) always append at least one element to this Vector. You should give it some inline capacity (of probably more than 1) to avoid many heap allocations. Then you’ll typically only allocate buffers on the heap for the few ligatures in the font when you create a LigaturePair. You can probably even further minimize heap allocations by giving the Vector in LigaturePair inline capacity, and you can avoid copying by “speculatively” appending to ligaturePairs and doing all the work in the appended element.

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:613
> +    if (!ligaturePairs.isEmpty()) {

If there are no ligatures, can we just avoid writing this subtable?

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:620
> +                segmentStart = i;

You forgot the update previousFirstGlyph here!

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:661
> +    for (auto segmentLength : overlappingFirstGlyphSegmentLengths) {

Why use range-based iteration here and index-based iteration everywhere else?

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:710
> +        append16(m_featureCountGSUB + i); // Features indices
> +    m_featureCountGSUB += featureCount;

Can just replace this with
    append16(m_featureCountGSUB++);

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:781
> +        case 0:

Why does 4 come before 0? It’s kind of weird to write a for loop with a switch statement over the loop index variable.

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:1155
> +    Vector<char> charString;
> +    writeCFFEncodedNumber(charString, m_unitsPerEm);
> +    writeCFFEncodedNumber(charString, 0);
> +    writeCFFEncodedNumber(charString, 0);
> +    charString.append(rMoveTo);
> +    charString.append(endChar);

Perhaps there should be a copy of this in a member variable so that you don’t have to create it every time this function is called?

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:1169
> +        UChar32 codepoint = *codePointsIterator;
> +        ++codePointsIterator;

UChar32 codepoint = codePointsIterator++;
Comment 9 Myles C. Maxfield 2015-01-06 17:47:19 PST
Comment on attachment 243412 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=243412&action=review

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:111
>> +    void growByLength(size_t numberOfBytes)
> 
> “by length” doesn’t contribute much to this function’s name. If you think that “grow” is not sufficiently clear, then maybe this should be called “growResult”.

I want to be clear of the semantics of "grow". In particular, you pass Vector::grow() the resulting size that you want the vector to be. However, for this function, I wanted to pass the delta. I think I can better solve this by making the function void grow(size_t delta).

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:174
>> +    Vector<Glyph, 1> glyphsForCodepoint(UChar32 codepoint) const;
> 
> No need to name the parameter here.

Done.

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:242
>> +            ++iterator;
> 
> codepoint = iterator++;

WTF::StringView::CodePoints::Iterator doesn't have a suffix increment operator, and given the implementation of that class (in particular, the particular semantics that Darin and I discussed), I don't think adding one would be a good idea.

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:598
>> +        Vector<Glyph> ligatureGlyphs;
> 
> You (almost?) always append at least one element to this Vector. You should give it some inline capacity (of probably more than 1) to avoid many heap allocations. Then you’ll typically only allocate buffers on the heap for the few ligatures in the font when you create a LigaturePair. You can probably even further minimize heap allocations by giving the Vector in LigaturePair inline capacity, and you can avoid copying by “speculatively” appending to ligaturePairs and doing all the work in the appended element.

Done.

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:613
>> +    if (!ligaturePairs.isEmpty()) {
> 
> If there are no ligatures, can we just avoid writing this subtable?

I don't think that's worth it. Doing so would both introduce additional complexity to SVGToOTFFontConverter::appendGSUBTable() which is already too complex, as well as splitting ligature handling into two places - a prepass to determine if we have ligatures (and probably build a data structure) and a secondary pass to iterate over the data structure and write the table. I don't think those costs justify this idea.

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:620
>> +                segmentStart = i;
> 
> You forgot the update previousFirstGlyph here!

Oh my. It's a wonder the tests didn't catch this. I'll make sure I add a test for this.

Done.

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:661
>> +    for (auto segmentLength : overlappingFirstGlyphSegmentLengths) {
> 
> Why use range-based iteration here and index-based iteration everywhere else?

The other loops all use their loop variables.

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:710
>> +    m_featureCountGSUB += featureCount;
> 
> Can just replace this with
>     append16(m_featureCountGSUB++);

Done.

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:781
>> +        case 0:
> 
> Why does 4 come before 0? It’s kind of weird to write a for loop with a switch statement over the loop index variable.

I wanted to make it clear that I'm populating the LookupList with each feature. I think it's better to switch for a particular output value than have a bunch of duplicated code where it's not very obvious what the intention of the code is.

Do you think maybe I should pull out this switch statement into a helper function?

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:1155
>> +    charString.append(endChar);
> 
> Perhaps there should be a copy of this in a member variable so that you don’t have to create it every time this function is called?

Done.

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:1169
>> +        ++codePointsIterator;
> 
> UChar32 codepoint = codePointsIterator++;

See above.
Comment 10 Myles C. Maxfield 2015-01-06 17:52:47 PST
Created attachment 244114 [details]
Patch
Comment 11 WebKit Commit Bot 2015-01-11 14:55:43 PST
Comment on attachment 244114 [details]
Patch

Clearing flags on attachment: 244114

Committed r178249: <http://trac.webkit.org/changeset/178249>
Comment 12 WebKit Commit Bot 2015-01-11 14:55:49 PST
All reviewed patches have been landed.  Closing bug.