Bug 137097 - [SVG -> OTF Converter] Support non-BMP codepoints
Summary: [SVG -> OTF Converter] Support non-BMP codepoints
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Layout and Rendering (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Myles C. Maxfield
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-09-24 17:29 PDT by Myles C. Maxfield
Modified: 2014-10-03 13:11 PDT (History)
12 users (show)

See Also:


Attachments
Patch (60.27 KB, patch)
2014-09-27 20:47 PDT, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
Patch (58.26 KB, patch)
2014-09-28 13:55 PDT, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
Patch (58.34 KB, patch)
2014-09-28 15:12 PDT, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
Patch (58.70 KB, patch)
2014-09-28 19:12 PDT, Myles C. Maxfield
darin: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Myles C. Maxfield 2014-09-24 17:29:09 PDT
The following test fails because of this (when the SVG -> OTF converter is turned on):

svg/custom/glyph-selection-non-bmp.svg
Comment 1 Myles C. Maxfield 2014-09-27 20:47:46 PDT
Created attachment 238798 [details]
Patch
Comment 2 Myles C. Maxfield 2014-09-28 12:14:19 PDT
Comment on attachment 238798 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=238798&action=review

> Source/WebCore/ChangeLog:1
> +2014-09-27  Myles C. Maxfield  <litherum@gmail.com>

Fix email address
Comment 3 Myles C. Maxfield 2014-09-28 13:55:40 PDT
Created attachment 238820 [details]
Patch
Comment 4 Myles C. Maxfield 2014-09-28 15:12:29 PDT
Created attachment 238825 [details]
Patch
Comment 5 Myles C. Maxfield 2014-09-28 17:11:10 PDT
Comment on attachment 238825 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=238825&action=review

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:949
> +            codepoint2 = data1.codepoints[i2++];

data2.codepoints
Comment 6 Darin Adler 2014-09-28 17:57:48 PDT
Comment on attachment 238825 [details]
Patch

Myles, are you going to review my tweaks to this file before landing all these additional changes? Or are you not reviewing because you are not a reviewer?

Code point is two words, not one.
Comment 7 Myles C. Maxfield 2014-09-28 19:12:14 PDT
Created attachment 238834 [details]
Patch
Comment 8 Myles C. Maxfield 2014-09-28 19:19:57 PDT
Darin: Oh, whoops - I didn't realize there was more to review!
Comment 9 Darin Adler 2014-10-03 09:33:58 PDT
Comment on attachment 238834 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=238834&action=review

Has this been rebased after my changes? I think I see things that I thought I improved back in the old style.

I see a few problems but this seems OK to land.

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:43
> +static inline void append32(Vector<char>& result, uint32_t value)

Comment in change log makes it sound like you got rid of this, but it’s still here!

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:62
> +    typedef String Codepoints;

I don’t think this typedef is a good abstraction. Should just use String directly.

It also seems a bis strange to use a string for this.

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:97
> +    inline void append32(uint32_t value)

The keyword inline has no effect here in a class definition. Any functions defined in the class definition are implicitly treated as inline. I suggest either moving thereout of the class definition, or removing the redundant inline keywords.

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:108
> +    inline void append4ByteCode(const char code[4])

Seems to me this could just be an overload of append32. I don’t think 4ByteCode adds any clarity. The only reason we include the size is to avoid accidentally writing the wrong amount of data. We should use the same terminology for appending 4 bytes in both cases, not arbitrarily call one 32 and the other 4Byte.

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:645
> +            U16_APPEND(buffer, length, 2, codepoint, error);

Seems to me that U16_APPEND is overkill here; all the buffer length checking is pointless. But maybe it compiles to something efficient enough.

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:647
> +                for (auto index : m_codepointsToIndicesMap.get(String(buffer, length)))

This is absurdly expensive, allocating and destroying a String for each code point.

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:939
> +bool SVGToOTFFontConverter::compareCodepointsLexicographically(const GlyphData& data1, const GlyphData& data2)

Why is this a member function?

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:943
> +    unsigned i1 = 0, i2 = 0;
> +    UChar32 codepoint1, codepoint2;
> +    unsigned length1 = data1.codepoints.length(), length2 = data2.codepoints.length();

We normally avoid doing these definitions two on a line like this in WebKit coding style.

I don’t understand why codepoint1 and codepoint2 are defined outside the loop.

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:944
> +    for ( ; i1 < length1 && i2 < length2; ) {

We should put the initialization in here or just use while instead of for.

> Source/WebCore/svg/SVGToOTFFontConversion.cpp:962
> +        && equalIgnoringCase(data1.glyphElement->fastGetAttribute(SVGNames::arabic_formAttr), String("isolated", String::ConstructFromLiteral)))

There is no need to construct a String just to call equalIgnoringCase. Or if there is, it's just a missing overload that should be added. Please remove the "String/ConstructFromLiteral" dance.
Comment 10 Myles C. Maxfield 2014-10-03 12:14:46 PDT
Comment on attachment 238834 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=238834&action=review

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:43
>> +static inline void append32(Vector<char>& result, uint32_t value)
> 
> Comment in change log makes it sound like you got rid of this, but it’s still here!

This is for CFFBuilder. I've updated the ChangeLog. Done.

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:62
>> +    typedef String Codepoints;
> 
> I don’t think this typedef is a good abstraction. Should just use String directly.
> 
> It also seems a bis strange to use a string for this.

Done.

Do you think I should use Vector<UChar32> instead? I'll keep String for now but I can always change it before it gets turned on.

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:97
>> +    inline void append32(uint32_t value)
> 
> The keyword inline has no effect here in a class definition. Any functions defined in the class definition are implicitly treated as inline. I suggest either moving thereout of the class definition, or removing the redundant inline keywords.

Ah, good to know! I'll remove the keyword. Done.

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:108
>> +    inline void append4ByteCode(const char code[4])
> 
> Seems to me this could just be an overload of append32. I don’t think 4ByteCode adds any clarity. The only reason we include the size is to avoid accidentally writing the wrong amount of data. We should use the same terminology for appending 4 bytes in both cases, not arbitrarily call one 32 and the other 4Byte.

Turns out that this leads to ambiguous calls such as "append32(0)". I'll rename this to append32BitCode(). Done.

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:645
>> +            U16_APPEND(buffer, length, 2, codepoint, error);
> 
> Seems to me that U16_APPEND is overkill here; all the buffer length checking is pointless. But maybe it compiles to something efficient enough.

Given that length will always be 0 and capacity will always be 2, the compiler should be able to optimize out the check.

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:647
>> +                for (auto index : m_codepointsToIndicesMap.get(String(buffer, length)))
> 
> This is absurdly expensive, allocating and destroying a String for each code point.

I'll make a bug about this - I'm thinking about making another data structure that remembers code points as UChar32 instead of Strings. https://bugs.webkit.org/show_bug.cgi?id=137395 Done.

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:939
>> +bool SVGToOTFFontConverter::compareCodepointsLexicographically(const GlyphData& data1, const GlyphData& data2)
> 
> Why is this a member function?

GlyphData is a private member of SVGToOTFFontConverter

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:943
>> +    unsigned length1 = data1.codepoints.length(), length2 = data2.codepoints.length();
> 
> We normally avoid doing these definitions two on a line like this in WebKit coding style.
> 
> I don’t understand why codepoint1 and codepoint2 are defined outside the loop.

Done.

Done.

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:944
>> +    for ( ; i1 < length1 && i2 < length2; ) {
> 
> We should put the initialization in here or just use while instead of for.

Done.

>> Source/WebCore/svg/SVGToOTFFontConversion.cpp:962
>> +        && equalIgnoringCase(data1.glyphElement->fastGetAttribute(SVGNames::arabic_formAttr), String("isolated", String::ConstructFromLiteral)))
> 
> There is no need to construct a String just to call equalIgnoringCase. Or if there is, it's just a missing overload that should be added. Please remove the "String/ConstructFromLiteral" dance.

Done.
Comment 11 Myles C. Maxfield 2014-10-03 13:11:16 PDT
https://trac.webkit.org/r174271