RESOLVED FIXED 135756
Elements whose contents start with an astral Unicode symbol disappear when CSS `::first-letter` is applied to them
https://bugs.webkit.org/show_bug.cgi?id=135756
Summary Elements whose contents start with an astral Unicode symbol disappear when CS...
Mathias Bynens
Reported 2014-08-08 06:20:55 PDT
Created attachment 236279 [details] Reduced test case Elements whose contents start with an astral Unicode symbol (such as U+1D306, which I’ll use in examples) disappear when CSS `::first-letter` is applied to them. For example, the following displays just fine, with and without `::first-letter { background: red; }`: <p>Lorem ipsum</p> But prepend any astral symbol to the element’s contents and it disappears when the `::first-letter` styles are active: <p>&#x1D306;Lorem ipsum (1)</p> Blink has the same bug: https://code.google.com/p/chromium/issues/detail?id=401945
Attachments
Reduced test case (221 bytes, text/html)
2014-08-08 06:20 PDT, Mathias Bynens
no flags
Patch (3.76 KB, patch)
2014-08-11 13:14 PDT, Myles C. Maxfield
no flags
Patch (10.33 KB, patch)
2014-08-12 16:58 PDT, Myles C. Maxfield
darin: review+
Myles C. Maxfield
Comment 1 2014-08-11 10:48:03 PDT
DumpRenderTree says that we aren't determining the first letter correctly. RenderBlock {P} at (0,34) size 784x18 RenderInline (generated) at (0,0) size 8x18 [bgcolor=#FF0000] RenderText {#text} at (0,0) size 8x18 text run at (0,0) width 8: "\x{F0}" RenderText {#text} at (8,0) size 128x18 text run at (8,0) width 128: "\x{9D}\x{152}\x{2020}Lorem ipsum (1)"
Myles C. Maxfield
Comment 2 2014-08-11 10:48:55 PDT
Whoops, ignore that last. I didn't put a meta charset tag in the document.
Myles C. Maxfield
Comment 3 2014-08-11 10:50:28 PDT
RenderBlock {P} at (0,34) size 784x18 RenderInline (generated) at (0,0) size 0x18 [bgcolor=#FF0000] RenderText {#text} at (0,0) size 0x18 text run at (0,0) width 0: "\x{D834}" RenderText {#text} at (0,0) size 64x18 text run at (0,0) width 64: "\x{DF06}Lorem ipsum (1)"
Myles C. Maxfield
Comment 4 2014-08-11 13:14:41 PDT
Darin Adler
Comment 5 2014-08-11 22:27:38 PDT
Comment on attachment 236395 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=236395&action=review We can do better. > Source/WebCore/rendering/RenderBlock.cpp:-3589 > - length++; You should just write something like this: length += numCharactersInGraphemeClusters(StringView(oldText).substring(length), 1); You’ll have to change the numCharactersInGraphemeClusters to take a StringView rather than a String, but otherwise this should do the job. Maybe you’ll need to performance test to see that’s fast enough, but it will do the right thing. > Source/WebCore/rendering/RenderBlock.cpp:3589 > + // FIXME: The surrogate might come from another <span> It would be more appropriate to say another text node, not <span>: // FIXME: The trail surrogate might be in a subsequent text node. But it seems strange to put the FIXME here when the same issue applies to all the rest of the string processing in this function. It’s not specific to this one part of the algorithm. > Source/WebCore/rendering/RenderBlock.cpp:3590 > + // FIXME: Presumably, combining marks should be a part of this, too I would word this completely differently, but it’s not needed if you take my advice above. > Source/WebCore/rendering/RenderBlock.cpp:3593 > + UChar32 codepoint = 0; > + U16_NEXT(oldText.characters16(), length, oldText.length(), codepoint); No need to initialize this to zero. Also no reason to smash two words together: “codepoint”. I would just call this local variable “character”. But you don’t need this code at all.
Myles C. Maxfield
Comment 6 2014-08-12 16:30:45 PDT
Comment on attachment 236395 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=236395&action=review >> Source/WebCore/rendering/RenderBlock.cpp:-3589 >> - length++; > > You should just write something like this: > > length += numCharactersInGraphemeClusters(StringView(oldText).substring(length), 1); > > You’ll have to change the numCharactersInGraphemeClusters to take a StringView rather than a String, but otherwise this should do the job. > > Maybe you’ll need to performance test to see that’s fast enough, but it will do the right thing. numCharactersInGraphemeClusters is super useful! Thanks for letting me know about it. Done. >> Source/WebCore/rendering/RenderBlock.cpp:3589 >> + // FIXME: The surrogate might come from another <span> > > It would be more appropriate to say another text node, not <span>: > > // FIXME: The trail surrogate might be in a subsequent text node. > > But it seems strange to put the FIXME here when the same issue applies to all the rest of the string processing in this function. It’s not specific to this one part of the algorithm. I moved the FIXME to RenderBlock::updateFirstLetter where the assumption is actually made. >> Source/WebCore/rendering/RenderBlock.cpp:3590 >> + // FIXME: Presumably, combining marks should be a part of this, too > > I would word this completely differently, but it’s not needed if you take my advice above. Yup. Done. >> Source/WebCore/rendering/RenderBlock.cpp:3593 >> + U16_NEXT(oldText.characters16(), length, oldText.length(), codepoint); > > No need to initialize this to zero. Also no reason to smash two words together: “codepoint”. I would just call this local variable “character”. But you don’t need this code at all. Yeah - numCharactersInGraphemeClusters fixes this right up. Done.
Myles C. Maxfield
Comment 7 2014-08-12 16:58:36 PDT
Darin Adler
Comment 8 2014-08-12 17:49:03 PDT
Comment on attachment 236483 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=236483&action=review > Source/WebCore/html/HTMLTextAreaElement.cpp:324 > + return proposedValue.left(numCharactersInGraphemeClusters(StringView(proposedValue), maxLength)); Is the explicit conversion to StringView needed? I’d expect an implicit conversion. > Source/WebCore/html/TextFieldInputType.cpp:341 > + unsigned newLength = numCharactersInGraphemeClusters(StringView(string), maxLength); Is the explicit conversion to StringView needed? I’d expect an implicit conversion. > Source/WebCore/platform/LocalizedStrings.cpp:85 > + unsigned numberOfCharacters = numCharactersInGraphemeClusters(StringView(trimmed), maxNumberOfGraphemeClustersInLookupMenuItem); Is the explicit conversion to StringView needed? I’d expect an implicit conversion.
Darin Adler
Comment 9 2014-08-12 17:49:52 PDT
Comment on attachment 236483 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=236483&action=review > Source/WebCore/ChangeLog:8 > + The previous code assumed that all "characters" are exactly 1 16-bit code unit wide. Instead, use U16_NEXT(). Oops, this comment is obsolete since it mentions U16_NEXT. > Source/WebCore/ChangeLog:12 > + This patch also modifies the signature of numCharactersInGraphemeClusters() to take a StringView instead > + of a string, which will avoid a copy. As soon as there is an implicit StringView constructor which takes a > + String, we can revert the changes to these callers. Oh, I did not see this comment. I thought there was an implicit StringView constructor.
Myles C. Maxfield
Comment 10 2014-08-12 20:50:35 PDT
Comment on attachment 236483 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=236483&action=review >> Source/WebCore/ChangeLog:8 >> + The previous code assumed that all "characters" are exactly 1 16-bit code unit wide. Instead, use U16_NEXT(). > > Oops, this comment is obsolete since it mentions U16_NEXT. Done. >> Source/WebCore/ChangeLog:12 >> + String, we can revert the changes to these callers. > > Oh, I did not see this comment. I thought there was an implicit StringView constructor. There is now! >> Source/WebCore/html/HTMLTextAreaElement.cpp:324 >> + return proposedValue.left(numCharactersInGraphemeClusters(StringView(proposedValue), maxLength)); > > Is the explicit conversion to StringView needed? I’d expect an implicit conversion. Nope! >> Source/WebCore/html/TextFieldInputType.cpp:341 >> + unsigned newLength = numCharactersInGraphemeClusters(StringView(string), maxLength); > > Is the explicit conversion to StringView needed? I’d expect an implicit conversion. Nope! >> Source/WebCore/platform/LocalizedStrings.cpp:85 >> + unsigned numberOfCharacters = numCharactersInGraphemeClusters(StringView(trimmed), maxNumberOfGraphemeClustersInLookupMenuItem); > > Is the explicit conversion to StringView needed? I’d expect an implicit conversion. Nope!
Myles C. Maxfield
Comment 11 2014-08-12 20:54:35 PDT
Note You need to log in before you can comment on or make changes to this bug.