NEW 210502
[GTK] TextNode::splitText() can lose content visually
https://bugs.webkit.org/show_bug.cgi?id=210502
Summary [GTK] TextNode::splitText() can lose content visually
Milan Crha
Reported 2020-04-14 09:42:09 PDT
Created attachment 396429 [details] How it looks like in Firefox Just noticed that calling splitText() in the middle of a multi-unicode character causes content lost on both sides. This is with trunk at r259630. Steps: a) run: MiniBrowser --editor-mode b) open the Inspector and in its console run: document.body.innerText = "😏😉🙂" c) still in the inspector run: document.body.firstChild.splitText(2) * all is fine, the Elements tab shows the text properly split into one and two Emojis d) still in the inspector run: document.body.firstChild.nextSibling.splitText(1) The outcome after d) are three text nodes in the body, the first showing the first Emoji, the second being empty text, the third with probably two letters, looks like whitespaces, though: document.body.firstChild.nextSibling.nodeValue.length 1 document.body.firstChild.nextSibling.nodeValue.charCodeAt(0) 55357 document.body.firstChild.nextSibling.nextSibling.nodeValue.length 3 document.body.firstChild.nextSibling.nextSibling.nodeValue.charCodeAt(0) 56841 document.body.firstChild.nextSibling.nextSibling.nodeValue.charCodeAt(1) 55357 document.body.firstChild.nextSibling.nextSibling.nodeValue.charCodeAt(2) 56898 I do not know what to expect from this, but that one can break "a letter" in the middle and have it completely lost with the next letter is not ideal. Calling: - document.body.normalize() fixes the situation like being after the step b). - it seems the splitText() is correct (see above), but the visual interpretation is broken (at least the second Emoji might be visible, it may not look like a whitespace). I tried with Firefox (67.0) and it behaves similarly (also two characters per Emoji), but the splitText call has no impact on the visual interpretation in the document body. It has impact on the interpretation in the Inspector (the inspector shows letters it cannot visualize as rectangles with the hexa code). ------------------------------------------- Side notes: Are there any sequences using multi-unicode characters, like in some Chinese variants or such? That the Emoji occupies two characters is impractical with line length calculations too, even though they are drawn as a single character. I know of "composite" Emojis, which is even bigger nightmare on many fronts.
Attachments
How it looks like in Firefox (8.67 KB, image/png)
2020-04-14 09:42 PDT, Milan Crha
no flags
Note You need to log in before you can comment on or make changes to this bug.