Bug 78711 - SVG <tspan> with Hebrew text incorrectly shifted
Summary: SVG <tspan> with Hebrew text incorrectly shifted
Alias: None
Product: WebKit
Classification: Unclassified
Component: SVG (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC Windows 7
: P2 Normal
Assignee: Nobody
Depends on:
Reported: 2012-02-15 06:42 PST by Michael O'Rourke
Modified: 2023-01-16 07:11 PST (History)
3 users (show)

See Also:

Example file and screenshot (75.52 KB, application/octet-stream)
2012-02-15 06:43 PST, Michael O'Rourke
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Michael O'Rourke 2012-02-15 06:42:48 PST
Hebrew text is disobeying the <tspan> breaks and merging content between different <tspan> which have different y offsets.

Open the .svg file in the attached zip file. Note that the first line in the text block with two lines is different. The first <tspan> for both text blocks is identical. For some reason, Webkit is grabbing the first character from the second <tspan> and placing it in the first line despite the fact that the SVG content clearly does not ask for this.

Per the screenshot in the zip file, this content works perfectly fine in Firefox. It is broken in Safari, Chrome, and IE9.
Comment 1 Michael O'Rourke 2012-02-15 06:43:29 PST
Created attachment 127176 [details]
Example file and screenshot
Comment 2 Nikolas Zimmermann 2012-02-15 07:24:49 PST
This is the problematic snippet:
<tspan y="40.3" x="0">ב־Unicode. הירשמו כעת לכנס Unicode</tspan>

I'm not yet sure what's the problem with this file, or if this is actually a bug. Here's some context:
The first logical character of this text has an absolute position, x=0, y=40.3 associated with it.
Let's examine a simple non-BiDi example first:
<text x="100 200 300">ABC</text>

Here are some important definitions from SVG 1.1 2nd edition:
<quote http://www.w3.org/TR/SVG/text.html#TextLayoutIntroduction>
Adjustments to the current text position are either absolute position adjustments or relative position adjustments. An absolute position adjustment occurs in the following circumstances:

- At the start of a ‘text’ element
- At the start of each ‘textPath’ element
- For each character within a ‘text’, ‘tspan’, ‘tref’ and ‘altGlyph’ element which has an ‘x’ or ‘y’ attribute value assigned to it explicitly (IMPORTANT!)

All other position adjustments to the current text position are relative position adjustments.

Each absolute position adjustment defines a new text chunk. Absolute position adjustments impact text layout in the following ways:

- Ligatures only occur when a set of characters which might map to a ligature are all in the same text chunk.
- Each text chunk represents a separate block of text for alignment due to ‘text-anchor’ property values.
- Reordering of characters due to bidirectionality only occurs within a text chunk. Reordering does not happen across text chunks. (IMPORTANT!)

<text x="100 200 300">ABC</text> leads to three text chunks, as A, B and C all carry absolute positions.

Now what happens when we have BiDi text (ABC=latin, abc=hebrew)
<text x="100 200 300">ABC</text> -> A at 100, B at 200, C at 300
<text x="100 200 300">abc</text> -> c at 100, b at 200, a at 300 (reordereed!)

Now consider:
<text x="100">abc</text> -> c at 100, b at 110, a at 120 (say 'b' and 'a' both are 10px wide)

The x/y/dx/dy/rotate lists are reordered as well to maintain consistency, this is demanded by SVG.
Now consider:
<text x="0">defGHI</text> -> This yield just one text chunk, but it spans multiple InlineBoxes:
'def' is one box and 'GHI' as well. Applying the BiDi rules, this would visually render as:
fedGHI - given the default text-direction is left-to-right.

Now let's just use: <text x="0">de<tspan y="20">f</tspan>GHI</text>. This leads to three text chunks: one for 'de', one for 'f', and one for 'GHI'.

This would render as:
y=0:    "ed"
y=20: "fGHI"

The inner tspan now defines an absolute position for the 'f'. As SVG doesn't allow BiDi reordering to span multiple text chunks, only the 'de' part is reversed, the 'f' is left alone.