Bug 17701

Summary: issue with charset gb2312 causes layout to be broken
Product: WebKit Reporter: jasneet <jasneet>
Component: WebKit Misc.Assignee: Nobody <webkit-unassigned>
Status: NEW ---    
Severity: Normal CC: ahmad.saleem792, ap, bfulgham, jasneet, mitz, mjhsieh, mmaxfield, webkit-bug-importer
Priority: P2 Keywords: HasReduction, InRadar
Version: 528+ (Nightly build)   
Hardware: PC   
OS: Windows XP   
URL: http://www.taobao.com/
See Also: https://bugs.webkit.org/show_bug.cgi?id=18085
Attachments:
Description Flags
screenshot
none
reduction none

Description jasneet 2008-03-06 17:54:15 PST
I Steps:
Go to
http://www.taobao.com/
Click on 7th tab : "&#25163;&#26426;&#25968;&#30721;"

II Issue:
Notice the layout is broken for 2nd and 4th column. The text is aligned in 3 lines instead of 2. 

III Conclusion: What FF and IE do but webkit does not is to infer the language from the encoding (in this case gb2312) and use a font for that language when no font is specified (or an only generic css family is specified).

FF and IE have lang/script-based font preferences while webkit currently does not.

In this particular case, gb2312 implies Simplified Chinese so that FF and IE use a font for SC unless that font does not cover a character. SC fonts usually have Latin letters so that they're used to render Latin as well as Chinese.

In case of webkit, it just has global font preferences. With no font specified, it's Times New Roman so Latin letters are rendered with it. Chinese characters are not obviously covered so that webkit ends up using Simsun *only* for Chinese characters. Because Latin glyphs in T-R is wider than those in Simsun, the layout (that relies on the precise width of rendered text) is broken.

IV Other browsers:
IE7: ok
FF: ok
Opera: not ok

V Nightly tested: 30236
Comment 1 jasneet 2008-03-06 17:54:47 PST
Created attachment 19578 [details]
screenshot
Comment 2 jasneet 2008-03-06 17:56:08 PST
Created attachment 19579 [details]
reduction
Comment 3 Mengjuei Hsieh 2011-01-15 09:31:51 PST
Current taobao.com layout does not have such problem. Works for me with safari 5.0.3 (6533.19.4) and webkit r75772. I guess there is no way to confirm it since archive.org doesn't have any archive of taobao.com. Suggestion: close the bug as works for me.
Comment 4 Alexey Proskuryakov 2011-01-15 12:31:30 PST
The bug description is pretty good, in fact. We should indeed strive to use the same font for Roman and Chinese characters on Chinese pages, and heuristics based on page encoding make sense.
Comment 5 mitz 2011-01-15 12:34:39 PST
I think the Chromium Windows font code might have better heuristics for this (although obviously not encoding-based) than the Apple Windows font code. Would be interesting to see what it looks like in the former.
Comment 6 Ahmad Saleem 2022-08-30 15:02:34 PDT
All browsers differ from each other in the attached reduction.

> Safari Technology Preview 152 does not show "bullets" or "list" icons.
> Chrome Canary 107 show "list" bullets but on wrong places compared to Firefox
> Firefox Nightly 106 show same as Chrome but "bullets" on wrong places.

I am changing this as "New" but will tag few Webkit Engineer to comment better.
Comment 7 Ahmad Saleem 2022-08-30 15:02:38 PDT
All browsers differ from each other in the attached reduction.

> Safari Technology Preview 152 does not show "bullets" or "list" icons.
> Chrome Canary 107 show "list" bullets but on wrong places compared to Firefox
> Firefox Nightly 106 show same as Chrome but "bullets" on wrong places.

I am changing this as "New" but will tag few Webkit Engineer to comment better.
Comment 8 Radar WebKit Bug Importer 2022-08-30 15:12:17 PDT
<rdar://problem/99353482>