11448 – &lang; and &rang; entities are mapped to the incorrect Unicode codepoint

RESOLVED FIXED 11448

&lang; and &rang; entities are mapped to the incorrect Unicode codepoint

https://bugs.webkit.org/show_bug.cgi?id=11448

Summary &lang; and &rang; entities are mapped to the incorrect Unicode codepoint

Mark Rowe (bdash)

Reported 2006-10-29 03:56:49 PST

According to the DTDs for HTML 4 (http://www.w3.org/TR/html4/HTMLsymbol.ent) and XHTML 1 (http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent) the lang and rang entities should correspond to U+3008 and U+3009 respectively. It states:  Currently lang and rang incorrectly end up as U+2039 and U+203A.

Attachments
proposed patch (383.66 KB, patch) 2006-11-03 13:57 PST, Alexey Proskuryakov	mjs: review+	Details Formatted Diff Diff
View All Add attachment proposed patch, testcase, etc.

Alexey Proskuryakov

Comment 1 2006-10-29 04:07:33 PST

Actually, the correspondence is a bit different: lang should be U+2329 according to DTDs, but that character is deprecated in Unicode. Its canonical form is U+3008. <!ENTITY lang "〈">  According to <http://www.w3.org/TR/charmod-norm/>, text on the Web should be in canonical precomposed form. We currently do this canonicalization for XHTML, but not for HTML.

Mark Rowe (bdash)

Comment 2 2006-10-29 04:14:37 PST

I stuffed up the initial description of this bug completely. WebKit's current behaviour is to map "lang" to U+2329 for HTML but to U+3008 in XHTML. It maps "rang" to U+232A for HTML but to U+3009 for XHTML. The behaviour as defined in the HTML and XHTML DTDs is to map "lang" to U+2329 and "rang" to U+232A. Our behaviour is therefore technically incorrect for XHTML, but as Alexey mentions U+2329 and U+232A are deprecated. This means that U+3008/U+3009 are arguably "more right". . . or something.

Alexey Proskuryakov

Comment 3 2006-11-03 13:57:36 PST

Created attachment 11368 [details] proposed patch OK, I'm not really convinced myself, but we should either go this way or make XHTML work as HTML for these entities...

Maciej Stachowiak

Comment 4 2006-11-03 16:04:43 PST

Comment on attachment 11368 [details] proposed patch I'm convinced. r=me Does this affect any other test results?

Alexey Proskuryakov

Comment 5 2006-11-04 00:03:21 PST

Committed revision 17591. No other tests were affected.

Ian 'Hixie' Hickson

Comment 6 2007-06-14 16:22:57 PDT

Fixed this in the HTML5 spec too.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 420+

Hardware Mac

OS OS X 10.4

Product WebKit

Component WebCore Misc.

Assignee

Nobody

Reported

2006-10-29 03:56 PST

Modified

2007-06-14 16:22 PDT History

CC List

1 user Show

URL

Keywords

Depends on

Blocks