Bug 30015

Summary: text-transform:capitalize is failing in CSS2.1 test suite
Product: WebKit Reporter: Shinichiro Hamaji <hamaji>
Component: CSSAssignee: Nobody <webkit-unassigned>
Status: NEW    
Severity: Normal CC: adamk, adele, ap, bdakin, darin, eric, hyatt, ian, mitz
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: PC   
OS: OS X 10.5   
Attachments:
Description Flags
Patch v0 mjs: review-

Shinichiro Hamaji
Reported 2009-10-02 04:57:20 PDT
(Though this is duplicate of Bug 23267, I'd like to open separate bug for this test case as the original bug has actually 3 bugs.) Let me explain why t1605-c545-txttrans-00-b-ag (http://www.w3.org/Style/CSS/Test/CSS2.1/current/html4/t1605-c545-txttrans-00-b-ag.htm) is failing. This test is exercising CSS' text-transform property. This test is failing for text-transform:capitalize. For the following texts, Xx xx x. (x.x. XX) Pp pp p. (p.p. PP) Éé éé é. (é.é. ÉÉ) the test suite is expecting Xx Xx X. (x.x. XX) Pp Pp P. (p.p. PP) Éé Éé É. (é.é. ÉÉ) but WebKit is capitalizing lower characters in parens like Xx Xx X. (X.X. XX) Pp Pp P. (P.P. PP) Éé Éé É. (É.É. ÉÉ) For example, this behavior converts "e.g.," into "E.G.,", which may look weird. You can see this example in another CSS test suite, LayoutTests/css1/text_properties/text_transform.html (http://www.w3.org/Style/CSS/Test/CSS1/current/sec545.htm). The CSS1 test suite is saying "There Should Be A Capital Letter After A Non-Breaking Space (&Nbsp;).". I guess it may mean a character should be capitalized if and only if it comes after whitespaces. The CSS2.1 test suite is expecting this behavior as well. Now, our implementation of capitalization uses ICU's word breaking, which is kind of "smart". For example, it converts from "hi" to "Hi", from foo's to Foo's, from layout-tests to Layout-Tests, and from e.g., to E.G., . If we use the approach the CSS test suite may be expecting (capitalize characters iff. it comes after whitespaces), the results will be "hi", Foo's, Layout-tests, and E.g., specifically. I'm not sure what is the best behavior. It seems that we intentionally chose the current behavior in Bug 4171. If we decide not to fix this, please just make this bug WONTFIX. Otherwise, my patch will fix this. FYI, I summarized results for some texts using 4 browsers. http://spreadsheets.google.com/pub?key=tGCsFWV1bwyYanB9y4Ri9IQ&output=html Bug 3406 is also related to this bug.
Attachments
Patch v0 (16.53 KB, patch)
2009-10-02 04:58 PDT, Shinichiro Hamaji
mjs: review-
Shinichiro Hamaji
Comment 1 2009-10-02 04:58:02 PDT
Created attachment 40511 [details] Patch v0
Shinichiro Hamaji
Comment 2 2009-10-02 04:58:52 PDT
(In reply to comment #1) > Created an attachment (id=40511) [details] > Patch v0 If we decide to make the CSS2.1 test suite pass, I'll refine this patch a bit.
Darin Adler
Comment 3 2009-10-02 09:51:14 PDT
Comment on attachment 40511 [details] Patch v0 Is this approach really correct? Shouldn't this be fixed inside the text break function instead?
Shinichiro Hamaji
Comment 4 2009-10-02 10:17:49 PDT
(In reply to comment #3) > (From update of attachment 40511 [details]) > Is this approach really correct? Shouldn't this be fixed inside the text break > function instead? It seems that the word break iterator is used by selection in editing, too. I guess changing word break iterator causes regressions for selection. For example, if you double click 'o' in "(foo bar)", current implementation selects "foo". If we change wordBreakIterator, "(foo" would be selected. But, it's just my guess. I didn't confirm if my guess is true. I'll check it. Or, I'm happy if someone who knows editing well gives comments.
Darin Adler
Comment 5 2009-10-02 11:12:15 PDT
I think we really need some input from CSS experts. We have here the ICU library’s concept of word breaks, and then the CSS concept of what words are capitalization-wise. We also are not considering the language. It’s challenging to change this without being sure what the desired behavior is. I suppose the CSS test suite makes it clear we’re wrong to be using the ICU concept?
Shinichiro Hamaji
Comment 6 2009-10-06 05:12:30 PDT
Yeah, I agree. I want advices from CSS experts. Thanks Darin for adding dhyatt into the CC lists. By the way, it seems that a microsoft person is saying that the result should be "(É.é. ÉÉ)" so IE and Firefox are OK because of the word breaking rule in Unicode. I'm not sure Unicode's rule should be applied to this style though. http://lists.w3.org/Archives/Public/public-css-testsuite/2009Feb/0040.html http://www.unicode.org/reports/tr29/tr29-13.html#Word_Boundaries
Eric Seidel (no email)
Comment 7 2009-11-26 21:18:51 PST
Any thoughts Ian or Hixie?
Maciej Stachowiak
Comment 8 2009-12-28 01:24:30 PST
Comment on attachment 40511 [details] Patch v0 I don't think the test suite behavior is either correct or required by the spec. The test's expectation is just wrong. If we match any other browser, it should be Firefox. I don't think the Opera behavior gives good results. r- to either reconsider this or try a different approach. Input from experts still welcome.
Shinichiro Hamaji
Comment 9 2010-01-20 01:56:33 PST
(In reply to comment #8) > (From update of attachment 40511 [details]) > I don't think the test suite behavior is either correct or required by the > spec. The test's expectation is just wrong. If we match any other browser, it > should be Firefox. I don't think the Opera behavior gives good results. r- to > either reconsider this or try a different approach. Input from experts still > welcome. Thanks for the comment and sorry for the latency. I played with Firefox and found the logic is not as simple as I expected. For example, no capitalization occurs for "あfoo" but "、foo" becomes "、Foo". I think I have too few knowledges around this to fix this issue correctly. I give up to fix this issue for now :(
Adam Klein
Comment 10 2012-03-02 15:44:10 PST
Rebaselined a bunch of Chromium results, including t1605-c545-txttrans-00-b-ag, in http://trac.webkit.org/changeset/109625. This may be a "bad" baseline, but decided it was better to have it checked in than to keep this test marked as "failing" for years.
Note You need to log in before you can comment on or make changes to this bug.