Bug 93424

Summary: Expand list of supported languages for RenderQuote to match WHATWG spec
Product: WebKit Reporter: Elliott Sprehn <esprehn>
Component: Layout and RenderingAssignee: Elliott Sprehn <esprehn>
Status: RESOLVED FIXED    
Severity: Normal CC: dglazkov, eric, jchaffraix, webkit.review.bot
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Bug Depends on:    
Bug Blocks: 92061    
Attachments:
Description Flags
Patch
none
Archive of layout-test-results from gce-cr-linux-06
none
Patch
none
Archive of layout-test-results from gce-cr-linux-02
none
Patch none

Description Elliott Sprehn 2012-08-07 19:05:59 PDT
IE does language specific quote selection. We should also pick the right quotes based on the language.
Comment 1 Elliott Sprehn 2012-08-07 19:15:58 PDT
Created attachment 157085 [details]
Patch

Expand the table of quotes for many languages. This table came from WKbug 3234.
Comment 2 Eric Seidel (no email) 2012-08-07 19:18:45 PDT
Comment on attachment 157085 [details]
Patch

There is totally behavior here... can't we test this using <q lang="foo">?
Comment 3 Elliott Sprehn 2012-08-07 19:21:12 PDT
This builds the big hash map on first access which means constructing and inserting 39 QuotesData's into a HashMap.

I'm not sure how "slow" this is considered to be? The old implementation of this had an array of structs it binary searched on first access for a language, and then if it found a match it created a QuotesData and put it into the map for faster access later. Do we need that optimization?

Also it turns out that IE is the only browser that does language based quote selection automatically. Hyatt wanted this originally in WKBug 3234, but I wonder if it's needed.

If we decide developers should instead specify the quotes they want with :lang and quotes in CSS we could remove the HashMap entirely, and the 2 extra ::create methods on QuotesData.

Should we support this feature to emulate IE? If so how important is the perf hit on first access?
Comment 4 Build Bot 2012-08-07 19:42:25 PDT
Comment on attachment 157085 [details]
Patch

Attachment 157085 [details] did not pass win-ews (win):
Output: http://queues.webkit.org/results/13459050
Comment 5 Elliott Sprehn 2012-08-07 20:00:22 PDT
(In reply to comment #2)
> (From update of attachment 157085 [details])
> There is totally behavior here... can't we test this using <q lang="foo">?

Yeah, this needs a test.
Comment 6 WebKit Review Bot 2012-08-07 20:03:09 PDT
Comment on attachment 157085 [details]
Patch

Attachment 157085 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/13454355

New failing tests:
fast/css-generated-content/quotes-lang.html
Comment 7 WebKit Review Bot 2012-08-07 20:03:12 PDT
Created attachment 157092 [details]
Archive of layout-test-results from gce-cr-linux-06

The attached test failures were seen while running run-webkit-tests on the chromium-ews.
Bot: gce-cr-linux-06  Port: <class 'webkitpy.common.config.ports.ChromiumXVFBPort'>  Platform: Linux-2.6.39-gcg-201203291735-x86_64-with-Ubuntu-10.04-lucid
Comment 8 Elliott Sprehn 2012-08-07 20:26:29 PDT
Created attachment 157095 [details]
Patch

Add a test and a hack for multi char strings.
Comment 9 Eric Seidel (no email) 2012-08-07 20:43:14 PDT
Comment on attachment 157095 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=157095&action=review

> LayoutTests/fast/css-generated-content/quotes-auto-lang-expected.html:7
> +    q:lang(af) { quotes: '\201E' '\201D' '\201A' '\2019' }

I think it might be a better test inline:
<!-- afrikaans -->
<span>&#201E;&#201D;&#201A;&#2019;</span>

But I could also be convinced that it's OK as is.
Comment 10 Build Bot 2012-08-07 20:43:50 PDT
Comment on attachment 157095 [details]
Patch

Attachment 157095 [details] did not pass win-ews (win):
Output: http://queues.webkit.org/results/13456323
Comment 11 Eric Seidel (no email) 2012-08-07 20:44:17 PDT
Comment on attachment 157095 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=157095&action=review

>> LayoutTests/fast/css-generated-content/quotes-auto-lang-expected.html:7
>> +    q:lang(af) { quotes: '\201E' '\201D' '\201A' '\2019' }
> 
> I think it might be a better test inline:
> <!-- afrikaans -->
> <span>&#201E;&#201D;&#201A;&#2019;</span>
> 
> But I could also be convinced that it's OK as is.

Actually, I realize that my example makes no sense.  Maybe this is closer to reality?

<span>&#201E;&#201D;af&#201A;&#2019;</span>
Comment 12 WebKit Review Bot 2012-08-07 22:09:36 PDT
Comment on attachment 157095 [details]
Patch

Attachment 157095 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/13455367

New failing tests:
fast/css-generated-content/quotes-lang.html
Comment 13 WebKit Review Bot 2012-08-07 22:09:39 PDT
Created attachment 157111 [details]
Archive of layout-test-results from gce-cr-linux-02

The attached test failures were seen while running run-webkit-tests on the chromium-ews.
Bot: gce-cr-linux-02  Port: <class 'webkitpy.common.config.ports.ChromiumXVFBPort'>  Platform: Linux-2.6.39-gcg-201203291735-x86_64-with-Ubuntu-10.04-lucid
Comment 14 Alexey Proskuryakov 2012-08-08 11:00:22 PDT
Comment on attachment 157095 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=157095&action=review

> Source/WebCore/ChangeLog:3
> +        <q> should pick the right kind of quote based on the language

I'm confused by the title. Isn't the patch just expanding the list of supported languages?

> Source/WebCore/ChangeLog:8
> +        Support lots of different language specific quotes automatically. This matches IE behavior.

Does this mean that the list of supported languages matches IE 100%?

> Source/WebCore/rendering/RenderQuote.cpp:145
> +    // Table of quotes from https://bug-3234-attachments.webkit.org/attachment.cgi?id=2135

I'm somewhat unsure whether people working on this code in the future need to know that.

Besides, the best data on quotation marks is available at <http://unicode.org/repos/cldr-tmp/trunk/diff/by_type/misc.delimiters.html>. Perhaps it's also exposed by ICU - I couldn't find it quickly, but sounds like it should be.

>>> LayoutTests/fast/css-generated-content/quotes-auto-lang-expected.html:7
>>> +    q:lang(af) { quotes: '\201E' '\201D' '\201A' '\2019' }
>> 
>> I think it might be a better test inline:
>> <!-- afrikaans -->
>> <span>&#201E;&#201D;&#201A;&#2019;</span>
>> 
>> But I could also be convinced that it's OK as is.
> 
> Actually, I realize that my example makes no sense.  Maybe this is closer to reality?
> 
> <span>&#201E;&#201D;af&#201A;&#2019;</span>

I agree with Eric. The expectation needs to be simpler, or a browser that doesn't honor q:lang() would erroneously pass.

> LayoutTests/fast/css-generated-content/quotes-auto-lang.html:5
> +<q><q>default</q></q>

This looks like it's bound to fail when users's default language is not English, or when the platform takes default quotes from the OS, which is not unthinkable, even though not implemented here.
Comment 15 Elliott Sprehn 2012-08-08 15:59:39 PDT
(In reply to comment #14)
> (From update of attachment 157095 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=157095&action=review
>...
>
> > Source/WebCore/ChangeLog:8
> > +        Support lots of different language specific quotes automatically. This matches IE behavior.
> 
> Does this mean that the list of supported languages matches IE 100%?

No, they don't publish that and we'd have to sit and compare which seems tedious.

> 
> > Source/WebCore/rendering/RenderQuote.cpp:145
> > +    // Table of quotes from https://bug-3234-attachments.webkit.org/attachment.cgi?id=2135
> 
> I'm somewhat unsure whether people working on this code in the future need to know that.
> 
> Besides, the best data on quotation marks is available at <http://unicode.org/repos/cldr-tmp/trunk/diff/by_type/misc.delimiters.html>. Perhaps it's also exposed by ICU - I couldn't find it quickly, but sounds like it should be.

I can't find this in ICU, but it looks like we should just use:
http://www.whatwg.org/specs/web-apps/current-work/multipage/rendering.html#quotes

> ... 
> > <span>&#201E;&#201D;af&#201A;&#2019;</span>
> 
> I agree with Eric. The expectation needs to be simpler, or a browser that doesn't honor q:lang() would erroneously pass.

That's a lot of work to maintain this test since you need to transform that css file...

> > LayoutTests/fast/css-generated-content/quotes-auto-lang.html:5
> > +<q><q>default</q></q>
> 
> This looks like it's bound to fail when users's default language is not English, or when the platform takes default quotes from the OS, which is not unthinkable, even though not implemented here.

That'll fail tons of layout tests that use <q> elements, so this should be fine. LayoutTests already assume " and ' are the default quotes.
Comment 16 Elliott Sprehn 2012-08-13 15:47:01 PDT
Created attachment 158136 [details]
Patch

Use the table from the WHATWG spec, and make the tests fail if you dont implement CSS quotes at all.
Comment 17 Eric Seidel (no email) 2012-08-13 15:59:26 PDT
Comment on attachment 158136 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=158136&action=review

LGTM.

> Source/WebCore/rendering/RenderQuote.cpp:94
> +    QUOTES_LANG("en",            "\x201c", "\x201d", "\x2018", "\x2019");

Do you want to share this with basicQuotesData?
Comment 18 Elliott Sprehn 2012-08-13 16:05:05 PDT
(In reply to comment #17)
> (From update of attachment 158136 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=158136&action=review
> 
> LGTM.
> 
> > Source/WebCore/rendering/RenderQuote.cpp:94
> > +    QUOTES_LANG("en",            "\x201c", "\x201d", "\x2018", "\x2019");
> 
> Do you want to share this with basicQuotesData?

You couldn't use it directly since it does staticQuotesMap.set(...). I suppose we could do: staticQuotesMap.set("en", QUOTES_DATA("\x201c", "\x201d", "\x2018", "\x2019"))

But there was something kind of nice about hiding the boilerplate of the map in the table def here.
Comment 19 WebKit Review Bot 2012-08-13 17:38:35 PDT
Comment on attachment 158136 [details]
Patch

Clearing flags on attachment: 158136

Committed r125476: <http://trac.webkit.org/changeset/125476>
Comment 20 WebKit Review Bot 2012-08-13 17:38:40 PDT
All reviewed patches have been landed.  Closing bug.
Comment 21 Alexey Proskuryakov 2012-08-14 09:47:53 PDT
> > This looks like it's bound to fail when users's default language is not English, or when the platform takes default quotes from the OS, which is not unthinkable, even though not implemented here.

> That'll fail tons of layout tests that use <q> elements, so this should be fine. LayoutTests already assume " and ' are the default quotes.

I disagree. All those tests will fail once we start using system language for -webkit-locale, which we should.

Looks like we didn't have any bugs filed for that, so I filed bug 93985. See also bug 18085 and bug 76892.