Bug 158188

Summary: [INTL] Implement the caseFirst option for Intl.Collator
Product: WebKit Reporter: Sukolsak Sakshuwong <sukolsak>
Component: JavaScriptCoreAssignee: Andy VanWagoner <andy>
Status: RESOLVED FIXED    
Severity: Normal CC: andy, buildbot, commit-queue, ggaren, jfbastien, keith_miller, mark.lam, msaboff, rniwa, saam
Priority: P2    
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Archive of layout-test-results from ews103 for mac-yosemite
none
Patch
none
Patch none

Description Sukolsak Sakshuwong 2016-05-29 01:33:21 PDT
The caseFirst option lets users specify whether upper case or lower case should sort first when using Intl.Collator.prototype.compare. For example,

    new Intl.Collator('en', {caseFirst: 'upper'}).compare('a', 'A'); // returns 1
    new Intl.Collator('en', {caseFirst: 'lower'}).compare('a', 'A'); // returns -1

According to the ECMAScript 2015 Internationalization API Specification (ECMA-402 2nd edition), implementations are not required to support caseFirst. But it would be a nice feature to have. Currently, only Chrome seems to supports this.

http://www.ecma-international.org/ecma-402/2.0/
Comment 1 Sukolsak Sakshuwong 2016-05-29 01:38:31 PDT
Created attachment 280056 [details]
Patch
Comment 2 Build Bot 2016-05-29 02:14:44 PDT
Comment on attachment 280056 [details]
Patch

Attachment 280056 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.webkit.org/results/1400967

Number of test failures exceeded the failure limit.
Comment 3 Build Bot 2016-05-29 02:14:47 PDT
Created attachment 280057 [details]
Archive of layout-test-results from ews103 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews103  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 4 Sukolsak Sakshuwong 2016-05-30 08:58:55 PDT
Created attachment 280100 [details]
Patch

Resubmit
Comment 5 Darin Adler 2016-05-30 20:52:53 PDT
Comment on attachment 280100 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=280100&action=review

> Source/JavaScriptCore/runtime/IntlCollator.cpp:289
>      m_numeric = (result.get(ASCIILiteral("kn")) == "true");

I’d like how this reads better without the parentheses.

> Source/JavaScriptCore/runtime/IntlCollator.cpp:290
> +    const String& caseFirst = result.get(ASCIILiteral("kf"));

As with the cases below, I think it’s nicer to have the translation from string to CaseFirst be in a helper function. It can be inlined if we like.

> Source/JavaScriptCore/runtime/IntlCollator.cpp:296
> +        ASSERT(caseFirst == "false");

Why can we assert this? What prevents invalid values?

> Source/JavaScriptCore/runtime/IntlCollator.cpp:389
> +    UColAttributeValue caseFirst;
> +    switch (m_caseFirst) {
> +    case CaseFirst::Upper:
> +        caseFirst = UCOL_UPPER_FIRST;
> +        break;
> +    case CaseFirst::Lower:
> +        caseFirst = UCOL_LOWER_FIRST;
> +        break;
> +    case CaseFirst::False:
> +        caseFirst = UCOL_OFF;
> +        break;
> +    default:
> +        ASSERT_NOT_REACHED();
> +    }

It’s a better idiom to put this kind of translation into a helper function that takes a CaseFirst and returns a UColAttributeValue. Then we can use return inside the switch cases, so we can put the ASSERT_NOT_REACHED outside the switch instead of in a default case, and the compiler will give us a warning if we ever leave a case out. It also makes the main code flow easier to read and helps us spot if we made a mistake.

This would be similar to the caseFirstString function. It can be inlined if we like.

Only trick might be using the UColAttributeValue type in the class header without including ICU headers.

> Source/JavaScriptCore/runtime/IntlCollator.cpp:451
> +const char* IntlCollator::caseFirstString(CaseFirst caseFirst)

I think we should probably mark this function inline.

> Source/JavaScriptCore/runtime/IntlCollator.cpp:462
> +    return nullptr;

I think we should return "false" here instead of nullptr.
Comment 6 Sukolsak Sakshuwong 2016-05-30 22:05:06 PDT
Thanks.

(In reply to comment #5)
> Comment on attachment 280100 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=280100&action=review
> 
> > Source/JavaScriptCore/runtime/IntlCollator.cpp:289
> >      m_numeric = (result.get(ASCIILiteral("kn")) == "true");
> 
> I’d like how this reads better without the parentheses.

Will fix.

> > Source/JavaScriptCore/runtime/IntlCollator.cpp:290
> > +    const String& caseFirst = result.get(ASCIILiteral("kf"));
> 
> As with the cases below, I think it’s nicer to have the translation from
> string to CaseFirst be in a helper function. It can be inlined if we like.

Will do. Do you have any name suggestion for the helper function? caseFirstString(CaseFirst) <-> caseFirstFromString(const String&)?

> > Source/JavaScriptCore/runtime/IntlCollator.cpp:296
> > +        ASSERT(caseFirst == "false");
> 
> Why can we assert this? What prevents invalid values?

The HashMap "result" which we get the caseFirst value from is sanitized by resolveLocale(...). We test invalid values in the tests.

> > Source/JavaScriptCore/runtime/IntlCollator.cpp:451
> > +const char* IntlCollator::caseFirstString(CaseFirst caseFirst)
> 
> I think we should probably mark this function inline.

Will do.

> > Source/JavaScriptCore/runtime/IntlCollator.cpp:462
> > +    return nullptr;
> 
> I think we should return "false" here instead of nullptr.

Why is that? Is it so that the compiler can optimize out the "case CaseFirst::False:" branch?
Comment 7 Andy VanWagoner 2017-04-27 21:45:59 PDT
Created attachment 308501 [details]
Patch
Comment 8 Andy VanWagoner 2017-04-27 21:50:10 PDT
Comment on attachment 308501 [details]
Patch

Only small changes from what Sukolsak submitted a year ago. The code matches the existing patterns in these files. Larger changes to clean up or optimize I'd suggest should be in a future patch.
Comment 9 Geoffrey Garen 2017-04-27 21:53:22 PDT
Comment on attachment 308501 [details]
Patch

r=me
Comment 10 WebKit Commit Bot 2017-04-27 23:00:23 PDT
Comment on attachment 308501 [details]
Patch

Clearing flags on attachment: 308501

Committed r215921: <http://trac.webkit.org/changeset/215921>
Comment 11 WebKit Commit Bot 2017-04-27 23:00:25 PDT
All reviewed patches have been landed.  Closing bug.