Bug 209774

Summary: [ECMA-402] Implement unified Intl.NumberFormat
Product: WebKit Reporter: Shane Carr <sffc>
Component: JavaScriptCoreAssignee: Yusuke Suzuki <ysuzuki>
Status: RESOLVED FIXED    
Severity: Normal CC: andy, annulen, benjamin, cdumez, cmarcelo, darin, ews-watchlist, feliziani.emanuele, gyuyoung.kim, keith_miller, magross, mark.lam, mmaxfield, msaboff, ross.kirsling, ryuan.choi, saam, sergio, sffc, ticaiolima, tzagallo, webkit-bug-importer, ysuzuki
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on:    
Bug Blocks: 213425    
Attachments:
Description Flags
WIP
none
WIP
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
ross.kirsling: review+
Patch for landing
none
Patch for landing
none
Patch for landing
none
Patch none

Description Shane Carr 2020-03-30 14:39:49 PDT
Unified Intl.NumberFormat, the proposal which adds compact notation, measurement units, and a handful of other options to Intl.NumberFormat, has reached Stage 4 in TC39 as of February 2020, and the other major browser engines (V8 and SpiderMonkey) have been shipping it for some time.

As usage of these features of Intl.NumberFormat increases throughout the ecosystem, WebKit users will be left behind with legacy polyfills, leading to poorer performance relative to other browsers.  At Google, we are currently weighing our options for calling these features of Intl.NumberFormat in supported environments in order to give users better performance and smaller download sizes.

ICU4C exposes C and C++ APIs that can be used to implement Unified Intl.NumberFormat.  Implementing Unified Intl.NumberFormat is largely a matter of adding the glue between JavaScript and ICU4C, as WebKit already does for other Intl types.

Original proposal repo: https://github.com/tc39/proposal-unified-intl-numberformat
Comment 1 Radar WebKit Bug Importer 2020-03-30 16:39:03 PDT
<rdar://problem/61079763>
Comment 2 Yusuke Suzuki 2020-06-24 19:49:42 PDT
Created attachment 402712 [details]
WIP
Comment 3 Yusuke Suzuki 2020-06-25 11:04:17 PDT
Created attachment 402749 [details]
WIP
Comment 4 Yusuke Suzuki 2020-06-25 19:32:28 PDT
Created attachment 402843 [details]
Patch

WIP
Comment 5 Yusuke Suzuki 2020-06-25 22:22:09 PDT
Created attachment 402853 [details]
Patch

WIP
Comment 6 Ross Kirsling 2020-06-25 22:55:38 PDT
Comment on attachment 402843 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=402843&action=review

> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:188
> +static Optional<MeasureUnit> isSanctionedSimpleUnitIdentifier(StringView unitIdentifier)

I think you should drop the `is` since this isn't bool-typed like the spec operation.

> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:264
> +static Optional<WellFormedUnit> isWellFormedUnitIdentifier(StringView unitIdentifier)

Ditto.

> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:524
> +        skeletonBuilder.append(numeratorUnit.type());
> +        skeletonBuilder.append('-');
> +        skeletonBuilder.append(numeratorUnit.subType());

I think you can use the variadic overload here?
  skeletonBuilder.append(numeratorUnit.type(), '-', numeratorUnit.subType());

> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:530
> +            skeletonBuilder.append(denominatorUnit.type());
> +            skeletonBuilder.append('-');
> +            skeletonBuilder.append(denominatorUnit.subType());

Ditto.

> Source/JavaScriptCore/runtime/IntlNumberFormat.h:47
> +enum class IntlNotation { Standard, Scientific, Engineering, Compact };

Looks like you forgot `: uint8_t` here.

> Source/JavaScriptCore/runtime/IntlNumberFormat.h:84
> +    friend void setNumberFormatDigitOptions(JSGlobalObject*, IntlType*, JSObject*, unsigned, unsigned, IntlNotation);

We should probably label the unsigneds, since there are two.

> Source/JavaScriptCore/runtime/IntlPluralRules.h:57
> +    friend void setNumberFormatDigitOptions(JSGlobalObject*, IntlType*, JSObject*, unsigned, unsigned, IntlNotation);

Ditto.

> Source/JavaScriptCore/runtime/IntlNumberFormatInlines.h:35
> +void setNumberFormatDigitOptions(JSGlobalObject* globalObject, IntlType* intlObject, JSObject* options, unsigned minimumFractionDigitsDefault, unsigned maximumFractionDigitsDefault, IntlNotation notation)

Maybe `intlInstance` would be better than `intlObject`, just so we don't get it confused with *the* IntlObject.

> Source/WTF/wtf/unicode/icu/ICUHelpers.h:112
> +unsigned majorVersion();

FYI: We could have $vm.icuVersion depend on this too.
Comment 7 Yusuke Suzuki 2020-06-25 23:18:06 PDT
Comment on attachment 402843 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=402843&action=review

Thanks!

>> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:188
>> +static Optional<MeasureUnit> isSanctionedSimpleUnitIdentifier(StringView unitIdentifier)
> 
> I think you should drop the `is` since this isn't bool-typed like the spec operation.

Fixed.

>> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:264
>> +static Optional<WellFormedUnit> isWellFormedUnitIdentifier(StringView unitIdentifier)
> 
> Ditto.

Fixed.

>> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:524
>> +        skeletonBuilder.append(numeratorUnit.subType());
> 
> I think you can use the variadic overload here?
>   skeletonBuilder.append(numeratorUnit.type(), '-', numeratorUnit.subType());

Nice! Fixed.

>> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:530
>> +            skeletonBuilder.append(denominatorUnit.subType());
> 
> Ditto.

Fixed.

>> Source/JavaScriptCore/runtime/IntlNumberFormat.h:47
>> +enum class IntlNotation { Standard, Scientific, Engineering, Compact };
> 
> Looks like you forgot `: uint8_t` here.

Oops! Fixed.

>> Source/JavaScriptCore/runtime/IntlNumberFormat.h:84
>> +    friend void setNumberFormatDigitOptions(JSGlobalObject*, IntlType*, JSObject*, unsigned, unsigned, IntlNotation);
> 
> We should probably label the unsigneds, since there are two.

Nice, fixed.

>> Source/JavaScriptCore/runtime/IntlNumberFormatInlines.h:35
>> +void setNumberFormatDigitOptions(JSGlobalObject* globalObject, IntlType* intlObject, JSObject* options, unsigned minimumFractionDigitsDefault, unsigned maximumFractionDigitsDefault, IntlNotation notation)
> 
> Maybe `intlInstance` would be better than `intlObject`, just so we don't get it confused with *the* IntlObject.

Fixed.

>> Source/JavaScriptCore/runtime/IntlPluralRules.h:57
>> +    friend void setNumberFormatDigitOptions(JSGlobalObject*, IntlType*, JSObject*, unsigned, unsigned, IntlNotation);
> 
> Ditto.

Fixed.

>> Source/WTF/wtf/unicode/icu/ICUHelpers.h:112
>> +unsigned majorVersion();
> 
> FYI: We could have $vm.icuVersion depend on this too.

Sounds nice! Fixed.
Comment 8 Yusuke Suzuki 2020-06-25 23:20:50 PDT
Created attachment 402854 [details]
Patch

WIP
Comment 9 Yusuke Suzuki 2020-06-26 13:15:44 PDT
Created attachment 402895 [details]
Patch
Comment 10 Yusuke Suzuki 2020-06-26 22:28:22 PDT
Created attachment 402944 [details]
Patch
Comment 11 Yusuke Suzuki 2020-06-27 00:19:16 PDT
Created attachment 402945 [details]
Patch
Comment 12 Yusuke Suzuki 2020-06-27 00:31:47 PDT
Created attachment 402946 [details]
Patch
Comment 13 Yusuke Suzuki 2020-06-27 00:34:44 PDT
Created attachment 402947 [details]
Patch
Comment 14 Yusuke Suzuki 2020-06-27 00:46:43 PDT
Created attachment 402948 [details]
Patch
Comment 15 Yusuke Suzuki 2020-06-27 01:08:55 PDT
Created attachment 402950 [details]
Patch
Comment 16 Yusuke Suzuki 2020-06-27 07:23:12 PDT
Created attachment 402953 [details]
Patch
Comment 17 Ross Kirsling 2020-06-27 15:25:16 PDT
Comment on attachment 402953 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=402953&action=review

r=me, this is awesome.

> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:223
> +    if (iterator != std::end(simpleUnits)) {
> +        if (!WTF::codePointCompare(iterator->subType().characters(), iterator->subType().length(), unitIdentifier.characters8(), unitIdentifier.length()))

&& perhaps?

> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:317
> -        throwTypeError(globalObject, scope, "failed to initialize NumberFormat due to invalid locale"_s);
> +        throwTypeError(globalObject, scope, "Failed to initialize NumberFormat due to invalid locale"_s);

Hmm, it seems like most Intl error messages start with a lowercase letter -- wonder if we should address them en masse in a separate patch?

> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:397
> +    } else {
> +        if (m_style == Style::Unit) {

Seems like we should just make it `else if`.

> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:640
> +        case CurrencyDisplay::NarrowSymbol:
> +            throwTypeError(globalObject, scope, "Failed to initialize NumberFormat"_s);
> +            return;

Hmm. I don't object to these messages because "not supported in the ICU version you're using" isn't appropriate to show to the user, but I wonder if we should differentiate them from the U_FAILURE cases somehow.

> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:726
> +    status = U_ZERO_ERROR;

I think you can delete this line?
(U_FAILURE is a `> U_ZERO_ERROR` check.)

> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:761
> +    status = U_ZERO_ERROR;

Ditto.

> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:1042
> +    status = U_ZERO_ERROR;

Ditto x3 in this section.

> Source/JavaScriptCore/runtime/IntlNumberFormatInlines.h:90
> +class IntlFieldIterator {

Fancy!

> Source/WTF/wtf/unicode/icu/ICUHelpers.h:112
> +struct ICUDeleter {

What a good idea. :D

> JSTests/stress/intl-numberformat-unified.js:38
> +    shouldBe((-100).toLocaleString("bn", {

Do you think we should have some more tests for non-English locales (at least, ones which have more predictable behavior)?

> JSTests/test262/expectations.yaml:1707
> +test/intl402/NumberFormat/prototype/format/engineering-scientific-de-DE.js:

These all require ICU >64, I take it? :(
Comment 18 Darin Adler 2020-06-28 13:39:54 PDT
Comment on attachment 402953 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=402953&action=review

> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:153
> +class MeasureUnit {

Seems like this should be a struct rather than a class, without a constructor or getter functions.

> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:217
> +        [] (const MeasureUnit& unit, const StringView& unitIdentifier) {

Should be StringView, not const StringView&; I believe it’s more efficient!

> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:220
> +            if (unitIdentifier.is8Bit())
> +                return WTF::codePointCompare(unit.subType().characters(), unit.subType().length(), unitIdentifier.characters8(), unitIdentifier.length()) < 0;
> +            return WTF::codePointCompare(unit.subType().characters(), unit.subType().length(), unitIdentifier.characters16(), unitIdentifier.length()) < 0;

Seems like a codePointCompare function that takes two StringView would make this code way easier to read.

> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:229
> +class WellFormedUnit {

Seems like this would be better as a struct, rather than a class.

> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:256
> +    auto per = StringView("-per-"_s);

Putting the ASCIILiteral into a StringView prevents some kinds of optimizations, I think. Too bad we need to do it.

> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:257
> +    auto pos = unitIdentifier.find(per);

The name "pos" isn’t following WebKit coding style. We normally use words.

> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:262
> +    auto pos2 = unitIdentifier.find(per, pos + per.length());
> +    if (pos2 != WTF::notFound)
> +        return WTF::nullopt;

No need to name "pos2" with a local variable. Maybe could use contains?

>> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:726
>> +    status = U_ZERO_ERROR;
> 
> I think you can delete this line?
> (U_FAILURE is a `> U_ZERO_ERROR` check.)

I am not sure that’s right. Warnings are negative numbers, which are "< U_ZERO_ERROR". Not "== U_ZERO_ERROR".

> Source/JavaScriptCore/runtime/IntlNumberFormat.h:38
> +#if !defined(HAVE_ICU_U_NUMBER_FORMATTER)
> +// UNUM_COMPACT_FIELD and UNUM_MEASURE_UNIT_FIELD are available after ICU 64.
> +#if U_ICU_VERSION_MAJOR_NUM >= 64
> +#define HAVE_ICU_U_NUMBER_FORMATTER 1
> +#endif
> +#endif

Can we put this into PlatformHave.h instead of here?

> Source/JavaScriptCore/runtime/IntlNumberFormatInlines.h:35
> +void setNumberFormatDigitOptions(JSGlobalObject* globalObject, IntlType* intlInstance, JSObject* options, unsigned minimumFractionDigitsDefault, unsigned maximumFractionDigitsDefault, IntlNotation notation)

I suggest references for the two pointer arguments, but when in JavaScriptCore I am often overruled on this.

> Source/JavaScriptCore/runtime/IntlNumberFormatInlines.h:94
> +    explicit IntlFieldIterator(UFieldPositionIterator* iterator)

Using a reference would have 3 benefits:

1) Helps make clear we can’t call next if it’s null.
2) Makes the class non copyable without requiring the use of WTF_MAKE_NONCOPYABLE.
3) Obviates need for ".get()" at the call site.

> Source/JavaScriptCore/runtime/IntlNumberFormatInlines.h:101
> +    int32_t next(int32_t& beginIndex, int32_t& endIndex, UErrorCode& status)
> +    {
> +        UNUSED_PARAM(status);

Might prefer the "unnamed argument" pattern to UNUSED_PARAM.
Comment 19 Ross Kirsling 2020-06-28 13:56:03 PDT
(In reply to Darin Adler from comment #18)
> >> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:726
> >> +    status = U_ZERO_ERROR;
> > 
> > I think you can delete this line?
> > (U_FAILURE is a `> U_ZERO_ERROR` check.)
> 
> I am not sure that’s right. Warnings are negative numbers, which are "<
> U_ZERO_ERROR". Not "== U_ZERO_ERROR".

That is a good point, though I think even internally the early outs always check for U_FAILURE, such that warnings remain purely informative. Resetting may be more correct but I'm not certain that it buys us anything unless we do end up checking for a warning (which for us probably just means "unless we're using needsToGrowToProduceCString").
Comment 20 Yusuke Suzuki 2020-06-29 01:02:48 PDT
(In reply to Ross Kirsling from comment #19)
> (In reply to Darin Adler from comment #18)
> > >> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:726
> > >> +    status = U_ZERO_ERROR;
> > > 
> > > I think you can delete this line?
> > > (U_FAILURE is a `> U_ZERO_ERROR` check.)
> > 
> > I am not sure that’s right. Warnings are negative numbers, which are "<
> > U_ZERO_ERROR". Not "== U_ZERO_ERROR".
> 
> That is a good point, though I think even internally the early outs always
> check for U_FAILURE, such that warnings remain purely informative. Resetting
> may be more correct but I'm not certain that it buys us anything unless we
> do end up checking for a warning (which for us probably just means "unless
> we're using needsToGrowToProduceCString").

OK, I checked ICU documents. And the sample code in ICU document always use "first initialize it with U_ZERO_ERROR, and continue using it without re-initialize it" manner. So, it seems initializing it only once is the intended use by ICU :)
Comment 21 Yusuke Suzuki 2020-06-29 01:08:29 PDT
Comment on attachment 402953 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=402953&action=review

Thanks!

>> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:153
>> +class MeasureUnit {
> 
> Seems like this should be a struct rather than a class, without a constructor or getter functions.

Fixed.

>> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:217
>> +        [] (const MeasureUnit& unit, const StringView& unitIdentifier) {
> 
> Should be StringView, not const StringView&; I believe it’s more efficient!

Fixed.

>> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:220
>> +            return WTF::codePointCompare(unit.subType().characters(), unit.subType().length(), unitIdentifier.characters16(), unitIdentifier.length()) < 0;
> 
> Seems like a codePointCompare function that takes two StringView would make this code way easier to read.

Nice, fixed.

>> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:223
>> +        if (!WTF::codePointCompare(iterator->subType().characters(), iterator->subType().length(), unitIdentifier.characters8(), unitIdentifier.length()))
> 
> && perhaps?

Fixed.

>> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:229
>> +class WellFormedUnit {
> 
> Seems like this would be better as a struct, rather than a class.

Fixed.

>> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:256
>> +    auto per = StringView("-per-"_s);
> 
> Putting the ASCIILiteral into a StringView prevents some kinds of optimizations, I think. Too bad we need to do it.

Yeah...

>> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:257
>> +    auto pos = unitIdentifier.find(per);
> 
> The name "pos" isn’t following WebKit coding style. We normally use words.

Fixed, renamed to "position".

>> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:262
>> +        return WTF::nullopt;
> 
> No need to name "pos2" with a local variable. Maybe could use contains?

Fixed.

>> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:317
>> +        throwTypeError(globalObject, scope, "Failed to initialize NumberFormat due to invalid locale"_s);
> 
> Hmm, it seems like most Intl error messages start with a lowercase letter -- wonder if we should address them en masse in a separate patch?

OK, changed :)

>> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:397
>> +        if (m_style == Style::Unit) {
> 
> Seems like we should just make it `else if`.

Fixed.

>> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:640
>> +            return;
> 
> Hmm. I don't object to these messages because "not supported in the ICU version you're using" isn't appropriate to show to the user, but I wonder if we should differentiate them from the U_FAILURE cases somehow.

Fixed.

>>> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:726
>>> +    status = U_ZERO_ERROR;
>> 
>> I think you can delete this line?
>> (U_FAILURE is a `> U_ZERO_ERROR` check.)
> 
> I am not sure that’s right. Warnings are negative numbers, which are "< U_ZERO_ERROR". Not "== U_ZERO_ERROR".

I've checked ICU document, and it seems that this re-initialization is not necessary. Removed.

>> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:761
>> +    status = U_ZERO_ERROR;
> 
> Ditto.

Fixed.

>> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:1042
>> +    status = U_ZERO_ERROR;
> 
> Ditto x3 in this section.

Fixed.

>> Source/JavaScriptCore/runtime/IntlNumberFormat.h:38
>> +#endif
> 
> Can we put this into PlatformHave.h instead of here?

Unfortunately, this check requires U_ICU_VERSION_MAJOR_NUM which is defined in ICU headers. And Platform.h etc. assumes that this only contains macros (it should not include any C forward declarations) since it is also used by sandbox preprocessors.
So, we cannot include ICU headers in PlatformHave.h. I leave this here for now.

>> Source/JavaScriptCore/runtime/IntlNumberFormatInlines.h:35
>> +void setNumberFormatDigitOptions(JSGlobalObject* globalObject, IntlType* intlInstance, JSObject* options, unsigned minimumFractionDigitsDefault, unsigned maximumFractionDigitsDefault, IntlNotation notation)
> 
> I suggest references for the two pointer arguments, but when in JavaScriptCore I am often overruled on this.

Since these pointers are JS objects, for now, I follow the JSC's convention: using pointers for JS cells.

>> Source/JavaScriptCore/runtime/IntlNumberFormatInlines.h:90
>> +class IntlFieldIterator {
> 
> Fancy!

:)

>> Source/JavaScriptCore/runtime/IntlNumberFormatInlines.h:94
>> +    explicit IntlFieldIterator(UFieldPositionIterator* iterator)
> 
> Using a reference would have 3 benefits:
> 
> 1) Helps make clear we can’t call next if it’s null.
> 2) Makes the class non copyable without requiring the use of WTF_MAKE_NONCOPYABLE.
> 3) Obviates need for ".get()" at the call site.

Changed.

>> Source/JavaScriptCore/runtime/IntlNumberFormatInlines.h:101
>> +        UNUSED_PARAM(status);
> 
> Might prefer the "unnamed argument" pattern to UNUSED_PARAM.

Fixed.

>> Source/WTF/wtf/unicode/icu/ICUHelpers.h:112
>> +struct ICUDeleter {
> 
> What a good idea. :D

This `template<auto>` is C++17's new fancy feature :)

>> JSTests/stress/intl-numberformat-unified.js:38
>> +    shouldBe((-100).toLocaleString("bn", {
> 
> Do you think we should have some more tests for non-English locales (at least, ones which have more predictable behavior)?

Added ja-JP pattern.

>> JSTests/test262/expectations.yaml:1707
>> +test/intl402/NumberFormat/prototype/format/engineering-scientific-de-DE.js:
> 
> These all require ICU >64, I take it? :(

They are failing because ICU returns *slightly* different pattern text :(
It passes with very new ICU like ICU 68. I've ensured that all of them pass with ICU 68.
Comment 22 Yusuke Suzuki 2020-06-29 01:13:01 PDT
Created attachment 403033 [details]
Patch for landing
Comment 23 Yusuke Suzuki 2020-06-29 01:21:45 PDT
Created attachment 403035 [details]
Patch for landing
Comment 24 Yusuke Suzuki 2020-06-29 07:29:47 PDT
Failed ios-wk2 tests are completely unrelated to Intl.

media/now-playing-status-without-media.html
webanimations/accelerated-animation-with-easing.html
imported/w3c/web-platform-tests/cors/credentials-flag.htm
imported/w3c/web-platform-tests/cors/origin.htm
Comment 25 Yusuke Suzuki 2020-06-29 09:37:52 PDT
I'll first land non-Intl part (StringView / ICUDeleter) to unlock further implementation.
Comment 26 Yusuke Suzuki 2020-06-29 10:23:41 PDT
Committed r263665: <https://trac.webkit.org/changeset/263665>
Comment 27 Yusuke Suzuki 2020-06-29 10:24:01 PDT
I've just landed WTF part only. Remaining part will be landed later.
Comment 28 Yusuke Suzuki 2020-06-29 10:27:54 PDT
Created attachment 403079 [details]
Patch for landing
Comment 29 Yusuke Suzuki 2020-07-26 00:38:37 PDT
Comment on attachment 403079 [details]
Patch for landing

View in context: https://bugs.webkit.org/attachment.cgi?id=403079&action=review

> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:213
> +// Create MeasureUnit like ICU4J.
> +struct MeasureUnit {
> +    ASCIILiteral type;
> +    ASCIILiteral subType;
> +};
> +
> +static Optional<MeasureUnit> sanctionedSimpleUnitIdentifier(StringView unitIdentifier)
> +{
> +    static constexpr MeasureUnit simpleUnits[] = {
> +        { "area"_s, "acre"_s },
> +        { "digital"_s, "bit"_s },
> +        { "digital"_s, "byte"_s },
> +        { "temperature"_s, "celsius"_s },
> +        { "length"_s, "centimeter"_s },
> +        { "duration"_s, "day"_s },
> +        { "angle"_s, "degree"_s },
> +        { "temperature"_s, "fahrenheit"_s },
> +        { "volume"_s, "fluid-ounce"_s },
> +        { "length"_s, "foot"_s },
> +        { "volume"_s, "gallon"_s },
> +        { "digital"_s, "gigabit"_s },
> +        { "digital"_s, "gigabyte"_s },
> +        { "mass"_s, "gram"_s },
> +        { "area"_s, "hectare"_s },
> +        { "duration"_s, "hour"_s },
> +        { "length"_s, "inch"_s },
> +        { "digital"_s, "kilobit"_s },
> +        { "digital"_s, "kilobyte"_s },
> +        { "mass"_s, "kilogram"_s },
> +        { "length"_s, "kilometer"_s },
> +        { "volume"_s, "liter"_s },
> +        { "digital"_s, "megabit"_s },
> +        { "digital"_s, "megabyte"_s },
> +        { "length"_s, "meter"_s },
> +        { "length"_s, "mile"_s },
> +        { "length"_s, "mile-scandinavian"_s },
> +        { "volume"_s, "milliliter"_s },
> +        { "length"_s, "millimeter"_s },
> +        { "duration"_s, "millisecond"_s },
> +        { "duration"_s, "minute"_s },
> +        { "duration"_s, "month"_s },
> +        { "mass"_s, "ounce"_s },
> +        { "concentr"_s, "percent"_s },
> +        { "digital"_s, "petabyte"_s },
> +        { "mass"_s, "pound"_s },
> +        { "duration"_s, "second"_s },
> +        { "mass"_s, "stone"_s },
> +        { "digital"_s, "terabit"_s },
> +        { "digital"_s, "terabyte"_s },
> +        { "duration"_s, "week"_s },
> +        { "length"_s, "yard"_s },
> +        { "duration"_s, "year"_s },
> +    };
> +
> +    auto iterator = std::lower_bound(std::begin(simpleUnits), std::end(simpleUnits), unitIdentifier,
> +        [] (const MeasureUnit& unit, StringView unitIdentifier) {
> +            return WTF::codePointCompare(StringView(unit.subType), unitIdentifier) < 0;
> +        });
> +    if (iterator != std::end(simpleUnits) && !WTF::codePointCompare(StringView(iterator->subType), unitIdentifier))
> +        return *iterator;
> +    return WTF::nullopt;
> +}

Let's land this part to unlock further prototyping.
Comment 30 Yusuke Suzuki 2020-07-26 00:39:31 PDT
Comment on attachment 403079 [details]
Patch for landing

View in context: https://bugs.webkit.org/attachment.cgi?id=403079&action=review

>> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:213
>> +}
> 
> Let's land this part to unlock further prototyping.

Or, for now, I'll just copy & paste.
Comment 31 Darin Adler 2020-07-26 09:12:27 PDT
Comment on attachment 403079 [details]
Patch for landing

View in context: https://bugs.webkit.org/attachment.cgi?id=403079&action=review

> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:209
> +    auto iterator = std::lower_bound(std::begin(simpleUnits), std::end(simpleUnits), unitIdentifier,
> +        [] (const MeasureUnit& unit, StringView unitIdentifier) {
> +            return WTF::codePointCompare(StringView(unit.subType), unitIdentifier) < 0;
> +        });

I’d like to see a std::is_sorted assertion. If we can’t check it is properly sorted at compile time, at least we should check it at runtime.

Seems like we might also want to consider overloading codePointCompare so it can handle the mix of ASCIILiteral and StringView without an explicit cast. There may even be some value in optimizing that case the way we optimize== for such combinations. Although right now it looks like we only optimize == with const char*, not ASCIILiteral.

> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:210
> +    if (iterator != std::end(simpleUnits) && !WTF::codePointCompare(StringView(iterator->subType), unitIdentifier))

This can use == instead of "!codePointCompare". The named function is only needed for sorting, which is ambiguous enough that we didn’t want a code-point based comparison ignoring linguistic rules to be implied. We don't treat equality as ambiguous, though, so == can be used.
Comment 32 Yusuke Suzuki 2020-08-12 20:45:55 PDT
Created attachment 406493 [details]
Patch
Comment 33 Yusuke Suzuki 2020-08-22 11:07:07 PDT
Comment on attachment 403079 [details]
Patch for landing

View in context: https://bugs.webkit.org/attachment.cgi?id=403079&action=review

>> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:209
>> +        });
> 
> I’d like to see a std::is_sorted assertion. If we can’t check it is properly sorted at compile time, at least we should check it at runtime.
> 
> Seems like we might also want to consider overloading codePointCompare so it can handle the mix of ASCIILiteral and StringView without an explicit cast. There may even be some value in optimizing that case the way we optimize== for such combinations. Although right now it looks like we only optimize == with const char*, not ASCIILiteral.

Added.

>> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:210
>> +    if (iterator != std::end(simpleUnits) && !WTF::codePointCompare(StringView(iterator->subType), unitIdentifier))
> 
> This can use == instead of "!codePointCompare". The named function is only needed for sorting, which is ambiguous enough that we didn’t want a code-point based comparison ignoring linguistic rules to be implied. We don't treat equality as ambiguous, though, so == can be used.

Changed!
Comment 34 Yusuke Suzuki 2020-08-22 11:32:31 PDT
Committed r266031: <https://trac.webkit.org/changeset/266031>
Comment 35 Yusuke Suzuki 2020-08-22 15:14:49 PDT
Catalina JSC bot is failing while EWS is green.
It looks like Catalina Safari is built against older SDK which includes old ICU header.
Comment 36 Yusuke Suzuki 2020-08-22 15:34:48 PDT
Committed r266037: <https://trac.webkit.org/changeset/266037>
Comment 37 Yusuke Suzuki 2020-08-23 00:11:56 PDT
Committed r266044: <https://trac.webkit.org/changeset/266044>
Comment 38 Yusuke Suzuki 2021-06-03 10:09:09 PDT
*** Bug 226297 has been marked as a duplicate of this bug. ***