RESOLVED FIXED 209774
[ECMA-402] Implement unified Intl.NumberFormat
https://bugs.webkit.org/show_bug.cgi?id=209774
Summary [ECMA-402] Implement unified Intl.NumberFormat
Shane Carr
Reported 2020-03-30 14:39:49 PDT
Unified Intl.NumberFormat, the proposal which adds compact notation, measurement units, and a handful of other options to Intl.NumberFormat, has reached Stage 4 in TC39 as of February 2020, and the other major browser engines (V8 and SpiderMonkey) have been shipping it for some time. As usage of these features of Intl.NumberFormat increases throughout the ecosystem, WebKit users will be left behind with legacy polyfills, leading to poorer performance relative to other browsers. At Google, we are currently weighing our options for calling these features of Intl.NumberFormat in supported environments in order to give users better performance and smaller download sizes. ICU4C exposes C and C++ APIs that can be used to implement Unified Intl.NumberFormat. Implementing Unified Intl.NumberFormat is largely a matter of adding the glue between JavaScript and ICU4C, as WebKit already does for other Intl types. Original proposal repo: https://github.com/tc39/proposal-unified-intl-numberformat
Attachments
WIP (38.00 KB, patch)
2020-06-24 19:49 PDT, Yusuke Suzuki
no flags
WIP (46.81 KB, patch)
2020-06-25 11:04 PDT, Yusuke Suzuki
no flags
Patch (64.43 KB, patch)
2020-06-25 19:32 PDT, Yusuke Suzuki
no flags
Patch (70.43 KB, patch)
2020-06-25 22:22 PDT, Yusuke Suzuki
no flags
Patch (71.50 KB, patch)
2020-06-25 23:20 PDT, Yusuke Suzuki
no flags
Patch (76.44 KB, patch)
2020-06-26 13:15 PDT, Yusuke Suzuki
no flags
Patch (87.47 KB, patch)
2020-06-26 22:28 PDT, Yusuke Suzuki
no flags
Patch (76.33 KB, patch)
2020-06-27 00:19 PDT, Yusuke Suzuki
no flags
Patch (87.79 KB, patch)
2020-06-27 00:31 PDT, Yusuke Suzuki
no flags
Patch (109.04 KB, patch)
2020-06-27 00:34 PDT, Yusuke Suzuki
no flags
Patch (109.04 KB, patch)
2020-06-27 00:46 PDT, Yusuke Suzuki
no flags
Patch (109.08 KB, patch)
2020-06-27 01:08 PDT, Yusuke Suzuki
no flags
Patch (109.29 KB, patch)
2020-06-27 07:23 PDT, Yusuke Suzuki
ross.kirsling: review+
Patch for landing (110.38 KB, patch)
2020-06-29 01:13 PDT, Yusuke Suzuki
no flags
Patch for landing (110.44 KB, patch)
2020-06-29 01:21 PDT, Yusuke Suzuki
no flags
Patch for landing (97.57 KB, patch)
2020-06-29 10:27 PDT, Yusuke Suzuki
no flags
Patch (93.05 KB, patch)
2020-08-12 20:45 PDT, Yusuke Suzuki
no flags
Radar WebKit Bug Importer
Comment 1 2020-03-30 16:39:03 PDT
Yusuke Suzuki
Comment 2 2020-06-24 19:49:42 PDT
Yusuke Suzuki
Comment 3 2020-06-25 11:04:17 PDT
Yusuke Suzuki
Comment 4 2020-06-25 19:32:28 PDT
Created attachment 402843 [details] Patch WIP
Yusuke Suzuki
Comment 5 2020-06-25 22:22:09 PDT
Created attachment 402853 [details] Patch WIP
Ross Kirsling
Comment 6 2020-06-25 22:55:38 PDT
Comment on attachment 402843 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=402843&action=review > Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:188 > +static Optional<MeasureUnit> isSanctionedSimpleUnitIdentifier(StringView unitIdentifier) I think you should drop the `is` since this isn't bool-typed like the spec operation. > Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:264 > +static Optional<WellFormedUnit> isWellFormedUnitIdentifier(StringView unitIdentifier) Ditto. > Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:524 > + skeletonBuilder.append(numeratorUnit.type()); > + skeletonBuilder.append('-'); > + skeletonBuilder.append(numeratorUnit.subType()); I think you can use the variadic overload here? skeletonBuilder.append(numeratorUnit.type(), '-', numeratorUnit.subType()); > Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:530 > + skeletonBuilder.append(denominatorUnit.type()); > + skeletonBuilder.append('-'); > + skeletonBuilder.append(denominatorUnit.subType()); Ditto. > Source/JavaScriptCore/runtime/IntlNumberFormat.h:47 > +enum class IntlNotation { Standard, Scientific, Engineering, Compact }; Looks like you forgot `: uint8_t` here. > Source/JavaScriptCore/runtime/IntlNumberFormat.h:84 > + friend void setNumberFormatDigitOptions(JSGlobalObject*, IntlType*, JSObject*, unsigned, unsigned, IntlNotation); We should probably label the unsigneds, since there are two. > Source/JavaScriptCore/runtime/IntlPluralRules.h:57 > + friend void setNumberFormatDigitOptions(JSGlobalObject*, IntlType*, JSObject*, unsigned, unsigned, IntlNotation); Ditto. > Source/JavaScriptCore/runtime/IntlNumberFormatInlines.h:35 > +void setNumberFormatDigitOptions(JSGlobalObject* globalObject, IntlType* intlObject, JSObject* options, unsigned minimumFractionDigitsDefault, unsigned maximumFractionDigitsDefault, IntlNotation notation) Maybe `intlInstance` would be better than `intlObject`, just so we don't get it confused with *the* IntlObject. > Source/WTF/wtf/unicode/icu/ICUHelpers.h:112 > +unsigned majorVersion(); FYI: We could have $vm.icuVersion depend on this too.
Yusuke Suzuki
Comment 7 2020-06-25 23:18:06 PDT
Comment on attachment 402843 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=402843&action=review Thanks! >> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:188 >> +static Optional<MeasureUnit> isSanctionedSimpleUnitIdentifier(StringView unitIdentifier) > > I think you should drop the `is` since this isn't bool-typed like the spec operation. Fixed. >> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:264 >> +static Optional<WellFormedUnit> isWellFormedUnitIdentifier(StringView unitIdentifier) > > Ditto. Fixed. >> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:524 >> + skeletonBuilder.append(numeratorUnit.subType()); > > I think you can use the variadic overload here? > skeletonBuilder.append(numeratorUnit.type(), '-', numeratorUnit.subType()); Nice! Fixed. >> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:530 >> + skeletonBuilder.append(denominatorUnit.subType()); > > Ditto. Fixed. >> Source/JavaScriptCore/runtime/IntlNumberFormat.h:47 >> +enum class IntlNotation { Standard, Scientific, Engineering, Compact }; > > Looks like you forgot `: uint8_t` here. Oops! Fixed. >> Source/JavaScriptCore/runtime/IntlNumberFormat.h:84 >> + friend void setNumberFormatDigitOptions(JSGlobalObject*, IntlType*, JSObject*, unsigned, unsigned, IntlNotation); > > We should probably label the unsigneds, since there are two. Nice, fixed. >> Source/JavaScriptCore/runtime/IntlNumberFormatInlines.h:35 >> +void setNumberFormatDigitOptions(JSGlobalObject* globalObject, IntlType* intlObject, JSObject* options, unsigned minimumFractionDigitsDefault, unsigned maximumFractionDigitsDefault, IntlNotation notation) > > Maybe `intlInstance` would be better than `intlObject`, just so we don't get it confused with *the* IntlObject. Fixed. >> Source/JavaScriptCore/runtime/IntlPluralRules.h:57 >> + friend void setNumberFormatDigitOptions(JSGlobalObject*, IntlType*, JSObject*, unsigned, unsigned, IntlNotation); > > Ditto. Fixed. >> Source/WTF/wtf/unicode/icu/ICUHelpers.h:112 >> +unsigned majorVersion(); > > FYI: We could have $vm.icuVersion depend on this too. Sounds nice! Fixed.
Yusuke Suzuki
Comment 8 2020-06-25 23:20:50 PDT
Created attachment 402854 [details] Patch WIP
Yusuke Suzuki
Comment 9 2020-06-26 13:15:44 PDT
Yusuke Suzuki
Comment 10 2020-06-26 22:28:22 PDT
Yusuke Suzuki
Comment 11 2020-06-27 00:19:16 PDT
Yusuke Suzuki
Comment 12 2020-06-27 00:31:47 PDT
Yusuke Suzuki
Comment 13 2020-06-27 00:34:44 PDT
Yusuke Suzuki
Comment 14 2020-06-27 00:46:43 PDT
Yusuke Suzuki
Comment 15 2020-06-27 01:08:55 PDT
Yusuke Suzuki
Comment 16 2020-06-27 07:23:12 PDT
Ross Kirsling
Comment 17 2020-06-27 15:25:16 PDT
Comment on attachment 402953 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=402953&action=review r=me, this is awesome. > Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:223 > + if (iterator != std::end(simpleUnits)) { > + if (!WTF::codePointCompare(iterator->subType().characters(), iterator->subType().length(), unitIdentifier.characters8(), unitIdentifier.length())) && perhaps? > Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:317 > - throwTypeError(globalObject, scope, "failed to initialize NumberFormat due to invalid locale"_s); > + throwTypeError(globalObject, scope, "Failed to initialize NumberFormat due to invalid locale"_s); Hmm, it seems like most Intl error messages start with a lowercase letter -- wonder if we should address them en masse in a separate patch? > Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:397 > + } else { > + if (m_style == Style::Unit) { Seems like we should just make it `else if`. > Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:640 > + case CurrencyDisplay::NarrowSymbol: > + throwTypeError(globalObject, scope, "Failed to initialize NumberFormat"_s); > + return; Hmm. I don't object to these messages because "not supported in the ICU version you're using" isn't appropriate to show to the user, but I wonder if we should differentiate them from the U_FAILURE cases somehow. > Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:726 > + status = U_ZERO_ERROR; I think you can delete this line? (U_FAILURE is a `> U_ZERO_ERROR` check.) > Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:761 > + status = U_ZERO_ERROR; Ditto. > Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:1042 > + status = U_ZERO_ERROR; Ditto x3 in this section. > Source/JavaScriptCore/runtime/IntlNumberFormatInlines.h:90 > +class IntlFieldIterator { Fancy! > Source/WTF/wtf/unicode/icu/ICUHelpers.h:112 > +struct ICUDeleter { What a good idea. :D > JSTests/stress/intl-numberformat-unified.js:38 > + shouldBe((-100).toLocaleString("bn", { Do you think we should have some more tests for non-English locales (at least, ones which have more predictable behavior)? > JSTests/test262/expectations.yaml:1707 > +test/intl402/NumberFormat/prototype/format/engineering-scientific-de-DE.js: These all require ICU >64, I take it? :(
Darin Adler
Comment 18 2020-06-28 13:39:54 PDT
Comment on attachment 402953 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=402953&action=review > Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:153 > +class MeasureUnit { Seems like this should be a struct rather than a class, without a constructor or getter functions. > Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:217 > + [] (const MeasureUnit& unit, const StringView& unitIdentifier) { Should be StringView, not const StringView&; I believe it’s more efficient! > Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:220 > + if (unitIdentifier.is8Bit()) > + return WTF::codePointCompare(unit.subType().characters(), unit.subType().length(), unitIdentifier.characters8(), unitIdentifier.length()) < 0; > + return WTF::codePointCompare(unit.subType().characters(), unit.subType().length(), unitIdentifier.characters16(), unitIdentifier.length()) < 0; Seems like a codePointCompare function that takes two StringView would make this code way easier to read. > Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:229 > +class WellFormedUnit { Seems like this would be better as a struct, rather than a class. > Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:256 > + auto per = StringView("-per-"_s); Putting the ASCIILiteral into a StringView prevents some kinds of optimizations, I think. Too bad we need to do it. > Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:257 > + auto pos = unitIdentifier.find(per); The name "pos" isn’t following WebKit coding style. We normally use words. > Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:262 > + auto pos2 = unitIdentifier.find(per, pos + per.length()); > + if (pos2 != WTF::notFound) > + return WTF::nullopt; No need to name "pos2" with a local variable. Maybe could use contains? >> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:726 >> + status = U_ZERO_ERROR; > > I think you can delete this line? > (U_FAILURE is a `> U_ZERO_ERROR` check.) I am not sure that’s right. Warnings are negative numbers, which are "< U_ZERO_ERROR". Not "== U_ZERO_ERROR". > Source/JavaScriptCore/runtime/IntlNumberFormat.h:38 > +#if !defined(HAVE_ICU_U_NUMBER_FORMATTER) > +// UNUM_COMPACT_FIELD and UNUM_MEASURE_UNIT_FIELD are available after ICU 64. > +#if U_ICU_VERSION_MAJOR_NUM >= 64 > +#define HAVE_ICU_U_NUMBER_FORMATTER 1 > +#endif > +#endif Can we put this into PlatformHave.h instead of here? > Source/JavaScriptCore/runtime/IntlNumberFormatInlines.h:35 > +void setNumberFormatDigitOptions(JSGlobalObject* globalObject, IntlType* intlInstance, JSObject* options, unsigned minimumFractionDigitsDefault, unsigned maximumFractionDigitsDefault, IntlNotation notation) I suggest references for the two pointer arguments, but when in JavaScriptCore I am often overruled on this. > Source/JavaScriptCore/runtime/IntlNumberFormatInlines.h:94 > + explicit IntlFieldIterator(UFieldPositionIterator* iterator) Using a reference would have 3 benefits: 1) Helps make clear we can’t call next if it’s null. 2) Makes the class non copyable without requiring the use of WTF_MAKE_NONCOPYABLE. 3) Obviates need for ".get()" at the call site. > Source/JavaScriptCore/runtime/IntlNumberFormatInlines.h:101 > + int32_t next(int32_t& beginIndex, int32_t& endIndex, UErrorCode& status) > + { > + UNUSED_PARAM(status); Might prefer the "unnamed argument" pattern to UNUSED_PARAM.
Ross Kirsling
Comment 19 2020-06-28 13:56:03 PDT
(In reply to Darin Adler from comment #18) > >> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:726 > >> + status = U_ZERO_ERROR; > > > > I think you can delete this line? > > (U_FAILURE is a `> U_ZERO_ERROR` check.) > > I am not sure that’s right. Warnings are negative numbers, which are "< > U_ZERO_ERROR". Not "== U_ZERO_ERROR". That is a good point, though I think even internally the early outs always check for U_FAILURE, such that warnings remain purely informative. Resetting may be more correct but I'm not certain that it buys us anything unless we do end up checking for a warning (which for us probably just means "unless we're using needsToGrowToProduceCString").
Yusuke Suzuki
Comment 20 2020-06-29 01:02:48 PDT
(In reply to Ross Kirsling from comment #19) > (In reply to Darin Adler from comment #18) > > >> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:726 > > >> + status = U_ZERO_ERROR; > > > > > > I think you can delete this line? > > > (U_FAILURE is a `> U_ZERO_ERROR` check.) > > > > I am not sure that’s right. Warnings are negative numbers, which are "< > > U_ZERO_ERROR". Not "== U_ZERO_ERROR". > > That is a good point, though I think even internally the early outs always > check for U_FAILURE, such that warnings remain purely informative. Resetting > may be more correct but I'm not certain that it buys us anything unless we > do end up checking for a warning (which for us probably just means "unless > we're using needsToGrowToProduceCString"). OK, I checked ICU documents. And the sample code in ICU document always use "first initialize it with U_ZERO_ERROR, and continue using it without re-initialize it" manner. So, it seems initializing it only once is the intended use by ICU :)
Yusuke Suzuki
Comment 21 2020-06-29 01:08:29 PDT
Comment on attachment 402953 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=402953&action=review Thanks! >> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:153 >> +class MeasureUnit { > > Seems like this should be a struct rather than a class, without a constructor or getter functions. Fixed. >> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:217 >> + [] (const MeasureUnit& unit, const StringView& unitIdentifier) { > > Should be StringView, not const StringView&; I believe it’s more efficient! Fixed. >> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:220 >> + return WTF::codePointCompare(unit.subType().characters(), unit.subType().length(), unitIdentifier.characters16(), unitIdentifier.length()) < 0; > > Seems like a codePointCompare function that takes two StringView would make this code way easier to read. Nice, fixed. >> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:223 >> + if (!WTF::codePointCompare(iterator->subType().characters(), iterator->subType().length(), unitIdentifier.characters8(), unitIdentifier.length())) > > && perhaps? Fixed. >> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:229 >> +class WellFormedUnit { > > Seems like this would be better as a struct, rather than a class. Fixed. >> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:256 >> + auto per = StringView("-per-"_s); > > Putting the ASCIILiteral into a StringView prevents some kinds of optimizations, I think. Too bad we need to do it. Yeah... >> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:257 >> + auto pos = unitIdentifier.find(per); > > The name "pos" isn’t following WebKit coding style. We normally use words. Fixed, renamed to "position". >> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:262 >> + return WTF::nullopt; > > No need to name "pos2" with a local variable. Maybe could use contains? Fixed. >> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:317 >> + throwTypeError(globalObject, scope, "Failed to initialize NumberFormat due to invalid locale"_s); > > Hmm, it seems like most Intl error messages start with a lowercase letter -- wonder if we should address them en masse in a separate patch? OK, changed :) >> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:397 >> + if (m_style == Style::Unit) { > > Seems like we should just make it `else if`. Fixed. >> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:640 >> + return; > > Hmm. I don't object to these messages because "not supported in the ICU version you're using" isn't appropriate to show to the user, but I wonder if we should differentiate them from the U_FAILURE cases somehow. Fixed. >>> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:726 >>> + status = U_ZERO_ERROR; >> >> I think you can delete this line? >> (U_FAILURE is a `> U_ZERO_ERROR` check.) > > I am not sure that’s right. Warnings are negative numbers, which are "< U_ZERO_ERROR". Not "== U_ZERO_ERROR". I've checked ICU document, and it seems that this re-initialization is not necessary. Removed. >> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:761 >> + status = U_ZERO_ERROR; > > Ditto. Fixed. >> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:1042 >> + status = U_ZERO_ERROR; > > Ditto x3 in this section. Fixed. >> Source/JavaScriptCore/runtime/IntlNumberFormat.h:38 >> +#endif > > Can we put this into PlatformHave.h instead of here? Unfortunately, this check requires U_ICU_VERSION_MAJOR_NUM which is defined in ICU headers. And Platform.h etc. assumes that this only contains macros (it should not include any C forward declarations) since it is also used by sandbox preprocessors. So, we cannot include ICU headers in PlatformHave.h. I leave this here for now. >> Source/JavaScriptCore/runtime/IntlNumberFormatInlines.h:35 >> +void setNumberFormatDigitOptions(JSGlobalObject* globalObject, IntlType* intlInstance, JSObject* options, unsigned minimumFractionDigitsDefault, unsigned maximumFractionDigitsDefault, IntlNotation notation) > > I suggest references for the two pointer arguments, but when in JavaScriptCore I am often overruled on this. Since these pointers are JS objects, for now, I follow the JSC's convention: using pointers for JS cells. >> Source/JavaScriptCore/runtime/IntlNumberFormatInlines.h:90 >> +class IntlFieldIterator { > > Fancy! :) >> Source/JavaScriptCore/runtime/IntlNumberFormatInlines.h:94 >> + explicit IntlFieldIterator(UFieldPositionIterator* iterator) > > Using a reference would have 3 benefits: > > 1) Helps make clear we can’t call next if it’s null. > 2) Makes the class non copyable without requiring the use of WTF_MAKE_NONCOPYABLE. > 3) Obviates need for ".get()" at the call site. Changed. >> Source/JavaScriptCore/runtime/IntlNumberFormatInlines.h:101 >> + UNUSED_PARAM(status); > > Might prefer the "unnamed argument" pattern to UNUSED_PARAM. Fixed. >> Source/WTF/wtf/unicode/icu/ICUHelpers.h:112 >> +struct ICUDeleter { > > What a good idea. :D This `template<auto>` is C++17's new fancy feature :) >> JSTests/stress/intl-numberformat-unified.js:38 >> + shouldBe((-100).toLocaleString("bn", { > > Do you think we should have some more tests for non-English locales (at least, ones which have more predictable behavior)? Added ja-JP pattern. >> JSTests/test262/expectations.yaml:1707 >> +test/intl402/NumberFormat/prototype/format/engineering-scientific-de-DE.js: > > These all require ICU >64, I take it? :( They are failing because ICU returns *slightly* different pattern text :( It passes with very new ICU like ICU 68. I've ensured that all of them pass with ICU 68.
Yusuke Suzuki
Comment 22 2020-06-29 01:13:01 PDT
Created attachment 403033 [details] Patch for landing
Yusuke Suzuki
Comment 23 2020-06-29 01:21:45 PDT
Created attachment 403035 [details] Patch for landing
Yusuke Suzuki
Comment 24 2020-06-29 07:29:47 PDT
Failed ios-wk2 tests are completely unrelated to Intl. media/now-playing-status-without-media.html webanimations/accelerated-animation-with-easing.html imported/w3c/web-platform-tests/cors/credentials-flag.htm imported/w3c/web-platform-tests/cors/origin.htm
Yusuke Suzuki
Comment 25 2020-06-29 09:37:52 PDT
I'll first land non-Intl part (StringView / ICUDeleter) to unlock further implementation.
Yusuke Suzuki
Comment 26 2020-06-29 10:23:41 PDT
Yusuke Suzuki
Comment 27 2020-06-29 10:24:01 PDT
I've just landed WTF part only. Remaining part will be landed later.
Yusuke Suzuki
Comment 28 2020-06-29 10:27:54 PDT
Created attachment 403079 [details] Patch for landing
Yusuke Suzuki
Comment 29 2020-07-26 00:38:37 PDT
Comment on attachment 403079 [details] Patch for landing View in context: https://bugs.webkit.org/attachment.cgi?id=403079&action=review > Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:213 > +// Create MeasureUnit like ICU4J. > +struct MeasureUnit { > + ASCIILiteral type; > + ASCIILiteral subType; > +}; > + > +static Optional<MeasureUnit> sanctionedSimpleUnitIdentifier(StringView unitIdentifier) > +{ > + static constexpr MeasureUnit simpleUnits[] = { > + { "area"_s, "acre"_s }, > + { "digital"_s, "bit"_s }, > + { "digital"_s, "byte"_s }, > + { "temperature"_s, "celsius"_s }, > + { "length"_s, "centimeter"_s }, > + { "duration"_s, "day"_s }, > + { "angle"_s, "degree"_s }, > + { "temperature"_s, "fahrenheit"_s }, > + { "volume"_s, "fluid-ounce"_s }, > + { "length"_s, "foot"_s }, > + { "volume"_s, "gallon"_s }, > + { "digital"_s, "gigabit"_s }, > + { "digital"_s, "gigabyte"_s }, > + { "mass"_s, "gram"_s }, > + { "area"_s, "hectare"_s }, > + { "duration"_s, "hour"_s }, > + { "length"_s, "inch"_s }, > + { "digital"_s, "kilobit"_s }, > + { "digital"_s, "kilobyte"_s }, > + { "mass"_s, "kilogram"_s }, > + { "length"_s, "kilometer"_s }, > + { "volume"_s, "liter"_s }, > + { "digital"_s, "megabit"_s }, > + { "digital"_s, "megabyte"_s }, > + { "length"_s, "meter"_s }, > + { "length"_s, "mile"_s }, > + { "length"_s, "mile-scandinavian"_s }, > + { "volume"_s, "milliliter"_s }, > + { "length"_s, "millimeter"_s }, > + { "duration"_s, "millisecond"_s }, > + { "duration"_s, "minute"_s }, > + { "duration"_s, "month"_s }, > + { "mass"_s, "ounce"_s }, > + { "concentr"_s, "percent"_s }, > + { "digital"_s, "petabyte"_s }, > + { "mass"_s, "pound"_s }, > + { "duration"_s, "second"_s }, > + { "mass"_s, "stone"_s }, > + { "digital"_s, "terabit"_s }, > + { "digital"_s, "terabyte"_s }, > + { "duration"_s, "week"_s }, > + { "length"_s, "yard"_s }, > + { "duration"_s, "year"_s }, > + }; > + > + auto iterator = std::lower_bound(std::begin(simpleUnits), std::end(simpleUnits), unitIdentifier, > + [] (const MeasureUnit& unit, StringView unitIdentifier) { > + return WTF::codePointCompare(StringView(unit.subType), unitIdentifier) < 0; > + }); > + if (iterator != std::end(simpleUnits) && !WTF::codePointCompare(StringView(iterator->subType), unitIdentifier)) > + return *iterator; > + return WTF::nullopt; > +} Let's land this part to unlock further prototyping.
Yusuke Suzuki
Comment 30 2020-07-26 00:39:31 PDT
Comment on attachment 403079 [details] Patch for landing View in context: https://bugs.webkit.org/attachment.cgi?id=403079&action=review >> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:213 >> +} > > Let's land this part to unlock further prototyping. Or, for now, I'll just copy & paste.
Darin Adler
Comment 31 2020-07-26 09:12:27 PDT
Comment on attachment 403079 [details] Patch for landing View in context: https://bugs.webkit.org/attachment.cgi?id=403079&action=review > Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:209 > + auto iterator = std::lower_bound(std::begin(simpleUnits), std::end(simpleUnits), unitIdentifier, > + [] (const MeasureUnit& unit, StringView unitIdentifier) { > + return WTF::codePointCompare(StringView(unit.subType), unitIdentifier) < 0; > + }); I’d like to see a std::is_sorted assertion. If we can’t check it is properly sorted at compile time, at least we should check it at runtime. Seems like we might also want to consider overloading codePointCompare so it can handle the mix of ASCIILiteral and StringView without an explicit cast. There may even be some value in optimizing that case the way we optimize== for such combinations. Although right now it looks like we only optimize == with const char*, not ASCIILiteral. > Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:210 > + if (iterator != std::end(simpleUnits) && !WTF::codePointCompare(StringView(iterator->subType), unitIdentifier)) This can use == instead of "!codePointCompare". The named function is only needed for sorting, which is ambiguous enough that we didn’t want a code-point based comparison ignoring linguistic rules to be implied. We don't treat equality as ambiguous, though, so == can be used.
Yusuke Suzuki
Comment 32 2020-08-12 20:45:55 PDT
Yusuke Suzuki
Comment 33 2020-08-22 11:07:07 PDT
Comment on attachment 403079 [details] Patch for landing View in context: https://bugs.webkit.org/attachment.cgi?id=403079&action=review >> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:209 >> + }); > > I’d like to see a std::is_sorted assertion. If we can’t check it is properly sorted at compile time, at least we should check it at runtime. > > Seems like we might also want to consider overloading codePointCompare so it can handle the mix of ASCIILiteral and StringView without an explicit cast. There may even be some value in optimizing that case the way we optimize== for such combinations. Although right now it looks like we only optimize == with const char*, not ASCIILiteral. Added. >> Source/JavaScriptCore/runtime/IntlNumberFormat.cpp:210 >> + if (iterator != std::end(simpleUnits) && !WTF::codePointCompare(StringView(iterator->subType), unitIdentifier)) > > This can use == instead of "!codePointCompare". The named function is only needed for sorting, which is ambiguous enough that we didn’t want a code-point based comparison ignoring linguistic rules to be implied. We don't treat equality as ambiguous, though, so == can be used. Changed!
Yusuke Suzuki
Comment 34 2020-08-22 11:32:31 PDT
Yusuke Suzuki
Comment 35 2020-08-22 15:14:49 PDT
Catalina JSC bot is failing while EWS is green. It looks like Catalina Safari is built against older SDK which includes old ICU header.
Yusuke Suzuki
Comment 36 2020-08-22 15:34:48 PDT
Yusuke Suzuki
Comment 37 2020-08-23 00:11:56 PDT
Yusuke Suzuki
Comment 38 2021-06-03 10:09:09 PDT
*** Bug 226297 has been marked as a duplicate of this bug. ***
Note You need to log in before you can comment on or make changes to this bug.