WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
209772
[ECMA-402] Implement Intl.Locale
https://bugs.webkit.org/show_bug.cgi?id=209772
Summary
[ECMA-402] Implement Intl.Locale
Shane Carr
Reported
2020-03-30 14:36:22 PDT
Intl.Locale has reached Stage 4 in TC39 as of February 2020, and the other major browser engines (V8 and SpiderMonkey) have been shipping it for some time. As usage of Intl.Locale increases throughout the ecosystem, WebKit users will be left behind with legacy polyfills, leading to poorer performance relative to other browsers. Intl.Locale is a low-level type whose usage we expect to grow rapidly as more and more libraries begin to leverage it. ICU4C exposes C and C++ APIs that can be used to implement Intl.Locale. Implementing Intl.Locale is largely a matter of adding the glue between JavaScript and ICU4C, as WebKit already does for other Intl types.
Attachments
Patch
(97.71 KB, patch)
2020-05-02 18:19 PDT
,
Ross Kirsling
no flags
Details
Formatted Diff
Diff
Patch
(98.39 KB, patch)
2020-05-02 20:09 PDT
,
Ross Kirsling
no flags
Details
Formatted Diff
Diff
Patch for landing
(100.73 KB, patch)
2020-05-05 21:25 PDT
,
Ross Kirsling
no flags
Details
Formatted Diff
Diff
Show Obsolete
(2)
View All
Add attachment
proposed patch, testcase, etc.
Shane Carr
Comment 1
2020-03-30 14:40:08 PDT
Original proposal repo:
https://github.com/tc39/proposal-intl-locale
Radar WebKit Bug Importer
Comment 2
2020-03-30 16:39:02 PDT
<
rdar://problem/61079762
>
Ross Kirsling
Comment 3
2020-05-02 18:19:22 PDT
Created
attachment 398301
[details]
Patch
Ross Kirsling
Comment 4
2020-05-02 19:06:45 PDT
Comment hidden (obsolete)
Hmm, since this is basically a static object with a slew of lazy getters, I suppose it would be good to have them all do call_once.
Ross Kirsling
Comment 5
2020-05-02 19:32:15 PDT
Comment hidden (obsolete)
(In reply to Ross Kirsling from
comment #4
)
> Hmm, since this is basically a static object with a slew of lazy getters, I > suppose it would be good to have them all do call_once.
Err no, that's not for caching instance data, what am I talking about...
Ross Kirsling
Comment 6
2020-05-02 20:09:39 PDT
Created
attachment 398303
[details]
Patch
Alexey Shvayka
Comment 7
2020-05-05 10:35:14 PDT
Comment on
attachment 398303
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=398303&action=review
> Source/JavaScriptCore/runtime/IntlLocalePrototype.cpp:56 > +const ClassInfo IntlLocalePrototype::s_info = { "Object", &Base::s_info, &localePrototypeTable, nullptr, CREATE_METHOD_TABLE(IntlLocalePrototype) };
We might also set `className` to "Intl.Locale" and use JSC_TO_STRING_TAG_WITHOUT_TRANSITION in IntlLocalePrototype::finishCreation().
Ross Kirsling
Comment 8
2020-05-05 11:08:28 PDT
(In reply to Alexey Shvayka from
comment #7
)
> Comment on
attachment 398303
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=398303&action=review
> > > Source/JavaScriptCore/runtime/IntlLocalePrototype.cpp:56 > > +const ClassInfo IntlLocalePrototype::s_info = { "Object", &Base::s_info, &localePrototypeTable, nullptr, CREATE_METHOD_TABLE(IntlLocalePrototype) }; > > We might also set `className` to "Intl.Locale" and use > JSC_TO_STRING_TAG_WITHOUT_TRANSITION in > IntlLocalePrototype::finishCreation().
Oh, I didn't know about this approach. Perhaps I should do so for all of the Intl prototypes?
Alexey Shvayka
Comment 9
2020-05-05 11:15:43 PDT
(In reply to Ross Kirsling from
comment #8
)
> Oh, I didn't know about this approach. Perhaps I should do so for all of the > Intl prototypes?
It's very new (introduced in
r260992
); other Intl prototypes are already covered. Also, I hope that
https://github.com/tc39/ecma402/pull/430
will align @@toStringTag values of all Intl prototypes.
Darin Adler
Comment 10
2020-05-05 11:17:02 PDT
Comment on
attachment 398303
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=398303&action=review
Looks generally great. I have some class design thoughts.
> Source/JavaScriptCore/runtime/IntlLocale.cpp:85 > + UErrorCode status = U_ZERO_ERROR; > + Vector<char, 32> buffer(32); > + int32_t parsedLength; > + auto bufferLength = uloc_forLanguageTag(input.data(), buffer.data(), buffer.size(), &parsedLength, &status); > + if (needsToGrowToProduceBuffer(status)) { > + buffer.resize(bufferLength); > + status = U_ZERO_ERROR; > + uloc_forLanguageTag(input.data(), buffer.data(), bufferLength, &parsedLength, &status); > + } > + if (U_FAILURE(status) || parsedLength != static_cast<int32_t>(input.length())) > + return false;
I am not OK with us continuing to write these out individually. I am going to come up with a template/pattern for these so we have a better chance of doing it right.
> Source/JavaScriptCore/runtime/IntlLocale.cpp:475 > + m_numeric = triState(keyword("colnumeric"_s).startsWith("true"_s));
I looked at the specification and did not know how to find the string "colnumeric" or the rule about "true" as a prefix. Surprised that a "starts with" check is correct here. Checking that something starts with an ASCII literal (case sensitive) is not well optimized, perhaps because it’s not very common. Surprised that it is the right thing here. From a micro-optimization point of view, annoying that this creates/destroys a WTF::String containing the characters "true"; a malloc/free pair. Neither == nor startsWithLettersIgnoringASCIICase does that. We just don’t have a startsWith overload that takes a literal. If we don’t want to add new overloads for WTF::String::startsWith, to avoid allocating memory we could use StringView: StringView { keyword("colnumeric"_s) }.startsWith("true") That would be more efficient. Also, ironically, the free function startsWith from StringCommon.h is better optimized, although it doesn’t work with literals: startsWith(keyword("colnumeric"_s), StringView { "true "}) Both of the above would be more efficient than String::startsWith. Further, the StringCommon.h functions could be fixed to work with ASCII literals by adding member functions named length(), is8Bit(), characters8(), and characters16() to ASCIILiteral or by some other kinds of refactoring.
> Source/JavaScriptCore/runtime/IntlLocale.h:76 > + void setKeyword(ASCIILiteral, const String&);
I think this should be called setKeywordValue. I think this should take a StringView. I know that currently String has a function to produce UTF-8 characters and StringView does not, but we should fix that.
> Source/JavaScriptCore/runtime/IntlLocale.h:77 > + String keyword(ASCIILiteral) const;
I think this should be called keywordValue.
> Source/JavaScriptCore/runtime/IntlLocale.h:80 > + CString m_localeID;
CString seems fine/great for the locale ID after it’s computed. While it’s being computed I don’t think CString is optimal. Vector<char> would probably be better, probably even with inline capacity. I think it’s a mistake that IntlLocale object is both the "locale ID builder" and the thing to use after it’s built. I suggest a separate object for use as the locale ID builder, that gets created and destroyed within the initializeLocaleFunction. Many of the private member functions would be part of that, not part of IntlLocale.
Saam Barati
Comment 11
2020-05-05 11:39:16 PDT
Comment on
attachment 398303
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=398303&action=review
> Source/JavaScriptCore/runtime/IntlLocale.cpp:79 > + if (needsToGrowToProduceBuffer(status)) {
This is wrong. It needs to be: needsToGrowToProduceCString
Saam Barati
Comment 12
2020-05-05 11:41:56 PDT
Comment on
attachment 398303
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=398303&action=review
>> Source/JavaScriptCore/runtime/IntlLocale.cpp:79 >> + if (needsToGrowToProduceBuffer(status)) { > > This is wrong. It needs to be: > needsToGrowToProduceCString
Sorry I think I'm wrong here based on your usage. Reinstating Darin's r+
> Source/JavaScriptCore/runtime/IntlLocale.cpp:95 > + auto bufferLength = uloc_canonicalize(m_localeID.data(), buffer.data(), buffer.size(), &status);
Looks like the same issue.
Saam Barati
Comment 13
2020-05-05 11:45:36 PDT
Comment on
attachment 398303
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=398303&action=review
>> Source/JavaScriptCore/runtime/IntlLocale.cpp:95 >> + auto bufferLength = uloc_canonicalize(m_localeID.data(), buffer.data(), buffer.size(), &status); > > Looks like the same issue.
Also wrong here. I did a quick vetting of all uses of "needsToGrowXYZ" and they seem correct.
Ross Kirsling
Comment 14
2020-05-05 11:47:36 PDT
(In reply to Saam Barati from
comment #13
)
> Comment on
attachment 398303
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=398303&action=review
> > >> Source/JavaScriptCore/runtime/IntlLocale.cpp:95 > >> + auto bufferLength = uloc_canonicalize(m_localeID.data(), buffer.data(), buffer.size(), &status); > > > > Looks like the same issue. > > Also wrong here. I did a quick vetting of all uses of "needsToGrowXYZ" and > they seem correct.
Thanks. I guess this may imply that the naming is a little confusing? Since the relevance of the warning is when we need the buffer itself to be null-terminated so that we can use its data *without* making any sort of string class.
Ross Kirsling
Comment 15
2020-05-05 11:58:12 PDT
(In reply to Darin Adler from
comment #10
)
> > Source/JavaScriptCore/runtime/IntlLocale.cpp:475 > > + m_numeric = triState(keyword("colnumeric"_s).startsWith("true"_s)); > > I looked at the specification and did not know how to find the string > "colnumeric" or the rule about "true" as a prefix. Surprised that a "starts > with" check is correct here. > > Checking that something starts with an ASCII literal (case sensitive) is not > well optimized, perhaps because it’s not very common. Surprised that it is > the right thing here.
This situation here is kind of goofy because we're operating on ICU's legacy locale identifiers. In particular, they represent boolean true as "yes", but when we modernize these legacy "yes" and "no" values they become "truefalse" and "falsefalse" (which would seem like a very weird bug, but then a missing value turns into just "false" so maybe it's somehow intended??). I didn't feel great about using startsWith as a solution to begin with, so I think this means we should just avoid the modernization step for boolean values and check `== "yes"` instead.
Saam Barati
Comment 16
2020-05-05 12:01:37 PDT
(In reply to Ross Kirsling from
comment #14
)
> (In reply to Saam Barati from
comment #13
) > > Comment on
attachment 398303
[details]
> > Patch > > > > View in context: > >
https://bugs.webkit.org/attachment.cgi?id=398303&action=review
> > > > >> Source/JavaScriptCore/runtime/IntlLocale.cpp:95 > > >> + auto bufferLength = uloc_canonicalize(m_localeID.data(), buffer.data(), buffer.size(), &status); > > > > > > Looks like the same issue. > > > > Also wrong here. I did a quick vetting of all uses of "needsToGrowXYZ" and > > they seem correct. > > Thanks. I guess this may imply that the naming is a little confusing? Since > the relevance of the warning is when we need the buffer itself to be > null-terminated so that we can use its data *without* making any sort of > string class.
The confusion on my end is just not trusting my immediate reaction that a char* buffer needs to be null terminated. It doesn’t depending on how the result is used. I think Darin’s planned refactoring will make this a lot nicer by having this code all in one place
Ross Kirsling
Comment 17
2020-05-05 12:30:18 PDT
(In reply to Ross Kirsling from
comment #8
)
> (In reply to Alexey Shvayka from
comment #7
) > > Comment on
attachment 398303
[details]
> > Patch > > > > View in context: > >
https://bugs.webkit.org/attachment.cgi?id=398303&action=review
> > > > > Source/JavaScriptCore/runtime/IntlLocalePrototype.cpp:56 > > > +const ClassInfo IntlLocalePrototype::s_info = { "Object", &Base::s_info, &localePrototypeTable, nullptr, CREATE_METHOD_TABLE(IntlLocalePrototype) }; > > > > We might also set `className` to "Intl.Locale" and use > > JSC_TO_STRING_TAG_WITHOUT_TRANSITION in > > IntlLocalePrototype::finishCreation(). > > Oh, I didn't know about this approach. Perhaps I should do so for all of the > Intl prototypes?
Oh oops, I now see this was done in
bug 211020
. Will follow suit here.
Ross Kirsling
Comment 18
2020-05-05 21:14:44 PDT
(In reply to Darin Adler from
comment #10
)
> > Source/JavaScriptCore/runtime/IntlLocale.h:76 > > + void setKeyword(ASCIILiteral, const String&); > > I think this should be called setKeywordValue. I think this should take a > StringView. I know that currently String has a function to produce UTF-8 > characters and StringView does not, but we should fix that.
We can do this, but since characters8() lacks a null terminator, I think we still have to make a CString before passing to uloc_setKeywordValue.
Darin Adler
Comment 19
2020-05-05 21:20:16 PDT
(In reply to Ross Kirsling from
comment #18
)
> We can do this, but since characters8() lacks a null terminator, I think we > still have to make a CString before passing to uloc_setKeywordValue.
Of course: characters8() also won’t produce UTF-8. I’m just saying that we can create a function like String::utf8 that works on a StringView.
Ross Kirsling
Comment 20
2020-05-05 21:25:49 PDT
Created
attachment 398585
[details]
Patch for landing
EWS
Comment 21
2020-05-05 23:01:11 PDT
Committed
r261215
: <
https://trac.webkit.org/changeset/261215
> All reviewed patches have been landed. Closing bug and clearing flags on
attachment 398585
[details]
.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug