Bug 213999

Summary: [WTF] Bundle ICU
Product: WebKit Reporter: Yusuke Suzuki <ysuzuki>
Component: Web Template FrameworkAssignee: Yusuke Suzuki <ysuzuki>
Status: RESOLVED WONTFIX    
Severity: Normal CC: ap, cgarcia, darin, fpizlo, Hironori.Fujii, mitz, mmaxfield, ross.kirsling, thorton, webkit-bug-importer, xan.lopez
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on:    
Bug Blocks: 218829    

Description Yusuke Suzuki 2020-07-06 12:58:12 PDT
Discussed with Phil. ECMA402 is integrating a lot of Intl features which can be only usable on the latest ICU 67. And these APIs are marked as draft APIs.
If we wait for stable APIs for these features, and if we wait for shipping of ICU in macOS which includes stable APIs, then it is likely that we will get 2~ years behind in Intl features, that's not a non-starter.
We decided bundling the latest ICU into WebKit.
Comment 1 Yusuke Suzuki 2020-07-06 12:59:36 PDT
(In reply to Yusuke Suzuki from comment #0)
> Discussed with Phil. ECMA402 is integrating a lot of Intl features which can
> be only usable on the latest ICU 67. And these APIs are marked as draft APIs.
> If we wait for stable APIs for these features, and if we wait for shipping
> of ICU in macOS which includes stable APIs, then it is likely that we will
> get 2~ years behind in Intl features, that's not a non-starter.
> We decided bundling the latest ICU into WebKit.

And discussed with Phil, because Xcode causes super slow build if new project is added etc. we will start building ICU as a part of libWTF.
Comment 2 Radar WebKit Bug Importer 2020-07-06 13:00:26 PDT
<rdar://problem/65144498>
Comment 3 Myles C. Maxfield 2020-07-06 14:15:40 PDT
What? What increasing binary size? What about Apple-specific modifications to ICU algorithms/data and SPI? There's also value in having i18n behavior match the rest of the system, from a text editing standpoint. Did we consider:

1) For features that don't require new ICU API (but do require new data) have the features enabled, but use the old ICU data (thereby matching system behavior)
2) For features that require new ICU API, guard them behind version macros

? I don't think including two separate, yet distinct versions of ICU on the system is the right path forward.
Comment 4 Yusuke Suzuki 2020-07-06 14:38:31 PDT
(In reply to Myles C. Maxfield from comment #3)
> What? What increasing binary size? What about Apple-specific modifications
> to ICU algorithms/data and SPI? There's also value in having i18n behavior
> match the rest of the system, from a text editing standpoint. Did we
> consider:
> 
> 1) For features that don't require new ICU API (but do require new data)
> have the features enabled, but use the old ICU data (thereby matching system
> behavior)
> 2) For features that require new ICU API, guard them behind version macros
> 
> ? I don't think including two separate, yet distinct versions of ICU on the
> system is the right path forward.

(1) is already happening (See JSTests/test262/config.yaml). We are already getting failing tests in test262 just because of ICU version.
(2) is the main concern here.

The main problem here is that ECMA-402 update speed to very fast and we cannot catch up with this without bundling ICU.
ECMA-402 is not tied to ICU officially since it is implementation details, but in fact the engines for consensus are bundling ICU except for JSC (SpiderMonkey and V8).
And Intl.ListFormat and Intl.DateTimeFormat.formatRangeToParts are about to going to the stage 4 because there are two engines implementing these features.

However, the APIs required for the features only exist in ICU 67~ (Note that Big Sur beta's ICU is 66), and these required APIs are still marked as "draft" because it is added in ICU 67.
These APIs are added to fill the need for this new JS Intl features, and SpiderMonkey and V8 use these draft APIs since they are bundling ICU directly.
If we wait for the stable version of these APIs, we will likely wait for Big Sur + 2 release since they will get stable in ICU 69.
Comment 5 Myles C. Maxfield 2020-07-06 14:49:40 PDT
(In reply to Yusuke Suzuki from comment #4)
> (In reply to Myles C. Maxfield from comment #3)
> > What? What increasing binary size? What about Apple-specific modifications
> > to ICU algorithms/data and SPI? There's also value in having i18n behavior
> > match the rest of the system, from a text editing standpoint. Did we
> > consider:
> > 
> > 1) For features that don't require new ICU API (but do require new data)
> > have the features enabled, but use the old ICU data (thereby matching system
> > behavior)
> > 2) For features that require new ICU API, guard them behind version macros
> > 
> > ? I don't think including two separate, yet distinct versions of ICU on the
> > system is the right path forward.
> 
> (1) is already happening (See JSTests/test262/config.yaml). We are already
> getting failing tests in test262 just because of ICU version.

Why does test262 test old versions of the OS?

> (2) is the main concern here.
> 
> The main problem here is that ECMA-402 update speed to very fast and we
> cannot catch up with this without bundling ICU.

Apple *does* bundle ICU. We just bundle it with the entire OS, rather than with WebKit.

If there is a problem about the update rate of ICU, I suggest we start an internal thread about it.

> ECMA-402 is not tied to ICU officially since it is implementation details,
> but in fact the engines for consensus are bundling ICU except for JSC
> (SpiderMonkey and V8).
> And Intl.ListFormat and Intl.DateTimeFormat.formatRangeToParts are about to
> going to the stage 4 because there are two engines implementing these
> features.
> 
> However, the APIs required for the features only exist in ICU 67~ (Note that
> Big Sur beta's ICU is 66), and these required APIs are still marked as
> "draft" because it is added in ICU 67.
> These APIs are added to fill the need for this new JS Intl features, and
> SpiderMonkey and V8 use these draft APIs since they are bundling ICU
> directly.
> If we wait for the stable version of these APIs, we will likely wait for Big
> Sur + 2 release since they will get stable in ICU 69.
Comment 6 Yusuke Suzuki 2020-07-06 15:37:49 PDT
(In reply to Myles C. Maxfield from comment #5)
> (In reply to Yusuke Suzuki from comment #4)
> > (In reply to Myles C. Maxfield from comment #3)
> > > What? What increasing binary size? What about Apple-specific modifications
> > > to ICU algorithms/data and SPI? There's also value in having i18n behavior
> > > match the rest of the system, from a text editing standpoint. Did we
> > > consider:
> > > 
> > > 1) For features that don't require new ICU API (but do require new data)
> > > have the features enabled, but use the old ICU data (thereby matching system
> > > behavior)
> > > 2) For features that require new ICU API, guard them behind version macros
> > > 
> > > ? I don't think including two separate, yet distinct versions of ICU on the
> > > system is the right path forward.
> > 
> > (1) is already happening (See JSTests/test262/config.yaml). We are already
> > getting failing tests in test262 just because of ICU version.
> 
> Why does test262 test old versions of the OS?

Test262 is expecting the results from the latest ICU (currently, it is using ICU 67), while our current Catalina's ICU is 64.
So, some of tests are failing because our ICU is old.

> 
> > (2) is the main concern here.
> > 
> > The main problem here is that ECMA-402 update speed to very fast and we
> > cannot catch up with this without bundling ICU.
> 
> Apple *does* bundle ICU. We just bundle it with the entire OS, rather than
> with WebKit.
> 
> If there is a problem about the update rate of ICU, I suggest we start an
> internal thread about it.

I think we should update ICU more frequently, but we need to figure out whether we can use ICU "draft" APIs, which is critical.
SpiderMonkey and V8 are using "draft" APIs because they are controlling the exact version of ICU which are using.
If we cannot use "draft" APIs, then we need to wait for ICU 69 anyway since these APIs will not get "stable" status until that,
and this means that we need to wait for 1.5 - 2 years even if update is frequent enough (like, twice a year etc.).

> 
> > ECMA-402 is not tied to ICU officially since it is implementation details,
> > but in fact the engines for consensus are bundling ICU except for JSC
> > (SpiderMonkey and V8).
> > And Intl.ListFormat and Intl.DateTimeFormat.formatRangeToParts are about to
> > going to the stage 4 because there are two engines implementing these
> > features.
> > 
> > However, the APIs required for the features only exist in ICU 67~ (Note that
> > Big Sur beta's ICU is 66), and these required APIs are still marked as
> > "draft" because it is added in ICU 67.
> > These APIs are added to fill the need for this new JS Intl features, and
> > SpiderMonkey and V8 use these draft APIs since they are bundling ICU
> > directly.
> > If we wait for the stable version of these APIs, we will likely wait for Big
> > Sur + 2 release since they will get stable in ICU 69.
Comment 7 Alexey Proskuryakov 2020-07-11 14:46:34 PDT
In addition to fully agreeing with concerns stated by Myles, I'm so much not looking forward to making WebKit build slower.

If there is an internal discussion, I'd like to be part of it, as there are some additional caveats that may have not been brought up yet.
Comment 8 Myles C. Maxfield 2021-04-09 12:53:50 PDT
This bug is 9 months old, and I think we've found another path forward. Can this bug be closed?
Comment 9 Yusuke Suzuki 2021-04-09 12:56:19 PDT
Internally discussed, but we gave up on this idea.