<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://bugs.webkit.org/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4.1"
          urlbase="https://bugs.webkit.org/"
          
          maintainer="admin@webkit.org"
>

    <bug>
          <bug_id>209579</bug_id>
          
          <creation_ts>2020-03-25 19:20:21 -0700</creation_ts>
          <short_desc>Escaped UTF-8 not parsed correctly with icu 52</short_desc>
          <delta_ts>2020-04-21 22:12:50 -0700</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WebKit</product>
          <component>JavaScriptCore</component>
          <version>WebKit Nightly Build</version>
          <rep_platform>All</rep_platform>
          <op_sys>Linux</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>WONTFIX</resolution>
          
          <see_also>https://bugs.webkit.org/show_bug.cgi?id=209694</see_also>
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>Minor</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Mike Gorse">mgorse</reporter>
          <assigned_to name="Nobody">webkit-unassigned</assigned_to>
          <cc>bfulgham</cc>
    
    <cc>bugs-noreply</cc>
    
    <cc>clopez</cc>
    
    <cc>darin</cc>
    
    <cc>ews-watchlist</cc>
    
    <cc>keith_miller</cc>
    
    <cc>mark.lam</cc>
    
    <cc>mcatanzaro</cc>
    
    <cc>mcrha</cc>
    
    <cc>mgorse</cc>
    
    <cc>msaboff</cc>
    
    <cc>saam</cc>
    
    <cc>tzagallo</cc>
    
    <cc>ysuzuki</cc>
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>1634172</commentid>
    <comment_count>0</comment_count>
    <who name="Mike Gorse">mgorse</who>
    <bug_when>2020-03-25 19:20:21 -0700</bug_when>
    <thetext>I&apos;m not sure if there&apos;s a minimum version of icu that we support; I noticed this while trying to prepare a security update for SLED 12 SP2, which has old versions of many things, including icu.
Anyhow, prior to icu 53, U8_COUNT_TRAIL_BYTES does not cast its parameter to unsigned before doing comparisons, and it ends up always returning 0 if a signed char is passed in.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1634174</commentid>
    <comment_count>1</comment_count>
      <attachid>394569</attachid>
    <who name="Mike Gorse">mgorse</who>
    <bug_when>2020-03-25 19:24:29 -0700</bug_when>
    <thetext>Created attachment 394569
Patch.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1634393</commentid>
    <comment_count>2</comment_count>
      <attachid>394569</attachid>
    <who name="Alexey Proskuryakov">ap</who>
    <bug_when>2020-03-26 10:19:25 -0700</bug_when>
    <thetext>Comment on attachment 394569
Patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=394569&amp;action=review

&gt; Source/JavaScriptCore/ChangeLog:3
&gt; +        Escaped UTF-8 not parsed correctly with icu 52

That&apos;s... from 2013. It doesn&apos;t seem meaningful to support such systems in WebKit trunk.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1634451</commentid>
    <comment_count>3</comment_count>
      <attachid>394569</attachid>
    <who name="Darin Adler">darin</who>
    <bug_when>2020-03-26 11:31:14 -0700</bug_when>
    <thetext>Comment on attachment 394569
Patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=394569&amp;action=review

&gt;&gt; Source/JavaScriptCore/ChangeLog:3
&gt;&gt; +        Escaped UTF-8 not parsed correctly with icu 52
&gt; 
&gt; That&apos;s... from 2013. It doesn&apos;t seem meaningful to support such systems in WebKit trunk.

I agree, unless there are some extenuating circumstances that require that we support this going forward.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1634452</commentid>
    <comment_count>4</comment_count>
      <attachid>394569</attachid>
    <who name="Darin Adler">darin</who>
    <bug_when>2020-03-26 11:31:39 -0700</bug_when>
    <thetext>Comment on attachment 394569
Patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=394569&amp;action=review

&gt;&gt;&gt; Source/JavaScriptCore/ChangeLog:3
&gt;&gt;&gt; +        Escaped UTF-8 not parsed correctly with icu 52
&gt;&gt; 
&gt;&gt; That&apos;s... from 2013. It doesn&apos;t seem meaningful to support such systems in WebKit trunk.
&gt; 
&gt; I agree, unless there are some extenuating circumstances that require that we support this going forward.

Who has a need to build the very latest WebKit, but with very old ICU?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1634515</commentid>
    <comment_count>5</comment_count>
    <who name="Mike Gorse">mgorse</who>
    <bug_when>2020-03-26 13:13:56 -0700</bug_when>
    <thetext>(In reply to Darin Adler from comment #4)
&gt; Comment on attachment 394569 [details]
&gt; Patch.
&gt; 
&gt; View in context:
&gt; https://bugs.webkit.org/attachment.cgi?id=394569&amp;action=review
&gt; 
&gt; &gt;&gt;&gt; Source/JavaScriptCore/ChangeLog:3
&gt; &gt;&gt;&gt; +        Escaped UTF-8 not parsed correctly with icu 52
&gt; &gt;&gt; 
&gt; &gt;&gt; That&apos;s... from 2013. It doesn&apos;t seem meaningful to support such systems in WebKit trunk.
&gt; &gt; 
&gt; &gt; I agree, unless there are some extenuating circumstances that require that we support this going forward.
&gt; 
&gt; Who has a need to build the very latest WebKit, but with very old ICU?

We are still, for instance, supporting SLE 12, which is on GNOME 3.20 as of SP2. But, from what I gather, it is recommended to ship the newest WebKitGTK release, in order to provide security fixes, although this becomes increasingly difficult over time when we are shipping versions of several libraries that WebKit no longer supports. Anyway, which versions of libraries to support is a larger question than just icu, and this wouldn&apos;t be the only downstream patch that I&apos;d need to carry...</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1634516</commentid>
    <comment_count>6</comment_count>
    <who name="Darin Adler">darin</who>
    <bug_when>2020-03-26 13:15:21 -0700</bug_when>
    <thetext>(In reply to Mike Gorse from comment #5)
&gt; We are still, for instance, supporting SLE 12, which is on GNOME 3.20 as of
&gt; SP2. But, from what I gather, it is recommended to ship the newest WebKitGTK
&gt; release, in order to provide security fixes, although this becomes
&gt; increasingly difficult over time when we are shipping versions of several
&gt; libraries that WebKit no longer supports. Anyway, which versions of
&gt; libraries to support is a larger question than just icu, and this wouldn&apos;t
&gt; be the only downstream patch that I&apos;d need to carry...

If updating WebKit I suggest also updating ICU to something newer than 7 years old.

If that’s not practical, then I think carrying a downstream patch would be the preferred solution.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1634756</commentid>
    <comment_count>7</comment_count>
    <who name="Milan Crha">mcrha</who>
    <bug_when>2020-03-26 23:51:11 -0700</bug_when>
    <thetext>(In reply to Darin Adler from comment #6)
&gt; If updating WebKit I suggest also updating ICU to something newer than 7
&gt; years old.

I agree with this, though if WebKit itself doesn&apos;t require newer ICU, it should work properly with the older too. In other words, if WebKit expects certain functionality or behaviour of some of its dependencies, which is available only since some version, then it&apos;s WebKit, which should claim that it requires such version of the dependencies. Just like when using new API from the dependency.

The other way around is to make the software working in the older versions as well.

Looking into Mike&apos;s change, it&apos;s quite trivial. I do not know the details, but I&apos;d even use &apos;unsigned char&apos; for the &apos;b0&apos; directly, to not have it defined as &apos;char&apos;. There&apos;s always a problem with the highest bit when it comes to cast between signed and unsigned types.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1634837</commentid>
    <comment_count>8</comment_count>
    <who name="Darin Adler">darin</who>
    <bug_when>2020-03-27 09:45:25 -0700</bug_when>
    <thetext>(In reply to Milan Crha from comment #7)
&gt; if WebKit itself doesn&apos;t require newer ICU, it
&gt; should work properly with the older too

That’s something we don’t agree on. WebKit often moves dependency forward and it’s a non-goal to work around bugs in very old versions of dependent libraries.

. In other words, if WebKit expects
&gt; certain functionality or behaviour of some of its dependencies, which is
&gt; available only since some version, then it&apos;s WebKit, which should claim that
&gt; it requires such version of the dependencies. Just like when using new API
&gt; from the dependency

Yes, the WebKit package should clearly state the dependency on a newer version of ICU. Let&apos;s fix that!

&gt; The other way around is to make the software working in the older versions
&gt; as well.

Easier said than done.

&gt; Looking into Mike&apos;s change, it&apos;s quite trivial. I do not know the details,
&gt; but I&apos;d even use &apos;unsigned char&apos; for the &apos;b0&apos; directly, to not have it
&gt; defined as &apos;char&apos;. There&apos;s always a problem with the highest bit when it
&gt; comes to cast between signed and unsigned types.

I’m not sure I agree. Is this the only problem with older ICU? We don’t test with those older versions and there is no guarantee this is sufficient.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1635429</commentid>
    <comment_count>9</comment_count>
    <who name="Milan Crha">mcrha</who>
    <bug_when>2020-03-30 02:51:30 -0700</bug_when>
    <thetext>(In reply to Darin Adler from comment #8)
&gt; That’s something we don’t agree on.

Partly. You kind of contradict that statement lower in the comment. See my adds:

&gt; WebKit often moves dependency forward and it’s a non-goal to work around
&gt; bugs in very old versions of dependent libraries.

I definitely agree with you in this, there are no resources to try to build with every single combination of the libraries out there. Neither the software should fix bugs in the libraries it uses, that&apos;s not the way it&apos;s supposed to work by all means. There are sometimes workarounds, but the rule of thumb should be: fix the library, not me, thus everyone will benefit from the fix.

&gt; Yes, the WebKit package should clearly state the dependency on a newer
&gt; version of ICU. Let&apos;s fix that!

This makes us agree on the subject :)

&gt; &gt; The other way around is to make the software working in the older versions
&gt; &gt; as well.
&gt; 
&gt; Easier said than done.

I know. I&apos;m sorry for the language, it was supposed to be sarcastic and hopefully (at least partly) funny by pushing into the extremes, but s I re-read it it didn&apos;t make it. I&apos;m sorry.

&gt; I’m not sure I agree. Is this the only problem with older ICU? We don’t test
&gt; with those older versions and there is no guarantee this is sufficient.

I cannot answer the above question. My comments about the patch were more or less generic, knowing the issues with conversion between signed and unsigned types and what issues it can cause.

As long as you bump the version of the ICU requirement it&apos;s fine to ignore the patch.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1636484</commentid>
    <comment_count>10</comment_count>
    <who name="Michael Catanzaro">mcatanzaro</who>
    <bug_when>2020-04-01 07:47:56 -0700</bug_when>
    <thetext>So our dependencies policy [1] allows us to require ICU 60 (the version available in Ubuntu 18.04).

[1] https://trac.webkit.org/wiki/WebKitGTK/DependenciesPolicy

On the other hand, this is a one-line patch. Normally when we bump dependencies, we do it because there&apos;s some actual strong reason to do so (to use a new API, or due to an incompatible change in the dependency) rather than to avoid one cast. :P I would just accept the patch; it&apos;s small and noninvasive and saves us the minor inconvenience of us both having to carry the patch downstream in RHEL and SLED. (I assume I&apos;ll need this when I try to update WebKitGTK in RHEL 7; we have ICU 50 there. :)

If we really want to bump our ICU dependency to avoid a cast, then we need to declare it properly in every Options*.cmake that uses find_package(ICU): OptionsAppleWin.cmake, OptionsFTW.cmake, OptionsGTK.cmake, OptionsJSCOnly.cmake, OptionsMac.cmake, OptionsPlayStation.cmake, OptionsWPE.cmake, and OptionsWinCairo.cmake. We currently don&apos;t pass any minimum required version there.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1636485</commentid>
    <comment_count>11</comment_count>
    <who name="Michael Catanzaro">mcatanzaro</who>
    <bug_when>2020-04-01 07:52:26 -0700</bug_when>
    <thetext>(In reply to Mike Gorse from comment #0)
&gt; Anyhow, prior to icu 53, U8_COUNT_TRAIL_BYTES does not cast its parameter to
&gt; unsigned before doing comparisons, and it ends up always returning 0 if a
&gt; signed char is passed in.

Oh, *wow*, so this is a runtime failure, not a build failure?

I didn&apos;t look closely enough at first and assumed it was a build failure. Since this is a runtime failure, and distributors updating WebKit will not see any build error, that seems like a pretty strong argument to take the patch; otherwise, we&apos;ll just wind up with broken behavior at runtime. Is there some example HTML that demonstrates the incorrect behavior?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1636489</commentid>
    <comment_count>12</comment_count>
    <who name="Michael Catanzaro">mcatanzaro</who>
    <bug_when>2020-04-01 08:00:39 -0700</bug_when>
    <thetext>My thinking is this: a build failure is inconvenient. Carrying patches in WebKit to fix downstream build failures is also inconvenient for WebKit. If it&apos;s a big patch, that&apos;s probably too much inconvenience for upstream, and a downstream patch is warranted. If it&apos;s a one-liner, it&apos;s probably simpler for everyone to take the patch upstream.

But a runtime failure, where WebKit does not behave properly at runtime, is a footgun. We can&apos;t possibly expect distributors to patch WebKit to avoid this, because they receive no build warning to notice that anything is wrong. In particular, in this case, we&apos;re calling a library function that used to fail unless passed an unsigned char... just adding a cast to unsigned char is not very onerous.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1636512</commentid>
    <comment_count>13</comment_count>
    <who name="Carlos Alberto Lopez Perez">clopez</who>
    <bug_when>2020-04-01 09:00:21 -0700</bug_when>
    <thetext>(In reply to Michael Catanzaro from comment #10)
&gt; If we really want to bump our ICU dependency to avoid a cast, then we need
&gt; to declare it properly in every Options*.cmake that uses find_package(ICU):
&gt; OptionsAppleWin.cmake, OptionsFTW.cmake, OptionsGTK.cmake,
&gt; OptionsJSCOnly.cmake, OptionsMac.cmake, OptionsPlayStation.cmake,
&gt; OptionsWPE.cmake, and OptionsWinCairo.cmake. We currently don&apos;t pass any
&gt; minimum required version there.

See bug 209694</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1636576</commentid>
    <comment_count>14</comment_count>
    <who name="Mike Gorse">mgorse</who>
    <bug_when>2020-04-01 11:18:10 -0700</bug_when>
    <thetext>(In reply to Michael Catanzaro from comment #11)
&gt; (In reply to Mike Gorse from comment #0)
&gt; &gt; Anyhow, prior to icu 53, U8_COUNT_TRAIL_BYTES does not cast its parameter to
&gt; &gt; unsigned before doing comparisons, and it ends up always returning 0 if a
&gt; &gt; signed char is passed in.
&gt; 
&gt; Oh, *wow*, so this is a runtime failure, not a build failure?
&gt; 
&gt; I didn&apos;t look closely enough at first and assumed it was a build failure.
&gt; Since this is a runtime failure, and distributors updating WebKit will not
&gt; see any build error, that seems like a pretty strong argument to take the
&gt; patch; otherwise, we&apos;ll just wind up with broken behavior at runtime. Is
&gt; there some example HTML that demonstrates the incorrect behavior?

Yes, it is a runtime failure. I don&apos;t have a sample off-hand. I noticed it when I tried to smoke test my WebKitGTK package and found that I couldn&apos;t authenticate a gmail connection in evolution; clicking Next on the screen that asked for an email address wouldn&apos;t move to the next screen for me to enter a password. JavaScriptCore appears to encounter %e2 somewhere on the page and fail to recognize it as the beginning of a UTF-8 sequence.</thetext>
  </long_desc>
      
          <attachment
              isobsolete="0"
              ispatch="1"
              isprivate="0"
          >
            <attachid>394569</attachid>
            <date>2020-03-25 19:24:29 -0700</date>
            <delta_ts>2020-03-27 09:45:39 -0700</delta_ts>
            <desc>Patch.</desc>
            <filename>icu.patch</filename>
            <type>text/plain</type>
            <size>1580</size>
            <attacher name="Mike Gorse">mgorse</attacher>
            
              <data encoding="base64">ZGlmZiAtLWdpdCBhL1NvdXJjZS9KYXZhU2NyaXB0Q29yZS9DaGFuZ2VMb2cgYi9Tb3VyY2UvSmF2
YVNjcmlwdENvcmUvQ2hhbmdlTG9nCmluZGV4IDUyOWRkYTYzZGNiLi5jOGViMDFhYzc3ZiAxMDA2
NDQKLS0tIGEvU291cmNlL0phdmFTY3JpcHRDb3JlL0NoYW5nZUxvZworKysgYi9Tb3VyY2UvSmF2
YVNjcmlwdENvcmUvQ2hhbmdlTG9nCkBAIC0xLDMgKzEsMTQgQEAKKzIwMjAtMDMtMjUgIE1pa2Ug
R29yc2UgIDxtZ29yc2VAc3VzZS5jb20+CisKKyAgICAgICAgRXNjYXBlZCBVVEYtOCBub3QgcGFy
c2VkIGNvcnJlY3RseSB3aXRoIGljdSA1MgorICAgICAgICBodHRwczovL2J1Z3Mud2Via2l0Lm9y
Zy9zaG93X2J1Zy5jZ2k/aWQ9MjA5NTc5CisKKyAgICAgICAgUmV2aWV3ZWQgYnkgTk9CT0RZIChP
T1BTISkuCisKKyAgICAgICAgKiBydW50aW1lL0pTR2xvYmFsT2JqZWN0RnVuY3Rpb25zLmNwcDoK
KyAgICAgICAgKEpTQzo6ZGVjb2RlKTogY2FzdCBjaGFyIHRvIHVuc2lnbmVkIGJlZm9yZSBjYWxs
aW5nCisgICAgICAgIFU4X0NPVU5UX1RSQUlMX0JZVEVTLgorCiAyMDIwLTAzLTI1ICBBbGV4ZXkg
U2h2YXlrYSAgPHNodmFpa2FsZXNoQGdtYWlsLmNvbT4KIAogICAgICAgICBcYiBlc2NhcGVzIGlu
c2lkZSBjaGFyYWN0ZXIgY2xhc3NlcyBzaG91bGQgYmUgdmFsaWQgaW4gVW5pY29kZSBwYXR0ZXJu
cwpkaWZmIC0tZ2l0IGEvU291cmNlL0phdmFTY3JpcHRDb3JlL3J1bnRpbWUvSlNHbG9iYWxPYmpl
Y3RGdW5jdGlvbnMuY3BwIGIvU291cmNlL0phdmFTY3JpcHRDb3JlL3J1bnRpbWUvSlNHbG9iYWxP
YmplY3RGdW5jdGlvbnMuY3BwCmluZGV4IDliYThjNzBkNzZhLi41MmJlNzNhMWMxOCAxMDA2NDQK
LS0tIGEvU291cmNlL0phdmFTY3JpcHRDb3JlL3J1bnRpbWUvSlNHbG9iYWxPYmplY3RGdW5jdGlv
bnMuY3BwCisrKyBiL1NvdXJjZS9KYXZhU2NyaXB0Q29yZS9ydW50aW1lL0pTR2xvYmFsT2JqZWN0
RnVuY3Rpb25zLmNwcApAQCAtMTgwLDcgKzE4MCw3IEBAIHN0YXRpYyBKU1ZhbHVlIGRlY29kZShK
U0dsb2JhbE9iamVjdCogZ2xvYmFsT2JqZWN0LCBjb25zdCBDaGFyVHlwZSogY2hhcmFjdGVycywK
ICAgICAgICAgICAgIGludCBjaGFyTGVuID0gMDsKICAgICAgICAgICAgIGlmIChrIDw9IGxlbmd0
aCAtIDMgJiYgaXNBU0NJSUhleERpZ2l0KHBbMV0pICYmIGlzQVNDSUlIZXhEaWdpdChwWzJdKSkg
ewogICAgICAgICAgICAgICAgIGNvbnN0IGNoYXIgYjAgPSBMZXhlcjxDaGFyVHlwZT46OmNvbnZl
cnRIZXgocFsxXSwgcFsyXSk7Ci0gICAgICAgICAgICAgICAgY29uc3QgaW50IHNlcXVlbmNlTGVu
ID0gMSArIFU4X0NPVU5UX1RSQUlMX0JZVEVTKGIwKTsKKyAgICAgICAgICAgICAgICBjb25zdCBp
bnQgc2VxdWVuY2VMZW4gPSAxICsgVThfQ09VTlRfVFJBSUxfQllURVMoc3RhdGljX2Nhc3Q8dW5z
aWduZWQgY2hhcj4oYjApKTsKICAgICAgICAgICAgICAgICBpZiAoayA8PSBsZW5ndGggLSBzZXF1
ZW5jZUxlbiAqIDMpIHsKICAgICAgICAgICAgICAgICAgICAgY2hhckxlbiA9IHNlcXVlbmNlTGVu
ICogMzsKICNpZiBVX0lDVV9WRVJTSU9OX01BSk9SX05VTSA+PSA2MAo=
</data>
<flag name="review"
          id="409996"
          type_id="1"
          status="-"
          setter="darin"
    />
    <flag name="commit-queue"
          id="409997"
          type_id="3"
          status="-"
          setter="darin"
    />
          </attachment>
      

    </bug>

</bugzilla>