Bug 135380 - [ARM] Incorrect handling of Unicode characters
Summary: [ARM] Incorrect handling of Unicode characters
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
Depends on:
Blocks: 108645
  Show dependency treegraph
Reported: 2014-07-29 04:36 PDT by Dániel Bátyai
Modified: 2014-08-21 01:59 PDT (History)
9 users (show)

See Also:

Patch (1.35 KB, patch)
2014-07-29 04:39 PDT, Dániel Bátyai
mark.lam: review-
Details | Formatted Diff | Diff
Patch (1.65 KB, patch)
2014-08-06 07:23 PDT, Dániel Bátyai
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Dániel Bátyai 2014-07-29 04:36:16 PDT
In c++ the signedness of char can be implementation dependent. Unfortunately, this caused the incorrect handling of Unicode characters in JavaScriptCore on ARM, since the code in stringFromUTF8 in jsc.cpp (http://trac.webkit.org/browser/trunk/Source/JavaScriptCore/jsc.cpp#L658) assumes that char is signed, but that was not the case on ARM.
Comment 1 Dániel Bátyai 2014-07-29 04:39:46 PDT
Created attachment 235676 [details]

Force GCC to use signed char for char
Comment 2 Mark Lam 2014-08-05 10:54:16 PDT
Comment on attachment 235676 [details]

I think the better fix is to make change stringFromUTF() in jsc.cpp to explicitly use a signed char since it is dependent on signed behavior for correctness.  This ensures that the code is correct independent of build configurations.  Are there other places where you’ve found the “sign”-ness of chars to be an issue?
Comment 3 Darin Adler 2014-08-05 16:33:14 PDT
Comment on attachment 235676 [details]

I think this change is OK, but I also think we should fix the code to not depend on char being signed. I don’t think we need to change the type to “signed char” — we can and should just remove the dependency.
Comment 4 Darin Adler 2014-08-05 16:36:11 PDT
Let me go further. We should remove the silly optimization in stringFromUTF. If we need a fast case for ASCII, that should be inside the fromUTF8WithLatin1Fallback function, not in the JSC tool. Please submit a patch that deletes the misguided “fast case” code from jsc.cpp.
Comment 5 Dániel Bátyai 2014-08-06 07:23:12 PDT
Created attachment 236100 [details]

Removed fast case, as suggested.
Comment 6 WebKit Commit Bot 2014-08-06 10:27:57 PDT
Comment on attachment 236100 [details]

Clearing flags on attachment: 236100

Committed r172152: <http://trac.webkit.org/changeset/172152>
Comment 7 WebKit Commit Bot 2014-08-06 10:28:01 PDT
All reviewed patches have been landed.  Closing bug.