WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
187947
JavaScript string corruption using RegExp with unicode character
https://bugs.webkit.org/show_bug.cgi?id=187947
Summary
JavaScript string corruption using RegExp with unicode character
Luca
Reported
2018-07-24 01:19:14 PDT
When a 2-bytes unicode character is removed from a JavaScript string, a combination of using the string as an object key, and string concatenation can corrupt the string with null bytes. Steps to Reproduce: Running this snippet on Safari, verified at least on Version 11.1.1 (13605.2.8), reproduces the bug (see notes below):
https://gist.github.com/lucaong/a7d7a2eee869e2c7afe8b59fc0dfda2b
Here is a JSFiddle of the same snippet (read comment):
https://jsfiddle.net/DukeLeNoir/mkrfw4g8/
Expected Results vs Actual Result: The script would be expected to find both strings as keys in the object, and therefore alert "All fine" twice. That's indeed what happens on other browsers. On Safari, the script instead alerts twice, the second time from a code path that should not occur. If the alert text is copied and pasted into a file and inspected with a hex dump tool (like xdd), the string is found to be corrupted with unexpected null bytes, explaining why it was not found as a key in the object. Calling any String.prototype method on the corrupted string before using it to index the object "fixes" it (even without using the method return value). Other operations that "fix" the corrupted string are using it in a comparison, logging it with console.log, or accessing one of its characters with []. This affects Safari Version 11.1.1 (13605.2.8) and Safari for iOS, but other versions are likely affected, as shown in this bug report for a popular JavaScript library, caused by the same root issue:
https://github.com/olivernn/lunr.js/issues/279
The provided script seems to be the minimum needed to reproduce the bug: using shorter strings, skipping the character enumeration, or any other operation done in this short script seems to cause the bug to disappear.
Attachments
Patch
(9.13 KB, patch)
2020-02-06 08:09 PST
,
Alexey Shvayka
ysuzuki
: review+
ysuzuki
: commit-queue+
Details
Formatted Diff
Diff
Patch
(9.15 KB, patch)
2020-02-06 12:40 PST
,
Alexey Shvayka
no flags
Details
Formatted Diff
Diff
Show Obsolete
(1)
View All
Add attachment
proposed patch, testcase, etc.
Radar WebKit Bug Importer
Comment 1
2018-07-24 15:22:50 PDT
<
rdar://problem/42558747
>
Sukolsak Sakshuwong
Comment 2
2018-07-26 18:14:24 PDT
The code from the snippet above causes an assertion failure in Debug mode: ASSERTION FAILED: is8Bit() /webkit/WebKitBuild/Debug/usr/local/include/wtf/text/StringImpl.h(271) : const LChar *WTF::StringImpl::characters8() const 1 0x10c34a419 WTFCrash 2 0x10c34dd19 WTF::StringImpl::characters8() const 3 0x10d6976f4 JSC::JSRopeString::resolveRopeInternal8NoSubstring(unsigned char*) const 4 0x10d69758a JSC::JSRopeString::resolveRopeInternal8(unsigned char*) const 5 0x10d698c82 JSC::JSRopeString::resolveRopeToExistingAtomicString(JSC::ExecState*) const 6 0x10cc9436c JSC::JSString::toExistingAtomicString(JSC::ExecState*) const 7 0x10d36f12f JSC::LLInt::getByVal(JSC::VM&, JSC::ExecState*, JSC::Instruction*, JSC::JSValue, JSC::JSValue) 8 0x10d36eeb5 llint_slow_path_get_by_val 9 0x10c439862 llint_entry 10 0x10c4355b2 vmEntryToJavaScript 11 0x10d28b50a JSC::JITCode::execute(JSC::VM*, JSC::ProtoCallFrame*) 12 0x10d28aab1 JSC::Interpreter::executeProgram(JSC::SourceCode const&, JSC::ExecState*, JSC::JSObject*) 13 0x10d543ad7 JSC::evaluate(JSC::ExecState*, JSC::SourceCode const&, JSC::JSValue, WTF::NakedPtr<JSC::Exception>&) 14 0x10c285010 runWithOptions(GlobalObject*, CommandLine&, bool&) 15 0x10c25c9cc jscmain(int, char**)::$_3::operator()(JSC::VM&, GlobalObject*, bool&) const 16 0x10c244174 int runJSC<jscmain(int, char**)::$_3>(CommandLine, bool, jscmain(int, char**)::$_3 const&) 17 0x10c242c5f jscmain(int, char**) 18 0x10c242bbe main 19 0x7fff6dff4015 start Segmentation fault: 11 Here is a smaller script that reproduces the same bug: var ab16bit = 'abcĀ'.replace(/c.*/, ''); var map = {}; map[ab16bit]; var ropeAB = 'a' + 'b'; var ropeABC = ropeAB + 'c'; map[ropeAB]; map[ropeABC] = 42; console.log(JSON.stringify(map)); // Expected: {"abc":42}. Actual: {"a\u0000c":42}. "map[ab16bit]" creates a 16-bit AtomicString "ab". ropeAB and ropeABC are initially 8-bit JSRopeStrings. "map[ropeAB]" causes ropeAB to resolve into the 16-bit AtomicString "ab". Because of
https://webkit.org/b/133574
, ropeAB becomes a 16-bit string. However, ropeABC, which points to ropeAB, is still an 8-bit JSRopeString. So, when it is forced to resolve, it copies only the first two bytes of ropeAB, which is "a\0". Thus it returns "a\0c". One way to fix this seems to be that, when an 8-bit JSRopeString becomes 16-bit, it should set all its ancestor JSRopeStrings to be 16-bit. But I'm not sure how.
Yusuke Suzuki
Comment 3
2018-07-26 18:22:58 PDT
(In reply to Sukolsak Sakshuwong from
comment #2
)
> The code from the snippet above causes an assertion failure in Debug mode: > > ASSERTION FAILED: is8Bit() > /webkit/WebKitBuild/Debug/usr/local/include/wtf/text/StringImpl.h(271) : > const LChar *WTF::StringImpl::characters8() const > 1 0x10c34a419 WTFCrash > 2 0x10c34dd19 WTF::StringImpl::characters8() const > 3 0x10d6976f4 JSC::JSRopeString::resolveRopeInternal8NoSubstring(unsigned > char*) const > 4 0x10d69758a JSC::JSRopeString::resolveRopeInternal8(unsigned char*) const > 5 0x10d698c82 > JSC::JSRopeString::resolveRopeToExistingAtomicString(JSC::ExecState*) const > 6 0x10cc9436c JSC::JSString::toExistingAtomicString(JSC::ExecState*) const > 7 0x10d36f12f JSC::LLInt::getByVal(JSC::VM&, JSC::ExecState*, > JSC::Instruction*, JSC::JSValue, JSC::JSValue) > 8 0x10d36eeb5 llint_slow_path_get_by_val > 9 0x10c439862 llint_entry > 10 0x10c4355b2 vmEntryToJavaScript > 11 0x10d28b50a JSC::JITCode::execute(JSC::VM*, JSC::ProtoCallFrame*) > 12 0x10d28aab1 JSC::Interpreter::executeProgram(JSC::SourceCode const&, > JSC::ExecState*, JSC::JSObject*) > 13 0x10d543ad7 JSC::evaluate(JSC::ExecState*, JSC::SourceCode const&, > JSC::JSValue, WTF::NakedPtr<JSC::Exception>&) > 14 0x10c285010 runWithOptions(GlobalObject*, CommandLine&, bool&) > 15 0x10c25c9cc jscmain(int, char**)::$_3::operator()(JSC::VM&, > GlobalObject*, bool&) const > 16 0x10c244174 int runJSC<jscmain(int, char**)::$_3>(CommandLine, bool, > jscmain(int, char**)::$_3 const&) > 17 0x10c242c5f jscmain(int, char**) > 18 0x10c242bbe main > 19 0x7fff6dff4015 start > Segmentation fault: 11 > > Here is a smaller script that reproduces the same bug: > > var ab16bit = 'abcĀ'.replace(/c.*/, ''); > > var map = {}; > map[ab16bit]; > > var ropeAB = 'a' + 'b'; > var ropeABC = ropeAB + 'c'; > > map[ropeAB]; > map[ropeABC] = 42; > console.log(JSON.stringify(map)); // Expected: {"abc":42}. Actual: > {"a\u0000c":42}. > > "map[ab16bit]" creates a 16-bit AtomicString "ab". ropeAB and ropeABC are > initially 8-bit JSRopeStrings. "map[ropeAB]" causes ropeAB to resolve into > the 16-bit AtomicString "ab". Because of
https://webkit.org/b/133574
, ropeAB > becomes a 16-bit string. However, ropeABC, which points to ropeAB, is still > an 8-bit JSRopeString. So, when it is forced to resolve, it copies only the > first two bytes of ropeAB, which is "a\0". Thus it returns "a\0c". > > One way to fix this seems to be that, when an 8-bit JSRopeString becomes > 16-bit, it should set all its ancestor JSRopeStrings to be 16-bit. But I'm > not sure how.
I think the correct fix should be making AtomicString 8-bit if it can be.
Alexey Shvayka
Comment 4
2020-02-06 08:09:24 PST
Created
attachment 389958
[details]
Patch
Yusuke Suzuki
Comment 5
2020-02-06 12:32:49 PST
Comment on
attachment 389958
[details]
Patch rs=me. I think this is fixed in
https://trac.webkit.org/changeset/253648/webkit
Alexey Shvayka
Comment 6
2020-02-06 12:40:04 PST
Created
attachment 389982
[details]
Patch Set reviewer and link
https://bugs.webkit.org/show_bug.cgi?id=205323
in ChangeLog.
WebKit Commit Bot
Comment 7
2020-02-06 13:36:20 PST
The commit-queue encountered the following flaky tests while processing
attachment 389982
[details]
: editing/spelling/spellcheck-async-remove-frame.html
bug 158401
(authors:
morrita@google.com
,
rniwa@webkit.org
, and
tony@chromium.org
) The commit-queue is continuing to process your patch.
WebKit Commit Bot
Comment 8
2020-02-06 13:36:54 PST
Comment on
attachment 389982
[details]
Patch Clearing flags on attachment: 389982 Committed
r255975
: <
https://trac.webkit.org/changeset/255975
>
WebKit Commit Bot
Comment 9
2020-02-06 13:36:55 PST
All reviewed patches have been landed. Closing bug.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug