When a 2-bytes unicode character is removed from a JavaScript string, a combination of using the string as an object key, and string concatenation can corrupt the string with null bytes. Steps to Reproduce: Running this snippet on Safari, verified at least on Version 11.1.1 (13605.2.8), reproduces the bug (see notes below): https://gist.github.com/lucaong/a7d7a2eee869e2c7afe8b59fc0dfda2b Here is a JSFiddle of the same snippet (read comment): https://jsfiddle.net/DukeLeNoir/mkrfw4g8/ Expected Results vs Actual Result: The script would be expected to find both strings as keys in the object, and therefore alert "All fine" twice. That's indeed what happens on other browsers. On Safari, the script instead alerts twice, the second time from a code path that should not occur. If the alert text is copied and pasted into a file and inspected with a hex dump tool (like xdd), the string is found to be corrupted with unexpected null bytes, explaining why it was not found as a key in the object. Calling any String.prototype method on the corrupted string before using it to index the object "fixes" it (even without using the method return value). Other operations that "fix" the corrupted string are using it in a comparison, logging it with console.log, or accessing one of its characters with []. This affects Safari Version 11.1.1 (13605.2.8) and Safari for iOS, but other versions are likely affected, as shown in this bug report for a popular JavaScript library, caused by the same root issue: https://github.com/olivernn/lunr.js/issues/279 The provided script seems to be the minimum needed to reproduce the bug: using shorter strings, skipping the character enumeration, or any other operation done in this short script seems to cause the bug to disappear.
<rdar://problem/42558747>
The code from the snippet above causes an assertion failure in Debug mode: ASSERTION FAILED: is8Bit() /webkit/WebKitBuild/Debug/usr/local/include/wtf/text/StringImpl.h(271) : const LChar *WTF::StringImpl::characters8() const 1 0x10c34a419 WTFCrash 2 0x10c34dd19 WTF::StringImpl::characters8() const 3 0x10d6976f4 JSC::JSRopeString::resolveRopeInternal8NoSubstring(unsigned char*) const 4 0x10d69758a JSC::JSRopeString::resolveRopeInternal8(unsigned char*) const 5 0x10d698c82 JSC::JSRopeString::resolveRopeToExistingAtomicString(JSC::ExecState*) const 6 0x10cc9436c JSC::JSString::toExistingAtomicString(JSC::ExecState*) const 7 0x10d36f12f JSC::LLInt::getByVal(JSC::VM&, JSC::ExecState*, JSC::Instruction*, JSC::JSValue, JSC::JSValue) 8 0x10d36eeb5 llint_slow_path_get_by_val 9 0x10c439862 llint_entry 10 0x10c4355b2 vmEntryToJavaScript 11 0x10d28b50a JSC::JITCode::execute(JSC::VM*, JSC::ProtoCallFrame*) 12 0x10d28aab1 JSC::Interpreter::executeProgram(JSC::SourceCode const&, JSC::ExecState*, JSC::JSObject*) 13 0x10d543ad7 JSC::evaluate(JSC::ExecState*, JSC::SourceCode const&, JSC::JSValue, WTF::NakedPtr<JSC::Exception>&) 14 0x10c285010 runWithOptions(GlobalObject*, CommandLine&, bool&) 15 0x10c25c9cc jscmain(int, char**)::$_3::operator()(JSC::VM&, GlobalObject*, bool&) const 16 0x10c244174 int runJSC<jscmain(int, char**)::$_3>(CommandLine, bool, jscmain(int, char**)::$_3 const&) 17 0x10c242c5f jscmain(int, char**) 18 0x10c242bbe main 19 0x7fff6dff4015 start Segmentation fault: 11 Here is a smaller script that reproduces the same bug: var ab16bit = 'abcĀ'.replace(/c.*/, ''); var map = {}; map[ab16bit]; var ropeAB = 'a' + 'b'; var ropeABC = ropeAB + 'c'; map[ropeAB]; map[ropeABC] = 42; console.log(JSON.stringify(map)); // Expected: {"abc":42}. Actual: {"a\u0000c":42}. "map[ab16bit]" creates a 16-bit AtomicString "ab". ropeAB and ropeABC are initially 8-bit JSRopeStrings. "map[ropeAB]" causes ropeAB to resolve into the 16-bit AtomicString "ab". Because of https://webkit.org/b/133574, ropeAB becomes a 16-bit string. However, ropeABC, which points to ropeAB, is still an 8-bit JSRopeString. So, when it is forced to resolve, it copies only the first two bytes of ropeAB, which is "a\0". Thus it returns "a\0c". One way to fix this seems to be that, when an 8-bit JSRopeString becomes 16-bit, it should set all its ancestor JSRopeStrings to be 16-bit. But I'm not sure how.
(In reply to Sukolsak Sakshuwong from comment #2) > The code from the snippet above causes an assertion failure in Debug mode: > > ASSERTION FAILED: is8Bit() > /webkit/WebKitBuild/Debug/usr/local/include/wtf/text/StringImpl.h(271) : > const LChar *WTF::StringImpl::characters8() const > 1 0x10c34a419 WTFCrash > 2 0x10c34dd19 WTF::StringImpl::characters8() const > 3 0x10d6976f4 JSC::JSRopeString::resolveRopeInternal8NoSubstring(unsigned > char*) const > 4 0x10d69758a JSC::JSRopeString::resolveRopeInternal8(unsigned char*) const > 5 0x10d698c82 > JSC::JSRopeString::resolveRopeToExistingAtomicString(JSC::ExecState*) const > 6 0x10cc9436c JSC::JSString::toExistingAtomicString(JSC::ExecState*) const > 7 0x10d36f12f JSC::LLInt::getByVal(JSC::VM&, JSC::ExecState*, > JSC::Instruction*, JSC::JSValue, JSC::JSValue) > 8 0x10d36eeb5 llint_slow_path_get_by_val > 9 0x10c439862 llint_entry > 10 0x10c4355b2 vmEntryToJavaScript > 11 0x10d28b50a JSC::JITCode::execute(JSC::VM*, JSC::ProtoCallFrame*) > 12 0x10d28aab1 JSC::Interpreter::executeProgram(JSC::SourceCode const&, > JSC::ExecState*, JSC::JSObject*) > 13 0x10d543ad7 JSC::evaluate(JSC::ExecState*, JSC::SourceCode const&, > JSC::JSValue, WTF::NakedPtr<JSC::Exception>&) > 14 0x10c285010 runWithOptions(GlobalObject*, CommandLine&, bool&) > 15 0x10c25c9cc jscmain(int, char**)::$_3::operator()(JSC::VM&, > GlobalObject*, bool&) const > 16 0x10c244174 int runJSC<jscmain(int, char**)::$_3>(CommandLine, bool, > jscmain(int, char**)::$_3 const&) > 17 0x10c242c5f jscmain(int, char**) > 18 0x10c242bbe main > 19 0x7fff6dff4015 start > Segmentation fault: 11 > > Here is a smaller script that reproduces the same bug: > > var ab16bit = 'abcĀ'.replace(/c.*/, ''); > > var map = {}; > map[ab16bit]; > > var ropeAB = 'a' + 'b'; > var ropeABC = ropeAB + 'c'; > > map[ropeAB]; > map[ropeABC] = 42; > console.log(JSON.stringify(map)); // Expected: {"abc":42}. Actual: > {"a\u0000c":42}. > > "map[ab16bit]" creates a 16-bit AtomicString "ab". ropeAB and ropeABC are > initially 8-bit JSRopeStrings. "map[ropeAB]" causes ropeAB to resolve into > the 16-bit AtomicString "ab". Because of https://webkit.org/b/133574, ropeAB > becomes a 16-bit string. However, ropeABC, which points to ropeAB, is still > an 8-bit JSRopeString. So, when it is forced to resolve, it copies only the > first two bytes of ropeAB, which is "a\0". Thus it returns "a\0c". > > One way to fix this seems to be that, when an 8-bit JSRopeString becomes > 16-bit, it should set all its ancestor JSRopeStrings to be 16-bit. But I'm > not sure how. I think the correct fix should be making AtomicString 8-bit if it can be.
Created attachment 389958 [details] Patch
Comment on attachment 389958 [details] Patch rs=me. I think this is fixed in https://trac.webkit.org/changeset/253648/webkit
Created attachment 389982 [details] Patch Set reviewer and link https://bugs.webkit.org/show_bug.cgi?id=205323 in ChangeLog.
The commit-queue encountered the following flaky tests while processing attachment 389982 [details]: editing/spelling/spellcheck-async-remove-frame.html bug 158401 (authors: morrita@google.com, rniwa@webkit.org, and tony@chromium.org) The commit-queue is continuing to process your patch.
Comment on attachment 389982 [details] Patch Clearing flags on attachment: 389982 Committed r255975: <https://trac.webkit.org/changeset/255975>
All reviewed patches have been landed. Closing bug.