Summary: | WebKit has too much of its own UTF-8 code and should rely more on ICU's UTF-8 support | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Darin Adler <darin> | ||||||||||||||||||||||
Component: | Web Template Framework | Assignee: | Darin Adler <darin> | ||||||||||||||||||||||
Status: | RESOLVED FIXED | ||||||||||||||||||||||||
Severity: | Normal | CC: | achristensen, ap, Basuke.Suzuki, benjamin, cdumez, cmarcelo, dbates, ews-watchlist, guijemont, guijemont+jsc-armv7-ews, keith_miller, mark.lam, mitz, mmaxfield, msaboff, rniwa, ross.kirsling, ryanhaddad, saam, sroberts, webkit-bug-importer, ysuzuki | ||||||||||||||||||||||
Priority: | P2 | Keywords: | InRadar | ||||||||||||||||||||||
Version: | WebKit Nightly Build | ||||||||||||||||||||||||
Hardware: | All | ||||||||||||||||||||||||
OS: | All | ||||||||||||||||||||||||
Attachments: |
|
Description
Darin Adler
2019-03-10 14:29:56 PDT
Created attachment 364198 [details]
Patch
Created attachment 364200 [details]
Patch
Comment on attachment 364200 [details] Patch Attachment 364200 [details] did not pass mac-ews (mac): Output: https://webkit-queues.webkit.org/results/11449715 Number of test failures exceeded the failure limit. Created attachment 364203 [details]
Archive of layout-test-results from ews100 for mac-highsierra
The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews100 Port: mac-highsierra Platform: Mac OS X 10.13.6
Comment on attachment 364200 [details] Patch Attachment 364200 [details] did not pass mac-wk2-ews (mac-wk2): Output: https://webkit-queues.webkit.org/results/11449717 Number of test failures exceeded the failure limit. Created attachment 364204 [details]
Archive of layout-test-results from ews106 for mac-highsierra-wk2
The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews106 Port: mac-highsierra-wk2 Platform: Mac OS X 10.13.6
Comment on attachment 364200 [details] Patch Attachment 364200 [details] did not pass mac-debug-ews (mac): Output: https://webkit-queues.webkit.org/results/11449689 Number of test failures exceeded the failure limit. Created attachment 364208 [details]
Archive of layout-test-results from ews115 for mac-highsierra
The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews115 Port: mac-highsierra Platform: Mac OS X 10.13.6
Trying to figure out what kind of failure/crash this is. Comment on attachment 364200 [details] Patch Attachment 364200 [details] did not pass jsc-ews (mac): Output: https://webkit-queues.webkit.org/results/11450006 New failing tests: jsc-layout-tests.yaml/js/script-tests/invalid-utf8-in-syntax-error.js.layout stress/encode-decode-uri-surrogates.js.ftl-no-cjit-small-pool stress/encode-decode-uri-component-surrogates.js.default stress/encode-decode-unicode.js.no-cjit-validate-phases stress/encode-decode-uri-surrogates.js.ftl-eager stress/encode-decode-unicode.js.ftl-eager jsc-layout-tests.yaml/js/script-tests/invalid-utf8-in-syntax-error.js.layout-dfg-eager-no-cjit stress/encode-decode-ascii.js.no-llint stress/encode-decode-ascii.js.dfg-eager-no-cjit-validate stress/encode-decode-unicode.js.dfg-eager-no-cjit-validate stress/encode-decode-uri-surrogates.js.no-llint stress/encode-decode-unicode.js.ftl-no-cjit-no-put-stack-validate stress/encode-decode-uri-component-surrogates.js.ftl-no-cjit-validate-sampling-profiler stress/encode-decode-uri-component-surrogates.js.ftl-no-cjit-b3o0 stress/encode-decode-unicode.js.no-llint stress/encode-decode-ascii.js.default stress/encode-decode-ascii.js.dfg-maximal-flush-validate-no-cjit stress/encode-decode-ascii.js.ftl-no-cjit-no-inline-validate stress/encode-decode-ascii.js.bytecode-cache stress/encode-decode-uri-component-surrogates.js.ftl-no-cjit-small-pool stress/encode-decode-unicode.js.dfg-eager stress/encode-decode-unicode.js.dfg-maximal-flush-validate-no-cjit stress/encode-decode-unicode.js.ftl-eager-no-cjit stress/encode-decode-unicode.js.ftl-no-cjit-no-inline-validate stress/encode-decode-uri-component-surrogates.js.no-llint stress/encode-decode-uri-surrogates.js.bytecode-cache stress/encode-decode-uri-component-surrogates.js.ftl-eager stress/encode-decode-uri-surrogates.js.ftl-no-cjit-validate-sampling-profiler stress/encode-decode-uri-surrogates.js.no-cjit-validate-phases stress/encode-decode-uri-surrogates.js.dfg-maximal-flush-validate-no-cjit stress/encode-decode-ascii.js.ftl-no-cjit-small-pool jsc-layout-tests.yaml/js/script-tests/invalid-utf8-in-syntax-error.js.layout-no-ftl stress/encode-decode-uri-component-surrogates.js.ftl-no-cjit-no-put-stack-validate stress/encode-decode-unicode.js.no-ftl stress/encode-decode-uri-surrogates.js.no-ftl stress/encode-decode-ascii.js.ftl-no-cjit-no-put-stack-validate stress/encode-decode-uri-surrogates.js.dfg-eager-no-cjit-validate stress/encode-decode-uri-surrogates.js.ftl-eager-no-cjit-b3o1 stress/encode-decode-ascii.js.no-cjit-validate-phases stress/encode-decode-uri-component-surrogates.js.dfg-eager stress/encode-decode-uri-surrogates.js.no-cjit-collect-continuously stress/encode-decode-ascii.js.ftl-eager-no-cjit-b3o1 stress/encode-decode-uri-component-surrogates.js.ftl-no-cjit-no-inline-validate stress/encode-decode-uri-component-surrogates.js.ftl-eager-no-cjit-b3o1 stress/encode-decode-ascii.js.ftl-eager-no-cjit stress/encode-decode-uri-component-surrogates.js.bytecode-cache stress/encode-decode-uri-component-surrogates.js.no-cjit-collect-continuously stress/encode-decode-unicode.js.no-cjit-collect-continuously stress/encode-decode-uri-component-surrogates.js.dfg-eager-no-cjit-validate stress/encode-decode-uri-surrogates.js.default jsc-layout-tests.yaml/js/script-tests/invalid-utf8-in-syntax-error.js.layout-ftl-no-cjit jsc-layout-tests.yaml/js/script-tests/invalid-utf8-in-syntax-error.js.layout-no-cjit stress/encode-decode-ascii.js.ftl-eager stress/encode-decode-ascii.js.no-ftl stress/encode-decode-unicode.js.ftl-no-cjit-small-pool stress/encode-decode-uri-component-surrogates.js.ftl-eager-no-cjit stress/encode-decode-uri-surrogates.js.ftl-eager-no-cjit stress/encode-decode-uri-surrogates.js.ftl-no-cjit-no-put-stack-validate stress/encode-decode-ascii.js.dfg-eager stress/encode-decode-unicode.js.default stress/encode-decode-ascii.js.ftl-no-cjit-b3o0 stress/encode-decode-uri-component-surrogates.js.no-cjit-validate-phases stress/encode-decode-unicode.js.ftl-no-cjit-validate-sampling-profiler stress/encode-decode-uri-component-surrogates.js.dfg-maximal-flush-validate-no-cjit stress/encode-decode-uri-surrogates.js.ftl-no-cjit-b3o0 stress/encode-decode-unicode.js.bytecode-cache stress/encode-decode-uri-surrogates.js.dfg-eager stress/encode-decode-uri-surrogates.js.ftl-no-cjit-no-inline-validate ChakraCore.yaml/ChakraCore/test/GlobalFunctions/GlobalFunctions.js.default stress/encode-decode-uri-component-surrogates.js.no-ftl stress/encode-decode-unicode.js.ftl-no-cjit-b3o0 stress/encode-decode-unicode.js.ftl-eager-no-cjit-b3o1 jsc-layout-tests.yaml/js/script-tests/invalid-utf8-in-syntax-error.js.layout-no-llint stress/encode-decode-ascii.js.no-cjit-collect-continuously stress/encode-decode-ascii.js.ftl-no-cjit-validate-sampling-profiler jsc-layout-tests.yaml/js/script-tests/invalid-utf8-in-syntax-error.js.layout-ftl-eager-no-cjit Created attachment 367915 [details]
Patch
I figured it out: previous patch broke AtomicString::fromUTF8 when passed a null-terminated string, and that’s fixed now. Comment on attachment 367915 [details] Patch Attachment 367915 [details] did not pass ios-sim-ews (ios-simulator-wk2): Output: https://webkit-queues.webkit.org/results/11946636 New failing tests: imported/w3c/web-platform-tests/html/syntax/parsing/html5lib_html5test-com.html imported/w3c/web-platform-tests/html/syntax/parsing/html5lib_entities02.html js/dom/encode-URI-test.html imported/w3c/web-platform-tests/html/syntax/parsing/html5lib_tests24.html imported/w3c/web-platform-tests/html/syntax/parsing/html5lib_domjs-unsafe.html imported/w3c/web-platform-tests/html/syntax/parsing/html5lib_plain-text-unsafe.html imported/w3c/web-platform-tests/encoding/textdecoder-utf16-surrogates.html imported/w3c/web-platform-tests/html/syntax/parsing/html5lib_entities01.html js/dom/webidl-type-mapping.html css3/escape-dom-api.html fast/text/dangling-surrogates.html fast/xsl/utf8-chunks.xml js/kde/encode_decode_uri.html Created attachment 367917 [details]
Archive of layout-test-results from ews123 for ios-simulator-wk2
The attached test failures were seen while running run-webkit-tests on the ios-sim-ews.
Bot: ews123 Port: ios-simulator-wk2 Platform: Mac OS X 10.13.6
Comment on attachment 367915 [details] Patch Attachment 367915 [details] did not pass jsc-armv7-ews (jsc-only): Output: https://webkit-queues.webkit.org/results/11946624 New failing tests: jsc-layout-tests.yaml/js/script-tests/invalid-utf8-in-syntax-error.js.layout stress/encode-decode-uri-component-surrogates.js.default stress/encode-decode-unicode.js.no-cjit-validate-phases jsc-layout-tests.yaml/js/script-tests/invalid-utf8-in-syntax-error.js.layout-dfg-eager-no-cjit stress/encode-decode-unicode.js.dfg-eager-no-cjit-validate stress/encode-decode-uri-surrogates.js.no-llint stress/encode-decode-unicode.js.no-llint stress/encode-decode-unicode.js.dfg-eager stress/encode-decode-unicode.js.dfg-maximal-flush-validate-no-cjit stress/encode-decode-uri-component-surrogates.js.no-llint stress/encode-decode-uri-surrogates.js.no-cjit-validate-phases stress/encode-decode-uri-surrogates.js.dfg-maximal-flush-validate-no-cjit stress/encode-decode-uri-component-surrogates.js.no-cjit-collect-continuously stress/encode-decode-uri-surrogates.js.dfg-eager-no-cjit-validate stress/encode-decode-uri-component-surrogates.js.dfg-eager stress/encode-decode-uri-surrogates.js.no-cjit-collect-continuously stress/encode-decode-unicode.js.no-cjit-collect-continuously stress/encode-decode-uri-component-surrogates.js.dfg-eager-no-cjit-validate stress/encode-decode-uri-surrogates.js.default jsc-layout-tests.yaml/js/script-tests/invalid-utf8-in-syntax-error.js.layout-no-cjit stress/encode-decode-unicode.js.default stress/encode-decode-uri-component-surrogates.js.no-cjit-validate-phases stress/encode-decode-uri-component-surrogates.js.dfg-maximal-flush-validate-no-cjit stress/encode-decode-uri-surrogates.js.dfg-eager jsc-layout-tests.yaml/js/script-tests/invalid-utf8-in-syntax-error.js.layout-no-llint apiTests Comment on attachment 367915 [details] Patch Attachment 367915 [details] did not pass jsc-ews (mac): Output: https://webkit-queues.webkit.org/results/11946627 New failing tests: jsc-layout-tests.yaml/js/script-tests/invalid-utf8-in-syntax-error.js.layout stress/encode-decode-uri-surrogates.js.ftl-no-cjit-small-pool stress/encode-decode-uri-component-surrogates.js.default stress/encode-decode-unicode.js.no-cjit-validate-phases stress/encode-decode-uri-surrogates.js.ftl-eager stress/encode-decode-unicode.js.ftl-eager jsc-layout-tests.yaml/js/script-tests/invalid-utf8-in-syntax-error.js.layout-dfg-eager-no-cjit stress/encode-decode-ascii.js.no-llint stress/encode-decode-ascii.js.dfg-eager-no-cjit-validate stress/encode-decode-unicode.js.dfg-eager-no-cjit-validate stress/encode-decode-uri-surrogates.js.no-llint stress/encode-decode-unicode.js.ftl-no-cjit-no-put-stack-validate stress/encode-decode-uri-component-surrogates.js.ftl-no-cjit-validate-sampling-profiler stress/encode-decode-uri-component-surrogates.js.ftl-no-cjit-b3o0 stress/encode-decode-unicode.js.no-llint stress/encode-decode-ascii.js.default stress/encode-decode-ascii.js.dfg-maximal-flush-validate-no-cjit stress/encode-decode-ascii.js.ftl-no-cjit-no-inline-validate stress/encode-decode-ascii.js.bytecode-cache stress/encode-decode-uri-component-surrogates.js.ftl-no-cjit-small-pool stress/encode-decode-unicode.js.dfg-eager stress/encode-decode-unicode.js.dfg-maximal-flush-validate-no-cjit stress/encode-decode-unicode.js.ftl-eager-no-cjit stress/encode-decode-unicode.js.ftl-no-cjit-no-inline-validate stress/encode-decode-uri-component-surrogates.js.no-llint stress/encode-decode-uri-surrogates.js.bytecode-cache stress/encode-decode-uri-component-surrogates.js.ftl-eager stress/encode-decode-uri-surrogates.js.ftl-no-cjit-validate-sampling-profiler stress/encode-decode-uri-surrogates.js.no-cjit-validate-phases stress/encode-decode-uri-surrogates.js.dfg-maximal-flush-validate-no-cjit stress/encode-decode-ascii.js.ftl-no-cjit-small-pool jsc-layout-tests.yaml/js/script-tests/invalid-utf8-in-syntax-error.js.layout-no-ftl stress/encode-decode-uri-component-surrogates.js.ftl-no-cjit-no-put-stack-validate stress/encode-decode-unicode.js.no-ftl stress/encode-decode-uri-surrogates.js.no-ftl stress/encode-decode-ascii.js.ftl-no-cjit-no-put-stack-validate stress/encode-decode-uri-surrogates.js.dfg-eager-no-cjit-validate stress/encode-decode-uri-surrogates.js.ftl-eager-no-cjit-b3o1 stress/encode-decode-ascii.js.no-cjit-validate-phases stress/encode-decode-uri-component-surrogates.js.dfg-eager stress/encode-decode-uri-surrogates.js.no-cjit-collect-continuously stress/encode-decode-ascii.js.ftl-eager-no-cjit-b3o1 stress/encode-decode-uri-component-surrogates.js.ftl-no-cjit-no-inline-validate stress/encode-decode-uri-component-surrogates.js.ftl-eager-no-cjit-b3o1 stress/encode-decode-ascii.js.ftl-eager-no-cjit stress/encode-decode-uri-component-surrogates.js.bytecode-cache stress/encode-decode-uri-component-surrogates.js.no-cjit-collect-continuously stress/encode-decode-unicode.js.no-cjit-collect-continuously stress/encode-decode-uri-component-surrogates.js.dfg-eager-no-cjit-validate stress/encode-decode-uri-surrogates.js.default jsc-layout-tests.yaml/js/script-tests/invalid-utf8-in-syntax-error.js.layout-ftl-no-cjit jsc-layout-tests.yaml/js/script-tests/invalid-utf8-in-syntax-error.js.layout-no-cjit stress/encode-decode-ascii.js.ftl-eager stress/encode-decode-ascii.js.no-ftl stress/encode-decode-unicode.js.ftl-no-cjit-small-pool stress/encode-decode-uri-component-surrogates.js.ftl-eager-no-cjit stress/encode-decode-uri-surrogates.js.ftl-eager-no-cjit stress/encode-decode-uri-surrogates.js.ftl-no-cjit-no-put-stack-validate stress/encode-decode-ascii.js.dfg-eager stress/encode-decode-unicode.js.default stress/encode-decode-ascii.js.ftl-no-cjit-b3o0 stress/encode-decode-uri-component-surrogates.js.no-cjit-validate-phases stress/encode-decode-unicode.js.ftl-no-cjit-validate-sampling-profiler stress/encode-decode-uri-component-surrogates.js.dfg-maximal-flush-validate-no-cjit stress/encode-decode-uri-surrogates.js.ftl-no-cjit-b3o0 stress/encode-decode-unicode.js.bytecode-cache stress/encode-decode-uri-surrogates.js.dfg-eager stress/encode-decode-uri-surrogates.js.ftl-no-cjit-no-inline-validate ChakraCore.yaml/ChakraCore/test/GlobalFunctions/GlobalFunctions.js.default stress/encode-decode-uri-component-surrogates.js.no-ftl stress/encode-decode-unicode.js.ftl-no-cjit-b3o0 stress/encode-decode-unicode.js.ftl-eager-no-cjit-b3o1 jsc-layout-tests.yaml/js/script-tests/invalid-utf8-in-syntax-error.js.layout-no-llint stress/encode-decode-ascii.js.no-cjit-collect-continuously stress/encode-decode-ascii.js.ftl-no-cjit-validate-sampling-profiler jsc-layout-tests.yaml/js/script-tests/invalid-utf8-in-syntax-error.js.layout-ftl-eager-no-cjit Created attachment 367919 [details]
Patch
Comment on attachment 367919 [details] Patch Attachment 367919 [details] did not pass jsc-armv7-ews (jsc-only): Output: https://webkit-queues.webkit.org/results/11947369 New failing tests: jsc-layout-tests.yaml/js/script-tests/invalid-utf8-in-syntax-error.js.layout jsc-layout-tests.yaml/js/script-tests/invalid-utf8-in-syntax-error.js.layout-no-cjit jsc-layout-tests.yaml/js/script-tests/invalid-utf8-in-syntax-error.js.layout-dfg-eager-no-cjit jsc-layout-tests.yaml/js/script-tests/invalid-utf8-in-syntax-error.js.layout-no-llint apiTests Comment on attachment 367919 [details] Patch Attachment 367919 [details] did not pass jsc-ews (mac): Output: https://webkit-queues.webkit.org/results/11947376 New failing tests: jsc-layout-tests.yaml/js/script-tests/invalid-utf8-in-syntax-error.js.layout jsc-layout-tests.yaml/js/script-tests/invalid-utf8-in-syntax-error.js.layout-no-cjit jsc-layout-tests.yaml/js/script-tests/invalid-utf8-in-syntax-error.js.layout-no-ftl jsc-layout-tests.yaml/js/script-tests/invalid-utf8-in-syntax-error.js.layout-dfg-eager-no-cjit jsc-layout-tests.yaml/js/script-tests/invalid-utf8-in-syntax-error.js.layout-ftl-no-cjit jsc-layout-tests.yaml/js/script-tests/invalid-utf8-in-syntax-error.js.layout-ftl-eager-no-cjit jsc-layout-tests.yaml/js/script-tests/invalid-utf8-in-syntax-error.js.layout-no-llint Not sure how to run those tests, but I’ll look into why they are failing next time I have a chance to work on this. This may not be the optimal way to run just a subset of tests, but at the high level, the command is "run-javascriptcore-tests". (In reply to Alexey Proskuryakov from comment #21) > This may not be the optimal way to run just a subset of tests, but at the > high level, the command is "run-javascriptcore-tests". You can also run only the layout JSC tests by: $ Tools/Scripts/run-jsc-stress-tests (--release or --debug) LayoutTests/jsc-layout-tests.yaml You can run an individual test using: $ ./Tools/Scripts/run-jsc (--release or --debug) LayoutTests/resources/standalone-pre.js <test> LayoutTests/resources/standalone-post.js e.g: $ ./Tools/Scripts/run-jsc (--release or --debug) LayoutTests/resources/standalone-pre.js LayoutTests/js/script-tests/invalid-utf8-in-syntax-error.js LayoutTests/resources/standalone-post.js You can also pass `--filter="text"` option to filter out tests :) Created attachment 368432 [details]
Patch
Comments in ChangeLog explain why that one test was failing. Not a real problem, just a test that needed to be rebased since the test *output* now includes a U+FFFD rather than an unpaired UTF-16 surrogate. The test behavior is identical, and correct, in both cases. EWS points out that Windows build is broken: C:\cygwin\home\buildbot\WebKit\Source\WTF\wtf\unicode\UTF8Conversion.cpp(45): error C2220: warning treated as error - no 'object' file generated [C:\cygwin\home\buildbot\WebKit\WebKitBuild\Release\Source\WTF\wtf\WTF.vcxproj] C:\cygwin\home\buildbot\WebKit\Source\WTF\wtf\unicode\UTF8Conversion.cpp(45): warning C4333: '>>': right shift by too large amount, data loss [C:\cygwin\home\buildbot\WebKit\WebKitBuild\Release\Source\WTF\wtf\WTF.vcxproj] Yes, I have a patch locally on my machine at home that works around that bug in either the Windows compiler's warning logic or in the Windows version of the ICU headers. Planning to upload it for review. Didn’t get a chance to upload it last night. Created attachment 368529 [details]
Patch
All EWS tests passing now. Finally ready for someone to review. Comment on attachment 368529 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=368529&action=review Large patch, somewhat superficial review. Overall comment: I don't think that hiding the types in "auto result = convertUTF16ToUTF8" and "auto success = convertLatin1ToUTF8" is an improvement, especially as the types are different, and the plan is to change them further. This took me an outstanding amount of effort to follow, with all the subtle differences with regards to where "target exhausted" is a failure, and where it is a success. > Source/WTF/wtf/text/WTFString.cpp:862 > + if (!convertUTF8ToUTF16(stringCurrent, reinterpret_cast<const char *>(stringStart + length), &bufferCurrent, bufferCurrent + buffer.size())) Pre-existing issue, misplaced star. > Source/WTF/wtf/unicode/UTF8Conversion.cpp:2 > - * Copyright (C) 2007, 2014 Apple Inc. All rights reserved. > + * Copyright (C) 2007-2019 Apple Inc. All rights reserved. Have we actually made non-trivial changes every year? > Source/WebCore/platform/SharedBuffer.cpp:340 > - result = WTF::Unicode::convertLatin1ToUTF8(&d, d + length, &p, p + buffer.size()); > + if (!WTF::Unicode::convertLatin1ToUTF8(&d, d + length, &p, p + buffer.size())) Typically, we export symbols in WTF into the global namespace, and don't use the WTF prefix at call sites. Is there a good reason for WTF::Unicode to be special? > Source/WebKit/Shared/API/APIString.h:26 > +#pragma once Do we now fully rely on pragma once even in cross-platform API headers? > LayoutTests/ChangeLog:14 > + * css3/escape-dom-api-expected.txt: > + * fast/text/dangling-surrogates-expected.txt: > + * js/dom/webidl-type-mapping-expected.txt: > + * js/invalid-utf8-in-syntax-error-expected.txt: > + Updated expected results to have the Unicode replacement character in cases where the > + text contains unpaired surrogates. The tests are still doing the same operations, and > + still getting the same results, but the text output no longer includes illegal UTF-8. It would be useful to mention whether the new behavior matches other browser engines. > LayoutTests/js/invalid-utf8-in-syntax-error.html:1 > +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> This should use HTML5 doctype. How did you create the wrapper? I just verified that "make-new-script-test js/invalid-utf8-in-syntax-error.html" creates a modern wrapper, also without pre/post scripts. Comment on attachment 368529 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=368529&action=review Thank you for the review. > Source/WTF/wtf/text/StringImpl.cpp:1857 > + auto success = convertLatin1ToUTF8(&characters, characters + length, &buffer, buffer + bufferVector.size()); This should be bool success, not auto success! >> Source/WTF/wtf/text/WTFString.cpp:862 >> + if (!convertUTF8ToUTF16(stringCurrent, reinterpret_cast<const char *>(stringStart + length), &bufferCurrent, bufferCurrent + buffer.size())) > > Pre-existing issue, misplaced star. I’ll fix that. >> Source/WTF/wtf/unicode/UTF8Conversion.cpp:2 >> + * Copyright (C) 2007-2019 Apple Inc. All rights reserved. > > Have we actually made non-trivial changes every year? I’ll check and update it correctly. >> Source/WebCore/platform/SharedBuffer.cpp:340 >> + if (!WTF::Unicode::convertLatin1ToUTF8(&d, d + length, &p, p + buffer.size())) > > Typically, we export symbols in WTF into the global namespace, and don't use the WTF prefix at call sites. Is there a good reason for WTF::Unicode to be special? Good point. Worth reconsidering later. I was not changing anything here at this time, but I am willing to look into this. Long term, maybe we should stop doing the "export WTF symbols into the global namespace" entirely. It’s less important in an era where we don’t write out whole identifiers so often. But my choice to follow the pattern of the nearby code here was not intentionally to move us in any direction. >> Source/WebKit/Shared/API/APIString.h:26 >> +#pragma once > > Do we now fully rely on pragma once even in cross-platform API headers? As far as I know, this is not a cross-platform API header. This is a header for a class used in the back end of the C API, not itself part of an API. The API directory is confusingly named since many of the headers in it are to *implement* the API, not part of the API. The entire WebKit API namespace is not itself part of API. >> LayoutTests/ChangeLog:14 >> + still getting the same results, but the text output no longer includes illegal UTF-8. > > It would be useful to mention whether the new behavior matches other browser engines. As I understand it, this is a change in the behavior of our test harnesses, to the text dumping done by DumpRenderTree and WebKitTestRunner; no change to behavior of the browser engine. >> LayoutTests/js/invalid-utf8-in-syntax-error.html:1 >> +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> > > This should use HTML5 doctype. > > How did you create the wrapper? I just verified that "make-new-script-test js/invalid-utf8-in-syntax-error.html" creates a modern wrapper, also without pre/post scripts. I’ll use make-new-script-test to regenerate this file. I copied another wrapper from another file in this directory and edited the filename; did not remember there was a make-new-script-test script. Committed r244821: <https://trac.webkit.org/changeset/244821> Seeing 4 Test262 JSC failures on Release and Debug after this change. FAIL test/built-ins/RegExp/prototype/Symbol.match/builtin-infer-unicode.js (default) Full Output: Exception: Test262Error: Expected SameValue(«�», «null») to be true FAIL test/built-ins/RegExp/prototype/Symbol.match/builtin-infer-unicode.js (strict mode) Full Output: Exception: Test262Error: Expected SameValue(«�», «null») to be true FAIL test/built-ins/RegExp/prototype/exec/u-lastindex-adv.js (default) Full Output: Exception: Test262Error: Expected SameValue(«�», «null») to be true FAIL test/built-ins/RegExp/prototype/exec/u-lastindex-adv.js (strict mode) Full Output: Exception: Test262Error: Expected SameValue(«�», «null») to be true https://build.webkit.org/builders/Apple%20High%20Sierra%20Release%20Test262%20%28Tests%29/builds/2784/steps/test262-test/logs/stdio https://build.webkit.org/builders/Apple%20High%20Sierra%20Debug%20Test262%20%28Tests%29/builds/2108/steps/test262-test/logs/stdio (In reply to Shawn Roberts from comment #34) > Seeing 4 Test262 JSC failures on Release and Debug after this change. Those are already-failing tests so we just need to update test262/expectations.yaml. Reverted r244821 for reason: Causing Committed r244827: <https://trac.webkit.org/changeset/244827> (In reply to Ross Kirsling from comment #35) > (In reply to Shawn Roberts from comment #34) > > Seeing 4 Test262 JSC failures on Release and Debug after this change. > > Those are already-failing tests so we just need to update > test262/expectations.yaml. Hm, the test run before this revision didn't report the tests as failing. The bottom of the stdio shows: "0 tests newly fail" I do see the failures when I search the test names, though. https://build.webkit.org/builders/Apple%20High%20Sierra%20Debug%20Test262%20%28Tests%29/builds/2107/steps/test262-test/logs/stdio (In reply to Ryan Haddad from comment #37) > (In reply to Ross Kirsling from comment #35) > > (In reply to Shawn Roberts from comment #34) > > > Seeing 4 Test262 JSC failures on Release and Debug after this change. > > > > Those are already-failing tests so we just need to update > > test262/expectations.yaml. Oh, I see what you are saying. The test was already failing but the output changed from: Exception: Test262Error: Expected SameValue(«���», «null») to be true to Exception: Test262Error: Expected SameValue(«�», «null») to be true I re-landed r244821 and r244822 in https://trac.webkit.org/changeset/244828/webkit. Will address the test262 failures separately. (In reply to Ryan Haddad from comment #39) > I re-landed r244821 and r244822 in > https://trac.webkit.org/changeset/244828/webkit. > > Will address the test262 failures separately. Committed r244833: <https://trac.webkit.org/changeset/244833> I see now. The test262 failures are analogous to the layout tests, where we had to rebase because the test harness now writes out strings differently (more correctly) but the actual test results are unchanged. So the expected results need to be rebased. Comment on attachment 368529 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=368529&action=review > Source/WTF/wtf/unicode/UTF8Conversion.cpp:99 > + U8_NEXT(reinterpret_cast<const uint8_t*>(source), sourceOffset, sourceEnd - source, character); It appears we would more closely match the behavior of Chrome and Firefox if we used U8_NEXT_OR_FFFD here instead of U8_NEXT. I'm planning to propose that change in an upcoming patch. Comment on attachment 368529 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=368529&action=review >> Source/WTF/wtf/unicode/UTF8Conversion.cpp:99 >> + U8_NEXT(reinterpret_cast<const uint8_t*>(source), sourceOffset, sourceEnd - source, character); > > It appears we would more closely match the behavior of Chrome and Firefox if we used U8_NEXT_OR_FFFD here instead of U8_NEXT. I'm planning to propose that change in an upcoming patch. Given that this is a widely used function, I am not sure that all call sites need the same behavior. Ok, feel free to discuss in https://bugs.webkit.org/show_bug.cgi?id=215671 |