Remove KJS::UChar Darin and I talked about this long ago. No need for a real class. This also gets rid of the silly KJS::UChar vs. ::UChar confusion.
Created attachment 18703 [details] Send KJS::UChar out to pasture JavaScriptCore/API/JSStringRef.cpp | 4 +- JavaScriptCore/JavaScriptCore.exp | 10 +- .../JavaScriptCore.xcodeproj/project.pbxproj | 1 - JavaScriptCore/bindings/c/c_utility.cpp | 2 +- JavaScriptCore/kjs/Parser.h | 2 - JavaScriptCore/kjs/function.cpp | 32 +- JavaScriptCore/kjs/function_object.cpp | 4 +- JavaScriptCore/kjs/identifier.cpp | 6 +- JavaScriptCore/kjs/interpreter.h | 2 - JavaScriptCore/kjs/lexer.cpp | 29 +- JavaScriptCore/kjs/lookup.cpp | 2 +- JavaScriptCore/kjs/nodes2string.cpp | 2 +- JavaScriptCore/kjs/string_object.cpp | 6 +- JavaScriptCore/kjs/ustring.cpp | 30 +- JavaScriptCore/kjs/ustring.h | 62 +- JavaScriptGlue/Configurations/Base.xcconfig | 2 +- .../ForwardingHeaders/wtf/unicode/Unicode.h | 1 + .../ForwardingHeaders/wtf/unicode/icu/UnicodeIcu.h | 1 + JavaScriptGlue/icu/LICENSE | 25 + JavaScriptGlue/icu/README | 4 + JavaScriptGlue/icu/unicode/platform.h | 267 ++ JavaScriptGlue/icu/unicode/putil.h | 180 ++ JavaScriptGlue/icu/unicode/uchar.h | 2798 ++++++++++++++++++++ JavaScriptGlue/icu/unicode/uconfig.h | 186 ++ JavaScriptGlue/icu/unicode/uiter.h | 707 +++++ JavaScriptGlue/icu/unicode/umachine.h | 371 +++ JavaScriptGlue/icu/unicode/urename.h | 1468 ++++++++++ JavaScriptGlue/icu/unicode/ustring.h | 1320 +++++++++ JavaScriptGlue/icu/unicode/utf.h | 221 ++ JavaScriptGlue/icu/unicode/utf16.h | 605 +++++ JavaScriptGlue/icu/unicode/utf8.h | 627 +++++ JavaScriptGlue/icu/unicode/utypes.h | 745 ++++++ JavaScriptGlue/icu/unicode/uversion.h | 216 ++ WebCore/WebCore.xcodeproj/project.pbxproj | 1 - WebCore/bindings/js/kjs_proxy.cpp | 2 +- WebCore/bindings/js/kjs_window.cpp | 4 +- WebCore/platform/DeprecatedString.cpp | 4 +- WebCore/platform/text/String.cpp | 4 +- 38 files changed, 9816 insertions(+), 137 deletions(-)
I will note that this patch is *entirely* the damn ICU headers :(
Comment on attachment 18703 [details] Send KJS::UChar out to pasture I'm going to upload a split patch.
Created attachment 18704 [details] Add ICU and appropriate Unicode forwarding headers to JSG (ENTIRELY ICU headers) JavaScriptGlue/Configurations/Base.xcconfig | 2 +- .../ForwardingHeaders/wtf/unicode/Unicode.h | 1 + .../ForwardingHeaders/wtf/unicode/icu/UnicodeIcu.h | 1 + JavaScriptGlue/icu/LICENSE | 25 + JavaScriptGlue/icu/README | 4 + JavaScriptGlue/icu/unicode/platform.h | 267 ++ JavaScriptGlue/icu/unicode/putil.h | 180 ++ JavaScriptGlue/icu/unicode/uchar.h | 2798 ++++++++++++++++++++ JavaScriptGlue/icu/unicode/uconfig.h | 186 ++ JavaScriptGlue/icu/unicode/uiter.h | 707 +++++ JavaScriptGlue/icu/unicode/umachine.h | 371 +++ JavaScriptGlue/icu/unicode/urename.h | 1468 ++++++++++ JavaScriptGlue/icu/unicode/ustring.h | 1320 +++++++++ JavaScriptGlue/icu/unicode/utf.h | 221 ++ JavaScriptGlue/icu/unicode/utf16.h | 605 +++++ JavaScriptGlue/icu/unicode/utf8.h | 627 +++++ JavaScriptGlue/icu/unicode/utypes.h | 745 ++++++ JavaScriptGlue/icu/unicode/uversion.h | 216 ++ 18 files changed, 9743 insertions(+), 1 deletions(-)
Created attachment 18705 [details] send KJS::UChar out to pasture (minus ICU headers) JavaScriptCore/API/JSStringRef.cpp | 4 +- JavaScriptCore/JavaScriptCore.exp | 10 ++-- JavaScriptCore/bindings/c/c_utility.cpp | 2 +- JavaScriptCore/kjs/Parser.h | 2 - JavaScriptCore/kjs/function.cpp | 32 ++++++++-------- JavaScriptCore/kjs/function_object.cpp | 4 +- JavaScriptCore/kjs/identifier.cpp | 6 +- JavaScriptCore/kjs/interpreter.h | 2 - JavaScriptCore/kjs/lexer.cpp | 29 +++++++------- JavaScriptCore/kjs/lookup.cpp | 2 +- JavaScriptCore/kjs/nodes2string.cpp | 2 +- JavaScriptCore/kjs/string_object.cpp | 6 +- JavaScriptCore/kjs/ustring.cpp | 30 +++++++------- JavaScriptCore/kjs/ustring.h | 62 +------------------------------ WebCore/bindings/js/kjs_proxy.cpp | 2 +- WebCore/bindings/js/kjs_window.cpp | 4 +- WebCore/platform/DeprecatedString.cpp | 4 +- WebCore/platform/text/String.cpp | 4 +- 18 files changed, 73 insertions(+), 134 deletions(-)
Comment on attachment 18705 [details] send KJS::UChar out to pasture (minus ICU headers) Generally looks good. + return toRef(UString(reinterpret_cast<const UChar*>(chars), static_cast<int>(numChars)).rep()->ref()); + return toRef(UString(reinterpret_cast<UChar*>(buffer.data()), p - buffer.data()).rep()->ref()); + const ::UChar* d = reinterpret_cast<const ::UChar*>(&data()[0]); + Completion comp = Interpreter::evaluate(exec, filename, baseLine, reinterpret_cast<const UChar*>(str.characters()), str.length(), thisNode); + return Identifier(reinterpret_cast<const UChar*>(m_impl->characters()), m_impl->length()); + return UString(reinterpret_cast<const UChar*>(m_impl->characters()), m_impl->length()); No typecast needed at all at these call sites now. There's a small but potentially slightly dangerous difference between KJS::UChar and the integer type UChar typedef, and that's the behavior of converting a char into UChar. This will silently behave differently, sign extending rather than padding with zeroes. Here are the three callsites that are affected: From CStringTranslator::translate: for (size_t i = 0; i != length; i++) d[i] = c[i]; This needs a typecast to "unsigned char" on the right side of the assignment statement. Unless we guarantee we only use this on ASCII strings. From SourceStream& SourceStream::operator<<(char c): UChar ch(c); This also needs a typecast to "unsigned char". It's probably a little less dangerous than the above because we probably use this operator only with ASCII characters, but I'm not sure. From UString::append(const char*), in 3 different places: for (int i = 0; i < tSize; ++i) d[thisSize + i] = t[i]; Same issue, same explanation -- safe if only used on ASCII, or we can add a cast to preserve the old behavior. I'm going to say review- because of this issue; otherwise good patch.
Comment on attachment 18704 [details] Add ICU and appropriate Unicode forwarding headers to JSG (ENTIRELY ICU headers) r=me But I suggest using svn copy on the whole icu directory from JavaScriptCore instead of doing the individual files like this.
I actually copied 3 fewer files than the full ICU directory. :) But I agree, that would be cleaner. I'll do that. The sign-extend issue calls the whole patch into question! Since the goal was not to make it easier to hack on WebKit (less confusion) rather than harder (dangerous to convert from char to UChar)! Hum...
s/not//
(In reply to comment #8) > The sign-extend issue calls the whole patch into question! Since the goal was > not to make it easier to hack on WebKit (less confusion) rather than harder > (dangerous to convert from char to UChar)! Nah, it's not all that important. It's not super-helpful to have char convert to UChar by 0-filling any more than by sign extending. It's just a subtle change at the switchover time. In the long run, it's find to have the single integral type. The safest way to do things would be to change KJS::UChar to remove the special char-conversion feature, then do the switchover in a separate step. I thought of one other difference between UChar and KJS::UChar. The latter default constructs to 0 even in cases where the raw type constructs uninitialized. These are the only things to worry about when switching over.
I think that's an excellent idea. I'll split the patch out into stages. One which makes UChar(char*) and UChar() private, and a second which does the removal.
UChar(char) rather. :)
Comment on attachment 18704 [details] Add ICU and appropriate Unicode forwarding headers to JSG (ENTIRELY ICU headers) Landed this patch (using SVN cp). Working on the other pieces still.
Created attachment 18753 [details] Corrections for darin JavaScriptCore/API/JSStringRef.cpp | 8 ++++---- JavaScriptCore/API/JSStringRefCF.cpp | 2 +- JavaScriptCore/bindings/objc/objc_utility.mm | 2 +- JavaScriptCore/kjs/identifier.cpp | 2 +- JavaScriptCore/kjs/nodes2string.cpp | 3 ++- JavaScriptCore/kjs/string_object.cpp | 8 ++++---- JavaScriptCore/kjs/ustring.cpp | 14 +++++++------- WebCore/bindings/js/kjs_proxy.cpp | 2 +- WebCore/dom/Document.cpp | 4 ++-- WebCore/html/HTMLFormElement.cpp | 2 +- WebCore/platform/text/AtomicString.cpp | 4 ++-- WebCore/platform/text/String.cpp | 8 ++++---- WebCore/platform/text/TextCodecICU.cpp | 2 +- 13 files changed, 31 insertions(+), 30 deletions(-)
Created attachment 18754 [details] Combined patch for darin JavaScriptCore/API/JSStringRef.cpp | 8 ++-- JavaScriptCore/API/JSStringRefCF.cpp | 2 +- JavaScriptCore/JavaScriptCore.exp | 10 ++-- JavaScriptCore/bindings/c/c_utility.cpp | 2 +- JavaScriptCore/bindings/objc/objc_utility.mm | 2 +- JavaScriptCore/kjs/Parser.h | 2 - JavaScriptCore/kjs/function.cpp | 32 +++++++------- JavaScriptCore/kjs/function_object.cpp | 4 +- JavaScriptCore/kjs/identifier.cpp | 8 ++-- JavaScriptCore/kjs/interpreter.h | 2 - JavaScriptCore/kjs/lexer.cpp | 29 ++++++------ JavaScriptCore/kjs/lookup.cpp | 2 +- JavaScriptCore/kjs/nodes2string.cpp | 5 +- JavaScriptCore/kjs/string_object.cpp | 14 +++--- JavaScriptCore/kjs/ustring.cpp | 40 ++++++++-------- JavaScriptCore/kjs/ustring.h | 62 +------------------------- WebCore/bindings/js/kjs_proxy.cpp | 2 +- WebCore/bindings/js/kjs_window.cpp | 4 +- WebCore/dom/Document.cpp | 4 +- WebCore/html/HTMLFormElement.cpp | 2 +- WebCore/platform/DeprecatedString.cpp | 4 +- WebCore/platform/text/AtomicString.cpp | 4 +- WebCore/platform/text/String.cpp | 8 ++-- WebCore/platform/text/TextCodecICU.cpp | 2 +- 24 files changed, 97 insertions(+), 157 deletions(-)
Comment on attachment 18754 [details] Combined patch for darin 552 if (charLen && (u == 0 || u >= 128 || !strchr(do_not_unescape, static_cast<unsigned char>(u)))) { There's no need for a cast here. strchr takes an int and the earlier code on the line already checks that it's a value in the range 1-127, so a normal conversion to int will do. 837 if (Lexer::isHexDigit((c + 2)[0]) && Lexer::isHexDigit((c + 3)[0]) && Lexer::isHexDigit((c + 4)[0]) && Lexer::isHexDigit((c + 5)[0])) { 838 u = Lexer::convertUnicode((c + 2)[0], (c + 3)[0], (c + 4)[0], (c + 5)[0]); Really should be c[2] instead of (c + 2)[0]. The [0] is a good trick to do this with a global replace, but it's a little ugly. 116 d[i] = static_cast<unsigned char>(c[i]); // static_cast to unsigned char to avoid erroneous sign-extend The comment says the sign extend here would be "erroneous", but really I think that this function is only called on pure ASCII strings. While changing 128-255 into U+0080 to U+00FF is arguably better than changing them to U+FF80 to U+FFFF, I think what's really going on here is that it's never going to matter. I think a better comment would simply state that we use unsigned char because we want to zero-extend rather than sign extending. Note that cross-platform, plain old "char" can be *either* unsigned or signed. 97 void Lexer::setCode(int startingLineNumber, const UChar *c, unsigned int len) Surprised you didn't move the star. 266212 UString &append(const char *); 267213 UString &append(unsigned short); 268214 UString &append(char c) { return append(static_cast<unsigned short>(static_cast<unsigned char>(c))); } 269 UString &append(UChar c) { return append(c.uc); } I think we need the append(unsigned short) to be changed to be an append(UChar), because on Windows those two types aren't the same. As-is I think this might break the Windows build. r=me on this as-is, but please make sure you don't break the Windows build
Eric, this patch has been sitting around in the commit queue for over a month now. Do you intend to land it?
Eric has been a busy little boy @ work. However my big project ended on Wed, so I'll look at making time to land this this afternoon. :)
Created attachment 19533 [details] Fix patch to compile on TOT, change ::UChar -> UChar JavaScriptCore/kjs/interpreter.h | 2 ++ JavaScriptCore/pcre/pcre.h | 4 ++-- .../bindings/js/JSCSSStyleDeclarationCustom.cpp | 12 ++++++------ WebCore/bindings/js/JSSVGPODTypeWrapper.h | 2 +- WebCore/svg/SVGAnimatedTemplate.h | 2 +- 5 files changed, 12 insertions(+), 10 deletions(-)
Created attachment 19534 [details] Fix per darin's suggestions JavaScriptCore/bindings/c/c_utility.cpp | 6 +++--- JavaScriptCore/kjs/function.cpp | 12 ++++++------ JavaScriptCore/kjs/identifier.cpp | 2 +- JavaScriptCore/kjs/lexer.cpp | 2 +- JavaScriptCore/kjs/nodes2string.cpp | 2 +- JavaScriptCore/kjs/regexp.cpp | 6 +++--- JavaScriptCore/kjs/string_object.cpp | 24 ++++++++++++------------ JavaScriptCore/kjs/ustring.cpp | 20 ++++++++++---------- JavaScriptCore/kjs/ustring.h | 12 ++++++------ WebCore/bindings/js/JSSVGPODTypeWrapper.h | 2 +- WebCore/svg/SVGAnimatedTemplate.h | 2 +- 11 files changed, 45 insertions(+), 45 deletions(-)
Created attachment 19535 [details] Updated combined patch for removing KJS::UChar JavaScriptCore/API/JSStringRef.cpp | 8 +- JavaScriptCore/API/JSStringRefCF.cpp | 2 +- JavaScriptCore/JavaScriptCore.exp | 10 ++-- JavaScriptCore/bindings/c/c_utility.cpp | 8 +- JavaScriptCore/bindings/objc/objc_utility.mm | 2 +- JavaScriptCore/kjs/Parser.h | 2 - JavaScriptCore/kjs/function.cpp | 34 +++++----- JavaScriptCore/kjs/function_object.cpp | 4 +- JavaScriptCore/kjs/identifier.cpp | 8 +- JavaScriptCore/kjs/interpreter.h | 4 +- JavaScriptCore/kjs/lexer.cpp | 29 ++++---- JavaScriptCore/kjs/lookup.cpp | 2 +- JavaScriptCore/kjs/nodes2string.cpp | 5 +- JavaScriptCore/kjs/regexp.cpp | 6 +- JavaScriptCore/kjs/string_object.cpp | 38 +++++----- JavaScriptCore/kjs/ustring.cpp | 48 ++++++------ JavaScriptCore/kjs/ustring.h | 74 ++----------------- JavaScriptCore/pcre/pcre.h | 4 +- .../bindings/js/JSCSSStyleDeclarationCustom.cpp | 12 ++-- WebCore/bindings/js/JSSVGPODTypeWrapper.h | 2 +- WebCore/bindings/js/kjs_proxy.cpp | 2 +- WebCore/bindings/js/kjs_window.cpp | 4 +- WebCore/dom/Document.cpp | 4 +- WebCore/platform/text/AtomicString.cpp | 4 +- WebCore/platform/text/String.cpp | 8 +- WebCore/platform/text/TextCodecICU.cpp | 2 +- WebCore/svg/SVGAnimatedTemplate.h | 2 +- 27 files changed, 135 insertions(+), 193 deletions(-)
Comment on attachment 19535 [details] Updated combined patch for removing KJS::UChar 213 UString& append(const UChar); 807 UString& UString::append(const UChar c) There's no reason for or value to having this "const" here. r=me
Comment on attachment 19533 [details] Fix patch to compile on TOT, change ::UChar -> UChar Clearing review flag on this, with the understanding that this is included in the combined patch.
Comment on attachment 19534 [details] Fix per darin's suggestions Clearing review flag on this, with the understanding that this is included in the combined patch.
I tried to apply this and land it, but because it's a git patch and not a Subversion patch, I couldn't use the --merge option of svn-apply and there were a few conflicts. So I think Eric is going to have to land this himself.
r30942