RESOLVED FIXED 17017
Remove KJS::UChar
https://bugs.webkit.org/show_bug.cgi?id=17017
Summary Remove KJS::UChar
Eric Seidel (no email)
Reported 2008-01-26 03:13:54 PST
Remove KJS::UChar Darin and I talked about this long ago. No need for a real class. This also gets rid of the silly KJS::UChar vs. ::UChar confusion.
Attachments
Send KJS::UChar out to pasture (394.34 KB, patch)
2008-01-26 03:19 PST, Eric Seidel (no email)
no flags
Add ICU and appropriate Unicode forwarding headers to JSG (ENTIRELY ICU headers) (367.09 KB, patch)
2008-01-26 03:26 PST, Eric Seidel (no email)
no flags
send KJS::UChar out to pasture (minus ICU headers) (26.08 KB, patch)
2008-01-26 03:26 PST, Eric Seidel (no email)
darin: review-
Corrections for darin (13.31 KB, patch)
2008-01-28 18:58 PST, Eric Seidel (no email)
no flags
Combined patch for darin (35.57 KB, patch)
2008-01-28 19:01 PST, Eric Seidel (no email)
darin: review+
Fix patch to compile on TOT, change ::UChar -> UChar (4.40 KB, patch)
2008-03-04 16:50 PST, Eric Seidel (no email)
no flags
Fix per darin's suggestions (16.19 KB, patch)
2008-03-04 16:50 PST, Eric Seidel (no email)
no flags
Updated combined patch for removing KJS::UChar (45.00 KB, patch)
2008-03-04 16:56 PST, Eric Seidel (no email)
darin: review+
Eric Seidel (no email)
Comment 1 2008-01-26 03:19:29 PST
Created attachment 18703 [details] Send KJS::UChar out to pasture JavaScriptCore/API/JSStringRef.cpp | 4 +- JavaScriptCore/JavaScriptCore.exp | 10 +- .../JavaScriptCore.xcodeproj/project.pbxproj | 1 - JavaScriptCore/bindings/c/c_utility.cpp | 2 +- JavaScriptCore/kjs/Parser.h | 2 - JavaScriptCore/kjs/function.cpp | 32 +- JavaScriptCore/kjs/function_object.cpp | 4 +- JavaScriptCore/kjs/identifier.cpp | 6 +- JavaScriptCore/kjs/interpreter.h | 2 - JavaScriptCore/kjs/lexer.cpp | 29 +- JavaScriptCore/kjs/lookup.cpp | 2 +- JavaScriptCore/kjs/nodes2string.cpp | 2 +- JavaScriptCore/kjs/string_object.cpp | 6 +- JavaScriptCore/kjs/ustring.cpp | 30 +- JavaScriptCore/kjs/ustring.h | 62 +- JavaScriptGlue/Configurations/Base.xcconfig | 2 +- .../ForwardingHeaders/wtf/unicode/Unicode.h | 1 + .../ForwardingHeaders/wtf/unicode/icu/UnicodeIcu.h | 1 + JavaScriptGlue/icu/LICENSE | 25 + JavaScriptGlue/icu/README | 4 + JavaScriptGlue/icu/unicode/platform.h | 267 ++ JavaScriptGlue/icu/unicode/putil.h | 180 ++ JavaScriptGlue/icu/unicode/uchar.h | 2798 ++++++++++++++++++++ JavaScriptGlue/icu/unicode/uconfig.h | 186 ++ JavaScriptGlue/icu/unicode/uiter.h | 707 +++++ JavaScriptGlue/icu/unicode/umachine.h | 371 +++ JavaScriptGlue/icu/unicode/urename.h | 1468 ++++++++++ JavaScriptGlue/icu/unicode/ustring.h | 1320 +++++++++ JavaScriptGlue/icu/unicode/utf.h | 221 ++ JavaScriptGlue/icu/unicode/utf16.h | 605 +++++ JavaScriptGlue/icu/unicode/utf8.h | 627 +++++ JavaScriptGlue/icu/unicode/utypes.h | 745 ++++++ JavaScriptGlue/icu/unicode/uversion.h | 216 ++ WebCore/WebCore.xcodeproj/project.pbxproj | 1 - WebCore/bindings/js/kjs_proxy.cpp | 2 +- WebCore/bindings/js/kjs_window.cpp | 4 +- WebCore/platform/DeprecatedString.cpp | 4 +- WebCore/platform/text/String.cpp | 4 +- 38 files changed, 9816 insertions(+), 137 deletions(-)
Eric Seidel (no email)
Comment 2 2008-01-26 03:20:16 PST
I will note that this patch is *entirely* the damn ICU headers :(
Eric Seidel (no email)
Comment 3 2008-01-26 03:26:06 PST
Comment on attachment 18703 [details] Send KJS::UChar out to pasture I'm going to upload a split patch.
Eric Seidel (no email)
Comment 4 2008-01-26 03:26:25 PST
Created attachment 18704 [details] Add ICU and appropriate Unicode forwarding headers to JSG (ENTIRELY ICU headers) JavaScriptGlue/Configurations/Base.xcconfig | 2 +- .../ForwardingHeaders/wtf/unicode/Unicode.h | 1 + .../ForwardingHeaders/wtf/unicode/icu/UnicodeIcu.h | 1 + JavaScriptGlue/icu/LICENSE | 25 + JavaScriptGlue/icu/README | 4 + JavaScriptGlue/icu/unicode/platform.h | 267 ++ JavaScriptGlue/icu/unicode/putil.h | 180 ++ JavaScriptGlue/icu/unicode/uchar.h | 2798 ++++++++++++++++++++ JavaScriptGlue/icu/unicode/uconfig.h | 186 ++ JavaScriptGlue/icu/unicode/uiter.h | 707 +++++ JavaScriptGlue/icu/unicode/umachine.h | 371 +++ JavaScriptGlue/icu/unicode/urename.h | 1468 ++++++++++ JavaScriptGlue/icu/unicode/ustring.h | 1320 +++++++++ JavaScriptGlue/icu/unicode/utf.h | 221 ++ JavaScriptGlue/icu/unicode/utf16.h | 605 +++++ JavaScriptGlue/icu/unicode/utf8.h | 627 +++++ JavaScriptGlue/icu/unicode/utypes.h | 745 ++++++ JavaScriptGlue/icu/unicode/uversion.h | 216 ++ 18 files changed, 9743 insertions(+), 1 deletions(-)
Eric Seidel (no email)
Comment 5 2008-01-26 03:26:27 PST
Created attachment 18705 [details] send KJS::UChar out to pasture (minus ICU headers) JavaScriptCore/API/JSStringRef.cpp | 4 +- JavaScriptCore/JavaScriptCore.exp | 10 ++-- JavaScriptCore/bindings/c/c_utility.cpp | 2 +- JavaScriptCore/kjs/Parser.h | 2 - JavaScriptCore/kjs/function.cpp | 32 ++++++++-------- JavaScriptCore/kjs/function_object.cpp | 4 +- JavaScriptCore/kjs/identifier.cpp | 6 +- JavaScriptCore/kjs/interpreter.h | 2 - JavaScriptCore/kjs/lexer.cpp | 29 +++++++------- JavaScriptCore/kjs/lookup.cpp | 2 +- JavaScriptCore/kjs/nodes2string.cpp | 2 +- JavaScriptCore/kjs/string_object.cpp | 6 +- JavaScriptCore/kjs/ustring.cpp | 30 +++++++------- JavaScriptCore/kjs/ustring.h | 62 +------------------------------ WebCore/bindings/js/kjs_proxy.cpp | 2 +- WebCore/bindings/js/kjs_window.cpp | 4 +- WebCore/platform/DeprecatedString.cpp | 4 +- WebCore/platform/text/String.cpp | 4 +- 18 files changed, 73 insertions(+), 134 deletions(-)
Darin Adler
Comment 6 2008-01-26 08:18:28 PST
Comment on attachment 18705 [details] send KJS::UChar out to pasture (minus ICU headers) Generally looks good. + return toRef(UString(reinterpret_cast<const UChar*>(chars), static_cast<int>(numChars)).rep()->ref()); + return toRef(UString(reinterpret_cast<UChar*>(buffer.data()), p - buffer.data()).rep()->ref()); + const ::UChar* d = reinterpret_cast<const ::UChar*>(&data()[0]); + Completion comp = Interpreter::evaluate(exec, filename, baseLine, reinterpret_cast<const UChar*>(str.characters()), str.length(), thisNode); + return Identifier(reinterpret_cast<const UChar*>(m_impl->characters()), m_impl->length()); + return UString(reinterpret_cast<const UChar*>(m_impl->characters()), m_impl->length()); No typecast needed at all at these call sites now. There's a small but potentially slightly dangerous difference between KJS::UChar and the integer type UChar typedef, and that's the behavior of converting a char into UChar. This will silently behave differently, sign extending rather than padding with zeroes. Here are the three callsites that are affected: From CStringTranslator::translate: for (size_t i = 0; i != length; i++) d[i] = c[i]; This needs a typecast to "unsigned char" on the right side of the assignment statement. Unless we guarantee we only use this on ASCII strings. From SourceStream& SourceStream::operator<<(char c): UChar ch(c); This also needs a typecast to "unsigned char". It's probably a little less dangerous than the above because we probably use this operator only with ASCII characters, but I'm not sure. From UString::append(const char*), in 3 different places: for (int i = 0; i < tSize; ++i) d[thisSize + i] = t[i]; Same issue, same explanation -- safe if only used on ASCII, or we can add a cast to preserve the old behavior. I'm going to say review- because of this issue; otherwise good patch.
Darin Adler
Comment 7 2008-01-26 08:20:51 PST
Comment on attachment 18704 [details] Add ICU and appropriate Unicode forwarding headers to JSG (ENTIRELY ICU headers) r=me But I suggest using svn copy on the whole icu directory from JavaScriptCore instead of doing the individual files like this.
Eric Seidel (no email)
Comment 8 2008-01-26 13:48:47 PST
I actually copied 3 fewer files than the full ICU directory. :) But I agree, that would be cleaner. I'll do that. The sign-extend issue calls the whole patch into question! Since the goal was not to make it easier to hack on WebKit (less confusion) rather than harder (dangerous to convert from char to UChar)! Hum...
Eric Seidel (no email)
Comment 9 2008-01-26 13:50:11 PST
s/not//
Darin Adler
Comment 10 2008-01-26 16:01:15 PST
(In reply to comment #8) > The sign-extend issue calls the whole patch into question! Since the goal was > not to make it easier to hack on WebKit (less confusion) rather than harder > (dangerous to convert from char to UChar)! Nah, it's not all that important. It's not super-helpful to have char convert to UChar by 0-filling any more than by sign extending. It's just a subtle change at the switchover time. In the long run, it's find to have the single integral type. The safest way to do things would be to change KJS::UChar to remove the special char-conversion feature, then do the switchover in a separate step. I thought of one other difference between UChar and KJS::UChar. The latter default constructs to 0 even in cases where the raw type constructs uninitialized. These are the only things to worry about when switching over.
Eric Seidel (no email)
Comment 11 2008-01-26 16:08:38 PST
I think that's an excellent idea. I'll split the patch out into stages. One which makes UChar(char*) and UChar() private, and a second which does the removal.
Eric Seidel (no email)
Comment 12 2008-01-26 16:09:20 PST
UChar(char) rather. :)
Eric Seidel (no email)
Comment 13 2008-01-28 18:06:39 PST
Comment on attachment 18704 [details] Add ICU and appropriate Unicode forwarding headers to JSG (ENTIRELY ICU headers) Landed this patch (using SVN cp). Working on the other pieces still.
Eric Seidel (no email)
Comment 14 2008-01-28 18:58:03 PST
Created attachment 18753 [details] Corrections for darin JavaScriptCore/API/JSStringRef.cpp | 8 ++++---- JavaScriptCore/API/JSStringRefCF.cpp | 2 +- JavaScriptCore/bindings/objc/objc_utility.mm | 2 +- JavaScriptCore/kjs/identifier.cpp | 2 +- JavaScriptCore/kjs/nodes2string.cpp | 3 ++- JavaScriptCore/kjs/string_object.cpp | 8 ++++---- JavaScriptCore/kjs/ustring.cpp | 14 +++++++------- WebCore/bindings/js/kjs_proxy.cpp | 2 +- WebCore/dom/Document.cpp | 4 ++-- WebCore/html/HTMLFormElement.cpp | 2 +- WebCore/platform/text/AtomicString.cpp | 4 ++-- WebCore/platform/text/String.cpp | 8 ++++---- WebCore/platform/text/TextCodecICU.cpp | 2 +- 13 files changed, 31 insertions(+), 30 deletions(-)
Eric Seidel (no email)
Comment 15 2008-01-28 19:01:13 PST
Created attachment 18754 [details] Combined patch for darin JavaScriptCore/API/JSStringRef.cpp | 8 ++-- JavaScriptCore/API/JSStringRefCF.cpp | 2 +- JavaScriptCore/JavaScriptCore.exp | 10 ++-- JavaScriptCore/bindings/c/c_utility.cpp | 2 +- JavaScriptCore/bindings/objc/objc_utility.mm | 2 +- JavaScriptCore/kjs/Parser.h | 2 - JavaScriptCore/kjs/function.cpp | 32 +++++++------- JavaScriptCore/kjs/function_object.cpp | 4 +- JavaScriptCore/kjs/identifier.cpp | 8 ++-- JavaScriptCore/kjs/interpreter.h | 2 - JavaScriptCore/kjs/lexer.cpp | 29 ++++++------ JavaScriptCore/kjs/lookup.cpp | 2 +- JavaScriptCore/kjs/nodes2string.cpp | 5 +- JavaScriptCore/kjs/string_object.cpp | 14 +++--- JavaScriptCore/kjs/ustring.cpp | 40 ++++++++-------- JavaScriptCore/kjs/ustring.h | 62 +------------------------- WebCore/bindings/js/kjs_proxy.cpp | 2 +- WebCore/bindings/js/kjs_window.cpp | 4 +- WebCore/dom/Document.cpp | 4 +- WebCore/html/HTMLFormElement.cpp | 2 +- WebCore/platform/DeprecatedString.cpp | 4 +- WebCore/platform/text/AtomicString.cpp | 4 +- WebCore/platform/text/String.cpp | 8 ++-- WebCore/platform/text/TextCodecICU.cpp | 2 +- 24 files changed, 97 insertions(+), 157 deletions(-)
Darin Adler
Comment 16 2008-01-29 12:56:05 PST
Comment on attachment 18754 [details] Combined patch for darin 552 if (charLen && (u == 0 || u >= 128 || !strchr(do_not_unescape, static_cast<unsigned char>(u)))) { There's no need for a cast here. strchr takes an int and the earlier code on the line already checks that it's a value in the range 1-127, so a normal conversion to int will do. 837 if (Lexer::isHexDigit((c + 2)[0]) && Lexer::isHexDigit((c + 3)[0]) && Lexer::isHexDigit((c + 4)[0]) && Lexer::isHexDigit((c + 5)[0])) { 838 u = Lexer::convertUnicode((c + 2)[0], (c + 3)[0], (c + 4)[0], (c + 5)[0]); Really should be c[2] instead of (c + 2)[0]. The [0] is a good trick to do this with a global replace, but it's a little ugly. 116 d[i] = static_cast<unsigned char>(c[i]); // static_cast to unsigned char to avoid erroneous sign-extend The comment says the sign extend here would be "erroneous", but really I think that this function is only called on pure ASCII strings. While changing 128-255 into U+0080 to U+00FF is arguably better than changing them to U+FF80 to U+FFFF, I think what's really going on here is that it's never going to matter. I think a better comment would simply state that we use unsigned char because we want to zero-extend rather than sign extending. Note that cross-platform, plain old "char" can be *either* unsigned or signed. 97 void Lexer::setCode(int startingLineNumber, const UChar *c, unsigned int len) Surprised you didn't move the star. 266212 UString &append(const char *); 267213 UString &append(unsigned short); 268214 UString &append(char c) { return append(static_cast<unsigned short>(static_cast<unsigned char>(c))); } 269 UString &append(UChar c) { return append(c.uc); } I think we need the append(unsigned short) to be changed to be an append(UChar), because on Windows those two types aren't the same. As-is I think this might break the Windows build. r=me on this as-is, but please make sure you don't break the Windows build
Mark Rowe (bdash)
Comment 17 2008-02-29 14:23:09 PST
Eric, this patch has been sitting around in the commit queue for over a month now. Do you intend to land it?
Eric Seidel (no email)
Comment 18 2008-02-29 14:35:42 PST
Eric has been a busy little boy @ work. However my big project ended on Wed, so I'll look at making time to land this this afternoon. :)
Eric Seidel (no email)
Comment 19 2008-03-04 16:50:02 PST
Created attachment 19533 [details] Fix patch to compile on TOT, change ::UChar -> UChar JavaScriptCore/kjs/interpreter.h | 2 ++ JavaScriptCore/pcre/pcre.h | 4 ++-- .../bindings/js/JSCSSStyleDeclarationCustom.cpp | 12 ++++++------ WebCore/bindings/js/JSSVGPODTypeWrapper.h | 2 +- WebCore/svg/SVGAnimatedTemplate.h | 2 +- 5 files changed, 12 insertions(+), 10 deletions(-)
Eric Seidel (no email)
Comment 20 2008-03-04 16:50:05 PST
Created attachment 19534 [details] Fix per darin's suggestions JavaScriptCore/bindings/c/c_utility.cpp | 6 +++--- JavaScriptCore/kjs/function.cpp | 12 ++++++------ JavaScriptCore/kjs/identifier.cpp | 2 +- JavaScriptCore/kjs/lexer.cpp | 2 +- JavaScriptCore/kjs/nodes2string.cpp | 2 +- JavaScriptCore/kjs/regexp.cpp | 6 +++--- JavaScriptCore/kjs/string_object.cpp | 24 ++++++++++++------------ JavaScriptCore/kjs/ustring.cpp | 20 ++++++++++---------- JavaScriptCore/kjs/ustring.h | 12 ++++++------ WebCore/bindings/js/JSSVGPODTypeWrapper.h | 2 +- WebCore/svg/SVGAnimatedTemplate.h | 2 +- 11 files changed, 45 insertions(+), 45 deletions(-)
Eric Seidel (no email)
Comment 21 2008-03-04 16:56:00 PST
Created attachment 19535 [details] Updated combined patch for removing KJS::UChar JavaScriptCore/API/JSStringRef.cpp | 8 +- JavaScriptCore/API/JSStringRefCF.cpp | 2 +- JavaScriptCore/JavaScriptCore.exp | 10 ++-- JavaScriptCore/bindings/c/c_utility.cpp | 8 +- JavaScriptCore/bindings/objc/objc_utility.mm | 2 +- JavaScriptCore/kjs/Parser.h | 2 - JavaScriptCore/kjs/function.cpp | 34 +++++----- JavaScriptCore/kjs/function_object.cpp | 4 +- JavaScriptCore/kjs/identifier.cpp | 8 +- JavaScriptCore/kjs/interpreter.h | 4 +- JavaScriptCore/kjs/lexer.cpp | 29 ++++---- JavaScriptCore/kjs/lookup.cpp | 2 +- JavaScriptCore/kjs/nodes2string.cpp | 5 +- JavaScriptCore/kjs/regexp.cpp | 6 +- JavaScriptCore/kjs/string_object.cpp | 38 +++++----- JavaScriptCore/kjs/ustring.cpp | 48 ++++++------ JavaScriptCore/kjs/ustring.h | 74 ++----------------- JavaScriptCore/pcre/pcre.h | 4 +- .../bindings/js/JSCSSStyleDeclarationCustom.cpp | 12 ++-- WebCore/bindings/js/JSSVGPODTypeWrapper.h | 2 +- WebCore/bindings/js/kjs_proxy.cpp | 2 +- WebCore/bindings/js/kjs_window.cpp | 4 +- WebCore/dom/Document.cpp | 4 +- WebCore/platform/text/AtomicString.cpp | 4 +- WebCore/platform/text/String.cpp | 8 +- WebCore/platform/text/TextCodecICU.cpp | 2 +- WebCore/svg/SVGAnimatedTemplate.h | 2 +- 27 files changed, 135 insertions(+), 193 deletions(-)
Darin Adler
Comment 22 2008-03-07 12:03:24 PST
Comment on attachment 19535 [details] Updated combined patch for removing KJS::UChar 213 UString& append(const UChar); 807 UString& UString::append(const UChar c) There's no reason for or value to having this "const" here. r=me
Darin Adler
Comment 23 2008-03-07 12:09:59 PST
Comment on attachment 19533 [details] Fix patch to compile on TOT, change ::UChar -> UChar Clearing review flag on this, with the understanding that this is included in the combined patch.
Darin Adler
Comment 24 2008-03-07 12:10:27 PST
Comment on attachment 19534 [details] Fix per darin's suggestions Clearing review flag on this, with the understanding that this is included in the combined patch.
Darin Adler
Comment 25 2008-03-10 09:32:43 PDT
I tried to apply this and land it, but because it's a git patch and not a Subversion patch, I couldn't use the --merge option of svn-apply and there were a few conflicts. So I think Eric is going to have to land this himself.
Eric Seidel (no email)
Comment 26 2008-03-10 15:08:26 PDT
Note You need to log in before you can comment on or make changes to this bug.