Bug 17017 - Remove KJS::UChar
Summary: Remove KJS::UChar
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: Macintosh OS X 10.4
: P2 Normal
Assignee: Eric Seidel (no email)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-01-26 03:13 PST by Eric Seidel (no email)
Modified: 2008-03-10 15:08 PDT (History)
2 users (show)

See Also:


Attachments
Send KJS::UChar out to pasture (394.34 KB, patch)
2008-01-26 03:19 PST, Eric Seidel (no email)
no flags Details | Formatted Diff | Diff
Add ICU and appropriate Unicode forwarding headers to JSG (ENTIRELY ICU headers) (367.09 KB, patch)
2008-01-26 03:26 PST, Eric Seidel (no email)
no flags Details | Formatted Diff | Diff
send KJS::UChar out to pasture (minus ICU headers) (26.08 KB, patch)
2008-01-26 03:26 PST, Eric Seidel (no email)
darin: review-
Details | Formatted Diff | Diff
Corrections for darin (13.31 KB, patch)
2008-01-28 18:58 PST, Eric Seidel (no email)
no flags Details | Formatted Diff | Diff
Combined patch for darin (35.57 KB, patch)
2008-01-28 19:01 PST, Eric Seidel (no email)
darin: review+
Details | Formatted Diff | Diff
Fix patch to compile on TOT, change ::UChar -> UChar (4.40 KB, patch)
2008-03-04 16:50 PST, Eric Seidel (no email)
no flags Details | Formatted Diff | Diff
Fix per darin's suggestions (16.19 KB, patch)
2008-03-04 16:50 PST, Eric Seidel (no email)
no flags Details | Formatted Diff | Diff
Updated combined patch for removing KJS::UChar (45.00 KB, patch)
2008-03-04 16:56 PST, Eric Seidel (no email)
darin: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Eric Seidel (no email) 2008-01-26 03:13:54 PST
Remove KJS::UChar

Darin and I talked about this long ago.  No need for a real class.  This also gets rid of the silly KJS::UChar vs. ::UChar confusion.
Comment 1 Eric Seidel (no email) 2008-01-26 03:19:29 PST
Created attachment 18703 [details]
Send KJS::UChar out to pasture

 JavaScriptCore/API/JSStringRef.cpp                 |    4 +-
 JavaScriptCore/JavaScriptCore.exp                  |   10 +-
 .../JavaScriptCore.xcodeproj/project.pbxproj       |    1 -
 JavaScriptCore/bindings/c/c_utility.cpp            |    2 +-
 JavaScriptCore/kjs/Parser.h                        |    2 -
 JavaScriptCore/kjs/function.cpp                    |   32 +-
 JavaScriptCore/kjs/function_object.cpp             |    4 +-
 JavaScriptCore/kjs/identifier.cpp                  |    6 +-
 JavaScriptCore/kjs/interpreter.h                   |    2 -
 JavaScriptCore/kjs/lexer.cpp                       |   29 +-
 JavaScriptCore/kjs/lookup.cpp                      |    2 +-
 JavaScriptCore/kjs/nodes2string.cpp                |    2 +-
 JavaScriptCore/kjs/string_object.cpp               |    6 +-
 JavaScriptCore/kjs/ustring.cpp                     |   30 +-
 JavaScriptCore/kjs/ustring.h                       |   62 +-
 JavaScriptGlue/Configurations/Base.xcconfig        |    2 +-
 .../ForwardingHeaders/wtf/unicode/Unicode.h        |    1 +
 .../ForwardingHeaders/wtf/unicode/icu/UnicodeIcu.h |    1 +
 JavaScriptGlue/icu/LICENSE                         |   25 +
 JavaScriptGlue/icu/README                          |    4 +
 JavaScriptGlue/icu/unicode/platform.h              |  267 ++
 JavaScriptGlue/icu/unicode/putil.h                 |  180 ++
 JavaScriptGlue/icu/unicode/uchar.h                 | 2798 ++++++++++++++++++++
 JavaScriptGlue/icu/unicode/uconfig.h               |  186 ++
 JavaScriptGlue/icu/unicode/uiter.h                 |  707 +++++
 JavaScriptGlue/icu/unicode/umachine.h              |  371 +++
 JavaScriptGlue/icu/unicode/urename.h               | 1468 ++++++++++
 JavaScriptGlue/icu/unicode/ustring.h               | 1320 +++++++++
 JavaScriptGlue/icu/unicode/utf.h                   |  221 ++
 JavaScriptGlue/icu/unicode/utf16.h                 |  605 +++++
 JavaScriptGlue/icu/unicode/utf8.h                  |  627 +++++
 JavaScriptGlue/icu/unicode/utypes.h                |  745 ++++++
 JavaScriptGlue/icu/unicode/uversion.h              |  216 ++
 WebCore/WebCore.xcodeproj/project.pbxproj          |    1 -
 WebCore/bindings/js/kjs_proxy.cpp                  |    2 +-
 WebCore/bindings/js/kjs_window.cpp                 |    4 +-
 WebCore/platform/DeprecatedString.cpp              |    4 +-
 WebCore/platform/text/String.cpp                   |    4 +-
 38 files changed, 9816 insertions(+), 137 deletions(-)
Comment 2 Eric Seidel (no email) 2008-01-26 03:20:16 PST
I will note that this patch is *entirely* the damn ICU headers :(
Comment 3 Eric Seidel (no email) 2008-01-26 03:26:06 PST
Comment on attachment 18703 [details]
Send KJS::UChar out to pasture

I'm going to upload a split patch.
Comment 4 Eric Seidel (no email) 2008-01-26 03:26:25 PST
Created attachment 18704 [details]
Add ICU and appropriate Unicode forwarding headers to JSG (ENTIRELY ICU headers)

 JavaScriptGlue/Configurations/Base.xcconfig        |    2 +-
 .../ForwardingHeaders/wtf/unicode/Unicode.h        |    1 +
 .../ForwardingHeaders/wtf/unicode/icu/UnicodeIcu.h |    1 +
 JavaScriptGlue/icu/LICENSE                         |   25 +
 JavaScriptGlue/icu/README                          |    4 +
 JavaScriptGlue/icu/unicode/platform.h              |  267 ++
 JavaScriptGlue/icu/unicode/putil.h                 |  180 ++
 JavaScriptGlue/icu/unicode/uchar.h                 | 2798 ++++++++++++++++++++
 JavaScriptGlue/icu/unicode/uconfig.h               |  186 ++
 JavaScriptGlue/icu/unicode/uiter.h                 |  707 +++++
 JavaScriptGlue/icu/unicode/umachine.h              |  371 +++
 JavaScriptGlue/icu/unicode/urename.h               | 1468 ++++++++++
 JavaScriptGlue/icu/unicode/ustring.h               | 1320 +++++++++
 JavaScriptGlue/icu/unicode/utf.h                   |  221 ++
 JavaScriptGlue/icu/unicode/utf16.h                 |  605 +++++
 JavaScriptGlue/icu/unicode/utf8.h                  |  627 +++++
 JavaScriptGlue/icu/unicode/utypes.h                |  745 ++++++
 JavaScriptGlue/icu/unicode/uversion.h              |  216 ++
 18 files changed, 9743 insertions(+), 1 deletions(-)
Comment 5 Eric Seidel (no email) 2008-01-26 03:26:27 PST
Created attachment 18705 [details]
send KJS::UChar out to pasture (minus ICU headers)

 JavaScriptCore/API/JSStringRef.cpp      |    4 +-
 JavaScriptCore/JavaScriptCore.exp       |   10 ++--
 JavaScriptCore/bindings/c/c_utility.cpp |    2 +-
 JavaScriptCore/kjs/Parser.h             |    2 -
 JavaScriptCore/kjs/function.cpp         |   32 ++++++++--------
 JavaScriptCore/kjs/function_object.cpp  |    4 +-
 JavaScriptCore/kjs/identifier.cpp       |    6 +-
 JavaScriptCore/kjs/interpreter.h        |    2 -
 JavaScriptCore/kjs/lexer.cpp            |   29 +++++++-------
 JavaScriptCore/kjs/lookup.cpp           |    2 +-
 JavaScriptCore/kjs/nodes2string.cpp     |    2 +-
 JavaScriptCore/kjs/string_object.cpp    |    6 +-
 JavaScriptCore/kjs/ustring.cpp          |   30 +++++++-------
 JavaScriptCore/kjs/ustring.h            |   62 +------------------------------
 WebCore/bindings/js/kjs_proxy.cpp       |    2 +-
 WebCore/bindings/js/kjs_window.cpp      |    4 +-
 WebCore/platform/DeprecatedString.cpp   |    4 +-
 WebCore/platform/text/String.cpp        |    4 +-
 18 files changed, 73 insertions(+), 134 deletions(-)
Comment 6 Darin Adler 2008-01-26 08:18:28 PST
Comment on attachment 18705 [details]
send KJS::UChar out to pasture (minus ICU headers)

Generally looks good.

+    return toRef(UString(reinterpret_cast<const UChar*>(chars), static_cast<int>(numChars)).rep()->ref());

+    return toRef(UString(reinterpret_cast<UChar*>(buffer.data()), p - buffer.data()).rep()->ref());

+  const ::UChar* d = reinterpret_cast<const ::UChar*>(&data()[0]);

+    Completion comp = Interpreter::evaluate(exec, filename, baseLine, reinterpret_cast<const UChar*>(str.characters()), str.length(), thisNode);

+    return Identifier(reinterpret_cast<const UChar*>(m_impl->characters()), m_impl->length());

+    return UString(reinterpret_cast<const UChar*>(m_impl->characters()), m_impl->length());

No typecast needed at all at these call sites now.

There's a small but potentially slightly dangerous difference between KJS::UChar and the integer type UChar typedef, and that's the behavior of converting a char into UChar. This will silently behave differently, sign extending rather than padding with zeroes. Here are the three callsites that are affected:

From CStringTranslator::translate:

        for (size_t i = 0; i != length; i++)
            d[i] = c[i];

This needs a typecast to "unsigned char" on the right side of the assignment statement. Unless we guarantee we only use this on ASCII strings.

From SourceStream& SourceStream::operator<<(char c):

    UChar ch(c);

This also needs a typecast to "unsigned char". It's probably a little less dangerous than the above because we probably use this operator only with ASCII characters, but I'm not sure.

From UString::append(const char*), in 3 different places:

        for (int i = 0; i < tSize; ++i)
            d[thisSize + i] = t[i];

Same issue, same explanation -- safe if only used on ASCII, or we can add a cast to preserve the old behavior.

I'm going to say review- because of this issue; otherwise good patch.
Comment 7 Darin Adler 2008-01-26 08:20:51 PST
Comment on attachment 18704 [details]
Add ICU and appropriate Unicode forwarding headers to JSG (ENTIRELY ICU headers)

r=me

But I suggest using svn copy on the whole icu directory from JavaScriptCore instead of doing the individual files like this.
Comment 8 Eric Seidel (no email) 2008-01-26 13:48:47 PST
I actually copied 3 fewer files than the full ICU directory. :) But I agree, that would be cleaner.  I'll do that.

The sign-extend issue calls the whole patch into question!  Since the goal was not to make it easier to hack on WebKit (less confusion) rather than harder (dangerous to convert from char to UChar)!

Hum...
Comment 9 Eric Seidel (no email) 2008-01-26 13:50:11 PST
s/not//
Comment 10 Darin Adler 2008-01-26 16:01:15 PST
(In reply to comment #8)
> The sign-extend issue calls the whole patch into question!  Since the goal was
> not to make it easier to hack on WebKit (less confusion) rather than harder
> (dangerous to convert from char to UChar)!

Nah, it's not all that important. It's not super-helpful to have char convert to UChar by 0-filling any more than by sign extending. It's just a subtle change at the switchover time. In the long run, it's find to have the single integral type.

The safest way to do things would be to change KJS::UChar to remove the special char-conversion feature, then do the switchover in a separate step.

I thought of one other difference between UChar and KJS::UChar. The latter default constructs to 0 even in cases where the raw type constructs uninitialized. These are the only things to worry about when switching over.
Comment 11 Eric Seidel (no email) 2008-01-26 16:08:38 PST
I think that's an excellent idea.

I'll split the patch out into stages.  One which makes UChar(char*) and UChar() private, and a second which does the removal.
Comment 12 Eric Seidel (no email) 2008-01-26 16:09:20 PST
UChar(char) rather. :)
Comment 13 Eric Seidel (no email) 2008-01-28 18:06:39 PST
Comment on attachment 18704 [details]
Add ICU and appropriate Unicode forwarding headers to JSG (ENTIRELY ICU headers)

Landed this patch (using SVN cp).  Working on the other pieces still.
Comment 14 Eric Seidel (no email) 2008-01-28 18:58:03 PST
Created attachment 18753 [details]
Corrections for darin

 JavaScriptCore/API/JSStringRef.cpp           |    8 ++++----
 JavaScriptCore/API/JSStringRefCF.cpp         |    2 +-
 JavaScriptCore/bindings/objc/objc_utility.mm |    2 +-
 JavaScriptCore/kjs/identifier.cpp            |    2 +-
 JavaScriptCore/kjs/nodes2string.cpp          |    3 ++-
 JavaScriptCore/kjs/string_object.cpp         |    8 ++++----
 JavaScriptCore/kjs/ustring.cpp               |   14 +++++++-------
 WebCore/bindings/js/kjs_proxy.cpp            |    2 +-
 WebCore/dom/Document.cpp                     |    4 ++--
 WebCore/html/HTMLFormElement.cpp             |    2 +-
 WebCore/platform/text/AtomicString.cpp       |    4 ++--
 WebCore/platform/text/String.cpp             |    8 ++++----
 WebCore/platform/text/TextCodecICU.cpp       |    2 +-
 13 files changed, 31 insertions(+), 30 deletions(-)
Comment 15 Eric Seidel (no email) 2008-01-28 19:01:13 PST
Created attachment 18754 [details]
Combined patch for darin

 JavaScriptCore/API/JSStringRef.cpp           |    8 ++--
 JavaScriptCore/API/JSStringRefCF.cpp         |    2 +-
 JavaScriptCore/JavaScriptCore.exp            |   10 ++--
 JavaScriptCore/bindings/c/c_utility.cpp      |    2 +-
 JavaScriptCore/bindings/objc/objc_utility.mm |    2 +-
 JavaScriptCore/kjs/Parser.h                  |    2 -
 JavaScriptCore/kjs/function.cpp              |   32 +++++++-------
 JavaScriptCore/kjs/function_object.cpp       |    4 +-
 JavaScriptCore/kjs/identifier.cpp            |    8 ++--
 JavaScriptCore/kjs/interpreter.h             |    2 -
 JavaScriptCore/kjs/lexer.cpp                 |   29 ++++++------
 JavaScriptCore/kjs/lookup.cpp                |    2 +-
 JavaScriptCore/kjs/nodes2string.cpp          |    5 +-
 JavaScriptCore/kjs/string_object.cpp         |   14 +++---
 JavaScriptCore/kjs/ustring.cpp               |   40 ++++++++--------
 JavaScriptCore/kjs/ustring.h                 |   62 +-------------------------
 WebCore/bindings/js/kjs_proxy.cpp            |    2 +-
 WebCore/bindings/js/kjs_window.cpp           |    4 +-
 WebCore/dom/Document.cpp                     |    4 +-
 WebCore/html/HTMLFormElement.cpp             |    2 +-
 WebCore/platform/DeprecatedString.cpp        |    4 +-
 WebCore/platform/text/AtomicString.cpp       |    4 +-
 WebCore/platform/text/String.cpp             |    8 ++--
 WebCore/platform/text/TextCodecICU.cpp       |    2 +-
 24 files changed, 97 insertions(+), 157 deletions(-)
Comment 16 Darin Adler 2008-01-29 12:56:05 PST
Comment on attachment 18754 [details]
Combined patch for darin

 552       if (charLen && (u == 0 || u >= 128 || !strchr(do_not_unescape, static_cast<unsigned char>(u)))) {

There's no need for a cast here. strchr takes an int and the earlier code on the line already checks that it's a value in the range 1-127, so a normal conversion to int will do.

 837             if (Lexer::isHexDigit((c + 2)[0]) && Lexer::isHexDigit((c + 3)[0]) && Lexer::isHexDigit((c + 4)[0]) && Lexer::isHexDigit((c + 5)[0])) {
 838                 u = Lexer::convertUnicode((c + 2)[0], (c + 3)[0], (c + 4)[0], (c + 5)[0]);

Really should be c[2] instead of (c + 2)[0]. The [0] is a good trick to do this with a global replace, but it's a little ugly.

 116             d[i] = static_cast<unsigned char>(c[i]); // static_cast to unsigned char to avoid erroneous sign-extend

The comment says the sign extend here would be "erroneous", but really I think that this function is only called on pure ASCII strings. While changing 128-255 into U+0080 to U+00FF is arguably better than changing them to U+FF80 to U+FFFF, I think what's really going on here is that it's never going to matter. I think a better comment would simply state that we use unsigned char because we want to zero-extend rather than sign extending. Note that cross-platform, plain old "char" can be *either* unsigned or signed.

 97 void Lexer::setCode(int startingLineNumber, const UChar *c, unsigned int len)

Surprised you didn't move the star.

266212     UString &append(const char *);
267213     UString &append(unsigned short);
268214     UString &append(char c) { return append(static_cast<unsigned short>(static_cast<unsigned char>(c))); }
269      UString &append(UChar c) { return append(c.uc); }

I think we need the append(unsigned short) to be changed to be an append(UChar), because on Windows those two types aren't the same. As-is I think this might break the Windows build.

r=me on this as-is, but please make sure you don't break the Windows build
Comment 17 Mark Rowe (bdash) 2008-02-29 14:23:09 PST
Eric, this patch has been sitting around in the commit queue for over a month now.  Do you intend to land it?
Comment 18 Eric Seidel (no email) 2008-02-29 14:35:42 PST
Eric has been a busy little boy @ work.  However my big project ended on Wed, so I'll look at making time to land this this afternoon. :)
Comment 19 Eric Seidel (no email) 2008-03-04 16:50:02 PST
Created attachment 19533 [details]
Fix patch to compile on TOT, change ::UChar -> UChar

 JavaScriptCore/kjs/interpreter.h                   |    2 ++
 JavaScriptCore/pcre/pcre.h                         |    4 ++--
 .../bindings/js/JSCSSStyleDeclarationCustom.cpp    |   12 ++++++------
 WebCore/bindings/js/JSSVGPODTypeWrapper.h          |    2 +-
 WebCore/svg/SVGAnimatedTemplate.h                  |    2 +-
 5 files changed, 12 insertions(+), 10 deletions(-)
Comment 20 Eric Seidel (no email) 2008-03-04 16:50:05 PST
Created attachment 19534 [details]
Fix per darin's suggestions

 JavaScriptCore/bindings/c/c_utility.cpp   |    6 +++---
 JavaScriptCore/kjs/function.cpp           |   12 ++++++------
 JavaScriptCore/kjs/identifier.cpp         |    2 +-
 JavaScriptCore/kjs/lexer.cpp              |    2 +-
 JavaScriptCore/kjs/nodes2string.cpp       |    2 +-
 JavaScriptCore/kjs/regexp.cpp             |    6 +++---
 JavaScriptCore/kjs/string_object.cpp      |   24 ++++++++++++------------
 JavaScriptCore/kjs/ustring.cpp            |   20 ++++++++++----------
 JavaScriptCore/kjs/ustring.h              |   12 ++++++------
 WebCore/bindings/js/JSSVGPODTypeWrapper.h |    2 +-
 WebCore/svg/SVGAnimatedTemplate.h         |    2 +-
 11 files changed, 45 insertions(+), 45 deletions(-)
Comment 21 Eric Seidel (no email) 2008-03-04 16:56:00 PST
Created attachment 19535 [details]
Updated combined patch for removing KJS::UChar

 JavaScriptCore/API/JSStringRef.cpp                 |    8 +-
 JavaScriptCore/API/JSStringRefCF.cpp               |    2 +-
 JavaScriptCore/JavaScriptCore.exp                  |   10 ++--
 JavaScriptCore/bindings/c/c_utility.cpp            |    8 +-
 JavaScriptCore/bindings/objc/objc_utility.mm       |    2 +-
 JavaScriptCore/kjs/Parser.h                        |    2 -
 JavaScriptCore/kjs/function.cpp                    |   34 +++++-----
 JavaScriptCore/kjs/function_object.cpp             |    4 +-
 JavaScriptCore/kjs/identifier.cpp                  |    8 +-
 JavaScriptCore/kjs/interpreter.h                   |    4 +-
 JavaScriptCore/kjs/lexer.cpp                       |   29 ++++----
 JavaScriptCore/kjs/lookup.cpp                      |    2 +-
 JavaScriptCore/kjs/nodes2string.cpp                |    5 +-
 JavaScriptCore/kjs/regexp.cpp                      |    6 +-
 JavaScriptCore/kjs/string_object.cpp               |   38 +++++-----
 JavaScriptCore/kjs/ustring.cpp                     |   48 ++++++------
 JavaScriptCore/kjs/ustring.h                       |   74 ++-----------------
 JavaScriptCore/pcre/pcre.h                         |    4 +-
 .../bindings/js/JSCSSStyleDeclarationCustom.cpp    |   12 ++--
 WebCore/bindings/js/JSSVGPODTypeWrapper.h          |    2 +-
 WebCore/bindings/js/kjs_proxy.cpp                  |    2 +-
 WebCore/bindings/js/kjs_window.cpp                 |    4 +-
 WebCore/dom/Document.cpp                           |    4 +-
 WebCore/platform/text/AtomicString.cpp             |    4 +-
 WebCore/platform/text/String.cpp                   |    8 +-
 WebCore/platform/text/TextCodecICU.cpp             |    2 +-
 WebCore/svg/SVGAnimatedTemplate.h                  |    2 +-
 27 files changed, 135 insertions(+), 193 deletions(-)
Comment 22 Darin Adler 2008-03-07 12:03:24 PST
Comment on attachment 19535 [details]
Updated combined patch for removing KJS::UChar

 213     UString& append(const UChar);

 807 UString& UString::append(const UChar c)

There's no reason for or value to having this "const" here.

r=me
Comment 23 Darin Adler 2008-03-07 12:09:59 PST
Comment on attachment 19533 [details]
Fix patch to compile on TOT, change ::UChar -> UChar

Clearing review flag on this, with the understanding that this is included in the combined patch.
Comment 24 Darin Adler 2008-03-07 12:10:27 PST
Comment on attachment 19534 [details]
Fix per darin's suggestions

Clearing review flag on this, with the understanding that this is included in the combined patch.
Comment 25 Darin Adler 2008-03-10 09:32:43 PDT
I tried to apply this and land it, but because it's a git patch and not a Subversion patch, I couldn't use the --merge option of svn-apply and there were a few conflicts. So I think Eric is going to have to land this himself.
Comment 26 Eric Seidel (no email) 2008-03-10 15:08:26 PDT
r30942