WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
15907
Implement USE(QT4_UNICODE) for UTF8 -> UTF16 conversion
https://bugs.webkit.org/show_bug.cgi?id=15907
Summary
Implement USE(QT4_UNICODE) for UTF8 -> UTF16 conversion
Justin Haygood
Reported
2007-11-08 14:51:26 PST
UTF8->UTF16 conversion in c_utility.cpp doesn't have a Qt4 Unicode implementation
Attachments
implements qt4 unicode in c_utility.cpp
(976 bytes, patch)
2007-11-08 14:52 PST
,
Justin Haygood
no flags
Details
Formatted Diff
Diff
implements qt4 unicode in c_utility.cpp
(1.30 KB, patch)
2007-11-08 14:53 PST
,
Justin Haygood
mrowe
: review-
Details
Formatted Diff
Diff
implements Qt 4 Unicode path in c_utility.cpp
(2.07 KB, patch)
2007-11-08 19:47 PST
,
Justin Haygood
ap
: review-
Details
Formatted Diff
Diff
Changes implementation of convertutf8toutf16 to use unicode.org conversion function
(30.12 KB, patch)
2007-11-09 13:33 PST
,
Justin Haygood
staikos
: review-
Details
Formatted Diff
Diff
Show Obsolete
(3)
View All
Add attachment
proposed patch, testcase, etc.
Justin Haygood
Comment 1
2007-11-08 14:52:07 PST
Created
attachment 17128
[details]
implements qt4 unicode in c_utility.cpp
Justin Haygood
Comment 2
2007-11-08 14:53:20 PST
Created
attachment 17129
[details]
implements qt4 unicode in c_utility.cpp Ooops.. I need to add my copyright at the top :)
Mark Rowe (bdash)
Comment 3
2007-11-08 14:58:51 PST
Comment on
attachment 17129
[details]
implements qt4 unicode in c_utility.cpp The patch needs a ChangeLog entry, and the coding style should follow the style guidelines. I know that the ICU code above does not, but it'd be great if new code would. In particular, the use of C-style casts, and the one-line for statement with strange whitespace usage.
Justin Haygood
Comment 4
2007-11-08 19:47:09 PST
Created
attachment 17138
[details]
implements Qt 4 Unicode path in c_utility.cpp
Alexey Proskuryakov
Comment 5
2007-11-08 23:17:14 PST
Comment on
attachment 17138
[details]
implements Qt 4 Unicode path in c_utility.cpp + *UTF16Length = wcslen(utf16); This won't work correctly if there are null characters in the string, and won't even compile on platforms with a different definition of wchar_t (such as Mac OS X). I think it would be best to take code from <
http://www.unicode.org/Public/PROGRAMS/CVTUTF/ConvertUTF.c
>, possibly throwing away the ICU-based implementation. Also, see the Latin-1 fallback code path in ICU branch - I think we'd like to have it on all platforms.
Justin Haygood
Comment 6
2007-11-09 13:33:45 PST
Created
attachment 17161
[details]
Changes implementation of convertutf8toutf16 to use unicode.org conversion function Alright, changes implementation to use unicode.org sources. ap was iffy on adding the files in their entirety, but I don't see a problem with it.
Holger Freyther
Comment 7
2007-11-10 17:26:52 PST
Tossing my two cents. I care a lot for the binary size. So this adds 30k of code for something already available in the toolkit so please make it optional to use the built-in codec...
George Staikos
Comment 8
2007-11-10 17:42:20 PST
Comment on
attachment 17161
[details]
Changes implementation of convertutf8toutf16 to use unicode.org conversion function Agreed, this is adding a lot of code we just don't need in Qt afaik.
Alexey Proskuryakov
Comment 9
2007-11-11 02:21:25 PST
(In reply to
comment #7
)
> So this adds 30k of code for something already available in the toolkit so please make it optional > to use the built-in codec...
Hmm, where do these 30K come from? We only use a single tiny function (I assume you do have dead code stripping enabled). I don't think such a trivial operation warrants having platform-specific branches. I was actually considering making a custom UTF-8 decoder for WebCore, too - just as with Latin-1, it's bad for performance to initialize the whole codec registry just for those.
Alexey Proskuryakov
Comment 10
2007-11-11 11:15:40 PST
Darin suggests that it would be best to have UTF-8 decoding in one place, such as wtf/unicode. UString::UTF8String() can go there, too.
Alexey Proskuryakov
Comment 11
2007-11-12 05:15:53 PST
(In reply to
comment #10
)
> Darin suggests that it would be best to have UTF-8 decoding in one place, such > as wtf/unicode. UString::UTF8String() can go there, too.
Doing that in
bug 15953
. Keeping this bug open for port authors to consider adding platform-specific branches.
Simon Hausmann
Comment 12
2010-03-23 02:17:40 PDT
Closing this bug since
r30794
consolidates this into WebCore::String::fromUTF8, which uses the TextEncoding/TextEncoder infrastructure.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug