Bug 27371 - [WINCE] Add WinCE specific platform/text files to WebCore
Summary: [WINCE] Add WinCE specific platform/text files to WebCore
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Platform (show other bugs)
Version: 528+ (Nightly build)
Hardware: Other OS X 10.5
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks: 23154 27715
  Show dependency treegraph
 
Reported: 2009-07-17 07:23 PDT by Adam Treat
Modified: 2009-12-04 17:48 PST (History)
7 users (show)

See Also:


Attachments
Adds platform/text/wince files to WebCore (163.21 KB, patch)
2009-07-17 07:43 PDT, Adam Treat
no flags Details | Formatted Diff | Diff
code clean-up (159.10 KB, patch)
2009-07-20 13:41 PDT, Yong Li
no flags Details | Formatted Diff | Diff
Latest version with latest cpplint cleanup (159.30 KB, patch)
2009-07-21 13:27 PDT, Adam Treat
eric: review-
Details | Formatted Diff | Diff
modified (32.03 KB, patch)
2009-08-14 15:05 PDT, Yong Li
eric: review-
Details | Formatted Diff | Diff
modified as Eric suggested (31.79 KB, patch)
2009-09-01 17:31 PDT, Yong Li
eric: review-
Details | Formatted Diff | Diff
modified again (31.54 KB, patch)
2009-09-04 17:45 PDT, Yong Li
abarth: review-
Details | Formatted Diff | Diff
TextBreakIterator (7.95 KB, patch)
2009-10-16 13:44 PDT, Yong Li
no flags Details | Formatted Diff | Diff
(2) TextBoundaries (3.36 KB, patch)
2009-10-16 13:45 PDT, Yong Li
no flags Details | Formatted Diff | Diff
(3) TextCodec (20.91 KB, patch)
2009-10-16 13:47 PDT, Yong Li
abarth: review-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Adam Treat 2009-07-17 07:23:52 PDT
The following patch contains the WinCE specific platform/text files for WebCore.
Comment 1 Adam Treat 2009-07-17 07:43:26 PDT
Created attachment 32939 [details]
Adds platform/text/wince files to WebCore

I've run them through cpplint and fixed them up by hand... fwiw.
Comment 2 Kwang Yul Seo 2009-07-17 08:23:58 PDT
Japanese and Thai are natively supported with libiconv while Chinese and Korean are not. Any reason for this?
Comment 3 Yong Li 2009-07-20 13:41:32 PDT
Created attachment 33103 [details]
code clean-up

modified for obvious coding style conflicts
Comment 4 Yong Li 2009-07-20 13:44:45 PDT
(In reply to comment #3)
> Created an attachment (id=33103) [details]
> code clean-up
> 
> modified for obvious coding style conflicts

I forgot to list 2 files in ChangeLog: TextCodecWince.h and Thai.cpp. But they are included in the patch.
Comment 5 Adam Treat 2009-07-21 13:27:34 PDT
Created attachment 33203 [details]
Latest version with latest cpplint cleanup

This version also has all licensing info included in the patch files.  They are all compatible.
Comment 6 Joe Mason 2009-07-27 07:30:23 PDT
We try to use the WinCE's os support as much as possible to avoid duplication.  These Japanese and Thai codecs aren't supported by the os so we have to implement them ourself.
Comment 7 Eric Seidel (no email) 2009-08-06 18:29:51 PDT
Comment on attachment 33203 [details]
Latest version with latest cpplint cleanup

This is too large for me to review.

Why are we adding:
platform/text/wince/Euc_jp.cpp
?

Now that you all have your magical UnicodeCE library, is this something that shoudl go in that?  Does the rest of WebCore want this Euc_jp.cpp file?  How is it generated, where does it come from?
Comment 8 Kwang Yul Seo 2009-08-07 04:17:54 PDT
Korean and Chinese codecs aren't supported on most devices too.

IMultiLanguage::EnumCodePages is the only to way to know what codecs are supported under WINCE. The list of WINCE OS supported codecs are different among devices.


I think there are three options here:

1) Remove Euc_JP and That and use only the OS supported codecs.

2) Use libiconv enitrely so that make sure most codecs are supported.

3) Make each codec as a plugin, so more codecs can be installed later.

I think the option 1 is good enough because other options can be done anyway by browser vendors anyway.
Comment 9 George Staikos 2009-08-07 05:36:01 PDT
We can move them out but the more we move out, the more external dependencies there will be in order to get it to build.  We have no desire to link in large libraries.  We only take what is necessary.
Comment 10 Yong Li 2009-08-07 07:03:48 PDT
(In reply to comment #8)
> Korean and Chinese codecs aren't supported on most devices too.
> 
> IMultiLanguage::EnumCodePages is the only to way to know what codecs are
> supported under WINCE. The list of WINCE OS supported codecs are different
> among devices.
> 
> 
> I think there are three options here:
> 
> 1) Remove Euc_JP and That and use only the OS supported codecs.
> 
> 2) Use libiconv enitrely so that make sure most codecs are supported.
> 
> 3) Make each codec as a plugin, so more codecs can be installed later.
> 
> I think the option 1 is good enough because other options can be done anyway by
> browser vendors anyway.

You do not only just want the codecs, but also want to display them with correct fonts. Is it helpful providing Korean and Chinese codecs without installing necessary fonts?

Most codecs of displayable languages are provided by OS and font provider. So we don't have to take care of them.

euc_jp is a special one. It's not because we only do this for Japanese language, but because euc_jp codec is not provided even by Windows Mobile Japanese version, neither by some 3rd-party Japanese language packs.

Thai is similar, but not that important, because most Thai websites use UTF-8.
Comment 11 Yong Li 2009-08-14 15:05:34 PDT
Created attachment 34875 [details]
modified

Thai.cpp and Euc_jp.cpp moved out
Comment 12 Eric Seidel (no email) 2009-08-24 17:26:20 PDT
Comment on attachment 34875 [details]
modified

This line is in the middle of nowhere:
 5  * All rights reserved.

Not true:
 2  * This file is part of the DOM implementation for KDE.

Why?
 246 #undef isSentenceStop

I'm not sure what this syntax does:
 59     friend LanguageManager& languageManager();

Why std::map instead of HashMap?
static std::map<UINT, std::string>& codePageCharsets()
 65 {
 66     static std::map<UINT, std::string> cc;
 67     return cc;
 68 }

Indent:
214 TextCodecWince::TextCodecWince(const TextEncoding& encoding)
 215 : m_encoding(encoding)

Comments are likely needed here:
 231     if (codePage > 50000) {
 232         if ((codePage >= 50220 && codePage <= 50222)
 233             || codePage == 50225
 234             || codePage == 50227
 235             || codePage == 50229
 236             || codePage == 52936
 237             || codePage == 54936
 238             || (codePage >= 57002 && codePage <= 57001)
 239             || codePage == 65000 // UTF-7

We generally use full english words for variable names:
 293                         int rtn = encMbToWc(dst, (const unsigned char*)bytes, srcEnd - bytes);
 294                         if (rtn >= 0)
(obviously we may have to adjust to avoid keywords here)

I dont' think explicit construction is needed here:
66     if (resultLength <= 0)
 467         return CString("?");

r-, mostly needing explanation as to why the std:: types, generally we only use std:: algorithms, not types in WebCore.  IIRC that's due to compiling without exceptions.
Comment 13 Yong Li 2009-08-24 19:50:45 PDT
(In reply to comment #12)

> I'm not sure what this syntax does:
>  59     friend LanguageManager& languageManager();

Only this function can construct the object.

Thanks for reviewing. Will upload a new patch soon.
Comment 14 Yong Li 2009-09-01 17:31:36 PDT
Created attachment 38902 [details]
modified as Eric suggested
Comment 15 Eric Seidel (no email) 2009-09-03 01:03:28 PDT
Comment on attachment 38902 [details]
modified as Eric suggested

Style:
 247     static WordBreakIterator *iterator = 0;

Also, these lines can be combined:
247     static WordBreakIterator *iterator = 0;
 248     if (!iterator)
 249         iterator = new WordBreakIterator;

static WordBreakIterator* iterator = new WordBreakIterator;

that will only run once on the function's first run.  Likewise you could just use:

static WordBreakIterator iterator; and return &iterator instead.  Although you might need to use the DEFINE_STATIC_LOCAL macro.

Please add a contructor for your BreakIterators sine they all seem to share the same initialization code.

Longer variable names are preferrd:
int textBreakFirst(TextBreakIterator* bi)
 298 {
 299     return bi->first();
 300 }

Why WebCore::?
 420         WebCore::decode(result, m_encoding.name(), bytes, length, &left, m_decodeBuffer.isEmpty(), sawInvalidChar);

Why L'' here?
 427         result.append(L'?');
Comment 16 Eric Seidel (no email) 2009-09-03 01:04:49 PDT
This code is large and complicated enough that it's difficutl to approve, so I expect we're going to go through this "you posting a patch and me picking out style nits" cycles for a while.  Would help to make it smaller, or to go over it yourself with a fine tooth comb to make sure that there is as little as possible code duplication and that it conforms as best as you know how to the style rules.
Comment 17 Yong Li 2009-09-03 19:51:02 PDT
> 
> Why WebCore::?
>  420         WebCore::decode(result, m_encoding.name(), bytes, length, &left,
> m_decodeBuffer.isEmpty(), sawInvalidChar);

because we want to use the static function but not the method. if there's no "WebCore::", compiler gives an error.

> 
> Why L'' here?
>  427         result.append(L'?');

L'' means it's wide char, the same type as UChar.
Comment 18 Yong Li 2009-09-04 17:45:42 PDT
Created attachment 39106 [details]
modified again
Comment 19 Adam Barth 2009-10-13 13:12:10 PDT
Comment on attachment 39106 [details]
modified again

I'm sorry, but this patch is too large to review properly.  Please create smaller patches.  Also, are there any tests for this code so we can see that it's correct?
Comment 20 Yong Li 2009-10-16 13:44:49 PDT
Created attachment 41318 [details]
TextBreakIterator
Comment 21 Yong Li 2009-10-16 13:45:44 PDT
Created attachment 41319 [details]
(2) TextBoundaries
Comment 22 Yong Li 2009-10-16 13:47:34 PDT
Created attachment 41321 [details]
(3) TextCodec
Comment 23 Yong Li 2009-10-16 14:06:10 PDT
(In reply to comment #19)
> (From update of attachment 39106 [details])
> I'm sorry, but this patch is too large to review properly.  Please create
> smaller patches.  Also, are there any tests for this code so we can see that
> it's correct?

I've separated it into 3 parts.

I don't think the wince port can build yet. As you can see, upstreaming is still in process.

But this code is used for Iris Browser at least in 1.1.9, and works well and stably.
Comment 24 Eric Seidel (no email) 2009-11-10 15:57:54 PST
Comment on attachment 41318 [details]
TextBreakIterator

Looks sane.
Comment 25 Eric Seidel (no email) 2009-11-10 15:58:18 PST
Comment on attachment 41319 [details]
(2) TextBoundaries

OK.
Comment 26 Eric Seidel (no email) 2009-11-10 15:59:46 PST
Comment on attachment 41321 [details]
(3) TextCodec

I am not really capable of reviewing this patch.
Comment 27 WebKit Commit Bot 2009-11-11 07:56:53 PST
Comment on attachment 41318 [details]
TextBreakIterator

Rejecting patch 41318 from commit-queue.

yong.li.webkit@gmail.com does not have committer permissions according to http://trac.webkit.org/browser/trunk/WebKitTools/Scripts/modules/committers.py.
Comment 28 WebKit Commit Bot 2009-11-11 07:57:06 PST
Comment on attachment 41319 [details]
(2) TextBoundaries

Rejecting patch 41319 from commit-queue.

yong.li.webkit@gmail.com does not have committer permissions according to http://trac.webkit.org/browser/trunk/WebKitTools/Scripts/modules/committers.py.
Comment 29 Eric Seidel (no email) 2009-11-11 08:18:56 PST
(In reply to comment #27)
> (From update of attachment 41318 [details])
> Rejecting patch 41318 from commit-queue.
> 
> yong.li.webkit@gmail.com does not have committer permissions according to
> http://trac.webkit.org/browser/trunk/WebKitTools/Scripts/modules/committers.py.

It looks like you have more than one bugzilla account. :(  The email address field in the Committer constructor can take an array.  You should feel free to commit a patch which adds your second email address to Committer("Yong Li", ["yong.li@torchmobile.com", "yong.li.webkit@gmail.com"])
Comment 30 Eric Seidel (no email) 2009-11-11 10:40:36 PST
I've added your second bugzilla account to committers.py in:
http://trac.webkit.org/changeset/50821
Comment 31 Adam Barth 2009-11-14 22:07:07 PST
Checking style for patch 41321 from bug 27371.
Running check-webkit-style
WebCore/platform/text/wince/TextCodecWince.h:33:  Alphabetical sorting problem.  [build/include_order] [4]
WebCore/platform/text/wince/TextCodecWince.h:38:  Code inside a namespace should not be indented.  [whitespace/indent] [4]
Done processing WebCore/platform/text/wince/TextCodecWince.h
WebCore/platform/text/wince/TextCodecWince.cpp:28:  Alphabetical sorting problem.  [build/include_order] [4]
Done processing WebCore/platform/text/wince/TextCodecWince.cpp
Total errors found: 3
Comment 32 WebKit Commit Bot 2009-11-18 18:30:15 PST
Comment on attachment 41318 [details]
TextBreakIterator

Clearing flags on attachment: 41318

Committed r51164: <http://trac.webkit.org/changeset/51164>
Comment 33 WebKit Commit Bot 2009-11-18 18:38:24 PST
Comment on attachment 41319 [details]
(2) TextBoundaries

Clearing flags on attachment: 41319

Committed r51165: <http://trac.webkit.org/changeset/51165>
Comment 34 Adam Barth 2009-11-30 12:14:49 PST
Attachment 41321 [details] did not pass style-queue:

Failed to run "WebKitTools/Scripts/check-webkit-style" exit_code: 1
WebCore/platform/text/wince/TextCodecWince.h:33:  Alphabetical sorting problem.  [build/include_order] [4]
WebCore/platform/text/wince/TextCodecWince.h:38:  Code inside a namespace should not be indented.  [whitespace/indent] [4]
Done processing WebCore/platform/text/wince/TextCodecWince.h
WebCore/platform/text/wince/TextCodecWince.cpp:28:  Alphabetical sorting problem.  [build/include_order] [4]
Done processing WebCore/platform/text/wince/TextCodecWince.cpp
Total errors found: 3
Comment 35 Adam Barth 2009-12-04 12:58:26 PST
Comment on attachment 41321 [details]
(3) TextCodec

This patch hasn't been touched in a month and has style problem.  Would you be willing to fix the style issues and re-post the patch, ideally on a new bug?
Comment 36 Yong Li 2009-12-04 13:05:32 PST
(In reply to comment #35)
> (From update of attachment 41321 [details])
> This patch hasn't been touched in a month and has style problem.  Would you be
> willing to fix the style issues and re-post the patch, ideally on a new bug?

Yes. I'll fix the style issues, and raise a new bug.
Comment 37 Adam Treat 2009-12-04 13:33:47 PST
(In reply to comment #36)
> (In reply to comment #35)
> > (From update of attachment 41321 [details] [details])
> > This patch hasn't been touched in a month and has style problem.  Would you be
> > willing to fix the style issues and re-post the patch, ideally on a new bug?
> 
> Yes. I'll fix the style issues, and raise a new bug.

You don't need to fix the namespace indentation issue.  The style guide has changed since the time you posted this for review.  As such it has been grandfathered in as far as I'm concerned.

This is why changing the style guide at whimsy is folly.
Comment 38 Adam Barth 2009-12-04 17:48:40 PST
> You don't need to fix the namespace indentation issue.

Yep, I agree.  Although, you could if you wanted to.  :)