WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
216253
Derive big5, jis0208 and jis0212 tables from ICU
https://bugs.webkit.org/show_bug.cgi?id=216253
Summary
Derive big5, jis0208 and jis0212 tables from ICU
Alex Christensen
Reported
2020-09-07 15:06:02 PDT
Derive jis0208 and jis0212 tables from ICU
Attachments
Patch
(303.78 KB, patch)
2020-09-07 15:07 PDT
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Patch
(977.24 KB, patch)
2020-09-07 23:36 PDT
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Show Obsolete
(1)
View All
Add attachment
proposed patch, testcase, etc.
Alex Christensen
Comment 1
2020-09-07 15:07:46 PDT
Created
attachment 408198
[details]
Patch
Darin Adler
Comment 2
2020-09-07 16:00:06 PDT
Comment on
attachment 408198
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=408198&action=review
r=me assuming all the tests pass
> Source/WebCore/platform/text/EncodingTables.cpp:1064 > + array = new std::array<std::pair<uint16_t, UChar>, 7724>();
No need for the parens here.
> Source/WebCore/platform/text/EncodingTables.cpp:1081 > + ucnv_toUnicode(icuConverter.get(), &output, output + 1, &input, input + sizeof(icuInput), nullptr, true, &error);
Are these 8836 separate calls to ucnv_toUnicode fast enough for our purposes?
> Source/WebCore/platform/text/EncodingTables.cpp:1887 > + ucnv_toUnicode(icuConverter.get(), &output, output + 1, &input, input + sizeof(icuInput), nullptr, true, &error);
Ditto.
> Source/WebCore/platform/text/EncodingTables.h:47 > -template<typename CollectionType> void sortByFirst(CollectionType&); > +template<typename CollectionType> void stableSortByFirst(CollectionType&);
I suggest keeping both around because std::sort is more efficient than std::stable_sort when they keys are all unique. We would use sortByFirst in any cases where the keys are unique, and it could even do the sortedFirstsAreUnique assertion. Removing the assertion would be an nice cleanup to the call sites.
Alex Christensen
Comment 3
2020-09-07 23:34:45 PDT
Comment on
attachment 408198
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=408198&action=review
> Source/WebCore/platform/text/TextCodecCJK.cpp:632 > + for (auto* pair = range.first; pair < range.second; pair++) {
It looks like I'll have to make pair auto instead of auto* because of MSVC's STL implementation.
Alex Christensen
Comment 4
2020-09-07 23:36:25 PDT
Created
attachment 408214
[details]
Patch
Alex Christensen
Comment 5
2020-09-07 23:47:33 PDT
I measured these time overheads in the complete table generation process: big5: 0.001075 seconds jis0208: 0.000405 seconds jis0212: 0.000417 seconds I also measured a 146240 byte decrease in WebCore's binary size with this change, which I think is worth it.
EWS
Comment 6
2020-09-08 10:51:09 PDT
Committed
r266729
: <
https://trac.webkit.org/changeset/266729
> All reviewed patches have been landed. Closing bug and clearing flags on
attachment 408214
[details]
.
Radar WebKit Bug Importer
Comment 7
2020-09-08 10:52:19 PDT
<
rdar://problem/68517504
>
Darin Adler
Comment 8
2020-09-08 10:53:11 PDT
Comment on
attachment 408214
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=408214&action=review
> Source/WebCore/platform/text/TextCodecCJK.cpp:126 > +using JIS0208EncodeIndex = std::array<std::pair<UChar, uint16_t>, sizeof(jis0208()) / sizeof(jis0208()[0])>;
Why this sizeof math instead of std::size?
> Source/WebCore/platform/text/TextCodecCJK.cpp:648 > +using EUCKREncodingIndex = std::array<std::pair<UChar, uint16_t>, sizeof(eucKR()) / sizeof(eucKR()[0])>;
Why this sizeof math instead of std::size?
Darin Adler
Comment 9
2020-09-08 10:55:34 PDT
Comment on
attachment 408214
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=408214&action=review
>> Source/WebCore/platform/text/TextCodecCJK.cpp:126 >> +using JIS0208EncodeIndex = std::array<std::pair<UChar, uint16_t>, sizeof(jis0208()) / sizeof(jis0208()[0])>; > > Why this sizeof math instead of std::size?
There must be some way to get the size based on the type rather than doing this math. I guess std::size is not it.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug