WebKit Bugzilla
New
Browse
Search+
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
54446
Share the helper functions used by Latin-1 and UTF-8 text codecs
https://bugs.webkit.org/show_bug.cgi?id=54446
Summary
Share the helper functions used by Latin-1 and UTF-8 text codecs
Andreas Kling
Reported
2011-02-15 04:13:27 PST
The new TextCodecUTF8 is using the 0x80808080 trick from TextCodecLatin1. We should put the helper functions in a single place instead of duplicating it in both implementations.
Attachments
Proposed patch
(16.05 KB, patch)
2011-02-15 04:19 PST
,
Andreas Kling
darin
: review-
Details
Formatted Diff
Diff
Proposed patch v2
(16.33 KB, patch)
2011-02-15 09:14 PST
,
Andreas Kling
darin
: review+
Details
Formatted Diff
Diff
Show Obsolete
(1)
View All
Add attachment
proposed patch, testcase, etc.
Andreas Kling
Comment 1
2011-02-15 04:19:15 PST
Created
attachment 82435
[details]
Proposed patch
Darin Adler
Comment 2
2011-02-15 08:52:57 PST
Comment on
attachment 82435
[details]
Proposed patch View in context:
https://bugs.webkit.org/attachment.cgi?id=82435&action=review
> Source/WebCore/ChangeLog:10 > + Move MachineWord, UCharByteFiller and the other helper functions > + into TextCodecHelpers.h where they can be used by both TextCodecUTF8 > + and TextCodecLatin1.
Very good idea--I should have done this--but I do not like the name “helpers” because it’s so vague. I would call this TextCodecASCIIFastPath.h or something along those lines.
> Source/WebCore/platform/text/TextCodecLatin1.cpp:130 > + if (*source < 0x80) {
Should use isASCII instead of hardcoding 0x80.
Darin Adler
Comment 3
2011-02-15 08:53:59 PST
Comment on
attachment 82435
[details]
Proposed patch View in context:
https://bugs.webkit.org/attachment.cgi?id=82435&action=review
> Source/WebCore/platform/text/TextCodecHelpers.h:34 > +// Assuming that a pointer is the size of a "machine word", then > +// uintptr_t is an integer type that is also a machine word. > +typedef uintptr_t MachineWord;
Oops, this header needs to be in the WebCore namespace, not the global namespace.
Andreas Kling
Comment 4
2011-02-15 09:14:46 PST
Created
attachment 82469
[details]
Proposed patch v2 Updated patch addressing all comments.
Darin Adler
Comment 5
2011-02-15 09:34:50 PST
Comment on
attachment 82469
[details]
Proposed patch v2 View in context:
https://bugs.webkit.org/attachment.cgi?id=82469&action=review
r=me -- also had some ideas for further cleanup
> Source/WebCore/platform/text/TextCodecLatin1.cpp:137 > - if (chunk & NonASCIIMask<sizeof(uintptr_t)>::value()) > + if (chunk & NonASCIIMask<sizeof(MachineWord)>::value()) > goto useLookupTable;
I realized we can make this even cleaner by defining a function in the header: inline MachineWord NonASCIIMachineWordMask() { return NonASCIIMask<sizeof(MachineWord)>::value(); } Or: inline bool isAllASCII(MachineWord word) { return word & NonASCIIMask<sizeof(MachineWord)>::value(); } And use one of these functions in the files. Then the template magic stays in the header.
> Source/WebCore/platform/text/TextCodecLatin1.cpp:139 > + UCharByteFiller<sizeof(MachineWord)>::copy(destination, source);
And here: inline void copyASCIIMachineWord(UChar* destination, const uint8_t* source) { UCharByteFiller<sizeof(MachineWord)>::copy(destination, source); } Or the type of source could be const MachineWord* and we could cast it. It’s nice to keep the template goop in the header.
Andreas Kling
Comment 6
2011-02-15 09:43:30 PST
(In reply to
comment #5
)
> (From update of
attachment 82469
[details]
) > View in context:
https://bugs.webkit.org/attachment.cgi?id=82469&action=review
> > r=me -- also had some ideas for further cleanup
And good ideas they are, will incorporate them when landing.
> inline bool isAllASCII(MachineWord word) > { > return word & NonASCIIMask<sizeof(MachineWord)>::value(); > }
This should probably be called isNotAllASCII(). ;)
Darin Adler
Comment 7
2011-02-15 09:45:11 PST
(In reply to
comment #6
)
> And good ideas they are, will incorporate them when landing. > > > inline bool isAllASCII(MachineWord word) > > { > > return word & NonASCIIMask<sizeof(MachineWord)>::value(); > > } > > This should probably be called isNotAllASCII(). ;)
Or we could put a ! in the function, which is probably what I would do.
Andreas Kling
Comment 8
2011-02-15 10:22:07 PST
Committed
r78580
: <
http://trac.webkit.org/changeset/78580
>
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug