RESOLVED FIXED54446
Share the helper functions used by Latin-1 and UTF-8 text codecs
https://bugs.webkit.org/show_bug.cgi?id=54446
Summary Share the helper functions used by Latin-1 and UTF-8 text codecs
Andreas Kling
Reported 2011-02-15 04:13:27 PST
The new TextCodecUTF8 is using the 0x80808080 trick from TextCodecLatin1. We should put the helper functions in a single place instead of duplicating it in both implementations.
Attachments
Proposed patch (16.05 KB, patch)
2011-02-15 04:19 PST, Andreas Kling
darin: review-
Proposed patch v2 (16.33 KB, patch)
2011-02-15 09:14 PST, Andreas Kling
darin: review+
Andreas Kling
Comment 1 2011-02-15 04:19:15 PST
Created attachment 82435 [details] Proposed patch
Darin Adler
Comment 2 2011-02-15 08:52:57 PST
Comment on attachment 82435 [details] Proposed patch View in context: https://bugs.webkit.org/attachment.cgi?id=82435&action=review > Source/WebCore/ChangeLog:10 > + Move MachineWord, UCharByteFiller and the other helper functions > + into TextCodecHelpers.h where they can be used by both TextCodecUTF8 > + and TextCodecLatin1. Very good idea--I should have done this--but I do not like the name “helpers” because it’s so vague. I would call this TextCodecASCIIFastPath.h or something along those lines. > Source/WebCore/platform/text/TextCodecLatin1.cpp:130 > + if (*source < 0x80) { Should use isASCII instead of hardcoding 0x80.
Darin Adler
Comment 3 2011-02-15 08:53:59 PST
Comment on attachment 82435 [details] Proposed patch View in context: https://bugs.webkit.org/attachment.cgi?id=82435&action=review > Source/WebCore/platform/text/TextCodecHelpers.h:34 > +// Assuming that a pointer is the size of a "machine word", then > +// uintptr_t is an integer type that is also a machine word. > +typedef uintptr_t MachineWord; Oops, this header needs to be in the WebCore namespace, not the global namespace.
Andreas Kling
Comment 4 2011-02-15 09:14:46 PST
Created attachment 82469 [details] Proposed patch v2 Updated patch addressing all comments.
Darin Adler
Comment 5 2011-02-15 09:34:50 PST
Comment on attachment 82469 [details] Proposed patch v2 View in context: https://bugs.webkit.org/attachment.cgi?id=82469&action=review r=me -- also had some ideas for further cleanup > Source/WebCore/platform/text/TextCodecLatin1.cpp:137 > - if (chunk & NonASCIIMask<sizeof(uintptr_t)>::value()) > + if (chunk & NonASCIIMask<sizeof(MachineWord)>::value()) > goto useLookupTable; I realized we can make this even cleaner by defining a function in the header: inline MachineWord NonASCIIMachineWordMask() { return NonASCIIMask<sizeof(MachineWord)>::value(); } Or: inline bool isAllASCII(MachineWord word) { return word & NonASCIIMask<sizeof(MachineWord)>::value(); } And use one of these functions in the files. Then the template magic stays in the header. > Source/WebCore/platform/text/TextCodecLatin1.cpp:139 > + UCharByteFiller<sizeof(MachineWord)>::copy(destination, source); And here: inline void copyASCIIMachineWord(UChar* destination, const uint8_t* source) { UCharByteFiller<sizeof(MachineWord)>::copy(destination, source); } Or the type of source could be const MachineWord* and we could cast it. It’s nice to keep the template goop in the header.
Andreas Kling
Comment 6 2011-02-15 09:43:30 PST
(In reply to comment #5) > (From update of attachment 82469 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=82469&action=review > > r=me -- also had some ideas for further cleanup And good ideas they are, will incorporate them when landing. > inline bool isAllASCII(MachineWord word) > { > return word & NonASCIIMask<sizeof(MachineWord)>::value(); > } This should probably be called isNotAllASCII(). ;)
Darin Adler
Comment 7 2011-02-15 09:45:11 PST
(In reply to comment #6) > And good ideas they are, will incorporate them when landing. > > > inline bool isAllASCII(MachineWord word) > > { > > return word & NonASCIIMask<sizeof(MachineWord)>::value(); > > } > > This should probably be called isNotAllASCII(). ;) Or we could put a ! in the function, which is probably what I would do.
Andreas Kling
Comment 8 2011-02-15 10:22:07 PST
Note You need to log in before you can comment on or make changes to this bug.