RESOLVED FIXED 74063
Improve charactersAreAllASCII() to compare multiple characters at a time
https://bugs.webkit.org/show_bug.cgi?id=74063
Summary Improve charactersAreAllASCII() to compare multiple characters at a time
Benjamin Poulain
Reported 2011-12-08 00:14:04 PST
We could load a word size of character at every iteration... Memory access would be more effective, and we would need less iterations in the loop to go over the string.
Attachments
Patch (2.71 KB, patch)
2011-12-08 00:42 PST, Benjamin Poulain
no flags
Patch (18.96 KB, patch)
2011-12-12 01:39 PST, Benjamin Poulain
no flags
Patch (19.01 KB, patch)
2011-12-12 13:54 PST, Benjamin Poulain
no flags
Patch (17.93 KB, patch)
2012-01-03 17:58 PST, Benjamin Poulain
no flags
Benjamin Poulain
Comment 1 2011-12-08 00:42:27 PST
Benjamin Poulain
Comment 2 2011-12-08 00:47:15 PST
This is more for fun than anything. I measured the gain of the function itself, it is a bit more than the expected 4x, 8x, 16x. I did not really look into it but I guess that is because of the reduced number of memory access. The function get a lot more complicated than the original, so I am not sure that is worth it for the project.
Early Warning System Bot
Comment 3 2011-12-08 02:32:56 PST
Darin Adler
Comment 4 2011-12-08 10:35:08 PST
Comment on attachment 118344 [details] Patch The functions in TextCodecASCIIFastPath.h provide a more readable way to do this type of optimization for single-byte characters, and ideally I’d like to see this share code with that. Maybe we can move those functions into WTF?
Benjamin Poulain
Comment 5 2011-12-08 10:48:30 PST
> The functions in TextCodecASCIIFastPath.h provide a more readable way to do this type of optimization for single-byte characters, and ideally I’d like to see this share code with that. Maybe we can move those functions into WTF? I was unaware the mask were available in TextCodecASCIIFastPath.h. I can have a look at moving the file to WTF.
Benjamin Poulain
Comment 6 2011-12-12 01:39:20 PST
Early Warning System Bot
Comment 7 2011-12-12 01:57:01 PST
Benjamin Poulain
Comment 8 2011-12-12 13:54:54 PST
Benjamin Poulain
Comment 9 2011-12-19 02:30:48 PST
Now that ASCIIType.h is no longer bloated, what do you think about merging the ASCIIFastPathHelper.h in this patch in ASCIIType?
Darin Adler
Comment 10 2011-12-19 11:31:38 PST
(In reply to comment #9) > Now that ASCIIType.h is no longer bloated, what do you think about merging the ASCIIFastPathHelper.h in this patch in ASCIIType? I don’t think this belongs in <wtf/ASCIICType.h>. The point of <ASCIICType.h> is to offer functions like the ones in <ctype.h>, but ones that work with ASCII. It should not have other functions in it. In fact, I believe some of the functions creeping in there do not belong there. I do think its’s good to have some functions for working with ASCII all in one header. It could even be named <wtf/ASCII.h>. I don’t like the word “helper” in a header name. Headers all help you!
Darin Adler
Comment 11 2011-12-19 11:38:07 PST
Comment on attachment 118838 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=118838&action=review Did you measure the performance impact of this? The code is faster for long strings but I suspect it’s slower for short ones. > Source/JavaScriptCore/wtf/text/ASCIIFastPathHelper.h:4 > + * Copyright (C) 2011 Benjamin Poulain <benjamin@webkit.org> If you are doing this work as an Apple employee you should not add your own copyright to the file. If you did the work when not an Apple employee that might be OK. > Source/JavaScriptCore/wtf/text/ASCIIFastPathHelper.h:46 > + > + We normally don’t use multiple blank lines to paragraph. We use just one. If you want a larger separator you’d normally use a comment. > Source/JavaScriptCore/wtf/text/ASCIIFastPathHelper.h:47 > +template<size_t size, typename CharType> struct NonASCIIMask; I’d prefer that “character type” not be abbreviated to CharType. How about CharacterType? I would like that in ASCIICType.h too. > Source/JavaScriptCore/wtf/text/ASCIIFastPathHelper.h:69 > +// Note: this function assume the input is likely all ASCII, and > +// do not leave early if it is not the case. The word “this” should be capitalized. The word “do” should be “does”. > Source/JavaScriptCore/wtf/text/ASCIIFastPathHelper.h:80 > + // Prologue: align the input. > + while (!isAlignedToMachineWord(characters) && (characters != end)) { > + allCharBits |= *characters; > + ++characters; > + } We may need to find a way to incorporate a fast path for buffers that are known to be aligned to a machine word that omits this first loop. For example, I suspect all string buffers for StringImpl have such a guarantee. I don’t think the parentheses are characters != end are needed. > Source/JavaScriptCore/wtf/text/ASCIIFastPathHelper.h:83 > + const CharType *wordEnd = alignToMachineWord(end); Misplaced * here.
Benjamin Poulain
Comment 12 2011-12-19 11:53:01 PST
ASCIIFastPath.h it is then :) > > Source/JavaScriptCore/wtf/text/ASCIIFastPathHelper.h:4 > > + * Copyright (C) 2011 Benjamin Poulain <benjamin@webkit.org> > > If you are doing this work as an Apple employee you should not add your own copyright to the file. If you did the work when not an Apple employee that might be OK. I use to have the distinction between work done...at work, and what I do on my weekends at home. I'll remove this.
Benjamin Poulain
Comment 13 2012-01-03 17:58:53 PST
Andreas Kling
Comment 14 2012-01-03 18:01:50 PST
Comment on attachment 121030 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=121030&action=review > Source/JavaScriptCore/wtf/text/ASCIIFastPath.h:2 > + * Copyright (C) 2011 Apple Inc. All rights reserved. Needs more 2012!
Benjamin Poulain
Comment 15 2012-01-03 18:03:41 PST
All the bots are red, I'll wait a bit before landing this.
WebKit Review Bot
Comment 16 2012-01-05 01:39:24 PST
Comment on attachment 121030 [details] Patch Clearing flags on attachment: 121030 Committed r104127: <http://trac.webkit.org/changeset/104127>
WebKit Review Bot
Comment 17 2012-01-05 01:39:30 PST
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.