Bug 35362 - Parsing of CSS rgb() values is slow
Summary: Parsing of CSS rgb() values is slow
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: Other OS X 10.5
: P2 Normal
Assignee: Stephen White
URL:
Keywords:
: 23110 (view as bug list)
Depends on:
Blocks: 23110
  Show dependency treegraph
 
Reported: 2010-02-24 14:42 PST by Stephen White
Modified: 2010-03-05 17:13 PST (History)
5 users (show)

See Also:


Attachments
First draft (2.15 KB, patch)
2010-02-24 15:02 PST, Stephen White
no flags Details | Formatted Diff | Diff
Patch (2.71 KB, patch)
2010-02-26 10:11 PST, Stephen White
no flags Details | Formatted Diff | Diff
Patch (2.76 KB, patch)
2010-02-26 11:03 PST, Stephen White
no flags Details | Formatted Diff | Diff
Patch (2.95 KB, patch)
2010-02-26 12:23 PST, Stephen White
darin: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Stephen White 2010-02-24 14:42:09 PST
Parsing of CSS rgb() values is slow
Comment 1 Stephen White 2010-02-24 14:45:49 PST
When running the "Complex Graphics" test from the Peacekeeper suite, upwards of 25% of CPU time is taken parsing CSS rgb() values.  Implementing a simple ad-hoc parser (as is done for hex color values) would go a long way to improving performance on this test.
Comment 2 Stephen White 2010-02-24 15:02:55 PST
Created attachment 49447 [details]
First draft
Comment 3 Stephen White 2010-02-24 17:18:44 PST
Re: patch1:

This functionality could be moved up into the other flavor of parseColor() on line 278. The layout tests seem ok with it, but I wasn't sure if this is correct.
Comment 4 Simon Fraser (smfr) 2010-02-24 17:31:44 PST
*** Bug 23110 has been marked as a duplicate of this bug. ***
Comment 5 Simon Fraser (smfr) 2010-02-24 17:32:12 PST
You should just do rgba() as well.
Comment 6 Oliver Hunt 2010-02-24 17:45:02 PST
Comment on attachment 49447 [details]
First draft

String doesn't guarantee null termination, you should instead pass in a length parameter, then you can do:
const UChar* end = *str + length;
and the *c pieces become
c != end
Comment 7 Stephen White 2010-02-26 10:11:49 PST
Created attachment 49597 [details]
Patch
Comment 8 Stephen White 2010-02-26 10:15:04 PST
Ok, fixed.  Also changed to use isCSSWhitespace(), since it doesn't include vertical tab, and isASCIISpace() does.
Comment 9 Stephen White 2010-02-26 10:15:56 PST
(In reply to comment #5)
> You should just do rgba() as well.

Unfortunately, rgba() requires parsing of floats (for the alpha), which is a bit trickier than ints.  I'll give it a shot but I might end up doing another patch for it if that's ok.
Comment 10 Oliver Hunt 2010-02-26 10:20:20 PST
Comment on attachment 49597 [details]
Patch

I would be interested in what you get for perf numbers if you replace isCSSWhitespace with a simplified ' ' and '\t' only check -- on the basis we are wanting to get the maximum perf out of the normal case.

Also my reading of this seems to indicate that this will accept
rgb(1,1,1)<random text>
Comment 11 Oliver Hunt 2010-02-26 10:22:14 PST
(In reply to comment #9)
> (In reply to comment #5)
> > You should just do rgba() as well.
> 
> Unfortunately, rgba() requires parsing of floats (for the alpha), which is a
> bit trickier than ints.  I'll give it a shot but I might end up doing another
> patch for it if that's ok.

I think it's important to understand you don't need to parse all valid inputs, you only need the fast path to handle the common cases, eg. 0.[0-9]{1,5} say
Comment 12 Darin Adler 2010-02-26 10:58:22 PST
(In reply to comment #10)
> I would be interested in what you get for perf numbers if you replace
> isCSSWhitespace with a simplified ' ' and '\t' only check -- on the basis we
> are wanting to get the maximum perf out of the normal case.

I would think space only. I expect that very few cases have tabs.
Comment 13 Stephen White 2010-02-26 11:03:33 PST
Created attachment 49602 [details]
Patch
Comment 14 Stephen White 2010-02-26 11:05:36 PST
(In reply to comment #10)
> (From update of attachment 49597 [details])
> I would be interested in what you get for perf numbers if you replace
> isCSSWhitespace with a simplified ' ' and '\t' only check -- on the basis we
> are wanting to get the maximum perf out of the normal case.

Ok, I gave it a try.  The difference was negligible:  +1.75% one run, -1.8% the
next, so within the variance of this particular test, which is about +/-2%. 
(Note that this test is +52.7% faster overall with my fix).

> Also my reading of this seems to indicate that this will accept
> rgb(1,1,1)<random text>

Fixed.  Note that I'm not accepting trailing whitespace either, which I figure is the
common case (no whitespace after the paren).
Comment 15 Darin Adler 2010-02-26 11:38:51 PST
Comment on attachment 49602 [details]
Patch

> +static inline bool parseInt(const UChar** str, const UChar* end, UChar terminator, int* value)

For in/out arguments we normally use references rather than pointers. Thus "str" and "value" would typically be references.

> +    const UChar *c = *str;
> +    int v = 0;

The "*" here is in the wrong place, should be next to the UChar.

We normally use words to name local variables, not letters.

> +        const UChar* c = name.characters() + 4;

We normally use words to name local variables, not letters.

> +        int r = 0, g = 0, b = 0;

We normally define variables on separate lines rather than all on one line.

There's no need to initialize these variables. I suggest not doing so.

I'm going to say r=me, despite these concerns since they are all stylistic issues.
Comment 16 Stephen White 2010-02-26 12:06:48 PST
(In reply to comment #15)
> (From update of attachment 49602 [details])
> > +static inline bool parseInt(const UChar** str, const UChar* end, UChar terminator, int* value)
> 
> For in/out arguments we normally use references rather than pointers. Thus
> "str" and "value" would typically be references.
> 
> > +    const UChar *c = *str;
> > +    int v = 0;
> 
> The "*" here is in the wrong place, should be next to the UChar.
> 
> We normally use words to name local variables, not letters.
> 
> > +        const UChar* c = name.characters() + 4;
> 
> We normally use words to name local variables, not letters.
> 
> > +        int r = 0, g = 0, b = 0;
> 
> We normally define variables on separate lines rather than all on one line.
> 
> There's no need to initialize these variables. I suggest not doing so.
> 
> I'm going to say r=me, despite these concerns since they are all stylistic
> issues.

Fixed.

(I wasn't sure if you meant "this is ok, but fix these before landing", "fix and re-upload" or "can land as is", so I fixed and re-uploaded just to be safe).
Comment 17 Stephen White 2010-02-26 12:23:51 PST
Created attachment 49612 [details]
Patch
Comment 18 Stephen White 2010-02-26 13:44:59 PST
Committed r55308: <http://trac.webkit.org/changeset/55308>