WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
50804
[chromium] Reduce canvas.toDataURL("image/jpeg") run-time cost by 10%
https://bugs.webkit.org/show_bug.cgi?id=50804
Summary
[chromium] Reduce canvas.toDataURL("image/jpeg") run-time cost by 10%
noel gordon
Reported
2010-12-10 00:38:57 PST
Reduce canvas.toDataURL("image/jpeg") run-time cost by 10%
Attachments
patch
(2.46 KB, patch)
2010-12-10 00:58 PST
,
noel gordon
no flags
Details
Formatted Diff
Diff
patch
(2.45 KB, patch)
2010-12-10 01:06 PST
,
noel gordon
no flags
Details
Formatted Diff
Diff
Show Obsolete
(1)
View All
Add attachment
proposed patch, testcase, etc.
noel gordon
Comment 1
2010-12-10 00:58:34 PST
Created
attachment 76173
[details]
patch
WebKit Review Bot
Comment 2
2010-12-10 01:00:16 PST
Attachment 76173
[details]
did not pass style-queue: Failed to run "['WebKitTools/Scripts/check-webkit-style', '--diff-files', u'WebCore/ChangeLog', u'WebCore/platform/image-encoders/skia/JPEGImageEncoder.cpp']" exit_code: 1 WebCore/platform/image-encoders/skia/JPEGImageEncoder.cpp:86: One or more unexpected \r (^M) found; better to use only a \n [whitespace/carriage_return] [1] Suppressing further [whitespace/carriage_return] reports for this file. WebCore/platform/image-encoders/skia/JPEGImageEncoder.cpp:37: Alphabetical sorting problem. [build/include_order] [4] WebCore/platform/image-encoders/skia/JPEGImageEncoder.cpp:90: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5] Total errors found: 12 in 2 files If any of these errors are false positives, please file a bug against check-webkit-style.
noel gordon
Comment 3
2010-12-10 01:06:04 PST
Created
attachment 76174
[details]
patch style: can't avoid the non-alphabetical include ordering nit.
noel gordon
Comment 4
2010-12-10 01:07:48 PST
so I expected to nitted again.
WebKit Review Bot
Comment 5
2010-12-10 01:08:49 PST
Attachment 76174
[details]
did not pass style-queue: Failed to run "['WebKitTools/Scripts/check-webkit-style', '--diff-files', u'WebCore/ChangeLog', u'WebCore/platform/image-encoders/skia/JPEGImageEncoder.cpp']" exit_code: 1 WebCore/platform/image-encoders/skia/JPEGImageEncoder.cpp:37: Alphabetical sorting problem. [build/include_order] [4] WebCore/platform/image-encoders/skia/JPEGImageEncoder.cpp:90: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5] Total errors found: 2 in 2 files If any of these errors are false positives, please file a bug against check-webkit-style.
Eric Seidel (no email)
Comment 6
2010-12-10 01:38:23 PST
Comment on
attachment 76174
[details]
patch You should consider using webkit-patch upload to upload your patches so that they get marked properly as patch files.
Eric Seidel (no email)
Comment 7
2010-12-10 01:40:51 PST
Comment on
attachment 76174
[details]
patch View in context:
https://bugs.webkit.org/attachment.cgi?id=76174&action=review
It's a little surprising that this unroll makes things faster, but OK. Are you certain in the stability of your numbers?
> WebCore/platform/image-encoders/skia/JPEGImageEncoder.cpp:90 > + if ((alpha != 0) && (alpha != 255)) {
We generally use just "alpha" instead of "alpha != 0", but it might be OK here.
noel gordon
Comment 8
2010-12-12 17:40:43 PST
(In reply to
comment #7
)
> (From update of
attachment 76174
[details]
) > View in context:
https://bugs.webkit.org/attachment.cgi?id=76174&action=review
> > It's a little surprising that this unroll makes things faster, but OK. Are you certain in the stability of your numbers?
Test image is 1800 x 1350 px JPEG. I load it into a <canvas> and then I do back-to-back canvas.toDataURL("image/jpeg") calls in js, logging the time as I go. My results are: Safari 5.0.2 win32 (7533.18.5): 12:15:45:062 extract image/jpeg 145 ms length: 667763 bytes 12:15:45:214 extract image/jpeg 146 ms length: 667763 bytes 12:15:45:366 extract image/jpeg 146 ms length: 667763 bytes 12:15:45:517 extract image/jpeg 145 ms length: 667763 bytes 12:15:45:668 extract image/jpeg 145 ms length: 667763 bytes 12:15:45:819 extract image/jpeg 144 ms length: 667763 bytes 12:15:45:970 extract image/jpeg 144 ms length: 667763 bytes 12:15:46:122 extract image/jpeg 145 ms length: 667763 bytes 12:15:46:273 extract image/jpeg 145 ms length: 667763 bytes 12:15:46:424 extract image/jpeg 145 ms length: 667763 bytes 12:15:46:575 extract image/jpeg 144 ms length: 667763 bytes Chrome ToT win32: 12:09:16:144 extract image/jpeg 129 ms length: 630179 bytes 12:09:16:301 extract image/jpeg 130 ms length: 630179 bytes 12:09:16:458 extract image/jpeg 130 ms length: 630179 bytes 12:09:16:617 extract image/jpeg 130 ms length: 630179 bytes 12:09:16:752 extract image/jpeg 130 ms length: 630179 bytes 12:09:16:909 extract image/jpeg 129 ms length: 630179 bytes 12:09:17:044 extract image/jpeg 129 ms length: 630179 bytes 12:09:17:203 extract image/jpeg 130 ms length: 630179 bytes 12:09:17:339 extract image/jpeg 130 ms length: 630179 bytes 12:09:17:474 extract image/jpeg 130 ms length: 630179 bytes 12:09:17:632 extract image/jpeg 130 ms length: 630179 bytes Chrome ToT win32 with this change: 12:13:14:649 extract image/jpeg 115 ms length: 630179 bytes 12:13:14:769 extract image/jpeg 115 ms length: 630179 bytes 12:13:14:891 extract image/jpeg 114 ms length: 630179 bytes 12:13:15:011 extract image/jpeg 114 ms length: 630179 bytes 12:13:15:131 extract image/jpeg 115 ms length: 630179 bytes 12:13:15:253 extract image/jpeg 115 ms length: 630179 bytes 12:13:15:373 extract image/jpeg 115 ms length: 630179 bytes 12:13:15:494 extract image/jpeg 115 ms length: 630179 bytes 12:13:15:617 extract image/jpeg 115 ms length: 630179 bytes 12:13:15:738 extract image/jpeg 115 ms length: 630179 bytes 12:13:15:857 extract image/jpeg 114 ms length: 630179 bytes worth it?
> > WebCore/platform/image-encoders/skia/JPEGImageEncoder.cpp:90 > > + if ((alpha != 0) && (alpha != 255)) { > > We generally use just "alpha" instead of "alpha != 0", but it might be OK here.
agree, OK here.
Eric Seidel (no email)
Comment 9
2010-12-13 00:20:34 PST
Comment on
attachment 76174
[details]
patch Those are definitely small numbers. :) But 10% is still 10%. Woud be even better if we had a benchmark for this sort of thing, but 115ms is pretty small already. :) WebCore/benchmarks is where such a benchmark would go.
WebKit Review Bot
Comment 10
2010-12-13 01:12:15 PST
Comment on
attachment 76174
[details]
patch Clearing flags on attachment: 76174 Committed
r73890
: <
http://trac.webkit.org/changeset/73890
>
WebKit Review Bot
Comment 11
2010-12-13 01:12:21 PST
All reviewed patches have been landed. Closing bug.
Stephen White
Comment 12
2010-12-13 07:46:02 PST
Comment on
attachment 76174
[details]
patch I'm fine with this change, but I'm surprised that putting the conditional in the loop doesn't make performance worse. Perhaps your tests are mostly alpha 255 or 0, and are benefiting disproportionately from branch prediction? You could also look at writing an int at a time to the output, rather than a char at a time. Of course, this would require unrolling 3X or so in order to write RGBR GBRG BRGB, kind of a pain, but usually nets a good speedup. If this continues to be a bottleneck, it might be worth looking at an SSE2 version as well.
noel gordon
Comment 13
2010-12-13 20:02:07 PST
> Those are definitely small numbers. :) But 10% is still 10%. Woud be even better if we had a benchmark for this sort of thing, but 115ms is pretty small already. :) WebCore/benchmarks is where such a benchmark would go.
Happy to take the 10%, we are about to get faster still. hbono is is working on
bug 50054
, which should make the encoding phase 3x faster.
noel gordon
Comment 14
2010-12-13 20:37:32 PST
(In reply to
comment #12
)
> I'm fine with this change, but I'm surprised that putting the conditional in the loop doesn't make performance worse. > Perhaps your tests are mostly alpha 255 or 0, and are benefiting disproportionately from branch prediction?
Apply the first statement to canvas.getImageData and canvas.putImageData :) But yes, I believe the else branch benefits from alpha 0|255 and that it also exposes the cost of the existing "lookup" table. This is not new idea; see the PNG ncoder implementation in the skia repository for example.
> You could also look at writing an int at a time to the output, rather than a char at a time. Of course, this would > require unrolling 3X or so in order to write RGBR GBRG BRGB, kind of a pain, but usually nets a good speedup.
Yes a pain. Would it be better to build premultipication/unpremultiplication tables?
http://mxr.mozilla.org/mozilla2.0/source/content/canvas/src/nsCanvasRenderingContext2D.cpp#3917
http://mxr.mozilla.org/mozilla2.0/source/content/canvas/src/nsCanvasRenderingContext2D.cpp#3996
> If this continues to be a bottleneck, it might be worth looking at an SSE2 version as well.
Indeed, hbono suggested exactly that approach, in particular that he could implement the (un)premultiplication via SSE or MMX. I'm awaiting the result of his
bug 50054
(I'm not synced to it yet) to measure the expected encoding phase speed-up (3x), and to consider if the unpremultiplication is a "bottleneck" worth more optimization ...
Hironori Bono
Comment 15
2010-12-14 18:21:02 PST
(In reply to
comment #13
)
> Happy to take the 10%, we are about to get faster still. hbono is is working > on
bug 50054
, which should make the encoding phase 3x faster.
This is just for your information. I have been integrating premultiply/unpremultiply operations into libjpeg-turbo so we can directly use skia pixels, i.e. we don't need Vector<JSAMPLE>. For those who are interested in it, I have uploaded a part of this change (only unpremultiply operations for SSE) to Rietveld: <
http://codereview.chromium.org/5862001/
>. Regards,
noel gordon
Comment 16
2010-12-14 21:44:02 PST
(In reply to
comment #14
)
> (In reply to
comment #12
) > > I'm fine with this change, but I'm surprised that putting the conditional in the loop doesn't make performance worse. > > Perhaps your tests are mostly alpha 255 or 0, and are benefiting disproportionately from branch prediction? > > Apply the first statement to canvas.getImageData and canvas.putImageData :)
correction, those are templated.
noel gordon
Comment 17
2010-12-14 23:59:43 PST
(In reply to
comment #15
) @hbono thanks for
http://codereview.chromium.org/5862001
, sounds good. I should note, be aware of
http://www.w3.org/Bugs/Public/show_bug.cgi?id=11431
noel gordon
Comment 18
2010-12-17 01:49:09 PST
http://www.w3.org/Bugs/Public/show_bug.cgi?id=11431
is easy to resolve - the spec is correct. Opened
bug 51237
to correct chromium. It's related to this bug, and we maintain the speedup mentioned here, but there's no need to unpremultiply RGB. See
bug 51237
for details.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug