Bug 33268

Summary: make JPEG image dcoder read segmented SharedBuffer
Product: WebKit Reporter: Yong Li <yong.li.webkit>
Component: ImagesAssignee: Nobody <webkit-unassigned>
Status: RESOLVED WONTFIX    
Severity: Normal CC: abarth, dglazkov, pkasting, skyul, webkit.review.bot
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Attachments:
Description Flags
the patch
none
fixed some style issues
none
fixed built error levin: review-

Description Yong Li 2010-01-06 13:52:49 PST
patch is coming
Comment 1 Yong Li 2010-01-06 13:58:41 PST
Created attachment 45993 [details]
the patch
Comment 2 WebKit Review Bot 2010-01-06 13:59:19 PST
Attachment 45993 [details] did not pass style-queue:

Failed to run "WebKitTools/Scripts/check-webkit-style" exit_code: 1
WebCore/platform/image-decoders/jpeg/JPEGImageDecoder.cpp:148:  num_bytes is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
WebCore/platform/image-decoders/jpeg/JPEGImageDecoder.cpp:168:  Missing space before ( in while(  [whitespace/parens] [5]
Total errors found: 2
Comment 3 Peter Kasting 2010-01-06 14:13:52 PST
Comment on attachment 45993 [details]
the patch

Other than the style nits the bot pointed out, my comments:

> +unsigned SharedBuffer::extract(char* buffer, unsigned bufferLength, unsigned from)

"from" (and to some degree "buffer") are a little vague.  How about:

unsigned SharedBuffer::copyToContiguousBuffer(char* destBuffer, size_t bufferLength, size_t readOffset)

> +class SharedBufferDataWalker {

Not totally obvious what this class does, maybe you can write a brief comment above it?

> +        unsigned lastOffset = -1;
> +        Vector<unsigned char, 8192> buffer;

Is it a potential perf issue that we can decode at most 8K at a time?

> +        SharedBufferDataWalker segment;

"segment" is kind of odd as a name... why not "dataWalker" or something?

> +        while(ture) {

Won't compile (typo)

> +            if (readOffset == lastOffset && m_bufferLength == data.size())
> +                break;
> +
> +            unsigned segmentLength = 0;
> +            if (readOffset == lastOffset) {

Could just move the first conditional here inside the second conditional.

> +                // Need more data to proceed
> +                if (buffer.isEmpty()) {
> +                    segment.set(data, readOffset);
> +                    segmentLength = segment.length();
> +                    buffer.append(segment.data(), segmentLength);
> +                }
> +                segment.set(data, readOffset + buffer.size());
> +                segmentLength = segment.length();
> +                buffer.append(segment.data(), segmentLength);
> +                m_info.src->bytes_in_buffer += segmentLength;
> +                m_info.src->next_input_byte = (JOCTET*)buffer.data();
> +                m_bufferLength = readOffset + buffer.size();
> +            } else {
> +                if (!buffer.isEmpty())
> +                    buffer.clear();
> +
> +                lastOffset = readOffset;
> +
> +                segment.set(data, readOffset);
> +                segmentLength = segment.length();
> +
> +                unsigned totalSize = readOffset + segmentLength;
> +                m_info.src->bytes_in_buffer += totalSize - m_bufferLength;
> +                m_info.src->next_input_byte = (JOCTET*)segment.data();
> +
> +                // If we still have bytes to skip, try to skip those now.
> +                if (m_bytesToSkip)
> +                    skipBytes(m_bytesToSkip);
> +
> +                m_bufferLength = totalSize;
> +            }

I confess, I can't totally follow all this even after peering through it a few times.  There are too many different buffers and lengths.  Maybe some block comment somewhere explaining the algorithm, I dunno.
Comment 4 WebKit Review Bot 2010-01-06 14:15:12 PST
Attachment 45993 [details] did not build on chromium:
Build output: http://webkit-commit-queue.appspot.com/results/165045
Comment 5 Yong Li 2010-01-06 14:46:52 PST
(In reply to comment #3)
> (From update of attachment 45993 [details])
> Other than the style nits the bot pointed out, my comments:
> 
> > +unsigned SharedBuffer::extract(char* buffer, unsigned bufferLength, unsigned from)
> 
> "from" (and to some degree "buffer") are a little vague.  How about:
> 
> unsigned SharedBuffer::copyToContiguousBuffer(char* destBuffer, size_t
> bufferLength, size_t readOffset)

"readOffset" is good. was thinking "offset" is confusing.

> 
> > +class SharedBufferDataWalker {
> 
> Not totally obvious what this class does, maybe you can write a brief comment
> above it?
sure.
> 
> > +        unsigned lastOffset = -1;
> > +        Vector<unsigned char, 8192> buffer;
> 
> Is it a potential perf issue that we can decode at most 8K at a time?

A segment in SharedBuffer is 4k. in the case JPEG reader needs 2 segments to continue moving, 8k is a good number. The vector can grow when 8k is not enough. but I think 8k is better than 0 for performance (no need to call malloc in most cases)

> 
> > +        SharedBufferDataWalker segment;
> 
> "segment" is kind of odd as a name... why not "dataWalker" or something?
ok.
> 
> > +        while(ture) {
> 
> Won't compile (typo)
found this too. just replaced for(;;) with it before making the patch
> 
> > +            if (readOffset == lastOffset && m_bufferLength == data.size())
> > +                break;
> > +
> > +            unsigned segmentLength = 0;
> > +            if (readOffset == lastOffset) {
> 
> Could just move the first conditional here inside the second conditional.
yes.
> 
> > +                // Need more data to proceed
> > +                if (buffer.isEmpty()) {
> > +                    segment.set(data, readOffset);
> > +                    segmentLength = segment.length();
> > +                    buffer.append(segment.data(), segmentLength);
> > +                }
> > +                segment.set(data, readOffset + buffer.size());
> > +                segmentLength = segment.length();
> > +                buffer.append(segment.data(), segmentLength);
> > +                m_info.src->bytes_in_buffer += segmentLength;
> > +                m_info.src->next_input_byte = (JOCTET*)buffer.data();
> > +                m_bufferLength = readOffset + buffer.size();
> > +            } else {
> > +                if (!buffer.isEmpty())
> > +                    buffer.clear();
> > +
> > +                lastOffset = readOffset;
> > +
> > +                segment.set(data, readOffset);
> > +                segmentLength = segment.length();
> > +
> > +                unsigned totalSize = readOffset + segmentLength;
> > +                m_info.src->bytes_in_buffer += totalSize - m_bufferLength;
> > +                m_info.src->next_input_byte = (JOCTET*)segment.data();
> > +
> > +                // If we still have bytes to skip, try to skip those now.
> > +                if (m_bytesToSkip)
> > +                    skipBytes(m_bytesToSkip);
> > +
> > +                m_bufferLength = totalSize;
> > +            }
> 
> I confess, I can't totally follow all this even after peering through it a few
> times.  There are too many different buffers and lengths.  Maybe some block
> comment somewhere explaining the algorithm, I dunno.

sometimes the low level decoding doesn't move on unless you feed it more data in consecutive memory space. "buffer" is there for this purpose.
Comment 6 Yong Li 2010-01-06 14:47:22 PST
Comment on attachment 45993 [details]
the patch

will upload new patch. cleared the flag
Comment 7 Peter Kasting 2010-01-06 14:52:29 PST
(In reply to comment #5)
> (In reply to comment #3)
> > (From update of attachment 45993 [details] [details])
> > > +        Vector<unsigned char, 8192> buffer;
> > 
> > Is it a potential perf issue that we can decode at most 8K at a time?
> 
> A segment in SharedBuffer is 4k. in the case JPEG reader needs 2 segments to
> continue moving, 8k is a good number. The vector can grow when 8k is not
> enough. but I think 8k is better than 0 for performance (no need to call malloc
> in most cases)

Oh, I meant in comparison to the pre-segmented-SharedBuffer case where we could perhaps decode the entire image in one shot.

> sometimes the low level decoding doesn't move on unless you feed it more data
> in consecutive memory space. "buffer" is there for this purpose.

Yeah, I got the general idea that we were copying segments out to a contiguous buffer to then feed to the decoder.  I was trying to understand the details to double-check if there were any errors/edge cases/etc. but I kept getting lost in how many different offsets we were tracking.
Comment 8 Yong Li 2010-01-06 15:02:13 PST
(In reply to comment #7)
> (In reply to comment #5)
> > (In reply to comment #3)
> > > (From update of attachment 45993 [details] [details] [details])
> > > > +        Vector<unsigned char, 8192> buffer;
> > > 
> > > Is it a potential perf issue that we can decode at most 8K at a time?
> > 
> > A segment in SharedBuffer is 4k. in the case JPEG reader needs 2 segments to
> > continue moving, 8k is a good number. The vector can grow when 8k is not
> > enough. but I think 8k is better than 0 for performance (no need to call malloc
> > in most cases)
> 
> Oh, I meant in comparison to the pre-segmented-SharedBuffer case where we could
> perhaps decode the entire image in one shot.

in that case "buffer" is not used at all. also "segmenting" can be easy to be disabled in SharedBuffer::append(). probably I should add #if ENABLE() there
> 
> > sometimes the low level decoding doesn't move on unless you feed it more data
> > in consecutive memory space. "buffer" is there for this purpose.
> 
> Yeah, I got the general idea that we were copying segments out to a contiguous
> buffer to then feed to the decoder.  I was trying to understand the details to
> double-check if there were any errors/edge cases/etc. but I kept getting lost
> in how many different offsets we were tracking.

life is not easy :) 

BTW, this algorithm has been running for a long while in iris browser winmob. (I believe you've seen this in one of wince port patches)
Comment 9 Yong Li 2010-01-06 15:08:05 PST
(In reply to comment #8)
> (In reply to comment #7)
> > (In reply to comment #5)
> > > (In reply to comment #3)
> > > > (From update of attachment 45993 [details] [details] [details] [details])

> > Oh, I meant in comparison to the pre-segmented-SharedBuffer case where we could
> > perhaps decode the entire image in one shot.
> 
> in that case "buffer" is not used at all. also "segmenting" can be easy to be
> disabled in SharedBuffer::append(). probably I should add #if ENABLE() there
> > 

probably I misunderstood you again. with internally segmented SharedBuffer, we don't need to maintain a big chunk of memory for the resource data. for perf, we don't have to move all bytes to a bigger buffer when receiving data from network. When decoding the image, perf could be a little bit worse.
Comment 10 Yong Li 2010-01-07 07:48:45 PST
Created attachment 46053 [details]
fixed some style issues
Comment 11 WebKit Review Bot 2010-01-07 07:52:46 PST
Attachment 46053 [details] did not pass style-queue:

Failed to run "WebKitTools/Scripts/check-webkit-style" exit_code: 1
WebCore/platform/image-decoders/jpeg/JPEGImageDecoder.cpp:80:  skip_input_data is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
WebCore/platform/image-decoders/jpeg/JPEGImageDecoder.cpp:433:  skip_input_data is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
Total errors found: 2
Comment 12 Yong Li 2010-01-07 07:55:07 PST
(In reply to comment #11)
> Attachment 46053 [details] did not pass style-queue:
> 
> Failed to run "WebKitTools/Scripts/check-webkit-style" exit_code: 1
> WebCore/platform/image-decoders/jpeg/JPEGImageDecoder.cpp:80:  skip_input_data
> is incorrectly named. Don't use underscores in your identifier names. 
> [readability/naming] [4]
> WebCore/platform/image-decoders/jpeg/JPEGImageDecoder.cpp:433:  skip_input_data
> is incorrectly named. Don't use underscores in your identifier names. 
> [readability/naming] [4]
> Total errors found: 2

should I fix all existing style issues in the file?
Comment 13 WebKit Review Bot 2010-01-07 07:55:23 PST
Attachment 46053 [details] did not build on chromium:
Build output: http://webkit-commit-queue.appspot.com/results/166405
Comment 14 Yong Li 2010-01-07 08:06:03 PST
Comment on attachment 46053 [details]
fixed some style issues

need to fix another error
Comment 15 Adam Barth 2010-01-07 11:16:33 PST
> should I fix all existing style issues in the file?

I'm not sure about this file in specific, but the GIF image decoder, for example, is not in WebKit style on purpose.
Comment 16 Yong Li 2010-01-07 11:18:23 PST
Created attachment 46066 [details]
fixed built error
Comment 17 Peter Kasting 2010-01-07 11:19:39 PST
(In reply to comment #15)
> > should I fix all existing style issues in the file?
> 
> I'm not sure about this file in specific, but the GIF image decoder, for
> example, is not in WebKit style on purpose.

Well, the GIF image "reader" (I'm using the names the files give themselves) is not, because the whole file was copied.  The "decoder" portion is.

In the JPEG/PNG cases, the decoders should in theory be in webkit style, except that in some cases we use names that match the libjpeg/libpng interfaces for clarity, even though it violates WK style.

If there are more style problems than those, it'd be nice to fix them, although if it's very many, I'm not sure it's fair to Yong to mandate fixing the whole file when it's touched.
Comment 18 WebKit Review Bot 2010-01-07 11:20:27 PST
Attachment 46066 [details] did not pass style-queue:

Failed to run "WebKitTools/Scripts/check-webkit-style" exit_code: 1
WebCore/platform/image-decoders/jpeg/JPEGImageDecoder.cpp:80:  skip_input_data is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
WebCore/platform/image-decoders/jpeg/JPEGImageDecoder.cpp:433:  skip_input_data is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
Total errors found: 2
Comment 19 Yong Li 2010-01-07 11:20:47 PST
(In reply to comment #15)
> > should I fix all existing style issues in the file?
> 
> I'm not sure about this file in specific, but the GIF image decoder, for
> example, is not in WebKit style on purpose.

I changed argument "num_bytes" to "numBytes", but then more style errors show up. Seem those functions are following libjpeg style
Comment 20 Yong Li 2010-01-07 11:22:13 PST
(In reply to comment #17)
> (In reply to comment #15)
> > > should I fix all existing style issues in the file?
> > 
> > I'm not sure about this file in specific, but the GIF image decoder, for
> > example, is not in WebKit style on purpose.
> 
> Well, the GIF image "reader" (I'm using the names the files give themselves) is
> not, because the whole file was copied.  The "decoder" portion is.
> 
> In the JPEG/PNG cases, the decoders should in theory be in webkit style, except
> that in some cases we use names that match the libjpeg/libpng interfaces for
> clarity, even though it violates WK style.
> 
> If there are more style problems than those, it'd be nice to fix them, although
> if it's very many, I'm not sure it's fair to Yong to mandate fixing the whole
> file when it's touched.

probably should be fixed in another bug
Comment 21 David Levin 2010-02-05 10:59:53 PST
Comment on attachment 46066 [details]
fixed built error

I gave this a quick once over since it was sitting in the queue for a while, but it was surface level and needs a more in depth review in the future.

r- for lack of tests/no indication of how the code is tested.


> diff --git a/WebCore/ChangeLog b/WebCore/ChangeLog
> +        Reviewed by NOBODY (OOPS!).
> +
> +        Make JPEGImageDecoder read segmented SharedBuffer
> +        Also move function copyFromSharedBuffer() to SharedBuffer
> +        and rename to extract().
> +        https://bugs.webkit.org/show_bug.cgi?id=33268
> +

This makes no mention of tests. How is this tested? Do new tests need to be added?


> +class SharedBufferDataWalker {
> +public:
> +    SharedBufferDataWalker() : m_buffer(0), m_length(0), m_data(0) {}

This isn't the style for initializing member variables (and then it would be good to put the braces on separate lines).

> +    // Returns the current pointer

Add a period to the end of the comment.


> diff --git a/WebCore/platform/image-decoders/jpeg/JPEGImageDecoder.cpp b/WebCore/platform/image-decoders/jpeg/JPEGImageDecoder.cpp
> + * Copyright (C) Research In Motion Limited 2009-2010. All rights reserved.

WebKit encourages uses comma delimited years instead of ranges.

> +void skip_input_data(j_decompress_ptr jd, long numBytes);

It looks like the parameter name "jd" adds no information so it should be removed.

> +        long bytesToSkip = std::min(numBytes, (long)src->pub.bytes_in_buffer);

Use C++ casting instead of C style casting.

> +        if (numBytes > bytesToSkip)
> +            m_bytesToSkip = (size_t)(numBytes - bytesToSkip);

Use C++ casting instead of C style casting.

> +
> +            if (readOffset == lastOffset) {
> +                if (m_bufferLength == data.size())
> +                    break;
> +
> +                // Need more contiguous data to proceed

Please add periods to ends of sentences.

> +                m_info.src->bytes_in_buffer += segmentLength;
> +                m_info.src->next_input_byte = (JOCTET*)contiguousBuffer.data();

Use C++ casting instead of C style casting.

> +                unsigned totalSize = readOffset + segmentLength;
> +                m_info.src->bytes_in_buffer += totalSize - m_bufferLength;
> +                m_info.src->next_input_byte = (JOCTET*)dataWalker.data();

Use C++ casting instead of C style casting.
Comment 22 Yong Li 2010-09-23 13:19:35 PDT
Not sure this is still worth doing. Close it for now.