Full Text Bug Listing

Peter Kasting

Reported 2009-06-16 17:30:30 PDT

This bug was filed at eseidel's request to have a clean slate to discuss some of the issues originally raised on bug 26379. (The following is quite long, but there has been enough discussion on this issue that I want to try and be precise as to what's going on.) The most prominent issue to be considered here is that RGBA32Buffer's name, location, and implementation are all currently questionable. RGBA32Buffer used to be a Vector<unsigned> used by (to my knowledge) Cairo (i.e. Cairo/Win and GTK+) and wx. During Chromium development this was forked so the Skia backend could store pixel data directly in an SkBitmap. To unfork the Skia decoders on bug 25709 I made RGBA32Buffer be an API around a platform-dependent storage system, which at the moment is still a Vector<unsigned> for wx and Cairo but could conceivably be e.g. a wxBitmap or cairo_surface_t or whatever. Unfortunately, this means that "RGBA32" is a bit misleading, and ports that want to switch to the cross-platform image decoders need to do extra work. Additionally, it may or may not be weird to have code in image-decoders/ that's cross platform, when most such things are in graphics/. Furthermore, the ImageBuffer class has been noted as another "platform-specific low level pixel store", and as such, has at least some conceptual overlap with both the original RGBA32Buffer and (perhaps even more) the current version. As far as I can tell there is general agreement that these two classes should combine into one (perhaps named "ImageFrame" or similar). The actual storage inside this class is less finalized. Three proposals have been made: (1) Store everything as a Vector<unsigned>, provide functions either in this class or outside it (as friends or consumers or whatever) to adopt this pixel data into a platform-specific storage class. At least for Cairo and Skia this can probably be done without copying (although the resulting object must not be used past the life of the original pixel data, so care would need to be taken), maybe not for wx. This probably involves the least platform-specific code overall. This is closest to the old RGBA32Buffer implementation. (2) Store everything in platform-specific storage, but provide any APIs necessary to read/write data as needed by image decoders or other consumers. This has the fewest theoretical barriers to performance as ports can implement operations directly, although it's not clear that there are large practical differences; also, wx might not benefit much from this in cases where it lacks sufficient APIs to operate on its own native surfaces (see bug 26379 comment 51). This might be beneficial for special-purpose ports, e.g. a port on a memory-constrained device that wished to always downsample to 16bpp, since that port would never have to keep any frames in memory as 32bpp. This is basically what the current RGBA32Buffer implementation is, and perhaps ImageBuffer too (but I haven't looked). (3) Store things in platform-specific storage, but provide a default implementation using Vector<unsigned>, for ports which don't want or need anything else. This is something of a hybrid of the above options; whether it benefits anyone today probably depends on whether the Cairo or wx ports are interested in using platform-specific storage soon, while whether it benefits anyone in the future is hard to say. It's possible this would provide an easy "glide path" for future ports, although the set of APIs currently on RGBA32Buffer is small and simple enough that I don't think there is much practical difference. I personally lean toward option (2), but option (3) would not be hard to write. Of course, the exact details of the merged class, or even of merging with ImageBuffer at all, are subject to change pending a deeper understanding of ImageBuffer. There is other work to be done here as well. The ImageSource implementations should probably merge, partly or completely, into one; the RGBA32Buffer APIs should be changed to use more WebKit types; and in general the entire image subsytem needs to be better-understood to the point where a complete design doc can be made. Ideally the CG port can be included here too, so the demarcation of "CG versus other" can become clearer, and so the relevant classes can be designed to accommodate not only CG but other ports too. My (perhaps naive) ideal is that someday there will be mostly just shared, cross-platform code for boring bits like plumbing image frames between the decoders and the cache, coupled to a single set of simple classes that do the actual low-level per-platform storage. Right now things are much more forked than this.

Peter Kasting

Comment 1 2009-06-16 17:38:18 PDT

*** Bug 26379 has been marked as a duplicate of this bug. ***

Brent Fulgham

Comment 2 2009-06-16 21:42:18 PDT

It might be useful to consider bug 16028 during this work.

Brent Fulgham

Comment 3 2009-06-16 21:44:21 PDT

(In reply to comment #2) > It might be useful to consider bug 16028 during this work. Also, bug 16200 might have some relation to the various subtle memory issues Peter has alluded to in earlier discussion of this topic.

Peter Kasting

Comment 4 2009-06-16 22:29:31 PDT

Thanks for the references. I have tried to leave comments on both bugs. The idea of storing decoded GIF frames as <32bpp is interesting; I am not at all sure how to make it work. At some point, of course, we need to render to the screen, and even more, we need to be able to eventually get some sort of a platform-specific frame handle to pass around (NativeImagePtr); if the individual platforms can transparently implement both 32bpp and paletted images underneath this pointer, we could probably do this on a per-platform basis, but if we wanted to use a common storage implementation we'd be left with the question of how this paletted data gets converted to bits for display and who owns those bits. If I stuck with just the backend I know well (Skia) and tried to mock something up I might have a better idea how to do this and what the payoff would be (since we already discard memory aggressively for large images the benefit is tough to estimate, though probably nonzero). I suspect this is orthogonal enough to the work on this bug to leave in a separate bug but it's definitely worth a look at some point.

Yong Li

Comment 5 2009-06-17 08:27:24 PDT

We've already done this in WINCE port. We also think this can be shared by other platforms. But so far we haven't got ready to merge all our code. Take a look at this file: http://code.staikos.net/cgi-bin/gitweb.cgi?p=WebKit-CE;a=blob;f=WebCore/platform/graphics/wince/ImageFrameSink.h;h=60281851ae5ec64cfd6b701409e257339677cc28;hb=HEAD What we've done is: 1. Define ImageFrameSink interface, which the platform-independent interface, but some of its methods must be implemented for specific platform. ImageFrameSink can support 16bit, and can also write directly to platform-specific image storage buffer. 2. In WINCE port, we implement ImageFrameSink with another utility class "SharedBitmap". The buffer that image decoder directly writes is also directly used to render the image. It's very easy to implement ImageFrameSinkFoo.cpp with current RGB32Buffer. 3. We only support 16bit in 2 formats: RGB555 and RGB565, but it's determined by a macro switch. Images with alpha channel cannot use 16bit currently. Transparent GIF images with single transparent colors can also use 16bit if the transparent color doesn't conflict with other colors in the image after converting to 16bit. 4. We made changes in image decoders, and to make them work with ImageFrameSink and support 16bit. 5. We also support scaling huge images down to smaller size, and this is done on the fly of decoding, so there's no extra buffer-to-buffer scaling. Refer to ImageDecoder.h and ImageDecocer.cpp in our GIT repository. This feature is available when USE(IMAGEFRAMESINK) is true. 6. We also did some work for resource caching. We've implemented SharedBuffer in our way, and so that the buffer doesn't need to be flat Vector. Instead, it can be a series of segments. See the definition of OptimizedBuffer. And we make all image decoders (and other modules that uses text decoders) work with segmented source. This is currently only for WINCE & TORCHMOBILE. But we think it can be helpful for other platform, too. 7. Other things I cannot remember at this time. We hope to get our efforts merged to upstream as much as we can, and wish this can save your development time. Best regards, Yong Li Torch Mobile, Inc.

Adam Treat

Comment 6 2009-06-17 08:40:50 PDT

I think #3 is a good hybrid of the first two proposals. I'd like to see us go in that direction personally.

Holger Freyther

Comment 7 2009-06-17 09:14:06 PDT

(In reply to comment #6) > I think #3 is a good hybrid of the first two proposals. I'd like to see us go > in that direction personally. #3 is my favorite as well. Having a default would be great for things like cairo, Qt (when putting the result into XServer memory, meaning going from Vector<unsigned> to a XPixmap or GL texture), EFL (when using a smart_object) and allows wide use.

Peter Kasting

Comment 8 2009-06-17 09:45:53 PDT

Yong Li: Thanks for the pointers. I will definitely have to take a close look at what you've done here. Other folks should look too. Other replies: I don't have a problem with approach #3. If we start merging in some of the more complex WinCE work it may be that the "architecture-neutral" backend won't benefit from every optimization (not sure), but I wouldn't think it wouldactively block any work. At the moment I'm working on bug 26460, but that's really a blocker for merging the ImageSource definitions more than further work on the backing buffer, so if someone is keen to start looking at the WinCE changes it wouldn't be wasted effort.

Peter Kasting

Comment 9 2009-06-17 10:29:38 PDT

(In reply to comment #5) > 3. We only support 16bit in 2 formats: RGB555 and RGB565, but it's > determined by a macro switch. Images with alpha channel cannot use 16bit > currently. Transparent GIF images with single transparent colors can also > use 16bit if the transparent color doesn't conflict with other colors in the > image after converting to 16bit. I suspect if we implemented this for all platforms we might want a runtime switch instead of a macro. I think this is orthogonal to the idea of storing paletted (8bpp) data in the buffer but if it's not that's another design consideration here. > 5. We also support scaling huge images down to smaller size, and this is > done on the fly of decoding, so there's no extra buffer-to-buffer scaling. > Refer to ImageDecoder.h and ImageDecocer.cpp in our GIT repository. This > feature is available when USE(IMAGEFRAMESINK) is true. This one might be a macro when implemented for everyone. A port like Chromium would probably never want this, while a mobile port might always want this (?). > 6. We also did some work for resource caching. We've implemented > SharedBuffer in our way, and so that the buffer doesn't need to be flat > Vector. Instead, it can be a series of segments. See the definition of > OptimizedBuffer. And we make all image decoders (and other modules that uses > text decoders) work with segmented source. This is currently only for WINCE > & TORCHMOBILE. But we think it can be helpful for other platform, too. This is the most confusing bit to me. There's already an image cache in WebKit that can hang onto or discard decoded image data. And there's also code that throws away the unneeded decoded data from large animated GIFs. Is this work you've done separate from that, or integrated with it? I worry about having multiple isolated caching systems as a single system seems like it would make better decisions. I will have to look at precisely how you've implemented this.

Yong Li

Comment 10 2009-06-17 11:29:53 PDT

(In reply to comment #9) > (In reply to comment #5) > > 3. We only support 16bit in 2 formats: RGB555 and RGB565, but it's > > determined by a macro switch. Images with alpha channel cannot use 16bit > > currently. Transparent GIF images with single transparent colors can also > > use 16bit if the transparent color doesn't conflict with other colors in the > > image after converting to 16bit. > > I suspect if we implemented this for all platforms we might want a runtime > switch instead of a macro. > Making it runtime can allow us to choose the format that is most suitable on the target device at run time. But I'm afraid that could affect performance, unless we take care of all pixel walking code like this: case format A: for each pixels { } case format B: for each pixels { } ... > > I think this is orthogonal to the idea of storing paletted (8bpp) data in the > buffer but if it's not that's another design consideration here. > > > 5. We also support scaling huge images down to smaller size, and this is > > done on the fly of decoding, so there's no extra buffer-to-buffer scaling. > > Refer to ImageDecoder.h and ImageDecocer.cpp in our GIT repository. This > > feature is available when USE(IMAGEFRAMESINK) is true. > > This one might be a macro when implemented for everyone. A port like Chromium > would probably never want this, while a mobile port might always want this (?). > > > 6. We also did some work for resource caching. We've implemented > > SharedBuffer in our way, and so that the buffer doesn't need to be flat > > Vector. Instead, it can be a series of segments. See the definition of > > OptimizedBuffer. And we make all image decoders (and other modules that uses > > text decoders) work with segmented source. This is currently only for WINCE > > & TORCHMOBILE. But we think it can be helpful for other platform, too. > > This is the most confusing bit to me. There's already an image cache in WebKit > that can hang onto or discard decoded image data. And there's also code that > throws away the unneeded decoded data from large animated GIFs. Is this work > you've done separate from that, or integrated with it? I worry about having > multiple isolated caching systems as a single system seems like it would make > better decisions. I will have to look at precisely how you've implemented > this. > SharedBuffer is used for raw resource data (encoded data). The reason why we allow segmented buffer is that memory situation on mobile device is very bad, and allocating a big block of memory can fail. Also, it takes CPU time to resize the Vector. I guess JSC::SegmentedVector is a good utility class for implementing OptimizedBuffer easily.

Peter Kasting

Comment 11 2009-06-17 11:40:18 PDT

(In reply to comment #10) > Making it runtime can allow us to choose the format that is most suitable on > the target device at run time. But I'm afraid that could affect performance, > unless we take care of all pixel walking code like this: > case format A: > for each pixels { > } > case format B: > for each pixels { > } > ... I'm skeptical; we're already doing nontrivial work on every pixel, it's not clear to me that adding an if (or perhaps writing some very clever algorithmic code) will have a measurable perf impact. Obviously, testing would say for sure. > > > 6. We also did some work for resource caching. We've implemented > > > SharedBuffer in our way, and so that the buffer doesn't need to be flat > > > Vector. Instead, it can be a series of segments. See the definition of > > > OptimizedBuffer. And we make all image decoders (and other modules that uses > > > text decoders) work with segmented source. This is currently only for WINCE > > > & TORCHMOBILE. But we think it can be helpful for other platform, too. > > > > This is the most confusing bit to me. There's already an image cache in WebKit > > that can hang onto or discard decoded image data. And there's also code that > > throws away the unneeded decoded data from large animated GIFs. Is this work > > you've done separate from that, or integrated with it? I worry about having > > multiple isolated caching systems as a single system seems like it would make > > better decisions. I will have to look at precisely how you've implemented > > this. > > > SharedBuffer is used for raw resource data (encoded data). The reason why we > allow segmented buffer is that memory situation on mobile device is very bad, > and allocating a big block of memory can fail. Also, it takes CPU time to > resize the Vector. I guess JSC::SegmentedVector is a good utility class for > implementing OptimizedBuffer easily. OK, so basically you're trying to decode images where perhaps not even one frame can fit into memory at once? Although if that were the case I'm not sure how the full image would get drawn, unless the drawing loop got changed... maybe what you're saying is, allocating the memory as a single chunk fails (due to fragmentation) but will succeed when broken into smaller pieces?

Yong Li

Comment 12 2009-06-17 11:45:26 PDT

> > OK, so basically you're trying to decode images where perhaps not even one > frame can fit into memory at once? Although if that were the case I'm not sure > how the full image would get drawn, unless the drawing loop got changed... > maybe what you're saying is, allocating the memory as a single chunk fails (due > to fragmentation) but will succeed when broken into smaller pieces? > No, nothing to do with decoded data. It's just the source data usually stored in a CachedResource object, where image decoders read encoded data from.

Adam Treat

Comment 13 2009-06-17 12:02:47 PDT

At this point, we're starting to get off-topic with the Torch talk. I only wanted it posted here to keep everyone generally informed. For more specific questions and back and forth, might I suggest we talk over IRC about Torch's changes and how they might fit in.

Yong Li

Comment 14 2009-07-22 14:15:34 PDT

see bug 27561 The patch is used in our wince port

Peter Kasting

Comment 15 2010-10-19 15:20:13 PDT

Adam expressed some interest in this cleanup work.

Summary:	Cross-platform image system architecture cleanup metabug
Product:	WebKit	Reporter:	Peter Kasting <pkasting>
Component:	Images	Assignee:	Nobody <webkit-unassigned>
Status:	NEW
Severity:	Normal	CC:	abarth, bfulgham, emacemac7, eric, hyatt, kevino, manyoso, skyul, staikos, yong.li.webkit, zecke
Priority:	P2
Version:	528+ (Nightly build)
Hardware:	PC
OS:	All
Bug Depends on:	26460, 27965, 28751, 28785
Bug Blocks:

Bug 26467