Bug 105821

Summary:

Optimize the texture packing for texImage2D() and texSubImage2D() in WebGL

Product:

WebKit

Reporter:

Jun Jiang <jun.a.jiang>

Component:

WebGL

Assignee:

Jun Jiang <jun.a.jiang>

Status:

RESOLVED FIXED

Severity:

Normal

CC:

dino, d-r, junov, kbr, noam, ojan.autocc, senorblanco, webkit.review.bot

Priority:

Version:

528+ (Nightly build)

Hardware:

All

OS:

All

Attachments:

Description	Flags
Patch	none
Patch	none
Patch	none

Jun Jiang

Reported 2012-12-28 00:29:10 PST

In texture packing for texImage2D() and texSubImage2D() in WebGL, there are hundreds of unpack and pack operations and instances for different combinations of source format, destination format and Alpha operations. Current Implementation uses function pointers to denote each corresponding unpack and pack operation. The code logic is a little complex and could be made cleaner and simpler. Mozilla has done some optimizations based on this implementation, we could merge it back. There are basically two things: 1. use template functions and avoid code bloat. 2. merge the flipY operation into the packing operation.

Attachments
Patch (98.06 KB, patch) 2012-12-28 00:49 PST, Jun Jiang	no flags	Details Formatted Diff Diff
Patch (117.35 KB, patch) 2013-01-08 13:03 PST, Jun Jiang	no flags	Details Formatted Diff Diff
Patch (117.30 KB, patch) 2013-01-16 03:30 PST, Jun Jiang	no flags	Details Formatted Diff Diff
Show Obsolete (2) View All Add attachment proposed patch, testcase, etc.

Jun Jiang

Comment 1 2012-12-28 00:49:54 PST

Created attachment 180850 [details] Patch

Jun Jiang

Comment 2 2012-12-31 00:44:43 PST

After applying the patch, the binary size of GraphicsContext3D.o for Non-CG ports are smaller than before. Taking chromium on Linux for example, the size of GraphicsContext3D.o decreases from 60976 bytes to 33488 bytes (release build). While for CG port, the binary size of GraphicsContext3D.o is bigger than before, increasing from 405600 bytes to 1194444 bytes(release build, symbols not striped). The difference between CG port and other ports is that the number of possible source formats for Image from DOM elements are bigger. The detail is in FormatConverter::convert(). This difference introduces much more instances for template functions and results a binary size increase for CG port.

Jun Jiang

Comment 3 2013-01-07 05:49:18 PST

find two ways to reduce the binary size of GraphicsContext3D.o for CG port, one is to optimize the convert() template function, another is to reduce possible formats from HTMLImageElement by considering the little/big endian of the machine. Simulated CG setting by using these two methods in SKIA port and the size is smaller than original. Will upload a successive patch.

Jun Jiang

Comment 4 2013-01-08 13:03:53 PST

Created attachment 181738 [details] Patch

Kenneth Russell

Comment 5 2013-01-14 19:13:18 PST

Thanks for your patch and sorry for the delay reviewing it. I'm reviewing it now. Other reviewers, please hold off r+'ing this patch for now.

Kenneth Russell

Comment 6 2013-01-15 10:41:36 PST

Comment on attachment 181738 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=181738&action=review This is nice work that gets closer to unifying the pixel packing/unpacking code again between Firefox and WebKit. Some relatively small issues, plus one higher-level one. Per offline discussion it sounds like a bug was caught during testing of this patch, so r-'ing this one, but please post a revision when ready. > Source/WebCore/platform/graphics/GraphicsContext3D.cpp:320 > +// Some code are merged back from Mozilla code in content/canvas/src/WebGLTexelConversions.h A high-level point: it would be better if the code that is expected to be shared between Firefox and WebKit were clearly marked at its beginning and end, as it is in Firefox's http://mxr.mozilla.org/mozilla-central/source/content/canvas/src/WebGLTexelConversions.h . > Source/WebCore/platform/graphics/GraphicsContext3D.cpp:1237 > + || format == GraphicsContext3D::DataFormatA16Little Since the 16-bit per channel big and little endian formats are only used by the Core Graphics port, and their use is rare, I wonder whether they could be handled differently to reduce the code size on all platforms. Specifically, perhaps data of this format could be converted to one of the 8-bit formats inside GraphicsContext3DCG.cpp, and the cross-platform code wouldn't need to know about these formats. If you agree, it isn't necessary to make that change in this patch -- it's large enough as it is -- but please file another bug and make it depend on this one to track that work. > Source/WebCore/platform/graphics/GraphicsContext3D.cpp:1350 > +ALWAYS_INLINE unsigned TexelBytesForFormat(int format) Can this take "GraphicsContext3D::DataFormat" rather than int? > Source/WebCore/platform/graphics/GraphicsContext3D.cpp:1408 > + unpackedIntermediateSrcData = malloc(mWidth * MaxNumberOfComponents *MaxBytesPerComponent); Use unpackedIntermediateSrcData = adoptArrayPtr(new uint8_t[mWidth * MaxNumberOfComponents * MaxBytesPerComponent]). > Source/WebCore/platform/graphics/GraphicsContext3D.cpp:1414 > + free(unpackedIntermediateSrcData); With OwnArrayPtr the call to free() is unnecessary. > Source/WebCore/platform/graphics/GraphicsContext3D.cpp:1430 > + const unsigned mWidth, mHeight; Use WebKit naming convention here and throughout: m_ prefix. > Source/WebCore/platform/graphics/GraphicsContext3D.cpp:1435 > + void* unpackedIntermediateSrcData; Avoid use of raw pointers -- use OwnArrayPtr<uint8_t> instead. Also, fix naming convention. > Source/WebCore/platform/graphics/GraphicsContext3D.cpp:1549 > + const bool CanSrcFormatComeFromDOMElementOrImageData = GraphicsContext3D::srcFormatComeFromDOMElementOrImageData(SrcFormat); Capitalization: canSrcFormat... Grammar: srcFormatComesFrom... > Source/WebCore/platform/graphics/GraphicsContext3D.cpp:1571 > + const bool trivialPack = (DstFormat == GraphicsContext3D::DataFormatRGBA8 || DstFormat == GraphicsContext3D::DataFormatRGBA32F) && alphaOp == GraphicsContext3D::AlphaDoNothing && mDstStride > 0; This method looks similar to Firefox's WebGLImageConverter::run. Why does this version need to compute these two flags? > Source/WebCore/platform/graphics/GraphicsContext3D.cpp:1577 > + if (!trivialUnpack && trivialPack) { Have you measured the cost of having these three if/else branches inside the loop? The alternative would be to clone the loop. > Source/WebCore/platform/graphics/GraphicsContext3D.cpp:1591 > +bool webGLPackPixels(const uint8_t* sourceData, GraphicsContext3D::DataFormat sourceFormat, unsigned width, unsigned height, unsigned sourceUnpackAlignment, unsigned destinationFormat, unsigned destinationType, GraphicsContext3D::AlphaOp alphaOp, void* destinationData, bool flipY) Avoid referring to "WebGL" inside GraphicsContext3D. This separate function doesn't appear to be needed at all; just put its body into GraphicsContext3D::packPixels. > Source/WebCore/platform/graphics/GraphicsContext3D.cpp:1594 > + int reminder = sourceUnpackAlignment ? (validSrc % sourceUnpackAlignment) : 0; Typo: reminder -> remainder > Source/WebCore/platform/graphics/GraphicsContext3D.cpp:1600 > + destinationData = (uint8_t *)destinationData + dstStride*(height - 1); Use C++-style rather than C-style casts in WebKit code: static_cast<uint8_t*>(...). Also, spaces around "*" operator. > Source/WebCore/platform/graphics/GraphicsContext3D.cpp:1610 > + uint8_t* dst = (uint8_t*)destinationData; Fix C-style cast. > Source/WebCore/platform/graphics/GraphicsContext3D.h:645 > + // The formats from DOM elements vary with Graphics ports. It can only be RGBA8 or BGRA8 for non-CG port while much more for CGi port. Typo: CGi -> CG

Jun Jiang

Comment 7 2013-01-15 11:34:42 PST

Kenneth, thanks for your useful comments. That Mozilla did not use the trivialUnpack and trivialPack flags has some intrinsic reasons. Mozilla depends on compiler to do optimization instead of algorithm level. The unpack, convert, pack operations are done in a per-pixel basis and all these functions are inlined. During compilation, compiler will optimize off the cost for unnecessary unpack, convert or pack operations. It indeed brings performance gain. But it still has several drawbacks: first, in some cases, per-pixel assignment in a loop is slower than a single memcpy() for a row; Second, some ARCHs, like arm, could not use instruction acceleration(NEON) for the unpack and pack on a per-row basis to transfer more data in a single trip.

Jun Jiang

Comment 8 2013-01-16 03:30:54 PST

Created attachment 182953 [details] Patch

Jun Jiang

Comment 9 2013-01-16 17:50:09 PST

Make changes according to Kenneth's suggestions and fix some issue with uploading textures to floating-point formats.

Kenneth Russell

Comment 10 2013-01-22 18:37:38 PST

Comment on attachment 182953 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=182953&action=review Thanks for persisting with this patch. It gets much closer to unifying this code again with Mozilla's, which will make it easier to share code in the future. Two tiny nits, but neither is worth continuing to delay your patch. r=me > Source/WebCore/platform/graphics/GraphicsContext3D.cpp:1239 > + || format == GraphicsContext3D::DataFormatA16Little Please see the question in the previous review regarding these 16-bit per channel formats. Since they are exposed here only for the Core Graphics port, and they don't seem to be useful to represent any existing high dynamic range file format, I think we should hide their use inside CG-specific code and normalize incoming images to 8 bits per channel if necessary. Do you agree? If so please file another bug. > Source/WebCore/platform/graphics/GraphicsContext3D.cpp:1434 > + OwnArrayPtr<uint8_t> unpackedIntermediateSrcData; Should have been m_unpackedIntermediateSrcData. Not a huge deal.

WebKit Review Bot

Comment 11 2013-01-22 19:04:20 PST

Comment on attachment 182953 [details] Patch Clearing flags on attachment: 182953 Committed r140497: <http://trac.webkit.org/changeset/140497>

WebKit Review Bot

Comment 12 2013-01-22 19:04:26 PST

All reviewed patches have been landed. Closing bug.

Note You need to log in before you can comment on or make changes to this bug.