Mozilla has recently re-worked the code which converts texture data among formats during the upload process to WebGL. From an email from Benoit Jacob:
Here is a changeset (currently only on mozilla-inbound) that brings this code size down to 17k on x86-64 (as measured with nm -S on linux). Our previous version, which was already quite careful, was 44k. This is using the 'fully templatized' approach i.e. a separate conversion loop is compiled for each case, traversing the bitmaps only once, with all the texel conversion functions inlined into it.
The key was to avoid compiling paths that are never called, the code for this is there:
It would be ideal to switch WebKit to use this improved code. Among other things, it can avoid a pixel copy in some situations and use a memcpy-optimized loop in more situations than the current code.
However, after analysis, a couple of issues would need to be resolved. The first is that WebKit currently supports many more source formats than the final code Mozilla settled on; compare SourceDataFormat in GraphicsContext3D.h to WebGLTexelFormat in the patch's WebGLContext.h. The major difference is support for 16-bit-per-channel, big-endian and little-endian formats, which only arise in the Core Graphics code path and only in limited situations. The increased number of source formats would probably significantly increase the code size. Since it seems undesirable to have features that are only supported in the Mac port, such images could be converted to 8 bits per channel in the CG code path, before handing the bits to the cross-platform repacking code.
Additionally, the current structure of the WebKit code always packs the Image or ImageData into a Vector<uint8_t> as its first step, preventing some of the key short-circuits in Mozilla's code from taking effect. To fix this, classes which provide scoped access ("lock"/"unlock") to the data in Image or ImageData objects need to be added to GraphicsContext3D; these would be implemented with port-specific code.
Because of the complexity in integrating this code, Bug 85942, which addresses some issues Mozilla found during development of the above patch, has been fixed separately.
When incorporating this code, care needs to be taken to not lose the optimizations John Bauman added in Bug 66884, associated Chromium bug http://crbug.com/92388 . He indicated offline that he tested with http://jsperf.com/webgl-teximage2d-vs-texsubimage2d/2 (and possibly other versions of the test) as well as webgl-ios-rage. He also indicates: "part of my fix should become redundant due to the fact that they're never copying to an intermediate buffer, and you should be able to modify unpack<BGRA8, uint8_t, uint8_t> to use the faster BGRA->RGBA conversion".