Bug 76654

Summary: WebGL must allocate smaller drawing buffer if allocation fails
Product: WebKit Reporter: Kenneth Russell <kbr>
Component: WebGLAssignee: yongsheng <yongsheng.zhu>
Status: RESOLVED FIXED    
Severity: Normal CC: cmarrin, logingx, twiz, webkit.review.bot, yongsheng.zhu, zmo
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Bug Depends on: 76239, 77481    
Bug Blocks:    
Attachments:
Description Flags
Patch
kbr: review-
Patch
none
Patch
kbr: review-
Patch
none
Patch
none
Patch none

Description Kenneth Russell 2012-01-19 12:18:40 PST
Per http://www.khronos.org/registry/webgl/specs/latest/#2.2 ("The Drawing Buffer"), the WebGL implementation must allocate a drawing buffer with a smaller than requested width and height if the requested size could not be allocated. The fix for Bug 76239 exposes a problem where the reallocation is not being retried.

I recall that zmo implemented this functionality once in Chromium's command buffer port, but I believe it needs to be reimplemented now after the conversion to DrawingBuffer.
Comment 1 Zhenyao Mo 2012-01-22 11:15:16 PST
Assigned to yongsheng.  Thanks for picking up this bug.  I've been thinking of working on it, but kept being distracted.
Comment 2 yongsheng 2012-01-22 22:53:25 PST
(In reply to comment #1)
> Assigned to yongsheng.  Thanks for picking up this bug.  I've been thinking of working on it, but kept being distracted.

I posted my thinkings about the policy to allocate smalller buffer than requested:
1) allocate all of the remaining buffer for the request
2) multiple the requested width and heighth with a ratio(such as 0.5) in the loop until the size of requested buffer is met

which one is preferred? or any better idea?
Comment 3 Zhenyao Mo 2012-01-23 09:16:45 PST
I don't think I fully understand what option 1) stands, but option 2) sounds reasonable.

My original implementation in chromium side is

1) if it's multisampled buffer, try a non-multisampled buffer with the same size
2) if 1) fails, try multisampled buffer with half size
3) if 2) fails, try non-multisampled buffer with half size
4) if 3) fails, goto 2) again

However, I think just doing half the size looping is good enough.
Comment 4 Kenneth Russell 2012-01-23 09:38:59 PST
Also, the first step should be to retry the allocation at the original size. I believe that we currently try to reallocate the new drawing buffer while holding on to its original resources, which fails and causes the drawing buffer to be released. The allocation should be retried after releasing the drawing buffer. All of this logic should probably be restructured.
Comment 5 Kenneth Russell 2012-01-25 09:50:24 PST
Yongsheng, do you have an update on this bug? The conformance suite failures in WebKit must be resolved soon. Either we need to push forward a fix for this or roll out bug 76239's patch for now.
Comment 6 yongsheng 2012-01-25 19:04:30 PST
oh, I'll try to do it today. The late response is because I'm on Chinese New Year.
Comment 7 yongsheng 2012-01-25 19:18:08 PST
(In reply to comment #3)
> I don't think I fully understand what option 1) stands, but option 2) sounds reasonable.
> My original implementation in chromium side is
> 
> 1) if it's multisampled buffer, try a non-multisampled buffer with the same size
> 2) if 1) fails, try multisampled buffer with half size
> 3) if 2) fails, try non-multisampled buffer with half size
> 4) if 3) fails, goto 2) again
> 
> However, I think just doing half the size looping is good enough.
okay, I'll follow the simple option 2).
Comment 8 yongsheng 2012-01-26 03:50:57 PST
Created attachment 124099 [details]
Patch

Please review the patch.
Comment 9 Zhenyao Mo 2012-01-26 06:22:40 PST
Comment on attachment 124099 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=124099&action=review

> Source/WebCore/platform/graphics/gpu/DrawingBuffer.cpp:248
> +        while ((s_currentResourceUsePixels + pixelDelta) > s_maximumResourceUsePixels) {

I don't think making sure less than s_maximumResourceUsePixels is enough.

You should try creating a buffer with adjustedSize, check if it succeeds, if not, try with half the size.  Do this until you get a valid backbuffer.
Comment 10 Kenneth Russell 2012-01-26 14:13:05 PST
Comment on attachment 124099 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=124099&action=review

Thanks for working on this during the Chinese new year, but it needs more work.

> Source/WebCore/ChangeLog:4
> +        the allocation fails. 

Try to keep the synopsis on one line.

> Source/WebCore/platform/graphics/gpu/DrawingBuffer.cpp:52
> +static float s_resourceAdjustedRatio = 0.5;

Should be const, not static.

>> Source/WebCore/platform/graphics/gpu/DrawingBuffer.cpp:248
>> +        while ((s_currentResourceUsePixels + pixelDelta) > s_maximumResourceUsePixels) {
> 
> I don't think making sure less than s_maximumResourceUsePixels is enough.
> 
> You should try creating a buffer with adjustedSize, check if it succeeds, if not, try with half the size.  Do this until you get a valid backbuffer.

I agree with Mo's assessment.
Comment 11 yongsheng 2012-01-26 19:24:20 PST
(In reply to comment #9)
> (From update of attachment 124099 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=124099&action=review
> 
> > Source/WebCore/platform/graphics/gpu/DrawingBuffer.cpp:248
> > +        while ((s_currentResourceUsePixels + pixelDelta) > s_maximumResourceUsePixels) {
> 
> I don't think making sure less than s_maximumResourceUsePixels is enough.
> 
> You should try creating a buffer with adjustedSize, check if it succeeds, if not, try with half the size.  Do this until you get a valid backbuffer.

ok, since there are two checkings of the framebuffer status in the function, if there is any failure, then try with a smaller size and do all of allocation again util all are okay, am I right?
Comment 12 yongsheng 2012-01-26 19:45:43 PST
(In reply to comment #10)
> (From update of attachment 124099 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=124099&action=review
> 
> Thanks for working on this during the Chinese new year, but it needs more work.
> 
> > Source/WebCore/ChangeLog:4
> > +        the allocation fails. 
> 
> Try to keep the synopsis on one line.
> 
> > Source/WebCore/platform/graphics/gpu/DrawingBuffer.cpp:52
> > +static float s_resourceAdjustedRatio = 0.5;
> 
> Should be const, not static.
okay, should be 'static const'. 
> >> Source/WebCore/platform/graphics/gpu/DrawingBuffer.cpp:248
> >> +        while ((s_currentResourceUsePixels + pixelDelta) > s_maximumResourceUsePixels) {
> > 
> > I don't think making sure less than s_maximumResourceUsePixels is enough.
> > 
> > You should try creating a buffer with adjustedSize, check if it succeeds, if not, try with half the size.  Do this until you get a valid backbuffer.
> 
> I agree with Mo's assessment.
thanks for your and Mo's comments. I'll refine it.
Comment 13 yongsheng 2012-01-27 02:31:22 PST
Created attachment 124280 [details]
Patch
Comment 14 yongsheng 2012-01-27 02:32:44 PST
The 'drawingbuffer-test.html' can be passed. However, there might be a stderr message "Source/WebKit/chromium/third_party/tcmalloc/chromium/src/system-alloc.cc:423] SbrkSysAllocator failed.". Does anyone know this?
Comment 15 WebKit Review Bot 2012-01-27 02:34:58 PST
Attachment 124280 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1

Source/WebCore/platform/graphics/gpu/DrawingBuffer.cpp:314:  An else statement can be removed when the prior "if" concludes with a return, break, continue or goto statement.  [readability/control_flow] [4]
Total errors found: 1 in 2 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 16 yongsheng 2012-01-27 06:03:27 PST
Created attachment 124301 [details]
Patch
Comment 17 Kenneth Russell 2012-01-27 18:13:51 PST
(In reply to comment #14)
> The 'drawingbuffer-test.html' can be passed. However, there might be a stderr message "Source/WebKit/chromium/third_party/tcmalloc/chromium/src/system-alloc.cc:423] SbrkSysAllocator failed.". Does anyone know this?

Is this new behavior that is caused by your patch?

Have you set a breakpoint at that point in the renderer process to see what made the call that failed? I don't think that we should be trying to do huge allocations.

Have you run all of the WebGL conformance tests with this patch? Does it fix all of the intended failures?

The code looks fine overall but you need to remove the associated line from LayoutTests/platform/chromium/test_expectations.txt . r- because of this issue.
Comment 18 Kenneth Russell 2012-01-27 18:16:15 PST
Note, I just ran the canvas/ subset of the WebGL conformance tests with a debug Chromium build and didn't see any tcmalloc messages printed to stderr.
Comment 19 Jeff Timanus 2012-01-27 18:18:04 PST
(In reply to comment #18)
> Note, I just ran the canvas/ subset of the WebGL conformance tests with a debug Chromium build and didn't see any tcmalloc messages printed to stderr.

I ran the layout tests fast/canvas/webgl, and drawing-buffer-test.html passed with this change applied.

Output:

twiz@visitor-1950-259h:/usr/local/google/twiz/chrome_src/src/webkit/tools/layout_tests$ ./run_webkit_tests.py fast/canvas/webgl
  fast/canvas/webgl/drawingbuffer-test.html -> unexpected pass
  fast/canvas/webgl/invalid-passed-params.html -> unexpected pass
98 tests ran as expected, 2 didn't:                 


Expected to fail, but passed: (2)
  fast/canvas/webgl/drawingbuffer-test.html
  fast/canvas/webgl/invalid-passed-params.html
Comment 20 yongsheng 2012-01-27 19:20:45 PST
(In reply to comment #17)
> (In reply to comment #14)
> > The 'drawingbuffer-test.html' can be passed. However, there might be a stderr message "Source/WebKit/chromium/third_party/tcmalloc/chromium/src/system-alloc.cc:423] SbrkSysAllocator failed.". Does anyone know this?
> 
> Is this new behavior that is caused by your patch?
Not very sure. It's a random issue(10%~20%) and only occurs when running the conformance test (that means it never occur when running the single case). Seems not reproducible on your platforms. 
> Have you set a breakpoint at that point in the renderer process to see what made the call that failed? I don't think that we should be trying to do huge allocations.
I'll try to have a look. Any hint to use 
> Have you run all of the WebGL conformance tests with this patch? Does it fix all of the intended failures?
yes, all are okay for me.
> 
> The code looks fine overall but you need to remove the associated line from LayoutTests/platform/chromium/test_expectations.txt . r- because of this issue.
okay, I'll do it.
Comment 21 yongsheng 2012-01-27 20:22:06 PST
Created attachment 124425 [details]
Patch
Comment 22 yongsheng 2012-01-29 01:29:59 PST
> twiz@visitor-1950-259h:/usr/local/google/twiz/chrome_src/src/webkit/tools/layout_tests$ ./run_webkit_tests.py fast/canvas/webgl
>   fast/canvas/webgl/drawingbuffer-test.html -> unexpected pass
>   fast/canvas/webgl/invalid-passed-params.html -> unexpected pass
> 98 tests ran as expected, 2 didn't:                 
on my testing environment, the 'invalid-passed-params.html' is 'expected pass'.
Comment 23 yongsheng 2012-01-29 01:35:23 PST
> > Is this new behavior that is caused by your patch?
> Not very sure. It's a random issue(10%~20%) and only occurs when running the conformance test (that means it never occur when running the single case). Seems not reproducible on your platforms. 

The error is from the GPU side. It resizes its framebuffer with the passed size {width=1, height=1}. It seems the driver allocates a real big buffer(11804672).
I encountered this issue without the patch. but the occurrence frequency is lower than with the patch.

Here is the call stack.
#0  DefaultSysAllocator::Alloc (this=0xb7b911d8, size=11804672, actual_size=0xbfccbacc, alignment=4096)
    at third_party/tcmalloc/chromium/src/system-alloc.cc:423
#1  0xb09991b1 in TCMalloc_SystemAlloc (size=11804672, actual_size=0xbfccbacc, alignment=4096)
    at third_party/tcmalloc/chromium/src/system-alloc.cc:473
#2  0xb0993685 in tcmalloc::PageHeap::GrowHeap (this=0xb7b8c0c0, n=2882)
    at third_party/tcmalloc/chromium/src/page_heap.cc:549
#3  0xb09916c0 in tcmalloc::PageHeap::New (this=0xb7b8c0c0, n=2882) at third_party/tcmalloc/chromium/src/page_heap.cc:102
#4  0xb0980728 in (anonymous namespace)::do_malloc_pages (heap=0xb7f37000, size=11804672)
    at third_party/tcmalloc/chromium/src/tcmalloc.cc:1140
#5  0xb098087c in (anonymous namespace)::do_malloc (size=11804428) at third_party/tcmalloc/chromium/src/tcmalloc.cc:1171
#6  0xb09813f9 in (anonymous namespace)::cpp_alloc (size=11804424, nothrow=true)
    at third_party/tcmalloc/chromium/src/tcmalloc.cc:1453
#7  0xb0980534 in (anonymous namespace)::do_malloc_or_cpp_alloc (size=11804424)
    at third_party/tcmalloc/chromium/src/tcmalloc.cc:1102
#8  0xb09808e9 in (anonymous namespace)::do_calloc (n=1, elem_size=11804424)
    at third_party/tcmalloc/chromium/src/tcmalloc.cc:1183
#9  0xb44908b3 in tc_calloc (n=1, elem_size=11804424) at third_party/tcmalloc/chromium/src/tcmalloc.cc:1602
#10 0xae56a345 in ralloc_size () from /usr/lib/i386-linux-gnu/dri/libglsl.so
#11 0xae56a3ee in rzalloc_size () from /usr/lib/i386-linux-gnu/dri/libglsl.so
#12 0xae56a4c8 in rzalloc_array_size () from /usr/lib/i386-linux-gnu/dri/libglsl.so
#13 0xae90a72c in do_wm_prog () from /usr/lib/i386-linux-gnu/dri/i965_dri.so
#14 0xae90ab48 in ?? () from /usr/lib/i386-linux-gnu/dri/i965_dri.so
#15 0xae9002fd in brw_validate_state () from /usr/lib/i386-linux-gnu/dri/i965_dri.so
#16 0xae8ecec2 in brw_draw_prims () from /usr/lib/i386-linux-gnu/dri/i965_dri.so
#17 0xae7680d1 in ?? () from /usr/lib/i386-linux-gnu/dri/libdricore.so
#18 0xae80129a in _mesa_meta_Clear () from /usr/lib/i386-linux-gnu/dri/libdricore.so
#19 0xae8c8e15 in ?? () from /usr/lib/i386-linux-gnu/dri/i965_dri.so
#20 0xae6996b1 in _mesa_Clear () from /usr/lib/i386-linux-gnu/dri/libdricore.so
#21 0xb3514b2b in gpu::gles2::GLES2DecoderImpl::ResizeOffscreenFrameBuffer (this=0xbd467e00, size=...)   <--- size = {width=1, height =1}
    at gpu/command_buffer/service/gles2_cmd_decoder.cc:2967
#22 0xb351180f in gpu::gles2::GLES2DecoderImpl::Initialize (this=0xbd467e00, surface=..., context=..., size=...,
    disallowed_features=..., allowed_extensions=0xb87a1bcc "*", attribs=...)
    at gpu/command_buffer/service/gles2_cmd_decoder.cc:2155
#23 0xb23a61d0 in GpuCommandBufferStub::OnInitialize (this=0xbd9f0a90, reply_message=0xb87e2d20)
    at content/common/gpu/gpu_command_buffer_stub.cc:234
Comment 24 Kenneth Russell 2012-01-31 11:00:41 PST
Comment on attachment 124425 [details]
Patch

Thanks, looks good. r=me
Comment 25 Kenneth Russell 2012-01-31 11:03:54 PST
(In reply to comment #23)
> > > Is this new behavior that is caused by your patch?
> > Not very sure. It's a random issue(10%~20%) and only occurs when running the conformance test (that means it never occur when running the single case). Seems not reproducible on your platforms. 
> 
> The error is from the GPU side. It resizes its framebuffer with the passed size {width=1, height=1}. It seems the driver allocates a real big buffer(11804672).
> I encountered this issue without the patch. but the occurrence frequency is lower than with the patch.
> 
> Here is the call stack.
> #0  DefaultSysAllocator::Alloc (this=0xb7b911d8, size=11804672, actual_size=0xbfccbacc, alignment=4096)
>     at third_party/tcmalloc/chromium/src/system-alloc.cc:423
> #1  0xb09991b1 in TCMalloc_SystemAlloc (size=11804672, actual_size=0xbfccbacc, alignment=4096)
>     at third_party/tcmalloc/chromium/src/system-alloc.cc:473
> #2  0xb0993685 in tcmalloc::PageHeap::GrowHeap (this=0xb7b8c0c0, n=2882)
>     at third_party/tcmalloc/chromium/src/page_heap.cc:549
> #3  0xb09916c0 in tcmalloc::PageHeap::New (this=0xb7b8c0c0, n=2882) at third_party/tcmalloc/chromium/src/page_heap.cc:102
> #4  0xb0980728 in (anonymous namespace)::do_malloc_pages (heap=0xb7f37000, size=11804672)
>     at third_party/tcmalloc/chromium/src/tcmalloc.cc:1140
> #5  0xb098087c in (anonymous namespace)::do_malloc (size=11804428) at third_party/tcmalloc/chromium/src/tcmalloc.cc:1171
> #6  0xb09813f9 in (anonymous namespace)::cpp_alloc (size=11804424, nothrow=true)
>     at third_party/tcmalloc/chromium/src/tcmalloc.cc:1453
> #7  0xb0980534 in (anonymous namespace)::do_malloc_or_cpp_alloc (size=11804424)
>     at third_party/tcmalloc/chromium/src/tcmalloc.cc:1102
> #8  0xb09808e9 in (anonymous namespace)::do_calloc (n=1, elem_size=11804424)
>     at third_party/tcmalloc/chromium/src/tcmalloc.cc:1183
> #9  0xb44908b3 in tc_calloc (n=1, elem_size=11804424) at third_party/tcmalloc/chromium/src/tcmalloc.cc:1602
> #10 0xae56a345 in ralloc_size () from /usr/lib/i386-linux-gnu/dri/libglsl.so
> #11 0xae56a3ee in rzalloc_size () from /usr/lib/i386-linux-gnu/dri/libglsl.so
> #12 0xae56a4c8 in rzalloc_array_size () from /usr/lib/i386-linux-gnu/dri/libglsl.so
> #13 0xae90a72c in do_wm_prog () from /usr/lib/i386-linux-gnu/dri/i965_dri.so
> #14 0xae90ab48 in ?? () from /usr/lib/i386-linux-gnu/dri/i965_dri.so
> #15 0xae9002fd in brw_validate_state () from /usr/lib/i386-linux-gnu/dri/i965_dri.so
> #16 0xae8ecec2 in brw_draw_prims () from /usr/lib/i386-linux-gnu/dri/i965_dri.so
> #17 0xae7680d1 in ?? () from /usr/lib/i386-linux-gnu/dri/libdricore.so
> #18 0xae80129a in _mesa_meta_Clear () from /usr/lib/i386-linux-gnu/dri/libdricore.so
> #19 0xae8c8e15 in ?? () from /usr/lib/i386-linux-gnu/dri/i965_dri.so
> #20 0xae6996b1 in _mesa_Clear () from /usr/lib/i386-linux-gnu/dri/libdricore.so
> #21 0xb3514b2b in gpu::gles2::GLES2DecoderImpl::ResizeOffscreenFrameBuffer (this=0xbd467e00, size=...)   <--- size = {width=1, height =1}
>     at gpu/command_buffer/service/gles2_cmd_decoder.cc:2967
> #22 0xb351180f in gpu::gles2::GLES2DecoderImpl::Initialize (this=0xbd467e00, surface=..., context=..., size=...,
>     disallowed_features=..., allowed_extensions=0xb87a1bcc "*", attribs=...)
>     at gpu/command_buffer/service/gles2_cmd_decoder.cc:2155
> #23 0xb23a61d0 in GpuCommandBufferStub::OnInitialize (this=0xbd9f0a90, reply_message=0xb87e2d20)
>     at content/common/gpu/gpu_command_buffer_stub.cc:234

Well, it doesn't look like your change to DrawingBuffer::reset() is causing this, at least not directly, since DrawingBuffer::reset() only allocates textures and renderbuffers, and the above stack trace is likely occurring during initialization of the WebGLRenderingContext. However, the behavior clearly looks wrong and I would suggest you raise this issue with Intel's driver team, as they may be able to track down the bug in Mesa or wherever it is.
Comment 26 WebKit Review Bot 2012-01-31 12:00:22 PST
Comment on attachment 124425 [details]
Patch

Clearing flags on attachment: 124425

Committed r106376: <http://trac.webkit.org/changeset/106376>
Comment 27 WebKit Review Bot 2012-01-31 12:00:27 PST
All reviewed patches have been landed.  Closing bug.
Comment 28 Jeff Timanus 2012-01-31 12:02:55 PST
*** Bug 76562 has been marked as a duplicate of this bug. ***
Comment 29 Kenneth Russell 2012-01-31 14:44:37 PST
http://trac.webkit.org/changeset/106376 broke Chromium's webkit_unit_tests and is being rolled out under https://bugs.webkit.org/show_bug.cgi?id=77481 . Reopening this bug.
Comment 30 Kenneth Russell 2012-01-31 16:32:30 PST
Yongsheng, please build and test Chromium's webkit_unit_tests target with your original patch. Either the patch or the unit test needs to be updated.
Comment 31 yongsheng 2012-01-31 23:54:51 PST
(In reply to comment #30)
> Yongsheng, please build and test Chromium's webkit_unit_tests target with your original patch. Either the patch or the unit test needs to be updated.
The reason is that when the newSize is empty, and adjustedSize will also be empty, so the code treat it as a failure. However, it should not.
So the fix is to call 'clear()' when the newSize is not empty and adjustedSize is empty.

Tested it with webkit_unit_tests, gpu_tests, layout tests.
Comment 32 yongsheng 2012-02-01 00:02:24 PST
Created attachment 124892 [details]
Patch

thanks to the unit tests to find this issue.
Comment 33 yongsheng 2012-02-01 08:14:18 PST
Created attachment 124952 [details]
Patch

reupload it. Please review
Comment 34 yongsheng 2012-02-02 17:28:03 PST
any comments?
Comment 35 Jeff Timanus 2012-02-03 08:23:52 PST
Comment on attachment 124952 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=124952&action=review

> Source/WebCore/platform/graphics/gpu/DrawingBuffer.cpp:324
> +        if (!newSize.isEmpty() && adjustedSize.isEmpty()) {

This change looks ok to me.  Would it have been possible to early-out if newSize is empty?  The need to dynamically shrink by 1/2 and test allocation does not seem necessary when we are starting at 0.
Comment 36 yongsheng 2012-02-04 00:43:16 PST
(In reply to comment #35)
> (From update of attachment 124952 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=124952&action=review
> 
> > Source/WebCore/platform/graphics/gpu/DrawingBuffer.cpp:324
> > +        if (!newSize.isEmpty() && adjustedSize.isEmpty()) {
> 
> This change looks ok to me.  Would it have been possible to early-out if newSize is empty?  The need to dynamically shrink by 1/2 and test allocation does not seem necessary when we are starting at 0.

I just want to follow up the previous implementation. If the newSize is empty, the old code would also do allocation. In new impl, I want to do that part of code too and then jump out the loop. Make sense?
Comment 37 Kenneth Russell 2012-02-06 11:53:20 PST
Comment on attachment 124952 [details]
Patch

Looks fine as long as it passes the test. r=me again
Comment 38 WebKit Review Bot 2012-02-06 17:58:32 PST
Comment on attachment 124952 [details]
Patch

Clearing flags on attachment: 124952

Committed r106888: <http://trac.webkit.org/changeset/106888>
Comment 39 WebKit Review Bot 2012-02-06 17:58:38 PST
All reviewed patches have been landed.  Closing bug.