Summary: | WebGL must allocate smaller drawing buffer if allocation fails | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Kenneth Russell <kbr> | ||||||||||||||
Component: | WebGL | Assignee: | yongsheng <yongsheng.zhu> | ||||||||||||||
Status: | RESOLVED FIXED | ||||||||||||||||
Severity: | Normal | CC: | cmarrin, logingx, twiz, webkit.review.bot, yongsheng.zhu, zmo | ||||||||||||||
Priority: | P2 | ||||||||||||||||
Version: | 528+ (Nightly build) | ||||||||||||||||
Hardware: | All | ||||||||||||||||
OS: | All | ||||||||||||||||
Bug Depends on: | 76239, 77481 | ||||||||||||||||
Bug Blocks: | |||||||||||||||||
Attachments: |
|
Description
Kenneth Russell
2012-01-19 12:18:40 PST
Assigned to yongsheng. Thanks for picking up this bug. I've been thinking of working on it, but kept being distracted. (In reply to comment #1) > Assigned to yongsheng. Thanks for picking up this bug. I've been thinking of working on it, but kept being distracted. I posted my thinkings about the policy to allocate smalller buffer than requested: 1) allocate all of the remaining buffer for the request 2) multiple the requested width and heighth with a ratio(such as 0.5) in the loop until the size of requested buffer is met which one is preferred? or any better idea? I don't think I fully understand what option 1) stands, but option 2) sounds reasonable. My original implementation in chromium side is 1) if it's multisampled buffer, try a non-multisampled buffer with the same size 2) if 1) fails, try multisampled buffer with half size 3) if 2) fails, try non-multisampled buffer with half size 4) if 3) fails, goto 2) again However, I think just doing half the size looping is good enough. Also, the first step should be to retry the allocation at the original size. I believe that we currently try to reallocate the new drawing buffer while holding on to its original resources, which fails and causes the drawing buffer to be released. The allocation should be retried after releasing the drawing buffer. All of this logic should probably be restructured. Yongsheng, do you have an update on this bug? The conformance suite failures in WebKit must be resolved soon. Either we need to push forward a fix for this or roll out bug 76239's patch for now. oh, I'll try to do it today. The late response is because I'm on Chinese New Year. (In reply to comment #3) > I don't think I fully understand what option 1) stands, but option 2) sounds reasonable. > My original implementation in chromium side is > > 1) if it's multisampled buffer, try a non-multisampled buffer with the same size > 2) if 1) fails, try multisampled buffer with half size > 3) if 2) fails, try non-multisampled buffer with half size > 4) if 3) fails, goto 2) again > > However, I think just doing half the size looping is good enough. okay, I'll follow the simple option 2). Created attachment 124099 [details]
Patch
Please review the patch.
Comment on attachment 124099 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=124099&action=review > Source/WebCore/platform/graphics/gpu/DrawingBuffer.cpp:248 > + while ((s_currentResourceUsePixels + pixelDelta) > s_maximumResourceUsePixels) { I don't think making sure less than s_maximumResourceUsePixels is enough. You should try creating a buffer with adjustedSize, check if it succeeds, if not, try with half the size. Do this until you get a valid backbuffer. Comment on attachment 124099 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=124099&action=review Thanks for working on this during the Chinese new year, but it needs more work. > Source/WebCore/ChangeLog:4 > + the allocation fails. Try to keep the synopsis on one line. > Source/WebCore/platform/graphics/gpu/DrawingBuffer.cpp:52 > +static float s_resourceAdjustedRatio = 0.5; Should be const, not static. >> Source/WebCore/platform/graphics/gpu/DrawingBuffer.cpp:248 >> + while ((s_currentResourceUsePixels + pixelDelta) > s_maximumResourceUsePixels) { > > I don't think making sure less than s_maximumResourceUsePixels is enough. > > You should try creating a buffer with adjustedSize, check if it succeeds, if not, try with half the size. Do this until you get a valid backbuffer. I agree with Mo's assessment. (In reply to comment #9) > (From update of attachment 124099 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=124099&action=review > > > Source/WebCore/platform/graphics/gpu/DrawingBuffer.cpp:248 > > + while ((s_currentResourceUsePixels + pixelDelta) > s_maximumResourceUsePixels) { > > I don't think making sure less than s_maximumResourceUsePixels is enough. > > You should try creating a buffer with adjustedSize, check if it succeeds, if not, try with half the size. Do this until you get a valid backbuffer. ok, since there are two checkings of the framebuffer status in the function, if there is any failure, then try with a smaller size and do all of allocation again util all are okay, am I right? (In reply to comment #10) > (From update of attachment 124099 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=124099&action=review > > Thanks for working on this during the Chinese new year, but it needs more work. > > > Source/WebCore/ChangeLog:4 > > + the allocation fails. > > Try to keep the synopsis on one line. > > > Source/WebCore/platform/graphics/gpu/DrawingBuffer.cpp:52 > > +static float s_resourceAdjustedRatio = 0.5; > > Should be const, not static. okay, should be 'static const'. > >> Source/WebCore/platform/graphics/gpu/DrawingBuffer.cpp:248 > >> + while ((s_currentResourceUsePixels + pixelDelta) > s_maximumResourceUsePixels) { > > > > I don't think making sure less than s_maximumResourceUsePixels is enough. > > > > You should try creating a buffer with adjustedSize, check if it succeeds, if not, try with half the size. Do this until you get a valid backbuffer. > > I agree with Mo's assessment. thanks for your and Mo's comments. I'll refine it. Created attachment 124280 [details]
Patch
The 'drawingbuffer-test.html' can be passed. However, there might be a stderr message "Source/WebKit/chromium/third_party/tcmalloc/chromium/src/system-alloc.cc:423] SbrkSysAllocator failed.". Does anyone know this? Attachment 124280 [details] did not pass style-queue:
Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1
Source/WebCore/platform/graphics/gpu/DrawingBuffer.cpp:314: An else statement can be removed when the prior "if" concludes with a return, break, continue or goto statement. [readability/control_flow] [4]
Total errors found: 1 in 2 files
If any of these errors are false positives, please file a bug against check-webkit-style.
Created attachment 124301 [details]
Patch
(In reply to comment #14) > The 'drawingbuffer-test.html' can be passed. However, there might be a stderr message "Source/WebKit/chromium/third_party/tcmalloc/chromium/src/system-alloc.cc:423] SbrkSysAllocator failed.". Does anyone know this? Is this new behavior that is caused by your patch? Have you set a breakpoint at that point in the renderer process to see what made the call that failed? I don't think that we should be trying to do huge allocations. Have you run all of the WebGL conformance tests with this patch? Does it fix all of the intended failures? The code looks fine overall but you need to remove the associated line from LayoutTests/platform/chromium/test_expectations.txt . r- because of this issue. Note, I just ran the canvas/ subset of the WebGL conformance tests with a debug Chromium build and didn't see any tcmalloc messages printed to stderr. (In reply to comment #18) > Note, I just ran the canvas/ subset of the WebGL conformance tests with a debug Chromium build and didn't see any tcmalloc messages printed to stderr. I ran the layout tests fast/canvas/webgl, and drawing-buffer-test.html passed with this change applied. Output: twiz@visitor-1950-259h:/usr/local/google/twiz/chrome_src/src/webkit/tools/layout_tests$ ./run_webkit_tests.py fast/canvas/webgl fast/canvas/webgl/drawingbuffer-test.html -> unexpected pass fast/canvas/webgl/invalid-passed-params.html -> unexpected pass 98 tests ran as expected, 2 didn't: Expected to fail, but passed: (2) fast/canvas/webgl/drawingbuffer-test.html fast/canvas/webgl/invalid-passed-params.html (In reply to comment #17) > (In reply to comment #14) > > The 'drawingbuffer-test.html' can be passed. However, there might be a stderr message "Source/WebKit/chromium/third_party/tcmalloc/chromium/src/system-alloc.cc:423] SbrkSysAllocator failed.". Does anyone know this? > > Is this new behavior that is caused by your patch? Not very sure. It's a random issue(10%~20%) and only occurs when running the conformance test (that means it never occur when running the single case). Seems not reproducible on your platforms. > Have you set a breakpoint at that point in the renderer process to see what made the call that failed? I don't think that we should be trying to do huge allocations. I'll try to have a look. Any hint to use > Have you run all of the WebGL conformance tests with this patch? Does it fix all of the intended failures? yes, all are okay for me. > > The code looks fine overall but you need to remove the associated line from LayoutTests/platform/chromium/test_expectations.txt . r- because of this issue. okay, I'll do it. Created attachment 124425 [details]
Patch
> twiz@visitor-1950-259h:/usr/local/google/twiz/chrome_src/src/webkit/tools/layout_tests$ ./run_webkit_tests.py fast/canvas/webgl
> fast/canvas/webgl/drawingbuffer-test.html -> unexpected pass
> fast/canvas/webgl/invalid-passed-params.html -> unexpected pass
> 98 tests ran as expected, 2 didn't:
on my testing environment, the 'invalid-passed-params.html' is 'expected pass'.
> > Is this new behavior that is caused by your patch?
> Not very sure. It's a random issue(10%~20%) and only occurs when running the conformance test (that means it never occur when running the single case). Seems not reproducible on your platforms.
The error is from the GPU side. It resizes its framebuffer with the passed size {width=1, height=1}. It seems the driver allocates a real big buffer(11804672).
I encountered this issue without the patch. but the occurrence frequency is lower than with the patch.
Here is the call stack.
#0 DefaultSysAllocator::Alloc (this=0xb7b911d8, size=11804672, actual_size=0xbfccbacc, alignment=4096)
at third_party/tcmalloc/chromium/src/system-alloc.cc:423
#1 0xb09991b1 in TCMalloc_SystemAlloc (size=11804672, actual_size=0xbfccbacc, alignment=4096)
at third_party/tcmalloc/chromium/src/system-alloc.cc:473
#2 0xb0993685 in tcmalloc::PageHeap::GrowHeap (this=0xb7b8c0c0, n=2882)
at third_party/tcmalloc/chromium/src/page_heap.cc:549
#3 0xb09916c0 in tcmalloc::PageHeap::New (this=0xb7b8c0c0, n=2882) at third_party/tcmalloc/chromium/src/page_heap.cc:102
#4 0xb0980728 in (anonymous namespace)::do_malloc_pages (heap=0xb7f37000, size=11804672)
at third_party/tcmalloc/chromium/src/tcmalloc.cc:1140
#5 0xb098087c in (anonymous namespace)::do_malloc (size=11804428) at third_party/tcmalloc/chromium/src/tcmalloc.cc:1171
#6 0xb09813f9 in (anonymous namespace)::cpp_alloc (size=11804424, nothrow=true)
at third_party/tcmalloc/chromium/src/tcmalloc.cc:1453
#7 0xb0980534 in (anonymous namespace)::do_malloc_or_cpp_alloc (size=11804424)
at third_party/tcmalloc/chromium/src/tcmalloc.cc:1102
#8 0xb09808e9 in (anonymous namespace)::do_calloc (n=1, elem_size=11804424)
at third_party/tcmalloc/chromium/src/tcmalloc.cc:1183
#9 0xb44908b3 in tc_calloc (n=1, elem_size=11804424) at third_party/tcmalloc/chromium/src/tcmalloc.cc:1602
#10 0xae56a345 in ralloc_size () from /usr/lib/i386-linux-gnu/dri/libglsl.so
#11 0xae56a3ee in rzalloc_size () from /usr/lib/i386-linux-gnu/dri/libglsl.so
#12 0xae56a4c8 in rzalloc_array_size () from /usr/lib/i386-linux-gnu/dri/libglsl.so
#13 0xae90a72c in do_wm_prog () from /usr/lib/i386-linux-gnu/dri/i965_dri.so
#14 0xae90ab48 in ?? () from /usr/lib/i386-linux-gnu/dri/i965_dri.so
#15 0xae9002fd in brw_validate_state () from /usr/lib/i386-linux-gnu/dri/i965_dri.so
#16 0xae8ecec2 in brw_draw_prims () from /usr/lib/i386-linux-gnu/dri/i965_dri.so
#17 0xae7680d1 in ?? () from /usr/lib/i386-linux-gnu/dri/libdricore.so
#18 0xae80129a in _mesa_meta_Clear () from /usr/lib/i386-linux-gnu/dri/libdricore.so
#19 0xae8c8e15 in ?? () from /usr/lib/i386-linux-gnu/dri/i965_dri.so
#20 0xae6996b1 in _mesa_Clear () from /usr/lib/i386-linux-gnu/dri/libdricore.so
#21 0xb3514b2b in gpu::gles2::GLES2DecoderImpl::ResizeOffscreenFrameBuffer (this=0xbd467e00, size=...) <--- size = {width=1, height =1}
at gpu/command_buffer/service/gles2_cmd_decoder.cc:2967
#22 0xb351180f in gpu::gles2::GLES2DecoderImpl::Initialize (this=0xbd467e00, surface=..., context=..., size=...,
disallowed_features=..., allowed_extensions=0xb87a1bcc "*", attribs=...)
at gpu/command_buffer/service/gles2_cmd_decoder.cc:2155
#23 0xb23a61d0 in GpuCommandBufferStub::OnInitialize (this=0xbd9f0a90, reply_message=0xb87e2d20)
at content/common/gpu/gpu_command_buffer_stub.cc:234
Comment on attachment 124425 [details]
Patch
Thanks, looks good. r=me
(In reply to comment #23) > > > Is this new behavior that is caused by your patch? > > Not very sure. It's a random issue(10%~20%) and only occurs when running the conformance test (that means it never occur when running the single case). Seems not reproducible on your platforms. > > The error is from the GPU side. It resizes its framebuffer with the passed size {width=1, height=1}. It seems the driver allocates a real big buffer(11804672). > I encountered this issue without the patch. but the occurrence frequency is lower than with the patch. > > Here is the call stack. > #0 DefaultSysAllocator::Alloc (this=0xb7b911d8, size=11804672, actual_size=0xbfccbacc, alignment=4096) > at third_party/tcmalloc/chromium/src/system-alloc.cc:423 > #1 0xb09991b1 in TCMalloc_SystemAlloc (size=11804672, actual_size=0xbfccbacc, alignment=4096) > at third_party/tcmalloc/chromium/src/system-alloc.cc:473 > #2 0xb0993685 in tcmalloc::PageHeap::GrowHeap (this=0xb7b8c0c0, n=2882) > at third_party/tcmalloc/chromium/src/page_heap.cc:549 > #3 0xb09916c0 in tcmalloc::PageHeap::New (this=0xb7b8c0c0, n=2882) at third_party/tcmalloc/chromium/src/page_heap.cc:102 > #4 0xb0980728 in (anonymous namespace)::do_malloc_pages (heap=0xb7f37000, size=11804672) > at third_party/tcmalloc/chromium/src/tcmalloc.cc:1140 > #5 0xb098087c in (anonymous namespace)::do_malloc (size=11804428) at third_party/tcmalloc/chromium/src/tcmalloc.cc:1171 > #6 0xb09813f9 in (anonymous namespace)::cpp_alloc (size=11804424, nothrow=true) > at third_party/tcmalloc/chromium/src/tcmalloc.cc:1453 > #7 0xb0980534 in (anonymous namespace)::do_malloc_or_cpp_alloc (size=11804424) > at third_party/tcmalloc/chromium/src/tcmalloc.cc:1102 > #8 0xb09808e9 in (anonymous namespace)::do_calloc (n=1, elem_size=11804424) > at third_party/tcmalloc/chromium/src/tcmalloc.cc:1183 > #9 0xb44908b3 in tc_calloc (n=1, elem_size=11804424) at third_party/tcmalloc/chromium/src/tcmalloc.cc:1602 > #10 0xae56a345 in ralloc_size () from /usr/lib/i386-linux-gnu/dri/libglsl.so > #11 0xae56a3ee in rzalloc_size () from /usr/lib/i386-linux-gnu/dri/libglsl.so > #12 0xae56a4c8 in rzalloc_array_size () from /usr/lib/i386-linux-gnu/dri/libglsl.so > #13 0xae90a72c in do_wm_prog () from /usr/lib/i386-linux-gnu/dri/i965_dri.so > #14 0xae90ab48 in ?? () from /usr/lib/i386-linux-gnu/dri/i965_dri.so > #15 0xae9002fd in brw_validate_state () from /usr/lib/i386-linux-gnu/dri/i965_dri.so > #16 0xae8ecec2 in brw_draw_prims () from /usr/lib/i386-linux-gnu/dri/i965_dri.so > #17 0xae7680d1 in ?? () from /usr/lib/i386-linux-gnu/dri/libdricore.so > #18 0xae80129a in _mesa_meta_Clear () from /usr/lib/i386-linux-gnu/dri/libdricore.so > #19 0xae8c8e15 in ?? () from /usr/lib/i386-linux-gnu/dri/i965_dri.so > #20 0xae6996b1 in _mesa_Clear () from /usr/lib/i386-linux-gnu/dri/libdricore.so > #21 0xb3514b2b in gpu::gles2::GLES2DecoderImpl::ResizeOffscreenFrameBuffer (this=0xbd467e00, size=...) <--- size = {width=1, height =1} > at gpu/command_buffer/service/gles2_cmd_decoder.cc:2967 > #22 0xb351180f in gpu::gles2::GLES2DecoderImpl::Initialize (this=0xbd467e00, surface=..., context=..., size=..., > disallowed_features=..., allowed_extensions=0xb87a1bcc "*", attribs=...) > at gpu/command_buffer/service/gles2_cmd_decoder.cc:2155 > #23 0xb23a61d0 in GpuCommandBufferStub::OnInitialize (this=0xbd9f0a90, reply_message=0xb87e2d20) > at content/common/gpu/gpu_command_buffer_stub.cc:234 Well, it doesn't look like your change to DrawingBuffer::reset() is causing this, at least not directly, since DrawingBuffer::reset() only allocates textures and renderbuffers, and the above stack trace is likely occurring during initialization of the WebGLRenderingContext. However, the behavior clearly looks wrong and I would suggest you raise this issue with Intel's driver team, as they may be able to track down the bug in Mesa or wherever it is. Comment on attachment 124425 [details] Patch Clearing flags on attachment: 124425 Committed r106376: <http://trac.webkit.org/changeset/106376> All reviewed patches have been landed. Closing bug. *** Bug 76562 has been marked as a duplicate of this bug. *** http://trac.webkit.org/changeset/106376 broke Chromium's webkit_unit_tests and is being rolled out under https://bugs.webkit.org/show_bug.cgi?id=77481 . Reopening this bug. Yongsheng, please build and test Chromium's webkit_unit_tests target with your original patch. Either the patch or the unit test needs to be updated. (In reply to comment #30) > Yongsheng, please build and test Chromium's webkit_unit_tests target with your original patch. Either the patch or the unit test needs to be updated. The reason is that when the newSize is empty, and adjustedSize will also be empty, so the code treat it as a failure. However, it should not. So the fix is to call 'clear()' when the newSize is not empty and adjustedSize is empty. Tested it with webkit_unit_tests, gpu_tests, layout tests. Created attachment 124892 [details]
Patch
thanks to the unit tests to find this issue.
Created attachment 124952 [details]
Patch
reupload it. Please review
any comments? Comment on attachment 124952 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=124952&action=review > Source/WebCore/platform/graphics/gpu/DrawingBuffer.cpp:324 > + if (!newSize.isEmpty() && adjustedSize.isEmpty()) { This change looks ok to me. Would it have been possible to early-out if newSize is empty? The need to dynamically shrink by 1/2 and test allocation does not seem necessary when we are starting at 0. (In reply to comment #35) > (From update of attachment 124952 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=124952&action=review > > > Source/WebCore/platform/graphics/gpu/DrawingBuffer.cpp:324 > > + if (!newSize.isEmpty() && adjustedSize.isEmpty()) { > > This change looks ok to me. Would it have been possible to early-out if newSize is empty? The need to dynamically shrink by 1/2 and test allocation does not seem necessary when we are starting at 0. I just want to follow up the previous implementation. If the newSize is empty, the old code would also do allocation. In new impl, I want to do that part of code too and then jump out the loop. Make sense? Comment on attachment 124952 [details]
Patch
Looks fine as long as it passes the test. r=me again
Comment on attachment 124952 [details] Patch Clearing flags on attachment: 124952 Committed r106888: <http://trac.webkit.org/changeset/106888> All reviewed patches have been landed. Closing bug. |