RESOLVED FIXED 39966
Crash in qsort() because compareBorders() is not reliable
https://bugs.webkit.org/show_bug.cgi?id=39966
Summary Crash in qsort() because compareBorders() is not reliable
Yong Li
Reported Monday, May 31, 2010 7:28:23 PM UTC
In RenderTableCell.cpp, compareBorders() is used to implement compareBorderStylesForQSort(). But it's not stable. When compareBorderStylesForQSort(a, b) returns 1, compareBorderStylesForQSort(b, a) can return 1 or 0. We should pass a stable compare function to qsort().
Attachments
the patch (2.98 KB, patch)
2010-05-31 11:30 PDT, Yong Li
ap: review-
the test case (1.94 KB, text/html)
2010-06-06 19:50 PDT, Yong Li
no flags
Test case (2.71 KB, patch)
2010-06-07 07:19 PDT, Yong Li
ap: review+
commit-queue: commit-queue-
the patch that fixes the problem (3.09 KB, patch)
2010-06-07 07:19 PDT, Yong Li
no flags
Fixed bugs. use same function for both qsort and other cases (20.49 KB, patch)
2010-06-07 10:03 PDT, Yong Li
no flags
updated patch (20.49 KB, patch)
2010-06-07 10:08 PDT, Yong Li
no flags
Fix comments & keep old behavior for BNONE style (20.44 KB, patch)
2010-07-23 08:49 PDT, Yong Li
no flags
test case (tab removed) (2.73 KB, patch)
2010-09-20 14:20 PDT, Yong Li
commit-queue: commit-queue-
Yong Li
Comment 1 Monday, May 31, 2010 7:30:49 PM UTC
Created attachment 57485 [details] the patch
Alexey Proskuryakov
Comment 2 Wednesday, June 2, 2010 12:18:16 AM UTC
Comment on attachment 57485 [details] the patch Please add a test case, or explain why making one isn't possible (sounds like it should be possible).
Yong Li
Comment 3 Wednesday, June 2, 2010 4:46:03 AM UTC
(In reply to comment #2) > (From update of attachment 57485 [details]) > Please add a test case, or explain why making one isn't possible (sounds like it should be possible). it's a hard job, because it relies on the implementation of qsort()
Alexey Proskuryakov
Comment 4 Wednesday, June 2, 2010 6:22:20 AM UTC
Can a test be made that crashes on any common implementation? The bug title says "crash", so surely this was observed in practice?
Yong Li
Comment 5 Wednesday, June 2, 2010 3:22:57 PM UTC
(In reply to comment #4) > Can a test be made that crashes on any common implementation? The bug title says "crash", so surely this was observed in practice? Sorry, the title is probably misleading. I have seen it crashed in armcc's qsort(), and using stable compare function fixed it. I cannot reproduce it on Safari win32. But in general, the compare functions passed to sorting algorithm (including qsort, stable_sort, and sorted containers like std::set/map) should be written in a reliable way and should give consistent result. The compare function passed to qsort should always make this true: compare(a, b) == -compare(b, a). The old code uses a function that can make compare(a, b) == 1 and compare(b, a) == 0. The behavior of running qsort() with it is undetermined.
Alexey Proskuryakov
Comment 6 Wednesday, June 2, 2010 4:03:41 PM UTC
Can you add a test that crashes on armcc without this fix?
Yong Li
Comment 7 Wednesday, June 2, 2010 4:06:11 PM UTC
(In reply to comment #6) > Can you add a test that crashes on armcc without this fix? I'll give a try.
Yong Li
Comment 8 Monday, June 7, 2010 3:50:55 AM UTC
Created attachment 57990 [details] the test case
Yong Li
Comment 9 Monday, June 7, 2010 3:51:57 AM UTC
(In reply to comment #6) > Can you add a test that crashes on armcc without this fix? Finally got a test case that can reproduce the crash. Do you think it should go into LayoutTest?
Yong Li
Comment 10 Monday, June 7, 2010 3:19:07 PM UTC
Created attachment 58026 [details] Test case
Yong Li
Comment 11 Monday, June 7, 2010 3:19:44 PM UTC
Created attachment 58027 [details] the patch that fixes the problem
Darin Adler
Comment 12 Monday, June 7, 2010 3:25:23 PM UTC
Comment on attachment 58027 [details] the patch that fixes the problem Is there a way to fix this without writing a whole new copy of the compareBorders function? That's quite undesirable, because we can have real problems if the two functions get out of sync. What's the particular thing in the original function that makes it inconsistent? The word "stable" has a specific meaning in sorting contexts. It means a sorting algorithm that does not reorder equal items in the array being sorted. Using the term stable in the title of the new version of the compareBorders function is imprecise and confusing since it already has a specific meaning for sorting.
Yong Li
Comment 13 Monday, June 7, 2010 3:58:06 PM UTC
(In reply to comment #12) > (From update of attachment 58027 [details]) > Is there a way to fix this without writing a whole new copy of the compareBorders function? That's quite undesirable, because we can have real problems if the two functions get out of sync. What's the particular thing in the original function that makes it inconsistent? > The word "stable" has a specific meaning in sorting contexts. It means a sorting algorithm that does not reorder equal items in the array being sorted. Using the term stable in the title of the new version of the compareBorders function is imprecise and confusing since it already has a specific meaning for sorting. 1) The existing compareBorders function is mainly used to choose a border style from 2. It looks buggy for example the santiy check ignore the BHIDDEN check. If border 1 doesn't exist and border 2 is hidden, it doesn't return an empty border value as it's supposed to do. In fact, !border1.exists() rarely happens. Also, this function returns border1 when 2 border styles are equal, and the old compareBorderStylesForQSort implementation only checks the return value once and assumes the returned value is the bigger one. so it can get compareBorderStylesForQSort(a, b) == 1 and compare...ForQSort(b, a) == 1. > return border1.precedence() >= border2.precedence() ? border1 : border2; When both borders are BHIDDEN, the result is compareBorderStylesForQSort(a, b) == -1 and compare...ForQSort(b, a) == -1. I'll try to make them use same code. 2) I thought about that, too, but couldn't find a good name. how about consistentCompare? or? rename stableCompareBorders to compareBorderStyles, but rename compareBorders to chooseBorderBetween?
Yong Li
Comment 14 Monday, June 7, 2010 4:33:58 PM UTC
Comment on attachment 58027 [details] the patch that fixes the problem there's a bug in there. HIDDEN border should take precedence.
Yong Li
Comment 15 Monday, June 7, 2010 6:03:56 PM UTC
Created attachment 58041 [details] Fixed bugs. use same function for both qsort and other cases Also fixed a bug in the old function: when both borders have NONE style, it didn't continue comparing any more. But it should continue checking other properties in this case according to the comment: // (2) Borders with a style of 'none' have the lowest priority. Only if the border properties of all // the elements meeting at this edge are 'none' will the border be omitted (but note that 'none' is // the default value for the border style.) // (3) If none of the styles are 'hidden' and at least one of them is not 'none', then narrow borders
Yong Li
Comment 16 Monday, June 7, 2010 6:04:35 PM UTC
Comment on attachment 58041 [details] Fixed bugs. use same function for both qsort and other cases oops wrong patch.
Yong Li
Comment 17 Monday, June 7, 2010 6:08:26 PM UTC
Created attachment 58042 [details] updated patch
Early Warning System Bot
Comment 18 Monday, June 7, 2010 6:08:57 PM UTC
WebKit Review Bot
Comment 19 Monday, June 7, 2010 6:13:55 PM UTC
Yong Li
Comment 20 Monday, June 7, 2010 6:25:26 PM UTC
Also, FireFox renders the test case in the way different from webkit browsers. FireFox renders some borders around a few elements. WebKit browsers like Safari and Qt port don't render any border. My patch makes webkit do the similar thing as FireFox. I'm not sure if FireFox is doing the right thing. Any expert?
Yong Li
Comment 21 Monday, June 7, 2010 9:33:06 PM UTC
(In reply to comment #20) > Also, FireFox renders the test case in the way different from webkit browsers. FireFox renders some borders around a few elements. WebKit browsers like Safari and Qt port don't render any border. My patch makes webkit do the similar thing as FireFox. I'm not sure if FireFox is doing the right thing. Any expert? Please ignore this
Yong Li
Comment 22 Thursday, June 10, 2010 6:45:18 PM UTC
FYI, Daniel Bates has tested this patch on mac, and it passed all tests. He also ran pixels test for LayoutTests/table and LayoutTest/fast/table and did not notice any significant differences. Thanks, Dan.
Kent Hansen
Comment 23 Friday, July 23, 2010 2:32:18 PM UTC
Comment on attachment 58042 [details] updated patch > + Make compareBorders() a consistent compare function which can used be by qsort(). "which can be used by..." > + Also fixed a bug in previous implementation that it didn't continue comparing when > + both borders have NONE style. If that's a separate bug it should have a separate patch and test case, and be fixed before this one. (See http://trac.webkit.org/wiki/CodeReview) Alternatively, leave that part out of this patch (keeping the behavior of the old implementation) and create a follow-up bug after this has landed. > // Sanity check the values passed in. If either is null, return the other. You might want to reword comments like the above one, since the function now returns an int, not one of its arguments. > // Rule #2 above. A style of 'none' has lowest priority and always loses to any other border. > - if (border2.style() == BNONE) > - return border1; > + if (border2.style() == BNONE) { > + if (border1.style() != BNONE) > + return 1; > + } > if (border1.style() == BNONE) > - return border2; > + return -1; It looks like -1 is always returned when both have style BNONE. Shouldn't it be 0, like for BHIDDEN? > @@ -767,10 +785,7 @@ static int compareBorderStylesForQSort(const void* pa, const void* pb) > const CollapsedBorderValue* b = static_cast<const CollapsedBorderValue*>(pb); > if (*a == *b) > return 0; > - CollapsedBorderValue borderWithHigherPrecedence = compareBorders(*a, *b); > - if (*a == borderWithHigherPrecedence) > - return 1; > - return -1; > + return compareBorders(*a, *b); > } Nice! In general I like that compareBorders() returns an int rather than a border; the name is better suited now than before. :) As for chooseBorder(), I'm not able to come up with a better name.
Yong Li
Comment 24 Friday, July 23, 2010 3:55:53 PM UTC
(In reply to comment #23) > (From update of attachment 58042 [details]) > > - if (border2.style() == BNONE) > > - return border1; > > + if (border2.style() == BNONE) { > > + if (border1.style() != BNONE) > > + return 1; > > + } > > if (border1.style() == BNONE) > > - return border2; > > + return -1; > It looks like -1 is always returned when both have style BNONE. Shouldn't it be 0, like for BHIDDEN? Nice catch! Actuall when both borders have style BNONE, we should continue comparing with other criteria. so it should > > "else" if (border1.style() == BNONE)
Yong Li
Comment 25 Friday, July 23, 2010 3:56:35 PM UTC
Comment on attachment 58042 [details] updated patch Review cancelled because of the bug Kent found.
Yong Li
Comment 26 Friday, July 23, 2010 4:34:40 PM UTC
(In reply to comment #24) > (In reply to comment #23) > > (From update of attachment 58042 [details] [details]) > > > > - if (border2.style() == BNONE) > > > - return border1; > > > + if (border2.style() == BNONE) { > > > + if (border1.style() != BNONE) > > > + return 1; > > > + } > > > if (border1.style() == BNONE) > > > - return border2; > > > + return -1; > > It looks like -1 is always returned when both have style BNONE. Shouldn't it be 0, like for BHIDDEN? > > Nice catch! Actuall when both borders have style BNONE, we should continue comparing with other criteria. so it should > > > > > "else" if (border1.style() == BNONE) I'm not very sure. I'll keep the old behavior (just return 0) and open another bug for that.
Yong Li
Comment 27 Friday, July 23, 2010 4:49:03 PM UTC
Created attachment 62429 [details] Fix comments & keep old behavior for BNONE style
Alexey Proskuryakov
Comment 28 Monday, September 20, 2010 6:42:40 PM UTC
Comment on attachment 58026 [details] Test case Commit-queue won't land this, please land manually.
Yong Li
Comment 29 Monday, September 20, 2010 6:44:21 PM UTC
(In reply to comment #28) > (From update of attachment 58026 [details]) > Commit-queue won't land this, please land manually. ha. thanks!
Eric Seidel (no email)
Comment 30 Monday, September 20, 2010 6:48:23 PM UTC
Why won't it land it?
WebKit Commit Bot
Comment 31 Monday, September 20, 2010 6:55:57 PM UTC
Comment on attachment 62429 [details] Fix comments & keep old behavior for BNONE style Clearing flags on attachment: 62429 Committed r67862: <http://trac.webkit.org/changeset/67862>
Yong Li
Comment 32 Monday, September 20, 2010 6:56:52 PM UTC
Comment on attachment 58026 [details] Test case give a try...
Alexey Proskuryakov
Comment 33 Monday, September 20, 2010 7:01:49 PM UTC
"This" was the other patch, the one with test. I thought commit queue didn't like multiple patches per bug.
Yong Li
Comment 34 Monday, September 20, 2010 7:15:31 PM UTC
Comment on attachment 58026 [details] Test case commit queue says: Building and testing without the patch as a sanity check 5 minutes ago Unable to build and test patch I'll manually commit it
Eric Seidel (no email)
Comment 35 Monday, September 20, 2010 7:18:03 PM UTC
(In reply to comment #34) > (From update of attachment 58026 [details]) > commit queue says: Building and testing without the patch as a sanity check 5 minutes ago Unable to build and test patch > > I'll manually commit it This just means the cq had trouble building. (likely due to tree redness). It's a non-fatal error.
Eric Seidel (no email)
Comment 36 Monday, September 20, 2010 7:18:27 PM UTC
(In reply to comment #35) > This just means the cq had trouble building. (likely due to tree redness). It's a non-fatal error. The commit-queue will always comment in the bug if it hits a fatal error.
Eric Seidel (no email)
Comment 37 Monday, September 20, 2010 7:19:34 PM UTC
(In reply to comment #33) > "This" was the other patch, the one with test. I thought commit queue didn't like multiple patches per bug. The commit-queue can handle multiple patches per bug. It didn't used to like them, no. :) I wouldn't say they're recommended, but the cq will try and process patches in attachment order per-bug. it also won't close a bug after landing if there are still non-obsolete patches on that bug.
WebKit Commit Bot
Comment 38 Monday, September 20, 2010 9:05:57 PM UTC
Comment on attachment 58026 [details] Test case Rejecting patch 58026 from commit-queue. Failed to run "[u'git', u'svn', u'dcommit']" exit_code: 1 Last 500 characters of output: repository/webkit/trunk ... M LayoutTests/ChangeLog A LayoutTests/tables/sort-collapsed-border-styles.html A repository hook failed: Commit blocked by pre-commit hook (exit code 1) with output: The following files contain tab characters: trunk/LayoutTests/tables/sort-collapsed-border-styles.html Please use spaces instead to indent. If you must commit a file with tabs, use svn propset to set the "allow-tabs" property. at /usr/local/git/libexec/git-core/git-svn line 572 Full output: http://queues.webkit.org/results/4026079
Yong Li
Comment 39 Monday, September 20, 2010 10:20:26 PM UTC
Created attachment 68135 [details] test case (tab removed)
Yong Li
Comment 40 Monday, September 20, 2010 10:21:16 PM UTC
Comment on attachment 68135 [details] test case (tab removed) with the tab removed
Yong Li
Comment 41 Monday, September 20, 2010 10:23:35 PM UTC
When I tried manually committing it. I got this: RA layer request failed: MKACTIVITY of '/repository/webkit/!svn/act/9569dd30-c4f c-11df-b9b0-c0cdfa48311f': 403 Forbidden (http://svn.webkit.org) at C:\Git/libex ec/git-core/git-svn line 3945
Eric Seidel (no email)
Comment 42 Monday, September 20, 2010 11:06:45 PM UTC
This appears to have hung the commit-cluster again. I'm investigating. It's not your fault, it should not be possible for a patch to hang the cluster. :)
Yong Li
Comment 43 Monday, September 20, 2010 11:18:56 PM UTC
(In reply to comment #42) > This appears to have hung the commit-cluster again. I'm investigating. It's not your fault, it should not be possible for a patch to hang the cluster. :) I did type wrong user/password more than one time. I haven't pushed a commit by myself for very long time. I cannot live without CQ :)
Andrew Wilson
Comment 44 Tuesday, September 21, 2010 1:36:54 AM UTC
After this revision landed, fast/repaint/table-cell-collapsed-border.html started generating a different result and so is causing failures on the Chromium build bots. I am not certain whether the new result is still valid or not, but I suspect it's now drawing the borders in a different order and generating invalid results. Please run this test and validate whether the results are still valid - I don't want to revert your CL when it's possible we just need to rebaseline the test expectations.
Eric Seidel (no email)
Comment 45 Tuesday, September 21, 2010 1:41:07 AM UTC
Did this get landed manually and the bug not updated?
Andrew Wilson
Comment 46 Tuesday, September 21, 2010 1:51:00 AM UTC
Looks like it landed as r67862.
Eric Seidel (no email)
Comment 47 Tuesday, September 21, 2010 2:03:57 AM UTC
Closing, per above.
Yong Li
Comment 48 Tuesday, September 21, 2010 3:28:15 PM UTC
(In reply to comment #47) > Closing, per above. The test case is still not landed...
Yong Li
Comment 49 Tuesday, September 21, 2010 3:30:29 PM UTC
(In reply to comment #44) > After this revision landed, fast/repaint/table-cell-collapsed-border.html started generating a different result and so is causing failures on the Chromium build bots. I am not certain whether the new result is still valid or not, but I suspect it's now drawing the borders in a different order and generating invalid results. > > Please run this test and validate whether the results are still valid - I don't want to revert your CL when it's possible we just need to rebaseline the test expectations. The patch fixes the compare function passed to qsort. So it is not surprising that it invalidates some old test result. I will check with fast/repaint/table-cell-collapsed-border.html.
Yong Li
Comment 50 Tuesday, September 21, 2010 4:22:17 PM UTC
(In reply to comment #44) > After this revision landed, fast/repaint/table-cell-collapsed-border.html started generating a different result and so is causing failures on the Chromium build bots. I am not certain whether the new result is still valid or not, but I suspect it's now drawing the borders in a different order and generating invalid results. > > Please run this test and validate whether the results are still valid - I don't want to revert your CL when it's possible we just need to rebaseline the test expectations. I noticed even the 2 table-cell-collapsed-border-expected.png files in chromium-linux and chromium-win are different. The new result is reverse of the old result: the actual result in chromium-win build is expected result in chromium-linux, and the actual result in chromium-linux is the expected result in chromium-win. I guess it must be due to the difference between 2 qsort algorithms in handling equal items.
Andrew Wilson
Comment 51 Tuesday, September 21, 2010 5:56:34 PM UTC
OK, so I'll rebaseline the tests then. Thanks for verifying that they are correct.
Darin Adler
Comment 52 Tuesday, September 21, 2010 5:59:46 PM UTC
Ideally we don’t want test results that are different on different platforms because of qsort. Maybe we can change things around so the sort is stable?
Yong Li
Comment 53 Tuesday, September 21, 2010 6:29:19 PM UTC
(In reply to comment #52) > Ideally we don’t want test results that are different on different platforms because of qsort. Maybe we can change things around so the sort is stable? 2 ways I can see: 1. use stable_sort instead of qsort to avoid changing the order of 2 equal items 2. make compare function never return 0 Seems 2) is very hard to be done gracefully. Also in most cases there are not too many collapsed border styles to sort. So probably 1) is acceptable?
Steve Block
Comment 54 Wednesday, September 22, 2010 11:16:41 AM UTC
*** Bug 13147 has been marked as a duplicate of this bug. ***
Darin Adler
Comment 55 Wednesday, September 22, 2010 4:20:22 PM UTC
(In reply to comment #53) > 2 ways I can see: > > 1. use stable_sort instead of qsort to avoid changing the order of 2 equal items > 2. make compare function never return 0 > > Seems 2) is very hard to be done gracefully. Also in most cases there are not too many collapsed border styles to sort. So probably 1) is acceptable? I’m not sure what order the borders are in beforehand. It seems fine to use stable_sort instead of qsort if the order beforehand is something well defined. In other words, what is the ordering we want to apply to otherwise equal border styles? Some sort of document order? I guess this bug may not be the right place to discuss this, but it would be nice to get this tidy.
Yong Li
Comment 56 Thursday, September 23, 2010 1:11:55 AM UTC
(In reply to comment #55) > (In reply to comment #53) > > 2 ways I can see: > > > > 1. use stable_sort instead of qsort to avoid changing the order of 2 equal items > > 2. make compare function never return 0 > > > > Seems 2) is very hard to be done gracefully. Also in most cases there are not too many collapsed border styles to sort. So probably 1) is acceptable? > > I’m not sure what order the borders are in beforehand. It seems fine to use stable_sort instead of qsort if the order beforehand is something well defined. > > In other words, what is the ordering we want to apply to otherwise equal border styles? Some sort of document order? > Seems the pre-sort order is: 1) order of children nodes 2) for each child node: left border, right, top and then bottom
Yong Li
Comment 57 Wednesday, December 22, 2010 3:54:25 PM UTC
try the test case
WebKit Commit Bot
Comment 58 Wednesday, December 22, 2010 8:23:55 PM UTC
Comment on attachment 68135 [details] test case (tab removed) Rejecting attachment 68135 [details] from commit-queue. Failed to run "['./Tools/Scripts/webkit-patch', '--status-host=queues.webkit.org', '--bot-id=cr-jail-3', 'build-and-test', '--no-clean', '--no-update', '--test', '--non-interactive']" exit_code: 2 Last 500 characters of output: sts/websocket/tests/workers ...... http/tests/workers ..... http/tests/xhtmlmp . http/tests/xmlhttprequest ............................................................................................................................................................................ http/tests/xmlhttprequest/web-apps ............... http/tests/xmlhttprequest/workers ......... 635.73s total testing time 22228 test cases (99%) succeeded 1 test case (<1%) was new 13 test cases (<1%) had stderr output Full output: http://queues.webkit.org/results/7325106
Eric Seidel (no email)
Comment 59 Wednesday, December 22, 2010 9:36:41 PM UTC
The commit-queue should give you a better message here, but your patch is missing expectation files for your new tests.
Eric Seidel (no email)
Comment 60 Wednesday, December 22, 2010 9:37:42 PM UTC
This test shoudl be dump as text. Are you sure this is a reduced test case? it looks like it might be way bigger than necessary.
Yong Li
Comment 61 Wednesday, December 22, 2010 9:58:37 PM UTC
(In reply to comment #60) > This test shoudl be dump as text. > Are you sure this is a reduced test case? it looks like it might be way bigger than necessary. Probably bigger than necessary. I used it to reprodce a crash in qsort with the old compare function. I didn't see rvct's qsort code, and just blindly tried different combinations and got a crash. Also, to test the sorting algorithm, problably the longer the better? Anyway the code has been changed from using qsort to using std::sort. Not sure if this test case is still important to us. Maybe leaving it to this bug is good enough?
Eric Seidel (no email)
Comment 62 Tuesday, January 11, 2011 11:11:05 AM UTC
I'm confused as to the status of this bug.
Note You need to log in before you can comment on or make changes to this bug.