141046 – BinarySwitch should be faster on average

RESOLVED FIXED141046

BinarySwitch should be faster on average

https://bugs.webkit.org/show_bug.cgi?id=141046

Summary BinarySwitch should be faster on average

Filip Pizlo

Reported 2015-01-29 12:56:49 PST

I've done some messing around in Mathematica with recurrence relations representing our BinarySwitch's average performance - that is, the average number of branches you will take to get to the case you want, for varying numbers of cases. I found that we could improve on BinarySwitch's strategy to achieve better average-case performance in some situations.

Attachments
average binary switch performance (96.98 KB, image/png) 2015-01-29 13:06 PST, Filip Pizlo	no flags	Details
average binary switch performance (101.83 KB, image/png) 2015-01-29 15:28 PST, Filip Pizlo	no flags	Details
the patch (17.66 KB, patch) 2015-01-31 13:33 PST, Filip Pizlo	no flags	Details Formatted Diff Diff
the patch (17.42 KB, patch) 2015-01-31 13:36 PST, Filip Pizlo	no flags	Details Formatted Diff Diff
the patch (18.43 KB, patch) 2015-02-02 09:27 PST, Filip Pizlo	no flags	Details Formatted Diff Diff
the patch (18.28 KB, patch) 2015-02-02 09:31 PST, Filip Pizlo	andersca: review+	Details Formatted Diff Diff
Show Obsolete (4) View All Add attachment proposed patch, testcase, etc.

Filip Pizlo

Comment 1 2015-01-29 13:06:54 PST

Created attachment 245643 [details] average binary switch performance This plots the average number of branches that will be taken to get to the code for some case, assuming that cases are sparse and each case is taken with equal probability. The X axis is the total number of cases. The top blue line is our current algorithm, which special-cases 1 and 2 cases (just branches on each of them in turn, which I say is a linear branch cascade) and handles all other cases by finding the median (choosing one of the two median candidates at random for odd inputs) and then saying something like: if (value < median) recurse left else if (value == median) handle median else recurse right The red line just below it shows the performance if we also special-cases 3 cases with a linear branch cascade. This is indeed faster. Here's an example of 3 cases with the old algorithm for case values (left, middle, right): if (value < middle) { if (value == left) handle left } else if (value == middle) { handle middle } else { if (value == right) handle right } This requires 2 branches for left, 2 branches for middle, and three branches for right. If we do a linear branch cascade: if (value == left) handle left else if (value == middle) handle middle else if (value == right) handle right then we get 1 branch for left, 2 branches for middle, and three branches for right. So, on average, we have 2 branches instead of 2.33. The lowest line represents an additional optimization, where we handle even numbers of cases totally differently. Instead of picking one of the two median candidates and using the left-median-right approach where the median is directly compared against, we just branch to determine whether we lie in the left half of the cases or the right half. Let rightMedian be the value in an even-sized array that is just to the right of the formal median (i.e. it's the larger of the two median candidates). Then for even numbers of cases we do: if (value < rightMedian) recurse left else recurse right And for odd numbers of cases we still do as before: if (value < median) recurse left else if (value == median) handle median else recurse right This will sometimes save us a dramatic number of branches. I'm still exploring ways of improving this even further. The noisiness of this algorithm's performance suggests that maybe we can do even better.

Filip Pizlo

Comment 2 2015-01-29 15:28:53 PST

Created attachment 245663 [details] average binary switch performance As I suspected, you can get even better performance by simply never having a direct equality comparison against the median when bisecting. Of course this assumes that branches are the true cost and so doing one compare and two branches is as bad as doing two compare/branches. I believe that this is the case in our simple JIT backend.

Filip Pizlo

Comment 3 2015-01-29 15:52:55 PST

Something else to keep in mind: in odd cases where one side of the bisection will be one larger than the other, it totally doesn't matter if you allow unbalancing to happen. One side will end up having one more element than the other, and that side might still be odd, in which case it will also have two sides and one of those sides will be bigger. Attempting to be left-right fair doesn't improve average throughput assuming that all cases have equal probability. Still, the code I'm working on will attempt to preserve balance using randomness, to ensure that we don't end up with a deterministic pathology where the input always goes down the right side of the tree and that side is always deeper.

Filip Pizlo

Comment 4 2015-01-31 13:33:19 PST

Created attachment 245798 [details] the patch

Filip Pizlo

Comment 5 2015-01-31 13:36:33 PST

Created attachment 245799 [details] the patch Just fixed a comment.

Filip Pizlo

Comment 6 2015-02-02 09:27:34 PST

Created attachment 245882 [details] the patch Revised the comments a bit. I believe I came up with a good formal way of reasoning about the cost of taking default. We stand to shave off 1/6 branch on average if we optimized for default by reducing leafThreshold to 2, but then we'd regress by 1/6 branch on average in the case that we take a case.

Filip Pizlo

Comment 7 2015-02-02 09:31:59 PST

Created attachment 245883 [details] the patch Fix the changelog to say that I've thought about the fall through case.

Filip Pizlo

Comment 8 2015-02-02 12:55:19 PST

Landed in http://trac.webkit.org/changeset/179490

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware All

OS All

Product WebKit

Component JavaScriptCore

Assignee

Filip Pizlo

Reported

2015-01-29 12:56 PST

Modified

2015-02-02 12:55 PST History

CC List

10 users Show

URL

Keywords

Depends on

Blocks