113635 – fourthTier: FTL JIT should be able to compile the Marsaglia random number generator

RESOLVED FIXED Bug 113635

fourthTier: FTL JIT should be able to compile the Marsaglia random number generator

https://bugs.webkit.org/show_bug.cgi?id=113635

Summary fourthTier: FTL JIT should be able to compile the Marsaglia random number gen...

Filip Pizlo

Reported 2013-03-29 21:45:54 PDT

This is a very simple PRNG that we ought to be able to compile: function marsaglia(m_z, m_w, n) { var result; for (var i = 0; i < n; ++i) { m_z = (36969 * (m_z & 65535) + (m_z >> 16)) | 0; m_w = (18000 * (m_w & 65535) + (m_w >> 16)) | 0; result = ((m_z << 16) + m_w) | 0; } return result; } var result = 0; for (var i = 0; i < 100; ++i) result += marsaglia(i, i + 1, 1000000); print(result);

Attachments
work in progress (21.48 KB, patch) 2013-03-29 23:06 PDT, Filip Pizlo	no flags	Details Formatted Diff Diff
the patch (35.78 KB, patch) 2013-03-29 23:52 PDT, Filip Pizlo	oliver: review+	Details Formatted Diff Diff
Show Obsolete (1) View All Add attachment proposed patch, testcase, etc.

Filip Pizlo

Comment 1 2013-03-29 23:06:35 PDT

Created attachment 195843 [details] work in progress

Filip Pizlo

Comment 2 2013-03-29 23:52:26 PDT

Created attachment 195844 [details] the patch

Oliver Hunt

Comment 3 2013-03-30 16:53:04 PDT

Comment on attachment 195844 [details] the patch View in context: https://bugs.webkit.org/attachment.cgi?id=195844&action=review > Source/JavaScriptCore/ChangeLog:14 > + The Marsaglia function runs ~60% faster with FTL, than DFG. Not a terrible start. Does the code look sane?

Oliver Hunt

Comment 4 2013-03-30 16:54:15 PDT

Comment on attachment 195844 [details] the patch View in context: https://bugs.webkit.org/attachment.cgi?id=195844&action=review > LayoutTests/fast/js/regress/script-tests/marsaglia.js:15 > +print(result); Oh, this should be debug(result) I think, as it occurs to me that in-browser print() brings up the print dialog (obviously :D )

Mark Hahnenberg

Comment 5 2013-03-30 16:57:32 PDT

Comment on attachment 195844 [details] the patch View in context: https://bugs.webkit.org/attachment.cgi?id=195844&action=review >> Source/JavaScriptCore/ChangeLog:14 >> + The Marsaglia function runs ~60% faster with FTL, than DFG. Not a terrible start. > > Does the code look sane? And when you say 60% faster, is that for the full 100 iterations in your original benchmark code or comparing the two once both of them fully tier up?

Filip Pizlo

Comment 6 2013-03-30 16:57:51 PDT

(In reply to comment #4) > (From update of attachment 195844 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=195844&action=review > > > LayoutTests/fast/js/regress/script-tests/marsaglia.js:15 > > +print(result); > > Oh, this should be debug(result) I think, as it occurs to me that in-browser print() brings up the print dialog (obviously :D ) Ooooops!! Haha, thanks for the catch.

Filip Pizlo

Comment 7 2013-03-30 16:58:29 PDT

(In reply to comment #3) > (From update of attachment 195844 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=195844&action=review > > > Source/JavaScriptCore/ChangeLog:14 > > + The Marsaglia function runs ~60% faster with FTL, than DFG. Not a terrible start. > > Does the code look sane? The LLVM IR looks sane. I'm still doing the wiring to allow us to actually disassemble the things that LLVM generates.

Filip Pizlo

Comment 8 2013-03-30 16:58:53 PDT

(In reply to comment #5) > (From update of attachment 195844 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=195844&action=review > > >> Source/JavaScriptCore/ChangeLog:14 > >> + The Marsaglia function runs ~60% faster with FTL, than DFG. Not a terrible start. > > > > Does the code look sane? > > And when you say 60% faster, is that for the full 100 iterations in your original benchmark code or comparing the two once both of them fully tier up? Yup.

Mark Hahnenberg

Comment 9 2013-03-30 17:00:31 PDT

> > And when you say 60% faster, is that for the full 100 iterations in your original benchmark code or comparing the two once both of them fully tier up? > > Yup. I'll assume you meant "yup" to the first option :-P

Filip Pizlo

Comment 10 2013-03-30 17:02:21 PDT

(In reply to comment #9) > > > And when you say 60% faster, is that for the full 100 iterations in your original benchmark code or comparing the two once both of them fully tier up? > > > > Yup. > > I'll assume you meant "yup" to the first option :-P Actually, I don't understand the question. What is the difference between "the full 100 iterations in your original benchmark" and "comparing the two once both of them fully tier up"? I'm just running the program, as it exists in the benchmark I'm landing in JSRegress, either with FTL turned on, or with FTL turned off. Both runs involve tier-ups. Both runs include compile time. Both runs include 100 iterations, by virtue of the fact that the program has a loop that calls marsaglia() 100 times.

Filip Pizlo

Comment 11 2013-03-30 17:03:36 PDT

Landed in http://trac.webkit.org/changeset/147283

Mark Hahnenberg

Comment 12 2013-03-30 17:08:56 PDT

> Actually, I don't understand the question. What is the difference between "the full 100 iterations in your original benchmark" and "comparing the two once both of them fully tier up"? Sorry, it was kind of a vague question. You could imagine comparing the two with "tier up time" not included in their times. So you'd wait some amount of time until both tiered all the way up, then test how long it takes each to run n iterations. I would imagine this would make the FTL look even better :-) I know our harness isn't really setup for that, but I didn't know if you had cooked up something custom.

Filip Pizlo

Comment 13 2013-03-30 17:15:22 PDT

(In reply to comment #12) > > Actually, I don't understand the question. What is the difference between "the full 100 iterations in your original benchmark" and "comparing the two once both of them fully tier up"? > > Sorry, it was kind of a vague question. You could imagine comparing the two with "tier up time" not included in their times. So you'd wait some amount of time until both tiered all the way up, then test how long it takes each to run n iterations. I would imagine this would make the FTL look even better :-) I know our harness isn't really setup for that, but I didn't know if you had cooked up something custom. Nope, I included the tier-up times. I actually didn't even run in the harness - since the harness still makes it super awkward to pass environment variables to one configuration (you did some things to fix that, I think, but I don't remember and anyways I didn't feel like using my brain to think or eyes to read). I just used 'time' on the command-line. So these measurements include: - The time it takes for the 'jsc' tool to start. - The time it takes to run in the interpreter and baseline JIT prior to tier-up. - The time it takes to compile the benchmark and tier up all the way to either DFG or FTL. - The time it takes to actually run once tiered up. One way to measure what the tier-up times look like is to increase, or decrease, the running time of the benchmark and see how it affects the speed-up. I just tried increasing it to 10x longer and the speed-up went to 68% instead of 60%. This implies that only a very small fraction of the 0.22 sec FTL total run-time includes compilation. Basically what I'm seeing consistently when I mess around with the Battlestar is that it actually takes significantly less time to launch than the nay-sayers claimed.

Geoffrey Garen

Comment 14 2013-04-01 10:54:57 PDT

> Basically what I'm seeing consistently when I mess around with the Battlestar is that it actually takes significantly less time to launch than the nay-sayers claimed. Launch times are much shorter when you start from low Earth orbit!

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware All

OS All

Product WebKit

Component JavaScriptCore

Assignee

Filip Pizlo

Reported

2013-03-29 21:45 PDT

Modified

2013-04-01 10:54 PDT History

CC List

7 users Show

URL

Keywords

Depends on

Blocks

112840

Dependancies

tree graph