RESOLVED FIXED Bug 113635
fourthTier: FTL JIT should be able to compile the Marsaglia random number generator
https://bugs.webkit.org/show_bug.cgi?id=113635
Summary fourthTier: FTL JIT should be able to compile the Marsaglia random number gen...
Filip Pizlo
Reported 2013-03-29 21:45:54 PDT
This is a very simple PRNG that we ought to be able to compile: function marsaglia(m_z, m_w, n) { var result; for (var i = 0; i < n; ++i) { m_z = (36969 * (m_z & 65535) + (m_z >> 16)) | 0; m_w = (18000 * (m_w & 65535) + (m_w >> 16)) | 0; result = ((m_z << 16) + m_w) | 0; } return result; } var result = 0; for (var i = 0; i < 100; ++i) result += marsaglia(i, i + 1, 1000000); print(result);
Attachments
work in progress (21.48 KB, patch)
2013-03-29 23:06 PDT, Filip Pizlo
no flags
the patch (35.78 KB, patch)
2013-03-29 23:52 PDT, Filip Pizlo
oliver: review+
Filip Pizlo
Comment 1 2013-03-29 23:06:35 PDT
Created attachment 195843 [details] work in progress
Filip Pizlo
Comment 2 2013-03-29 23:52:26 PDT
Created attachment 195844 [details] the patch
Oliver Hunt
Comment 3 2013-03-30 16:53:04 PDT
Comment on attachment 195844 [details] the patch View in context: https://bugs.webkit.org/attachment.cgi?id=195844&action=review > Source/JavaScriptCore/ChangeLog:14 > + The Marsaglia function runs ~60% faster with FTL, than DFG. Not a terrible start. Does the code look sane?
Oliver Hunt
Comment 4 2013-03-30 16:54:15 PDT
Comment on attachment 195844 [details] the patch View in context: https://bugs.webkit.org/attachment.cgi?id=195844&action=review > LayoutTests/fast/js/regress/script-tests/marsaglia.js:15 > +print(result); Oh, this should be debug(result) I think, as it occurs to me that in-browser print() brings up the print dialog (obviously :D )
Mark Hahnenberg
Comment 5 2013-03-30 16:57:32 PDT
Comment on attachment 195844 [details] the patch View in context: https://bugs.webkit.org/attachment.cgi?id=195844&action=review >> Source/JavaScriptCore/ChangeLog:14 >> + The Marsaglia function runs ~60% faster with FTL, than DFG. Not a terrible start. > > Does the code look sane? And when you say 60% faster, is that for the full 100 iterations in your original benchmark code or comparing the two once both of them fully tier up?
Filip Pizlo
Comment 6 2013-03-30 16:57:51 PDT
(In reply to comment #4) > (From update of attachment 195844 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=195844&action=review > > > LayoutTests/fast/js/regress/script-tests/marsaglia.js:15 > > +print(result); > > Oh, this should be debug(result) I think, as it occurs to me that in-browser print() brings up the print dialog (obviously :D ) Ooooops!! Haha, thanks for the catch.
Filip Pizlo
Comment 7 2013-03-30 16:58:29 PDT
(In reply to comment #3) > (From update of attachment 195844 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=195844&action=review > > > Source/JavaScriptCore/ChangeLog:14 > > + The Marsaglia function runs ~60% faster with FTL, than DFG. Not a terrible start. > > Does the code look sane? The LLVM IR looks sane. I'm still doing the wiring to allow us to actually disassemble the things that LLVM generates.
Filip Pizlo
Comment 8 2013-03-30 16:58:53 PDT
(In reply to comment #5) > (From update of attachment 195844 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=195844&action=review > > >> Source/JavaScriptCore/ChangeLog:14 > >> + The Marsaglia function runs ~60% faster with FTL, than DFG. Not a terrible start. > > > > Does the code look sane? > > And when you say 60% faster, is that for the full 100 iterations in your original benchmark code or comparing the two once both of them fully tier up? Yup.
Mark Hahnenberg
Comment 9 2013-03-30 17:00:31 PDT
> > And when you say 60% faster, is that for the full 100 iterations in your original benchmark code or comparing the two once both of them fully tier up? > > Yup. I'll assume you meant "yup" to the first option :-P
Filip Pizlo
Comment 10 2013-03-30 17:02:21 PDT
(In reply to comment #9) > > > And when you say 60% faster, is that for the full 100 iterations in your original benchmark code or comparing the two once both of them fully tier up? > > > > Yup. > > I'll assume you meant "yup" to the first option :-P Actually, I don't understand the question. What is the difference between "the full 100 iterations in your original benchmark" and "comparing the two once both of them fully tier up"? I'm just running the program, as it exists in the benchmark I'm landing in JSRegress, either with FTL turned on, or with FTL turned off. Both runs involve tier-ups. Both runs include compile time. Both runs include 100 iterations, by virtue of the fact that the program has a loop that calls marsaglia() 100 times.
Filip Pizlo
Comment 11 2013-03-30 17:03:36 PDT
Mark Hahnenberg
Comment 12 2013-03-30 17:08:56 PDT
> Actually, I don't understand the question. What is the difference between "the full 100 iterations in your original benchmark" and "comparing the two once both of them fully tier up"? Sorry, it was kind of a vague question. You could imagine comparing the two with "tier up time" not included in their times. So you'd wait some amount of time until both tiered all the way up, then test how long it takes each to run n iterations. I would imagine this would make the FTL look even better :-) I know our harness isn't really setup for that, but I didn't know if you had cooked up something custom.
Filip Pizlo
Comment 13 2013-03-30 17:15:22 PDT
(In reply to comment #12) > > Actually, I don't understand the question. What is the difference between "the full 100 iterations in your original benchmark" and "comparing the two once both of them fully tier up"? > > Sorry, it was kind of a vague question. You could imagine comparing the two with "tier up time" not included in their times. So you'd wait some amount of time until both tiered all the way up, then test how long it takes each to run n iterations. I would imagine this would make the FTL look even better :-) I know our harness isn't really setup for that, but I didn't know if you had cooked up something custom. Nope, I included the tier-up times. I actually didn't even run in the harness - since the harness still makes it super awkward to pass environment variables to one configuration (you did some things to fix that, I think, but I don't remember and anyways I didn't feel like using my brain to think or eyes to read). I just used 'time' on the command-line. So these measurements include: - The time it takes for the 'jsc' tool to start. - The time it takes to run in the interpreter and baseline JIT prior to tier-up. - The time it takes to compile the benchmark and tier up all the way to either DFG or FTL. - The time it takes to actually run once tiered up. One way to measure what the tier-up times look like is to increase, or decrease, the running time of the benchmark and see how it affects the speed-up. I just tried increasing it to 10x longer and the speed-up went to 68% instead of 60%. This implies that only a very small fraction of the 0.22 sec FTL total run-time includes compilation. Basically what I'm seeing consistently when I mess around with the Battlestar is that it actually takes significantly less time to launch than the nay-sayers claimed.
Geoffrey Garen
Comment 14 2013-04-01 10:54:57 PDT
> Basically what I'm seeing consistently when I mess around with the Battlestar is that it actually takes significantly less time to launch than the nay-sayers claimed. Launch times are much shorter when you start from low Earth orbit!
Note You need to log in before you can comment on or make changes to this bug.