Bug 22736

Summary:

State-of-art peephole state machine

Product:

WebKit

Reporter:

Judit Jász <jasy>

Component:

JavaScriptCore

Assignee:

Nobody <webkit-unassigned>

Status:

RESOLVED WONTFIX

Severity:

Normal

CC:

emacemac7, ggaren, oliver, ossy

Priority:

Version:

528+ (Nightly build)

Hardware:

OS:

All

Attachments:

Description	Flags
Peephole framwork with state machine	oliver: review-
performance results (on jit)	none
performance results	none
updated patch	ggaren: review-
updated patch	eric: review-
performance result on MAC	none
performance result (if JIT and WREC disabled)	none

Judit Jász

Reported 2008-12-08 07:42:01 PST

We were thinking about if there any useful way to increase the maintainability and the efficiency of the current peephole framework. The current approach is quick, but the more peepholes are included the slower the bytecode generation is. (Because the continuous checks.) We have implemented a peephole framework which works with a state machine. The idea is originated from the Tamarin. This state machine works in the following way: 1. After each bytecode is emitted the state machine changes its state according to the given bytecode. 2. If it is found a pattern candidate, it tries to apply the peephole. Otherwise it counts the states forward according to the next bytecodes. In this framework it is very easy to adding a new peephole pattern to the system. All peepholes are grouped in the same file, and in a same style. We found there is no performance regression with this new framework. What are your opinions and thoughts about this solution?

Attachments
Peephole framwork with state machine (86.19 KB, patch) 2008-12-08 10:33 PST, Judit Jász	oliver: review-	Details Formatted Diff Diff
performance results (on jit) (3.89 KB, text/plain) 2008-12-08 12:46 PST, Judit Jász	no flags	Details
performance results (3.92 KB, text/plain) 2008-12-08 12:47 PST, Judit Jász	no flags	Details
updated patch (91.10 KB, patch) 2009-01-23 09:04 PST, Judit Jász	ggaren: review-	Details Formatted Diff Diff
updated patch (77.06 KB, patch) 2009-03-11 03:23 PDT, Csaba Osztrogonác	eric: review-	Details Formatted Diff Diff
performance result on MAC (3.67 KB, text/plain) 2009-03-11 03:27 PDT, Csaba Osztrogonác	no flags	Details
performance result (if JIT and WREC disabled) (3.81 KB, text/plain) 2009-03-11 03:28 PDT, Csaba Osztrogonác	no flags	Details
View All Add attachment proposed patch, testcase, etc.

Judit Jász

Comment 1 2008-12-08 10:33:43 PST

Created attachment 25843 [details] Peephole framwork with state machine This patch introduces the peephole framework, which works with the help of the generated state tables. We can switch among the original peepholes and the peepholes of the introduced framework with the macro PEEPHOLE_OPTIMIZER.

Sam Weinig

Comment 2 2008-12-08 10:38:33 PST

What is the performance change due to this patch? Is it a win?

Judit Jász

Comment 3 2008-12-08 11:02:56 PST

(In reply to comment #2) > What is the performance change due to this patch? Is it a win? > Whit these existed peepholes there is no any significant gain (and sometimes are minimal slow down). But I think if we increase the number of peepholes there could be some performance improvement. I will send the detailed performance results soon.

Judit Jász

Comment 4 2008-12-08 12:46:01 PST

Created attachment 25850 [details] performance results (on jit)

Judit Jász

Comment 5 2008-12-08 12:47:36 PST

Created attachment 25851 [details] performance results

Oliver Hunt

Comment 6 2009-01-22 18:34:23 PST

Comment on attachment 25843 [details] Peephole framwork with state machine While this patch looks like it would be a great addition and improvement over the existing simplistic optimiser there are a number of issues that we'll need to address. The most significant of these is the use of the STL -- basically WebKit does not use STL, and for the more complex STL structures (such as vector, etc) we run the risk of mismatching malloc implementations. My other concern is that the state tables appear to be hand coded, and I would much rather we had a way to autogenerate them. Ignoring those issues the code itself looks good (bar some minor style errors) and probably just needs to be brought up to tip of tree (we do now have length info for each opcode which should remove/reduce that portion of this patch). Will CC myself to this bug so i don't miss updates. Many apologies for the huge delay in review :-(

Judit Jász

Comment 7 2009-01-23 09:04:26 PST

Created attachment 26973 [details] updated patch This patch contains the up to date version of the peephole framework patch with some new patterns. There are no significant performance differences on the SunSpider tests, but only our WindScorpion benchmark (http://www.sed.hu/webkit/?page=downloads). The performance gain is about 3% in this later case. The state tables of the peepholes are included by the peephole.cpp. This file is fully (re)generated at compilation time by the execution of the code of peephole_generator.cpp If you want to use this peephole framework, define the PEEPHOLE_OPTIMIZER macro.

Oliver Hunt

Comment 8 2009-01-23 11:06:58 PST

Comment on attachment 26973 [details] updated patch Ah ha, peephole_generator is stand alone executable i take it? Sorry, I misunderstood that in my first pass Hmmm, I'm not sure I understand the change to JITCall.cpp? --Oliver

Judit Jász

Comment 9 2009-01-23 11:53:44 PST

(In reply to comment #8) > (From update of attachment 26973 [details] [review]) > Ah ha, peephole_generator is stand alone executable i take it? Sorry, I > misunderstood that in my first pass > > Hmmm, I'm not sure I understand the change to JITCall.cpp? > > --Oliver Ok, the modification of the JITCall does not connect to the peephole framework. It is only the remainder of my earlier bug riport: https://bugs.webkit.org/show_bug.cgi?id=23497

Geoffrey Garen

Comment 10 2009-01-26 17:16:51 PST

Latest performance results still show no improvement.

Oliver Hunt

Comment 11 2009-01-26 17:28:17 PST

Geoff, while i realise there isn't a perf improvement i believe this is the direction we wish to move in future. A few things that need to occur in the patch (that i noticed over the weekend) peephole_optimiser should be renamed to PeepholeOptimiser (or rather appropriate changes to the generator need to occur to make this happen). Generation of the optimiser should be a build step -- the generated file itself shouldn't be included in the patch.

Csaba Osztrogonác

Comment 12 2009-03-11 03:23:31 PDT

Created attachment 28469 [details] updated patch I removed the previously included generated file, and renamed files suite to code style guidelines. Additionally the optimizer generation moved to a build phase, already it works on MAC, Qt, GTK and Windows platforms.

Csaba Osztrogonác

Comment 13 2009-03-11 03:27:15 PDT

Created attachment 28470 [details] performance result on MAC

Csaba Osztrogonác

Comment 14 2009-03-11 03:28:50 PDT

Created attachment 28471 [details] performance result (if JIT and WREC disabled)

Csaba Osztrogonác

Comment 15 2009-03-11 03:33:51 PDT

(In reply to comment #12) > Created an attachment (id=28469) [review] > updated patch And I forgot to mention, the second peephole pattern fixed, which had a hidden bug. (It works correctly on SunSpider and Regression tests, but appeared with another test.)

Oliver Hunt

Comment 16 2009-03-11 03:53:26 PDT

I'm not entirely convinced by this patch, while it is helpful to have a sane peephole optimiser the current tradeoff between win vs. additional complexity does not make it seem entirely worth while. It would be interesting to know what the results are if you do run-sunspider --v8 (which uses the original v8 tests) to get further performance information.

Eric Seidel (no email)

Comment 17 2009-05-21 19:04:21 PDT

Comment on attachment 28469 [details] updated patch Sounds like Oliver is waiting on v8 numbers to review this. Given that there has been no activity in this bug in 2 months, I'm marking this as r- to remove it from the review queue. You should feel encouraged to email oliver or geoff or maciej directly for further guidance as to where to take this patch. Thanks again for the patch!

Maciej Stachowiak

Comment 18 2009-05-21 22:36:46 PDT

Please reflag for review when the requested perf testing info is provided.

Csaba Osztrogonác

Comment 19 2009-12-16 23:23:17 PST

I closed it, because it is obsolete now.

Note You need to log in before you can comment on or make changes to this bug.