Bug 51091

Summary: [NRWT] Rewrite current test_types/* classes completely.
Product: WebKit Reporter: Hayato Ito <hayato>
Component: Tools / TestsAssignee: Hayato Ito <hayato>
Status: RESOLVED FIXED    
Severity: Normal CC: dpranke, eric, koz, mihaip, ojan, tony
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: PC   
OS: OS X 10.5   
Bug Depends on: 53004, 53063, 53071, 53908, 55123    
Bug Blocks: 36065    
Attachments:
Description Flags
patch-in-progress
none
patch-in-progress2 none

Description Hayato Ito 2010-12-14 20:32:14 PST
As I and Dirk disscussed in chat, it might be better to rewrite current test_types/* classes completely.

The reasons are:

1) All of the test_types/* classes are pretty kludgy at this point.
   They try to do too many things in test_type.compare_output() method, like
    - regenerate baselines.
    - copy the output around, etc..

2) The current class hieralchy does't give us significant benefits at this point.
   There are only two subclasses, TextDiff and ImageDiff. It is difficult to add new test type because compare_output() method has many side effects and doesn't have clear responsibility what subclass should implement in this method.

I think more simple approach is better than the current test_type/* approach.
Comment 1 Hayato Ito 2011-01-14 01:10:06 PST
Created attachment 78904 [details]
patch-in-progress
Comment 2 Hayato Ito 2011-01-14 01:10:56 PST
Hi Dirk and other reviews,

I posted a patch for early reviews.
Although the patch itself is incomplete, I'd like to get early feedbacks to know whether you guys like this approach or not.
Comment 3 Dirk Pranke 2011-01-17 18:32:29 PST
Comment on attachment 78904 [details]
patch-in-progress

View in context: https://bugs.webkit.org/attachment.cgi?id=78904&action=review

Generally, I think this patch looks great, and will be a big improvement. Thanks for working on this! I have a few minor comments, but nothing too substantial.

> Tools/Scripts/webkitpy/layout_tests/layout_package/dump_render_tree_thread.py:75
> +def _run_single_test(port, options, test_input, driver, worker_name):

I am tempted to pull this into single_test_runner.py and make the SingleTestRunner object private to that file.

> Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:1
> +# Copyright (C) 2010 Google Inc. All rights reserved.

I'm not wild about the name "single_test_runner", but I have yet to think of anything better. I'll keep thinking about it.

> Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:36
> +_log = logging.getLogger("webkitpy.layout_tests.layout_package.single_test_runner")

You can actually use _log = logging.getLogger(__name__) here.

> Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:118
> +        driver_output = self._driver.run_test(self._driver_input())

I think you probably will want to pull everything except the actual invocation of run_test() into a separate routine, to make things easier to write unit tests for, e,g, def process_output(self, driver_output) .

> Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:127
> +        driver_output = self._driver.run_test(self._driver_input())

you're calling run_test() twice ... this is probably a cut&paste error.

> Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:161
> +        self._write_results_to_result_driectory(failures, driver_output, expected_driver_output)

"_drietct" ... spelling error ;)

> Tools/Scripts/webkitpy/layout_tests/layout_package/test_runner.py:949
>          self._print_statistics_for_test_timings(

It sure seems like we can get rid of all of this debugging stats tracking. Maybe ping ojan to see if he thinks it is still valuable?

> Tools/Scripts/webkitpy/layout_tests/port/base.py:810
> +class DriverInput:

moving TestInput, TestOutput here seems like a good thing to do.

> Tools/Scripts/webkitpy/layout_tests/run_webkit_tests.py:280
> +        # FIXME: Remove "--reset-results" option.

--reset-results and --new-baseline are not the same thing. One writes into the platform directory, and one overwrites the existing baselines, wherever they are (platform or generic). Arguably we need both types of behavior, and I think this patch is breaking --reset-results (the one that overwrites the existing baselines regardless of their location).

It would be nice if these things had better names, though. --reset-results is kind of vague, and --new-baseline doesn't tell me if the baseline is generic or platform-specific. Maybe we want something like

--overwrite-existing-baseline  (or --update-existing-baselines)
--new-platform-specific-baseline  (awfully long, though)

We should also update the parameter name in save_baseline_data() accordingly.

Note that --old-run-webkit-tests has:
--add-platform-exceptions
--new-test-results
--reset-results

as well.
Comment 4 Hayato Ito 2011-01-20 05:33:35 PST
Created attachment 79587 [details]
patch-in-progress2
Comment 5 Hayato Ito 2011-01-20 05:50:17 PST
Hi Dirk,

Thank you for your time reviewing this in-progress patch. I'm glad to hear that this patch looks good for you.

(In reply to comment #3)
> 
> > Tools/Scripts/webkitpy/layout_tests/layout_package/dump_render_tree_thread.py:75
> > +def _run_single_test(port, options, test_input, driver, worker_name):
> 
> I am tempted to pull this into single_test_runner.py and make the SingleTestRunner object private to that file.

Nice point. Done.

> 
> > Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:1
> > +# Copyright (C) 2010 Google Inc. All rights reserved.
> 
> I'm not wild about the name "single_test_runner", but I have yet to think of anything better. I'll keep thinking about it.

single_test_runner is not good name. I don't have a beter name yet :)
test_executer? driver_runner_and_output_comparator? Hmm. I am still thinking.

> 
> > Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:36
> > +_log = logging.getLogger("webkitpy.layout_tests.layout_package.single_test_runner")
> 
> You can actually use _log = logging.getLogger(__name__) here.

Thank you. Done.

> 
> > Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:118
> > +        driver_output = self._driver.run_test(self._driver_input())
> 
> I think you probably will want to pull everything except the actual invocation of run_test() into a separate routine, to make things easier to write unit tests for, e,g, def process_output(self, driver_output) .

Thank you. I've refactored run() into separate routines.

> 
> > Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:127
> > +        driver_output = self._driver.run_test(self._driver_input())
> 
> you're calling run_test() twice ... this is probably a cut&paste error.

My bad. Thank you. Done.

> 
> > Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:161
> > +        self._write_results_to_result_driectory(failures, driver_output, expected_driver_output)
> 
> "_drietct" ... spelling error ;)

Thank you. Fixed.

> 
> > Tools/Scripts/webkitpy/layout_tests/layout_package/test_runner.py:949
> >          self._print_statistics_for_test_timings(
> 
> It sure seems like we can get rid of all of this debugging stats tracking. Maybe ping ojan to see if he thinks it is still valuable?

Although I am not sure whether this debugging tracking for each 'TEST_TYPE' is still valuable or not, I've removed all tracking code which tracks stats per TEST_TYPE. I'll ask ojan whether it's ok or not.

> 
> > Tools/Scripts/webkitpy/layout_tests/port/base.py:810
> > +class DriverInput:
> 
> moving TestInput, TestOutput here seems like a good thing to do.
> 
> > Tools/Scripts/webkitpy/layout_tests/run_webkit_tests.py:280
> > +        # FIXME: Remove "--reset-results" option.
> 
> --reset-results and --new-baseline are not the same thing. One writes into the platform directory, and one overwrites the existing baselines, wherever they are (platform or generic). Arguably we need both types of behavior, and I think this patch is breaking --reset-results (the one that overwrites the existing baselines regardless of their location).
> 
> It would be nice if these things had better names, though. --reset-results is kind of vague, and --new-baseline doesn't tell me if the baseline is generic or platform-specific. Maybe we want something like
> 
> --overwrite-existing-baseline  (or --update-existing-baselines)
> --new-platform-specific-baseline  (awfully long, though)
> 
> We should also update the parameter name in save_baseline_data() accordingly.
> 
> Note that --old-run-webkit-tests has:
> --add-platform-exceptions
> --new-test-results
> --reset-results
> 
> as well.


Thank you. I am sorry that I missed the difference between '--reset-results' and '--new-baseline'.
I reverted all change related with these options. Now the patch should not break '--reset-results'.
Comment 6 Eric Seidel (no email) 2011-01-20 13:47:14 PST
Comment on attachment 79587 [details]
patch-in-progress2

View in context: https://bugs.webkit.org/attachment.cgi?id=79587&action=review

> Tools/ChangeLog:9
> +        kludgy at this point. That class hieralchy doesn't give us significant

hieralchy?

> Tools/ChangeLog:16
> +        So I just introduced single_test_runner, which simply does all compare tasks,
> +        text and image compares.
> +        I also introduced test_result_writer which takes a responsibility of
> +        writing a test result to the layout test result directory.

I think we need to better explain what this patch is doing.  It's not yet clear to me what this patch is trying to accomplish.  I agree with you that the existing test_type system is kludgy, but I'd like to better understand your new system.  Better explanation in the ChangeLog will help that. :)

> Tools/Scripts/webkitpy/layout_tests/layout_package/test_results.py:41
> +    def __init__(self, filename, failures=None, test_run_time=None):

Did you fix all callers?

> Tools/Scripts/webkitpy/layout_tests/port/chromium.py:68
> +    # FIXME: This is a temporary fix.

This comment isn't very clear.

> Tools/Scripts/webkitpy/layout_tests/port/chromium.py:359
> +        # FIXME: Temporary fix for review.
> +        self._filesystem = port._filesystem

Again, not clear.
Comment 7 Eric Seidel (no email) 2011-01-20 13:48:04 PST
Comment on attachment 79587 [details]
patch-in-progress2

Small (<20k) patches are also easier to review than 50k ones. :)

Removing r? since this is marked as "in-progress".
Comment 8 Dirk Pranke 2011-01-20 16:36:51 PST
This is my take on things; hopefully Ito-san can chime in with whether this is also his take.

1) A lot of the logic that used to be in the test_types/* classes has moved into the port/* classes. What is left is now just confusing and hard to follow or extend.

2) Much of the logic in dump_render_tree_thread is also hard to follow and it mixes the logic of actually running DRT, passing it a set of input, getting a set of output, and comparing the output against the baselines with the logic of managing sets of tests. It would be good to split this logic out. The NRWT multiprocessing code does a lot of this, by creating the "Worker" abstraction, but it would be good to split out the "actually do the test" abstraction as well, which is what I think the "single_test_runner" object does.

I think this is all good stuff. Now, a couple of other thoughts ...

(1) The port/base.py module was already large and is getting larger with this patch. 

(2) We're starting to get lots of small relatively-fine-grained objects, and I'm a bit worried that it will be hard to keep track of which class has which responsibilitiies.

That said, the actual Driver class is pretty thin. Maybe it makes sense to combine single_test_runner and Driver and move them into a separate file?

Ideally the port/* modules do not depend on the layout_package/* modules (or anything else under webkitpy.layout_tests). I think this patch gets us pretty close to this, but we still have the test_failures class. I wonder if it makes sense to move that into port/* as well?
Comment 9 Hayato Ito 2011-01-20 20:29:53 PST
Hi Eric, Dirk,
Thank you for reviews.


> > Tools/ChangeLog:16
> > +        So I just introduced single_test_runner, which simply does all compare tasks,
> > +        text and image compares.
> > +        I also introduced test_result_writer which takes a responsibility of
> > +        writing a test result to the layout test result directory.
> 
> I think we need to better explain what this patch is doing.  It's not yet clear to me what this patch is trying to accomplish.  I agree with you that the existing test_type system is kludgy, but I'd like to better understand your new system.  Better explanation in the ChangeLog will help that. :)

The one of the motivations in this patch is that we can add reftests support more easily, but this patch is not only for reftests, but also for making new-run-webkit-tests code clean and more understandable.

I agree current *in-progress* ChangeLog entry is not clear for everyone. I'll update ChangeLog to reflect my intension. 


> 
> > Tools/Scripts/webkitpy/layout_tests/layout_package/test_results.py:41
> > +    def __init__(self, filename, failures=None, test_run_time=None):
> 
> Did you fix all callers?

Yes, I believe so.

> 
> > Tools/Scripts/webkitpy/layout_tests/port/chromium.py:68
> > +    # FIXME: This is a temporary fix.
> 
> This comment isn't very clear.
> 
> > Tools/Scripts/webkitpy/layout_tests/port/chromium.py:359
> > +        # FIXME: Temporary fix for review.
> > +        self._filesystem = port._filesystem
> 
> Again, not clear.

Sorry for confusing. This 'filesystem' fix is needed to run new_run_webkit_tests on chromium-linux as of yesterday after I merged WebKit HEAD. WebKit HEAD seemed to be broken when running new_run_webkit_tests on chromium-linux. I should have mentioned it. I'll fix it in other patch.

(In reply to comment #8)
> This is my take on things; hopefully Ito-san can chime in with whether this is also his take.
> 
> 1) A lot of the logic that used to be in the test_types/* classes has moved into the port/* classes. What is left is now just confusing and hard to follow or extend.
> 
> 2) Much of the logic in dump_render_tree_thread is also hard to follow and it mixes the logic of actually running DRT, passing it a set of input, getting a set of output, and comparing the output against the baselines with the logic of managing sets of tests. It would be good to split this logic out. The NRWT multiprocessing code does a lot of this, by creating the "Worker" abstraction, but it would be good to split out the "actually do the test" abstraction as well, which is what I think the "single_test_runner" object does.
> 
> I think this is all good stuff. Now, a couple of other thoughts ...
> 
> (1) The port/base.py module was already large and is getting larger with this patch. 
> (2) We're starting to get lots of small relatively-fine-grained objects, and I'm a bit worried that it will be hard to keep track of which class has which responsibilitiies.
> 
> That said, the actual Driver class is pretty thin. Maybe it makes sense to combine single_test_runner and Driver and move them into a separate file?

At first glace, I am not sure that it is good to merge single_test_runner with Driver.  I think that Driver should be thin and is used only for getting image dump (and text dump). But I'll keep on thinking on it. At least, separating Driver into a separe file sounds good for me. 

> 
> Ideally the port/* modules do not depend on the layout_package/* modules (or anything else under webkitpy.layout_tests). I think this patch gets us pretty close to this, but we still have the test_failures class. I wonder if it makes sense to move that into port/* as well?

TestFailuers is one of the things which frustrates me. :)   I'd like to use these TestFailures as is in this refactoring process until we have clear ideas. 

Now I believe I got early feedback, I continue to polish this patch. Maybe I'll separate this patch into several small patches for easy reviews.

Again, thank you for feedbacks, that's always helpful for me.
Comment 10 Hayato Ito 2011-01-24 06:17:03 PST
Hi Eric, Dirk,

I am separating this patch into smaller patches as you suggested.
The one is here: https://bugs.webkit.org/show_bug.cgi?id=53004

Could you review that patch?
Comment 11 Hayato Ito 2011-01-25 00:45:40 PST
I’ve been separeting this patch into smaller patches. So I'd like to use this bugzilla entry as a master bug.

So far, I've made the following patches:

1. https://bugs.webkit.org/show_bug.cgi?id=53004
2. https://bugs.webkit.org/show_bug.cgi?id=53063
3. https://bugs.webkit.org/show_bug.cgi?id=53071

I only set the review flag(r?) if that can be applied to WebKit HEAD. Other patches cannot be applied to WebKit HEAD and depends on the preceeding patches.
Comment 12 Hayato Ito 2011-02-28 20:18:15 PST
All dependents bugs were landed. Closing this bug.