Bug 46128

Summary: webkitpy - display web pages using a common routine
Product: WebKit Reporter: Dirk Pranke <dpranke>
Component: New BugsAssignee: Dirk Pranke <dpranke>
Status: RESOLVED FIXED    
Severity: Normal CC: abarth, atwilson, cjerdonek, eric, tony, victorw
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Other   
OS: OS X 10.5   
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch abarth: review+

Description Dirk Pranke 2010-09-20 15:02:12 PDT
webkitpy - display web pages using a common routine
Comment 1 Dirk Pranke 2010-09-20 15:03:12 PDT
Created attachment 68141 [details]
Patch
Comment 2 Dirk Pranke 2010-09-20 15:05:37 PDT
Note that this fixes an issue with the rebaselining tool as it is currently checked in, which is crashing when trying to display the results file.
Comment 3 Adam Barth 2010-09-20 16:24:20 PDT
Comment on attachment 68141 [details]
Patch

LGTM.  It's generally better to use an instance version rather than a static version of User to make it easier to write unit tests (by injecting a mock user object as a dependency), but this is a step in the right direction.
Comment 4 Dirk Pranke 2010-09-20 16:51:39 PDT
(In reply to comment #3)
> (From update of attachment 68141 [details])
> LGTM.  It's generally better to use an instance version rather than a static version of User to make it easier to write unit tests (by injecting a mock user object as a dependency), but this is a step in the right direction.

That is using an instance of User, and not a static object. I assume you mean instead that show_results_html_file() should take a user passed in as an argument; that would mean that I'd have to create a User object in the calling routine for no other purpose than to pass it to this routine, which shouldn't have to know that it uses a User at all, which feels like a layering violation.

Frankly, I'm not sure why we have have the User.open_url() call, since it's basically just a wrapper around webbrowser.open(). I only used User() because there was a comment in one of the FIXMEs that mentioned it as desirable.
Comment 5 Eric Seidel (no email) 2010-09-20 17:00:06 PDT
(In reply to comment #4)
> That is using an instance of User, and not a static object. I assume you mean instead that show_results_html_file() should take a user passed in as an argument; that would mean that I'd have to create a User object in the calling routine for no other purpose than to pass it to this routine, which shouldn't have to know that it uses a User at all, which feels like a layering violation.

In these cases, where a User object isn't accessible, we can use patterns like Executive() does.  The whole point of a User object is to abstract away interactions with the user so they can be easily mocked out for testing.

> Frankly, I'm not sure why we have have the User.open_url() call, since it's basically just a wrapper around webbrowser.open(). I only used User() because there was a comment in one of the FIXMEs that mentioned it as desirable.

For exactly the reason mentioned above.  Ease-of-mocking.  It's also generally a good idea to factor the code into different objects.  Just like how Executive does all the dealing with processes instead of having individual os.kill calls sprinkled throughout the codebase.  User handles platform-specific differences, and abstracts the whole idea of dealing with a User.

If you ever don't have a User object, but feel like you should, make a static method on User which just creates a User() and calls the instance method.
Comment 6 Dirk Pranke 2010-09-20 17:34:30 PDT
(In reply to comment #5)
> (In reply to comment #4)
> > That is using an instance of User, and not a static object. I assume you mean instead that show_results_html_file() should take a user passed in as an argument; that would mean that I'd have to create a User object in the calling routine for no other purpose than to pass it to this routine, which shouldn't have to know that it uses a User at all, which feels like a layering violation.
> 
> In these cases, where a User object isn't accessible, we can use patterns like Executive() does.  The whole point of a User object is to abstract away interactions with the user so they can be easily mocked out for testing.

I'm not 100% sure that I follow what you're saying here. Looking at the current version of executive.py, the only pattern I see that might be applicable is how executive.run_command creates an instance and then calls executive.Executive.run_command on it. Is that what you're referring to?

If so, how is executive.run_command() (or an equivalent User.static_open_url()) easier to mock than webbrowser.open() ?

> It's also generally a good idea to factor the code into different objects.  Just like how Executive does all the dealing with processes instead of having individual os.kill calls sprinkled throughout the codebase.  User handles platform-specific differences, and abstracts the whole idea of dealing with a User.

There's two reasons to factor the code - to ensure that wherever we try to do X, we do it the same way, and, where necessary, to deviate from built in implementations because the built in doesn't do what we want. os.kill() qualifies for both the first and second cases (because simply calling os.kill() doesn't work right on windows). User.open_url() -- as we're using it currently -- only falls into the first category. In this case we are simply duplicating the work the platform already does for us, which is less compelling to me. But there is still value in trying to get common behavior.

> If you ever don't have a User object, but feel like you should, make a static method on User which just creates a User() and calls the instance method.

I'm not sure I understand how that makes things better. Are you suggesting that by introducing a User.static_open_url() that wrapped User().open_url() it's easier to grep for the usages? Or is there some other benefit (apart from the first thing discussed above)?
Comment 7 Eric Seidel (no email) 2010-09-20 17:43:09 PDT
(In reply to comment #6)
> (In reply to comment #5)
> > (In reply to comment #4)
> > > That is using an instance of User, and not a static object. I assume you mean instead that show_results_html_file() should take a user passed in as an argument; that would mean that I'd have to create a User object in the calling routine for no other purpose than to pass it to this routine, which shouldn't have to know that it uses a User at all, which feels like a layering violation.
> > 
> > In these cases, where a User object isn't accessible, we can use patterns like Executive() does.  The whole point of a User object is to abstract away interactions with the user so they can be easily mocked out for testing.
> 
> I'm not 100% sure that I follow what you're saying here. Looking at the current version of executive.py, the only pattern I see that might be applicable is how executive.run_command creates an instance and then calls executive.Executive.run_command on it. Is that what you're referring to?

Yes.

> If so, how is executive.run_command() (or an equivalent User.static_open_url()) easier to mock than webbrowser.open() ?

You can replace the User object with a MockUser (see mocktool.py) and then all parts of the code are known never to block on the user.  Or even more useful is replacing the Tool object with a MockTool, and then the test never touches anything it shouldn't.  Otherwise you'd need a whole set of functions where you go and replace pieces of python before any testing.

I think this all makes sense form the webkit-patch world where there is a central non-static tool object which is passed around, and all interesting functionality is exposed off the tool or its sub objects.  Never do Command or Step objects talk directly to the OS as then they'd be hard to mock.

> > It's also generally a good idea to factor the code into different objects.  Just like how Executive does all the dealing with processes instead of having individual os.kill calls sprinkled throughout the codebase.  User handles platform-specific differences, and abstracts the whole idea of dealing with a User.
> 
> There's two reasons to factor the code - to ensure that wherever we try to do X, we do it the same way, and, where necessary, to deviate from built in implementations because the built in doesn't do what we want. os.kill() qualifies for both the first and second cases (because simply calling os.kill() doesn't work right on windows). User.open_url() -- as we're using it currently -- only falls into the first category. In this case we are simply duplicating the work the platform already does for us, which is less compelling to me. But there is still value in trying to get common behavior.
> 
> > If you ever don't have a User object, but feel like you should, make a static method on User which just creates a User() and calls the instance method.
> 
> I'm not sure I understand how that makes things better. Are you suggesting that by introducing a User.static_open_url() that wrapped User().open_url() it's easier to grep for the usages? Or is there some other benefit (apart from the first thing discussed above)?

Eventually new-run-webkit-tests should also have a Tool object (see bug 45838) and probably be a Command subclass using Steps to get its work done so as to most easily participate in webkitpy unit testing.  In that world, it makes a lot of sense to have a User object and be able to mock out all user-related functions at once instead of wondering if you've overridden the right set of webbrowser or urllib2 module functions in your unit tests.
Comment 8 Eric Seidel (no email) 2010-09-20 17:44:56 PDT
Using instances like "User" for easy mocking also makes the code more future resilient.  As when some lower-level part of webkitpy changes out from under you, you know your higher-level test is never going to start hitting servers or starting processes or whatever, because the mocks are in place.  Those lower levels are going to be using the same MockUser/MockTool you passed into your higher level test.
Comment 9 Dirk Pranke 2010-09-20 17:49:02 PDT
In case my previous comments weren't clear as to what I was saying ...

If we added a function to user.py like:

def open_url(url):
     user = User()
     user.open_url(url)

then I would be fine with calling that instead of calling User().open_url() like I am now. I agree that the latter can't be Mocked very easily, and that there is value in all of the code calling a single routine to open URLs. I'm not convinced that the routine above really makes things much clearer than just asking everyone to call webbrowser.open(), but I don't feel that strongly about it.

On the other hand, any changes to the existing run_webkit_tests code to create a User in one routine and pass it to others makes less sense to me.

I will post an updated patch and add the static routine and we can make sure we're all on the same page.
Comment 10 Dirk Pranke 2010-09-20 17:52:25 PDT
(In reply to comment #7)
> > I'm not 100% sure that I follow what you're saying here. Looking at the current version of executive.py, the only pattern I see that might be applicable is how executive.run_command creates an instance and then calls executive.Executive.run_command on it. Is that what you're referring to?
> 
> Yes.
> 
> > If so, how is executive.run_command() (or an equivalent User.static_open_url()) easier to mock than webbrowser.open() ?
> 
> You can replace the User object with a MockUser (see mocktool.py) and then all parts of the code are known never to block on the user.  Or even more useful is replacing the Tool object with a MockTool, and then the test never touches anything it shouldn't.  Otherwise you'd need a whole set of functions where you go and replace pieces of python before any testing.
> 
> I think this all makes sense form the webkit-patch world where there is a central non-static tool object which is passed around, and all interesting functionality is exposed off the tool or its sub objects.  Never do Command or Step objects talk directly to the OS as then they'd be hard to mock.
> 

Yeah, if you can assume that you have some common global objects then life is easier, I agree. NRWT doesn't have the Tool and User objects, so this gets more contorted.
Comment 11 Eric Seidel (no email) 2010-09-20 17:56:46 PDT
I don't care re: User().foo() or User.foo().

The only real advantage to User.foo() is that you can later remove it and catch all the callers in ones unit testing.  That's the plan with removing Executive.run_command() since everything should have an Executive() instance eventually.

The idea of piping User through various routines in new-run-webkit-tests isn't super-appealing.  Which is part of the idea behind the Step design.  Running layout tests really shouldn't be very complicated.  Its amazing to me what a tangled mess of code old-run-webkit-tests is (and NRWT is to some degree).  We started moving Commands towards independent execution units called Steps which pass their state between them through a state object.  Eventually I suspect NRWT may move to a similar model.  It's difficult to see a global overview of what NRWT actually does, where-as it's super easy to see what "land" is doing:

http://trac.webkit.org/browser/trunk/WebKitTools/Scripts/webkitpy/tool/commands/download.py#L76

Part of that is made possible by having a global "talk to the rest of the world" object, which we call self.tool.  It's possible for us to test at the function level, step level, command level, or even the whole mutli-command-tool level.  For any of those we just pass our mock object down in and never have to worry about what may or may not be properly mocked in the subcomponents, because they all go through the global instances.
Comment 12 Eric Seidel (no email) 2010-09-20 17:57:48 PDT
(In reply to comment #10)
> > I think this all makes sense form the webkit-patch world where there is a central non-static tool object which is passed around, and all interesting functionality is exposed off the tool or its sub objects.  Never do Command or Step objects talk directly to the OS as then they'd be hard to mock.
> > 
> 
> Yeah, if you can assume that you have some common global objects then life is easier, I agree. NRWT doesn't have the Tool and User objects, so this gets more contorted.

Sounds like we're on a similar page.  Much of that is our fault for never having split the Tool logic out.  Command was designed to be able to be run in a single-command-tool-like-design.  We just never finished ripping Tool out from WebKitPatch so that other objects could use the Command and Step infrastructure safely.
Comment 13 Dirk Pranke 2010-09-20 18:20:53 PDT
Created attachment 68168 [details]
Patch
Comment 14 Adam Barth 2010-09-20 18:25:49 PDT
Comment on attachment 68168 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=68168&action=review

> WebKitTools/Scripts/webkitpy/common/system/user.py:53
> +def open_url(url):

We should give this some sort of pejorative name to discourage folks from using this function.  They shouldn't be using a global static.  Instead, they should be using a user instance hung off a context object.

> WebKitTools/Scripts/webkitpy/layout_tests/run_webkit_tests_unittest.py:100
> +    def setUp(self):
> +        def open_url(url):
> +            self._url_opened = url
> +        self._user_open_url = user.open_url
> +        user.open_url = open_url
> +        self._url_opened = None

Mutating global statics in unit tests is error-prone.  It's much better to pass in the dependencies explicitly so you can inject them explicitly too.
Comment 15 Eric Seidel (no email) 2010-09-20 18:27:18 PDT
We also often use OutputCapture for this sort of test.

MockUser() writes to stdout whenever something tries to open a URL.  Which is what OutputCapture can verify by calling assert_outputs with your expected stdout, which would include the MOCK: open_url http:://myurl.com/ message.
Comment 16 Dirk Pranke 2010-09-20 18:32:02 PDT
(In reply to comment #14)
> (From update of attachment 68168 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=68168&action=review
> 
> > WebKitTools/Scripts/webkitpy/common/system/user.py:53
> > +def open_url(url):
> 
> We should give this some sort of pejorative name to discourage folks from using this function.  They shouldn't be using a global static.  Instead, they should be using a user instance hung off a context object.

Did you have any perjorative names in mind?

> > WebKitTools/Scripts/webkitpy/layout_tests/run_webkit_tests_unittest.py:100
> > +    def setUp(self):
> > +        def open_url(url):
> > +            self._url_opened = url
> > +        self._user_open_url = user.open_url
> > +        user.open_url = open_url
> > +        self._url_opened = None
> 
> Mutating global statics in unit tests is error-prone.  It's much better to pass in the dependencies explicitly so you can inject them explicitly too.

I generally agree that it's better to pass in dependencies than to mutate globals. In my experience, though, this pattern has been pretty much 100% reliable; do have some reason to think the above might fail?

As discussed earlier, it may make sense eventually to have run_webkit_tests create or get passed a User object early on, but there isn't one now and so I think this'll have to do until such time.

Are you suggesting some other change instead?
Comment 17 Adam Barth 2010-09-20 18:37:14 PDT
Comment on attachment 68168 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=68168&action=review

>>> WebKitTools/Scripts/webkitpy/common/system/user.py:53
>>> +def open_url(url):
>> 
>> We should give this some sort of pejorative name to discourage folks from using this function.  They shouldn't be using a global static.  Instead, they should be using a user instance hung off a context object.
> 
> Did you have any perjorative names in mind?

deprecated_open_url ?

>>> WebKitTools/Scripts/webkitpy/layout_tests/run_webkit_tests_unittest.py:100
>>> +        self._url_opened = None
>> 
>> Mutating global statics in unit tests is error-prone.  It's much better to pass in the dependencies explicitly so you can inject them explicitly too.
> 
> I generally agree that it's better to pass in dependencies than to mutate globals. In my experience, though, this pattern has been pretty much 100% reliable; do have some reason to think the above might fail?
> 
> As discussed earlier, it may make sense eventually to have run_webkit_tests create or get passed a User object early on, but there isn't one now and so I think this'll have to do until such time.
> 
> Are you suggesting some other change instead?

I'd just skip these changes to the unittests.
Comment 18 Dirk Pranke 2010-09-20 18:43:05 PDT
(In reply to comment #11)
> 
> The idea of piping User through various routines in new-run-webkit-tests isn't super-appealing.
>

We need to pipe in some global context object, though, right?

  Which is part of the idea behind the Step design.  Running layout tests really shouldn't be very complicated.  Its amazing to me what a tangled mess of code old-run-webkit-tests is (and NRWT is to some degree).  We started moving Commands towards independent execution units called Steps which pass their state between them through a state object.  Eventually I suspect NRWT may move to a similar model.  It's difficult to see a global overview of what NRWT actually does, where-as it's super easy to see what "land" is doing:
> 
> http://trac.webkit.org/browser/trunk/WebKitTools/Scripts/webkitpy/tool/commands/download.py#L76
> 
> Part of that is made possible by having a global "talk to the rest of the world" object, which we call self.tool.  It's possible for us to test at the function level, step level, command level, or even the whole mutli-command-tool level.  For any of those we just pass our mock object down in and never have to worry about what may or may not be properly mocked in the subcomponents, because they all go through the global instances.

NRWT is gradually getting cleaner, but I suspect that at least part of the reason that you find "land" easy to follow is that you're much more familiar with the Tool/Step/Command infrastructure than you are with NRWT.

I haven't yet spent enough time with that code to really understand it, and I find it difficult to follow what's going on at any more complicated level than "what steps are run" - e.g., how is state propagated from step to step? How are options parsed? Etc.

It may be that once I do get my head wrapped around it, it will be clearer overall. Dunno yet :)

At any rate, your design of having your global mock object is pretty similar to my design for the Port object, for what sounds like much the same goals. I think perhaps this is why you've been confused as to why I had so many things hanging off Port in the past - you wanted them to be on some other global object I didn't know existed.
Comment 19 Dirk Pranke 2010-09-20 18:44:36 PDT
(In reply to comment #17)
> (From update of attachment 68168 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=68168&action=review
> 
> >>> WebKitTools/Scripts/webkitpy/common/system/user.py:53
> >>> +def open_url(url):
> >> 
> >> We should give this some sort of pejorative name to discourage folks from using this function.  They shouldn't be using a global static.  Instead, they should be using a user instance hung off a context object.
> > 
> > Did you have any perjorative names in mind?
> 
> deprecated_open_url ?
> 

Okay.

> >>> WebKitTools/Scripts/webkitpy/layout_tests/run_webkit_tests_unittest.py:100
> >>> +        self._url_opened = None
> >> 
> >> Mutating global statics in unit tests is error-prone.  It's much better to pass in the dependencies explicitly so you can inject them explicitly too.
> > 
> > I generally agree that it's better to pass in dependencies than to mutate globals. In my experience, though, this pattern has been pretty much 100% reliable; do have some reason to think the above might fail?
> > 
> > As discussed earlier, it may make sense eventually to have run_webkit_tests create or get passed a User object early on, but there isn't one now and so I think this'll have to do until such time.
> > 
> > Are you suggesting some other change instead?
> 
> I'd just skip these changes to the unittests.

No can do. It turns out that if I don't mock out user.open_url, then the person running the tests gets a whole bunch of pages opened in his/her browser. Also, I can't easily test that the open call is in fact getting called, which is kind of important (yes, the tests weren't doing that earlier, but that's not a good thing).
Comment 20 Adam Barth 2010-09-20 18:48:22 PDT
Comment on attachment 68168 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=68168&action=review

>>>>> WebKitTools/Scripts/webkitpy/layout_tests/run_webkit_tests_unittest.py:100
>>>>> +        self._url_opened = None
>>>> 
>>>> Mutating global statics in unit tests is error-prone.  It's much better to pass in the dependencies explicitly so you can inject them explicitly too.
>>> 
>>> I generally agree that it's better to pass in dependencies than to mutate globals. In my experience, though, this pattern has been pretty much 100% reliable; do have some reason to think the above might fail?
>>> 
>>> As discussed earlier, it may make sense eventually to have run_webkit_tests create or get passed a User object early on, but there isn't one now and so I think this'll have to do until such time.
>>> 
>>> Are you suggesting some other change instead?
>> 
>> I'd just skip these changes to the unittests.
> 
> No can do. It turns out that if I don't mock out user.open_url, then the person running the tests gets a whole bunch of pages opened in his/her browser. Also, I can't easily test that the open call is in fact getting called, which is kind of important (yes, the tests weren't doing that earlier, but that's not a good thing).

It sounds like you're fixing the architectural problems in the wrong order then.  You first need to make it so you can inject dependencies and the create the dependencies to inject.
Comment 21 Dirk Pranke 2010-09-20 18:59:15 PDT
> It sounds like you're fixing the architectural problems in the wrong order then.  You first need to make it so you can inject dependencies and the create the dependencies to inject.

Like I said above, I've not found any difficulties to occur in practice if you do mutate global statics for testing purposes, so I don't know that I would really call it an "architectural problem". This is a pattern that's used a fair amount in the unit tests for NRWT. I personally find it preferable to introducing new objects and interfaces *solely* to be able to mock things out.

I've already suggested that it may make sense to evaluate refactoring NRWT on top of the other webkitpy objects, but I would prefer not to have to do this before I can make this simple change, which fixes bugs, increases code reuse, and gets better test coverage.

I will note that both rebaseline-chromium-webkit-tests and new-run-webkit-tests on the webkit ports are currently broken without this fix, so this isn't an idle thing ... Given that, do you still want me to hold off on this change? I can always fix the bug differently without actually trying to move in the right direction.
Comment 22 Adam Barth 2010-09-20 19:09:12 PDT
I feel I've given you my perspective.  One thing you can do without re-writing the world is hang the User instance off the port and then have fake port sub in the MockUser object.
Comment 23 Dirk Pranke 2010-09-20 19:29:48 PDT
(In reply to comment #22)
> I feel I've given you my perspective.  One thing you can do without re-writing the world is hang the User instance off the port and then have fake port sub in the MockUser object.

Sorry, my last comment came off as more confrontational than I intended :(

To clarify: you do not want to R+ a patch that modifies the global statics, so you would prefer it if I abandoned that approach?

If that's true, then I think there's no point in adding the User.open_url() method, since that can't be mocked, and I essentially have to do something to propagate a user object in to this routine, right,
in order to keep from actually opening windows?

You're suggesting that a way to do this is to have ports create actual User() objects, and the fake port create a MockUser() object, and then I would modify the unit tests to grep captured stdout to see if MockUser.open_url() was called?
Comment 24 Adam Barth 2010-09-20 19:34:53 PDT
That sounds like it would work.  Alternatively, you could find another reviewer who thinks the other approach is better.
Comment 25 Dirk Pranke 2010-09-20 21:06:44 PDT
Created attachment 68184 [details]
Patch
Comment 26 Dirk Pranke 2010-09-20 21:08:02 PDT
Okay, trying again, this time with a MockUser and no mutated statics. I didn't use the MockUser object you have in ../common (a) because I don't really understand it yet and this was easier and (b) I wanted to actually verify the URL cleanly without having to grep output, so this seemed like a reasonable tradeoff.
Comment 27 Dirk Pranke 2010-09-20 21:08:46 PDT
There's probably some additional cleanup we can do to assume that certain keyword arguments are getting passed and things are getting set up, but I'll leave that for some other patch.
Comment 28 Dirk Pranke 2010-09-21 12:34:16 PDT
Created attachment 68275 [details]
Patch
Comment 29 Adam Barth 2010-09-21 16:35:56 PDT
Comment on attachment 68275 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=68275&action=review

This looks good.  I'm sorry that I don't have time to review it in complete detail, but this looks like a good compromise for now.

> WebKitTools/Scripts/webkitpy/layout_tests/port/base.py:90
> +        self._name = kwargs.get('port_name', None)
> +        self._options = kwargs.get('options', None)
> +        self._executive = kwargs.get('executive', Executive())
> +        self._user = kwargs.get('user', User())

You can just name these in the argument list and let **kwargs grab the rest.
Comment 30 Dirk Pranke 2010-09-21 16:44:13 PDT
(In reply to comment #29)
> (From update of attachment 68275 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=68275&action=review
> 
> This looks good.  I'm sorry that I don't have time to review it in complete detail, but this looks like a good compromise for now.
>

Thanks! Hopefully soon I'll switch this ove to the Tool/Step infrastructure and we'll use real User and Executive objects. 
 
> > WebKitTools/Scripts/webkitpy/layout_tests/port/base.py:90
> > +        self._name = kwargs.get('port_name', None)
> > +        self._options = kwargs.get('options', None)
> > +        self._executive = kwargs.get('executive', Executive())
> > +        self._user = kwargs.get('user', User())
> 
> You can just name these in the argument list and let **kwargs grab the rest.

True, but I intentionally deleted those two parameters and moved to an all-keyword approach because the args are all optional and this simplifies the interface to the constructor. This way we don't have to change the signature in every constructor whenever we want to propagate a new parameter from the caller to the base classes.
Comment 31 Dirk Pranke 2010-09-21 19:25:05 PDT
Committed r68008: <http://trac.webkit.org/changeset/68008>