Bug 71237 - BaselineOptimizer tests should use mocks instead of real Executive/FileSystem objects
Summary: BaselineOptimizer tests should use mocks instead of real Executive/FileSystem...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Eric Seidel (no email)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-10-31 14:12 PDT by Eric Seidel (no email)
Modified: 2022-02-27 23:47 PST (History)
4 users (show)

See Also:


Attachments
Patch (8.80 KB, patch)
2011-10-31 14:16 PDT, Eric Seidel (no email)
no flags Details | Formatted Diff | Diff
Make our PortFactory mocking slightly more explicit (9.43 KB, patch)
2011-10-31 14:53 PDT, Eric Seidel (no email)
abarth: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Eric Seidel (no email) 2011-10-31 14:12:24 PDT
BaselineOptimizer tests should use mocks instead of real Executive/FileSystem objects
Comment 1 Eric Seidel (no email) 2011-10-31 14:16:32 PDT
Created attachment 113086 [details]
Patch
Comment 2 Dirk Pranke 2011-10-31 14:31:13 PDT
Comment on attachment 113086 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=113086&action=review

> Tools/Scripts/webkitpy/common/checkout/baselineoptimizer.py:-34
> -def _baseline_search_hypergraph(fs):

I think it is better to be explicit about the objects that you depend on, rather than looking for a global context-like object, like host.

> Tools/Scripts/webkitpy/common/checkout/baselineoptimizer.py:47
> +        port = port_factory.get(port_name)

Why aren't you passing the filesystem to the port any more?

> Tools/Scripts/webkitpy/common/checkout/baselineoptimizer.py:67
> +    def __init__(self, host):

Same comment as above, I think it is better to be explicit about the objects that you depend on, rather than looking for a global context-like object, like host.

> Tools/Scripts/webkitpy/common/checkout/baselineoptimizer_unittest.py:39
> +        BaselineOptimizer.__init__(self, MockHost())

Same comment.
Comment 3 Eric Seidel (no email) 2011-10-31 14:42:29 PDT
(In reply to comment #2)
> (From update of attachment 113086 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=113086&action=review
> Same comment as above, I think it is better to be explicit about the objects that you depend on, rather than looking for a global context-like object, like host.

Unfortunately, that gets us into the trouble we have now.  Since Port is explicit about what it depends on, and you are expected to pass it an Executive, User, and FileSystem, parts of code which fail to do so, aren't mockable.  Part of this is the confusion of Port with some sort of Host object.  Passing Host here, makes it easy to mock these calls.  If they need to talk out to the OS, they do so via Host.  Whether that's via Executive, Filesystem or User objects.
Comment 4 Eric Seidel (no email) 2011-10-31 14:44:15 PDT
(In reply to comment #2)
> (From update of attachment 113086 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=113086&action=review
> > Tools/Scripts/webkitpy/common/checkout/baselineoptimizer.py:47
> > +        port = port_factory.get(port_name)
> 
> Why aren't you passing the filesystem to the port any more?

It's no longer needed.  It just happens that port_factory is the same name as the module used to be imported as.  But now it's a PortFactory instance, which contains all the information needed to wire up a Port with the proper Executive, User, etc.  Previously we only had a FileSystem, so we passed that, since that was better than nothing.  Now we have a Host, and better yet, we have a PortFactory from that host, so we just get the port we want, knowing that it will be properly wired up with the right Executive, etc.
Comment 5 Eric Seidel (no email) 2011-10-31 14:53:48 PDT
Created attachment 113087 [details]
Make our PortFactory mocking slightly more explicit
Comment 6 Dirk Pranke 2011-10-31 14:55:40 PDT
(In reply to comment #3)
> (In reply to comment #2)
> > (From update of attachment 113086 [details] [details])
> > View in context: https://bugs.webkit.org/attachment.cgi?id=113086&action=review
> > Same comment as above, I think it is better to be explicit about the objects that you depend on, rather than looking for a global context-like object, like host.
> 
> Unfortunately, that gets us into the trouble we have now.  Since Port is explicit about what it depends on, and you are expected to pass it an Executive, User, and FileSystem, parts of code which fail to do so, aren't mockable.  Part of this is the confusion of Port with some sort of Host object.  Passing Host here, makes it easy to mock these calls.  If they need to talk out to the OS, they do so via Host.  Whether that's via Executive, Filesystem or User objects.

I understand what you're aiming towards; the problem is that the MockHost includes a whole bunch of other objects as well (a SCM object, a Bugzilla object, etc.), and by switching to that, you lose visibility into the fact that you *only* need an executive, a user, and a filesystem.

Perhaps part of the problem is that it's not clear to me what a "host" is supposed to be. Is it supposed to truly be just things like those three objects (i.e., pointers to stuff in common/system), or a bigger catch-all class? If the former, I'm okay with that (and agree that it will be good to split this out from Port). But I'm not okay with the latter.

(In reply to comment #4)
> (In reply to comment #2)
> > (From update of attachment 113086 [details] [details])
> > View in context: https://bugs.webkit.org/attachment.cgi?id=113086&action=review
> > > Tools/Scripts/webkitpy/common/checkout/baselineoptimizer.py:47
> > > +        port = port_factory.get(port_name)
> > 
> > Why aren't you passing the filesystem to the port any more?
> 
> It's no longer needed.  

Ah, sorry, I missed the insertion on line 45. I follow it now.
Comment 7 Dirk Pranke 2011-10-31 14:58:08 PDT
Comment on attachment 113087 [details]
Make our PortFactory mocking slightly more explicit

View in context: https://bugs.webkit.org/attachment.cgi?id=113087&action=review

> Tools/Scripts/webkitpy/common/checkout/baselineoptimizer_unittest.py:41
> +        host.port_factory = PortFactory(host)  # We want a real PortFactory, but it should use mocks.

Okay, I'm lost again. Why does the host have a port_factory? That seems like a layering inversion.
Comment 8 Eric Seidel (no email) 2011-10-31 17:26:17 PDT
(In reply to comment #7)
> (From update of attachment 113087 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=113087&action=review
> 
> > Tools/Scripts/webkitpy/common/checkout/baselineoptimizer_unittest.py:41
> > +        host.port_factory = PortFactory(host)  # We want a real PortFactory, but it should use mocks.
> 
> Okay, I'm lost again. Why does the host have a port_factory? That seems like a layering inversion.

Host has historically just been where all the Mock-able singletons have gone.  PortFactory is one of them.  We could certainly split Host and have an additional layer for higher-level mockable singletons.
Comment 9 Dirk Pranke 2011-10-31 17:48:53 PDT
(In reply to comment #8)
> (In reply to comment #7)
> > (From update of attachment 113087 [details] [details])
> > View in context: https://bugs.webkit.org/attachment.cgi?id=113087&action=review
> > 
> > > Tools/Scripts/webkitpy/common/checkout/baselineoptimizer_unittest.py:41
> > > +        host.port_factory = PortFactory(host)  # We want a real PortFactory, but it should use mocks.
> > 
> > Okay, I'm lost again. Why does the host have a port_factory? That seems like a layering inversion.
> 
> Host has historically just been where all the Mock-able singletons have gone.  PortFactory is one of them.  We could certainly split Host and have an additional layer for higher-level mockable singletons.

Okay, I am definitely opposed to having one catch-all object that contains singletons from multiple layers. I would probably be okay with a version of this change that had a "host" object that contained singletons only for things in common/system, since that is a clear layer and set of functionality, although I still don't think I prefer that to requiring the specific objects that are needed.

It would be good if we separated out the concept of "hosts" from "mockable objects", though. I think that is making things unnecessarily confusing (to me, at least).
Comment 10 Eric Seidel (no email) 2011-10-31 18:35:19 PDT
So if I'm looking for this same functionality, of having BaselineOptimizer not actually touch the filesystem (or launch processes) while running the unittests, how would you compose the change?
Comment 11 Eric Seidel (no email) 2011-10-31 18:38:49 PDT
(In reply to comment #9)
> Okay, I am definitely opposed to having one catch-all object that contains singletons from multiple layers. I would probably be okay with a version of this change that had a "host" object that contained singletons only for things in common/system, since that is a clear layer and set of functionality, although I still don't think I prefer that to requiring the specific objects that are needed.

I think the layers are at best il-defined.  Port for example, is supposed to be rather low on the dependency chain, yet it's the "god object" for NRWT.  Port holds NRWT's Executive and Filesystem.  We'd like to fix that, but for now, when instantiating port objects we have to be very careful to pass them the current FileSystem/Executive/User or the unittests touch the real filesystem.

This change is about making BaselineOptimizer pass the Host along to the Port, effectively, through the PortFactory (since that's the preferred way to make ports).  The real layering violation of which you speak is that Port is the Host for NRWT.  Until we fix that we basically need to pass a Host through any object which intends to create Port objects.
Comment 12 Dirk Pranke 2011-10-31 18:48:04 PDT
(In reply to comment #10)
> So if I'm looking for this same functionality, of having BaselineOptimizer not actually touch the filesystem (or launch processes) while running the unittests, how would you compose the change?

Assuming I am following the code properly, I would pass a MockExecutive object and a MockFilesystem object to the TestBaselineOptimizer constructor, since the port will use a real user but we probably don't care about that. If I was being pedantic/thorough, I would pass a MockUser as well. I would not use the PortFactory object, since it takes host objects but those are not defined outside of the MockHost object. Alternatively, would create a default host object in common/system/host, and then allow myself to use MockHost. 

I don't think the (MockHost) host object should have a member called port_factory. That seems like it is just inviting trouble and isn't needed in this case, but maybe I'm missing something?
Comment 13 Dirk Pranke 2011-10-31 18:58:13 PDT
(In reply to comment #11)
> (In reply to comment #9)
> > Okay, I am definitely opposed to having one catch-all object that contains singletons from multiple layers. I would probably be okay with a version of this change that had a "host" object that contained singletons only for things in common/system, since that is a clear layer and set of functionality, although I still don't think I prefer that to requiring the specific objects that are needed.
> 
> I think the layers are at best il-defined.  Port for example, is supposed to be rather low on the dependency chain, yet it's the "god object" for NRWT.  Port holds NRWT's Executive and Filesystem.  We'd like to fix that, but for now, when instantiating port objects we have to be very careful to pass them the current FileSystem/Executive/User or the unittests touch the real filesystem.

I think we agree that the layers are ill-defined and where we eventually want to be; we might just be disagreeing over the path to get there. 

I would be happy to see filesystem, executive, and user objects be mandatory in the port constructors, and for them to be explicitly passed to most if not all other constructors under webkitpy.layout_tests; I would also -- as I attempted to describe above -- be happy to see a common.system-level "host" replace those three params (I put "host" in quotes here because I don't want to confuse this with the MockHost object that currently holds a bunch of other mocks as well). I don't think such a system-level host wants or needs a handle to a port_factory object, but maybe I am wrong?

I'm not positive, but I think I've removed most of the places where we need to retrieve a filesystem from the port elsewhere in NRWT. It shouldn't be too hard to remove the few others that remain. There may be a few places that need to retrieve port.executive, but those shouldn't be too hard to fix, either. If you don't get to this before then, I would be happy to try and put together a patch to that effect tomorrow.
Comment 14 Adam Barth 2011-11-02 12:13:27 PDT
> Perhaps part of the problem is that it's not clear to me what a "host" is supposed to be.

In webkitpy, our design philosophy has been to avoid instantiating object from the global static scope when those objects interact with the outside environment (e.g., create processes, touch the file system, or interact with the network).

The "host" object is just an object to carry these dependencies around explicitly.  Rather then instantiating these objects from the global static scope, we retrieve them from the host.  The goal of those design is to make it easier to be able to write tests that do not have dependencies on the outside world.

This approach has worked very well for us, at least for code that was developed as part of webkitpy.  Code that we've imported from elsewhere often hasn't been designed with this philosophy in mind.  In the case of NRWT, we've retrofitted some of this idea onto the code using the Port abstraction, but currently the Port abstraction is too broad.  There's a lot of code in NRWT that gets things from Port that have nothing to do with ports of WebKit.

Over time, we'd like to integrate NRWT better with the rest of webkitpy, which means infusing the Host object throughout NRWT.  I haven't looked at this patch specifically, but from reading the comments, that appears to be what Eric is doing here.

There's certainly a tradeoff in this design between using Python's import mechanism to create clean abstraction layers and removing dependencies on statics.  Ideally the language would let give us an import-like mechanism for controlling the names available in the host namespace in the same way that it lets us control the names available in the global scope.  Unfortunately, Python doesn't provide this facility, so we pay some cost for this design.  Over time, we might want to break Host down into sub objects for each layer, which might make it easier to catch layering violation in code review.

> I would be happy to try and put together a patch to that effect tomorrow.

I didn't see this patch in my bugmail.  (Although, admittedly, I did have hundreds of bugmails to wade through.)
Comment 15 Adam Barth 2011-11-02 12:17:43 PDT
Comment on attachment 113087 [details]
Make our PortFactory mocking slightly more explicit

View in context: https://bugs.webkit.org/attachment.cgi?id=113087&action=review

> Tools/Scripts/webkitpy/common/checkout/baselineoptimizer_unittest.py:35
> +from webkitpy.tool.mocktool import MockHost

We should move MockHost to common now that we've moved Host to common.  As written, Dirk is correct that importing tool packages in common is a layering violation.

> Tools/Scripts/webkitpy/common/checkout/baselineoptimizer_unittest.py:59
> +        host.port_factory = PortFactory(host)  # We want a real PortFactory, but it should use mocks.

Perhaps that's what the default port_factory on MockHost should be?

> Tools/Scripts/webkitpy/common/checkout/baselineoptimizer_unittest.py:139
> +        # FIXME: What functionality does this rely on?  When can we remove this if?

I should have made a more detailed comment.  This can probably be uncommented if we've succeeded in injecting all the dependencies.

> Tools/Scripts/webkitpy/tool/commands/rebaseline.py:119
> -        self._baseline_optimizer = BaselineOptimizer(tool.scm(), tool.filesystem)
> +        self._baseline_optimizer = BaselineOptimizer(tool)

We should do a global search-and-replace for tool -> host.  It's way past time for that.

> Tools/Scripts/webkitpy/tool/mocktool.py:842
> -class MockTool(object):
> +class MockHost(object):

This should move to common.
Comment 16 Dirk Pranke 2011-11-02 12:43:47 PDT
((In reply to comment #14)
> > Perhaps part of the problem is that it's not clear to me what a "host" is supposed to be.
> 
> In webkitpy, our design philosophy has been to avoid instantiating object from the global static scope when those objects interact with the outside environment (e.g., create processes, touch the file system, or interact with the network).
> 

I certainly agree with this design philosophy. 

> The "host" object is just an object to carry these dependencies around explicitly.  Rather then instantiating these objects from the global static scope, we retrieve them from the host.  The goal of those design is to make it easier to be able to write tests that do not have dependencies on the outside world.
>

I definitely agree with the second sentence of this. I am less enthusiastic about the first sentence, because a using a single global context, while convenient, can obscure what the true dependencies are. 

What I was attempting to get at with my comment was that I didn't know what the type signature of the host was supposed to be (i.e., what the interface is). What things should be hung off the host, and what shouldn't?

> There's certainly a tradeoff in this design between using Python's import mechanism to create clean abstraction layers and removing dependencies on statics.  Ideally the language would let give us an import-like mechanism for controlling the names available in the host namespace in the same way that it lets us control the names available in the global scope.  Unfortunately, Python doesn't provide this facility, so we pay some cost for this design.  Over time, we might want to break Host down into sub objects for each layer, which might make it easier to catch layering violation in code review.
> 

Given that the Port class already has the mechanisms needed to catch layering violations (by taking the filesystem, user, executive objects directly), I don't want to lose that. That is what I am objecting to.

If we had some concept of a "SystemHost" (for lack of a better word) that was being passed to the port, that would be better. The generic Host could then contain a system_host as a field, rather than containing a filesystem, executive, etc. Alternatively (and less satisfactorily), the Host could derive from SystemHost. 

> > I would be happy to try and put together a patch to that effect tomorrow.
> 
> I didn't see this patch in my bugmail.  (Although, admittedly, I did have hundreds of bugmails to wade through.)

Yeah, sorry, I was occupied with my primary task these days and didn't have time for it. I will try to get to it today, but I don't believe that such a patch is holding up anything on this bug.
Comment 17 Adam Barth 2011-11-02 13:01:50 PDT
> What I was attempting to get at with my comment was that I didn't know what the type signature of the host was supposed to be (i.e., what the interface is). What things should be hung off the host, and what shouldn't?

Generally, any object that reads or writes the environment (e.g., disk, network) should be obtained from the Host, either directly (as a property of the host) or from an object that was obtained from the host.  For example, we could move SCM from Host to Checkout, which we get from the host.

> Given that the Port class already has the mechanisms needed to catch layering violations (by taking the filesystem, user, executive objects directly), I don't want to lose that. That is what I am objecting to.

As part of this process, we'd like to reduce the scope of things the Port needs to worry about.  Today the Port object is concerned with a great many things that are unrelated to ports of WebKit.  These concerns should be separated.  That net result of that process is that much of the code that today holds a Port object will in the future hold a Host object.  If it needs the Port, it can get it from the Host.

> If we had some concept of a "SystemHost" (for lack of a better word) that was being passed to the port, that would be better. The generic Host could then contain a system_host as a field, rather than containing a filesystem, executive, etc. Alternatively (and less satisfactorily), the Host could derive from SystemHost. 

Yeah, in the future, we'll probably want to break the Host object down by layers.  However, I don't think we should block this patch on that refactoring.

> Yeah, sorry, I was occupied with my primary task these days and didn't have time for it. I will try to get to it today, but I don't believe that such a patch is holding up anything on this bug.

Code should not be obtaining the Executive from the Port.  The executive facilities of the environment are not related to ports of WebKit.  Rather, that code should get the executive from the Host, which will require plumbing the Host object deeper into NRWT.

Changing code to get the Executive from the Port might be worth doing, but we can also wait until Host is more prevalent throughout NRWT.
Comment 18 Dirk Pranke 2011-11-02 14:34:09 PDT
(In reply to comment #17)
> > What I was attempting to get at with my comment was that I didn't know what the type signature of the host was supposed to be (i.e., what the interface is). What things should be hung off the host, and what shouldn't?
> 
> Generally, any object that reads or writes the environment (e.g., disk, network) should be obtained from the Host, either directly (as a property of the host) or from an object that was obtained from the host.  For example, we could move SCM from Host to Checkout, which we get from the host.
> 

To clarify, by "network" do you mean something like comon.system.url_fetcher or common.net.bugzilla.? I would normally call the latter a service, and the "net" is something of a misnomer (but terminology is not all that important as long as we understand each other). 

As I said before, Port implementations can depend on stuff in webkitpy.common.system but it's not clear to me that anything else should be allowed, and I am loathe to propagate coding patterns that will allow other dependencies to creep in unnoticed.
 
> As part of this process, we'd like to reduce the scope of things the Port needs to worry about.  Today the Port object is concerned with a great many things that are unrelated to ports of WebKit.  These concerns should be separated.  That net result of that process is that much of the code that today holds a Port object will in the future hold a Host object.  If it needs the Port, it can get it from the Host.

I have filed bug 71403 to track what we think should be split out from the Port object. While I believe that we all agree on the general claim, I would be very surprised if we agreed on the details. Please take a look and comment there :)

> Code should not be obtaining the Executive from the Port.  The executive facilities of the environment are not related to ports of WebKit. 

Agreed. At least under webkitpy.layout_tests, today no code does this. A perhaps slightly better example is the filesystem, which is referenced in a couple places and is the thing I was talking about fixing in the separate patch.

>  Rather, that code should get the executive from the Host, which will require plumbing the Host object deeper into NRWT.

I think this is the part that we disagree about somewhat. 

I would say that code that needs an executive should be passed an executive, not passed a host from which it can get an executive. Accordingly, NRWT doesn't need to know about Hosts at all. That said, I obviously don't want to get into a situation where you have to pass 20 arguments to every constructor, so some form of aggregation is appropriate. I just don't believe in the "one Host to rule them all" aggregate.
Comment 19 Adam Barth 2011-11-02 14:56:04 PDT
> To clarify, by "network" do you mean something like comon.system.url_fetcher or common.net.bugzilla.? I would normally call the latter a service, and the "net" is something of a misnomer (but terminology is not all that important as long as we understand each other). 

Ideally, bugzilla would be built on top of a mockable abstraction of the network.  Unfortunately, it's not, so using that class directly causes side effects.

> I would say that code that needs an executive should be passed an executive, not passed a host from which it can get an executive.

Yes.  We disagree about that point.  The design we've been using in webkitpy is to supply a Host in most cases.  In some cases where it's clear only an executive will be needed, we pass just the Executive, but that's not the common case.
Comment 20 Eric Seidel (no email) 2011-11-02 15:25:38 PDT
I'm going to land this and start work on more cleanup.  I'm glad we had this discussion!
Comment 21 Eric Seidel (no email) 2011-11-02 15:26:09 PDT
Committed r99106: <http://trac.webkit.org/changeset/99106>