Bug 54070 - nrwt multiprocessing: implement basic support for threads and processes
Summary: nrwt multiprocessing: implement basic support for threads and processes
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: Other OS X 10.5
: P2 Normal
Assignee: Dirk Pranke
URL:
Keywords:
Depends on: 54068
Blocks: 49566 54071
  Show dependency treegraph
 
Reported: 2011-02-08 23:24 PST by Dirk Pranke
Modified: 2011-02-13 02:19 PST (History)
6 users (show)

See Also:


Attachments
Patch (15.38 KB, patch)
2011-02-08 23:30 PST, Dirk Pranke
no flags Details | Formatted Diff | Diff
update w/ review feedback from tony, mihai, merge in changes from 54068 (16.12 KB, patch)
2011-02-11 13:35 PST, Dirk Pranke
no flags Details | Formatted Diff | Diff
delegate to objects for concurrency in *WorkerConnection rather than mixing them in (16.12 KB, patch)
2011-02-11 16:51 PST, Dirk Pranke
no flags Details | Formatted Diff | Diff
incremental diff of the prior two attachments, to show only the changes necessary to use delegates instead of mixins (4.28 KB, patch)
2011-02-11 16:54 PST, Dirk Pranke
no flags Details | Formatted Diff | Diff
can't pickle nested classes (16.04 KB, patch)
2011-02-11 17:14 PST, Dirk Pranke
no flags Details | Formatted Diff | Diff
update w/ additional feedback from tony (16.09 KB, patch)
2011-02-11 17:21 PST, Dirk Pranke
tony: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Dirk Pranke 2011-02-08 23:24:41 PST
nrwt multiprocessing: implement basic support for threads and processes
Comment 1 Dirk Pranke 2011-02-08 23:30:49 PST
Created attachment 81758 [details]
Patch
Comment 2 Mihai Parparita 2011-02-11 11:39:57 PST
Comment on attachment 81758 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=81758&action=review

> Tools/Scripts/webkitpy/layout_tests/layout_package/manager_worker_broker.py:195
> +        return _ThreadedWorker(self._broker, self._port)

There seems to be naming mismatch here (worker connection vs. worker). I realize that may be because connections are supposed to be facades, but perhaps we should reconsider the "connection" name? In the Chromium multi-process scheme, facades for an object in another process are known as *Host (e.g. RenderView and RenderViewHost). Perhaps we should use the same naming here?

> Tools/Scripts/webkitpy/layout_tests/layout_package/manager_worker_broker_unittest.py:122
> +    # Methods to implement the Manager side of the ClientInterface

ClientInterface is now BrokerClient. Also, can you make _TestsMixin actually subclass BrokerClient, so that the relationship is more obvious?

Also, these header-style comments take up a bit too much room (not sure if we have a convention for them in webkitpy)

> Tools/Scripts/webkitpy/layout_tests/layout_package/manager_worker_broker_unittest.py:201
> +    def queue(self):

Is this (or the other queue() implementation below) necessary anymore? The queue class is now determined in based on the passed in worker model in manager_worker_broker.get.
Comment 3 Tony Chang 2011-02-11 11:49:11 PST
Comment on attachment 81758 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=81758&action=review

> Tools/Scripts/webkitpy/layout_tests/layout_package/manager_worker_broker.py:223
> +    def start(self):
> +        raise NotImplementedError

start() seems like a confusing name when there's also BrokerConneciton.run_message_loop().  What's the difference?  Also, how is it different from run()?

> Tools/Scripts/webkitpy/layout_tests/layout_package/manager_worker_broker.py:252
> +class _ThreadedWorker(threading.Thread, _WorkerConnection):
> +    def __init__(self, broker, port):

This looks like multiple inheritance, no?  For example, both threading.Thread and _WorkerConnection seem to have a run() method.  This doesn't feel like a mixin to me . . .

> Tools/Scripts/webkitpy/layout_tests/layout_package/manager_worker_broker_unittest.py:123
> +    #
> +    # Methods to implement the Manager side of the ClientInterface
> +    #

These 3 line comments are weird.  They should be single line docstrings.
Comment 4 Dirk Pranke 2011-02-11 12:56:35 PST
(In reply to comment #2)
> (From update of attachment 81758 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=81758&action=review
> 
> > Tools/Scripts/webkitpy/layout_tests/layout_package/manager_worker_broker.py:195
> > +        return _ThreadedWorker(self._broker, self._port)
> 
> There seems to be naming mismatch here (worker connection vs. worker). I realize that may be because connections are supposed to be facades, but perhaps we should reconsider the "connection" name? In the Chromium multi-process scheme, facades for an object in another process are known as *Host (e.g. RenderView and RenderViewHost). Perhaps we should use the same naming here?
> 

It might be less confusing if _{Inline,Threaded,MultiProcess}Worker were named *WorkerConnection instead. I intentionally didn't use *Host because the connection objects live in the same process/thread as the objects they are connecting to the broker. I'll rename them to *Connection for now.
 
> > Tools/Scripts/webkitpy/layout_tests/layout_package/manager_worker_broker_unittest.py:122
> > +    # Methods to implement the Manager side of the ClientInterface
> 
> ClientInterface is now BrokerClient. Also, can you make _TestsMixin actually subclass BrokerClient, so that the relationship is more obvious?
>

Sure.
 
> Also, these header-style comments take up a bit too much room (not sure if we have a convention for them in webkitpy)
>

I can delete them.
 
> > Tools/Scripts/webkitpy/layout_tests/layout_package/manager_worker_broker_unittest.py:201
> > +    def queue(self):
> 
> Is this (or the other queue() implementation below) necessary anymore? The queue class is now determined in based on the passed in worker model in manager_worker_broker.get.

The queues aren't needed in this patch, but they will be needed in the patch for bug 54072, where we use them to control the TestWorker separately from the messaging. This also points out that I did actually need the **kwargs to start_worker and the Worker constructor, but I think I can get around that with lexical scoping. I'll update the patches and resubmit.
Comment 5 Dirk Pranke 2011-02-11 13:12:22 PST
(In reply to comment #3)
> (From update of attachment 81758 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=81758&action=review
> 
> > Tools/Scripts/webkitpy/layout_tests/layout_package/manager_worker_broker.py:223
> > +    def start(self):
> > +        raise NotImplementedError
> 
> start() seems like a confusing name when there's also BrokerConneciton.run_message_loop().  What's the difference?  Also, how is it different from run()?
> 

Any class that implements a WorkerConnection needs to implement start() and run(). In the threaded and multiprocessing cases, these are implemented by Thread.start()/run() and Process.start()/run(). In the inline case, they're stubs. 

The start() method causes the other thread/process to actually get created., and the run() method is invoked inside the other method. The WorkerConnection's run() method is usually a thin shim around the Worker's run() method, which itself usually a thin shim around run_message_loop().

> > Tools/Scripts/webkitpy/layout_tests/layout_package/manager_worker_broker.py:252
> > +class _ThreadedWorker(threading.Thread, _WorkerConnection):
> > +    def __init__(self, broker, port):
> 
> This looks like multiple inheritance, no?  For example, both threading.Thread and _WorkerConnection seem to have a run() method.  This doesn't feel like a mixin to me . . .

Yes, it is multiple inheritance. _WorkerConnection is an abstract base class, not a mixin. See above.

> 
> > Tools/Scripts/webkitpy/layout_tests/layout_package/manager_worker_broker_unittest.py:123
> > +    #
> > +    # Methods to implement the Manager side of the ClientInterface
> > +    #
> 
> These 3 line comments are weird.  They should be single line docstrings.

They refer to groups of methods, so I don't think docstrings make much sense. That said, I'll probably just delete them outright.
Comment 6 Dirk Pranke 2011-02-11 13:35:09 PST
Created attachment 82167 [details]
update w/ review feedback from tony, mihai, merge in changes from 54068
Comment 7 Tony Chang 2011-02-11 13:37:42 PST
View in context: https://bugs.webkit.org/attachment.cgi?id=81758&action=review

>>> Tools/Scripts/webkitpy/layout_tests/layout_package/manager_worker_broker.py:252
>>> +    def __init__(self, broker, port):
>> 
>> This looks like multiple inheritance, no?  For example, both threading.Thread and _WorkerConnection seem to have a run() method.  This doesn't feel like a mixin to me . . .
> 
> Yes, it is multiple inheritance. _WorkerConnection is an abstract base class, not a mixin. See above.

I thought we decided that multiple inheritance is bad unless it's a mixin or for interfaces?  For example, ThreadedWorker.start depends on the order in which we inherit from the base classes.

Can the worker connection hold a subclass of threading.Thread and delegate calls to run() and start() to it?  You could even declare this class within _ThreadedWorkerConnection.
Comment 8 Dirk Pranke 2011-02-11 14:05:31 PST
(In reply to comment #7)
> View in context: https://bugs.webkit.org/attachment.cgi?id=81758&action=review
> 
> >>> Tools/Scripts/webkitpy/layout_tests/layout_package/manager_worker_broker.py:252
> >>> +    def __init__(self, broker, port):
> >> 
> >> This looks like multiple inheritance, no?  For example, both threading.Thread and _WorkerConnection seem to have a run() method.  This doesn't feel like a mixin to me . . .
> > 
> > Yes, it is multiple inheritance. _WorkerConnection is an abstract base class, not a mixin. See above.
> 
> I thought we decided that multiple inheritance is bad unless it's a mixin or for interfaces?  For example, ThreadedWorker.start depends on the order in which we inherit from the base classes.
>

Well, unfortunately, there isn't really such a thing as interfaces in Python. WorkerConnection is barely more than an interface, but if this layout is troubling, we have a few options:

(1) I can delete start() and run() from _WorkerConnection, eliminating the overlap. Technically, they don't need to be there, as they are only called by routines that don't require the polymorphism.

(2) I can explictly override start() and run() in ThreadedWorkerConnection and MultipleProcessWorkerConnection(), to make it clear which methods will be called.
  
Or I can do (3):

> Can the worker connection hold a subclass of threading.Thread and delegate calls to run() and start() to it?  You could even declare this class within _ThreadedWorkerConnection.

But, I've done this before and I think it's probably less clear than either (1) or (2). Thoughts?
Comment 9 Tony Chang 2011-02-11 14:33:59 PST
(In reply to comment #8)
> Well, unfortunately, there isn't really such a thing as interfaces in Python. WorkerConnection is barely more than an interface, but if this layout is troubling, we have a few options:
> 
> (1) I can delete start() and run() from _WorkerConnection, eliminating the overlap. Technically, they don't need to be there, as they are only called by routines that don't require the polymorphism.

Hmm, but isn't this still multiple inheritance?  We would still have the two constructors and the methods from BrokerConnection mixed with the methods from threading.Thread.  All this does is eliminate the overlapping methods.  It's also possible that the member variables (like self._client and self.name) could collide with member variables in threading.Thread.

> (2) I can explictly override start() and run() in ThreadedWorkerConnection and MultipleProcessWorkerConnection(), to make it clear which methods will be called.

Wouldn't this be multiple inheritance too?  Another way to think about it is, this isn't something you could do in Java because WorkerConnection isn't an interface.  You could do it in C++, but it would violate the style guide.

> Or I can do (3):
> 
> > Can the worker connection hold a subclass of threading.Thread and delegate calls to run() and start() to it?  You could even declare this class within _ThreadedWorkerConnection.
> 
> But, I've done this before and I think it's probably less clear than either (1) or (2). Thoughts?

I agree that this solution requires more classes and more delegating (which I dislike), but of the three options, it's the only one that avoid multiple inheritance.
Comment 10 Dirk Pranke 2011-02-11 15:47:57 PST
(In reply to comment #9)
> (In reply to comment #8)
> > Well, unfortunately, there isn't really such a thing as interfaces in Python. WorkerConnection is barely more than an interface, but if this layout is troubling, we have a few options:
> > 
> > (1) I can delete start() and run() from _WorkerConnection, eliminating the overlap. Technically, they don't need to be there, as they are only called by routines that don't require the polymorphism.
> 
> Hmm, but isn't this still multiple inheritance?

Yes, but the methods don't overlap (apart from the constructor), which makes it a mixin under my definition. If you're worried about name conflicts, I can change to double underscores for the private fields).

(mixins have state and can depend on the linearization order, see, e.g.: http://stackoverflow.com/questions/925609/mixins-vs-traits ).
Comment 11 Dirk Pranke 2011-02-11 16:11:10 PST
(In reply to comment #10)
> (In reply to comment #9)
> > (In reply to comment #8)
> > > Well, unfortunately, there isn't really such a thing as interfaces in Python. WorkerConnection is barely more than an interface, but if this layout is troubling, we have a few options:
> > > 
> > > (1) I can delete start() and run() from _WorkerConnection, eliminating the overlap. Technically, they don't need to be there, as they are only called by routines that don't require the polymorphism.
> > 
> > Hmm, but isn't this still multiple inheritance?
> 
> Yes, but the methods don't overlap (apart from the constructor), which makes it a mixin under my definition. If you're worried about name conflicts, I can change to double underscores for the private fields).
> 
> (mixins have state and can depend on the linearization order, see, e.g.: http://stackoverflow.com/questions/925609/mixins-vs-traits ).

Frankly, a better way to think about this is that Threading.Thread and multiprocessing.Process are really the things being mixed in.
Comment 12 Dirk Pranke 2011-02-11 16:51:56 PST
Created attachment 82208 [details]
delegate to objects for concurrency in *WorkerConnection rather than mixing them in
Comment 13 Dirk Pranke 2011-02-11 16:54:21 PST
Created attachment 82210 [details]
incremental diff of the prior two attachments, to show only the changes necessary to use delegates instead of mixins

Okay, I've posted another rev that shows basically how things would work w/ delegates instead of mixins. Note that the patch in 54074 would need to change as well, but the changes there are not very interesting.
Comment 14 Tony Chang 2011-02-11 17:10:32 PST
Comment on attachment 82208 [details]
delegate to objects for concurrency in *WorkerConnection rather than mixing them in

View in context: https://bugs.webkit.org/attachment.cgi?id=82208&action=review

> Tools/Scripts/webkitpy/layout_tests/layout_package/manager_worker_broker.py:243
> +    class _Thread(threading.Thread):
> +        def __init__(self, worker_connection):
> +            threading.Thread.__init__(self)
> +            self._worker_connection = worker_connection

Should we pass in client and port so run() doesn't have to access _client and _port?

> Tools/Scripts/webkitpy/layout_tests/layout_package/manager_worker_broker.py:260
> +class _MultiProcessWorkerConnection(_WorkerConnection):

If multiprocessing doesn't exist, will this basically be the same as ThreadedWorkerConnection?

> Tools/Scripts/webkitpy/layout_tests/layout_package/manager_worker_broker.py:264
> +    class _Process(_Multiprocessing_Process):
> +        def __init__(self, worker_connection):
> +            _Multiprocessing_Process.__init__(self)
> +            self._worker_connection = worker_connection

Same thing, can we pass in what is needed by run()?
Comment 15 Dirk Pranke 2011-02-11 17:14:16 PST
Created attachment 82211 [details]
can't pickle nested classes
Comment 16 Dirk Pranke 2011-02-11 17:21:14 PST
Created attachment 82213 [details]
update w/ additional feedback from tony
Comment 17 Dirk Pranke 2011-02-11 17:28:11 PST
(In reply to comment #14)
> (From update of attachment 82208 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=82208&action=review
> 
> > Tools/Scripts/webkitpy/layout_tests/layout_package/manager_worker_broker.py:243
> > +    class _Thread(threading.Thread):
> > +        def __init__(self, worker_connection):
> > +            threading.Thread.__init__(self)
> > +            self._worker_connection = worker_connection
> 
> Should we pass in client and port so run() doesn't have to access _client and _port?
> 

Done.

> > Tools/Scripts/webkitpy/layout_tests/layout_package/manager_worker_broker.py:260
> > +class _MultiProcessWorkerConnection(_WorkerConnection):
> 
> If multiprocessing doesn't exist, will this basically be the same as ThreadedWorkerConnection?
>

Yes.
 
> > Tools/Scripts/webkitpy/layout_tests/layout_package/manager_worker_broker.py:264
> > +    class _Process(_Multiprocessing_Process):
> > +        def __init__(self, worker_connection):
> > +            _Multiprocessing_Process.__init__(self)
> > +            self._worker_connection = worker_connection
> 
> Same thing, can we pass in what is needed by run()?

Done.
Comment 18 Dirk Pranke 2011-02-11 17:42:09 PST
Committed r78398: <http://trac.webkit.org/changeset/78398>
Comment 19 WebKit Review Bot 2011-02-13 02:19:54 PST
http://trac.webkit.org/changeset/78398 might have broken SnowLeopard Intel Release (WebKit2 Tests)
The following tests are not passing:
fast/loader/empty-embed-src-attribute.html