Bug 92365 - Refactor cross thread communication
Summary: Refactor cross thread communication
Status: UNCONFIRMED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Platform (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-07-26 04:16 PDT by Kwonjin Jeong
Modified: 2012-08-16 09:28 PDT (History)
12 users (show)

See Also:


Attachments
This patch is not for committing. (61.74 KB, patch)
2012-08-13 05:25 PDT, Kwonjin Jeong
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Kwonjin Jeong 2012-07-26 04:16:36 PDT
1. Introduction

In WebCore, there are several implementations of threads, tasks and cross thread communication. For example, FileThread, DatabaseThread and StorageThread perform blocking IO asynchronously and invoke callbacks using ScriptExecutionContext.  Unfortunately, there is no code sharing at all though each of these implementation is pretty similar. Every thread has its own slightly different cross thread communication mechanism.

We will be better off if we could unify the cross thread communication mechanism and reuse it. Levin’s sample proposal looks promising, but I think the very first step is to refactor common code into reusable components and combine various task classes into one.

Levin filed a bug on Bug 51857 and wrote a wiki on this issue:
https://trac.webkit.org/wiki/ThreadCommunication


2. Requirements

I analyzed the current task threads code and extracted basic requirements:

- Add a task to the end of the task queue (append)
- Add a task to the front of the task queue (prepend)
- Cancel a group of tasks that are pending in the task queue
- Specify an action that will performed on thread termination



3. API Proposal

3.1 Task

Instead of creating a task type for each use case, WTF::Function is used everywhere. CrossThreadTask is merged into WTF::Function because they play essentially the same role. 

However, there is one problem with WTF::Function. When binding a non-static method with WTF::Function, if the receiver object’s class has ref/deref methods, the object’s reference count is incremented and decremented on WTF::Function construction and destruction respectively. If the object’s class is not a subclass of ThreadSafeRefCounted, this behavior can cause a race condition. We must modify WTF::Function to perform ref/deref only for ThreadSafeRefCounted.

3.2 TaskGroup

From the requirement above, we must be able to cancel a group of task. TaskGroup is introduced to specify the group of a task. When posting a task, we can specify the task group where the task belongs. See TaskThread::postTask(const Function<void()>& task, PassRefPtr<TaskGroup> group) method below.

class TaskGroup : public ThreadSafeRefCounted<TaskGroup> { };


3.3 Thread

TaskThread is our new abstraction for task thread. FileThread, DatabaseThread and other task threads reuse the code by inheriting from TaskThread.

class TaskThread : public ThreadSafeRefCounted<TaskThread> {
public:
    ~TaskThread();
    void start();
    void stop();

    // Schedule a task to be executed by the task thread.
    void postTask(const Function<void()>& task);
    // Schedule a task to be executed by the task thread and also specify the task group.
    // Tasks scheduled with this method can be cancelled by removeTasks() method.
    void postTask(const Function<void()>& task, PassRefPtr<TaskGroup> group);

    // Schedule a task to be immediately executed by the task thread.
    void postImmediateTask(PassOwnPtr<TaskThreadTask> task);
    // Schedule a task to be executed by the task thread and also specify the task group.
    // Tasks scheduled with this method can be cancelled by removeTasks() method.
    void postImmediateTask(const Function<void()>& task, PassRefPtr<TaskGroup> group);

    // Cancel pending tasks that belong to the given group.
    void removeTasks(PassRefPtr<TaskGroup> group);

protected:
    TaskThread();
    // Subclasses can override this method to specify an action to be executed
    // on thread termination
    virtual void onRunLoopTerminated();
};


3.4 Callback to ScriptExecutionContext

ScriptExecutionContext::postTask() now takes a WTF::Function as an argument because WTF::Function replaced CrossThreadTask.

class ScriptExecutionContext :
   public SecurityContext, public Supplementable<ScriptExecutionContext> {
public:
…
   virtual void postTask(const Function<void()>&) = 0;
...
};

There is a small change in usage. CrossThreadTask does not bind the pointer to ScriptExecutionContext because it is automatically passed as the first argument when the task is invoked. However, since WTF::Function does not support partial application, the pointer to ScriptExecutionContext must be bound as the first argument when binding parameters to a task. This is possible because task threads already know the pointer. See the example below.

void AsyncFileStream::openForReadOnFileThread(const String& path, long long offset, long long length)
{
   bool success = m_stream->openForRead(path, offset, length);
   m_context->postTask(createCallbackTask(&didOpen, AllowCrossThreadAccess(this), success));
}

=>

void AsyncFileStream::openForReadOnFileThread(const String& path, long long offset, long long length)
{
   bool success = m_stream->openForRead(path, offset, length);
   m_context->postTask(bind(&didOpen, m_context, this, success));
}


3.5 CrossThreadCopyable

If the parameter type is a subclass of CrossThreadCopyable, the parameter is copied by calling crossThreadCopy().

template<typename Orig, New>
class CrossThreadCopyable<Orig, New> {
public:
    virtual New crossThreadCopy();
}

AllowCrossThreadAccess is no longer needed because WTF::Function ref counts only the first parameter.


3.6 deepCopy

If a class can’t be a subclass of CrossThreadCopyable, deepCopy() can be used to pass thread-unsafe instances by calling custom copy methods. For example, we can pass String and KURL instances thread-safely by wrapping them with deepCopy. For a string s, deepCopy(s) returns s.isolatedCopy().

void LocalFileSystem::readFileSystem(ScriptExecutionContext* context, FileSystemType, PassOwnPtr<AsyncFileSystemCallbacks> callbacks, FileSystemSynchronousType)
{
    context->postTask(bind(&openFileSystem, context, deepCopy(fileSystemBasePath()), deepCopy(fcontext->securityOrigin()->databaseIdentifier()), false, callbacks));
}


4. Plan

This proposal does not add any new mechanism except for a few minor changes in WTF::Function. It simply removes duplicated code and extracts common code into reusable components. Once we have this refactoring done successfully, we can move on to fancier cross thread communication mechanism.

Before I take an initiative to work hard on this proposal, I want to hear opinions from reviewers and committers who understand WebKit threads very well. Also if you have any concerns on implementation level details, please let me know before I struggle with C++ template hacks :)
Comment 1 Kwonjin Jeong 2012-07-26 19:37:54 PDT
Correct some mistakes and change the name of the virtual method.

> class TaskThread : public ThreadSafeRefCounted<TaskThread> {
> public:
>     ~TaskThread();
    virtual ~TaskThread();
>     void start();
>     void stop();
> 
>     // Schedule a task to be executed by the task thread.
>     void postTask(const Function<void()>& task);
>     // Schedule a task to be executed by the task thread and also specify the task group.
>     // Tasks scheduled with this method can be cancelled by removeTasks() method.
>     void postTask(const Function<void()>& task, PassRefPtr<TaskGroup> group);
> 
>     // Schedule a task to be immediately executed by the task thread.
>     void postImmediateTask(PassOwnPtr<TaskThreadTask> task);
    void postImmediateTask(const Function<void()>& task);
>     // Schedule a task to be executed by the task thread and also specify the task group.
>     // Tasks scheduled with this method can be cancelled by removeTasks() method.
>     void postImmediateTask(const Function<void()>& task, PassRefPtr<TaskGroup> group);
> 
>     // Cancel pending tasks that belong to the given group.
>     void removeTasks(PassRefPtr<TaskGroup> group);
> 
> protected:
>     TaskThread();
>     // Subclasses can override this method to specify an action to be executed
>     // on thread termination
>     virtual void onRunLoopTerminated();
    virtual void didStopThread();
> };
Comment 2 David Levin 2012-07-26 19:44:58 PDT
This makes every task implement ThreadSafeRefCounted solely in order to support this
  "Cancel a group of tasks that are pending in the task queue."

Could that be done in a different way in order to avoid this? Also, what is the scenario in which this is needed?


It also looks like you're proposing getting rid of the fact that all the copying was done automatically so that people couldn't mess that up by accident. Is that true?
Comment 3 Kwonjin Jeong 2012-07-27 00:01:54 PDT
(In reply to comment #2)
> This makes every task implement ThreadSafeRefCounted solely in order to support this
>   "Cancel a group of tasks that are pending in the task queue."
> 
> Could that be done in a different way in order to avoid this? Also, what is the scenario in which this is needed?

I can't understand your question exactly.
In my proposal, the task is just a WTF::Function instance. Implementing new class for a task isn't needed.

There are some cases canceling a group of the tasks.
AsyncFileStream::stop() method unschedules all tasks related to specific FileStream instance by calling FileThread::unscheduleTasks() method. DatabaseThread also has an API for unscheduling all tasks related to specific Database instance.


> It also looks like you're proposing getting rid of the fact that all the copying was done automatically so that people couldn't mess that up by accident. Is that true?

I don't want to get rid of it so I add CrossThreadCopyable class for automatic parameter copying.
But I have to keep the behavior of WTF::Function. For some types like WTF::String, because of that, parameters are copied manually using deepCopy() function.

Thank you for your comment.
Comment 4 David Levin 2012-07-27 00:26:30 PDT
(In reply to comment #3)

> There are some cases canceling a group of the tasks.
> AsyncFileStream::stop() method unschedules all tasks related to specific FileStream instance by calling FileThread::unscheduleTasks() method. DatabaseThread also has an API for unscheduling all tasks related to specific Database instance.
> 

Ok. I misread thing before anyway. I kind of like the TaskGroup thing but I wonder why:
1. I can't just call cancel on it.
2. Why it is called TaskGroup when its sole purpose is to be able to cancel tasks?
3. If there is some error check to determine if it is passed to more than one TaskThread (or perhaps that is ok).



> But I have to keep the behavior of WTF::Function. For some types like WTF::String, because of that, parameters are copied manually using deepCopy() function.
> 

Only refer counting the first item seems odd and error prone. I'm all for getting rid of CrossThreadTask but I don't think that WTF::Function plays the same role with this odd behavior. (Actually, I think the CrossThreadTask isn't bad but ScriptExecutionContext should be removed from it.) If WTF::Function is used like CrossThreadTask, then perhaps it should get the functionality to copy/ref every parameter instead of just the first (and CrossThreadTask can go away).


One more comment re TaskThread, instead of making it ThreadSafeRefCounted, I'd suggest make a TaskThreadProxy that other threads could use. (The Proxy would hold some sort of weak pointer to a TaskThread.) From what I've seen typically it is nice for a run loop to know which thread it will be destroyed on.


btw, thanks for taking this up. I didn't have time to complete all of this and more recently my time on WebKit, etc. has been vastly reduced (largely to making comment like this :).), so I'll admit that perhaps my views may be out of touch with any new thinking that has happened.

btw, https://trac.webkit.org/wiki/ThreadCommunication was meant to be the highest level of abstraction with lower level things that could be used. But perhaps it is ok just to go for this lower level but there were some common error prone patterns that kept emerging which insipred it.


PS I think this " void postImmediateTask(PassOwnPtr<TaskThreadTask> task);" is a typo (i.e. TaskThreadTask).
Comment 5 Kwang Yul Seo 2012-07-27 00:44:07 PDT
(In reply to comment #4)
> Only refer counting the first item seems odd and error prone. I'm all for getting rid of CrossThreadTask but I don't think that WTF::Function plays the same role with this odd behavior. (Actually, I think the CrossThreadTask isn't bad but ScriptExecutionContext should be removed from it.) If WTF::Function is used like CrossThreadTask, then perhaps it should get the functionality to copy/ref every parameter instead of just the first (and CrossThreadTask can go away).

I agree. It seems hard to merge CrossThreadTask into WTF::Function while maintaining the original behavior of WTF::Function. Because WTF::Function is not used frequently in WebKit, we can change the behavior of WTF::Function and modify the corresponding call sites.
Comment 6 David Levin 2012-07-27 00:47:10 PDT
(In reply to comment #5)
> (In reply to comment #4)
> > Only refer counting the first item seems odd and error prone. I'm all for getting rid of CrossThreadTask but I don't think that WTF::Function plays the same role with this odd behavior. (Actually, I think the CrossThreadTask isn't bad but ScriptExecutionContext should be removed from it.) If WTF::Function is used like CrossThreadTask, then perhaps it should get the functionality to copy/ref every parameter instead of just the first (and CrossThreadTask can go away).
> 
> I agree. It seems hard to merge CrossThreadTask into WTF::Function while maintaining the original behavior of WTF::Function. Because WTF::Function is not used frequently in WebKit, we can change the behavior of WTF::Function and modify the corresponding call sites.

Alternatively, you could put a layer on top of WTF::Function to do the copy/ref (but then I would believe that the ref of the first parameter in WTF::Function shouldn't be there and anyone who wants it should use the layer on top of it).

I haven't thought deeply about this. Also, if WTF::Function is used within one thread, then it would not be good to have to do CrossThreadCopy for every parameter.
Comment 7 Kwang Yul Seo 2012-07-27 00:59:06 PDT
(In reply to comment #6)
> I haven't thought deeply about this. Also, if WTF::Function is used within one thread, then it would not be good to have to do CrossThreadCopy for every parameter.

That's a big problem. Because the name WTF::Function seems not relevant to cross thread communication, people will be surprised to know that every argument of WTF::Function is copied in a cross-thread safe way.
Comment 8 David Levin 2012-07-27 01:00:53 PDT
(In reply to comment #7)
> (In reply to comment #6)
> > I haven't thought deeply about this. Also, if WTF::Function is used within one thread, then it would not be good to have to do CrossThreadCopy for every parameter.
> 
> That's a big problem. Because the name WTF::Function seems not relevant to cross thread communication, people will be surprised to know that every argument of WTF::Function is copied in a cross-thread safe way.

Then perhaps there should just be a WTF::CrossThreadFunction (which does the copy/ref thing) and then uses WTF::Function?
Comment 9 Kwang Yul Seo 2012-07-27 01:01:56 PDT
(In reply to comment #8)
> Then perhaps there should just be a WTF::CrossThreadFunction (which does the copy/ref thing) and then uses WTF::Function?

It sounds good to me! Let's listen to Kwonjin's opinion.
Comment 10 Kwonjin Jeong 2012-07-27 01:10:45 PDT
(In reply to comment #9)
> (In reply to comment #8)
> > Then perhaps there should just be a WTF::CrossThreadFunction (which does the copy/ref thing) and then uses WTF::Function?
> 
> It sounds good to me! Let's listen to Kwonjin's opinion.

I also agree on yours.
Comment 11 Kwonjin Jeong 2012-07-27 02:05:12 PDT
(In reply to comment #4)
> One more comment re TaskThread, instead of making it ThreadSafeRefCounted, I'd suggest make a TaskThreadProxy that other threads could use. (The Proxy would hold some sort of weak pointer to a TaskThread.) From what I've seen typically it is nice for a run loop to know which thread it will be destroyed on.

It's a good idea. But in this step, I just want to refactor current code. 
I will consider further improvements in the next step.
Comment 12 Kwang Yul Seo 2012-07-27 02:13:49 PDT
(In reply to comment #11)
> (In reply to comment #4)
> > One more comment re TaskThread, instead of making it ThreadSafeRefCounted, I'd suggest make a TaskThreadProxy that other threads could use. (The Proxy would hold some sort of weak pointer to a TaskThread.) From what I've seen typically it is nice for a run loop to know which thread it will be destroyed on.
> 
> It's a good idea. But in this step, I just want to refactor current code. 
> I will consider further improvements in the next step.

I think it is important to proceed this refactoring efforts step by step. As Kwonjin mentioned in the design document, the first step is to extract common code into reusable components. Once we have WTF::CrossThreadFunction, TaskGroup and TaskThread in hand, we can easily add fancier features.


BTW, Levein, do you know actual WebKit code that can benefit from TaskThreadProxy?
Comment 13 David Levin 2012-07-27 10:02:41 PDT
(In reply to comment #12)
> (In reply to comment #11)
> > (In reply to comment #4)
> > > One more comment re TaskThread, instead of making it ThreadSafeRefCounted, I'd suggest make a TaskThreadProxy that other threads could use. (The Proxy would hold some sort of weak pointer to a TaskThread.) From what I've seen typically it is nice for a run loop to know which thread it will be destroyed on.
> > 
> > It's a good idea. But in this step, I just want to refactor current code. 
> > I will consider further improvements in the next step.
> 
> I think it is important to proceed this refactoring efforts step by step. As Kwonjin mentioned in the design document, the first step is to extract common code into reusable components. Once we have WTF::CrossThreadFunction, TaskGroup and TaskThread in hand, we can easily add fancier features.
> 
> 
> BTW, Levein, do you know actual WebKit code that can benefit from TaskThreadProxy?

I don't believe that the current implementations of TaskThread are ThreadSafeRefCounted so that would be a change and may introduce some issues when it may be deleted on any thread.

So the idea was that other threads would use a TaskThreadProxy instead of the TaskThread itself. (I was inspired by this http://code.google.com/searchframe#OAMlx_jo-ck/src/base/message_loop_proxy.h&l=1 with a little more info here: http://dev.chromium.org/developers/coding-style/important-abstractions-and-data-structures).
Comment 14 Kwonjin Jeong 2012-07-27 20:06:42 PDT
(In reply to comment #13)
> (In reply to comment #12)
> > (In reply to comment #11)
> > > (In reply to comment #4)
> > > > One more comment re TaskThread, instead of making it ThreadSafeRefCounted, I'd suggest make a TaskThreadProxy that other threads could use. (The Proxy would hold some sort of weak pointer to a TaskThread.) From what I've seen typically it is nice for a run loop to know which thread it will be destroyed on.
> > > 
> > > It's a good idea. But in this step, I just want to refactor current code. 
> > > I will consider further improvements in the next step.
> > 
> > I think it is important to proceed this refactoring efforts step by step. As Kwonjin mentioned in the design document, the first step is to extract common code into reusable components. Once we have WTF::CrossThreadFunction, TaskGroup and TaskThread in hand, we can easily add fancier features.
> > 
> > 
> > BTW, Levein, do you know actual WebKit code that can benefit from TaskThreadProxy?
> 
> I don't believe that the current implementations of TaskThread are ThreadSafeRefCounted so that would be a change and may introduce some issues when it may be deleted on any thread.

The sole reason why TaskThread inherits ThreadSafeRefCounted is to prevent TaskThread instance from being deleted while the thread is running.
TaskThread holds a RefPtr for itself all the while so TaskThread isn't deleted. You can find the similar implementations on FileThread and DatabaseThread.
Comment 15 David Levin 2012-07-29 10:25:18 PDT
(In reply to comment #14)
> (In reply to comment #13)
> > (In reply to comment #12)
> > > (In reply to comment #11)
> > > > (In reply to comment #4)
> > > > > One more comment re TaskThread, instead of making it ThreadSafeRefCounted, I'd suggest make a TaskThreadProxy that other threads could use. (The Proxy would hold some sort of weak pointer to a TaskThread.) From what I've seen typically it is nice for a run loop to know which thread it will be destroyed on.
> > > > 
> > > > It's a good idea. But in this step, I just want to refactor current code. 
> > > > I will consider further improvements in the next step.
> > > 
> > > I think it is important to proceed this refactoring efforts step by step. As Kwonjin mentioned in the design document, the first step is to extract common code into reusable components. Once we have WTF::CrossThreadFunction, TaskGroup and TaskThread in hand, we can easily add fancier features.
> > > 
> > > 
> > > BTW, Levein, do you know actual WebKit code that can benefit from TaskThreadProxy?
> > 
> > I don't believe that the current implementations of TaskThread are ThreadSafeRefCounted so that would be a change and may introduce some issues when it may be deleted on any thread.
> 
> The sole reason why TaskThread inherits ThreadSafeRefCounted is to prevent TaskThread instance from being deleted while the thread is running.
> TaskThread holds a RefPtr for itself all the while so TaskThread isn't deleted. You can find the similar implementations on FileThread and DatabaseThread.

Only one thread should have access to it. Then it won't need to be ThreadSafeRefCounted.
Comment 16 Kwonjin Jeong 2012-08-02 23:32:08 PDT
(In reply to comment #15)
> Only one thread should have access to it. Then it won't need to be ThreadSafeRefCounted.

There are two threads that access to it.

1) TaskThread - has a reference to keep it alive while running.
2) Owner thread - has a reference to post tasks to the TaskThread.
Comment 17 Kwonjin Jeong 2012-08-13 05:25:10 PDT
Created attachment 157969 [details]
This patch is not for committing.