Bug 35689

Summary: ResourceLoader should call cancel() on ResourceHandle after clearing its client
Product: WebKit Reporter: Yong Li <yong.li.webkit>
Component: Page LoadingAssignee: Nobody <webkit-unassigned>
Status: RESOLVED WONTFIX    
Severity: Normal CC: ap, joenotcharles, staikos
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Attachments:
Description Flags
the patch ap: review-, ap: commit-queue-

Description Yong Li 2010-03-03 09:30:44 PST
As done in ApplicationCacheGroup.cpp:

        m_manifestHandle->setClient(0);
        m_manifestHandle->cancel();
        m_manifestHandle = 0;

ResourceLoader should also call cancel() on the handle after clearing its client, because the job is supposed to be canceled at this point, and setClient(0) does nothing but zeroing m_client. 

Patch is following
Comment 1 Yong Li 2010-03-03 09:33:58 PST
Created attachment 49915 [details]
the patch
Comment 2 George Staikos 2010-03-03 09:39:27 PST
No testcase?
Comment 3 Yong Li 2010-03-03 10:26:19 PST
(In reply to comment #2)
> No testcase?

No. This shouldn't affect unless you stop loading manually (stop or click on a link)
Comment 4 Yong Li 2010-03-03 11:02:19 PST
Actually, it's hard to reproduce problem and verify the patch helps. But I think the patch is reasonable and at least harmless to all ports.

When a ResourceHandle has a null client, there's no reason why it's not canceled.

Always checking "client()" pointer in a special port will also work, but it is just ugly.
Comment 5 Alexey Proskuryakov 2010-03-03 16:07:11 PST
I think that one difference is that if you don't cancel, the resource will still load into disk cache, making subsequent loads faster.

This sounds like a duplicate of bug 6656, and if you agree, please mark it as such, even though it has a patch attached already.
Comment 6 Yong Li 2010-03-04 08:29:11 PST
(In reply to comment #5)
> I think that one difference is that if you don't cancel, the resource will
> still load into disk cache, making subsequent loads faster.
> 
> This sounds like a duplicate of bug 6656, and if you agree, please mark it as
> such, even though it has a patch attached already.

No. It's different from bug 6656.

If we dont' cancel, the result is undetermined, depending on the port. "m_handle" is cleared just after setClient(0), and in some ports, this ResourceHandle's ref count becomes to 0 and then the job is canceled anyway. In some other ports, the ResourceHandle object is still referenced in a list. When it receives any data or response next time and the client is not there any more, it must check "client()" otherwise it crashes. Something like this:
if (!client()) {
 cancel();
 return;
}
must be added into many places.

We should add a macro if we intend to load the not needed resource to disk cache. Also, this may be unwanted for mobile devices. The code is called most from MainResourceLoader, and there's not too much chance we want to load main resource from disk.
Comment 7 Eric Seidel (no email) 2010-03-05 16:26:40 PST
Comment on attachment 49915 [details]
the patch

OK.  How do we test this?
Comment 8 George Staikos 2010-03-05 17:36:37 PST
Discussed with Yong and we haven't found a [good] way so far.
Comment 9 George Staikos 2010-03-09 11:55:32 PST
Comment on attachment 49915 [details]
the patch

No known way to test this one so far.  Patch appears to be fully safe though.
Comment 10 Alexey Proskuryakov 2010-03-09 12:24:11 PST
Comment on attachment 49915 [details]
the patch

I disagree with taking a patch that doesn't describe what it changes behavior-wise.
Comment 11 Alexey Proskuryakov 2010-03-09 12:25:03 PST
What are "some ports" and "some other ports"? How can I tell if this will affect the Mac port?
Comment 12 Alexey Proskuryakov 2010-03-09 12:30:46 PST
Per comment 6, this fixes crashes on some platforms. Why not make a patch for the crash? You may be able to exercise this code by detaching a subframe, or just removing an img element.
Comment 13 Yong Li 2010-03-10 08:15:00 PST
(In reply to comment #12)
> Per comment 6, this fixes crashes on some platforms. Why not make a patch for
> the crash? You may be able to exercise this code by detaching a subframe, or
> just removing an img element.

Yes, we can fix crash in the port where it crashes. But is there any reason why we don't want to cancel the job at this point? It's hard to verify the behavior change, but what the small change does is very obvious. There's similar code at 2 places in ApplicationCacheGroup.cpp. Apparently it's very safe.

        m_manifestHandle->setClient(0);
        m_manifestHandle->cancel();
        m_manifestHandle = 0;
Comment 14 Alexey Proskuryakov 2010-03-10 08:59:34 PST
I have two specific concerns:
- how does this affect the behavior described in bug 6656;
- what will happen to loads in subframes when those are re-attached to another parent.

The change looks safe in that it's unlikely to introduce crashing, but it's not clear whether it can regress behavior, and not clear how it improves it. It would be much easier to judge if its effects on different platforms were investigated and described somewhat deeper.
Comment 15 Yong Li 2010-03-10 10:40:01 PST
(In reply to comment #14)
> I have two specific concerns:
> - how does this affect the behavior described in bug 6656;
> - what will happen to loads in subframes when those are re-attached to another
> parent.
> 

98  if (m_handle->client() == this) {
9999 m_handle->setClient(0);
 100 m_handle->cancel();
 101 }
100102 m_handle = 0; 

There's a check for current client. If the resource is taken over by others (like a download manager), its client is supposed to be changed to that. If the client is null, the data will be lost anyway.

I just did a search, find that ResourceLoader::releaseResources() is only called in these places:

1) NetscapePlugInStreamLoader::releaseResources()
2) MainResourceLoader::receivedError()
3) ResourceLoader::didFinishLoading()
4) ResourceLoader::didFail()
5) ResourceLoader::didCancel()

For 1), I cannot find where it's called.
For 3), 4), 5), no need to cancel the job because it's already done or cancelled

For 2), it happens only in 2 cases:
a) MainResourceLoader::didFail(const ResourceError&
b) when Policy is changed to PolicyDownload

For b), the resource handle has been transferred to the downloader. My patch will give a trouble if the downloader is still using the ResourceHandle without changing its client. But is it possible?

For a), this can be called indirectly from ResourceHandler but can also be called by other webkit modules. This is where my patch can help.
Comment 16 Alexey Proskuryakov 2010-03-10 10:57:25 PST
This is a useful analysis, thanks! It doesn't answer my questions, but it raises good new ones.

Yes, it seems that downloading is another potential problem area for this. Resources requested by plug-ins might be tricky, too.
Comment 17 Joe Mason 2010-03-10 11:05:05 PST
I think the core issue here is that without this patch, a ResourceHandle's client() can become 0 unexpectedly while a network request is still open.  For the author of a port, it's easy to assume that a ResourceHandle will always have a valid client(), and if this invariant isn't true it can lead to bugs in the port.  (I'm thinking mainly about new ports here; existing ports deal with this already either by always checking client() before using it, or having port-specific code to always call cancel() as soon as the client becomes 0.)

So I've been assuming that if some network data comes in, and I call resourceHandle->client()->didReceiveData(...), but the client() is null, that's a bug in webkit.  Is this a good invariant to strive for?

However, while typing this up, I just realized that this isn't true if the network data is being received in a different thread than the one that called setClient - in that case, there's still a window where data can be processed after the client is set to 0 but before the job is cleaned up.  So ports will still need to check the client() before using it, even if webkit always does the right thing for the single-threaded case.

Also, I do think this patch needs a test case, so I think we should withdraw it.
Comment 18 Yong Li 2010-03-10 11:23:43 PST
(In reply to comment #17)
> 
> Also, I do think this patch needs a test case, so I think we should withdraw
> it.

Agree. I almost said "withdraw" in last comment. We can always find a way to make it in the port. But this is still a tricky problem that webkit doesn't define clearly, and it probably will screw up somebody in the future.