Bug 20777 - [CURL] memory leak of ResouceHandles
Summary: [CURL] memory leak of ResouceHandles
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKitGTK (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Marco Barisione
URL:
Keywords: Curl
Depends on:
Blocks:
 
Reported: 2008-09-11 03:57 PDT by Marco Barisione
Modified: 2008-12-19 08:03 PST (History)
5 users (show)

See Also:


Attachments
Fix the leak (1.47 KB, patch)
2008-09-11 04:07 PDT, Marco Barisione
eric: review-
Details | Formatted Diff | Diff
Updated patch with refs() and manager-side refcount handling (2.65 KB, patch)
2008-12-14 23:33 PST, Kalle Vahlman
zecke: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Marco Barisione 2008-09-11 03:57:44 PDT
ResourceHandle::start increases the ref count of the handle but this is never decrease so we leak ResourceHandles, see
See https://lists.webkit.org/pipermail/webkit-dev/2008-September/004851.html

Patch coming soon.
Comment 1 Marco Barisione 2008-09-11 04:07:35 PDT
Created attachment 23343 [details]
Fix the leak

Delete the ResourceHandle when the job is completed or there is an error. I used delete directly instead of deref because the win port is doing the same.
Comment 2 Michael Emmel 2008-09-11 09:34:41 PDT
This probably causes a crash it did last time I tried it. You need to verify that the loader code which owns the handle never references it past this point.

I had to deref in the next run of the event loop.
Comment 3 Marco Barisione 2008-09-11 09:41:07 PDT
(In reply to comment #2)
> This probably causes a crash it did last time I tried it. You need to verify
> that the loader code which owns the handle never references it past this point.

I used WebKit compiled with this patch for normal navigation and I didn't get any crash, but probably there are corner cases they I was not able to reproduce.
 
> I had to deref in the next run of the event loop.

I wonder why the win port uses delete instead of deref, if we want to be more sure that nothing bad is going to happen we could deref the ResourceHandle in the next run of the event loop.

Comment 4 Michael Emmel 2008-09-11 09:45:22 PDT
Have you looked at the ncurl patch ?
Comment 5 Eric Seidel (no email) 2008-09-12 12:26:57 PDT
Comment on attachment 23343 [details]
Fix the leak

I hate this kind of memory management.  (Nothing personal).  Could we change this API so that it either used ref-counting, or didn't have this kind of single-ownership memory model burried inside these functions which are named "remove" and not "destroy".  Why is it clear that passing something for removal should destroy it!?
Comment 6 Michael Emmel 2008-09-12 12:46:04 PDT
(In reply to comment #5)
> (From update of attachment 23343 [details] [edit])
> I hate this kind of memory management.  (Nothing personal).  Could we change
> this API so that it either used ref-counting, or didn't have this kind of
> single-ownership memory model burried inside these functions which are named
> "remove" and not "destroy".  Why is it clear that passing something for removal
> should destroy it!?
> 

Eric if the loader would up the reference count by one we can drop ours at the right time and destroy the backend resources correctly. The ref count should be two IMHO.
Comment 7 Marco Barisione 2008-10-06 08:34:23 PDT
(In reply to comment #5)
> (From update of attachment 23343 [details] [edit])
> I hate this kind of memory management.  (Nothing personal).  Could we change
> this API so that it either used ref-counting, or didn't have this kind of
> single-ownership memory model burried inside these functions which are named
> "remove" and not "destroy".  Why is it clear that passing something for removal
> should destroy it!?

I'm not sure to understand what you mean. I copied the direct use of delete from the Windows port, would be enough for you to deref the object instead of deleting it?

Comment 8 Michael Emmel 2008-10-06 10:17:14 PDT
Unless the code has changed you will see that the Ref is still accessed in the loader.
By using delete the memory is referenced after its freed.
Comment 9 Frederic Marmond 2008-10-22 08:27:04 PDT
I worked on this bug, and found that upgrading libcurl from 7.18.0 to 7.18.2 fixed memory leak problems.
Can you test please?
Comment 10 Michael Emmel 2008-10-22 11:47:14 PDT
(In reply to comment #9)
> I worked on this bug, and found that upgrading libcurl from 7.18.0 to 7.18.2
> fixed memory leak problems.
> Can you test please?
> 

How did you get the ref count down to zero ?
The problem with the current patch is it deletes pointer leaving the ref pointed at freed memory.
Comment 11 Frederic Marmond 2008-10-23 00:58:16 PDT
(In reply to comment #10)
> (In reply to comment #9)
> 
> How did you get the ref count down to zero ?
> The problem with the current patch is it deletes pointer leaving the ref
> pointed at freed memory.
> 

In fact, I didn't want to say upgrading libcurl solved ALL memory leaks, but lots of them, which were inside libcurl 7.18.0. I thought that this info may be useful to guys who are reading this bug.
Applying the attached patch didn't made any crash for me, and save few kB. upgrading libcurl saved me few MB (on gmail load, for example).
Comment 12 Michael Emmel 2008-10-23 01:02:43 PDT
Understood.

Correct there are actually two issues.
Comment 13 Kalle Vahlman 2008-11-24 13:40:31 PST
(In reply to comment #7)
> (In reply to comment #5)
> > (From update of attachment 23343 [details] [review] [edit])
> > I hate this kind of memory management.  (Nothing personal).  Could we change
> > this API so that it either used ref-counting, or didn't have this kind of
> > single-ownership memory model burried inside these functions which are named
> > "remove" and not "destroy".  Why is it clear that passing something for removal
> > should destroy it!?
> 
> I'm not sure to understand what you mean. I copied the direct use of delete
> from the Windows port, would be enough for you to deref the object instead of
> deleting it?

I switched the delete's to derefs() and the ResourceHandle leaks vanished from valgrind (for the cases I tested with, eg. planet.gnome.org).

Reading the code, the ResourceHandleCurl implementation adds a ref for itself (to protect against the ResourceLoader releasing it's ref) and then the ResourceHandleManager code removes it. Now, to me it looks like it would be clearer to move the ref from ResourceHandleCurl to ResourceHandleManager:add() so the ref is managed by ResourceHandleManager only.

I can redo Marco's patch if needed, but let's get this fixed! :)
Comment 14 Michael Emmel 2008-11-24 16:30:16 PST
You probably should take a look at the lifetime of the handle over in the loader code. The original problem was that the handle was accessed by the loader code after it was removed from curl and the loader code did not ref itself.

I think what should happen is both curl and the loader code should ref the handle until neither is using it i.e the ref count should be two for a active handle.

I suspect the loader code has been refactored to remove this problem. If so I may be wrong about needing the ref at all in curl you might be able to remove it.

So if your not segfaulting now maybe we don't need to ref at all. Thats the best approach i.e the lifetime of the Reference is fully controlled by the loader code.

Comment 15 Kalle Vahlman 2008-11-24 22:59:33 PST
(In reply to comment #14)
> You probably should take a look at the lifetime of the handle over in the
> loader code. The original problem was that the handle was accessed by the
> loader code after it was removed from curl and the loader code did not ref
> itself.
>
> I think what should happen is both curl and the loader code should ref the
> handle until neither is using it i.e the ref count should be two for a active
> handle.
> 
> I suspect the loader code has been refactored to remove this problem. If so I
> may be wrong about needing the ref at all in curl you might be able to remove
> it.

Yes, I looked at the loader and it keeps a RefPtr of the ResourceHandle. I also checked that the refcount indeed was 2 when the handle comes in. The ref is really need for the curl manager since it needs to cleanup the curl side with the ResourceHandleInternal data and calling to the ResourceHandleClient callbacks might (and does) result in the client dropping its ref. I verified that whenever we hit the removeFromCurl() where the deref happens, the refcount was already 1 (ie. the client had released the ref).

> So if your not segfaulting now maybe we don't need to ref at all. Thats the
> best approach i.e the lifetime of the Reference is fully controlled by the
> loader code.
 
I'm not sure if that's easily possible as the curl seems to operate in async mode, but if the curl operations are reliably cancellable I suppose it would be doable by specifically cleaning the ongoing curl actions through ~ResourceHandle or so.

But before rewriting it, the patch with the proposed modifications should do the Right Thing for the current situation.
Comment 16 Michael Emmel 2008-11-25 08:00:54 PST
(In reply to comment #15)
> (In reply to comment #14)
> > You probably should take a look at the lifetime of the handle over in the
> > loader code. The original problem was that the handle was accessed by the
> > loader code after it was removed from curl and the loader code did not ref
> > itself.
> >
> > I think what should happen is both curl and the loader code should ref the
> > handle until neither is using it i.e the ref count should be two for a active
> > handle.
> > 
> > I suspect the loader code has been refactored to remove this problem. If so I
> > may be wrong about needing the ref at all in curl you might be able to remove
> > it.
> 
> Yes, I looked at the loader and it keeps a RefPtr of the ResourceHandle. I also
> checked that the refcount indeed was 2 when the handle comes in. The ref is
> really need for the curl manager since it needs to cleanup the curl side with
> the ResourceHandleInternal data and calling to the ResourceHandleClient
> callbacks might (and does) result in the client dropping its ref. I verified
> that whenever we hit the removeFromCurl() where the deref happens, the refcount
> was already 1 (ie. the client had released the ref).
> 
> > So if your not segfaulting now maybe we don't need to ref at all. Thats the
> > best approach i.e the lifetime of the Reference is fully controlled by the
> > loader code.
> 
> I'm not sure if that's easily possible as the curl seems to operate in async
> mode, but if the curl operations are reliably cancellable I suppose it would be
> doable by specifically cleaning the ongoing curl actions through
> ~ResourceHandle or so.
> 
> But before rewriting it, the patch with the proposed modifications should do
> the Right Thing for the current situation.
> 


Yep should finally work correctly your right it can be cleaned up later.
I'd say get it submitted.
Comment 17 Kalle Vahlman 2008-12-14 23:33:28 PST
Created attachment 26023 [details]
Updated patch with refs() and manager-side refcount handling 

So finally I found time to create and attach the patch, but I noticed the warning about test cases in the changelog entry and am now wondering if that will block inclusion..?

I have personally no experience in creating test cases so if someone would help with that (if needed), it'd be great...
Comment 18 Holger Freyther 2008-12-16 06:51:36 PST
Comment on attachment 26023 [details]
Updated patch with refs() and manager-side refcount handling 

> diff --git a/WebCore/ChangeLog b/WebCore/ChangeLog

> +        WARNING: NO TEST CASES ADDED OR CHANGED

Just remove this is if you think it is not testable. I would agree that testing this is incredible hard with a LayoutTest and from within the API... I will run run-webkit-tests and see if we crash...



> index bbc31d4..ca24ec5 100644
> --- a/WebCore/platform/network/curl/ResourceHandleCurl.cpp
> +++ b/WebCore/platform/network/curl/ResourceHandleCurl.cpp
> @@ -104,7 +104,6 @@ ResourceHandle::~ResourceHandle()
>  bool ResourceHandle::start(Frame* frame)
>  {
>      ASSERT(frame);
> -    ref();
>      ResourceHandleManager::sharedInstance()->add(this);
>      return true;
>  }

yes!



> diff --git a/WebCore/platform/network/curl/ResourceHandleManager.cpp b/WebCore/platform/network/curl/ResourceHandleManager.cpp
> index 410f9eb..f1a2be4 100644
> --- a/WebCore/platform/network/curl/ResourceHandleManager.cpp
> +++ b/WebCore/platform/network/curl/ResourceHandleManager.cpp
> @@ -350,6 +350,7 @@ void ResourceHandleManager::removeFromCurl(ResourceHandle* job)
>      curl_multi_remove_handle(m_curlMultiHandle, d->m_handle);
>      curl_easy_cleanup(d->m_handle);
>      d->m_handle = 0;
> +    job->deref();
>  }

deref when the job was started and finished by curl? looks right.



>  
>  void ResourceHandleManager::setupPUT(ResourceHandle*, struct curl_slist**)
> @@ -443,6 +444,7 @@ void ResourceHandleManager::add(ResourceHandle* job)
>  {
>      // we can be called from within curl, so to avoid re-entrancy issues
>      // schedule this job to be added the next time we enter curl download loop
> +    job->ref();

yupp, ref it when we start to handle it..




>          if (job == m_resourceHandleList[i]) {
>              m_resourceHandleList.remove(i);
> +            job->deref();
>              return true;

Cool. It looks right. When we add something to m_resourceHandleList as ref the job, when we cancel/remove a not started job we deref, once curl finished (which means the job was started and removed from the m_resourceHandleList) we will deref it.
Comment 19 Holger Freyther 2008-12-16 07:32:01 PST
Landed in r39334.
Comment 20 Holger Freyther 2008-12-19 08:03:58 PST
Closing it.