Bug 104665 - [EFL] API unit tests are running extremely slow on the bots
Summary: [EFL] API unit tests are running extremely slow on the bots
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit EFL (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Thiago Marcos P. Santos
URL:
Keywords:
Depends on: 104666 104782
Blocks: 90451
  Show dependency treegraph
 
Reported: 2012-12-11 05:53 PST by Thiago Marcos P. Santos
Modified: 2013-01-21 11:33 PST (History)
15 users (show)

See Also:


Attachments
Proposed Patch (1.59 KB, patch)
2012-12-23 22:46 PST, Joone Hur
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Thiago Marcos P. Santos 2012-12-11 05:53:37 PST
They take ~60 seconds a full run on my desktop machine. For some reason on the bots we have a lot of dangling WebProcess sucking 100% of the CPU.
Comment 1 Thiago Marcos P. Santos 2012-12-11 09:02:21 PST
Problem found!

The cache is creating too many garbage at:
/home/buildslave-1/.cache/WebKitEfl/
Comment 2 Thiago Marcos P. Santos 2012-12-11 09:17:59 PST
Guys, this is a serious issue.

The conclusion here is: soup_cache creates a lot of files over the time at .cache/WebKitEfl. When I checked the bots, it had close to 66k files.

Performance will deteriorate over time, since soup_cache_load() at WebProcessMainEfl.cpp does a complete readdir(.cache/WebKitEfl), so it will be O(n) and it takes a lot of CPU for that. *sigh*

We have to limit somehow the number of entries at this cache.
Comment 3 Kenneth Rohde Christiansen 2012-12-11 09:21:07 PST
void soup_cache_set_max_size(SoupCache *cache, guint max_size);

http://developer.gnome.org/libsoup/unstable/SoupCache.html
Comment 4 Kenneth Rohde Christiansen 2012-12-11 09:27:14 PST
Why not use a different cache dir when running the tests and clear it before running the tests. Also I think something like 50MB should be sufficient and that should keep the file amount down as well
Comment 5 Kenneth Rohde Christiansen 2012-12-11 09:30:38 PST
https://bugzilla.mozilla.org/show_bug.cgi?id=559942 
https://code.google.com/p/chromium/issues/detail?id=40079

Info on chosen cache sizes in other browsers
Comment 6 Sergio Villar Senin 2012-12-12 01:09:54 PST
(In reply to comment #2)
> Guys, this is a serious issue.
> 
> The conclusion here is: soup_cache creates a lot of files over the time at .cache/WebKitEfl. When I checked the bots, it had close to 66k files.
> 
> Performance will deteriorate over time, since soup_cache_load() at WebProcessMainEfl.cpp does a complete readdir(.cache/WebKitEfl), so it will be O(n) and it takes a lot of CPU for that. *sigh*

That is not correct. The load of the cache reads the contents of a single file which stores the cache indexes. It's true that we need to improve that (there are several bugs in the gnome bugzilla for that) but it isn't not a complete readdir.

> We have to limit somehow the number of entries at this cache.

There are a couple of options:

* setup a temporal dir for the cache during testing
* disable the cache during testing
* limit the size of the cache during testing
Comment 7 Joone Hur 2012-12-12 02:04:15 PST
I think we need an API to clear the cache files before running the test cases.
So, I've filed a bug: https://bugs.webkit.org/show_bug.cgi?id=104782
Comment 8 Thiago Marcos P. Santos 2012-12-12 08:15:56 PST
(In reply to comment #6)
> (In reply to comment #2)
> > Guys, this is a serious issue.
> > 
> > The conclusion here is: soup_cache creates a lot of files over the time at .cache/WebKitEfl. When I checked the bots, it had close to 66k files.
> > 
> > Performance will deteriorate over time, since soup_cache_load() at WebProcessMainEfl.cpp does a complete readdir(.cache/WebKitEfl), so it will be O(n) and it takes a lot of CPU for that. *sigh*
> 
> That is not correct. The load of the cache reads the contents of a single file which stores the cache indexes. It's true that we need to improve that (there are several bugs in the gnome bugzilla for that) but it isn't not a complete readdir.

Indeed, but looks like the entry count grows at the same rate as the file count on the folder. The process spends a lot of time inserting these entries on a list.

> 
> > We have to limit somehow the number of entries at this cache.
> 
> There are a couple of options:
> 
> * setup a temporal dir for the cache during testing
> * disable the cache during testing
> * limit the size of the cache during testing

My main concern here is from the browser user point of view. I really don't expect the users to clear the cache when they fell like the browser is not performing.

At the time when I deleted the .cache/WebKitEfl folder on the bots, it had 65k files and the size of ~200MB. After removing the cache, the unit test run time dropped from 17 min to 1 min, which tells something about the impact of loading the cache when starting a WebProcess.

I did that yesterday and today it is already 1k files and ~3MB. I gonna fix this on the bots by creating some sort of temp dir, but it won't solve the underlining problem.

Would be good to work together with the libsoup developers on this issue.
Comment 9 Dominik Röttsches (drott) 2012-12-12 08:17:48 PST
CC'ed Dan Winship - maybe he can comment.
Comment 10 Sergio Villar Senin 2012-12-12 08:26:43 PST
(In reply to comment #8)
> (In reply to comment #6)
> > (In reply to comment #2)
> > > Guys, this is a serious issue.
> > > 
> > > The conclusion here is: soup_cache creates a lot of files over the time at .cache/WebKitEfl. When I checked the bots, it had close to 66k files.
> > > 
> > > Performance will deteriorate over time, since soup_cache_load() at WebProcessMainEfl.cpp does a complete readdir(.cache/WebKitEfl), so it will be O(n) and it takes a lot of CPU for that. *sigh*
> > 
> > That is not correct. The load of the cache reads the contents of a single file which stores the cache indexes. It's true that we need to improve that (there are several bugs in the gnome bugzilla for that) but it isn't not a complete readdir.
> 
> Indeed, but looks like the entry count grows at the same rate as the file count on the folder. The process spends a lot of time inserting these entries on a list.
> 
> > 
> > > We have to limit somehow the number of entries at this cache.
> > 
> > There are a couple of options:
> > 
> > * setup a temporal dir for the cache during testing
> > * disable the cache during testing
> > * limit the size of the cache during testing
> 
> My main concern here is from the browser user point of view. I really don't expect the users to clear the cache when they fell like the browser is not performing.
> 
> At the time when I deleted the .cache/WebKitEfl folder on the bots, it had 65k files and the size of ~200MB. After removing the cache, the unit test run time dropped from 17 min to 1 min, which tells something about the impact of loading the cache when starting a WebProcess.
> 
> I did that yesterday and today it is already 1k files and ~3MB. I gonna fix this on the bots by creating some sort of temp dir, but it won't solve the underlining problem.
> 
> Would be good to work together with the libsoup developers on this issue.

So as I said there are a couple of reasons why it takes so long. The main one is likely that the HTTP headers are currently stored in the index. Loading the index means allocating space for all of them and also deserialize them from the index file.

I opened https://bugzilla.gnome.org/show_bug.cgi?id=665884 some time ago to fix that. The patches were accepted but I didn't commit them yet because I am not totally sure that they're correct (not having the headers in memory added some complexities to the SoupCache).
Comment 11 Thiago Marcos P. Santos 2012-12-12 14:47:39 PST
(In reply to comment #10)
> (In reply to comment #8)
> > (In reply to comment #6)
> > > (In reply to comment #2)
> > > > Guys, this is a serious issue.
> > > > 
> > > > The conclusion here is: soup_cache creates a lot of files over the time at .cache/WebKitEfl. When I checked the bots, it had close to 66k files.
> > > > 
> > > > Performance will deteriorate over time, since soup_cache_load() at WebProcessMainEfl.cpp does a complete readdir(.cache/WebKitEfl), so it will be O(n) and it takes a lot of CPU for that. *sigh*
> > > 
> > > That is not correct. The load of the cache reads the contents of a single file which stores the cache indexes. It's true that we need to improve that (there are several bugs in the gnome bugzilla for that) but it isn't not a complete readdir.
> > 
> > Indeed, but looks like the entry count grows at the same rate as the file count on the folder. The process spends a lot of time inserting these entries on a list.
> > 
> > > 
> > > > We have to limit somehow the number of entries at this cache.
> > > 
> > > There are a couple of options:
> > > 
> > > * setup a temporal dir for the cache during testing
> > > * disable the cache during testing
> > > * limit the size of the cache during testing
> > 
> > My main concern here is from the browser user point of view. I really don't expect the users to clear the cache when they fell like the browser is not performing.
> > 
> > At the time when I deleted the .cache/WebKitEfl folder on the bots, it had 65k files and the size of ~200MB. After removing the cache, the unit test run time dropped from 17 min to 1 min, which tells something about the impact of loading the cache when starting a WebProcess.
> > 
> > I did that yesterday and today it is already 1k files and ~3MB. I gonna fix this on the bots by creating some sort of temp dir, but it won't solve the underlining problem.
> > 
> > Would be good to work together with the libsoup developers on this issue.
> 
> So as I said there are a couple of reasons why it takes so long. The main one is likely that the HTTP headers are currently stored in the index. Loading the index means allocating space for all of them and also deserialize them from the index file.
> 
> I opened https://bugzilla.gnome.org/show_bug.cgi?id=665884 some time ago to fix that. The patches were accepted but I didn't commit them yet because I am not totally sure that they're correct (not having the headers in memory added some complexities to the SoupCache).

I made the mistake of just deleting the cache on the bots. I should have saved it to help on the profiling the hotspots for optimization. Fortunately (if I can say that) the EFL bot maintained by Samsung still has the issue.

Gyuyoung, is it possible to share somehow the contents of .cache/WebKitEfl from the efl-buildslave-1 before you remove it? It would be very helpful on debugging this issue.
Comment 12 Gyuyoung Kim 2012-12-12 20:32:03 PST
(In reply to comment #11)
> Gyuyoung, is it possible to share somehow the contents of .cache/WebKitEfl from the efl-buildslave-1 before you remove it? It would be very helpful on debugging this issue.

Sure, I will share it via email or ftp server.
Comment 13 Joone Hur 2012-12-23 22:46:25 PST
Created attachment 180645 [details]
Proposed Patch

I've implemented ewk_context_resource_cache_clear API, which allows to clear all resource caches in memory and local storage. 
https://bugs.webkit.org/show_bug.cgi?id=104782

We can clear all cache files by calling this API before running the API unit tests.
Comment 14 Kenneth Rohde Christiansen 2012-12-24 01:48:39 PST
Comment on attachment 180645 [details]
Proposed Patch

Is this up for review? I still think the name is not so good, what is a resource cache? Does it include more than http?
Comment 15 Thiago Marcos P. Santos 2012-12-27 07:38:19 PST
Comment on attachment 180645 [details]
Proposed Patch

I'm fine with this patch as long as we create a patch to fix the performance degradation caused by the cache.
Comment 16 Gyuyoung Kim 2012-12-29 21:03:14 PST
Comment on attachment 180645 [details]
Proposed Patch

LGTM
Comment 17 Gyuyoung Kim 2012-12-29 21:05:31 PST
Comment on attachment 180645 [details]
Proposed Patch

It looks someone else might wanna have final a look.
Comment 18 Joone Hur 2012-12-29 21:07:05 PST
(In reply to comment #15)
> (From update of attachment 180645 [details])
> I'm fine with this patch as long as we create a patch to fix the performance degradation caused by the cache.

The ewk_context_resource_cache_clear API landed, so this patch could fix the performance degradation.

http://trac.webkit.org/changeset/138554
Comment 19 Thiago Marcos P. Santos 2013-01-07 11:57:09 PST
What about landing this patch? Joone?
Comment 20 Joone Hur 2013-01-07 14:53:25 PST
(In reply to comment #19)
> What about landing this patch? Joone?

I think that we need Kenneth's final confirmation.
Comment 21 Laszlo Gombos 2013-01-07 16:02:20 PST
(In reply to comment #6)
> (In reply to comment #2)
> > Guys, this is a serious issue.
> > 
> > The conclusion here is: soup_cache creates a lot of files over the time at .cache/WebKitEfl. When I checked the bots, it had close to 66k files.
> > 
> > Performance will deteriorate over time, since soup_cache_load() at WebProcessMainEfl.cpp does a complete readdir(.cache/WebKitEfl), so it will be O(n) and it takes a lot of CPU for that. *sigh*
> 
> That is not correct. The load of the cache reads the contents of a single file which stores the cache indexes. It's true that we need to improve that (there are several bugs in the gnome bugzilla for that) but it isn't not a complete readdir.

Sergio, can you share with us which soup bugs (bug ids or urls) are relevant for this issue so that we can track them ?
Comment 22 Dan Winship 2013-01-07 18:13:01 PST
https://bugzilla.gnome.org/show_bug.cgi?id=665884 moves the HTTP headers from the cache index file into the individual cache files, thereby making the index much smaller.

https://bugzilla.gnome.org/show_bug.cgi?id=666143 discusses changing the way cache files are stored, but Sergio keeps forgetting to attach his old patches there (hint hint :).
Comment 23 WebKit Review Bot 2013-01-21 11:33:24 PST
Comment on attachment 180645 [details]
Proposed Patch

Clearing flags on attachment: 180645

Committed r140348: <http://trac.webkit.org/changeset/140348>
Comment 24 WebKit Review Bot 2013-01-21 11:33:30 PST
All reviewed patches have been landed.  Closing bug.