68160 – curl backend has horrible performance

UNCONFIRMED 68160

curl backend has horrible performance

https://bugs.webkit.org/show_bug.cgi?id=68160

Summary curl backend has horrible performance

Onne Gorter

Reported 2011-09-15 07:36:53 PDT

Created attachment 107496 [details] fix The curl backend always uses a poll timer, and worse, doesn't fully read all kernel buffers when there is work to do. Instead it always defers more work to the next poll interval. Loading large images takes a lot longer then it should. Attached is a fix that: * schedule work immediately and not using poll timeout * never wait in select, just ask kernel for activity * if select was useful, do work and immediately try select again * only if there is nothing more to do but wait on network traffic, poll every 0.02 seconds

Attachments
fix (3.28 KB, patch) 2011-09-15 07:36 PDT, Onne Gorter	no flags	Details Formatted Diff Diff
View All Add attachment proposed patch, testcase, etc.

Gyuyoung Kim

Comment 1 2011-09-15 17:46:11 PDT

Comment on attachment 107496 [details] fix I'm not sure if EFL port can only use curl backend now. Are there any test cases to reproduce this problem? BTW, you miss ChangeLog. Please see this link : http://www.webkit.org/coding/contributing.html#changelogs

Onne Gorter

Comment 2 2011-09-16 01:27:13 PDT

I doubt anybody uses curl as "production" backend, it doesn't handle cookies nicely, and apparently has horrible performance. EFL can use it, but can also use libsoup if you enable glib support. If there are timed test in which it downloads large images (or other files) it would show. Before the patch, curl reads a maximum of BUFFER_SIZE * 1/0.05 bytes per second. My guess is BUFFER_SIZE = 16k so: 320k per second. So 3 seconds per MB, even from localhost.

Gyuyoung Kim

Comment 3 2011-09-16 01:50:17 PDT

We are using libsoup by default. But, I'm personally interested in curl backend. But, I'm not an expert for curl. It looks Alp Toker is curl network backend according to WebKit Team wiki.

Raphael Kubo da Costa (:rakuco)

Comment 4 2011-09-16 08:39:35 PDT

Alp does not seem to be that active these days (his last commit is from 2009) according to git log. This is not a bug specific to EFL, so I'm changing the component to WebCore Misc and CC'ing Alp anyway. Onne, could you run Tools/Scripts/check-webkit-style on your patch and then resubmit?

ssseintr

Comment 5 2011-09-19 23:43:50 PDT

Hi All, We too facing curl performance issue. GMAIL is too slow. Currently I'm investigating this issue. Do u face GMail Loading issue..? Regards, Vicky.

ssseintr

Comment 6 2011-09-21 01:04:30 PDT

Hi Onne, why have you used recursive here..?. why can't we achieve it using loop..? if (again) { downloadTimerCallback(timer); return; } Regards, Vicky.

Onne Gorter

Comment 7 2011-09-22 00:33:33 PDT

Could, but this way it has minimum impact on the existing code, and the diff is small. Such simple recursion gets compiled away anyhow.

ssseintr

Comment 8 2011-10-11 01:56:13 PDT

Hi, any updates..? when will the patch land in main tree..? regards, vicky.

Onne Gorter

Comment 9 2011-10-11 02:39:49 PDT

Well, I don't really have a "normal" WebKit checkout, but something much smaller and patched for some specific use case. It is definitely not latest upstream. So right now I cannot do what Raphael asks. Feel free to do it yourself, the patch does not fully comply to webkit coding standards, but is very small and simple, so easy to adapt and land in the main tree, I suppose. I will do it myself some later point in time, when I get some time to invest in our WebKit and build env, but no promises. regards, -Onne

ssseintr

Comment 10 2011-10-11 23:42:21 PDT

hi Onne, thanks. yes i did it myself. it s working for me without issues. regards, vicky.

Arunprasad

Comment 11 2012-08-03 01:52:54 PDT

If I move curl_multi_perform to Webkit mainloop, will be any performance improvements? Currently curl_multi_perform will be called from a timer with 50ms interval. I think that was a very bad design behind its horrible performance. Since most of the benchmarks says libcurl is far better than libsoup.

Onne Gorter

Comment 12 2012-08-06 01:58:06 PDT

> If I move curl_multi_perform to Webkit mainloop There is no webkit mainloop, that is part of the problem. But yes, your GUI library or wherever you run webkit code from has a mainloop -- probably. If you integrate curl_multi_perform in there, that would totally fix any performance problem. It would however still be wise to empty the kernel buffers before going back into the mainloop, but that is a minor detail.

Arunprasad

Comment 13 2012-08-06 02:05:49 PDT

>>There is no webkit mainloop, that is part of the problem. yes. wrongly mentioned. It is our implementation specific event loop. >>It would however still be wise to empty the kernel buffers before going back into the mainloop, but that is a minor detail. I agree, but malfunctioning servers can be a issue here. Suppose If server keeps sending 1 byte for each send emptying those data other than mainloop will cause responsiveness issues. Anyhow this issue can be avoided by keeping trac of time spend on emptying buffer.

Onne Gorter

Comment 14 2012-08-06 02:27:49 PDT

> ... Anyhow this issue can be avoided by keeping trac of time spend on emptying buffer. I think you misunderstand the issue. When a file becomes readable, your select (or epoll/kqueue) will wake up. Then you actually read. You should read until the kernel buffer is empty (not until the remote is done sending!). So put the fd into nonblocking, and read until EWOULDBLOCK. A slow server means you read a few bytes, and then go back to the mainloop. Much later the kernel receives the next few bytes, and wakes your select again, you read some more... no issue there. However, a fast server, means you read from the kernel using a 16k buffer, but the kernel has maybe a half a meg worth of data ready. The current implementation will read 16k, go back into the mainloop, immediately wake up because the same fd has more to read, finally back at curl you read another 16k, and back into the mainloop ... and round and round and round. That is the scenario is the current behavior of curl as implemented in webkit without this patch. You can significantly reduce cpu time spend in the case of a fast server by reading in a loop until EWOULDBLOCK and not in a loop by going all the way back into the mainloop every time. That was the reason for my remark. hope this helps, good luck! -Onne

Note You need to log in before you can comment on or make changes to this bug.

Status UNCONFIRMED

Resolution

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware Unspecified

OS Unspecified

Product WebKit

Component WebCore Misc.

Assignee

Nobody

Reported

2011-09-15 07:36 PDT

Modified

2012-08-06 02:27 PDT History

CC List

7 users Show

URL

Keywords

Depends on

Blocks