WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
UNCONFIRMED
68160
curl backend has horrible performance
https://bugs.webkit.org/show_bug.cgi?id=68160
Summary
curl backend has horrible performance
Onne Gorter
Reported
2011-09-15 07:36:53 PDT
Created
attachment 107496
[details]
fix The curl backend always uses a poll timer, and worse, doesn't fully read all kernel buffers when there is work to do. Instead it always defers more work to the next poll interval. Loading large images takes a lot longer then it should. Attached is a fix that: * schedule work immediately and not using poll timeout * never wait in select, just ask kernel for activity * if select was useful, do work and immediately try select again * only if there is nothing more to do but wait on network traffic, poll every 0.02 seconds
Attachments
fix
(3.28 KB, patch)
2011-09-15 07:36 PDT
,
Onne Gorter
no flags
Details
Formatted Diff
Diff
View All
Add attachment
proposed patch, testcase, etc.
Gyuyoung Kim
Comment 1
2011-09-15 17:46:11 PDT
Comment on
attachment 107496
[details]
fix I'm not sure if EFL port can only use curl backend now. Are there any test cases to reproduce this problem? BTW, you miss ChangeLog. Please see this link :
http://www.webkit.org/coding/contributing.html#changelogs
Onne Gorter
Comment 2
2011-09-16 01:27:13 PDT
I doubt anybody uses curl as "production" backend, it doesn't handle cookies nicely, and apparently has horrible performance. EFL can use it, but can also use libsoup if you enable glib support. If there are timed test in which it downloads large images (or other files) it would show. Before the patch, curl reads a maximum of BUFFER_SIZE * 1/0.05 bytes per second. My guess is BUFFER_SIZE = 16k so: 320k per second. So 3 seconds per MB, even from localhost.
Gyuyoung Kim
Comment 3
2011-09-16 01:50:17 PDT
We are using libsoup by default. But, I'm personally interested in curl backend. But, I'm not an expert for curl. It looks Alp Toker is curl network backend according to WebKit Team wiki.
Raphael Kubo da Costa (:rakuco)
Comment 4
2011-09-16 08:39:35 PDT
Alp does not seem to be that active these days (his last commit is from 2009) according to git log. This is not a bug specific to EFL, so I'm changing the component to WebCore Misc and CC'ing Alp anyway. Onne, could you run Tools/Scripts/check-webkit-style on your patch and then resubmit?
ssseintr
Comment 5
2011-09-19 23:43:50 PDT
Hi All, We too facing curl performance issue. GMAIL is too slow. Currently I'm investigating this issue. Do u face GMail Loading issue..? Regards, Vicky.
ssseintr
Comment 6
2011-09-21 01:04:30 PDT
Hi Onne, why have you used recursive here..?. why can't we achieve it using loop..? if (again) { downloadTimerCallback(timer); return; } Regards, Vicky.
Onne Gorter
Comment 7
2011-09-22 00:33:33 PDT
Could, but this way it has minimum impact on the existing code, and the diff is small. Such simple recursion gets compiled away anyhow.
ssseintr
Comment 8
2011-10-11 01:56:13 PDT
Hi, any updates..? when will the patch land in main tree..? regards, vicky.
Onne Gorter
Comment 9
2011-10-11 02:39:49 PDT
Well, I don't really have a "normal" WebKit checkout, but something much smaller and patched for some specific use case. It is definitely not latest upstream. So right now I cannot do what Raphael asks. Feel free to do it yourself, the patch does not fully comply to webkit coding standards, but is very small and simple, so easy to adapt and land in the main tree, I suppose. I will do it myself some later point in time, when I get some time to invest in our WebKit and build env, but no promises. regards, -Onne
ssseintr
Comment 10
2011-10-11 23:42:21 PDT
hi Onne, thanks. yes i did it myself. it s working for me without issues. regards, vicky.
Arunprasad
Comment 11
2012-08-03 01:52:54 PDT
If I move curl_multi_perform to Webkit mainloop, will be any performance improvements? Currently curl_multi_perform will be called from a timer with 50ms interval. I think that was a very bad design behind its horrible performance. Since most of the benchmarks says libcurl is far better than libsoup.
Onne Gorter
Comment 12
2012-08-06 01:58:06 PDT
> If I move curl_multi_perform to Webkit mainloop
There is no webkit mainloop, that is part of the problem. But yes, your GUI library or wherever you run webkit code from has a mainloop -- probably. If you integrate curl_multi_perform in there, that would totally fix any performance problem. It would however still be wise to empty the kernel buffers before going back into the mainloop, but that is a minor detail.
Arunprasad
Comment 13
2012-08-06 02:05:49 PDT
>>There is no webkit mainloop, that is part of the problem.
yes. wrongly mentioned. It is our implementation specific event loop.
>>It would however still be wise to empty the kernel buffers before going back into the mainloop, but that is a minor detail.
I agree, but malfunctioning servers can be a issue here. Suppose If server keeps sending 1 byte for each send emptying those data other than mainloop will cause responsiveness issues. Anyhow this issue can be avoided by keeping trac of time spend on emptying buffer.
Onne Gorter
Comment 14
2012-08-06 02:27:49 PDT
> ... Anyhow this issue can be avoided by keeping trac of time spend on emptying buffer.
I think you misunderstand the issue. When a file becomes readable, your select (or epoll/kqueue) will wake up. Then you actually read. You should read until the kernel buffer is empty (not until the remote is done sending!). So put the fd into nonblocking, and read until EWOULDBLOCK. A slow server means you read a few bytes, and then go back to the mainloop. Much later the kernel receives the next few bytes, and wakes your select again, you read some more... no issue there. However, a fast server, means you read from the kernel using a 16k buffer, but the kernel has maybe a half a meg worth of data ready. The current implementation will read 16k, go back into the mainloop, immediately wake up because the same fd has more to read, finally back at curl you read another 16k, and back into the mainloop ... and round and round and round. That is the scenario is the current behavior of curl as implemented in webkit without this patch. You can significantly reduce cpu time spend in the case of a fast server by reading in a loop until EWOULDBLOCK and not in a loop by going all the way back into the mainloop every time. That was the reason for my remark. hope this helps, good luck! -Onne
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug