RESOLVED FIXED74929
webkit-patch apply-attachment is very slow for big patches
https://bugs.webkit.org/show_bug.cgi?id=74929
Summary webkit-patch apply-attachment is very slow for big patches
Csaba Osztrogonác
Reported 2011-12-20 07:20:35 PST
I ran into this problem, because Qt EWS was very very slow today. And the slowness was caused by big patches in https://bugs.webkit.org/show_bug.cgi?id=73643 I measured a little bit and found that apply-attachment is absolutely non effective now. apply-attachment: 640 sec ($ Tools/Scripts/webkit-patch apply-attachment 120013 [details]) - Fetching: https://bugs.webkit.org/attachment.cgi?id=120013&action=edit : 100 sec - Fetching: https://bugs.webkit.org/show_bug.cgi?id=73643&ctype=xml : 420 sec - Processing patch 120013 from bug 73643 : 120 sec my simple method: 85 sec ($ wget https://bugs.webkit.org/attachment.cgi?id=120013 $ Tools/Scripts/svn-apply attachment.cgi?id=120013) - wget: 60 sec - svn-apply: 25 sec It would be great if we can make webkit-patch apply-attachment faster for huge patches to make EWS bots faster.
Attachments
show.xml.tmpl (6.77 KB, patch)
2012-04-12 07:45 PDT, Zoltan Arvai
ddkilzer: review-
eric: commit-queue-
Diff between default and custom show.xml.tmpl (695 bytes, patch)
2012-04-18 07:39 PDT, David Kilzer (:ddkilzer)
no flags
Untested fix to add "excludefield=attachmentdata" everywhere "ctype=xml" is used (4.01 KB, patch)
2012-04-18 08:19 PDT, David Kilzer (:ddkilzer)
no flags
r+-ed patch with changelog (5.28 KB, patch)
2012-05-09 00:59 PDT, Csaba Osztrogonác
no flags
Adam Barth
Comment 1 2011-12-20 11:41:11 PST
Assigned for investigation. Feel free to steal.
Eric Seidel (no email)
Comment 2 2011-12-20 12:11:18 PST
I suspect it's because we load the entire attachment into memory. Probably multiple times. see attachment.py: def contents(self): # FIXME: We shouldn't be grabbing at _bugzilla. return self._bug._bugzilla.fetch_attachment_contents(self.id()) Then we apply the contents via applypatch.py: self._tool.checkout().apply_patch(state["patch"], force=self._options.non_interactive or self._options.force_patch) which in turn pipes it through: self._executive.run_command(args, input=patch.contents()) I suspect that for starters, we should make sure contents() is never called more than once (And if so, cache the value).
Csaba Osztrogonác
Comment 3 2012-04-03 03:07:03 PDT
I ran into this bug again with a huge sized bug: EWS did the following things: Fetching: https://bugs.webkit.org/attachment.cgi?id=135006&action=edit Fetching: https://bugs.webkit.org/show_bug.cgi?id=25206&ctype=xml Fetching: https://bugs.webkit.org/attachment.cgi?id=135006&action=edit Fetching: https://bugs.webkit.org/show_bug.cgi?id=25206&ctype=xml Fetching: https://bugs.webkit.org/show_bug.cgi?id=25206&ctype=xml Unfortunately https://bugs.webkit.org/show_bug.cgi?id=25206&ctype=xml is 104Mb sized and it takes ~9.5 minutes to fetch it. I'm going to pick up this bug.
Nikolas Zimmermann
Comment 4 2012-04-03 03:12:42 PDT
(In reply to comment #3) > I ran into this bug again with a huge sized bug: > > EWS did the following things: > Fetching: https://bugs.webkit.org/attachment.cgi?id=135006&action=edit > Fetching: https://bugs.webkit.org/show_bug.cgi?id=25206&ctype=xml > Fetching: https://bugs.webkit.org/attachment.cgi?id=135006&action=edit > Fetching: https://bugs.webkit.org/show_bug.cgi?id=25206&ctype=xml > Fetching: https://bugs.webkit.org/show_bug.cgi?id=25206&ctype=xml > > Unfortunately https://bugs.webkit.org/show_bug.cgi?id=25206&ctype=xml > is 104Mb sized and it takes ~9.5 minutes to fetch it. > > I'm going to pick up this bug. Just a quick FYI: WildFox: Ossy: nope, I have no clue of webkitpy, I only know how to make it check the big file checkbox WildFox: Ossy: Tools/Scripts/webkitpy/common/net/bugzilla/bugzilla.py WildFox: Ossy: in "def _fill_attachment_form", right after "self.browser['ispatch'] = ("1",)" inject: WildFox: Ossy: self.browser['bigfile'] = ("bigfile",) This makes us always check the big file check box.. What we really should do is figure out the size before, and eventually check it, if needed (I'd even ask the user, your patch is XXX MB, do you really want to upload it...)
Eric Seidel (no email)
Comment 5 2012-04-03 03:20:35 PDT
We keep the patch in memory. Maybe more than one copy. :)
Csaba Osztrogonác
Comment 6 2012-04-03 06:13:21 PDT
https://bugs.webkit.org/show_bug.cgi?id=25206 This bug has many 7.5-10Mb sized attachments from Chromium EWS. It is overkiller for EWS bots fetch the whole bug 3 times. :( ( For Qt EWS it takes 30 minutes to process a patch from this bug. :( ) Can we make Chromium EWS not to upload huge diffs? URLs point to an external storage would be better.
Csaba Osztrogonác
Comment 7 2012-04-03 08:10:53 PDT
The size of the bug is 105Mb. :(( Our EWS always dies with OOM during processing this huge XML.
Rob Buis
Comment 8 2012-04-03 08:14:25 PDT
Hi Ossy, (In reply to comment #7) > The size of the bug is 105Mb. :(( > > Our EWS always dies with OOM during processing this huge XML. I was wondering, would obsoleting those archive attachments help? Cheers, Rob.
Csaba Osztrogonác
Comment 9 2012-04-03 08:23:09 PDT
(In reply to comment #8) > Hi Ossy, > > (In reply to comment #7) > > The size of the bug is 105Mb. :(( > > > > Our EWS always dies with OOM during processing this huge XML. > > I was wondering, would obsoleting those archive attachments help? > Cheers, > > Rob. I hope, let's see. I'll remove the r? from the patch, and then put it back after EWSs dropped it from their queue.
Eric Seidel (no email)
Comment 10 2012-04-03 12:34:15 PDT
I see. This has little to do with the attachment fetch, and all do to with the fact that bugzilla at some point started including attachments in the ctype=xml response. I've seen this for a while, but don't know how to fix it. Presumably we'll have to move off of ctype=xml, onto some custom BeautifulSoup parsing of the normal page. :( Ah! I see. It's a template option: https://bugzilla.mozilla.org/show_bug.cgi?id=302669 We'll just have to fix our template.
Eric Seidel (no email)
Comment 11 2012-04-03 12:36:36 PDT
http://trac.webkit.org/browser/trunk/Websites/bugs.webkit.org/template/en/default/bug/show.xml.tmpl#L106 I'm not sure how we disable the "displayfields.attachmentdata" option, but I'm sure we can figure it out. :) We can also copy that template over to the custom section and modify it as we see fit.
Zoltan Arvai
Comment 12 2012-04-12 07:45:06 PDT
Created attachment 136911 [details] show.xml.tmpl add bugs.webkit.org/template/en/custom/bug/show.xml.tmpl : I replaced this section [% IF displayfields.attachmentdata %] <data encoding="base64">[% a.data FILTER base64 %]</data> with this content: [% IF displayfields.attachmentdata %] [%# Return URL of attachment instead of data #%] <url>[% urlbase FILTER uri FILTER html %]attachment.cgi?id=[% a.id %]</url> Is it a valid solution for this problem? Is any other modification necessary to apply the new template file?
Eric Seidel (no email)
Comment 13 2012-04-12 10:37:06 PDT
Looks fine to me, but ddkilzer or wms would be better reviewers.
Eric Seidel (no email)
Comment 14 2012-04-12 10:37:50 PDT
Comment on attachment 136911 [details] show.xml.tmpl I'm going to r+, but please give them 24 hours to comment before changing to cq+. We're in no rush here.
David Kilzer (:ddkilzer)
Comment 15 2012-04-18 07:39:51 PDT
Created attachment 137684 [details] Diff between default and custom show.xml.tmpl
David Kilzer (:ddkilzer)
Comment 16 2012-04-18 07:53:02 PDT
(In reply to comment #10) > I see. This has little to do with the attachment fetch, and all do to with the fact that bugzilla at some point started including attachments in the ctype=xml response. I've seen this for a while, but don't know how to fix it. Presumably we'll have to move off of ctype=xml, onto some custom BeautifulSoup parsing of the normal page. :( > > Ah! I see. It's a template option: > https://bugzilla.mozilla.org/show_bug.cgi?id=302669 > > We'll just have to fix our template. Looking at the patches on that b.m.o. bug, it seems like adding this query parameter to the URL fetching the data would do the same thing: excludefield=attachmentdata Did anyone try modifying webkit-patch to add that query parameter to "ctype=xml"? I only ask because modifying templates in Bugzilla means that the template modification has to be reapplied every time we update Bugzilla, which tends to be labor-intensive. If we can just exclude the attachment data (and aren't using the attachment URL for anything), that seems like a better approach. Yes, I really think we should add the "excludefield=attachmentdata" query parameter instead.
David Kilzer (:ddkilzer)
Comment 17 2012-04-18 07:53:50 PDT
Comment on attachment 136911 [details] show.xml.tmpl r- since we should just add "excludefield=attachmentdata" to the URL to exclude this field from the XML. Compare these two URLs: https://bugs.webkit.org/show_bug.cgi?id=25206&ctype=xml https://bugs.webkit.org/show_bug.cgi?id=25206&ctype=xml&excludefield=attachmentdata
Csaba Osztrogonác
Comment 18 2012-04-18 07:56:54 PDT
(In reply to comment #17) > (From update of attachment 136911 [details]) > r- since we should just add "excludefield=attachmentdata" to the URL to exclude this field from the XML. Compare these two URLs: > > https://bugs.webkit.org/show_bug.cgi?id=25206&ctype=xml > https://bugs.webkit.org/show_bug.cgi?id=25206&ctype=xml&excludefield=attachmentdata Yayyy, it will be so simpler. Zoltán, could you prepare a patch to make webkit-patch add "&excludefield=attachmentdata" to the query?
David Kilzer (:ddkilzer)
Comment 19 2012-04-18 08:19:37 PDT
Created attachment 137695 [details] Untested fix to add "excludefield=attachmentdata" everywhere "ctype=xml" is used
David Kilzer (:ddkilzer)
Comment 20 2012-04-18 08:20:40 PDT
(In reply to comment #19) > Created an attachment (id=137695) [details] > Untested fix to add "excludefield=attachmentdata" everywhere "ctype=xml" is used I don't have time to test this, but this is basically the type of fix I'm suggesting.
Eric Seidel (no email)
Comment 21 2012-04-18 10:39:55 PDT
Comment on attachment 137695 [details] Untested fix to add "excludefield=attachmentdata" everywhere "ctype=xml" is used If this works, this sounds like the better fix, yes. I wonder what other options to show_bug.cgi I should know about. :)
Zoltan Arvai
Comment 22 2012-04-20 05:00:41 PDT
I ran some test with the patch, it seems ok.
Csaba Osztrogonác
Comment 23 2012-05-08 10:07:57 PDT
Oh ... I thought it is fixed long time ago. :) If you guys don't want to submit the full patch, I'll do it tomorrow.
Csaba Osztrogonác
Comment 24 2012-05-08 10:09:02 PDT
(In reply to comment #23) > Oh ... I thought it is fixed long time ago. :) If you guys don't want to submit the full patch, I'll do it tomorrow. We really need this patch, because Qt EWS bots are dying now, because they try to fetch https://bugs.webkit.org/show_bug.cgi?id=84614
Adam Barth
Comment 25 2012-05-08 10:11:53 PDT
Comment on attachment 137695 [details] Untested fix to add "excludefield=attachmentdata" everywhere "ctype=xml" is used Looks like this just needs a ChangeLog.
Csaba Osztrogonác
Comment 26 2012-05-09 00:59:40 PDT
Created attachment 140878 [details] r+-ed patch with changelog
Csaba Osztrogonác
Comment 27 2012-05-09 01:00:53 PDT
Comment on attachment 140878 [details] r+-ed patch with changelog David, is it OK for you? :)
Csaba Osztrogonác
Comment 28 2012-05-14 04:16:21 PDT
Comment on attachment 140878 [details] r+-ed patch with changelog There wasn't any objection in the last 3 weeks, so we shouldn't wait more (and block fixing this annoying bug).
WebKit Review Bot
Comment 29 2012-05-14 05:07:34 PDT
Comment on attachment 140878 [details] r+-ed patch with changelog Clearing flags on attachment: 140878 Committed r116934: <http://trac.webkit.org/changeset/116934>
WebKit Review Bot
Comment 30 2012-05-14 05:07:42 PDT
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.