WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
55859
[Chromium] The method that retrieves all resources from a page should be moved from Chromium to WebKit
https://bugs.webkit.org/show_bug.cgi?id=55859
Summary
[Chromium] The method that retrieves all resources from a page should be move...
Jay Civelli
Reported
2011-03-06 23:46:15 PST
In the Chromium WebKit glue code, dom_operations.[h, cc], a method called GetAllSavableResourceLinksForCurrentPage is used to retrieve all the resources from a page. It should be moved to WebKit. One of the key reasons is that we want that method to also report CSS resources, and keeping it in Chromium would required to expose unnecessarily the CSS info through the Chromium WebKit API.
Attachments
Initial patch
(24.73 KB, patch)
2011-03-06 23:59 PST
,
Jay Civelli
levin
: review-
Details
Formatted Diff
Diff
Fix build and style issue
(25.26 KB, patch)
2011-03-07 23:22 PST
,
Jay Civelli
levin
: review-
Details
Formatted Diff
Diff
Addressing David's remarks.
(25.17 KB, patch)
2011-03-15 17:02 PDT
,
Jay Civelli
no flags
Details
Formatted Diff
Diff
Fixed style.
(25.28 KB, patch)
2011-03-15 17:20 PDT
,
Jay Civelli
abarth
: review-
Details
Formatted Diff
Diff
Addressed Adam's comments
(27.51 KB, patch)
2011-03-16 13:54 PDT
,
Jay Civelli
no flags
Details
Formatted Diff
Diff
Fixed style...
(27.50 KB, patch)
2011-03-16 13:59 PDT
,
Jay Civelli
levin
: review-
Details
Formatted Diff
Diff
Addressed David's comments
(28.97 KB, patch)
2011-03-16 15:24 PDT
,
Jay Civelli
no flags
Details
Formatted Diff
Diff
Patch
(29.78 KB, patch)
2011-03-22 10:04 PDT
,
Jay Civelli
no flags
Details
Formatted Diff
Diff
One more attempt at landing this.
(27.75 KB, patch)
2011-03-23 16:21 PDT
,
Jay Civelli
no flags
Details
Formatted Diff
Diff
Show Obsolete
(8)
View All
Add attachment
proposed patch, testcase, etc.
Jay Civelli
Comment 1
2011-03-06 23:59:40 PST
Created
attachment 84924
[details]
Initial patch
WebKit Review Bot
Comment 2
2011-03-07 00:04:41 PST
Attachment 84924
[details]
did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebKit/chromium/ChangeLog', u'Sourc..." exit_code: 1 Last 3072 characters of output: izerTest.cpp:34: Alphabetical sorting problem. [build/include_order] [4] Source/WebKit/chromium/tests/WebPageSerializerTest.cpp:36: Found header this file implements after other header. Should be: config.h, primary header, blank line, and then alphabetically sorted. [build/include_order] [4] Source/WebKit/chromium/tests/WebPageSerializerTest.cpp:62: Should only a single space after a punctuation in a comment. [whitespace/comments] [5] Source/WebKit/chromium/tests/data/pageserialization/simple_iframe.html:1: One or more unexpected \r (^M) found; better to use only a \n [whitespace/carriage_return] [1] Suppressing further [whitespace/carriage_return] reports for this file. Source/WebKit/chromium/public/WebPageSerializer.h:77: The parameter name "view" adds no information, so it should be removed. [readability/parameter_name] [5] Source/WebKit/chromium/src/WebPageSerializer.cpp:62: Use 0 instead of NULL. [readability/null] [5] Source/WebKit/chromium/src/WebPageSerializer.cpp:69: Boolean expressions that span multiple lines should have their operators on the left side of the line instead of the right side. [whitespace/operators] [4] Source/WebKit/chromium/src/WebPageSerializer.cpp:70: Boolean expressions that span multiple lines should have their operators on the left side of the line instead of the right side. [whitespace/operators] [4] Source/WebKit/chromium/src/WebPageSerializer.cpp:71: Boolean expressions that span multiple lines should have their operators on the left side of the line instead of the right side. [whitespace/operators] [4] Source/WebKit/chromium/src/WebPageSerializer.cpp:74: Boolean expressions that span multiple lines should have their operators on the left side of the line instead of the right side. [whitespace/operators] [4] Source/WebKit/chromium/src/WebPageSerializer.cpp:75: Boolean expressions that span multiple lines should have their operators on the left side of the line instead of the right side. [whitespace/operators] [4] Source/WebKit/chromium/src/WebPageSerializer.cpp:76: Boolean expressions that span multiple lines should have their operators on the left side of the line instead of the right side. [whitespace/operators] [4] Source/WebKit/chromium/src/WebPageSerializer.cpp:106: Boolean expressions that span multiple lines should have their operators on the left side of the line instead of the right side. [whitespace/operators] [4] Source/WebKit/chromium/src/WebPageSerializer.cpp:224: One line control clauses should not use braces. [whitespace/braces] [4] Source/WebKit/chromium/tests/data/pageserialization/simple_page.html:1: One or more unexpected \r (^M) found; better to use only a \n [whitespace/carriage_return] [1] Suppressing further [whitespace/carriage_return] reports for this file. Source/WebKit/chromium/tests/data/pageserialization/top_frame.html:1: One or more unexpected \r (^M) found; better to use only a \n [whitespace/carriage_return] [1] Suppressing further [whitespace/carriage_return] reports for this file. Total errors found: 266 in 11 files If any of these errors are false positives, please file a bug against check-webkit-style.
WebKit Review Bot
Comment 3
2011-03-07 00:10:26 PST
Attachment 84924
[details]
did not build on chromium: Build output:
http://queues.webkit.org/results/8102494
WebKit Review Bot
Comment 4
2011-03-07 01:03:00 PST
Attachment 84924
[details]
did not build on chromium: Build output:
http://queues.webkit.org/results/8107263
Dimitri Glazkov (Google)
Comment 5
2011-03-07 07:08:45 PST
Build breakage?
David Levin
Comment 6
2011-03-07 10:48:11 PST
Comment on
attachment 84924
[details]
Initial patch r- I may not be the one to r+ but it seems clear that this currently should get an r- due to lots of style issues and it broke the build (and the copyright years say 2010 instead of 2011).
Jay Civelli
Comment 7
2011-03-07 23:22:28 PST
Created
attachment 85026
[details]
Fix build and style issue
David Levin
Comment 8
2011-03-15 10:33:31 PDT
Comment on
attachment 85026
[details]
Fix build and style issue View in context:
https://bugs.webkit.org/attachment.cgi?id=85026&action=review
Just a few things to address and then this should be fine.
> Source/WebKit/chromium/src/WebDataSourceImpl.cpp:38 > +#include <googleurl/src/gurl.h>
Why do you need to include this?
> Source/WebKit/chromium/src/WebPageSerializer.cpp:34 > +#include "DocumentLoader.h"
Where is the document loaded used in here?
> Source/WebKit/chromium/src/WebPageSerializer.cpp:63 > + if (element.hasTagName(HTMLNames::imgTag) || element.hasTagName(HTMLNames::scriptTag)) {
Single line clause shouldn't have surrounding {}
> Source/WebKit/chromium/src/WebPageSerializer.cpp:82 > + // TODO(jnd): Add support for extracting links of sub-resources which
s/TODO({^)]*)/FIXME/
> Source/WebKit/chromium/src/WebPageSerializer.cpp:84 > + // See bug:
http://b/issue?id=1111667
.
Internal bug reference doesn't seem appropriate.
> Source/WebKit/chromium/src/WebPageSerializer.cpp:109 > + if (visitedFrames->find(frame) == notFound)
While find is more flexible, I wonder if there should be a contains method as well which just does this but would be more readable in these pretty common cases. Reading more of this code, I like this idea more and more.
> Source/WebKit/chromium/src/WebPageSerializer.cpp:120 > + if (!url.protocolIs("http") && !url.protocolIs("https") && !url.protocolIs("file"))
url.protocolInHTTPFamily url.protocolIsLocalFile
> Source/WebKit/chromium/src/WebPageSerializer.cpp:163 > + if (node->isElementNode()) {
if (!node->isElementNode()) continue; (Less indenting and "fail fast" approach.)
> Source/WebKit/chromium/tests/WebPageSerializerTest.cpp:55 > + WebPageSerializerTest() : m_webView(0), m_supportedSchemes(3U)
3U? Do we need the U?
> Source/WebKit/chromium/tests/WebPageSerializerTest.cpp:59 > + m_supportedSchemes[1] = "file";
Should this be [2] ? This seems to indicate a hole in the test coverage. Should it have https resources in it? What about file resources? What about unsupported schemes?
Jay Civelli
Comment 9
2011-03-15 17:02:57 PDT
Created
attachment 85879
[details]
Addressing David's remarks.
Jay Civelli
Comment 10
2011-03-15 17:07:33 PDT
(In reply to
comment #8
)
> (From update of
attachment 85026
[details]
) > View in context:
https://bugs.webkit.org/attachment.cgi?id=85026&action=review
> > Just a few things to address and then this should be fine. > > > Source/WebKit/chromium/src/WebDataSourceImpl.cpp:38 > > +#include <googleurl/src/gurl.h> > > Why do you need to include this?
Removed.
> > Source/WebKit/chromium/src/WebPageSerializer.cpp:34 > > +#include "DocumentLoader.h" > > Where is the document loaded used in here?
At line 134: KURL frameURL = frame->loader()->documentLoader()->request().url();
> > Source/WebKit/chromium/src/WebPageSerializer.cpp:63 > > + if (element.hasTagName(HTMLNames::imgTag) || element.hasTagName(HTMLNames::scriptTag)) { > > Single line clause shouldn't have surrounding {}
I thought when following else statement had {} then we should use them on the "then" clause as well?
> > Source/WebKit/chromium/src/WebPageSerializer.cpp:82 > > + // TODO(jnd): Add support for extracting links of sub-resources which > > s/TODO({^)]*)/FIXME/
Done.
> > Source/WebKit/chromium/src/WebPageSerializer.cpp:84 > > + // See bug:
http://b/issue?id=1111667
. > > Internal bug reference doesn't seem appropriate.
Removed.
> > Source/WebKit/chromium/src/WebPageSerializer.cpp:109 > > + if (visitedFrames->find(frame) == notFound) > > While find is more flexible, I wonder if there should be a contains method as well which just does this but would be more readable in these pretty common cases.
Here I am using a Vector, not a WebVector, and it does not have a contain. Actually I now realize I am using the WebVector::find() in the test only, and I would probably only need a contains method. So I replaced find() with contains().
> Reading more of this code, I like this idea more and more. > > > Source/WebKit/chromium/src/WebPageSerializer.cpp:120 > > + if (!url.protocolIs("http") && !url.protocolIs("https") && !url.protocolIs("file")) > > url.protocolInHTTPFamily > url.protocolIsLocalFile
Done
> > Source/WebKit/chromium/src/WebPageSerializer.cpp:163 > > + if (node->isElementNode()) { > > if (!node->isElementNode()) > continue; > > (Less indenting and "fail fast" approach.)
OK.
> > Source/WebKit/chromium/tests/WebPageSerializerTest.cpp:55 > > + WebPageSerializerTest() : m_webView(0), m_supportedSchemes(3U) > > 3U? Do we need the U?
Yes, the fails to compile otherwise. (it's using the wrong constructor in that case).
> > Source/WebKit/chromium/tests/WebPageSerializerTest.cpp:59 > > + m_supportedSchemes[1] = "file"; > > Should this be [2] ?
Arg! Yes.
> This seems to indicate a hole in the test coverage. Should it have https resources in it? What about file resources? What about unsupported schemes?
Good idea, added.
WebKit Review Bot
Comment 11
2011-03-15 17:10:33 PDT
Attachment 85879
[details]
did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebKit/chromium/ChangeLog', u'Sourc..." exit_code: 1 Source/WebKit/chromium/public/WebVector.h:129: One or more unexpected \r (^M) found; better to use only a \n [whitespace/carriage_return] [1] Suppressing further [whitespace/carriage_return] reports for this file. Total errors found: 7 in 13 files If any of these errors are false positives, please file a bug against check-webkit-style.
Jay Civelli
Comment 12
2011-03-15 17:20:12 PDT
Created
attachment 85884
[details]
Fixed style.
Adam Barth
Comment 13
2011-03-15 17:33:45 PDT
Comment on
attachment 85884
[details]
Fixed style. View in context:
https://bugs.webkit.org/attachment.cgi?id=85884&action=review
WebPageSerializer.cpp is such great sadness. I re-wrote the whole thing, but never landed the new version. :( Almost every line of code in that file is wrong.
> Source/WebKit/chromium/src/WebPageSerializer.cpp:60 > +KURL getSubResourceURLFromElement(const Element& element)
I usually pass Elements by pointer, not by reference.
> Source/WebKit/chromium/src/WebPageSerializer.cpp:92 > + if (value.isNull() || value.isEmpty() || value.startsWith("javascript:", false))
value.startsWith("javascript:", false) is wrong. It could start with " javascript:" also, for example.
> Source/WebKit/chromium/src/WebPageSerializer.cpp:106 > + if ((element.hasTagName(HTMLNames::iframeTag) || element.hasTagName(HTMLNames::frameTag)) > + && element.isFrameOwnerElement()) {
What if it's an object or an embed element that has a frame?
> Source/WebKit/chromium/src/WebPageSerializer.cpp:114 > + if (url.isEmpty() || url.isNull() || !url.isValid())
isEmpty includes isNull, right?
Jay Civelli
Comment 14
2011-03-16 13:53:34 PDT
(In reply to
comment #13
)
> (From update of
attachment 85884
[details]
) > View in context:
https://bugs.webkit.org/attachment.cgi?id=85884&action=review
> > WebPageSerializer.cpp is such great sadness. I re-wrote the whole thing, but never landed the new version. :( > > Almost every line of code in that file is wrong.
And it's only going to get worse, as I am planning to add some more code in that file :-) I looked at the other file briefly (WebPageSerializerImpl.cpp) and was also concerned. Any reason why you did not land your new version? Anything I can do to help?
> > Source/WebKit/chromium/src/WebPageSerializer.cpp:60 > > +KURL getSubResourceURLFromElement(const Element& element) > > I usually pass Elements by pointer, not by reference.
OK
> > Source/WebKit/chromium/src/WebPageSerializer.cpp:92 > > + if (value.isNull() || value.isEmpty() || value.startsWith("javascript:", false)) > > value.startsWith("javascript:", false) is wrong. It could start with " javascript:" also, for example.
OK, now stripping white-space.
> > Source/WebKit/chromium/src/WebPageSerializer.cpp:106 > > + if ((element.hasTagName(HTMLNames::iframeTag) || element.hasTagName(HTMLNames::frameTag)) > > + && element.isFrameOwnerElement()) { > > What if it's an object or an embed element that has a frame?
Now supported.
> > Source/WebKit/chromium/src/WebPageSerializer.cpp:114 > > + if (url.isEmpty() || url.isNull() || !url.isValid()) > > isEmpty includes isNull, right?
Oh, yeah.
Jay Civelli
Comment 15
2011-03-16 13:54:21 PDT
Created
attachment 85968
[details]
Addressed Adam's comments
WebKit Review Bot
Comment 16
2011-03-16 13:56:11 PDT
Attachment 85968
[details]
did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebKit/chromium/ChangeLog', u'Sourc..." exit_code: 1 Source/WebKit/chromium/tests/data/pageserialization/object_iframe.html:1: One or more unexpected \r (^M) found; better to use only a \n [whitespace/carriage_return] [1] Suppressing further [whitespace/carriage_return] reports for this file. Source/WebKit/chromium/src/WebPageSerializer.cpp:90: One line control clauses should not use braces. [whitespace/braces] [4] Source/WebKit/chromium/tests/data/pageserialization/embed_iframe.html:1: One or more unexpected \r (^M) found; better to use only a \n [whitespace/carriage_return] [1] Suppressing further [whitespace/carriage_return] reports for this file. Total errors found: 11 in 15 files If any of these errors are false positives, please file a bug against check-webkit-style.
Jay Civelli
Comment 17
2011-03-16 13:59:00 PDT
Created
attachment 85970
[details]
Fixed style...
WebKit Review Bot
Comment 18
2011-03-16 14:01:09 PDT
Attachment 85970
[details]
did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebKit/chromium/ChangeLog', u'Sourc..." exit_code: 1 Source/WebKit/chromium/src/WebPageSerializer.cpp:90: One line control clauses should not use braces. [whitespace/braces] [4] Total errors found: 1 in 15 files If any of these errors are false positives, please file a bug against check-webkit-style.
David Levin
Comment 19
2011-03-16 14:44:53 PDT
Comment on
attachment 85970
[details]
Fixed style... View in context:
https://bugs.webkit.org/attachment.cgi?id=85970&action=review
r- because I want Vector::contains :)
> Source/WebKit/chromium/src/WebPageSerializer.cpp:61 > +{
ASSERT(element);
> Source/WebKit/chromium/src/WebPageSerializer.cpp:63 > + if (element->hasTagName(HTMLNames::imgTag) || element->hasTagName(HTMLNames::scriptTag)) {
Style bot missed this one but it is still true: "One line control clauses should not use braces. [whitespace/braces] [4]"
> Source/WebKit/chromium/src/WebPageSerializer.cpp:65 > + } else if (element->hasTagName(HTMLNames::inputTag)) {
Leave the braces for this if.
> Source/WebKit/chromium/src/WebPageSerializer.cpp:72 > + || element->hasTagName(HTMLNames::tdTag)) {
Remove them here.
> Source/WebKit/chromium/src/WebPageSerializer.cpp:77 > + || element->hasTagName(HTMLNames::insTag)) {
Remove them here.
> Source/WebKit/chromium/src/WebPageSerializer.cpp:79 > + } else if (element->hasTagName(HTMLNames::linkTag)) {
Leave the braces for this if.
> Source/WebKit/chromium/src/WebPageSerializer.cpp:86 > + } else if (element->hasTagName(HTMLNames::objectTag)) {
Remove them here.
>> Source/WebKit/chromium/src/WebPageSerializer.cpp:90 >> + } > > One line control clauses should not use braces. [whitespace/braces] [4]
What Style bot said.
> Source/WebKit/chromium/src/WebPageSerializer.cpp:115 > + if (visitedFrames->find(frame) == notFound)
May I recommend adding contains to Vector (just implement in terms of find)?
> Source/WebKit/chromium/src/WebPageSerializer.cpp:130 > + if (resourceURLs->find(url) == notFound)
Use contains.
> Source/WebKit/chromium/src/WebPageSerializer.cpp:159 > + if (visitedFrames->find(frame) != notFound)
Use contains.
> Source/WebKit/chromium/src/WebPageSerializer.cpp:162 > + if (frameURLs->find(frameURL) == notFound)
Use contains.
Jay Civelli
Comment 20
2011-03-16 15:24:13 PDT
Created
attachment 85987
[details]
Addressed David's comments Addressed all comments.
David Levin
Comment 21
2011-03-16 15:28:28 PDT
Comment on
attachment 85987
[details]
Addressed David's comments View in context:
https://bugs.webkit.org/attachment.cgi?id=85987&action=review
> Source/JavaScriptCore/ChangeLog:10 > + (WTF::::contains):
Consider fixing the to be the right function names in the future but ok.
WebKit Commit Bot
Comment 22
2011-03-17 12:47:02 PDT
The commit-queue encountered the following flaky tests while processing
attachment 85987
[details]
: http/tests/inspector/network/network-size.html
bug 56581
(author:
pfeldman@chromium.org
) The commit-queue is continuing to process your patch.
WebKit Commit Bot
Comment 23
2011-03-17 12:49:43 PDT
Comment on
attachment 85987
[details]
Addressed David's comments Clearing flags on attachment: 85987 Committed
r81377
: <
http://trac.webkit.org/changeset/81377
>
WebKit Commit Bot
Comment 24
2011-03-17 12:49:50 PDT
All reviewed patches have been landed. Closing bug.
James Simonsen
Comment 25
2011-03-21 13:28:31 PDT
The new unit test from this patch is failing already. This is being rolled out in
bug 56765
. The failure is: [ RUN ] WebPageSerializerTest.MultipleFrames .\tests\WebPageSerializerTest.cpp(182): error: Value of: resources.size() Actual: 5 Expected: 6 .\tests\WebPageSerializerTest.cpp(186): error: Value of: webVectorContains(resources, "
http://www.test.com/music.mid
") Actual: false Expected: true [ FAILED ] WebPageSerializerTest.MultipleFrames (16 ms)
David Levin
Comment 26
2011-03-21 13:31:37 PDT
Reopened.
Jay Civelli
Comment 27
2011-03-22 10:04:21 PDT
Created
attachment 86467
[details]
Patch
Jay Civelli
Comment 28
2011-03-22 10:07:36 PDT
Another attempt at landing this patch. I am not entirely sure while the WebPageSerializerTest.MultipleFrames test failed, as it passes in both debug and release mode on my box. There was an unbalanced tag for the resource reported as missing: <embed src="music.mid"></object> that I fixed, which should hopefully fix the problem.
WebKit Commit Bot
Comment 29
2011-03-22 12:18:16 PDT
Comment on
attachment 86467
[details]
Patch Clearing flags on attachment: 86467 Committed
r81686
: <
http://trac.webkit.org/changeset/81686
>
WebKit Commit Bot
Comment 30
2011-03-22 12:18:24 PDT
All reviewed patches have been landed. Closing bug.
James Simonsen
Comment 31
2011-03-22 19:51:45 PDT
This broke the main Chrome Win bot again. I don't know why it only affects that one. The Win WebKit Canary and Deps bots run it just fine.
Jay Civelli
Comment 32
2011-03-23 16:21:20 PDT
Created
attachment 86718
[details]
One more attempt at landing this. Not sure why the test fails on one of the bot. One of the embed src is not reported. This could be caused by a plugin. Disabling that resource in the test and adding a FIXME: for now.
Jay Civelli
Comment 33
2011-03-23 16:33:56 PDT
Reopening for rerelanding.
WebKit Commit Bot
Comment 34
2011-03-23 19:40:36 PDT
The commit-queue encountered the following flaky tests while processing
attachment 86718
[details]
: http/tests/loading/bad-server-subframe.html
bug 56985
(author:
mjs@apple.com
) The commit-queue is continuing to process your patch.
WebKit Commit Bot
Comment 35
2011-03-23 19:44:15 PDT
Comment on
attachment 86718
[details]
One more attempt at landing this. Clearing flags on attachment: 86718 Committed
r81846
: <
http://trac.webkit.org/changeset/81846
>
WebKit Commit Bot
Comment 36
2011-03-23 19:44:21 PDT
All reviewed patches have been landed. Closing bug.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug