WebKit Bugzilla
New
Browse
Search+
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
47748
[chromium] WebPageSerializerImpl duplicates (poorly) a lot of code from MarkupAccumulator
https://bugs.webkit.org/show_bug.cgi?id=47748
Summary
[chromium] WebPageSerializerImpl duplicates (poorly) a lot of code from Marku...
Adam Barth
Reported
2010-10-15 15:52:23 PDT
WebPageSerializerImpl duplicates (poorly) a lot of code from MarkupAccumulator
Attachments
work in progress
(13.14 KB, patch)
2010-10-15 15:53 PDT
,
Adam Barth
no flags
Details
Formatted Diff
Diff
more code
(25.94 KB, patch)
2010-10-15 16:47 PDT
,
Adam Barth
no flags
Details
Formatted Diff
Diff
mostly there
(32.87 KB, patch)
2010-10-15 17:32 PDT
,
Adam Barth
no flags
Details
Formatted Diff
Diff
what-what the compile
(34.20 KB, patch)
2010-10-18 14:00 PDT
,
Adam Barth
no flags
Details
Formatted Diff
Diff
patch (seems to work)
(36.72 KB, patch)
2010-10-18 15:05 PDT
,
Adam Barth
abarth
: review-
abarth
: commit-queue-
Details
Formatted Diff
Diff
Show Obsolete
(4)
View All
Add attachment
proposed patch, testcase, etc.
Adam Barth
Comment 1
2010-10-15 15:53:08 PDT
Created
attachment 70910
[details]
work in progress
Adam Barth
Comment 2
2010-10-15 15:55:54 PDT
Whoever write this code didn't understand how the web works. You can see them digging themselves deeper and deeper with each layer of complexity they added to try solve there problems. This is something of a re-write. The major design differences are as follows: 1) Start serializing from document instead of documentElement. This avoids the need to manufacture a bunch of fake markup. 2) Always serialize in UTF-8. This avoid a bunch of complexity about guess the charset wrong. UTF-8 is the one true charset of the web. 3) Instead of trying to produce markup, use the editing code to do the heavy lifting. This code should have no concept of a angle braket.
Adam Barth
Comment 3
2010-10-15 16:47:17 PDT
Created
attachment 70918
[details]
more code
Adam Barth
Comment 4
2010-10-15 17:32:33 PDT
Created
attachment 70929
[details]
mostly there
Ryosuke Niwa
Comment 5
2010-10-15 21:08:23 PDT
Comment on
attachment 70929
[details]
mostly there View in context:
https://bugs.webkit.org/attachment.cgi?id=70929&action=review
> WebCore/editing/MarkupAccumulator.cpp:342 > - if (shouldResolveURLs() && !element->document()->url().isLocalFile()) > - appendQuotedURLAttributeValue(out, element->document()->completeURL(attribute.value()).string()); > + if (shouldResolveURLs()) > + appendQuotedURLAttributeValue(out, resolveURL(element, attribute)); > else > appendQuotedURLAttributeValue(out, attribute.value());
You may combine these lines and just do: appendQuotedURLAttributeValue(out, shouldResolveURLs() ? resolveURL(element, attribute.value()) : attribute.value());
> WebKit/chromium/src/WebPageSerializerImpl.cpp:153 > + if (element->hasTagName(metaTag) && isCharsetDeclaration(element)) > + return
Now you can move this to serializeNodes!
Adam Barth
Comment 6
2010-10-18 12:51:42 PDT
(In reply to
comment #5
)
> (From update of
attachment 70929
[details]
) > View in context:
https://bugs.webkit.org/attachment.cgi?id=70929&action=review
> > > WebCore/editing/MarkupAccumulator.cpp:342 > > - if (shouldResolveURLs() && !element->document()->url().isLocalFile()) > > - appendQuotedURLAttributeValue(out, element->document()->completeURL(attribute.value()).string()); > > + if (shouldResolveURLs()) > > + appendQuotedURLAttributeValue(out, resolveURL(element, attribute)); > > else > > appendQuotedURLAttributeValue(out, attribute.value()); > > You may combine these lines and just do: > appendQuotedURLAttributeValue(out, shouldResolveURLs() ? resolveURL(element, attribute.value()) : attribute.value());
Done.
> > WebKit/chromium/src/WebPageSerializerImpl.cpp:153 > > + if (element->hasTagName(metaTag) && isCharsetDeclaration(element)) > > + return > > Now you can move this to serializeNodes!
Investigating now that I've rebased to top-of-tree where serializeNodes exists.
Adam Barth
Comment 7
2010-10-18 12:58:07 PDT
Comment on
attachment 70929
[details]
mostly there View in context:
https://bugs.webkit.org/attachment.cgi?id=70929&action=review
>>> WebKit/chromium/src/WebPageSerializerImpl.cpp:153 >>> + virtual void appendElement(Vector<UChar>& out, Element* element, Namespaces* namespaces) >>> + { >>> + if (element->hasTagName(metaTag) && isCharsetDeclaration(element)) >>> + return >> >> Now you can move this to serializeNodes! > > Investigating now that I've rebased to top-of-tree where serializeNodes exists.
So, that would work well for the <meta> case, but it doesn't work for the <base> case. In the <base> case, we actually and to re-write the serialization of the tag instead of omitting it. I'm not sure its worth adding that complexity to the machine just for the <meta> case.
Ryosuke Niwa
Comment 8
2010-10-18 13:28:49 PDT
(In reply to
comment #7
)
> (From update of
attachment 70929
[details]
) > View in context:
https://bugs.webkit.org/attachment.cgi?id=70929&action=review
> > >>> WebKit/chromium/src/WebPageSerializerImpl.cpp:153 > >>> + virtual void appendElement(Vector<UChar>& out, Element* element, Namespaces* namespaces) > >>> + { > >>> + if (element->hasTagName(metaTag) && isCharsetDeclaration(element)) > >>> + return > >> > >> Now you can move this to serializeNodes! > > > > Investigating now that I've rebased to top-of-tree where serializeNodes exists. > > So, that would work well for the <meta> case, but it doesn't work for the <base> case. In the <base> case, we actually and to re-write the serialization of the tag instead of omitting it. I'm not sure its worth adding that complexity to the machine just for the <meta> case.
I see. I guess it's to okay to add it to appendElement then. But we might want to comment that this is fine only because meta & base do not have end tags.
Adam Barth
Comment 9
2010-10-18 14:00:50 PDT
Created
attachment 71078
[details]
what-what the compile
Adam Barth
Comment 10
2010-10-18 15:05:34 PDT
Created
attachment 71090
[details]
patch (seems to work)
Ryosuke Niwa
Comment 11
2010-10-18 15:16:32 PDT
Comment on
attachment 71090
[details]
patch (seems to work) View in context:
https://bugs.webkit.org/attachment.cgi?id=71090&action=review
Great to see a lot of code being removed!
> WebCore/editing/MarkupAccumulator.cpp:367 > + appendQuotedURLAttributeValue(out, shouldResolveURLs() ? resolveURL(element, attribute) : attribute.value().string());
Nice change.
> WebKit/chromium/src/WebPageSerializerImpl.cpp:117 > + bool contains(const String& url)
Can you make contains const?
> WebKit/chromium/src/WebPageSerializerImpl.cpp:161 > + if (element->hasTagName(metaTag) && isCharsetDeclaration(static_cast<HTMLMetaElement*>(element))) > + return; // Notice that this works because <meta> is self-closing!
Why did you remove assertion here?
> WebKit/chromium/src/WebPageSerializerImpl.cpp:207 > + String markOfTheWeb = WebPageSerializer::generateMarkOfTheWebDeclaration(url);
You can probably move generateMarkOfTheWebDeclaration into this class in the future. In fact, everything in WebPageSerializer.cpp seems to belong to this file. I don't even understand why we have interface + impl for this class given there's only one implementation of the interface.
> WebKit/chromium/src/WebPageSerializerImpl.cpp:305 > + KURL mainURL = m_webFrame->frame()->document()->url();
I don't like the name mainURL but I guess we should change that later.
Adam Barth
Comment 12
2010-10-18 15:27:13 PDT
Comment on
attachment 71090
[details]
patch (seems to work) View in context:
https://bugs.webkit.org/attachment.cgi?id=71090&action=review
>> WebKit/chromium/src/WebPageSerializerImpl.cpp:117 >> + bool contains(const String& url) > > Can you make contains const?
Sure.
>> WebKit/chromium/src/WebPageSerializerImpl.cpp:161 >> + return; // Notice that this works because <meta> is self-closing! > > Why did you remove assertion here?
Because it failed in testing. I think I was calling the wrong function to tell if <meta> gets a close tag, or maybe I was calling it in the wrong way?
>> WebKit/chromium/src/WebPageSerializerImpl.cpp:207 >> + String markOfTheWeb = WebPageSerializer::generateMarkOfTheWebDeclaration(url); > > You can probably move generateMarkOfTheWebDeclaration into this class in the future. In fact, everything in WebPageSerializer.cpp seems to belong to this file. I don't even understand why we have interface + impl for this class given there's only one implementation of the interface.
The interface is part of the Chromium WebKit API. I'm not sure why we have an cpp file for it though.
>> WebKit/chromium/src/WebPageSerializerImpl.cpp:305 >> + KURL mainURL = m_webFrame->frame()->document()->url(); > > I don't like the name mainURL but I guess we should change that later.
I'm happy to change it now if you have a suggestion for a better name.
Ryosuke Niwa
Comment 13
2010-10-18 15:33:13 PDT
(In reply to
comment #12
)
> >> WebKit/chromium/src/WebPageSerializerImpl.cpp:161 > >> + return; // Notice that this works because <meta> is self-closing! > > > > Why did you remove assertion here? > > Because it failed in testing. I think I was calling the wrong function to tell if <meta> gets a close tag, or maybe I was calling it in the wrong way?
Try ASSERT(!node->hasChildNodes() && elementCannotHaveEndTag(node)); shouldSelfClose checks a bunch of extra conditions. Still, it's weird that we the assertion failed. We might have some bug there...
> The interface is part of the Chromium WebKit API. I'm not sure why we have an cpp file for it though.
I see. I think we can still get rid of WebPageSerializer.cpp later.
> >> WebKit/chromium/src/WebPageSerializerImpl.cpp:305 > >> + KURL mainURL = m_webFrame->frame()->document()->url(); > > > > I don't like the name mainURL but I guess we should change that later. > > I'm happy to change it now if you have a suggestion for a better name.
urlOfMainFrame or urlOfMainDocument?
Adam Barth
Comment 14
2010-10-18 15:42:50 PDT
> Try ASSERT(!node->hasChildNodes() && elementCannotHaveEndTag(node)); shouldSelfClose checks a bunch of extra conditions. Still, it's weird that we the assertion failed. We might have some bug there...
That assert will fail if someone inserts an element as a child of <meta> using the DOM. Will we serialize an end tag in that case?
> > >> WebKit/chromium/src/WebPageSerializerImpl.cpp:305 > > >> + KURL mainURL = m_webFrame->frame()->document()->url(); > > > > > > I don't like the name mainURL but I guess we should change that later. > > > > I'm happy to change it now if you have a suggestion for a better name. > > urlOfMainFrame or urlOfMainDocument?
Ok. I'll try something like that.
Ryosuke Niwa
Comment 15
2010-10-18 15:43:45 PDT
(In reply to
comment #14
)
> > urlOfMainFrame or urlOfMainDocument? > > Ok. I'll try something like that.
How about urlOfRootDocument?
Eric Seidel (no email)
Comment 16
2010-10-28 13:45:54 PDT
Comment on
attachment 71090
[details]
patch (seems to work) LGTM.
Adam Barth
Comment 17
2010-11-07 16:28:01 PST
Comment on
attachment 71090
[details]
patch (seems to work) I'm going to land this by hand.
Eric Seidel (no email)
Comment 18
2010-12-20 22:42:41 PST
@abarth: ping?
Adam Barth
Comment 19
2010-12-20 23:40:02 PST
Comment on
attachment 71090
[details]
patch (seems to work) This patch turned out to be subtly wrong. I've lost context about how, sadly.
Eric Seidel (no email)
Comment 20
2012-02-08 16:08:47 PST
Sad to see this old patch die.
Ryosuke Niwa
Comment 21
2012-02-08 16:10:00 PST
(In reply to
comment #20
)
> Sad to see this old patch die.
I think abarth fixed this bug elsewhere. I see MarkupAccumulator used in WebPageSerializerImpl now.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug