RESOLVED FIXED 52085
Add Document::setContent()
https://bugs.webkit.org/show_bug.cgi?id=52085
Summary Add Document::setContent()
Patrick R. Gansterer
Reported 2011-01-07 14:56:26 PST
Add Document::setContent()
Attachments
Patch (7.50 KB, patch)
2011-01-07 15:10 PST, Patrick R. Gansterer
no flags
Patch (7.18 KB, patch)
2011-01-08 17:42 PST, Patrick R. Gansterer
no flags
Patch (7.13 KB, patch)
2011-01-11 09:19 PST, Patrick R. Gansterer
no flags
Patch (7.09 KB, patch)
2011-01-11 10:37 PST, Patrick R. Gansterer
no flags
Patch (7.14 KB, patch)
2011-01-11 11:54 PST, Patrick R. Gansterer
no flags
Patch (7.14 KB, patch)
2011-01-11 12:19 PST, Patrick R. Gansterer
no flags
Patch (7.06 KB, patch)
2011-01-11 13:55 PST, Patrick R. Gansterer
no flags
Patrick R. Gansterer
Comment 1 2011-01-07 15:10:15 PST
Darin Adler
Comment 2 2011-01-08 17:01:57 PST
Comment on attachment 78280 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=78280&action=review > WebCore/dom/Document.cpp:1170 > +bool Document::setContent(const Vector<char>& content) Is it really better to have something that takes a vector rather than a const char*, size_t pair?
Patrick R. Gansterer
Comment 3 2011-01-08 17:42:51 PST
Created attachment 78336 [details] Patch (In reply to comment #2) > (From update of attachment 78280 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=78280&action=review > > > WebCore/dom/Document.cpp:1170 > > +bool Document::setContent(const Vector<char>& content) > > Is it really better to have something that takes a vector rather than a const char*, size_t pair? No, Vector as argument is a clear mistake. I think you meant const char*, int pair? (see http://trac.webkit.org/browser/trunk/Source/WebCore/dom/DocumentParser.h?rev=75314#L49)
Eric Seidel (no email)
Comment 4 2011-01-10 11:44:05 PST
Comment on attachment 78336 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=78336&action=review I think this needs one more round. I like where this is going though! > Source/WebCore/dom/Document.cpp:1179 > + return wellFormed(); Is wellFormed what we'd want for non-XML content? Is this an XML-only method? > Source/WebCore/dom/Document.cpp:1186 > + // FIXME: We should pass content directly to the parser insted of decoding it here. > + // At the moment only SVGFonts use this method, so the xml mime type is ok for now. > + setDecoder(TextResourceDecoder::create("application/xml")); Why should we pass it directly to the parser? It seems it would be better to put this decoding code in the SVGImage code for now, unless you're about to post a follow-up patch to not decode? > Source/WebCore/dom/Document.h:353 > + bool setContent(const char*, int); Don't you want size_t here?
Adam Barth
Comment 5 2011-01-10 15:51:20 PST
Comment on attachment 78336 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=78336&action=review >> Source/WebCore/dom/Document.cpp:1186 >> + // FIXME: We should pass content directly to the parser insted of decoding it here. >> + // At the moment only SVGFonts use this method, so the xml mime type is ok for now. >> + setDecoder(TextResourceDecoder::create("application/xml")); > > Why should we pass it directly to the parser? > > It seems it would be better to put this decoding code in the SVGImage code for now, unless you're about to post a follow-up patch to not decode? It's confusing that this is only for XML. I'd expect this method to work for both HTML and XML. Also, have you looked at how the output of JavaScript URLs gets put into a document? I think it might benefit from using this API. You can follow the trail from executeIfJavaScriptURL.
Adam Barth
Comment 6 2011-01-10 15:51:57 PST
I wish I could upvote this patch as being awesome.
Patrick R. Gansterer
Comment 7 2011-01-11 09:19:27 PST
Patrick R. Gansterer
Comment 8 2011-01-11 09:31:15 PST
Comment on attachment 78336 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=78336&action=review >> Source/WebCore/dom/Document.cpp:1179 >> + return wellFormed(); > > Is wellFormed what we'd want for non-XML content? Is this an XML-only method? IMHO wellFomed() is what we want. Do you know a better alternative? At the moment it's used for XML-only, but it feeds the parser, so it can be used for other content too. >>> Source/WebCore/dom/Document.cpp:1186 >>> + setDecoder(TextResourceDecoder::create("application/xml")); >> >> Why should we pass it directly to the parser? >> >> It seems it would be better to put this decoding code in the SVGImage code for now, unless you're about to post a follow-up patch to not decode? > > It's confusing that this is only for XML. I'd expect this method to work for both HTML and XML. Also, have you looked at how the output of JavaScript URLs gets put into a document? I think it might benefit from using this API. You can follow the trail from executeIfJavaScriptURL. see 52036 (Feed libxml2 with raw data, relying on it to do character set decoding): This will avoid UTF-8->UTF-16->UTF-8 convertion. I only removed the code paths, which triggered the new assertion. I'll look for methos which might use this when I can set raw data on document. Step by step. :-) >> Source/WebCore/dom/Document.h:353 >> + bool setContent(const char*, int); > > Don't you want size_t here? see http://trac.webkit.org/browser/trunk/Source/WebCore/dom/DocumentParser.h?rev=75314#L49 (from https://bugs.webkit.org/show_bug.cgi?id=52085#c3 ;-))
Patrick R. Gansterer
Comment 9 2011-01-11 10:37:52 PST
Eric Seidel (no email)
Comment 10 2011-01-11 11:45:48 PST
Comment on attachment 78547 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=78547&action=review > Source/WebCore/dom/Document.cpp:1179 > + return wellFormed(); I feel like well-formedness has little to do with if the setContent succeeded or not, which is what the bool return from setContent feels like it should be. It seems the callers should check wellFormed separately. > Source/WebCore/loader/cache/CachedFont.cpp:144 > + > + if (!m_externalSVGDocument->setContent(decoder->decode(m_data->data(), m_data->size()) + decoder->flush())) > + m_externalSVGDocument = 0; This is wrong. You got rid of the decoder error detection. If we don't have a test for that, we should.
Eric Seidel (no email)
Comment 11 2011-01-11 11:46:06 PST
I think the wellFormed() check should be split out and setContent should just return void.
Patrick R. Gansterer
Comment 12 2011-01-11 11:54:19 PST
WebKit Review Bot
Comment 13 2011-01-11 12:14:29 PST
Patrick R. Gansterer
Comment 14 2011-01-11 12:19:25 PST
Eric Seidel (no email)
Comment 15 2011-01-11 12:19:48 PST
Comment on attachment 78569 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=78569&action=review In general this looks fine. Looks like it breaks cr-linux though? > Source/WebCore/loader/cache/CachedFont.cpp:-150 > - m_externalSVGDocument->finishParsing(); It's unclear to me how this change from document->finishParsing to m_parser->finish() is going to behave.
Build Bot
Comment 16 2011-01-11 12:23:28 PST
WebKit Review Bot
Comment 17 2011-01-11 12:25:04 PST
Patrick R. Gansterer
Comment 18 2011-01-11 12:25:11 PST
(In reply to comment #15) > (From update of attachment 78569 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=78569&action=review > > In general this looks fine. Looks like it breaks cr-linux though? missed the bool -> void change in patch :-/ > > Source/WebCore/loader/cache/CachedFont.cpp:-150 > > - m_externalSVGDocument->finishParsing(); > > It's unclear to me how this change from document->finishParsing to m_parser->finish() is going to behave. document->finishParsing does "if (m_parser) m_parser->finish();". I think it's ok, because m_parser can't be null ;-)
WebKit Review Bot
Comment 19 2011-01-11 13:05:01 PST
Early Warning System Bot
Comment 20 2011-01-11 13:06:49 PST
WebKit Review Bot
Comment 21 2011-01-11 13:52:02 PST
Patrick R. Gansterer
Comment 22 2011-01-11 13:55:03 PST
Eric Seidel (no email)
Comment 23 2011-01-11 13:58:32 PST
Comment on attachment 78593 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=78593&action=review LGTM. > Source/WebCore/loader/cache/CachedFont.cpp:143 > + m_externalSVGDocument->setContent(decoder->decode(m_data->data(), m_data->size()) + decoder->flush()); It's unfortunate that this is much less memory efficient now. We end up doing two (possibly large) allocations for the main data decoding instead of one.
Patrick R. Gansterer
Comment 24 2011-01-11 14:02:11 PST
Comment on attachment 78593 [details] Patch (In reply to comment #23) > (From update of attachment 78593 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=78593&action=review > > LGTM. Sorry for so much spam. > > Source/WebCore/loader/cache/CachedFont.cpp:143 > > + m_externalSVGDocument->setContent(decoder->decode(m_data->data(), m_data->size()) + decoder->flush()); > > It's unfortunate that this is much less memory efficient now. We end up doing two (possibly large) allocations for the main data decoding instead of one. IMHO flush() won't return anything in 99.9%. Usually decode returns all of the decoded data, so we pass the string directly to setContent. flush() will only return a non-empty string when we have "invalid" input enconding.
Build Bot
Comment 25 2011-01-11 14:06:04 PST
WebKit Commit Bot
Comment 26 2011-01-11 17:26:19 PST
Comment on attachment 78593 [details] Patch Clearing flags on attachment: 78593 Committed r75577: <http://trac.webkit.org/changeset/75577>
WebKit Commit Bot
Comment 27 2011-01-11 17:26:28 PST
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.