Bug 31144

Summary: [Qt] The XML tokenizer reports a parse error twice if it occurs before the document element is found.
Product: WebKit Reporter: Jakub Wieczorek <jwieczorek>
Component: XMLAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: zecke
Priority: P3    
Version: 528+ (Nightly build)   
Hardware: PC   
OS: All   
Attachments:
Description Flags
test case
none
proposed patch
zecke: review-
proposed patch
none
proposed patch zecke: review+, eric: commit-queue-

Jakub Wieczorek
Reported 2009-11-04 15:14:22 PST
Created attachment 42528 [details] test case Attached is a test case that demonstrates the problem. The root of the problem seems to be r27045: http://trac.webkit.org/changeset/27045. I'm wondering why the logic in question was necessary? Is it expected that XMLTokenizer should report a failure when parsing an empty document? Or was the patch addressing some other issue? Nevertheless, it doesn't work correctly when the tokenizer hits a parse error before the root element is processed. Such an error would be handled in ::parse() and that logic in ::doEnd() would catch the same error and report it for the second time.
Attachments
test case (13 bytes, text/xml)
2009-11-04 15:14 PST, Jakub Wieczorek
no flags
proposed patch (2.85 KB, patch)
2009-11-04 15:25 PST, Jakub Wieczorek
zecke: review-
proposed patch (2.69 KB, patch)
2009-11-05 15:29 PST, Jakub Wieczorek
no flags
proposed patch (5.45 KB, patch)
2009-11-06 08:45 PST, Jakub Wieczorek
zecke: review+
eric: commit-queue-
Jakub Wieczorek
Comment 1 2009-11-04 15:25:54 PST
Created attachment 42530 [details] proposed patch Along with the fix, I added a manual test. I'm not sure how to write a layout test that would not depend on the locale? (the error messages provided by QXmlStream are localized)
Holger Freyther
Comment 2 2009-11-04 16:35:35 PST
(In reply to comment #1) > Created an attachment (id=42530) [details] > proposed patch > > Along with the fix, I added a manual test. > > I'm not sure how to write a layout test that would not depend on the locale? > (the error messages provided by QXmlStream are localized) Did you try to write a test?
Holger Freyther
Comment 3 2009-11-04 16:48:10 PST
Comment on attachment 42530 [details] proposed patch > - if (m_stream.error() == QXmlStreamReader::PrematureEndOfDocumentError > - || (m_wroteText && !m_sawFirstElement && !m_sawXSLTransform)) > + if (m_stream.error() == QXmlStreamReader::PrematureEndOfDocumentError) I'm not convinced. Why do you remove m_sawXSLTransform? What is with the originally mentioned XML Http Request test, is it still passing, why it passing without the change? So the question is the following: "libxml2 has the semantic that when writing an empty string and finishing it, it will report an error. For QXmlStreamReader this is valid." Is that still true, was that true, was the semantic of QXmlStreamReader changed? > diff --git a/WebCore/manual-tests/qt/parse-error-reported-twice.xml b/WebCore/manual-tests/qt/parse-error-reported-twice.xml > new file mode 100644 > index 0000000..0b97956 > --- /dev/null > +++ b/WebCore/manual-tests/qt/parse-error-reported-twice.xml > @@ -0,0 +1,5 @@ > +<?xml version="1.0" encoding="utf-8"?> > +<!-- Trigger a parse error in the doctype. --> > +<!DOCTYPE doc {}> > +<doc></doc> > +<!-- The tokenizer should report a parse failure _once_. --> > -- > 1.6.4.2 > Make that a proper layout test, don't worry about the locale it is the same for everyone and we can change it via the LayoutTestController
Holger Freyther
Comment 4 2009-11-04 17:00:27 PST
(In reply to comment #0) > Created an attachment (id=42528) [details] > test case > > Attached is a test case that demonstrates the problem. > > The root of the problem seems to be r27045: > http://trac.webkit.org/changeset/27045. > > I'm wondering why the logic in question was necessary? Is it expected that > XMLTokenizer should report a failure when parsing an empty document? Or was the > patch addressing some other issue? Yes, libxml2 is having this behavior and we need it too.
Holger Freyther
Comment 5 2009-11-04 17:04:38 PST
Looking at it a bit more you might want to use m_sawError in ::doEnd() to see if an error message has been sent during the parsing. And then it would be greatly appreciated if you could provide a layout test (your manual one with dumping as text would work) and finally see if we report an empty document as failure (or not).
Jakub Wieczorek
Comment 6 2009-11-05 15:29:33 PST
Created attachment 42606 [details] proposed patch (In reply to comment #5) > Looking at it a bit more you might want to use m_sawError in ::doEnd() to see > if an error message has been sent during the parsing. And then it would be > greatly appreciated if you could provide a layout test (your manual one with > dumping as text would work) and finally see if we report an empty document as > failure (or not). Thanks, indeed the previous patch was wrong. Here is a revised one that makes use of m_sawError to check if an error has been encountered during the parsing to make sure the same failure would not be reported twice. I glanced through the fast/parser tests and noticed that fast/parser/xml-declaration-missing-ending-mark.html demonstrates the same problem. Therefore, I believe no new test is needed for this patch? That test is currently on the Skipped list: it fails due to this bug but it also needs a Qt-specific result. I'll file another bug with the correct test result. The expected behaviour as for empty documents is preserved with this patch.
Holger Freyther
Comment 7 2009-11-05 20:54:59 PST
Okay, merge the test result into this bug... or the fix into the other...
Holger Freyther
Comment 8 2009-11-05 21:00:45 PST
To be more precise. The reason I want to have one patch is to avoid the following: If I land this patch first... - Oh yeah this is fixing a bug from THAT test case but we are skipping it If I land the result first: - Oh yeah this is the result but we are failing this test... What I want is: - Oh we are curently failing THAT test, here is a fix and a Qt specific result due a different error message...
Jakub Wieczorek
Comment 9 2009-11-06 08:45:58 PST
Created attachment 42654 [details] proposed patch The patch along with the test result.
Holger Freyther
Comment 10 2009-11-06 09:09:38 PST
Comment on attachment 42654 [details] proposed patch The ChangeLog could be more clear too...I will do it when landing... thanks for the fix!
Eric Seidel (no email)
Comment 11 2009-11-08 10:36:46 PST
Comment on attachment 42654 [details] proposed patch Marking cq- since holger had further comments and wanted to land this manually.
Holger Freyther
Comment 12 2009-11-08 23:35:11 PST
Thanks landed it in r50640.
Note You need to log in before you can comment on or make changes to this bug.