Created attachment 42528 [details] test case Attached is a test case that demonstrates the problem. The root of the problem seems to be r27045: http://trac.webkit.org/changeset/27045. I'm wondering why the logic in question was necessary? Is it expected that XMLTokenizer should report a failure when parsing an empty document? Or was the patch addressing some other issue? Nevertheless, it doesn't work correctly when the tokenizer hits a parse error before the root element is processed. Such an error would be handled in ::parse() and that logic in ::doEnd() would catch the same error and report it for the second time.
Created attachment 42530 [details] proposed patch Along with the fix, I added a manual test. I'm not sure how to write a layout test that would not depend on the locale? (the error messages provided by QXmlStream are localized)
(In reply to comment #1) > Created an attachment (id=42530) [details] > proposed patch > > Along with the fix, I added a manual test. > > I'm not sure how to write a layout test that would not depend on the locale? > (the error messages provided by QXmlStream are localized) Did you try to write a test?
Comment on attachment 42530 [details] proposed patch > - if (m_stream.error() == QXmlStreamReader::PrematureEndOfDocumentError > - || (m_wroteText && !m_sawFirstElement && !m_sawXSLTransform)) > + if (m_stream.error() == QXmlStreamReader::PrematureEndOfDocumentError) I'm not convinced. Why do you remove m_sawXSLTransform? What is with the originally mentioned XML Http Request test, is it still passing, why it passing without the change? So the question is the following: "libxml2 has the semantic that when writing an empty string and finishing it, it will report an error. For QXmlStreamReader this is valid." Is that still true, was that true, was the semantic of QXmlStreamReader changed? > diff --git a/WebCore/manual-tests/qt/parse-error-reported-twice.xml b/WebCore/manual-tests/qt/parse-error-reported-twice.xml > new file mode 100644 > index 0000000..0b97956 > --- /dev/null > +++ b/WebCore/manual-tests/qt/parse-error-reported-twice.xml > @@ -0,0 +1,5 @@ > +<?xml version="1.0" encoding="utf-8"?> > +<!-- Trigger a parse error in the doctype. --> > +<!DOCTYPE doc {}> > +<doc></doc> > +<!-- The tokenizer should report a parse failure _once_. --> > -- > 1.6.4.2 > Make that a proper layout test, don't worry about the locale it is the same for everyone and we can change it via the LayoutTestController
(In reply to comment #0) > Created an attachment (id=42528) [details] > test case > > Attached is a test case that demonstrates the problem. > > The root of the problem seems to be r27045: > http://trac.webkit.org/changeset/27045. > > I'm wondering why the logic in question was necessary? Is it expected that > XMLTokenizer should report a failure when parsing an empty document? Or was the > patch addressing some other issue? Yes, libxml2 is having this behavior and we need it too.
Looking at it a bit more you might want to use m_sawError in ::doEnd() to see if an error message has been sent during the parsing. And then it would be greatly appreciated if you could provide a layout test (your manual one with dumping as text would work) and finally see if we report an empty document as failure (or not).
Created attachment 42606 [details] proposed patch (In reply to comment #5) > Looking at it a bit more you might want to use m_sawError in ::doEnd() to see > if an error message has been sent during the parsing. And then it would be > greatly appreciated if you could provide a layout test (your manual one with > dumping as text would work) and finally see if we report an empty document as > failure (or not). Thanks, indeed the previous patch was wrong. Here is a revised one that makes use of m_sawError to check if an error has been encountered during the parsing to make sure the same failure would not be reported twice. I glanced through the fast/parser tests and noticed that fast/parser/xml-declaration-missing-ending-mark.html demonstrates the same problem. Therefore, I believe no new test is needed for this patch? That test is currently on the Skipped list: it fails due to this bug but it also needs a Qt-specific result. I'll file another bug with the correct test result. The expected behaviour as for empty documents is preserved with this patch.
Okay, merge the test result into this bug... or the fix into the other...
To be more precise. The reason I want to have one patch is to avoid the following: If I land this patch first... - Oh yeah this is fixing a bug from THAT test case but we are skipping it If I land the result first: - Oh yeah this is the result but we are failing this test... What I want is: - Oh we are curently failing THAT test, here is a fix and a Qt specific result due a different error message...
Created attachment 42654 [details] proposed patch The patch along with the test result.
Comment on attachment 42654 [details] proposed patch The ChangeLog could be more clear too...I will do it when landing... thanks for the fix!
Comment on attachment 42654 [details] proposed patch Marking cq- since holger had further comments and wanted to land this manually.
Thanks landed it in r50640.