Bug 31144 - [Qt] The XML tokenizer reports a parse error twice if it occurs before the document element is found.
Summary: [Qt] The XML tokenizer reports a parse error twice if it occurs before the do...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: XML (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC All
: P3 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-11-04 15:14 PST by Jakub Wieczorek
Modified: 2009-11-08 23:35 PST (History)
1 user (show)

See Also:


Attachments
test case (13 bytes, text/xml)
2009-11-04 15:14 PST, Jakub Wieczorek
no flags Details
proposed patch (2.85 KB, patch)
2009-11-04 15:25 PST, Jakub Wieczorek
zecke: review-
Details | Formatted Diff | Diff
proposed patch (2.69 KB, patch)
2009-11-05 15:29 PST, Jakub Wieczorek
no flags Details | Formatted Diff | Diff
proposed patch (5.45 KB, patch)
2009-11-06 08:45 PST, Jakub Wieczorek
zecke: review+
eric: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jakub Wieczorek 2009-11-04 15:14:22 PST
Created attachment 42528 [details]
test case

Attached is a test case that demonstrates the problem.

The root of the problem seems to be r27045: http://trac.webkit.org/changeset/27045.

I'm wondering why the logic in question was necessary? Is it expected that XMLTokenizer should report a failure when parsing an empty document? Or was the patch addressing some other issue?

Nevertheless, it doesn't work correctly when the tokenizer hits a parse error before the root element is processed. Such an error would be handled in ::parse() and that logic in ::doEnd() would catch the same error and report it for the second time.
Comment 1 Jakub Wieczorek 2009-11-04 15:25:54 PST
Created attachment 42530 [details]
proposed patch

Along with the fix, I added a manual test.

I'm not sure how to write a layout test that would not depend on the locale? (the error messages provided by QXmlStream are localized)
Comment 2 Holger Freyther 2009-11-04 16:35:35 PST
(In reply to comment #1)
> Created an attachment (id=42530) [details]
> proposed patch
> 
> Along with the fix, I added a manual test.
> 
> I'm not sure how to write a layout test that would not depend on the locale?
> (the error messages provided by QXmlStream are localized)

Did you try to write a test?
Comment 3 Holger Freyther 2009-11-04 16:48:10 PST
Comment on attachment 42530 [details]
proposed patch


> -    if (m_stream.error() == QXmlStreamReader::PrematureEndOfDocumentError
> -        || (m_wroteText && !m_sawFirstElement && !m_sawXSLTransform))
> +    if (m_stream.error() == QXmlStreamReader::PrematureEndOfDocumentError)

I'm not convinced. Why do you remove m_sawXSLTransform? What is with the originally mentioned XML Http Request test, is it still passing, why it passing without the change?

So the question is the following:
  "libxml2 has the semantic that when writing an empty string and finishing it, it will report an error. For QXmlStreamReader this is valid."

Is that still true, was that true, was the semantic of QXmlStreamReader changed?



> diff --git a/WebCore/manual-tests/qt/parse-error-reported-twice.xml b/WebCore/manual-tests/qt/parse-error-reported-twice.xml
> new file mode 100644
> index 0000000..0b97956
> --- /dev/null
> +++ b/WebCore/manual-tests/qt/parse-error-reported-twice.xml
> @@ -0,0 +1,5 @@
> +<?xml version="1.0" encoding="utf-8"?>
> +<!-- Trigger a parse error in the doctype. -->
> +<!DOCTYPE doc {}>
> +<doc></doc>
> +<!-- The tokenizer should report a parse failure _once_. -->
> -- 
> 1.6.4.2
> 

Make that a proper layout test, don't worry about the locale it is the same for everyone and we can change it via the LayoutTestController
Comment 4 Holger Freyther 2009-11-04 17:00:27 PST
(In reply to comment #0)
> Created an attachment (id=42528) [details]
> test case
> 
> Attached is a test case that demonstrates the problem.
> 
> The root of the problem seems to be r27045:
> http://trac.webkit.org/changeset/27045.
> 
> I'm wondering why the logic in question was necessary? Is it expected that
> XMLTokenizer should report a failure when parsing an empty document? Or was the
> patch addressing some other issue?

Yes, libxml2 is having this behavior and we need it too.
Comment 5 Holger Freyther 2009-11-04 17:04:38 PST
Looking at it a bit more you might want to use m_sawError in ::doEnd() to see if an error message has been sent during the parsing. And then it would be greatly appreciated if you could provide a layout test (your manual one with dumping as text would work) and finally see if we report an empty document as failure (or not).
Comment 6 Jakub Wieczorek 2009-11-05 15:29:33 PST
Created attachment 42606 [details]
proposed patch

(In reply to comment #5)
> Looking at it a bit more you might want to use m_sawError in ::doEnd() to see
> if an error message has been sent during the parsing. And then it would be
> greatly appreciated if you could provide a layout test (your manual one with
> dumping as text would work) and finally see if we report an empty document as
> failure (or not).

Thanks, indeed the previous patch was wrong.

Here is a revised one that makes use of m_sawError to check if an error has been encountered during the parsing to make sure the same failure would not be reported twice.

I glanced through the fast/parser tests and noticed that fast/parser/xml-declaration-missing-ending-mark.html demonstrates the same problem. Therefore, I believe no new test is needed for this patch? That test is currently on the Skipped list: it fails due to this bug but it also needs a Qt-specific result. I'll file another bug with the correct test result.

The expected behaviour as for empty documents is preserved with this patch.
Comment 7 Holger Freyther 2009-11-05 20:54:59 PST
Okay, merge the test result into this bug... or the fix into the other...
Comment 8 Holger Freyther 2009-11-05 21:00:45 PST
To be more precise. The reason I want to have one patch is to avoid the following:

If I land this patch first...
- Oh yeah this is fixing a bug from THAT test case but we are skipping it

If I land the result first:
- Oh yeah this is the result but we are failing this test...


What I want is:
- Oh we are curently failing THAT test, here is a fix and a Qt specific result due a different error message...
Comment 9 Jakub Wieczorek 2009-11-06 08:45:58 PST
Created attachment 42654 [details]
proposed patch

The patch along with the test result.
Comment 10 Holger Freyther 2009-11-06 09:09:38 PST
Comment on attachment 42654 [details]
proposed patch

The ChangeLog could be more clear too...I will do it when landing... thanks for the fix!
Comment 11 Eric Seidel (no email) 2009-11-08 10:36:46 PST
Comment on attachment 42654 [details]
proposed patch

Marking cq- since holger had further comments and wanted to land this manually.
Comment 12 Holger Freyther 2009-11-08 23:35:11 PST
Thanks landed it in r50640.