Bug 41427 - Add expat based XMLDocumentParser
Summary: Add expat based XMLDocumentParser
Status: RESOLVED WONTFIX
Alias: None
Product: WebKit
Classification: Unclassified
Component: XML (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC OS X 10.5
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks: 43085
  Show dependency treegraph
 
Reported: 2010-06-30 14:03 PDT by Patrick R. Gansterer
Modified: 2010-07-27 15:13 PDT (History)
2 users (show)

See Also:


Attachments
Patch (32.55 KB, patch)
2010-06-30 14:04 PDT, Patrick R. Gansterer
abarth: review-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Patrick R. Gansterer 2010-06-30 14:03:29 PDT
see patch
Comment 1 Patrick R. Gansterer 2010-06-30 14:04:13 PDT
Created attachment 60147 [details]
Patch
Comment 2 Darin Adler 2010-06-30 14:06:50 PDT
Comment on attachment 60147 [details]
Patch

Why?

We really don’t want a third one of these!
Comment 3 Patrick R. Gansterer 2010-07-01 00:00:15 PDT
(In reply to comment #2)
> (From update of attachment 60147 [details])
> Why?
The main reason was that I had problems to get libxml working.

When you look at the libxml parser you have many toString() calls, which do UTF-8 decoding. This isn't necessary with expat, because it can return UTF-16 strings directly.

I din't had the time to do performance tests, but expat is usually a little faster than libxml.

> We really don’t want a third one of these!
Then you won't accecpt a MSXML implementation too? (will save memory when using the windows system library)
What do you think about a "native WebKit XMLParser" with no 3rdparty dependency?
Comment 4 Alexey Proskuryakov 2010-07-01 12:18:21 PDT
A native XML engine would be good in that it would allow for a much more efficient and compliant XSLT implementation, as well as sharing code with XPath. But it would take a lot of resources to write and maintain.
Comment 5 Patrick R. Gansterer 2010-07-22 08:59:40 PDT
Some performance data*:

            libxml2        expat     percent
 5MB SVG:  0.7183sec     0.5356sec    -25%
10MB SVG:  1.6084sec     1.2298sec    -24%
20MB SVG:  5.4084sec     4.6952sec    -13%

* time from begin of first XMLDocumentParser::doWrite until end of XMLDocumentParser::doEnd, average of 3 measurements
Comment 6 Alexey Proskuryakov 2010-07-22 10:14:32 PDT
A port that does not use libxml2 has no practical way to implement XSLT, is that correct?
Comment 7 Patrick R. Gansterer 2010-07-22 10:19:26 PDT
(In reply to comment #6)
> A port that does not use libxml2 has no practical way to implement XSLT, is that correct?
Yes that is correct, libxlst needs libxml2. I don't think that expat is a real option for "fat clients" but as an alternative without XLST support (on small devices).
The table in comment #5 shows that there is much room for improvements in the XML parser.
Comment 8 Adam Barth 2010-07-27 07:34:04 PDT
Comment on attachment 60147 [details]
Patch

I really don't like the idea of adding a third (!) XML parser.  It's bad enough that we have three.  If we can improve perf by switching our XML parser, that's great, but we should kill the old one.  IMHO, adding the second XML parser was a mistake too.
Comment 9 Patrick R. Gansterer 2010-07-27 15:12:25 PDT
(In reply to comment #8)
> (From update of attachment 60147 [details])
> I really don't like the idea of adding a third (!) XML parser.  It's bad enough that we have three.  If we can improve perf by switching our XML parser, that's great, but we should kill the old one.  IMHO, adding the second XML parser was a mistake too.
I agree with you 100%. My expat version was mainly a quick hack to show the possible speed increase of the XMLParser. It can't be a full replacement beacause of the missing xlst support :-(.
I've created a new bug for this issue at bug 43085.
I think there is no way around to implement a full native WebKit XMLParser?