Bug 43085 - libxml2 parser has a large performance overhead
Summary: libxml2 parser has a large performance overhead
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: XML (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on: 45735 52036 41427 45488 45594 45990 50516 50517
Blocks:
  Show dependency treegraph
 
Reported: 2010-07-27 15:07 PDT by Patrick R. Gansterer
Modified: 2011-06-29 06:16 PDT (History)
5 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Patrick R. Gansterer 2010-07-27 15:07:45 PDT
In the current implementation of the XMLParser is much room for performance improvements.

A expat based XMLParser (see bug 41427) showed up to 25% less parsing time:
            libxml2        expat     percent
 5MB SVG:  0.7183sec     0.5356sec    -25%
10MB SVG:  1.6084sec     1.2298sec    -24%
20MB SVG:  5.4084sec     4.6952sec    -13%
Comment 1 Eric Seidel (no email) 2010-07-31 09:56:27 PDT
Long ago we used Expat. I don't remember why we switched to libxml2.
Comment 2 Patrick R. Gansterer 2010-07-31 09:58:07 PDT
(In reply to comment #1)
> Long ago we used Expat. I don't remember why we switched to libxml2.
because expat has no XLST support?
Comment 3 Eric Seidel (no email) 2010-07-31 10:01:58 PDT
If you're interested in this question, I suggest reading the svn logs in the xml directory in webcore. Trac.webkit.org.
Comment 4 Patrick R. Gansterer 2010-07-31 10:14:41 PDT
(In reply to comment #3)
> If you're interested in this question, I suggest reading the svn logs in the xml directory in webcore. Trac.webkit.org.
Wow, that's realy old code. ;-)
I don't think that expat will be better than libxml. IMHO only a "native" WebKit parser can avoid the time-consuming memcpy/strcpy that any 3rdparty parser has. My expat implementation avoids the UTF16->UTF8->UTF16 conversation of libxml implementation, but there are unnecessary memcpy in the expat code anyway.
Comment 5 Konstantin Tokarev 2011-06-29 05:42:56 PDT
>IMHO only a "native" WebKit parser can avoid the time-consuming memcpy/strcpy that any 3rdparty parser has.

Rapidxml does not have any memcpy/strcpy calls
Comment 6 Patrick R. Gansterer 2011-06-29 06:16:27 PDT
(In reply to comment #5)
> >IMHO only a "native" WebKit parser can avoid the time-consuming memcpy/strcpy that any 3rdparty parser has.
> 
> Rapidxml does not have any memcpy/strcpy calls

Rapidxml (like expat) has many missing features: e.g. namespace support. So it's not a real alternative for libxml2.