Bug 43085

Summary: libxml2 parser has a large performance overhead
Product: WebKit Reporter: Patrick R. Gansterer <paroga>
Component: XMLAssignee: Nobody <webkit-unassigned>
Status: NEW ---    
Severity: Normal CC: annulen, ap, darin, eric, mrowe
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Bug Depends on: 45735, 52036, 41427, 45488, 45594, 45990, 50516, 50517    
Bug Blocks:    

Description Patrick R. Gansterer 2010-07-27 15:07:45 PDT
In the current implementation of the XMLParser is much room for performance improvements.

A expat based XMLParser (see bug 41427) showed up to 25% less parsing time:
            libxml2        expat     percent
 5MB SVG:  0.7183sec     0.5356sec    -25%
10MB SVG:  1.6084sec     1.2298sec    -24%
20MB SVG:  5.4084sec     4.6952sec    -13%
Comment 1 Eric Seidel (no email) 2010-07-31 09:56:27 PDT
Long ago we used Expat. I don't remember why we switched to libxml2.
Comment 2 Patrick R. Gansterer 2010-07-31 09:58:07 PDT
(In reply to comment #1)
> Long ago we used Expat. I don't remember why we switched to libxml2.
because expat has no XLST support?
Comment 3 Eric Seidel (no email) 2010-07-31 10:01:58 PDT
If you're interested in this question, I suggest reading the svn logs in the xml directory in webcore. Trac.webkit.org.
Comment 4 Patrick R. Gansterer 2010-07-31 10:14:41 PDT
(In reply to comment #3)
> If you're interested in this question, I suggest reading the svn logs in the xml directory in webcore. Trac.webkit.org.
Wow, that's realy old code. ;-)
I don't think that expat will be better than libxml. IMHO only a "native" WebKit parser can avoid the time-consuming memcpy/strcpy that any 3rdparty parser has. My expat implementation avoids the UTF16->UTF8->UTF16 conversation of libxml implementation, but there are unnecessary memcpy in the expat code anyway.
Comment 5 Konstantin Tokarev 2011-06-29 05:42:56 PDT
>IMHO only a "native" WebKit parser can avoid the time-consuming memcpy/strcpy that any 3rdparty parser has.

Rapidxml does not have any memcpy/strcpy calls
Comment 6 Patrick R. Gansterer 2011-06-29 06:16:27 PDT
(In reply to comment #5)
> >IMHO only a "native" WebKit parser can avoid the time-consuming memcpy/strcpy that any 3rdparty parser has.
> 
> Rapidxml does not have any memcpy/strcpy calls

Rapidxml (like expat) has many missing features: e.g. namespace support. So it's not a real alternative for libxml2.