Bug 43085
| Summary: | libxml2 parser has a large performance overhead | ||
|---|---|---|---|
| Product: | WebKit | Reporter: | Patrick R. Gansterer <paroga> | 
| Component: | XML | Assignee: | Nobody <webkit-unassigned> | 
| Status: | NEW | ||
| Severity: | Normal | CC: | annulen, ap, darin, eric, mrowe | 
| Priority: | P2 | ||
| Version: | 528+ (Nightly build) | ||
| Hardware: | All | ||
| OS: | All | ||
| Bug Depends on: | 45735, 52036, 41427, 45488, 45594, 45990, 50516, 50517 | ||
| Bug Blocks: | |||
          Patrick R. Gansterer
          
          
          
          
        
        
      In the current implementation of the XMLParser is much room for performance improvements.
A expat based XMLParser (see bug 41427) showed up to 25% less parsing time:
            libxml2        expat     percent
 5MB SVG:  0.7183sec     0.5356sec    -25%
10MB SVG:  1.6084sec     1.2298sec    -24%
20MB SVG:  5.4084sec     4.6952sec    -13%
    | Attachments | ||
|---|---|---|
| Add attachment proposed patch, testcase, etc. | 
          Eric Seidel (no email)
          
          
          
          
        
        
      Long ago we used Expat. I don't remember why we switched to libxml2.
    
          Patrick R. Gansterer
          
          
          
          
        
        
      (In reply to comment #1)
> Long ago we used Expat. I don't remember why we switched to libxml2.
because expat has no XLST support?
    
          Eric Seidel (no email)
          
          
          
          
        
        
      If you're interested in this question, I suggest reading the svn logs in the xml directory in webcore. Trac.webkit.org.
    
          Patrick R. Gansterer
          
          
          
          
        
        
      (In reply to comment #3)
> If you're interested in this question, I suggest reading the svn logs in the xml directory in webcore. Trac.webkit.org.
Wow, that's realy old code. ;-)
I don't think that expat will be better than libxml. IMHO only a "native" WebKit parser can avoid the time-consuming memcpy/strcpy that any 3rdparty parser has. My expat implementation avoids the UTF16->UTF8->UTF16 conversation of libxml implementation, but there are unnecessary memcpy in the expat code anyway.
    
          Konstantin Tokarev
          
          
          
          
        
        
      >IMHO only a "native" WebKit parser can avoid the time-consuming memcpy/strcpy that any 3rdparty parser has.
Rapidxml does not have any memcpy/strcpy calls
    
          Patrick R. Gansterer
          
          
          
          
        
        
      (In reply to comment #5)
> >IMHO only a "native" WebKit parser can avoid the time-consuming memcpy/strcpy that any 3rdparty parser has.
> 
> Rapidxml does not have any memcpy/strcpy calls
Rapidxml (like expat) has many missing features: e.g. namespace support. So it's not a real alternative for libxml2.