Bug 52036 - Feed libxml2 with raw data, relying on it to do character set decoding
Summary: Feed libxml2 with raw data, relying on it to do character set decoding
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: XML (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Patrick R. Gansterer
URL:
Keywords:
Depends on: 52547 52085 53398
Blocks: 43085
  Show dependency treegraph
 
Reported: 2011-01-06 17:20 PST by Patrick R. Gansterer
Modified: 2011-01-30 09:13 PST (History)
2 users (show)

See Also:


Attachments
Work in progress (6.95 KB, patch)
2011-01-06 17:20 PST, Patrick R. Gansterer
no flags Details | Formatted Diff | Diff
Work in progress (6.50 KB, patch)
2011-01-16 17:01 PST, Patrick R. Gansterer
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Patrick R. Gansterer 2011-01-06 17:20:20 PST
Created attachment 78193 [details]
Work in progress

I created a patch of the work I've done already. Maybe you can give me some early feedback.
At the moment only about 5 test fail, because of some missing encoding problem. (I don't teach libxml2 about all known TextEncodings in the current state).

XMLDocumentParser is a subclass of ScriptableDocumentParser which is a subclass of DecodedDataDocumentParser.
Normally DecodedDataDocumentParser handles the appendBytes method, which I've implemented in the XMLDocumentParser to get the raw data.
IMHO this is a kind of "layer violation". Can you give me a tip how to implement this in a correct way? Do I need to change the inheritance of all "parser classes"?
Comment 1 Patrick R. Gansterer 2011-01-16 17:01:15 PST
Created attachment 79118 [details]
Work in progress

I did some small performance tests (see bug 52547) with this new patch:

                  avg      median  stdev  min   max
HTML              6517.25  6770.5  505    5242  7286
XML (original)    6254.5   6366    573    5462  7118
XML (with patch)  5735.45  5385.5  704    4159  6853
                  -8.3%    -15.4%