Bug 13233 - Need to implement an optimizing XPath evaluator
Summary: Need to implement an optimizing XPath evaluator
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: XML (show other bugs)
Version: 523.x (Safari 3)
Hardware: Mac OS X 10.4
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-03-30 13:57 PDT by Alexey Proskuryakov
Modified: 2010-06-10 17:30 PDT (History)
3 users (show)

See Also:


Attachments
Romeo&Juliet performance test (67.27 KB, application/zip)
2007-03-30 14:03 PDT, Alexey Proskuryakov
no flags Details
first step (39.40 KB, patch)
2009-05-31 06:25 PDT, Alexey Proskuryakov
no flags Details | Formatted Diff | Diff
Don't needlessly sort location path results (6.40 KB, patch)
2009-05-31 10:45 PDT, Alexey Proskuryakov
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Alexey Proskuryakov 2007-03-30 13:57:09 PDT
There are many opportunities for XPath optimization that we are still missing.

Some work is still needed on micro-optimizing our implementation. For example, it's wasteful to implement a node-set as Vector<RefPtr>, as nodes won't go away during evaluation anyway (except for lazily created Attr nodes).

Many ideas for higher-level optimizations can be apparently borrowed from Mozilla, see <https://bugzilla.mozilla.org/show_bug.cgi?id=208172> and <http://www.fiveanddime.net/firefox-1.5-source/mozilla/extensions/transformiix/docs/optimized-xpath.html.html>.
Comment 1 Alexey Proskuryakov 2007-03-30 14:03:50 PDT
Created attachment 13893 [details]
Romeo&Juliet performance test

Some XPath performance tests:

http://ejohn.org/apps/classname/xpath.html
http://www.andrewdupont.net/test/xpath/
http://www.asciiarmor.com/2004/11/13/java-xpath-10-engine-comparison-performance/

A simple JavaScript translation of the last one is attached; it could use some more attention, though.
Comment 2 Sasha Firsov 2009-02-19 09:44:36 PST
At the moment the WebKit XPath evaluation is outsider among of FF,IE,Opera. 
The performance is so bad, that exclude WebKit from Web 2.0 heavy service-oriented apps. Test for 100K "/root/el" in <root><el>123</el>...</root> takes several minutes in comparison to few seconds in other browsers. Accounting the browser freeze app looks failed. 
Accounting WekKit market share on such kind of apps, ways around will not be made and putting WebKit in row of unsupported browsers.
Comment 3 Alexey Proskuryakov 2009-02-19 10:00:17 PST
Realistic tests would go a long way towards improving WebKit XPath support. Iterating a single dimensional array (If I understood you correctly, that's your example) is hardly a task for XPath. Nor would be a topic for this bug, as it's likely a target for implementation micro-optimizations, not for an optimizing evaluator.
Comment 4 Sasha Firsov 2009-02-19 10:12:51 PST
Evaluator optimizing is the subject for Iterating a single dimensional array.
And it is a task for XPath evaluator optimization. Most of apps which utilizing web services working with flat array-like data XML. Picture galleries, search engines are just simple samples. You could use their JSON incarnation, but XPath is way more usable as filters over same data.

XPath optimisation is not so critical for UI DOM navigation, where numbers are sufficient(still few times slower of other browsers).
Comment 5 Alexey Proskuryakov 2009-05-31 06:25:03 PDT
Created attachment 30812 [details]
first step

Implement some optimizations:
- avoid building full NodeSets to evaluate predicates, if possible;
- optimize "//" if there are no context sensitive predicates.
Comment 6 Alexey Proskuryakov 2009-05-31 10:45:38 PDT
Created attachment 30817 [details]
Don't needlessly sort location path results
Comment 7 Darin Adler 2009-05-31 17:05:18 PDT
Comment on attachment 30812 [details]
first step

> +            void addSubExpression(Expression* expr)
> +            {
> +                m_subExpressions.append(expr);
> +                m_isContextNodeSensitive = m_isContextNodeSensitive || expr->m_isContextNodeSensitive;
> +                m_isContextPositionSensitive = m_isContextPositionSensitive || expr->m_isContextPositionSensitive;
> +                m_isContextSizeSensitive = m_isContextSizeSensitive || expr->m_isContextSizeSensitive;
> +            }

How about using |= here?

r=me
Comment 8 Darin Adler 2009-05-31 17:07:46 PDT
Comment on attachment 30817 [details]
Don't needlessly sort location path results

> +            NodeSet(const NodeSet& other) : m_isSorted(other.m_isSorted), m_subtreesAreDisjoint(other.m_subtreesAreDisjoint), m_nodes(other.m_nodes) { }
> +            NodeSet& operator=(const NodeSet& other) { m_isSorted = other.m_isSorted; m_subtreesAreDisjoint = other.m_subtreesAreDisjoint; m_nodes = other.m_nodes; return *this; }

You should instead delete these explicit definitions. The compiler defaults would work fine.

r=me
Comment 9 Alexey Proskuryakov 2009-05-31 21:32:09 PDT
Comment on attachment 30812 [details]
first step

Committed revision 44306, clearing review flag.

> How about using |= here?

Not sure about that - it feels strange to use a binary operation with booleans.
Comment 10 Darin Adler 2009-05-31 21:51:32 PDT
(In reply to comment #9)
> Not sure about that - it feels strange to use a binary operation with booleans.

I don't know why. The logical operations and binary operations both do the same thing with booleans, except for their short-circuiting properties. The binary operations aren't "raw bits" operations or anything like that.
Comment 11 Alexey Proskuryakov 2009-05-31 22:25:52 PDT
Comment on attachment 30817 [details]
Don't needlessly sort location path results

Committed revision 44307, clearing review flag.

> You should instead delete these explicit definitions. The compiler defaults
> would work fine.

Done.

> I don't know why. The logical operations and binary operations both do the same
> thing with booleans, except for their short-circuiting properties.

It's not so good for BOOLs and Booleans, but OK, changed it when landing this patch.