WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
33330
Web Inspector: Regex-based syntax highlighting is slow.
https://bugs.webkit.org/show_bug.cgi?id=33330
Summary
Web Inspector: Regex-based syntax highlighting is slow.
Pavel Feldman
Reported
2010-01-07 10:53:50 PST
As of today, we are executing too many regexes against too many substrings in highlighter. Every time regex does not match, we lose all the intermediate calculations and move to another regex. So if "String" did not match, we will try "StringStart" although they are very similar. Instead of doing regexes, we should try implementing more efficient lexer. Unfortunately, there are no good parser generators for JavaScript yet, I tried a bunch of them. There is also no good grammar for ECMAScript. So I ended up using re2c (fast regex-based lexer generator) post-processed with a handful of sed scripts to convert from C to JavaScript. It gave me >10x performance boost on highlighting 30K lines file. The footprint of the generated lexer is almost 100K though which is 10 times more than the source. Patch will follow shortly (it will be based on text editor from 33001 that is not yet landed).
Attachments
[PATCH] Proposed change
(122.81 KB, patch)
2010-01-07 11:04 PST
,
Pavel Feldman
timothy
: review+
Details
Formatted Diff
Diff
View All
Add attachment
proposed patch, testcase, etc.
Pavel Feldman
Comment 1
2010-01-07 11:04:04 PST
Created
attachment 46065
[details]
[PATCH] Proposed change
Timothy Hatcher
Comment 2
2010-01-07 11:56:54 PST
Comment on
attachment 46065
[details]
[PATCH] Proposed change
> + this._tokens = [];
Tab.
> + this._conditions = { > + DIV: 0, > + NODIV: 1, > + COMMENT: 2, > + DSTRING: 3, > + SSTRING: 4, > + REGEX: 5 > + }; > + > + this.case_DIV = 1000; > + this.case_NODIV = 1001; > + this.case_COMMENT = 1002; > + this.case_DSTRING = 1003; > + this.case_SSTRING = 1004; > + this.case_REGEX = 1005; > +}
Can we camelcase these? I suspect not.
> +WebInspector.JavaScriptTokenizer.prototype = { > + > + setLine: function(line) {
Remove extra line. Getter and setter?
> + getCondition: function() > + { > + return this._condition; > + }, > + > + setCondition: function(condition) > + { > + this._condition = condition; > + },
Getter and setter?
> + while (1) { > + switch (goto) > + /*!re2c > + re2c:define:YYCTYPE = "var";
Why is two space indented? You should add a JS comment before this like "Start of Autogenerated stuff.." (better wording.)
Pavel Feldman
Comment 3
2010-01-08 03:51:17 PST
(In reply to
comment #2
)
> Tab. >
Fixed.
> > > Can we camelcase these? I suspect not. >
In theory, we could. But we need to distinguish between rules (camel case) and conditions (all caps). In most re2c I've seen all caps for conditions.
> > > +WebInspector.JavaScriptTokenizer.prototype = { > > + > > + setLine: function(line) { > > Remove extra line. Getter and setter? >
Done.
> > > + getCondition: function() > > + { > > + return this._condition; > > + }, > > + > > + setCondition: function(condition) > > + { > > + this._condition = condition; > > + }, > > Getter and setter? >
Generated code is calling into these. It is using a C function template for that.
> > > + while (1) { > > + switch (goto) > > + /*!re2c > > + re2c:define:YYCTYPE = "var"; > > Why is two space indented? You should add a JS comment before this like "Start > of Autogenerated stuff.." (better wording.)
Done. Committing to
http://svn.webkit.org/repository/webkit/trunk
... D WebCore/inspector/front-end/JavaScriptHighlighterScheme.js M WebCore/ChangeLog M WebCore/WebCore.gypi M WebCore/WebCore.vcproj/WebCore.vcproj A WebCore/inspector/front-end/JavaScriptTokenizer.js A WebCore/inspector/front-end/JavaScriptTokenizer.re2js M WebCore/inspector/front-end/TextEditorHighlighter.js M WebCore/inspector/front-end/WebKit.qrc M WebCore/inspector/front-end/inspector.html Committed
r52985
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug