Bug 14360 - JavaScript source view should be syntax highlighted
Summary: JavaScript source view should be syntax highlighted
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Web Inspector (Deprecated) (show other bugs)
Version: 523.x (Safari 3)
Hardware: All All
: P2 Normal
Assignee: Keishi Hattori
URL:
Keywords: InRadar
Depends on:
Blocks: 19677
  Show dependency treegraph
 
Reported: 2007-06-24 00:13 PDT by Adam Roben (:aroben)
Modified: 2008-06-19 13:27 PDT (History)
3 users (show)

See Also:


Attachments
Adds syntax coloring to JS source code in the Resources panel. (7.02 KB, patch)
2008-06-13 23:14 PDT, Keishi Hattori
no flags Details | Formatted Diff | Diff
highlighter rewritten (12.42 KB, patch)
2008-06-15 20:49 PDT, Keishi Hattori
timothy: review-
Details | Formatted Diff | Diff
regexp detection improved (12.68 KB, patch)
2008-06-17 22:33 PDT, Keishi Hattori
no flags Details | Formatted Diff | Diff
lines reduced (8.76 KB, patch)
2008-06-18 10:47 PDT, Keishi Hattori
no flags Details | Formatted Diff | Diff
delete (9.19 KB, patch)
2008-06-18 13:28 PDT, Keishi Hattori
no flags Details | Formatted Diff | Diff
with change log (10.29 KB, patch)
2008-06-18 17:19 PDT, Keishi Hattori
timothy: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Adam Roben (:aroben) 2007-06-24 00:13:48 PDT
The JavaScript source view in the Inspector should be syntax highlighted, just like the HTML source view already is.
Comment 1 Adam Roben (:aroben) 2008-01-29 10:59:10 PST
<rdar://problem/5712818>
Comment 2 Adam Roben (:aroben) 2008-06-11 19:16:40 PDT
It would probably be good enough (at least as an initial implementation) to do the highlighting using JavaScript code in the Inspector (like Drosera did for syntax highlighting).
Comment 3 Keishi Hattori 2008-06-13 23:14:16 PDT
Created attachment 21693 [details]
Adds syntax coloring to JS source code in the Resources panel.

I didn't know where to stick my code so I stuck it at the end of SourceFrame.js.

keywords, operators(except ?:), regexp, numbers, strings, comments get colorized
Comment 4 Timothy Hatcher 2008-06-14 00:37:20 PDT
Comment on attachment 21693 [details]
Adds syntax coloring to JS source code in the Resources panel.

Over all this is looking great! I have many comments, mainly because I have been thinking about this recently, but most are very minor.

A few style things: The tabs need to be removed and changed to 4 spaces. We also avoid single string quotes in the Inspector's code. A few places like "i=0" which is missing spaces around the operator. Also in for loops use ++i and ++j etc. Also try to use ===/!== wherever you can, I notice you mixed it up a bit. In echoChar (and other places) you don't indent the line after the if/else statements.

Other comments:
I would remove the use of getElementsByClassName in syntaxHighlightJavascript, and add more assumptions about the structure of the elements. We control the content so this code can be more efficient. Take a look at sourceRow() and _addMessageToSource(), they access the table rows and cells directly.

Tracking state needs to be changed, using a class name to track being in a comment is heavy handed. You can just add a property to the previous <tr> line element, like "_syntaxHighlightState" which is an object that has a "comment" boolean property. I recommend an object because I think you will need more state tracking because RegExp's can span multiple lines, as well as strings.

No need to add WebInspector.JavascriptSyntaxHighlight, just add another function to SourceFrame's prototype like "_syntaxHighlightJavascriptLine".

I would not use innerHTML for this code. The syntax highlighting needs to be fast, so manipulating the DOM directly would be preferred. Use textContent of the webkit-line-content cell, then remove all the nodes of the cell, then scan and make text nodes and elements with a className and textContent for the highlighted parts and append them to the cell as you go. This would eliminate the need for echoChar, since you don't need to entity encode textContent.

I am curious if isDigit and isHex would be faster as:

function isDigit(c)
{
    return "0123456789".indexOf(c) !== -1;
}

function isHex(c)
{
    return "0123456789abcdefABCDEF".indexOf(c) !== -1;
}

You are missing a few operators like "%=", "*=", "/=", etc. Right now those would highlight as two separate operators. Also most syntax highlighters don't highlight operators, so maybe we should skip them too?

Your list of keywords is too broad and includes things that JavaScriptCore does not consider a keyword. See JavaScriptCore/kjs/keywords.table for the correct list. Why do you highlight boolean true and false differently? I would just label them as a keyword.

This code needs to be as fast as possible. Also if you store the state on the previous lines, we can make the highlighter lazy, so we only syntax highlight the visible lines until you scroll down. The Drosera syntax highlighting code was a real slow down for long scripts, so this lazy highlighting approach might be required.

Do you need to check for \n when scanning? Our view source code always makes a new <tr> for each line ending. Do we keep the \n in the text?

You will also want to syntax highlight in ScriptView.

Where did you get the colors for the highlights? Does it match Xcode or something else?
Comment 5 Adam Roben (:aroben) 2008-06-14 06:49:56 PDT
(In reply to comment #4)
> This code needs to be as fast as possible. Also if you store the state on the
> previous lines, we can make the highlighter lazy, so we only syntax highlight
> the visible lines until you scroll down. The Drosera syntax highlighting code
> was a real slow down for long scripts, so this lazy highlighting approach might
> be required.

I think we could probably make the code lazy in a followup patch.
Comment 6 Keishi Hattori 2008-06-15 20:49:58 PDT
Created attachment 21720 [details]
highlighter rewritten

I rewrote the highlighter. It's still slow(prototype.js takes a couple of seconds). 

bug: confuses division slash with regexp
todo: Highlighting in the scripts panel
Comment 7 Keishi Hattori 2008-06-15 21:13:56 PDT
by the way, colors now match dashcode.
Comment 8 Adam Roben (:aroben) 2008-06-16 06:36:39 PDT
Comment on attachment 21720 [details]
highlighter rewritten

Keishi, you should make sure to set the "review" flag to "?" when uploading patches that you want people to look at (I've just done it for you).
Comment 9 Timothy Hatcher 2008-06-17 14:18:48 PDT
Comment on attachment 21720 [details]
highlighter rewritten

This is looking better! I have a few comments, most are code style and formatting.

First there are many lines that are missing indentation, like:

+            var match = /^(-?\d+\.?\d*|0x\h+|Infinity|NaN)\W/.exec(str);
+            if (!match)
+            match = /^(-?\d+\.?\d*|0x\h+|Infinity|NaN)$/.exec(str);

The line after the if statement should be indented like:

+            var match = /^(-?\d+\.?\d*|0x\h+|Infinity|NaN)\W/.exec(str);
+            if (!match)
+                match = /^(-?\d+\.?\d*|0x\h+|Infinity|NaN)$/.exec(str);

While I am looking at this line (and the others like it with two matches each) these can be simplified into one RegExp (for speed). Like:

+            var match = /^(-?\d+\.?\d*|0x\h+|Infinity|NaN)(?:\W|$)/.exec(str);

Also notice that I used the non-capturing parenthesis (?:...) to prevent the regex from remembering the contents, since we don't need them for back references. Use these when you don't need a back reference in other places you are using regular parenthesis now.

Put the opening brace for your functions (a few of them) on the next line:

+        function createSpan(content, className) {

Like:

+        function createSpan(content, className)
+        {

For readability, add some extra empty lines in a few places, like after the closing brace for the nested functions.

+            var span = document.createElement('SPAN');

Use double quotes, and use lowercase "span".

+                node._length = match[0].length;

I am not too fond of adding _length to all the highlighted nodes in the DOM. This will be a fairly large memory impact for large syntax-highlighted JS documents. You can take advantage of the closure created by these nested functions to provide the length for the previous match like this:

	_syntaxHighlightJavascriptLine: function(line, prevLine) {
		// ...
		
		var previousMatchLength = 0;
		
		// ...
		
		function findNumber(str) {
			// ...
			previousMatchLength = match[0].length;
			return node;
			// ...
		}
		
		// ...
		
		findNumber(code);
		i += previousMatchLength;
		
		// ...		
	}


+        if (tmp < code.length) {
+            line.appendChild(document.createTextNode(code.substring(tmp, i)));
+        }

This should be a single line if statement (no braces.)


+    syntaxHighlightJavascript: function() {
+        var rowNum = 1, lineContainer, line, prevLine;
+        var totalLines = this.numOfLines();
+        for (var i = 1; i <= totalLines; ++i) {
+            lineContainer = this.sourceRow(i);
+            line = lineContainer.lastChild;
+            this._syntaxHighlightJavascriptLine(line, prevLine);
+            prevLine = line;
+        }

I would not call sourceRow, it does a lot of work to be used in a loop like this.  Something like this would be better:

    syntaxHighlightJavascript: function() {
        var table = this.element.contentDocument.getElementsByTagName("table")[0];
        if (!table)
            return;

        var rows = table.rows;
        var rowsLength = rows.length;
        var previousCell = null;
        for (var i = 0; i < rowsLength; ++i) {
            var row = rows[i];
            var cell = row.getElementsByTagName("td")[1];
            if (!cell)
                continue;
            this._syntaxHighlightJavascriptLine(cell, previousCell);
            previousCell = cell;
        }
    }

You can then remove numOfLines. 

+            if (this.resource.type === WebInspector.Resource.Type.Script) {
+                this.sourceFrame.syntaxHighlightJavascript();
+            }

This should not have braces.

You should add the syntaxHighlightJavascript() call to ScriptView.js too, just like you did for SourceView. In a similar setupSourceFrameIfNeeded function too! No need to check type, just add the call and you fix one of the bugs you listed.
Comment 10 Keishi Hattori 2008-06-17 22:33:47 PDT
Created attachment 21811 [details]
regexp detection improved

xenon, thank you for the review.

This time I've done:
- Script View highlighting
- Support for scientific notation number literals
- Improved regexp detection

regexp is still not 100%, but I think it works most of the time now. jquery-min, prototype.js looks good.
Comment 11 Adam Roben (:aroben) 2008-06-18 08:10:08 PDT
Comment on attachment 21811 [details]
regexp detection improved

+        function findNumber(str)
+        {
+            var match = /^(-?(\d+\.?\d*([eE][+-]\d+)?|0[xX]\h+|Infinity)|NaN)(?:\W|$)/.exec(str);
+            if (match) {
+                var node = createSpan(match[1], "webkit-javascript-number");
+                previousMatchLength = match[1].length;
+                return node;
+            }
+            return null;
+        }

There are many functions in this patch that have this same form. I have two thoughts on this:

1. It would be nicer to reverse the condition in the if and do an early return. This will reduce nesting in the function. For example:

if (!match)
    return null;
...do the work that was inside the if here...

2. There might be some way to reduce the amount of boilerplate code here. Something like this might work, at least for the simple cases:

function generateFinder(regex, matchNumber, class)
{
    return function(str) {
        var match = regex.exec(str);
        if (!match)
            return null;
        previousMatchLength = match[matchNumber].length;
        return createSpan(match[matchNumber], class);
    }
}
Comment 12 Keishi Hattori 2008-06-18 10:47:22 PDT
Created attachment 21820 [details]
lines reduced

Thanks aroben. I was feeling there was too much redundancy.
I reduced the number of lines to less than half the previous one. I hope I didn't compromise code readability. I was surprised to see that generating a function every time has hardly any overhead.
Comment 13 Timothy Hatcher 2008-06-18 11:05:03 PDT
(In reply to comment #12)
> Thanks aroben. I was feeling there was too much redundancy.
> I reduced the number of lines to less than half the previous one. I hope I
> didn't compromise code readability. I was surprised to see that generating a
> function every time has hardly any overhead. 

Neat! I would use multiple var statements instead of comma separating them. Might make things more readable, and match coding style of the inspector better.

I think it is time to write a ChangeLog entry and get this sucker landed!
Comment 14 Anthony Ricaud 2008-06-18 12:36:07 PDT
Maybe it is possible to delete all the _commentContinues, _regexpContinues, etc. If I've understood the patch well, they are not useful after the code is executed. 
Comment 15 Timothy Hatcher 2008-06-18 12:42:33 PDT
(In reply to comment #14)
> Maybe it is possible to delete all the _commentContinues, _regexpContinues,
> etc. If I've understood the patch well, they are not useful after the code is
> executed. 

It would be good to delete them, but in the future when we lazily syntax highlight, we will need at least the last line that was syntax highlighted to have this info.
Comment 16 Anthony Ricaud 2008-06-18 13:23:03 PDT
When a section ends, a while loop could delete all the properties for that section. This would also work with a lazy syntax highlighting. 

But the code only needs to remember the state of the previous line, not every line. After highlighting a line, the precious line could be cleaned. 
Comment 17 Timothy Hatcher 2008-06-18 13:26:45 PDT
Agreed. We should make that change (delete the previous line's _*Continues properties when done with the line.)
Comment 18 Keishi Hattori 2008-06-18 13:28:36 PDT
Created attachment 21822 [details]
delete

I think writing like this will make it easy to rewrite into lazy highlighting.
        function deleteContinueFlags(cell)
        {
            if (cell) {
                delete cell._commentContinues;
                delete cell._singleQuoteStringContinues;
                delete cell._doubleQuoteStringContinues;
                delete cell._regexpContinues;
            }
        }
        
        for (var i = 0; i < rowsLength; ++i) {
            var row = rows[i];
            var cell = row.getElementsByTagName("td")[1];
            if (!cell)
                continue;
            this._syntaxHighlightJavascriptLine(cell, previousCell);
            deleteContinueFlags(previousCell);
            previousCell = cell;
        }
        deleteContinueFlags(previousCell);
Comment 19 Timothy Hatcher 2008-06-18 13:57:22 PDT
Looks great! I would move the if (cell) to the caller of deleteContinueFlags or make it an early return:

if (!cell)
    return;

Still need a ChangeLog, and then this will be r+ed!
Comment 20 Adam Roben (:aroben) 2008-06-18 14:03:19 PDT
(In reply to comment #19)
> Still need a ChangeLog, and then this will be r+ed!

Instructions for writing a ChangeLog can be found at http://webkit.org/coding/contributing.html
Comment 21 Keishi Hattori 2008-06-18 17:19:38 PDT
Created attachment 21825 [details]
with change log

Added a change log entry.
Comment 22 Adam Roben (:aroben) 2008-06-19 07:48:56 PDT
Comment on attachment 21825 [details]
with change log

We normally reference the Bugzilla bug in the ChangeLog entry by including its title and URL. Also, your email address in the ChangeLog header is missing the enclosing angle brackets. I'll fix these before landing.
Comment 23 Timothy Hatcher 2008-06-19 10:02:01 PDT
Landed in r34657.
Comment 24 Timothy Hatcher 2008-06-19 10:08:07 PDT
I think we should use colors from Xcode. After using this for a bit in the Web Inspector I am not fond of the Dashcode colors.
Comment 25 Timothy Hatcher 2008-06-19 10:22:08 PDT
I changed the colors in r34658.
Comment 26 Anthony Ricaud 2008-06-19 12:53:21 PDT
I don't know if it should be part of another bug but javascript code in HTML content is not syntax highlighted. It should be. 
Comment 27 Adam Roben (:aroben) 2008-06-19 13:27:59 PDT
(In reply to comment #26)
> I don't know if it should be part of another bug but javascript code in HTML
> content is not syntax highlighted. It should be. 

There is at least bug 19351. I had thought there was another one as well, but I can't find it.