Bug 49576 - jsc does not ignore shebang
Summary: jsc does not ignore shebang
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: Mac (Intel) OS X 10.6
: P2 Trivial
Assignee: Nobody
Depends on:
Reported: 2010-11-15 20:42 PST by John Tantalo
Modified: 2010-12-06 10:47 PST (History)
1 user (show)

See Also:

JavaScript source with shebang. (32 bytes, text/javascript)
2010-11-15 20:42 PST, John Tantalo
no flags Details
svn diff JavaScriptCore/parser/Lexer.cpp (1.59 KB, patch)
2010-11-15 20:55 PST, John Tantalo
no flags Details | Formatted Diff | Diff
proposed patch (2.68 KB, patch)
2010-11-15 21:49 PST, John Tantalo
ggaren: review-
Details | Formatted Diff | Diff
proposed patch (1.00 KB, patch)
2010-11-16 11:18 PST, John Tantalo
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description John Tantalo 2010-11-15 20:42:12 PST
Created attachment 73956 [details]
JavaScript source with shebang.

Various other JavaScript engines [1] correctly ignore a shebang [2] as the first line of a file. Invoking jsc with a source file that contains a shebang generates a lex error. It should instead ignore the shebang.

[1] nodejs, narwhal, ringo
[2] http://en.wikipedia.org/wiki/Shebang_(Unix)
Comment 1 John Tantalo 2010-11-15 20:55:41 PST
Created attachment 73957 [details]
svn diff JavaScriptCore/parser/Lexer.cpp

I added a new character type for a hash, i.e., #, ASCII 35.

When this character was previously encountered, it would have immediately triggered a lexical error.

My change is to first check if this character is at the start of the line. If not, it is an error.

Next, I check if the next character is a bang, i.e,. !, ASCII 33. If it is, then the rest of the line is treated as a comment. If not, then it is an error.

Since the hash character was previously illegal, the only potential danger of this patch is that source which previously failed to lex may now be executed

This patch does not enforce that the shebang occurs on the first line of the source. Any occurrence of a hash other than the first character of the source should be illegal. I don't know how to do this, so I will submit my current patch as a viable solution to the bug.
Comment 2 John Tantalo 2010-11-15 21:49:31 PST
Created attachment 73960 [details]
proposed patch

Replaced diff with a real patch, with a test, changelog, and correct style.
Comment 3 Geoffrey Garen 2010-11-16 10:58:58 PST
Comment on attachment 73960 [details]
proposed patch

I think you've made this change at the wrong level of abstraction. Accounting for shebang in the lexer applies this Unix-only rule to all web pages and operating systems.

I think a better fix would be for the jsc tool itself to strip out shebang before evaluating a source file.
Comment 4 John Tantalo 2010-11-16 11:18:27 PST
Created attachment 74017 [details]
proposed patch

This patch modifies jsc's file load function to detect a shebang in the first two characters of the file buffer. If it finds a shebang, it replaces it with "//", which causes the rest of the line to be interpreted as a JavaScript comment.
Comment 5 Geoffrey Garen 2010-11-29 11:40:03 PST
Comment on attachment 74017 [details]
proposed patch

Comment 6 WebKit Commit Bot 2010-12-06 10:47:50 PST
Comment on attachment 74017 [details]
proposed patch

Clearing flags on attachment: 74017

Committed r73377: <http://trac.webkit.org/changeset/73377>
Comment 7 WebKit Commit Bot 2010-12-06 10:47:56 PST
All reviewed patches have been landed.  Closing bug.