Bug 49576

Summary: jsc does not ignore shebang
Product: WebKit Reporter: John Tantalo <john.tantalo>
Component: JavaScriptCoreAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Trivial CC: commit-queue
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Macintosh Intel   
OS: OS X 10.6   
Attachments:
Description Flags
JavaScript source with shebang.
none
svn diff JavaScriptCore/parser/Lexer.cpp
none
proposed patch
ggaren: review-
proposed patch none

Description John Tantalo 2010-11-15 20:42:12 PST
Created attachment 73956 [details]
JavaScript source with shebang.

Various other JavaScript engines [1] correctly ignore a shebang [2] as the first line of a file. Invoking jsc with a source file that contains a shebang generates a lex error. It should instead ignore the shebang.

[1] nodejs, narwhal, ringo
[2] http://en.wikipedia.org/wiki/Shebang_(Unix)
Comment 1 John Tantalo 2010-11-15 20:55:41 PST
Created attachment 73957 [details]
svn diff JavaScriptCore/parser/Lexer.cpp

I added a new character type for a hash, i.e., #, ASCII 35.

When this character was previously encountered, it would have immediately triggered a lexical error.

My change is to first check if this character is at the start of the line. If not, it is an error.

Next, I check if the next character is a bang, i.e,. !, ASCII 33. If it is, then the rest of the line is treated as a comment. If not, then it is an error.

Since the hash character was previously illegal, the only potential danger of this patch is that source which previously failed to lex may now be executed

This patch does not enforce that the shebang occurs on the first line of the source. Any occurrence of a hash other than the first character of the source should be illegal. I don't know how to do this, so I will submit my current patch as a viable solution to the bug.
Comment 2 John Tantalo 2010-11-15 21:49:31 PST
Created attachment 73960 [details]
proposed patch

Replaced diff with a real patch, with a test, changelog, and correct style.
Comment 3 Geoffrey Garen 2010-11-16 10:58:58 PST
Comment on attachment 73960 [details]
proposed patch

I think you've made this change at the wrong level of abstraction. Accounting for shebang in the lexer applies this Unix-only rule to all web pages and operating systems.

I think a better fix would be for the jsc tool itself to strip out shebang before evaluating a source file.
Comment 4 John Tantalo 2010-11-16 11:18:27 PST
Created attachment 74017 [details]
proposed patch

This patch modifies jsc's file load function to detect a shebang in the first two characters of the file buffer. If it finds a shebang, it replaces it with "//", which causes the rest of the line to be interpreted as a JavaScript comment.
Comment 5 Geoffrey Garen 2010-11-29 11:40:03 PST
Comment on attachment 74017 [details]
proposed patch

r=me
Comment 6 WebKit Commit Bot 2010-12-06 10:47:50 PST
Comment on attachment 74017 [details]
proposed patch

Clearing flags on attachment: 74017

Committed r73377: <http://trac.webkit.org/changeset/73377>
Comment 7 WebKit Commit Bot 2010-12-06 10:47:56 PST
All reviewed patches have been landed.  Closing bug.