Bug 166871

Summary: Lift template escape sequence restrictions in tagged templates
Product: WebKit Reporter: Kevin Gibbons <bakkot>
Component: JavaScriptCoreAssignee: Yusuke Suzuki <ysuzuki>
Status: RESOLVED FIXED    
Severity: Normal CC: buildbot, commit-queue, darin, ggaren, keith_miller, mark.lam, msaboff, rniwa, saam, ysuzuki
Priority: P2    
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
WIP
none
Patch
none
Archive of layout-test-results from ews101 for mac-elcapitan
none
Archive of layout-test-results from ews107 for mac-elcapitan-wk2
none
Archive of layout-test-results from ews115 for mac-elcapitan
none
Patch saam: review+

Description Kevin Gibbons 2017-01-09 16:51:53 PST
The template literal revision proposal reached stage 3 in July.

This allows

(function(strs) {
  return strs[0] === undefined && strs.raw[0] === '\\u{invalid}\\1\\xGG';
})`\u{invalid}\1\xGG`; // true

etc. Invalid escapes remain a syntax error in untagged templates.
Comment 1 Yusuke Suzuki 2017-01-14 15:02:33 PST
Created attachment 298861 [details]
WIP
Comment 2 Yusuke Suzuki 2017-01-15 23:15:13 PST
Created attachment 298939 [details]
Patch
Comment 3 Build Bot 2017-01-16 00:29:46 PST
Comment on attachment 298939 [details]
Patch

Attachment 298939 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.webkit.org/results/2897828

New failing tests:
inspector/runtime/parse.html
Comment 4 Build Bot 2017-01-16 00:29:49 PST
Created attachment 298940 [details]
Archive of layout-test-results from ews101 for mac-elcapitan

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews101  Port: mac-elcapitan  Platform: Mac OS X 10.11.6
Comment 5 Build Bot 2017-01-16 00:33:24 PST
Comment on attachment 298939 [details]
Patch

Attachment 298939 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.webkit.org/results/2897834

New failing tests:
inspector/runtime/parse.html
Comment 6 Build Bot 2017-01-16 00:33:27 PST
Created attachment 298941 [details]
Archive of layout-test-results from ews107 for mac-elcapitan-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews107  Port: mac-elcapitan-wk2  Platform: Mac OS X 10.11.6
Comment 7 Build Bot 2017-01-16 00:48:52 PST
Comment on attachment 298939 [details]
Patch

Attachment 298939 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/2897846

New failing tests:
inspector/runtime/parse.html
Comment 8 Build Bot 2017-01-16 00:48:55 PST
Created attachment 298942 [details]
Archive of layout-test-results from ews115 for mac-elcapitan

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews115  Port: mac-elcapitan  Platform: Mac OS X 10.11.6
Comment 9 Yusuke Suzuki 2017-01-16 02:20:03 PST
Created attachment 298948 [details]
Patch
Comment 10 Saam Barati 2017-01-27 17:43:11 PST
Comment on attachment 298948 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=298948&action=review

Still looking. Some fly-by comments

> Source/JavaScriptCore/parser/Lexer.cpp:686
> +        // For raw template literal syntax, we consume `NotEscapeSequence`.
> +        //
> +        // NotEscapeSequence ::
> +        //     u [lookahead not one of HexDigit][lookahead != {]
> +        //     u HexDigit [lookahead not one of HexDigit]
> +        //     u HexDigit HexDigit [lookahead not one of HexDigit]
> +        //     u HexDigit HexDigit HexDigit [lookahead not one of HexDigit]
> +        while (isASCIIHexDigit(m_current))
> +            shift();

Shouldn't you just be parsing 4 characters here?

> Source/JavaScriptCore/parser/Lexer.cpp:1544
> +        tokenData->raw = makeIdentifier(m_bufferForRawTemplateString16.data(), m_bufferForRawTemplateString16.size());

Why do we unique this string?
Comment 11 Saam Barati 2017-01-27 18:28:07 PST
Comment on attachment 298948 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=298948&action=review

> Source/JavaScriptCore/parser/Lexer.cpp:1228
> +                shift();

Can you assert after here !isASCIIHexDigit(m_current)
Comment 12 Saam Barati 2017-01-27 18:28:26 PST
r=me
Comment 13 Yusuke Suzuki 2017-01-27 18:41:59 PST
Comment on attachment 298948 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=298948&action=review

Thanks!

>> Source/JavaScriptCore/parser/Lexer.cpp:686
>> +            shift();
> 
> Shouldn't you just be parsing 4 characters here?

I think it is OK. We already checks the condition `UNLIKELY(!isASCIIHexDigit(m_current) || !isASCIIHexDigit(character2) || !isASCIIHexDigit(character3) || !isASCIIHexDigit(character4))`.
This condition means that at least we should have one non-ascii-hex-digit character in upcoming 4 characters.
So, this loop must stop in 0-3 iterations.

>> Source/JavaScriptCore/parser/Lexer.cpp:1228
>> +                shift();
> 
> Can you assert after here !isASCIIHexDigit(m_current)

Yeah. Added.

>> Source/JavaScriptCore/parser/Lexer.cpp:1544
>> +        tokenData->raw = makeIdentifier(m_bufferForRawTemplateString16.data(), m_bufferForRawTemplateString16.size());
> 
> Why do we unique this string?

This is aligned with the convention that we always perform `makeIdentifier` for the content of the string literal.
Maybe we have a room for optimization: calling `makeIdentifier` only for possible identifier string.
For now, this patch performs `makeIdentifier` even for raw strings.
Comment 14 Yusuke Suzuki 2017-01-27 19:10:36 PST
Committed r211319: <http://trac.webkit.org/changeset/211319>