Bug 217576 - [JSC] Accept arbitrary module namespace identifier names
Summary: [JSC] Accept arbitrary module namespace identifier names
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Yusuke Suzuki
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2020-10-11 03:31 PDT by Yusuke Suzuki
Modified: 2020-12-16 18:47 PST (History)
13 users (show)

See Also:


Attachments
Patch (18.46 KB, patch)
2020-12-15 00:22 PST, Yusuke Suzuki
no flags Details | Formatted Diff | Diff
Patch (18.50 KB, patch)
2020-12-15 02:26 PST, Yusuke Suzuki
no flags Details | Formatted Diff | Diff
Patch (20.56 KB, patch)
2020-12-15 18:41 PST, Yusuke Suzuki
no flags Details | Formatted Diff | Diff
Patch (20.56 KB, patch)
2020-12-15 18:43 PST, Yusuke Suzuki
darin: review+
ews-feeder: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Yusuke Suzuki 2020-10-11 03:31:24 PDT
https://github.com/tc39/ecma262/pull/2154
Comment 1 Radar WebKit Bug Importer 2020-10-18 03:32:15 PDT
<rdar://problem/70416104>
Comment 2 Yusuke Suzuki 2020-12-15 00:22:40 PST
Created attachment 416227 [details]
Patch
Comment 3 Yusuke Suzuki 2020-12-15 02:26:08 PST
Created attachment 416231 [details]
Patch
Comment 4 Darin Adler 2020-12-15 11:37:12 PST
Comment on attachment 416231 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=416231&action=review

> Source/JavaScriptCore/parser/Parser.cpp:3401
> +static inline bool isStringWellFormedUnicode(const String& string)
> +{
> +    if (string.is8Bit())
> +        return true;
> +    const UChar* characters = string.characters16();
> +    for (unsigned index = 0, length = string.length(); index < length; ++index) {
> +        UChar character = characters[index];
> +        if (!U16_IS_SURROGATE(character))
> +            continue;
> +        if (U16_IS_SURROGATE_TRAIL(character))
> +            return false;
> +        if (index + 1 == length)
> +            return false;
> +        ++index;
> +        UChar nextCharacter = characters[index];
> +        if (!U16_IS_TRAIL(nextCharacter))
> +            return false;
> +    }
> +    return true;
> +}

This looks like the same function as hasUnpairedSurrogate from JSDOMConvertStrings.cpp. It would be nice to share one function rather than having two separate implementations.
Comment 5 Yusuke Suzuki 2020-12-15 11:46:35 PST
Comment on attachment 416231 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=416231&action=review

>> Source/JavaScriptCore/parser/Parser.cpp:3401
>> +}
> 
> This looks like the same function as hasUnpairedSurrogate from JSDOMConvertStrings.cpp. It would be nice to share one function rather than having two separate implementations.

Sounds good!
Comment 6 Yusuke Suzuki 2020-12-15 18:41:51 PST
Created attachment 416307 [details]
Patch
Comment 7 Yusuke Suzuki 2020-12-15 18:43:47 PST
Created attachment 416308 [details]
Patch
Comment 8 Darin Adler 2020-12-16 09:36:03 PST
Comment on attachment 416308 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=416308&action=review

> Source/JavaScriptCore/parser/Parser.cpp:3538
> +template <class TreeBuilder> typename TreeBuilder::ExportSpecifier Parser<LexerType>::parseExportSpecifier(TreeBuilder& context, Vector<std::pair<const Identifier*, const Identifier*>>& maybeExportedLocalNames, bool& hasKeywordForLocalBindings, bool& hasReferencedModuleExportNames)

Could we make this easier to read by using a structure for return values instead of having so many out arguments?

> Source/JavaScriptCore/parser/Parser.cpp:3566
> -        failIfFalse(matchIdentifierOrKeyword(), "Expected an exported name for the export declaration");
> +        failIfFalse(matchIdentifierOrKeyword() || match(STRING), "Expected an exported name or a module export name string for the export declaration");
>          exportedName = m_token.m_data.ident;
> +        if (match(STRING))
> +            failIfTrue(hasUnpairedSurrogate(exportedName->string()), "Expected a well-formed-unicode string for the module export name");

Is match(STRING) expensive? I ask because we do the check twice. Pattern also recurs below.

> Source/WTF/wtf/text/StringView.h:1090
> +inline bool hasUnpairedSurrogate(StringView string)

Might be a little *too* inlined, but we can tweak that later.
Comment 9 Yusuke Suzuki 2020-12-16 18:26:25 PST
Comment on attachment 416308 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=416308&action=review

Thanks!!

>> Source/JavaScriptCore/parser/Parser.cpp:3538
>> +template <class TreeBuilder> typename TreeBuilder::ExportSpecifier Parser<LexerType>::parseExportSpecifier(TreeBuilder& context, Vector<std::pair<const Identifier*, const Identifier*>>& maybeExportedLocalNames, bool& hasKeywordForLocalBindings, bool& hasReferencedModuleExportNames)
> 
> Could we make this easier to read by using a structure for return values instead of having so many out arguments?

One of the thing we would like to have is that hasKeywordForLocalBindings and hasReferencedModuleExportNames is reused for multiple export specifier parsing.
And we set true to them when we find some export specifiers which has these features.
And we do some validation after parsing all the export specifiers.

I think returning structure from this makes the code more complicated like, in the receiver side,

if (hasKeywordForLocalBindings)
    hasKeywordForLocalBindingsInSomeExportSpecifier = true;
if (hasReferencedModuleExportNames)
    hasReferencedModuleExportNamesInSomeExportSpecifier = true;

things are necessary in the caller side. So compared to that, I think still this is cleaner.

>> Source/JavaScriptCore/parser/Parser.cpp:3566
>> +            failIfTrue(hasUnpairedSurrogate(exportedName->string()), "Expected a well-formed-unicode string for the module export name");
> 
> Is match(STRING) expensive? I ask because we do the check twice. Pattern also recurs below.

This is cheap This is just comparing one enum value.

>> Source/WTF/wtf/text/StringView.h:1090
>> +inline bool hasUnpairedSurrogate(StringView string)
> 
> Might be a little *too* inlined, but we can tweak that later.

Maybe, if it is too big, the compiler will just put as non-inlined function :)
I think this is still good since `if (string.is8Bit())` would be major cases.
Comment 10 Yusuke Suzuki 2020-12-16 18:47:21 PST
Committed r270923: <https://trac.webkit.org/changeset/270923>