RESOLVED FIXED 217576
[JSC] Accept arbitrary module namespace identifier names
https://bugs.webkit.org/show_bug.cgi?id=217576
Summary [JSC] Accept arbitrary module namespace identifier names
Yusuke Suzuki
Reported 2020-10-11 03:31:24 PDT
Attachments
Patch (18.46 KB, patch)
2020-12-15 00:22 PST, Yusuke Suzuki
no flags
Patch (18.50 KB, patch)
2020-12-15 02:26 PST, Yusuke Suzuki
no flags
Patch (20.56 KB, patch)
2020-12-15 18:41 PST, Yusuke Suzuki
no flags
Patch (20.56 KB, patch)
2020-12-15 18:43 PST, Yusuke Suzuki
darin: review+
ews-feeder: commit-queue-
Radar WebKit Bug Importer
Comment 1 2020-10-18 03:32:15 PDT
Yusuke Suzuki
Comment 2 2020-12-15 00:22:40 PST
Yusuke Suzuki
Comment 3 2020-12-15 02:26:08 PST
Darin Adler
Comment 4 2020-12-15 11:37:12 PST
Comment on attachment 416231 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=416231&action=review > Source/JavaScriptCore/parser/Parser.cpp:3401 > +static inline bool isStringWellFormedUnicode(const String& string) > +{ > + if (string.is8Bit()) > + return true; > + const UChar* characters = string.characters16(); > + for (unsigned index = 0, length = string.length(); index < length; ++index) { > + UChar character = characters[index]; > + if (!U16_IS_SURROGATE(character)) > + continue; > + if (U16_IS_SURROGATE_TRAIL(character)) > + return false; > + if (index + 1 == length) > + return false; > + ++index; > + UChar nextCharacter = characters[index]; > + if (!U16_IS_TRAIL(nextCharacter)) > + return false; > + } > + return true; > +} This looks like the same function as hasUnpairedSurrogate from JSDOMConvertStrings.cpp. It would be nice to share one function rather than having two separate implementations.
Yusuke Suzuki
Comment 5 2020-12-15 11:46:35 PST
Comment on attachment 416231 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=416231&action=review >> Source/JavaScriptCore/parser/Parser.cpp:3401 >> +} > > This looks like the same function as hasUnpairedSurrogate from JSDOMConvertStrings.cpp. It would be nice to share one function rather than having two separate implementations. Sounds good!
Yusuke Suzuki
Comment 6 2020-12-15 18:41:51 PST
Yusuke Suzuki
Comment 7 2020-12-15 18:43:47 PST
Darin Adler
Comment 8 2020-12-16 09:36:03 PST
Comment on attachment 416308 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=416308&action=review > Source/JavaScriptCore/parser/Parser.cpp:3538 > +template <class TreeBuilder> typename TreeBuilder::ExportSpecifier Parser<LexerType>::parseExportSpecifier(TreeBuilder& context, Vector<std::pair<const Identifier*, const Identifier*>>& maybeExportedLocalNames, bool& hasKeywordForLocalBindings, bool& hasReferencedModuleExportNames) Could we make this easier to read by using a structure for return values instead of having so many out arguments? > Source/JavaScriptCore/parser/Parser.cpp:3566 > - failIfFalse(matchIdentifierOrKeyword(), "Expected an exported name for the export declaration"); > + failIfFalse(matchIdentifierOrKeyword() || match(STRING), "Expected an exported name or a module export name string for the export declaration"); > exportedName = m_token.m_data.ident; > + if (match(STRING)) > + failIfTrue(hasUnpairedSurrogate(exportedName->string()), "Expected a well-formed-unicode string for the module export name"); Is match(STRING) expensive? I ask because we do the check twice. Pattern also recurs below. > Source/WTF/wtf/text/StringView.h:1090 > +inline bool hasUnpairedSurrogate(StringView string) Might be a little *too* inlined, but we can tweak that later.
Yusuke Suzuki
Comment 9 2020-12-16 18:26:25 PST
Comment on attachment 416308 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=416308&action=review Thanks!! >> Source/JavaScriptCore/parser/Parser.cpp:3538 >> +template <class TreeBuilder> typename TreeBuilder::ExportSpecifier Parser<LexerType>::parseExportSpecifier(TreeBuilder& context, Vector<std::pair<const Identifier*, const Identifier*>>& maybeExportedLocalNames, bool& hasKeywordForLocalBindings, bool& hasReferencedModuleExportNames) > > Could we make this easier to read by using a structure for return values instead of having so many out arguments? One of the thing we would like to have is that hasKeywordForLocalBindings and hasReferencedModuleExportNames is reused for multiple export specifier parsing. And we set true to them when we find some export specifiers which has these features. And we do some validation after parsing all the export specifiers. I think returning structure from this makes the code more complicated like, in the receiver side, if (hasKeywordForLocalBindings) hasKeywordForLocalBindingsInSomeExportSpecifier = true; if (hasReferencedModuleExportNames) hasReferencedModuleExportNamesInSomeExportSpecifier = true; things are necessary in the caller side. So compared to that, I think still this is cleaner. >> Source/JavaScriptCore/parser/Parser.cpp:3566 >> + failIfTrue(hasUnpairedSurrogate(exportedName->string()), "Expected a well-formed-unicode string for the module export name"); > > Is match(STRING) expensive? I ask because we do the check twice. Pattern also recurs below. This is cheap This is just comparing one enum value. >> Source/WTF/wtf/text/StringView.h:1090 >> +inline bool hasUnpairedSurrogate(StringView string) > > Might be a little *too* inlined, but we can tweak that later. Maybe, if it is too big, the compiler will just put as non-inlined function :) I think this is still good since `if (string.is8Bit())` would be major cases.
Yusuke Suzuki
Comment 10 2020-12-16 18:47:21 PST
Note You need to log in before you can comment on or make changes to this bug.