Bug 226912

Summary: Check for "xn--" in any subdomain when parsing URL hosts
Product: WebKit Reporter: Alex Christensen <achristensen>
Component: New BugsAssignee: Alex Christensen <achristensen>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, benjamin, cdumez, clopez, cmarcelo, darin, ews-watchlist, webkit-bug-importer, youennf
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch darin: review+

Alex Christensen
Reported 2021-06-10 21:17:07 PDT
Check for "xn--" in any subdomain when parsing URL hosts
Attachments
Patch (31.07 KB, patch)
2021-06-10 21:20 PDT, Alex Christensen
no flags
Patch (39.98 KB, patch)
2021-06-11 13:01 PDT, Alex Christensen
no flags
Patch (42.05 KB, patch)
2021-06-11 13:39 PDT, Alex Christensen
no flags
Patch (46.21 KB, patch)
2021-06-11 16:21 PDT, Alex Christensen
no flags
Patch (49.56 KB, patch)
2021-06-14 12:59 PDT, Alex Christensen
no flags
Patch (49.12 KB, patch)
2021-06-14 13:51 PDT, Alex Christensen
darin: review+
Alex Christensen
Comment 1 2021-06-10 21:20:06 PDT
EWS Watchlist
Comment 2 2021-06-10 21:21:07 PDT
This patch modifies the imported WPT tests. Please ensure that any changes on the tests (not coming from a WPT import) are exported to WPT. Please see https://trac.webkit.org/wiki/WPTExportProcess
Alexey Proskuryakov
Comment 3 2021-06-11 07:11:04 PDT
Comment on attachment 431178 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=431178&action=review > Source/WTF/ChangeLog:3 > + Check for "xn--" in any subdomain when parsing URL hosts This changes many WPT results from pass to fail, please explain why that’s for the best.
Alex Christensen
Comment 4 2021-06-11 13:01:08 PDT
Alex Christensen
Comment 5 2021-06-11 13:39:06 PDT
Alex Christensen
Comment 6 2021-06-11 16:21:10 PDT
Darin Adler
Comment 7 2021-06-12 15:23:09 PDT
Comment on attachment 431239 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=431239&action=review > Source/WTF/wtf/URLParser.cpp:2662 > + state = State::NotAtSubdomainBeginOrInXNDashDash; Does not seem to do the right thing if c is "." here, since we skip a character before starting to look for ".". Test case that would fail would be "x.xn--". > Source/WTF/wtf/URLParser.cpp:2668 > + state = State::NotAtSubdomainBeginOrInXNDashDash; Ditto. Test case that would fail would be "xn.xn--". > Source/WTF/wtf/URLParser.cpp:2673 > + state = State::NotAtSubdomainBeginOrInXNDashDash; Ditto. Test case that would fail would be "xn-.xn--".
Alex Christensen
Comment 8 2021-06-14 12:58:16 PDT
Good catch! Adding those tests.
Alex Christensen
Comment 9 2021-06-14 12:59:22 PDT
Darin Adler
Comment 10 2021-06-14 13:05:34 PDT
Comment on attachment 431356 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=431356&action=review > Source/WTF/wtf/URLParser.cpp:2655 > + if (c == 'x' || c == 'X') > + state = State::AfterX; > + else if (c == '.') > + state = State::AtSubdomainBegin; > + else > + state = State::NotAtSubdomainBeginOrInXNDashDash; > + break; More concise way to write this: case State::AtSubdomainBegin: if (c == 'x' || c == 'X') state = State::AfterX; continue; } break; case State::... ... } if (c == '.') state = State::AtSubdomainBegin; else state = State::NotAtSubdomainBeginOrInXNDashDash; Code would be less repetitive.
Alex Christensen
Comment 11 2021-06-14 13:50:54 PDT
I like it.
Alex Christensen
Comment 12 2021-06-14 13:51:21 PDT
Alex Christensen
Comment 13 2021-06-15 08:28:41 PDT
Comment on attachment 431360 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=431360&action=review > Source/WTF/wtf/URLParser.cpp:2546 > + WTFLogAlways("ERRORS %x", processingDetails.errors); Will remove before landing.
Darin Adler
Comment 14 2021-06-15 08:49:26 PDT
Comment on attachment 431360 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=431360&action=review > Source/WTF/wtf/URL.cpp:-406 > - // For host names bigger than this, we won't do IDN encoding, which is almost certainly OK. I feel like it’s a problem to lose this comment. Could write a better one, but just removing it seems wrong. Without the comment it’s not obvious why it’s OK to leave the hostname as-is when it’s super long, and the check just below is mysterious. > Source/WTF/wtf/URLHelpers.cpp:-43 > -// For host names bigger than this, we won't do IDN encoding, which is almost certainly OK. Again, losing this comment. > Source/WTF/wtf/URLHelpers.cpp:575 > + if (hostName.length() > URLParser::hostnameBufferLength) Mysterious code without a comment. Why is it safe to just do nothing for long strings.
Alex Christensen
Comment 15 2021-06-15 09:51:25 PDT
Comment on attachment 431360 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=431360&action=review >> Source/WTF/wtf/URL.cpp:-406 >> - // For host names bigger than this, we won't do IDN encoding, which is almost certainly OK. > > I feel like it’s a problem to lose this comment. Could write a better one, but just removing it seems wrong. > > Without the comment it’s not obvious why it’s OK to leave the hostname as-is when it’s super long, and the check just below is mysterious. I'll keep the comment here too. >> Source/WTF/wtf/URLHelpers.cpp:575 >> + if (hostName.length() > URLParser::hostnameBufferLength) > > Mysterious code without a comment. Why is it safe to just do nothing for long strings. I'll put the comment here. > Source/WTF/wtf/URLParser.h:52 > + // Needs to be big enough to hold an IDN-encoded name. > + // For host names bigger than this, we won't do IDN encoding, which is almost certainly OK. > + constexpr static size_t hostnameBufferLength = 2048; ... instead of moving it along with the definition of hostnameBufferLength to here.
Alex Christensen
Comment 16 2021-06-15 09:59:30 PDT
Radar WebKit Bug Importer
Comment 17 2021-06-15 10:00:34 PDT
Note You need to log in before you can comment on or make changes to this bug.