161390 – Implement IPv6 parsing in URLParser

RESOLVED FIXED 161390

Implement IPv6 parsing in URLParser

https://bugs.webkit.org/show_bug.cgi?id=161390

Summary Implement IPv6 parsing in URLParser

Alex Christensen

Reported 2016-08-30 10:14:05 PDT

Implement IPv6 parsing in URLParser

Attachments
Patch (10.44 KB, patch) 2016-08-30 10:38 PDT, Alex Christensen	no flags	Details Formatted Diff Diff
Patch (10.67 KB, patch) 2016-08-30 17:12 PDT, Alex Christensen	no flags	Details Formatted Diff Diff
Patch (10.85 KB, patch) 2016-08-31 15:43 PDT, Alex Christensen	no flags	Details Formatted Diff Diff
Patch (12.09 KB, patch) 2016-08-31 16:27 PDT, Alex Christensen	no flags	Details Formatted Diff Diff
Patch (12.45 KB, patch) 2016-08-31 16:44 PDT, Alex Christensen	no flags	Details Formatted Diff Diff
Show Obsolete (4) View All Add attachment proposed patch, testcase, etc.

Alex Christensen

Comment 1 2016-08-30 10:38:44 PDT

Created attachment 287407 [details] Patch

Alex Christensen

Comment 2 2016-08-30 10:39:24 PDT

Comment on attachment 287407 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=287407&action=review > Tools/TestWebKitAPI/Tests/WebCore/URLParser.cpp:104 > + // checkURL("http://[0:f::f:f:0:0]", {"http", "", "", "[0:f::f:f:0:0]", 0, "/", "", "", "http://[0:f::f:f:0:0]/"}); > + // checkURL("http://[0:f:0:0:f::]", {"http", "", "", "[0:f:0:0:f::]", 0, "/", "", "", "http://[0:f:0:0:f::]/"}); > + // checkURL("http://[::f:0:0:f:0:0]", {"http", "", "", "[::f:0:0:f:0:0]", 0, "/", "", "", "http://[::f:0:0:f:0:0]/"}); These tests don't pass yet, but they should if the parser were 100% correct.. > Tools/TestWebKitAPI/Tests/WebCore/URLParser.cpp:191 > + /* > + checkURLDifferences("http://[0:0:f:0:0:f:0:0]", ditto

Alex Christensen

Comment 3 2016-08-30 17:12:14 PDT

Created attachment 287460 [details] Patch

Alex Christensen

Comment 4 2016-08-30 22:18:49 PDT

I fixed the tests. Not every branch is tested, but I plan to go over all this code with code coverage analysis before turning it on and make sure all code paths are tested.

Brady Eidson

Comment 5 2016-08-31 15:31:08 PDT

Comment on attachment 287460 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=287460&action=review > Source/WebCore/platform/URLParser.cpp:591 > +static char nibbleToHex(uint8_t nibble) This method is weird. It doesn't specify which bits of the byte are the nibble in question. In person you said it's always the lower bits. This is confusing from a code clarity standpoint. Also, we already have lowerNibbleToASCIIHexDigit. In person you said the problem is that lowerNibbleToASCIIHexDigit returns uppercase characters whereas you need lowercase. May I suggest u_tolower(lowerNibbleToASCIIHexDigit(...)) or convertASCIIAlphaToLower(lowerNibbleToASCIIHexDigit(...))? Or if performance is critical, naming this lowerNibbleToLowercaseASCIIHexDigit? > Source/WebCore/platform/URLParser.cpp:601 > + uint8_t c0 = piece >> 12; > + uint8_t c1 = piece >> 8 & 0xF; > + uint8_t c2 = piece >> 4 & 0xF; > + uint8_t c3 = piece & 0xF; These characters are poorly named nibbles.

Alex Christensen

Comment 6 2016-08-31 15:43:48 PDT

Created attachment 287553 [details] Patch

Darin Adler

Comment 7 2016-08-31 16:09:18 PDT

Comment on attachment 287553 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=287553&action=review > Source/WebCore/platform/URLParser.cpp:594 > +static char lowerNibbleToLowercaseASCIIHexDigit(uint8_t nibble) > +{ > + return nibble + (nibble < 10 ? '0' : 'a' - 10); > +} Please put this in ASCIICType.h, not here. > Source/WebCore/platform/URLParser.cpp:598 > + uint8_t nibbles[4] = { static_cast<uint8_t>(piece >> 12), static_cast<uint8_t>(piece >> 8 & 0xF), static_cast<uint8_t>(piece >> 4 & 0xF), static_cast<uint8_t>(piece & 0xF) }; Are you sure these static_cast are needed? What happens if you omit them? It seems overkill to use an array and too specific to specify the type uint8_t. I would write this: if (auto nibble0 = piece >> 12) ... auto nibble1 = piece >> 8 & 0xF; if (printed || nibble1) { etc. > Source/WebCore/platform/URLParser.cpp:616 > + Optional<size_t> compressPointer = findLongestZeroSequence(address); I suggest auto here. > Source/WebCore/platform/URLParser.cpp:819 > + size_t swaps = piecePointer - compressPointer.value(); > + piecePointer = 7; > + while (swaps) > + std::swap(address[piecePointer--], address[compressPointer.value() + swaps-- - 1]); This seems so non-obvious to me; what does it even do? Can we do this with std::reverse instead? > Tools/TestWebKitAPI/Tests/WebCore/URLParser.cpp:104 > + checkURL("http://[0:f::f:f:0:0]", {"http", "", "", "[0:f::f:f:0:0]", 0, "/", "", "", "http://[0:f::f:f:0:0]/"}); > + checkURL("http://[0:f:0:0:f::]", {"http", "", "", "[0:f:0:0:f::]", 0, "/", "", "", "http://[0:f:0:0:f::]/"}); > + checkURL("http://[::f:0:0:f:0:0]", {"http", "", "", "[::f:0:0:f:0:0]", 0, "/", "", "", "http://[::f:0:0:f:0:0]/"}); Don’t we need more cases? Using only 0 and f seems to not cover enough of the parsing.

Alex Christensen

Comment 8 2016-08-31 16:18:00 PDT

(In reply to comment #7) > > Source/WebCore/platform/URLParser.cpp:598 > > + uint8_t nibbles[4] = { static_cast<uint8_t>(piece >> 12), static_cast<uint8_t>(piece >> 8 & 0xF), static_cast<uint8_t>(piece >> 4 & 0xF), static_cast<uint8_t>(piece & 0xF) }; > > Are you sure these static_cast are needed? What happens if you omit them? They are needed. It doesn't compile successfully without them. > > It seems overkill to use an array and too specific to specify the type > uint8_t. I would write this: > > if (auto nibble0 = piece >> 12) > ... > auto nibble1 = piece >> 8 & 0xF; > if (printed || nibble1) { > > etc. This would make the static_cast not necessary. > > Source/WebCore/platform/URLParser.cpp:819 > > + size_t swaps = piecePointer - compressPointer.value(); > > + piecePointer = 7; > > + while (swaps) > > + std::swap(address[piecePointer--], address[compressPointer.value() + swaps-- - 1]); > > This seems so non-obvious to me; what does it even do? Can we do this with > std::reverse instead? Maybe. This is what the spec says. I was quite aggressive with my use of postfix decrement, though. > > > Tools/TestWebKitAPI/Tests/WebCore/URLParser.cpp:104 > > + checkURL("http://[0:f::f:f:0:0]", {"http", "", "", "[0:f::f:f:0:0]", 0, "/", "", "", "http://[0:f::f:f:0:0]/"}); > > + checkURL("http://[0:f:0:0:f::]", {"http", "", "", "[0:f:0:0:f::]", 0, "/", "", "", "http://[0:f:0:0:f::]/"}); > > + checkURL("http://[::f:0:0:f:0:0]", {"http", "", "", "[::f:0:0:f:0:0]", 0, "/", "", "", "http://[::f:0:0:f:0:0]/"}); > > Don’t we need more cases? Using only 0 and f seems to not cover enough of > the parsing. Yes we need more cases. I am going to do much more thorough testing with code coverage in a second pass.

Alex Christensen

Comment 9 2016-08-31 16:27:14 PDT

Created attachment 287559 [details] Patch

Alex Christensen

Comment 10 2016-08-31 16:44:44 PDT

Created attachment 287567 [details] Patch

Alex Christensen

Comment 11 2016-08-31 17:01:22 PDT

(In reply to comment #8) > > Don’t we need more cases? Using only 0 and f seems to not cover enough of > > the parsing. > Yes we need more cases. I am going to do much more thorough testing with > code coverage in a second pass. To be clear, the tests with lots of f's are testing the location of the :: Entire features like the ipv4 parser inside of the ipv6 parser currently have 0% test coverage, but I'm trying to get the parser to a semi-usable state then polish it from there.

Alex Christensen

Comment 12 2016-08-31 17:13:11 PDT

> > This seems so non-obvious to me; what does it even do? Can we do this with > > std::reverse instead? > Maybe. This is what the spec says. I was quite aggressive with my use of > postfix decrement, though. std::reverse isn't what this does. std::reverse switches the order of the elements. This just shifts the elements over without changing their order. We could use a reverse version of std::swap_ranges, but such a thing does not exist.

Alex Christensen

Comment 13 2016-08-31 17:25:59 PDT

http://trac.webkit.org/changeset/205273

Darin Adler

Comment 14 2016-09-02 09:08:40 PDT

Comment on attachment 287567 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=287567&action=review > Tools/TestWebKitAPI/Tests/WebCore/URLParser.cpp:56 > + return s1.utf8() == s2.utf8(); This is no different than s1 == s2; the conversion to UTF-8 has no effect on equality. Not sure why this exists.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version WebKit Nightly Build

Hardware Unspecified

OS Unspecified

Product WebKit

Component New Bugs

Assignee

Alex Christensen

Reported

2016-08-30 10:14 PDT

Modified

2016-09-02 09:08 PDT History

CC List

5 users Show

URL

Keywords

Depends on

Blocks