165885 – REGRESSION (r208902) Null pointer dereference in wkIsPublicSuffix

RESOLVED FIXED 165885

REGRESSION (r208902) Null pointer dereference in wkIsPublicSuffix

https://bugs.webkit.org/show_bug.cgi?id=165885

Summary REGRESSION (r208902) Null pointer dereference in wkIsPublicSuffix

Alex Christensen

Reported 2016-12-14 18:12:20 PST

REGRESSION (r208902) Null pointer dereference in wkIsPublicSuffix

Attachments
Patch (3.32 KB, patch) 2016-12-14 18:21 PST, Alex Christensen	no flags	Details Formatted Diff Diff
Patch (4.09 KB, patch) 2016-12-14 19:56 PST, Alex Christensen	no flags	Details Formatted Diff Diff
Patch (7.47 KB, patch) 2016-12-14 20:24 PST, Alex Christensen	no flags	Details Formatted Diff Diff
Patch (8.17 KB, patch) 2016-12-15 00:12 PST, Alex Christensen	no flags	Details Formatted Diff Diff
Show Obsolete (3) View All Add attachment proposed patch, testcase, etc.

Alex Christensen

Comment 1 2016-12-14 18:21:28 PST

Created attachment 297154 [details] Patch

Alex Christensen

Comment 2 2016-12-14 18:24:58 PST

rdar://problem/29476917

Alex Christensen

Comment 3 2016-12-14 19:56:12 PST

Created attachment 297161 [details] Patch

Alex Christensen

Comment 4 2016-12-14 20:24:08 PST

Created attachment 297162 [details] Patch

Darin Adler

Comment 5 2016-12-14 21:44:12 PST

Comment on attachment 297162 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=297162&action=review > Source/WebCore/ChangeLog:9 > + wkIsPublicSuffix crashes if you give it a nil NSString*. This patch seems to do more than just check for nil; changes the topPrivatelyControlledDomain algorithm. > Source/WebCore/platform/mac/PublicSuffixMac.mm:49 > if (domain.isNull() || domain.isEmpty()) > return String(); Never need to check both isNull and isEmpty. Null strings will all compare as empty. if (domain.isEmpty()) return String(); But also, we should see if we can just leave this check out entirely and test that null and empty strings both work. > Source/WebCore/platform/mac/PublicSuffixMac.mm:52 > + if ([decodeHostName(domain) _web_looksLikeIPAddress]) > return domain; Do we really need to decode the host name before calling _web_looksLikeIPAddress? That does not seem necessary. The old code did it because the single call to decodeHostName was shared by the entire function, but now I suspect we can get the correct result by just calling it without decoding. > Source/WebCore/platform/mac/PublicSuffixMac.mm:67 > + Vector<size_t, 8> substringStarts; > + substringStarts.append(0); > + for (size_t i = 0; i < domain.length(); ++i) { > + if (domain[i] == '.') > + substringStarts.append(i + 1); > + } > + > // Match the longest possible suffix. > - bool hasTopLevelDomain = false; > - NSCharacterSet *dot = [[NSCharacterSet characterSetWithCharactersInString:@"."] retain]; > - NSRange nextDot = NSMakeRange(0, [host length]); > - for (NSRange previousDot = [host rangeOfCharacterFromSet:dot]; previousDot.location != NSNotFound; nextDot = previousDot, previousDot = [host rangeOfCharacterFromSet:dot options:0 range:NSMakeRange(previousDot.location + previousDot.length, [host length] - (previousDot.location + previousDot.length))]) { > - NSString *substring = [host substringFromIndex:previousDot.location + previousDot.length]; > - if (wkIsPublicSuffix(substring)) { > - hasTopLevelDomain = true; > - break; > - } > + for (size_t i = 0; i < substringStarts.size(); ++i) { > + NSString *host = decodeHostName(domain.substring(substringStarts[i])); > + if (host && wkIsPublicSuffix(host)) > + return i ? domain.substring(substringStarts[i - 1]) : String(); > } > - > - [dot release]; > - if (!hasTopLevelDomain) > - return String(); > - > - if (!nextDot.location) > - return domain; > - > - return encodeHostName([host substringFromIndex:nextDot.location + nextDot.length]); > + return String(); If we feel the need to rewrite this to use String, we don’t need to allocate a vector, even on the stack. We can just translate the old algorithm more directly. Here’s my cut at it, taking my lead from your function in changing how it works: if (isPublicSuffix(domain)) return String(); // FIXME: Do we really need this special case? If we really do need this, we should add a test case proving that we do. size_t separatorPosition; for (unsigned labelStart = 0; (separatorPosition = domain.find('.', labelStart)) != notFound; labelStart = separatorPosition + 1) { if (isPublicSuffix(domain.substring(separatorPosition + 1))) return domain.substring(labelStart); } return String(); Note that indices into a string are unsigned, not size_t, except for the return value from find, which is peculiar because it uses size_t. What’s the rationale for calling decodeHostName over and over again? Could that perhaps give incorrect results if there is some special treatment at the very start of a host name? Or maybe it might be bad for performance?

Alex Christensen

Comment 6 2016-12-15 00:09:45 PST

We do need to call decodeHostName multiple times for correct behavior in the case that caused me to look into this, a url like "r4---asdf.example.com" where the first part causes ICU to fail but the rest is fine. We could build up a Vector<String> of each section between the dots, but that would require possibly more calls to uidna_nameToUnicode. I'm not sure whether each call to uidna_nameToUnicode slows it down because it is setting things up, or if uidna_nameToUnicode is slow internally. I don't imagine there is much of a performance difference between the two possible implementations of this, both of which require many calls to uidna_nameToUnicode, so I'm going to commit the version you suggested.

Alex Christensen

Comment 7 2016-12-15 00:12:52 PST

Created attachment 297178 [details] Patch

Alex Christensen

Comment 8 2016-12-15 00:20:39 PST

http://trac.webkit.org/changeset/209857

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version WebKit Nightly Build

Hardware Unspecified

OS Unspecified

Product WebKit

Component New Bugs

Assignee

Alex Christensen

Reported

2016-12-14 18:12 PST

Modified

2016-12-15 00:20 PST History

CC List

2 users Show

URL

Keywords InRadar

Depends on

Blocks