Bug 143381

Summary: URLs containing tabs or newlines are parsed incorrectly
Product: WebKit Reporter: Maciej Stachowiak <mjs>
Component: WebCore Misc.Assignee: Jiewen Tan <jiewen_tan>
Status: RESOLVED FIXED    
Severity: Normal CC: achristensen, ap, bfulgham, buildbot, commit-queue, jiewen_tan, rniwa, webkit-bug-importer, zcorpan
Priority: P2 Keywords: InRadar
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
test cases
none
Patch
none
Archive of layout-test-results from ews100 for mac-yosemite
none
Archive of layout-test-results from ews106 for mac-yosemite-wk2
none
Archive of layout-test-results from ews125 for ios-simulator-wk2
none
Archive of layout-test-results from ews113 for mac-yosemite
none
Patch
none
Archive of layout-test-results from ews103 for mac-yosemite
none
Archive of layout-test-results from ews106 for mac-yosemite-wk2
none
Archive of layout-test-results from ews123 for ios-simulator-wk2
none
Archive of layout-test-results from ews114 for mac-yosemite
none
Patch
bfulgham: review+, bfulgham: commit-queue-
Patch for landing none

Description Maciej Stachowiak 2015-04-03 11:10:49 PDT
WebKit appears to treat raw newline or tab characters in a URL hostname as spaces. Other browsers appear to strip them (and the URL spec calls for this). Illustrated example here:

https://url.spec.whatwg.org/interop/test-results/d7d52bebd0?select=current&baseline=safari

Live test case here:

http://intertwingly.net/tmp/url0.html
Comment 1 Simon Pieters (:zcorpan) 2016-03-07 08:33:05 PST
Also in the scheme. (Gecko changed that recently.)

See https://github.com/whatwg/url/issues/101

and https://github.com/whatwg/html/issues/823#issuecomment-193236577
Comment 2 Simon Pieters (:zcorpan) 2016-03-21 05:22:46 PDT
Test in web-platform-tests: http://w3c-test.org/url/a-element.html

Parsing: <h	t t p://h	o s t:9	0 0 0/p	a t h?q	u e ry#f	r a g> against <about:blank>
Comment 3 Jiewen Tan 2016-05-31 17:54:05 PDT
<rdar://problem/26567214>
Comment 4 Jiewen Tan 2016-06-01 14:30:16 PDT
Created attachment 280267 [details]
test cases

Please put it under LayoutTests/fast/url, and run it.
Comment 5 Radar WebKit Bug Importer 2016-06-01 14:32:19 PDT
<rdar://problem/26586115>
Comment 6 Jiewen Tan 2016-06-02 18:40:07 PDT
(In reply to comment #5)
> <rdar://problem/26586115>

This radar is duplicated.
Comment 7 Jiewen Tan 2016-06-02 18:42:41 PDT
Created attachment 280395 [details]
Patch
Comment 8 WebKit Commit Bot 2016-06-02 18:44:24 PDT
Attachment 280395 [details] did not pass style-queue:


ERROR: Source/WebCore/platform/URL.cpp:117:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WebCore/platform/URL.cpp:118:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WebCore/platform/URL.cpp:119:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
Total errors found: 3 in 9 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 9 Build Bot 2016-06-02 19:30:10 PDT
Comment on attachment 280395 [details]
Patch

Attachment 280395 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.webkit.org/results/1426035

New failing tests:
fast/url/tab-and-newline-stripping.html
http/tests/security/javascriptURL/javascriptURL-execution-context-iframe-src-setAttributeNS.html
imported/w3c/web-platform-tests/html/dom/reflection-text.html
fast/dom/HTMLAnchorElement/set-href-attribute-host.html
http/tests/security/javascriptURL/javascriptURL-execution-context-iframe-src-setAttributeNode.html
http/tests/security/javascriptURL/javascriptURL-execution-context-iframe-src-setAttribute.html
http/tests/security/javascriptURL/javascriptURL-execution-context-frame-location-htmldom.html
http/tests/security/javascriptURL/javascriptURL-execution-context-frame-src-setAttribute.html
http/tests/security/javascriptURL/javascriptURL-execution-context-iframe-src-getAttribute-value.html
http/tests/security/javascriptURL/javascriptURL-execution-context-frame-src-setAttributeNodeNS.html
imported/w3c/web-platform-tests/html/dom/reflection-embedded.html
http/tests/security/javascriptURL/javascriptURL-execution-context-iframe-src-setAttributeNodeNS.html
http/tests/security/javascriptURL/javascriptURL-execution-context-frame-src-setAttributeNS.html
http/tests/security/javascriptURL/javascriptURL-execution-context-frame-src-getAttribute-value.html
http/tests/security/javascriptURL/javascriptURL-execution-context-frame-src-setAttributeNode.html
http/tests/security/javascriptURL/javascriptURL-execution-context-frame-src-htmldom.html
http/tests/security/javascriptURL/javascriptURL-execution-context-iframe-src-htmldom.html
Comment 10 Build Bot 2016-06-02 19:30:14 PDT
Created attachment 280404 [details]
Archive of layout-test-results from ews100 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews100  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 11 Build Bot 2016-06-02 19:33:57 PDT
Comment on attachment 280395 [details]
Patch

Attachment 280395 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.webkit.org/results/1426042

New failing tests:
fast/url/tab-and-newline-stripping.html
http/tests/security/javascriptURL/javascriptURL-execution-context-iframe-src-setAttributeNS.html
imported/w3c/web-platform-tests/html/dom/reflection-text.html
fast/dom/HTMLAnchorElement/set-href-attribute-host.html
http/tests/security/javascriptURL/javascriptURL-execution-context-iframe-src-setAttributeNode.html
http/tests/security/javascriptURL/javascriptURL-execution-context-iframe-src-setAttribute.html
http/tests/security/javascriptURL/javascriptURL-execution-context-frame-location-htmldom.html
http/tests/security/javascriptURL/javascriptURL-execution-context-frame-src-setAttribute.html
http/tests/security/javascriptURL/javascriptURL-execution-context-iframe-src-getAttribute-value.html
http/tests/security/javascriptURL/javascriptURL-execution-context-frame-src-setAttributeNodeNS.html
imported/w3c/web-platform-tests/html/dom/reflection-embedded.html
http/tests/security/javascriptURL/javascriptURL-execution-context-iframe-src-setAttributeNodeNS.html
http/tests/security/javascriptURL/javascriptURL-execution-context-frame-src-setAttributeNS.html
http/tests/security/javascriptURL/javascriptURL-execution-context-frame-src-getAttribute-value.html
http/tests/security/javascriptURL/javascriptURL-execution-context-frame-src-setAttributeNode.html
http/tests/security/javascriptURL/javascriptURL-execution-context-frame-src-htmldom.html
http/tests/security/javascriptURL/javascriptURL-execution-context-iframe-src-htmldom.html
Comment 12 Build Bot 2016-06-02 19:34:01 PDT
Created attachment 280406 [details]
Archive of layout-test-results from ews106 for mac-yosemite-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews106  Port: mac-yosemite-wk2  Platform: Mac OS X 10.10.5
Comment 13 Build Bot 2016-06-02 19:38:07 PDT
Comment on attachment 280395 [details]
Patch

Attachment 280395 [details] did not pass ios-sim-ews (ios-simulator-wk2):
Output: http://webkit-queues.webkit.org/results/1426036

New failing tests:
fast/url/tab-and-newline-stripping.html
http/tests/security/javascriptURL/javascriptURL-execution-context-iframe-src-setAttributeNS.html
http/tests/security/javascriptURL/javascriptURL-execution-context-frame-src-setAttribute.html
fast/dom/HTMLAnchorElement/set-href-attribute-host.html
http/tests/security/javascriptURL/javascriptURL-execution-context-iframe-src-setAttributeNode.html
http/tests/security/javascriptURL/javascriptURL-execution-context-iframe-src-setAttribute.html
http/tests/security/javascriptURL/javascriptURL-execution-context-frame-location-htmldom.html
imported/w3c/web-platform-tests/html/dom/reflection-text.html
http/tests/security/javascriptURL/javascriptURL-execution-context-iframe-src-getAttribute-value.html
http/tests/security/javascriptURL/javascriptURL-execution-context-frame-src-setAttributeNodeNS.html
imported/w3c/web-platform-tests/html/dom/reflection-embedded.html
http/tests/security/javascriptURL/javascriptURL-execution-context-iframe-src-setAttributeNodeNS.html
http/tests/security/javascriptURL/javascriptURL-execution-context-frame-src-setAttributeNS.html
http/tests/security/javascriptURL/javascriptURL-execution-context-frame-src-getAttribute-value.html
http/tests/security/javascriptURL/javascriptURL-execution-context-iframe-src-htmldom.html
http/tests/security/javascriptURL/javascriptURL-execution-context-frame-src-htmldom.html
http/tests/security/javascriptURL/javascriptURL-execution-context-frame-src-setAttributeNode.html
Comment 14 Build Bot 2016-06-02 19:38:12 PDT
Created attachment 280409 [details]
Archive of layout-test-results from ews125 for ios-simulator-wk2

The attached test failures were seen while running run-webkit-tests on the ios-sim-ews.
Bot: ews125  Port: ios-simulator-wk2  Platform: Mac OS X 10.11.4
Comment 15 Build Bot 2016-06-02 19:41:12 PDT
Comment on attachment 280395 [details]
Patch

Attachment 280395 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/1426044

New failing tests:
fast/url/tab-and-newline-stripping.html
http/tests/security/javascriptURL/javascriptURL-execution-context-iframe-src-setAttributeNS.html
http/tests/security/javascriptURL/javascriptURL-execution-context-frame-src-setAttribute.html
fast/dom/HTMLAnchorElement/set-href-attribute-host.html
http/tests/security/javascriptURL/javascriptURL-execution-context-iframe-src-setAttributeNode.html
http/tests/security/javascriptURL/javascriptURL-execution-context-iframe-src-setAttribute.html
http/tests/security/javascriptURL/javascriptURL-execution-context-frame-location-htmldom.html
http/tests/security/javascriptURL/javascriptURL-execution-context-iframe-src-getAttribute-value.html
http/tests/security/javascriptURL/javascriptURL-execution-context-frame-src-setAttributeNodeNS.html
http/tests/security/javascriptURL/javascriptURL-execution-context-iframe-src-setAttributeNodeNS.html
http/tests/security/javascriptURL/javascriptURL-execution-context-frame-src-setAttributeNS.html
http/tests/security/javascriptURL/javascriptURL-execution-context-frame-src-getAttribute-value.html
http/tests/security/javascriptURL/javascriptURL-execution-context-frame-src-setAttributeNode.html
http/tests/security/javascriptURL/javascriptURL-execution-context-frame-src-htmldom.html
http/tests/security/javascriptURL/javascriptURL-execution-context-iframe-src-htmldom.html
Comment 16 Build Bot 2016-06-02 19:41:16 PDT
Created attachment 280412 [details]
Archive of layout-test-results from ews113 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews113  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 17 Jiewen Tan 2016-06-03 16:52:28 PDT
Created attachment 280484 [details]
Patch
Comment 18 WebKit Commit Bot 2016-06-03 16:54:01 PDT
Attachment 280484 [details] did not pass style-queue:


ERROR: Source/WebCore/platform/URL.cpp:117:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WebCore/platform/URL.cpp:118:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WebCore/platform/URL.cpp:119:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
Total errors found: 3 in 22 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 19 Build Bot 2016-06-03 17:32:49 PDT
Comment on attachment 280484 [details]
Patch

Attachment 280484 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.webkit.org/results/1431113

New failing tests:
http/tests/security/xssAuditor/javascript-link-control-char2.html
Comment 20 Build Bot 2016-06-03 17:32:53 PDT
Created attachment 280488 [details]
Archive of layout-test-results from ews103 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews103  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 21 Build Bot 2016-06-03 17:44:18 PDT
Comment on attachment 280484 [details]
Patch

Attachment 280484 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.webkit.org/results/1431143

New failing tests:
http/tests/security/xssAuditor/javascript-link-control-char2.html
Comment 22 Build Bot 2016-06-03 17:44:21 PDT
Created attachment 280490 [details]
Archive of layout-test-results from ews106 for mac-yosemite-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews106  Port: mac-yosemite-wk2  Platform: Mac OS X 10.10.5
Comment 23 Build Bot 2016-06-03 17:48:23 PDT
Comment on attachment 280484 [details]
Patch

Attachment 280484 [details] did not pass ios-sim-ews (ios-simulator-wk2):
Output: http://webkit-queues.webkit.org/results/1431140

New failing tests:
http/tests/security/xssAuditor/javascript-link-control-char2.html
Comment 24 Build Bot 2016-06-03 17:48:27 PDT
Created attachment 280493 [details]
Archive of layout-test-results from ews123 for ios-simulator-wk2

The attached test failures were seen while running run-webkit-tests on the ios-sim-ews.
Bot: ews123  Port: ios-simulator-wk2  Platform: Mac OS X 10.11.4
Comment 25 Build Bot 2016-06-03 17:51:53 PDT
Comment on attachment 280484 [details]
Patch

Attachment 280484 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/1431151

New failing tests:
http/tests/security/xssAuditor/javascript-link-control-char2.html
Comment 26 Build Bot 2016-06-03 17:51:57 PDT
Created attachment 280494 [details]
Archive of layout-test-results from ews114 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews114  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 27 Jiewen Tan 2016-06-06 12:47:18 PDT
Created attachment 280620 [details]
Patch
Comment 28 WebKit Commit Bot 2016-06-06 12:49:03 PDT
Attachment 280620 [details] did not pass style-queue:


ERROR: Source/WebCore/platform/URL.cpp:117:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WebCore/platform/URL.cpp:118:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WebCore/platform/URL.cpp:119:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WebCore/platform/URL.cpp:1940:  An else statement can be removed when the prior "if" concludes with a return, break, continue or goto statement.  [readability/control_flow] [4]
Total errors found: 4 in 23 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 29 Alex Christensen 2016-06-06 15:00:18 PDT
Comment on attachment 280620 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=280620&action=review

> Source/WebCore/platform/URL.cpp:485
> +    // Get rid of leading and trailing whitespace and control characters.
> +    String rel = relative.stripWhiteSpace(shouldTrimFromURL);
> +
> +    // Get rid of any tabs and newlines.
> +    rel = rel.removeCharacters(isTabNewline);

Could we do this in one step?
Comment 30 Jiewen Tan 2016-06-06 16:16:30 PDT
Comment on attachment 280620 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=280620&action=review

>> Source/WebCore/platform/URL.cpp:485
>> +    rel = rel.removeCharacters(isTabNewline);
> 
> Could we do this in one step?

We could build a customized function to do the trick, but I don't think there is any WTFString builtin support for doing it.

Are you aware of any tricks that could help to achieve this without replicating many of the WTFString codes?
Comment 31 Alex Christensen 2016-06-06 17:16:01 PDT
(In reply to comment #30)
> Are you aware of any tricks that could help to achieve this without
> replicating many of the WTFString codes?
I am not aware of such tricks.  They may exist, or they may need to be made.
I think we need to rewrite the URL parser.
Comment 32 Jiewen Tan 2016-06-06 17:41:24 PDT
(In reply to comment #31)
> (In reply to comment #30)
> > Are you aware of any tricks that could help to achieve this without
> > replicating many of the WTFString codes?
> I am not aware of such tricks.  They may exist, or they may need to be made.
> I think we need to rewrite the URL parser.

I have no objections of rewriting the URL parser, but we should try our best to harden the current one before that actually happens.
Comment 33 Brent Fulgham 2016-06-06 17:41:51 PDT
Comment on attachment 280620 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=280620&action=review

> Source/WebCore/platform/URL.cpp:1942
> +        else

This else isn't necessary. Once we stop 'continuing' in the loop, we are no longer leading. Just set 'leading = false' once we stop the continue branch.

> Source/WebCore/platform/URL.cpp:1951
> +        if (isLetterMatchIgnoringCase(url[i], protocol[j]))

Couldn't this have just been written:

if (!isLetterMatchIgnoringCase(url[i], protocol[j]))
    return false;

++j;

> LayoutTests/fast/url/segments-expected.txt:18
> +PASS segments('http://f:\n/c') is '["http:","f","","/c","",""]'

Are you sure this change is correct? Why do we expect your whitespace handling changes to result in a different segment output?

> LayoutTests/fast/url/segments-from-data-url-expected.txt:18
> +PASS segments('http://f:\n/c') is '["http:","f","","/c","",""]'

Ditto.

> LayoutTests/fast/url/segments-from-data-url.html:25
> +  ["http://f:\\n/c",                         ["http:","f","","/c","",""]],

Ditto.

> LayoutTests/fast/url/segments.html:26
> +  ["http://f:\\n/c",                         ["http:","f","","/c","",""]],

Ditto.
Comment 34 Jiewen Tan 2016-06-06 17:57:55 PDT
Comment on attachment 280620 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=280620&action=review

Thanks for reviewing my patch!

>> LayoutTests/fast/url/segments-expected.txt:18
>> +PASS segments('http://f:\n/c') is '["http:","f","","/c","",""]'
> 
> Are you sure this change is correct? Why do we expect your whitespace handling changes to result in a different segment output?

Yes, I think so. Before my patch, the URL parser will think any URL containing whitespace as invalid URLs. Therefore, there is no segments information this test can extract from the test URL.

Now, with my patch. The test URL becomes http://f:/c. So, the protocol = 'http:', the hostname = 'f', the port = '', the path = '/c', the query = '', and the segment = ''.
Comment 35 Brent Fulgham 2016-06-06 18:04:40 PDT
Comment on attachment 280620 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=280620&action=review

>>> LayoutTests/fast/url/segments-expected.txt:18
>>> +PASS segments('http://f:\n/c') is '["http:","f","","/c","",""]'
>> 
>> Are you sure this change is correct? Why do we expect your whitespace handling changes to result in a different segment output?
> 
> Yes, I think so. Before my patch, the URL parser will think any URL containing whitespace as invalid URLs. Therefore, there is no segments information this test can extract from the test URL.
> 
> Now, with my patch. The test URL becomes http://f:/c. So, the protocol = 'http:', the hostname = 'f', the port = '', the path = '/c', the query = '', and the segment = ''.

I see! That makes sense.
Comment 36 Brent Fulgham 2016-06-06 18:12:19 PDT
Comment on attachment 280620 [details]
Patch

r=me, but please fix the return logic in protocolIs as I suggested (and the style-bot told you to fix).
Comment 37 Jiewen Tan 2016-06-06 18:44:04 PDT
Comment on attachment 280620 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=280620&action=review

>> Source/WebCore/platform/URL.cpp:1942
>> +        else
> 
> This else isn't necessary. Once we stop 'continuing' in the loop, we are no longer leading. Just set 'leading = false' once we stop the continue branch.

Fixed.

>> Source/WebCore/platform/URL.cpp:1951
>> +        if (isLetterMatchIgnoringCase(url[i], protocol[j]))
> 
> Couldn't this have just been written:
> 
> if (!isLetterMatchIgnoringCase(url[i], protocol[j]))
>     return false;
> 
> ++j;

Fixed.
Comment 38 Jiewen Tan 2016-06-06 18:45:05 PDT
Created attachment 280658 [details]
Patch for landing
Comment 39 WebKit Commit Bot 2016-06-06 18:46:32 PDT
Attachment 280658 [details] did not pass style-queue:


ERROR: Source/WebCore/platform/URL.cpp:117:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WebCore/platform/URL.cpp:118:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WebCore/platform/URL.cpp:119:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
Total errors found: 3 in 23 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 40 WebKit Commit Bot 2016-06-06 22:45:18 PDT
Comment on attachment 280658 [details]
Patch for landing

Clearing flags on attachment: 280658

Committed r201740: <http://trac.webkit.org/changeset/201740>