Bug 150346 - Do not sanitize user input for input[type=url]
Summary: Do not sanitize user input for input[type=url]
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Forms (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Keith Rollin
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2015-10-19 16:41 PDT by Keith Rollin
Modified: 2015-10-28 16:54 PDT (History)
9 users (show)

See Also:


Attachments
Patch (5.76 KB, patch)
2015-10-23 16:57 PDT, Keith Rollin
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews113 for mac-yosemite (902.62 KB, application/zip)
2015-10-23 17:38 PDT, Build Bot
no flags Details
Patch (7.12 KB, patch)
2015-10-26 15:17 PDT, Keith Rollin
no flags Details | Formatted Diff | Diff
Patch (7.02 KB, patch)
2015-10-27 10:01 PDT, Keith Rollin
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews104 for mac-mavericks-wk2 (708.45 KB, application/zip)
2015-10-27 10:53 PDT, Build Bot
no flags Details
Patch (1.68 KB, patch)
2015-10-28 14:33 PDT, Keith Rollin
no flags Details | Formatted Diff | Diff
Patch (2.08 KB, patch)
2015-10-28 15:56 PDT, Keith Rollin
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Keith Rollin 2015-10-19 16:41:26 PDT
See Bug 148864. In addressing that bug, input[type=url] values are now sanitized when they are set. However, as Kent Tamura (tkent@chromium.org) points out:

-----
The specification and the test ask to sanitize
 a) text set by |value| IDL attribute
 b) text set by |value| content attribute

but not 
 c) text set by user.

In WebKit and Blink, InputType::sanitizeValue is called in all of three cases for now.

The problem after your change is that selection API won't work well.
e.g.
  we have input[type=url].  A user puts "   http://apple.com/   " (23 characters).  JavaScript code runs:

  var length = input.value.length;  // 17 because whitepsaces are stripped.
  input.setSelectionRange(length, length);  // Move the caret at the end

This sets the caret on "o", not the end of the value.

So, I proposed we didn't strip whitespaces for user input.  input[type=email] doesn't have this issue because it doesn't support selection API.
-----

This proposal seems supported by the HTML spec, which says of input[type=url]: "The value attribute, if specified and not empty, must have a value that is a valid URL potentially surrounded by spaces that is also an absolute URL." Since text is sanitized when setting element.value or when initialized from the content attribute, the only way these potential spaces can appear is from user input.

As part of this, see also:

https://bugs.webkit.org/show_bug.cgi?id=148864
https://code.google.com/p/chromium/issues/detail?id=446108
https://www.w3.org/Bugs/Public/show_bug.cgi?id=28401
Comment 1 Radar WebKit Bug Importer 2015-10-23 15:45:10 PDT
<rdar://problem/23243240>
Comment 2 Keith Rollin 2015-10-23 16:57:07 PDT
Created attachment 263959 [details]
Patch
Comment 3 Build Bot 2015-10-23 17:38:34 PDT
Comment on attachment 263959 [details]
Patch

Attachment 263959 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/329157

New failing tests:
fast/forms/input-user-input-sanitization.html
Comment 4 Build Bot 2015-10-23 17:38:37 PDT
Created attachment 263968 [details]
Archive of layout-test-results from ews113 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews113  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 5 Darin Adler 2015-10-24 16:13:00 PDT
Comment on attachment 263959 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=263959&action=review

Looks like the mac-debug bot is crashing when running this test. That sounds like an insertion failure.

> Source/WebCore/html/TextFieldInputType.cpp:507
> +    // Do not sanitize input values if the element supports the selection interface:
> +    //      https://bugs.webkit.org/show_bug.cgi?id=150346
> +    // The selection interface is supported by BaseTextInputType, which is the parent
> +    // class for email, password, search, telephone, text, and url. However, email
> +    // explicitly turns off support, so only the latter five are affected by this check.

This comment seems too oblique. I think the point is that when we support the selection interface, the “sanitize” concept is not good. But is that part of HTML’s design or is it something we decided on our own? Where does the rationale for tying these two concepts together come from? I assume it has something to do with the fact that if you are able to use the selection interface then you require that offsets within the string you set match offsets you use with the selection API. So that explains why sanitization causes a problem, but it does not explain why it’s then optional. Why do we ever sanitize any field type, if there is sometimes a problem like this? We need a short comment, but it needs to focus on why the code does what it does. I don’t think the comment needs to mention that email is not affected; that’s important for testing this change, but it’s also something that could change if code elsewhere changed and so should be documented by the tests. The main purpose of this code would be to answer the question “why?” rather than saying what this code does. I can already read the code below and see that it says “don’t sanitize if the object supports the selection API”, so that is the thing that the comment does *not* need to say. What the comment needs to say is why.

> LayoutTests/fast/forms/input-user-input-sanitization-expected.txt:11
> +PASS input.value is result
> +PASS input.value is result
> +PASS input.value is result
> +PASS input.value is result
> +PASS input.value is result
> +PASS input.value is result

We try to structure our tests so the output makes it clear what the test is testing. That’s why it logs the expression. Thus we write test expressions that include the specifics of the test.

> LayoutTests/fast/forms/input-user-input-sanitization.html:38
> +function testOne(id, text, expected)
> +{
> +    input = focusAndType(id, text);
> +    result = expected || text;
> +    shouldBe('input.value', 'result');
> +}

We need to fix this so the shouldBe gets expressions that include what’s being tested and the actual expected result. Using it like this means we can’t see what’s being tested. Something like:

    shouldBe('focusAndType("' + id + '", "' + text + '").value', '"' + result + '"');

Not sure if exactly will work, but something like that.
Comment 6 Keith Rollin 2015-10-26 15:17:16 PDT
Created attachment 264080 [details]
Patch
Comment 7 Keith Rollin 2015-10-26 15:18:48 PDT
* Addressed crash.
* Reworded comment.
* Changed test as suggested.
Comment 8 Chris Dumez 2015-10-26 15:27:12 PDT
Comment on attachment 264080 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=264080&action=review

> Source/WebCore/html/HTMLInputElement.cpp:1046
> +    // values in order to retain parity between what's in the model and what's

The comment should indicate that it does not sanitize *user input* only.

> Source/WebCore/html/HTMLInputElement.cpp:1048
> +    if (!m_inputType->supportsSelectionAPI())

No if

> Source/WebCore/html/HTMLInputElement.cpp:1049
> +        ASSERT(value == sanitizeValue(value) || sanitizeValue(value).isEmpty());

ASSERT(m_inputType->supportsSelectionAPI() || value == sanitizeValue(value) || sanitizeValue(value).isEmpty());

> Source/WebCore/html/HTMLInputElement.cpp:1052
> +    // The assert macro above may also be simplified to: value == santizeValue(value)

TYPO!

> LayoutTests/fast/forms/input-user-input-sanitization.html:1
> +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">

<!DOCTYPE html>

> LayoutTests/fast/forms/input-user-input-sanitization.html:5
> +<script src="resources/common.js"></script>

What is this used for?

> LayoutTests/fast/forms/input-user-input-sanitization.html:10
> +<p id="description"></p>

Not needed

> LayoutTests/fast/forms/input-user-input-sanitization.html:11
> +<div id="console"></div>

Not needed.

> LayoutTests/fast/forms/input-user-input-sanitization.html:27
> +    for (var i = 0, len = text.length; i < len; i++) {

no abbreviations in variable names please.

> LayoutTests/fast/forms/input-user-input-sanitization.html:36
> +    shouldBe('focusAndType("' + id + '", "' + text + '").value', '"' + result + '"');

Could use shouldBeEqualToString()
Comment 9 Kent Tamura 2015-10-26 16:51:13 PDT
Comment on attachment 264080 [details]
Patch

The approach looks ok to me.
Comment 10 Keith Rollin 2015-10-27 10:01:41 PDT
Created attachment 264133 [details]
Patch
Comment 11 Keith Rollin 2015-10-27 10:03:44 PDT
Updated for Chris's comments.
Comment 12 Keith Rollin 2015-10-27 10:06:50 PDT
> no abbreviations in variable names please.

Arg. I missed this one. I hope that's OK.
Comment 13 Chris Dumez 2015-10-27 10:09:29 PDT
Comment on attachment 264133 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=264133&action=review

> LayoutTests/fast/forms/input-user-input-sanitization.html:1
> +<!DOCTYPE HTML>

FYI, we usually use lowercase html but it is case-insensitive as per the spec so fine:
https://html.spec.whatwg.org/#the-doctype
Comment 14 Build Bot 2015-10-27 10:53:34 PDT
Comment on attachment 264133 [details]
Patch

Attachment 264133 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.webkit.org/results/344926

New failing tests:
animations/multiple-backgrounds.html
Comment 15 Build Bot 2015-10-27 10:53:38 PDT
Created attachment 264140 [details]
Archive of layout-test-results from ews104 for mac-mavericks-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews104  Port: mac-mavericks-wk2  Platform: Mac OS X 10.9.5
Comment 16 WebKit Commit Bot 2015-10-27 10:54:43 PDT
Comment on attachment 264133 [details]
Patch

Clearing flags on attachment: 264133

Committed r191626: <http://trac.webkit.org/changeset/191626>
Comment 17 WebKit Commit Bot 2015-10-27 10:54:47 PDT
All reviewed patches have been landed.  Closing bug.
Comment 18 Ryan Haddad 2015-10-27 15:07:06 PDT
The test added with this change is failing on win:
https://build.webkit.org/builders/Apple%20Win%207%20Release%20%28Tests%29/builds/54767

--- /home/buildbot/slave/win-release-tests/build/layout-test-results/fast/forms/input-user-input-sanitization-expected.txt
+++ /home/buildbot/slave/win-release-tests/build/layout-test-results/fast/forms/input-user-input-sanitization-actual.txt
@@ -3,12 +3,12 @@
 On success, you will see a series of "PASS" messages, followed by "TEST COMPLETE".
 
 
-PASS focusAndType("email", "   foobar@example.com   ").value is "foobar@example.com"
+FAIL focusAndType("email", "   foobar@example.com   ").value should be foobar@example.com. Was foobar2example.com.
 PASS focusAndType("password", "   foobar   ").value is "   foobar   "
 PASS focusAndType("search", "   foobar   ").value is "   foobar   "
 PASS focusAndType("telephone", "   123-456-7890   ").value is "   123-456-7890   "
 PASS focusAndType("text", "   foobar   ").value is "   foobar   "
-PASS focusAndType("url", "   https://foobar.example.com   ").value is "   https://foobar.example.com   "
+FAIL focusAndType("url", "   https://foobar.example.com   ").value should be    https://foobar.example.com   . Was    https;//foobar.example.com   .
 PASS successfullyParsed is true
 
 TEST COMPLETE
Comment 19 Keith Rollin 2015-10-28 14:33:17 PDT
Created attachment 264247 [details]
Patch
Comment 20 Chris Dumez 2015-10-28 14:34:36 PDT
Comment on attachment 264247 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=264247&action=review

> Tools/ChangeLog:12
> +

Please explain what the behavior change is. Why is your approach different from isASCIIUpper()?
Comment 21 Chris Dumez 2015-10-28 14:42:59 PDT
Comment on attachment 264247 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=264247&action=review

> Tools/DumpRenderTree/win/EventSender.cpp:514
> +        const char* shiftedUSCharacters = "ABCDEFGHIJKLMNOPQRSTUVWXYZ~!@#$%^&*()_+{}|:\"<>?";

const char shiftedUSCharacters[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ~!@#$%^&*()_+{}|:\"<>?";

would be better since we don't modify it.

> Tools/DumpRenderTree/win/EventSender.cpp:517
> +        if (strchr(shiftedUSCharacters, charCode) != nullptr)

Also in WebKit, we usually don't compare to nullptr. This would be:
if (strchr(shiftedUSCharacters, charCode))
Comment 22 Keith Rollin 2015-10-28 15:56:26 PDT
Created attachment 264259 [details]
Patch
Comment 23 Chris Dumez 2015-10-28 15:58:05 PDT
Comment on attachment 264259 [details]
Patch

LGTM but I'll let Brent review since this is Windows.
Comment 24 Brent Fulgham 2015-10-28 16:04:27 PDT
Comment on attachment 264259 [details]
Patch

r=me. Thank you for fixing this!
Comment 25 WebKit Commit Bot 2015-10-28 16:54:02 PDT
Comment on attachment 264259 [details]
Patch

Clearing flags on attachment: 264259

Committed r191704: <http://trac.webkit.org/changeset/191704>
Comment 26 WebKit Commit Bot 2015-10-28 16:54:08 PDT
All reviewed patches have been landed.  Closing bug.