RESOLVED FIXED 148864
The value sanitization algorithm for input[type=url] should strip whitespaces
https://bugs.webkit.org/show_bug.cgi?id=148864
Summary The value sanitization algorithm for input[type=url] should strip whitespaces
Ryosuke Niwa
Reported 2015-09-04 18:24:47 PDT
The value sanitization algorithm for input type=url should strip leading and trailing whitespaces. See https://html.spec.whatwg.org/multipage/forms.html#url-state-(type=url) This bug was found by the newly added test: LayoutTests/http/tests/w3c/html/semantics/forms/the-input-element/url.html
Attachments
Patch (21.25 KB, patch)
2015-10-13 23:04 PDT, Keith Rollin
no flags
Patch (21.75 KB, patch)
2015-10-16 10:07 PDT, Keith Rollin
no flags
Patch (21.74 KB, patch)
2015-10-16 10:36 PDT, Keith Rollin
no flags
Radar WebKit Bug Importer
Comment 1 2015-09-04 18:26:01 PDT
Keith Rollin
Comment 2 2015-10-13 21:32:05 PDT
For reference, from https://html.spec.whatwg.org/multipage/forms.html#url-state-(type=url) : The value sanitization algorithm is as follows: Strip line breaks from the value, then strip leading and trailing whitespace from the value.
Keith Rollin
Comment 3 2015-10-13 23:03:02 PDT
Added URLInputType::sanitizeValue to strip leading and trailing spaces.
Keith Rollin
Comment 4 2015-10-13 23:04:29 PDT
Chris Dumez
Comment 5 2015-10-14 09:48:21 PDT
Comment on attachment 263059 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=263059&action=review > Source/WebCore/ChangeLog:8 > + Please add a link to the spec and indicate the behavior of other major browsers. > LayoutTests/imported/w3c/web-platform-tests/html/semantics/forms/the-input-element/type-change-state-expected.txt:87 > +FAIL change state from url to text assert_equals: input.value should be foobar after change of state expected " foobar " but got "foobar" There are new failures here. I think we should handle these cases properly in this patch.
Keith Rollin
Comment 6 2015-10-14 10:34:13 PDT
> There are new failures here. I think we should handle these cases properly in this patch. The failures are a confusing part of the test. They look like errors, but they seem to be part of capturing correct/expected results. What's happening in the test is that there is a double-nested loop, each loop iterating over each possible type of input (hidden, email, url, number, color, etc.). The core of the loop sets the type of the input element to the type controlled by the outer loop, sets the input element's value to a test string (" foobar "), sets the type of the input element to the type controlled by the inner loop, and checks the input element's new value. In order to check that the new value is correct, the test would need to encode the full matrix of initial-type -> new-type pairs, along with the expected new value for that pair (i.e., {"url/hidden/<result>", "url/text/<result>", "url/search/<result>", ... }). There are around 22 different types, meaning 484 pairs that would need to be captured in the test. This seems onerous and error-prone to generate and maintain. Therefore (and, since I didn't originally write this test, I'm only inferring here), it seems that the creator of this test captured the expected results in the *-expected.txt file by allowing both PASS and FAIL lines. The PASS lines cover test pairs where the result can trivially be validated. The FAIL lines cover test pairs where the result cannot be trivially be determined. But since the FAIL line contains information on what was actually produced in the conversion from one type to another, if a test run continues to match that FAIL line, then the test is correct and passes. If we start failing in a different way, then something has changed that needs to be investigated. That's what happened with the patch that sanitizes input[type=url].value. The resulting value is now different, leading to some FAIL tests to now fail in a different way, some PASS tests to now fail, and some FAIL tests to now PASS. I went through all the tests that now produce different results to verify that the new results are expected and that the *-expected.txt file contains the patterns we now want. Please let me know if it's OK to keep this current approach, or if we should implement a new approach (e.g., writing some script to generate and maintain the full set of 484 expected results).
Chris Dumez
Comment 7 2015-10-16 09:27:17 PDT
Comment on attachment 263059 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=263059&action=review r=me but please update the Change Log as suggested before landing. >> LayoutTests/imported/w3c/web-platform-tests/html/semantics/forms/the-input-element/type-change-state-expected.txt:87 >> +FAIL change state from url to text assert_equals: input.value should be foobar after change of state expected " foobar " but got "foobar" > > There are new failures here. I think we should handle these cases properly in this patch. Ok, never mind this comment.
Keith Rollin
Comment 8 2015-10-16 10:07:33 PDT
Chris Dumez
Comment 9 2015-10-16 10:09:27 PDT
Comment on attachment 263281 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=263281&action=review > Source/WebCore/ChangeLog:7 > + Reviewed by cdumez@apple.com. email won't work. > Source/WebCore/ChangeLog:12 > + Chrome also has the same issue with url.html. Firefox passes. All extra space before 'All'. > LayoutTests/imported/w3c/ChangeLog:7 > + Reviewed by cdumez@apple.com. ditto.
Keith Rollin
Comment 10 2015-10-16 10:36:06 PDT
WebKit Commit Bot
Comment 11 2015-10-16 11:24:45 PDT
Comment on attachment 263288 [details] Patch Clearing flags on attachment: 263288 Committed r191188: <http://trac.webkit.org/changeset/191188>
WebKit Commit Bot
Comment 12 2015-10-16 11:24:50 PDT
All reviewed patches have been landed. Closing bug.
Kent Tamura
Comment 13 2015-10-19 01:10:11 PDT
The landed patch looks wrong to me. Please refer to https://code.google.com/p/chromium/issues/detail?id=446108#c10
Kent Tamura
Comment 14 2015-10-19 01:47:17 PDT
Comment on attachment 263288 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=263288&action=review > Source/WebCore/html/URLInputType.cpp:69 > + return stripLeadingAndTrailingHTMLSpaces(BaseTextInputType::sanitizeValue(proposedValue)); TextFieldInputType::sanitizeValue() is not called. Now 'value' IDL property can have CR/LF wrongly.
Chris Dumez
Comment 15 2015-10-19 09:57:09 PDT
Comment on attachment 263288 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=263288&action=review >> Source/WebCore/html/URLInputType.cpp:69 >> + return stripLeadingAndTrailingHTMLSpaces(BaseTextInputType::sanitizeValue(proposedValue)); > > TextFieldInputType::sanitizeValue() is not called. Now 'value' IDL property can have CR/LF wrongly. I don't understand this comment. We're calling BaseTextInputType::sanitizeValue(), which ends up calling TextFieldInputType::sanitizeValue() (because BaseTextInputType is a subclass of TextFieldInputType and BaseTextInputType does not override sanitizeValue().
Kent Tamura
Comment 16 2015-10-19 15:44:56 PDT
Comment on attachment 263288 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=263288&action=review >>> Source/WebCore/html/URLInputType.cpp:69 >>> + return stripLeadingAndTrailingHTMLSpaces(BaseTextInputType::sanitizeValue(proposedValue)); >> >> TextFieldInputType::sanitizeValue() is not called. Now 'value' IDL property can have CR/LF wrongly. > > I don't understand this comment. We're calling BaseTextInputType::sanitizeValue(), which ends up calling TextFieldInputType::sanitizeValue() (because BaseTextInputType is a subclass of TextFieldInputType and BaseTextInputType does not override sanitizeValue(). My bad. You're right.
Note You need to log in before you can comment on or make changes to this bug.