WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED WONTFIX
46982
Setting href protocol attribute on malformed URL is incorrect when page is served over HTTP
https://bugs.webkit.org/show_bug.cgi?id=46982
Summary
Setting href protocol attribute on malformed URL is incorrect when page is se...
Steve Block
Reported
2010-10-01 07:04:26 PDT
LayoutTest fast/dom/HTMLAnchorElement/set-href-attribute-protocol.html tests setting the protocol attribute of the href of an anchor element. A particular part of this test is as follows ... debug("Set protocol to http on malformed URL"); a.href = "foo:??bar"; a.protocol = "http"; shouldBe("a.href", "'
http:??bar
'"); I noticed that when the test is served over HTTP, rather than as a file:// URL as is currently the case, this test fails as follows ... Chromium (including ToT WebKit): FAIL -
http://host/??bar
Safari (including ToT WebKit) : FAIL -
http://host/path??bar
Android (including ToT WebKit) : FAIL -
http://host/path??bar
The test passes on recent versions of FF and IE. A test case, which duplicates the existing test as an HTTP test, is attached. The spec -
http://dev.w3.org/html5/spec/urls.html
- suggests that the expected result in the test is correct irrespective of the scheme over which the page is served, so it looks to me like this should be fixed.
Attachments
Test case
(5.22 KB, text/plain)
2010-10-01 07:08 PDT
,
Steve Block
no flags
Details
View All
Add attachment
proposed patch, testcase, etc.
Steve Block
Comment 1
2010-10-01 07:08:59 PDT
Created
attachment 69464
[details]
Test case
Yael
Comment 2
2010-10-01 17:29:17 PDT
I can make the test case even smaller to show the error: var a = document.createElement('a'); a.href="
http:??bar
"; alert(a.href); The reason for the failure is in KURL.cpp, line 401. if (p[1] != '/' && equalIgnoringCase(base.protocol(), String(str, p - str)) && base.isHierarchical()) We compare the protocol of the base url to the protocol of the new utl, and if they match, and the base url is hierarchical, then we assume that either the new url is hierarchical or relative. I don't think that assumption is valid.
Steve Block
Comment 3
2010-10-04 05:28:10 PDT
Considering your simpler test case ... Served from
http://host/path
document.createElement('a').href = '
http:?query
'; '
http://host/path?query
' is the correct result provided we're following the non-strict parsing rules defined at
http://tools.ietf.org/html/rfc3986#section-5.2.2
. I think that the logic you mention at line 401 of KURL.cpp is providing exactly this non-strict parsing - we ignore the scheme if it's the same as that of the base URI. If we were to use the strict parsing rules, then the result should be '
http:?query
'. This matches the case when the scheme doesn't match that of the base URI, as is the case when the layout tests are served from file://. As for the test case I originally mentioned, this still looks like a bug to me, as the protocol is being set on the resolved URL, which should be 'foo:?query' ... Served from
http://host/path
a.href = "foo:?query"; a.protocol = "http"; shouldBe("a.href", "'
http:?query
'"); Chromium (including ToT WebKit): FAIL -
http://host/?query
Safari (including ToT WebKit) : FAIL -
http://host/path?query
Android (including ToT WebKit) : FAIL -
http://host/path?query
Note that the reverse case works as expected ... Served from
http://host/path
a.href = "
http:?query
"; a.protocol = "foo"; shouldBe("a.href", "'foo://host/path?query'");
Steve Block
Comment 4
2010-10-07 07:18:15 PDT
> As for the test case I originally mentioned, this still looks like a bug to me, as the protocol is being set on the > resolved URL, which should be 'foo:?query' ... > Served from
http://host/path
> a.href = "foo:?query"; > a.protocol = "http"; > shouldBe("a.href", "'
http:?query
'"); > > Chromium (including ToT WebKit): FAIL -
http://host/?query
> Safari (including ToT WebKit) : FAIL -
http://host/path?query
> Android (including ToT WebKit) : FAIL -
http://host/path?query
Having looked into this more carefully, I'm starting to think the Safari/Android behaviour is correct. The spec says that href attribute is subject to changes in base URL and should be re-resolved when the base URL changes. So the href attribute should 'store' the unresolved URL, and it is resolved relative to the base URL only when the getter is called. So it makes sense that .. a.href = "foo:?query"; a.protocol = "http"; shouldBe("a.href", "'
http://host/path?query
'"); ... should be identical in behaviour to ... a.href = "
http:?query
"; shouldBe("a.href", "'
http://host/path?query
'"); Can anybody confirm that this desired behaviour is correct? If so, then only Chromium is buggy. In any case, I think I've found a related, more fundamental bug, which should probably be fixed first -
Bug 47348
Adam Barth
Comment 5
2010-10-07 10:10:13 PDT
Sadly, this stuff is a mess. I'm interested to hear what Brett thinks.
Brett Wilson (Google)
Comment 6
2010-10-07 23:10:10 PDT
To answer the question, if you have href="
http://google.com
" and you ask for the path component, should it be empty or should it be "/"? Another question would he "
HTTP://google.com/
" should the protocol be "http" or "HTTP". Currently Chromium will definitely give you a slash and "http" because it resolves the full URL all the time. We have three different path canonicalization routines currently. One for hierarchical (http and the like), one for file URLs, and one for data/javascript URLs and the like. Say the page says a.path = "/foo?bar#baz" I think we have to escape the ? and the # or else this won't just be setting the path. Yet if the algorithm is supposed to be "parse this URL and do a string substitution without canonicalization" then we would need yet a fourth mode where we just escape characters that change the parsing of the URL. I think our behavior makes sense and I haven't heard of any real world problems with it. Changing this, even if it would make it more correct, would introduce a lot of complexity for almost no benefit.
Alexey Proskuryakov
Comment 7
2010-10-07 23:23:07 PDT
But isn't it even less desirable to change KURL to match?
Adam Barth
Comment 8
2010-10-07 23:24:37 PDT
(In reply to
comment #7
)
> But isn't it even less desirable to change KURL to match?
The most desirable thing, IMHO, is to unfork our URL code. As long as we have two pieces of code doing the same job, we're going to have two different behaviors.
Brett Wilson (Google)
Comment 9
2010-10-07 23:35:40 PDT
I wasn't arguing that KURL should be changed in this respect.
Steve Block
Comment 10
2010-10-08 02:25:35 PDT
OK, so it sounds like this should be WONTFIX. I'd appreciate your feedback on
Bug 47348
though, to see it that should be fixed.
Steve Block
Comment 11
2010-10-14 05:21:51 PDT
The thinking is that this isn't worth fixing due to the complexity it would introduce. However, this assumption may change once
Bug 47348
is fixed, as that may involve changing much of this code. Leaving this bug open until
Bug 47348
is fixed.
Ahmad Saleem
Comment 12
2022-08-05 07:52:02 PDT
I took the test case from attached and changed it into JSFiddle: Link -
https://jsfiddle.net/9drs1ofw/show
Following are results across browsers: *** Safari 15.6 on macOS 12.5 *** FAIL a.href should be
https://www.mydomain.com/path/
. Was null://www.mydomain.com/path/. ^ Safari fail only in this. *** Firefox Nightly 105 *** FAIL a.href should be
https://www.mydomain.com/path/
. Was null://www.mydomain.com/path/. *** Chrome Canary 106 *** FAIL a.href should be
https://www.mydomain.com/path/
. Was null://www.mydomain.com/path/. FAIL a.href should be
http:??bar
. Was
http:/??bar
. _______ Do we need to fix the failing test case? Just wanted to share updated test results. Thanks!
Brent Fulgham
Comment 13
2022-08-05 09:15:31 PDT
It seems like the standard has moved in this space, and all browsers agree on our current behavior. Seems like WONTFIX to me.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug