RESOLVED WONTFIX 61097
Implement HTML5 location resolveURL
https://bugs.webkit.org/show_bug.cgi?id=61097
Summary Implement HTML5 location resolveURL
Erik Arvidsson
Reported 2011-05-18 16:34:54 PDT
Implement the resolveURL method on the Location interface
Attachments
Patch (16.82 KB, patch)
2011-05-19 13:40 PDT, Erik Arvidsson
no flags
Patch: expanded the tests (18.24 KB, patch)
2011-05-19 16:16 PDT, Erik Arvidsson
no flags
Archive of layout-test-results from ec2-cr-linux-01 (1.28 MB, application/zip)
2011-05-19 17:13 PDT, WebKit Review Bot
no flags
Patch (20.57 KB, patch)
2011-05-19 19:19 PDT, Erik Arvidsson
no flags
Patch: Use KURL::string so that the return URL is not decoded (17.27 KB, patch)
2011-05-20 11:03 PDT, Erik Arvidsson
ap: review-
Erik Arvidsson
Comment 1 2011-05-19 13:40:22 PDT
Alexey Proskuryakov
Comment 2 2011-05-19 14:13:30 PDT
Comment on attachment 94108 [details] Patch I'm wondering why prettyURL is used. Could you please add some tests with percent escaped URLs (both base and relative part), as well as with non-ASCII ones? Does any other browser implement this, so that we could compare the behavior in edge cases?
Erik Arvidsson
Comment 3 2011-05-19 14:15:20 PDT
No other browser implements this yet. I'll add some more tests. Thanks for looking at the patch.
Erik Arvidsson
Comment 4 2011-05-19 15:41:16 PDT
It is unclear to me what we want here. location.href uses prettyURL which does the following /a/%2F/b => /a//b Do we want to do the same thing here or would it be better to use string() which would not decode the percent encoded parts?
Erik Arvidsson
Comment 5 2011-05-19 16:16:51 PDT
Created attachment 94139 [details] Patch: expanded the tests
Alexey Proskuryakov
Comment 6 2011-05-19 16:44:18 PDT
I don't know how to answer these questions. But I think that they need to be answered before adding this new feature. ResoleURL is mostly useful because of its promise to do "the right thing" - and if we are going to change behavior later, it can be a problem for developers using it.
Erik Arvidsson
Comment 7 2011-05-19 16:59:41 PDT
(In reply to comment #6) > I don't know how to answer these questions. But I think that they need to be answered before adding this new feature. ResoleURL is mostly useful because of its promise to do "the right thing" - and if we are going to change behavior later, it can be a problem for developers using it. I'm just following the current behavior of location.href in this patch. We already have open bugs that Safari does not follow any of the other browsers and as far as I know Adam already spent a lot of effort trying to resolve this. See https://bugs.webkit.org/show_bug.cgi?id=30225 for more details.
WebKit Review Bot
Comment 8 2011-05-19 17:13:07 PDT
Comment on attachment 94139 [details] Patch: expanded the tests Attachment 94139 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/8719220 New failing tests: fast/dom/location-resolve-url.html
WebKit Review Bot
Comment 9 2011-05-19 17:13:12 PDT
Created attachment 94149 [details] Archive of layout-test-results from ec2-cr-linux-01 The attached test failures were seen while running run-webkit-tests on the chromium-ews. Bot: ec2-cr-linux-01 Port: Chromium Platform: Linux-2.6.35-28-virtual-x86_64-with-Ubuntu-10.10-maverick
Erik Arvidsson
Comment 10 2011-05-19 17:33:55 PDT
Comment on attachment 94139 [details] Patch: expanded the tests I need to update the tests to support both percent encoded and non encoded
Adam Barth
Comment 11 2011-05-19 17:34:19 PDT
Comment on attachment 94139 [details] Patch: expanded the tests View in context: https://bugs.webkit.org/attachment.cgi?id=94139&action=review > Source/WebCore/page/Location.cpp:289 > + return resolvedURL.hasPath() ? resolvedURL.prettyURL() : resolvedURL.prettyURL() + "/"; This looks great, but can you make a static helper function to share this code with Location::href? > Source/WebCore/workers/WorkerLocation.cpp:89 > + return resolvedURL.hasPath() ? resolvedURL.prettyURL() : resolvedURL.prettyURL() + "/"; Same here. This code is copy/pasted just two lines above. Maybe this should be a function on KURL?
Alexey Proskuryakov
Comment 12 2011-05-19 17:37:33 PDT
> I'm just following the current behavior of location.href in this patch. I don't think that this is a good explanation of why an new API can be implemented with a behavior that's not understood well.
Adam Barth
Comment 13 2011-05-19 17:43:14 PDT
> I don't think that this is a good explanation of why an new API can be implemented with a behavior that's not understood well. I don't think we should block implementing this API on unifying URL behavior across the web. URLs have crappy interoperability. That's a problem we should fix, but it's not a problem that needs to be fixed before adding this API.
Dominic Cooney
Comment 14 2011-05-19 17:56:52 PDT
Comment on attachment 94139 [details] Patch: expanded the tests View in context: https://bugs.webkit.org/attachment.cgi?id=94139&action=review > Source/WebCore/page/Location.cpp:278 > +String Location::resolveURL(const String& urlString, ExceptionCode& ec) const Where’s the SOP check for this?
Alexey Proskuryakov
Comment 15 2011-05-19 17:59:13 PDT
I disagree. What the use case for having the API if we can't even say what results it's going to return? Clearly it's not needed for compatibility, and clearly it's not needed for implementation experience, since implementation is trivial. So why is it needed at all? Hopefully there is some middle ground between "we have no idea" and "unify URL behavior across the web". If we can say something like "always use resolveURL() to get an URL that can be safely used with such and such existing API", and maintain that across releases, then I would agree that it's a good thing. Some other practical use case would make me convinced, too.
Adam Barth
Comment 16 2011-05-19 18:14:43 PDT
> What the use case for having the API if we can't even say what results it's going to return? The use case is to let web sites resolve relative URLs without having to create a goofy HTMLAnchorElement. This feature is just missing from the web platform. > Clearly it's not needed for compatibility, and clearly it's not needed for implementation experience, since implementation is trivial. So why is it needed at all? Because it's useful to web site authors. By your logic, we shouldn't implement any new web platform feature that relates to URLs because it's not needed for compatibility and URL processing isn't well-defined.
Adam Barth
Comment 17 2011-05-19 18:33:00 PDT
To flesh out a use case: 1) I want to obtain an OAuth token. 2) I want to have a callback URL relative to the current page, but I don't want to hard code the base URL of the page. To generate the callback URL for OAuth, I use location.resolveURL("/oauthcallback.php"), which gives me the right URL to include in the initial message regardless of the current host. To work around the lack of this API today, folks use regular expressions over location.href, which is pretty ridiculous and error prone.
Erik Arvidsson
Comment 18 2011-05-19 18:38:29 PDT
(In reply to comment #14) > (From update of attachment 94139 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=94139&action=review > > > Source/WebCore/page/Location.cpp:278 > > +String Location::resolveURL(const String& urlString, ExceptionCode& ec) const > > Where’s the SOP check for this? SOP? Same Origin Principle? This does not request anything so there is no need for any check like that.
Erik Arvidsson
Comment 19 2011-05-19 19:19:27 PDT
WebKit Review Bot
Comment 20 2011-05-19 20:09:59 PDT
Comment on attachment 94160 [details] Patch Attachment 94160 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/8722050
Adam Barth
Comment 21 2011-05-19 21:28:44 PDT
Comment on attachment 94160 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=94160&action=review r- for compile break. One comment below: > Source/WebCore/platform/KURL.h:154 > + String prettyURLWithTrailingSlash() const; You probably need this in KURLGoogleWhateverThatForkedFileIsCalled.cpp as well. Is prettyURL called elsewhere without adding a trailing slash? prettyURLWithTrailingSlash isn't the greatest name because there isn't always a trailing slash, as in: http://example.com/foo?bar Also, does this code handle the following input correctly: http://example.com?bar I guess it depends on KURL canonicalizes things. I don't think hasPath() will ever be false for GURL because GURL is more agressive about canonicalizing. Another interesting test case is: http: but now we're getting into UA-specific URL behaviors.
WebKit Review Bot
Comment 22 2011-05-19 21:41:46 PDT
Comment on attachment 94160 [details] Patch Attachment 94160 [details] did not pass cr-mac-ews (chromium): Output: http://queues.webkit.org/results/8719260
Alexey Proskuryakov
Comment 23 2011-05-19 22:35:20 PDT
> To generate the callback URL for OAuth, I use location.resolveURL("/oauthcallback.php"), which gives me the right URL to include in the initial message regardless of the current host. Thank you, that's a valid use case, and it is good to have it documented here. There are some requirements that follow from it. 1) It should be reasonably easy to pass the resolved URL to a server. 2) The server should get a URL it can make a request to. The pretty URL mostly passes (1) by the virtue of being a String, and even a valid UTF-16 sequence (which is not trivial, as we decode an arbitrary sequence of percent escapes). However, the server needs an ASCII string to make an HTTP request, so either side would need to perform an encoding step to make ASCII. (2) is clearly not fulfilled though. The browser's best idea of what such a URL looks like is not the pretty URL, but what it does when trying to request a resource. From your use case, it appears that the definition of resolveURL(url) should be: the URL the browser would use to request a subresource (such as an image from an IMG element) with the argument as a URL (e.g. src attribute of the IMG element). if we can all agree with this, then the implementation needs to be corrected.
WebKit Review Bot
Comment 24 2011-05-19 23:24:57 PDT
Comment on attachment 94160 [details] Patch Attachment 94160 [details] did not pass cr-mac-ews (chromium): Output: http://queues.webkit.org/results/8722085
Ojan Vafai
Comment 25 2011-05-20 10:28:47 PDT
The value returned to resolveURL should also match the one returned by location.href and location.pathname, no? It seems like Safari should not be returning the pretty URL for those (Firefox, Chrome and IE don't).
Ojan Vafai
Comment 26 2011-05-20 10:50:16 PDT
(In reply to comment #25) > The value returned to resolveURL should also match the one returned by location.href and location.pathname, no? It seems like Safari should not be returning the pretty URL for those (Firefox, Chrome and IE don't). Thinking about this more, I think it's fine for everyone to do the desired behavior (i.e. not the pretty URL). Making location.href and location.pathname can be done as a followup and this is an incremental step in that direction that does not have backward-compat concerns.
Alexey Proskuryakov
Comment 27 2011-05-20 10:58:50 PDT
> The value returned to resolveURL should also match the one returned by location.href and location.pathname, no? I'm not sure how to interpret this requirement. Are you saying that location.resolveURL(location.pathname) should always be equal to location.href? If you think that it should, please tell us why. Not having this equivalence does not affect the single known use case for resolveURL. I don't see any other case where it would matter. > It seems like Safari should not be returning the pretty URL for those (Firefox, Chrome and IE don't). It is very possible that location.href and pathname need to be fixed. The former is tracked in bug 30225. There is no bug for location.pathname, but there is one for HTMLAnchorElement.pathname (bug 22899). Unless there is a strong reason why these should match (i.e. stronger than our engineering instincts), I suggest focusing on resolveURL alone in this particular bug.
Alexey Proskuryakov
Comment 28 2011-05-20 10:59:34 PDT
> Making location.href and location.pathname can be done as a followup and this is an incremental step in that direction that does not have backward-compat concerns. Agreed.
Erik Arvidsson
Comment 29 2011-05-20 11:03:28 PDT
Created attachment 94241 [details] Patch: Use KURL::string so that the return URL is not decoded
Darin Adler
Comment 30 2011-05-20 11:05:58 PDT
Comment on attachment 94241 [details] Patch: Use KURL::string so that the return URL is not decoded What about all other functions of prettyURL? Does it do nothing useful?
Adam Barth
Comment 31 2011-05-20 11:20:11 PDT
> What about all other functions of prettyURL? Does it do nothing useful? I've filed https://bugs.webkit.org/show_bug.cgi?id=61201 to investigate this.
Erik Arvidsson
Comment 32 2011-05-20 11:23:02 PDT
(In reply to comment #30) > (From update of attachment 94241 [details]) > What about all other functions of prettyURL? Does it do nothing useful? I'm not fully sure what you are referring to here? prettyURL is still used in places where the main use of the URL is to present it in a more readable form for humans.
Darin Adler
Comment 33 2011-05-20 11:23:45 PDT
(In reply to comment #32) > I'm not fully sure what you are referring to here? > > prettyURL is still used in places where the main use of the URL is to present it in a more readable form for humans. Was there any benefit to using prettyURL here, or was it only causing problems and offering no benefits?
Alexey Proskuryakov
Comment 34 2011-05-20 11:38:07 PDT
Looking good to me. I'd like to see a test that has non-ASCII characters in domain name, to verify that IDNA encoding is used as expected. Perhaps it would be better to make this an http test, just to be closer to normal usage. +document.querySelector('base').href = 'http://www.webkit.org/\u03B1/\u03B2/'; This is interesting! It would be good to test an escape sequence that cannot be decoded to a valid UTF-8 string here (like %d0%d0). Also, I suggest adding <meta charset="utf-8"> to the test to make it more predictable in browsers that have control over how URL paths are encoded. Is resolveURL expected to preserve credentials from main document's URL (like http://user:pass@example.org)? I can't really tell from the use case, and I'm not sure what we do for images or for XMLHttpRequest. This is worth a regression test once we have the answer.
Erik Arvidsson
Comment 35 2011-05-20 15:18:07 PDT
I'm going on sabbatical for 3 months so I unassigned myself. If anyone wants to take over, feel free to do so.
Alexey Proskuryakov
Comment 36 2011-05-23 12:02:40 PDT
Comment on attachment 94241 [details] Patch: Use KURL::string so that the return URL is not decoded r- to get this out of review queue, since some more work is still needed. > Is resolveURL expected to preserve credentials from main document's URL My thinking is that probably not. We shouldn't disclose user credentials to a 3rd party server.
Erik Arvidsson
Comment 37 2011-08-16 17:15:54 PDT
I'm back and I still want to resolve this. Adam, do you want me to take over bug 30225 as well?
Adam Barth
Comment 38 2011-08-16 17:18:11 PDT
> Adam, do you want me to take over bug 30225 as well? If you'd be willing. I've lost context on these issues.
Anne van Kesteren
Comment 39 2011-08-17 01:24:44 PDT
Should resolveURL not be removed now we get the global URL object?
Erik Arvidsson
Comment 40 2011-08-17 11:33:54 PDT
I agree that Location resolveURL is already better covered by the URL proposal. Adam, what is the status of the URL spec?
Adam Barth
Comment 41 2011-08-17 11:40:13 PDT
> Adam, what is the status of the URL spec? Draft is here: https://docs.google.com/document/d/1r_VTFKApVOaNIkocrg0z-t7lZgzisTuGTXkdzAk4gLU/edit?hl=en_US The W3C WebApps group wants to take up work on the spec. I need to find some time to learn how to upload the document to their servers.
Note You need to log in before you can comment on or make changes to this bug.