Bug 14555 - Spaces should be encoded as %20 in mailto URLs
Summary: Spaces should be encoded as %20 in mailto URLs
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: Forms (show other bugs)
Version: 523.x (Safari 3)
Hardware: All All
: P3 Enhancement
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks: 37641
  Show dependency treegraph
 
Reported: 2007-07-07 13:27 PDT by Michael A. Puls II
Modified: 2011-03-14 16:09 PDT (History)
4 users (show)

See Also:


Attachments
Descriptive TC. (Will need to minimalize) (2.58 KB, text/html)
2007-07-07 13:28 PDT, Michael A. Puls II
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Michael A. Puls II 2007-07-07 13:27:26 PDT
When generating the encoded data set for an action="mailto:" form, a few things go wrong.

1. Values from the form fields are encoded twice. For example, with <input type="text" name="cc" value="test@site.com">, test@site.com will be encoded to test%2540site.com instead of just test%40site.com.

2. "body=" is inserted at the beginning of the data set, which causes other parts to become part of the body.

3. Spaces in the form fields are encoded to + instead of %20. Mail clients don't decode + to a space, so each space in the form fields will show up as + in the mail client.
Comment 1 Michael A. Puls II 2007-07-07 13:28:13 PDT
Created attachment 15436 [details]
Descriptive TC. (Will need to minimalize)
Comment 2 David Kilzer (:ddkilzer) 2007-07-07 14:38:41 PDT
This occurs with a local debug build of WebKit r24089 with Safari 3.0 (522.12) on Mac OS  X 10.4.10 (8R218).

It also occurs with Safari 2.0.4 (419.3) with original WebKit on Mac OS X 10.4.10.

Note that Mail.app correctly parses the body out of the URL (unlike on Windows?), but everything is still improperly twice-encoded.

Comment 3 Alexey Proskuryakov 2008-01-02 09:40:57 PST
Fixed (1) and (2) in <http://trac.webkit.org/projects/webkit/changeset/29086>.

Since other browsers do not seem to encode spaces as suggested in (3), I have left this for later. The test case mentions efforts to get specs corrected - have those been successful? Also, is there a Firefox bug for (3)?

Re-titling and changing priority to match the current scope of the bug.
Comment 4 Michael A. Puls II 2008-01-02 17:59:23 PST
(In reply to comment #3)
> Fixed (1) and (2) in <http://trac.webkit.org/projects/webkit/changeset/29086>.

Thanks

> Since other browsers do not seem to encode spaces as suggested in (3), I have
> left this for later. The test case mentions efforts to get specs corrected -
> have those been successful? Also, is there a Firefox bug for (3)?

The issue raised at < http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2007-January/009210.html > and < http://lists.w3.org/Archives/Public/public-html/2007May/1061.html > is still not resolved in HTML5 yet because the WebForms part of the spec hasn't been reviewed and put into HTML5. The editor of Xforms has also seen those links. No one really has given any feedback yet though.

I've also mentioned this issue to the authors of RFC2368. They at least recognize the problem and believe that it's unfortunate that application/x-www-form-urlencoded uses '+' for spaces instead of %20.

For this issue I've suggested clarifications in the replacement for RFC2368 < http://tools.ietf.org/html/draft-duerst-mailto-bis-03 >, but it has expired and I have not seen anything come of that yet. Instead, I've been making my own  < http://shadow2531.com/opera/testcases/mailto/modern_mailto_uri_scheme.html >, which adresses this in the encoding section and in the forms section (which needs more work, but has a js example of what Firefox does for both GET and POST with action="mailto:" with an included fix for encoding spaces as %20).

I don't think there's a Mozilla bug on this, but I'm pretty sure those in the HTML working group from Mozilla have seen the HTML5 posts above.

I did file an Opera bug though, but forms in Opera with action="mailto:" are still partially broken, so what Opera does for #3 will have to wait till forms with action="mailto:" are fixed in general.

For method="post" and action="mailto:", it might not make sense, but for method="get" and action="mailto:", it makes sense for sure to generate spaces as %20 instead of +. (You can easily see how things break when spaces are generated as + instead of %20 like they are now.)

However, you are right, other browsers don't get it right either. So, I guess we must wait.
Comment 5 Martin Dürst 2008-01-03 22:56:42 PST
(In reply to comment #3)

> Since other browsers do not seem to encode spaces as suggested in (3), I have
> left this for later. The test case mentions efforts to get specs corrected -
> have those been successful? Also, is there a Firefox bug for (3)?

I have just published http://www.ietf.org/internet-drafts/draft-duerst-mailto-bis-04.txt, intended to obsolete RFC 2368. I made it clear that spaces in mailto URIs have to be escaped as %20, not as +.

Comment 6 Martin Dürst 2010-08-30 21:15:56 PDT
(In reply to comment #5)

> I have just published http://www.ietf.org/internet-drafts/draft-duerst-mailto-bis-04.txt, intended to obsolete RFC 2368. I made it clear that spaces in mailto URIs have to be escaped as %20, not as +.

The above link doesn't work anymore, sorry. Please see https://datatracker.ietf.org/doc/draft-duerst-mailto-bis/. This draft has been approved by the IESG, and is now being worked on by the RFC Editor.
Comment 7 Martin Dürst 2010-08-30 21:18:51 PDT
(In reply to comment #4)

> However, you are right, other browsers don't get it right either. So, I guess we must wait.

I don't understand why you want to wait for other browsers to fix something before you fix yourself. It may make sense when "being liberal in what you accept". But it doesn't make sense here, where the browser produces something, and it's other software (mailers) that accept it (or not).
Comment 8 Adam Barth 2010-08-30 21:21:38 PDT
It's unclear whether we want to support syntax like the following:

mailto:addr1@an.example,addr2@an.example

The situation requires further study.
Comment 9 Adam Barth 2010-08-30 21:35:52 PDT
Sorry, looks like bugzilla ate the example.
Comment 10 Michael A. Puls II 2010-08-31 03:28:10 PDT
(In reply to comment #7)
> (In reply to comment #4)
> 
> > However, you are right, other browsers don't get it right either. So, I guess we must wait.
> 
> I don't understand why you want to wait for other browsers to fix something before you fix yourself. It may make sense when "being liberal in what you accept". But it doesn't make sense here, where the browser produces something, and it's other software (mailers) that accept it (or not).

I was just being patient etc. What I really think is that we should forget about what other browsers do and just do what makes sense and what makes sense is encoding spaces as %20. This is supported by <https://datatracker.ietf.org/doc/draft-duerst-mailto-bis/>.

This is also supported by HTML5 <http://www.whatwg.org/specs/web-apps/current-work/multipage/association-of-controls-and-forms.html#submit-mailto-headers> for GET, PUT and DELETE, but not POST.

Hixie didn't spec this for a POST dataset though because the whole dataset gets percent-encoded and becomes the body value, so it's not needed. And, if authors decide to use + in @action, it's their fault for not using %20 if they want a space and '%2B' if the want a '+' and things don't come out right in the client (local or http).

See:
<http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2007-January/009210.html>
<http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2008-October/016858.html>
<http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2008-December/017667.html>

But one certain thing was determined: Regardless of how browsers generate action=mailto: form data, they must convert that data to the format the client they're passing to understands. For example, if Thunderbird doesn't treat '+' as spaces (which it doesn't of course), then the browser should use %20 for spaces. 

What this means is that you can fix things with regex if you really wanted to with these rules:

1. Replace all '+' in @action with '%2B' before you do anything else (raw spaces should already be converted to %20 by the resolver). (In a copy of the @action value that you'll be working with. Don't modify the attribute's value.)

2. For method="not post", replace all '+' in the generated dataset with '%20"

Then, everything's good to go and you don't have to modify the form dataset generation code to make an exception for 'mailto'. You can just do the "mailto:" stuff after the fact when @action contains a mailto URI.

With that said, I see no reason to wait on anything to fix this.
Comment 11 Michael A. Puls II 2010-08-31 03:30:31 PDT
> regex

Would be overkill. I meant a simple replacement function.
Comment 12 Adam Barth 2010-08-31 11:21:28 PDT
> What I really think is that we should forget about what other browsers do and just do what makes sense and what makes sense is encoding spaces as %20. This is supported by <https://datatracker.ietf.org/doc/draft-duerst-mailto-bis/>.

As I said above, I haven't studied this issue in detail.  However, in general, ignoring the behavior of other browsers is unwise.
Comment 13 Michael A. Puls II 2010-09-02 06:49:21 PDT
(In reply to comment #12)
> > What I really think is that we should forget about what other browsers do and just do what makes sense and what makes sense is encoding spaces as %20. This is supported by <https://datatracker.ietf.org/doc/draft-duerst-mailto-bis/>.
> 
> As I said above, I haven't studied this issue in detail.  However, in general, ignoring the behavior of other browsers is unwise.

Understood.

To clarify though:

As far as method!="post" and method="post" with enctype="application/x-www-form-urlencoded" (the default enctype), <http://shadow2531.com/opera/testcases/mailto/mailtoform.js> is what HTML5 is proposing.

Since browsers are broken and non-interoperable with each other, this feature currently can't be used or relied upon on the web very much. So, HTML5 specs a common ground of what browsers do and what makes sense for interoperability etc. 

There's only one line in there that's not part of HTML5 and it's my suggestion. But, ignoring my suggestion, implementing the rest would be like any HTML5 feature--Implement it and give feedback to the group.
Comment 14 Martin Dürst 2010-10-04 20:58:35 PDT
The new version of the mailto: URI/IRI scheme spec is out now, RFC 6068. Please see http://tools.ietf.org/html/rfc6068.