RESOLVED WONTFIX Bug 44641
Implement Base64 HTML entities
https://bugs.webkit.org/show_bug.cgi?id=44641
Summary Implement Base64 HTML entities
Adam Barth
Reported 2010-08-25 15:41:29 PDT
Implement Base64 HTML entities
Attachments
Work in progress (14.54 KB, patch)
2010-08-25 15:42 PDT, Adam Barth
no flags
Patch (23.67 KB, patch)
2010-08-25 16:23 PDT, Adam Barth
no flags
Patch (23.97 KB, patch)
2010-08-25 16:40 PDT, Adam Barth
no flags
Patch (12.78 KB, patch)
2010-09-06 03:20 PDT, Adam Barth
abarth: review-
Adam Barth
Comment 1 2010-08-25 15:42:49 PDT
Created attachment 65484 [details] Work in progress
Adam Barth
Comment 2 2010-08-25 16:23:15 PDT
Eric Seidel (no email)
Comment 3 2010-08-25 16:33:03 PDT
Alexey Proskuryakov
Comment 4 2010-08-25 16:33:39 PDT
I got curious, and found this explanation: <http://www.mail-archive.com/whatwg@lists.whatwg.org/msg23193.html>. The idea seems to be that it will be slightly easier to use this new mechanism to escape untrusted content (but one would still have to remember to escape, and forgetting to do that is the most common issue AFAIK). An obvious downside is that inserted untrusted content will be unreadable by humans.
Adam Barth
Comment 5 2010-08-25 16:40:38 PDT
Adam Barth
Comment 6 2010-08-25 16:46:33 PDT
Yep. This is not a part of HTML5 (yet). The goal is to make it easier for folks to add untrusted content to their document while avoiding cross-site scripting. Here's a design document that shows some of the thinking that lead to this design: https://docs.google.com/document/edit?id=1Uye7FCE7sIouru_9ayiyYRDP_ibjY6ZcOeImWH1pFrE&hl=en&authkey=CLO4uYIN The design in this patch is simpler than some of the other ideas in that document.
Adam Barth
Comment 7 2010-08-25 16:55:36 PDT
Here's the summary from the email if you don't want to click through. == Summary == HTML should support Base64-encoded entities to make it easier for authors to include untrusted content in their documents without risking XSS. For example, &%SFRNTDUncyA8Y2FudmFzPiBlbGVtZW50IGlzIGF3ZXNvbWUuCg==; would decode to "HTML5's <canvas> element is awesome." Notice that the < and > characters get emitted by the parser as character tokens. That means they can't be used by an attacker for XSS. These entities can be used safely both in intertag content as well as in attribute values.
Adam Barth
Comment 8 2010-09-06 03:20:40 PDT
Oliver Hunt
Comment 9 2010-09-06 11:30:07 PDT
(In reply to comment #7) > Here's the summary from the email if you don't want to click through. > > == Summary == > > HTML should support Base64-encoded entities to make it easier for > authors to include untrusted content in their documents without > risking XSS. For example, > > &%SFRNTDUncyA8Y2FudmFzPiBlbGVtZW50IGlzIGF3ZXNvbWUuCg==; > > would decode to "HTML5's <canvas> element is awesome." Notice that > the < and > characters get emitted by the parser as character tokens. > That means they can't be used by an attacker for XSS. These entities > can be used safely both in intertag content as well as in attribute > values. What use cases does this solve that aren't already solved by innerText and/or innerStaticHTML ?
Adam Barth
Comment 10 2010-09-06 11:37:23 PDT
> What use cases does this solve that aren't already solved by innerText and/or innerStaticHTML ? They solve different problems. innerText/innerStaticHTML let you modify a DOM node safely where as base64 entities give you a safe way of transmitting untrusted data from the server to the client. Put another way, if you want to use innerText/innerStaticHTML, you still need a safe way of getting the untrusted content you want to assign to those properties from the server to the client. That's the problem that base64 entities solve.
Oliver Hunt
Comment 11 2010-09-06 11:40:55 PDT
(In reply to comment #10) > > What use cases does this solve that aren't already solved by innerText and/or innerStaticHTML ? > > They solve different problems. innerText/innerStaticHTML let you modify a DOM node safely where as base64 entities give you a safe way of transmitting untrusted data from the server to the client. > > Put another way, if you want to use innerText/innerStaticHTML, you still need a safe way of getting the untrusted content you want to assign to those properties from the server to the client. That's the problem that base64 entities solve. there's already base64 decode support in JS (through btoa) Also what encoding is used in base64? based on atob/btoa behaviour base64 doesn't support multibyte characters, so this needs to be specified.
Adam Barth
Comment 12 2010-09-06 12:16:46 PDT
> there's already base64 decode support in JS (through btoa) Imagine a PHP script that wants to send an untrusted string to the client at a particular point in the output stream. They can't do the following: <?php echo "<script>document.write(btoa('".base64_encode($untrusted_string)."'));</script>" ?> because that's XSS. However, they can do: <?php echo "&%'".base64_encode($untrusted_string)."';" ?> That's safe. > Also what encoding is used in base64? UTF8. > based on atob/btoa behaviour base64 doesn't support multibyte characters, so this needs to be specified. The btoa behavior is really nutty and also needs to be specified. :)
Adam Barth
Comment 13 2010-09-06 12:18:08 PDT
Rather: <?php echo "&%".base64_encode($untrusted_string).";" ?> (removed extra ' characters that snuck in).
Alexey Proskuryakov
Comment 14 2010-11-15 10:19:39 PST
> > Also what encoding is used in base64? > UTF8. I think that this needs to explicitly mention what happens to bad UTF-8 (unpaired surrogates, misplaced BOMs, overlong sequences etc). With tests!
Adam Barth
Comment 15 2010-11-15 13:02:52 PST
(In reply to comment #14) > > > Also what encoding is used in base64? > > UTF8. > > I think that this needs to explicitly mention what happens to bad UTF-8 (unpaired surrogates, misplaced BOMs, overlong sequences etc). With tests! Sure. It should probably do the same thing as http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html#preprocessing-the-input-stream without the CR/LF magic (and possibly without the null byte magic).
Adam Barth
Comment 16 2010-12-21 01:17:35 PST
Comment on attachment 66617 [details] Patch No love for this patch, apparently.
Adam Barth
Comment 17 2011-05-23 10:16:34 PDT
This idea never got enough traction.
Note You need to log in before you can comment on or make changes to this bug.