WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED WONTFIX
Bug 44641
Implement Base64 HTML entities
https://bugs.webkit.org/show_bug.cgi?id=44641
Summary
Implement Base64 HTML entities
Adam Barth
Reported
2010-08-25 15:41:29 PDT
Implement Base64 HTML entities
Attachments
Work in progress
(14.54 KB, patch)
2010-08-25 15:42 PDT
,
Adam Barth
no flags
Details
Formatted Diff
Diff
Patch
(23.67 KB, patch)
2010-08-25 16:23 PDT
,
Adam Barth
no flags
Details
Formatted Diff
Diff
Patch
(23.97 KB, patch)
2010-08-25 16:40 PDT
,
Adam Barth
no flags
Details
Formatted Diff
Diff
Patch
(12.78 KB, patch)
2010-09-06 03:20 PDT
,
Adam Barth
abarth
: review-
Details
Formatted Diff
Diff
Show Obsolete
(3)
View All
Add attachment
proposed patch, testcase, etc.
Adam Barth
Comment 1
2010-08-25 15:42:49 PDT
Created
attachment 65484
[details]
Work in progress
Adam Barth
Comment 2
2010-08-25 16:23:15 PDT
Created
attachment 65495
[details]
Patch
Eric Seidel (no email)
Comment 3
2010-08-25 16:33:03 PDT
Attachment 65495
[details]
did not build on mac: Build output:
http://queues.webkit.org/results/3755634
Alexey Proskuryakov
Comment 4
2010-08-25 16:33:39 PDT
I got curious, and found this explanation: <
http://www.mail-archive.com/whatwg@lists.whatwg.org/msg23193.html
>. The idea seems to be that it will be slightly easier to use this new mechanism to escape untrusted content (but one would still have to remember to escape, and forgetting to do that is the most common issue AFAIK). An obvious downside is that inserted untrusted content will be unreadable by humans.
Adam Barth
Comment 5
2010-08-25 16:40:38 PDT
Created
attachment 65499
[details]
Patch
Adam Barth
Comment 6
2010-08-25 16:46:33 PDT
Yep. This is not a part of HTML5 (yet). The goal is to make it easier for folks to add untrusted content to their document while avoiding cross-site scripting. Here's a design document that shows some of the thinking that lead to this design:
https://docs.google.com/document/edit?id=1Uye7FCE7sIouru_9ayiyYRDP_ibjY6ZcOeImWH1pFrE&hl=en&authkey=CLO4uYIN
The design in this patch is simpler than some of the other ideas in that document.
Adam Barth
Comment 7
2010-08-25 16:55:36 PDT
Here's the summary from the email if you don't want to click through. == Summary == HTML should support Base64-encoded entities to make it easier for authors to include untrusted content in their documents without risking XSS. For example, &%SFRNTDUncyA8Y2FudmFzPiBlbGVtZW50IGlzIGF3ZXNvbWUuCg==; would decode to "HTML5's <canvas> element is awesome." Notice that the < and > characters get emitted by the parser as character tokens. That means they can't be used by an attacker for XSS. These entities can be used safely both in intertag content as well as in attribute values.
Adam Barth
Comment 8
2010-09-06 03:20:40 PDT
Created
attachment 66617
[details]
Patch
Oliver Hunt
Comment 9
2010-09-06 11:30:07 PDT
(In reply to
comment #7
)
> Here's the summary from the email if you don't want to click through. > > == Summary == > > HTML should support Base64-encoded entities to make it easier for > authors to include untrusted content in their documents without > risking XSS. For example, > > &%SFRNTDUncyA8Y2FudmFzPiBlbGVtZW50IGlzIGF3ZXNvbWUuCg==; > > would decode to "HTML5's <canvas> element is awesome." Notice that > the < and > characters get emitted by the parser as character tokens. > That means they can't be used by an attacker for XSS. These entities > can be used safely both in intertag content as well as in attribute > values.
What use cases does this solve that aren't already solved by innerText and/or innerStaticHTML ?
Adam Barth
Comment 10
2010-09-06 11:37:23 PDT
> What use cases does this solve that aren't already solved by innerText and/or innerStaticHTML ?
They solve different problems. innerText/innerStaticHTML let you modify a DOM node safely where as base64 entities give you a safe way of transmitting untrusted data from the server to the client. Put another way, if you want to use innerText/innerStaticHTML, you still need a safe way of getting the untrusted content you want to assign to those properties from the server to the client. That's the problem that base64 entities solve.
Oliver Hunt
Comment 11
2010-09-06 11:40:55 PDT
(In reply to
comment #10
)
> > What use cases does this solve that aren't already solved by innerText and/or innerStaticHTML ? > > They solve different problems. innerText/innerStaticHTML let you modify a DOM node safely where as base64 entities give you a safe way of transmitting untrusted data from the server to the client. > > Put another way, if you want to use innerText/innerStaticHTML, you still need a safe way of getting the untrusted content you want to assign to those properties from the server to the client. That's the problem that base64 entities solve.
there's already base64 decode support in JS (through btoa) Also what encoding is used in base64? based on atob/btoa behaviour base64 doesn't support multibyte characters, so this needs to be specified.
Adam Barth
Comment 12
2010-09-06 12:16:46 PDT
> there's already base64 decode support in JS (through btoa)
Imagine a PHP script that wants to send an untrusted string to the client at a particular point in the output stream. They can't do the following: <?php echo "<script>document.write(btoa('".base64_encode($untrusted_string)."'));</script>" ?> because that's XSS. However, they can do: <?php echo "&%'".base64_encode($untrusted_string)."';" ?> That's safe.
> Also what encoding is used in base64?
UTF8.
> based on atob/btoa behaviour base64 doesn't support multibyte characters, so this needs to be specified.
The btoa behavior is really nutty and also needs to be specified. :)
Adam Barth
Comment 13
2010-09-06 12:18:08 PDT
Rather: <?php echo "&%".base64_encode($untrusted_string).";" ?> (removed extra ' characters that snuck in).
Alexey Proskuryakov
Comment 14
2010-11-15 10:19:39 PST
> > Also what encoding is used in base64? > UTF8.
I think that this needs to explicitly mention what happens to bad UTF-8 (unpaired surrogates, misplaced BOMs, overlong sequences etc). With tests!
Adam Barth
Comment 15
2010-11-15 13:02:52 PST
(In reply to
comment #14
)
> > > Also what encoding is used in base64? > > UTF8. > > I think that this needs to explicitly mention what happens to bad UTF-8 (unpaired surrogates, misplaced BOMs, overlong sequences etc). With tests!
Sure. It should probably do the same thing as
http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html#preprocessing-the-input-stream
without the CR/LF magic (and possibly without the null byte magic).
Adam Barth
Comment 16
2010-12-21 01:17:35 PST
Comment on
attachment 66617
[details]
Patch No love for this patch, apparently.
Adam Barth
Comment 17
2011-05-23 10:16:34 PDT
This idea never got enough traction.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug