Bug 43579 - js-test-pre.js's escapeHTML should escape null characters so we can see them
Summary: js-test-pre.js's escapeHTML should escape null characters so we can see them
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: Other OS X 10.5
: P2 Normal
Assignee: Adam Barth
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-08-05 14:38 PDT by Adam Barth
Modified: 2010-08-09 09:08 PDT (History)
3 users (show)

See Also:


Attachments
Patch (9.39 KB, patch)
2010-08-05 14:39 PDT, Adam Barth
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Adam Barth 2010-08-05 14:38:09 PDT
js-test-pre.js's escapeHTML should escape null characters so we can see them
Comment 1 Adam Barth 2010-08-05 14:39:04 PDT
Created attachment 63637 [details]
Patch
Comment 2 Darin Adler 2010-08-05 14:46:44 PDT
Comment on attachment 63637 [details]
Patch

> -    return text.replace(/&/g, "&amp;").replace(/</g, "&lt;");
> +    return text.replace(/&/g, "&amp;").replace(/</g, "&lt;").replace(/\0/g, "\\0");

I was thinking we should chose a form that can be parsed such as &#0; but perhaps there is no such thing.

r=me
Comment 3 Adam Barth 2010-08-05 15:06:53 PDT
> I was thinking we should chose a form that can be parsed such as &#0; but perhaps there is no such thing.

The problem is that will get decoded by the HTML parser and turned into a null character again.  :)
Comment 4 Adam Barth 2010-08-05 16:52:12 PDT
Comment on attachment 63637 [details]
Patch

Clearing flags on attachment: 63637

Committed r64796: <http://trac.webkit.org/changeset/64796>
Comment 5 Adam Barth 2010-08-05 16:52:17 PDT
All reviewed patches have been landed.  Closing bug.
Comment 6 Darin Adler 2010-08-05 19:25:03 PDT
(In reply to comment #3)
> The problem is that will get decoded by the HTML parser and turned into a null character again.  :)

Are you sure? I don’t think it will.
Comment 7 Adam Barth 2010-08-05 19:40:53 PDT
> Are you sure? I don’t think it will.

http://trac.webkit.org/browser/trunk/WebCore/html/HTMLEntityParser.cpp#L66

Looks like the entity parser itself catches null and replaces it with U+FFFD.
Comment 8 Darin Adler 2010-08-05 19:43:25 PDT
The only downside with turning a null character into "\0" is that the characters "\0" also display as "\0". I guess I can’t think of any better thing to do, though, so I’m not sure what my point is!
Comment 9 Adam Barth 2010-08-05 19:47:24 PDT
> The only downside with turning a null character into "\0" is that the characters "\0" also display as "\0".

That seems like a bug too, though.  Isn't the output of these tests meant to look like a JavaScript string?
Comment 10 Darin Adler 2010-08-08 21:57:31 PDT
(In reply to comment #9)
> That seems like a bug too, though.  Isn't the output of these tests meant to look like a JavaScript string?

On reflection, I erred in my original suggestion.

The job of escapeHTML is to take a string and put it in a format where if it’s parsed as HTML you get the original string back. And it can almost do that but not quite, because of the special rule about null characters that makes it impossible in the general case. So I was wrong to suggest adding special handling of null characters to that function without renaming it. Tests should not need to serialize and then re-parse; fixing that would eliminate the need for escapeHTML entirely.

Escaping for human readability is another type of function. I don't think the test machinery tries to do that at all, but it could. If there was a function that tried to do that, it should handle more control characters than just the null character, and possibly many other non-ASCII characters as well.

That’s all from memory without studying js-test-pre.js further. I might have other ideas if I read it over.
Comment 11 Adam Barth 2010-08-08 23:42:41 PDT
Would you like me to revert this patch while we consider further?
Comment 12 Eric Seidel (no email) 2010-08-09 08:24:42 PDT
IIRC, ggaren is the (or at least one of the) original author(s) of js-test-pre.js
Comment 13 Darin Adler 2010-08-09 09:08:10 PDT
There’s no need to revert the change. But we can come up with something better for the future.

For this discussion it doesn’t matter who the original author of js-test-pre.js is. I’m pretty sure it was Maciej, though.