Bug 11947 - nbsps should be converted to entities in innerHTML
Summary: nbsps should be converted to entities in innerHTML
Alias: None
Product: WebKit
Classification: Unclassified
Component: DOM (show other bugs)
Version: 420+
Hardware: All OS X 10.4
: P4 Trivial
Assignee: Alexey Proskuryakov
URL: http://www.fredck.com/bugs/safari/nbs...
: 18769 20654 (view as bug list)
Depends on:
Reported: 2006-12-23 06:40 PST by webkit
Modified: 2013-01-08 12:43 PST (History)
6 users (show)

See Also:

proposed fix (7.97 KB, patch)
2008-05-05 00:17 PDT, Alexey Proskuryakov
darin: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description webkit 2006-12-23 06:40:01 PST
The nbsp entity is being replaced by a space when retrieving a element innerHTML.

It seams that, while entities are converted to their relative chars, nbsp represents a special case.

There are no standards for innerHTML, so we could longer discuss it here. The fact is that Safari behaves differently from IE and Firefox. Actually Firefox has addressed this issue too:
Comment 1 Alexey Proskuryakov 2006-12-24 00:08:04 PST
Actually, innerHTML correctly produces U+00A0, which is NO-BREAK SPACE in Unicode.

I'm confirming a difference with Firefox, but lowering the priority/severity, because I don't see how this can cause problems. Feel free to raise it if this does cause issues.
Comment 2 David Kilzer (:ddkilzer) 2006-12-24 08:13:05 PST
Frederico, how does this behavior affect the FCKeditor?

Comment 3 webkit 2006-12-27 05:08:36 PST
This is not something visible in the current version of FCKeditor. It impacts on new developments I'm doing in FCKeditor for the Enter Key handler. It will be possible to control the behavior of the Enter key with more precision.

Somewhere on the code, I need to check if part of the DOM (a custom range implementation) is truly empty. To do that, I need to check if the innerHTML of that range contains only "pure" spaces. It is something like innerHTML.Trim().length == 0 (Trim() is another custom one).
Comment 4 webkit 2007-01-07 09:14:36 PST
Based on the Alexey comment ("innerHTML correctly produces U+00A0"), I've been able to change the code to make it work for the specific FCKeditor need. So, this bug is not anymore blocking FCKeditor, but it is still a bug.
Comment 5 Alexey Proskuryakov 2008-05-05 00:15:33 PDT
*** Bug 18769 has been marked as a duplicate of this bug. ***
Comment 6 Alexey Proskuryakov 2008-05-05 00:17:46 PDT
Created attachment 20967 [details]
proposed fix

I'm not sure why escaping logic is repeated several times in markup.cpp, perhaps this file could use some refactoring. Not quite ready to do it now, though.
Comment 7 Darin Adler 2008-05-05 07:42:11 PDT
Comment on attachment 20967 [details]
proposed fix

Since there are three code paths in markup.cpp, we would need three tests to ensure we tested all three.

> \ No newline at end of file

Should fix that.
Comment 8 Alexey Proskuryakov 2008-05-05 11:33:43 PDT
Committed revision 32879.

Added a test case for attributes. I'm not quite sure, but looks like the third code path is for copy/paste, and cannot be easily tested.
Comment 9 Alexey Proskuryakov 2008-09-08 03:38:36 PDT
*** Bug 20654 has been marked as a duplicate of this bug. ***
Comment 10 danya.postfactum 2013-01-08 12:43:49 PST
Why, why a browser should convert " " to   ? I'm trying to find any explanation, any specific nbsp behavior in specifications, but I can't. So, could you explain guys why do it? And why don't you convert another specific whitespaces, such as  " ", " ", " ", " ", " ", " ", " ", " ", " ", " " to entities?