Summary: | contentEditable inserts non standard spaces | ||||||
---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Peer Bremer <peer> | ||||
Component: | HTML Editing | Assignee: | Nobody <webkit-unassigned> | ||||
Status: | RESOLVED INVALID | ||||||
Severity: | Normal | CC: | ap, justin.garcia, mrowe, peer | ||||
Priority: | P2 | ||||||
Version: | 523.x (Safari 3) | ||||||
Hardware: | Mac | ||||||
OS: | OS X 10.4 | ||||||
URL: | http://www.smileCMS.com/webkit/strange_spaces.txt | ||||||
Attachments: |
|
Description
Peer Bremer
2007-08-02 04:50:10 PDT
These are non-breaking spaces, and they are often necessary when editing HTML. Since some code is broken by this, it's quite possible that this is not one of these cases. Could you please provide detailed steps to reproduce this issue on a live site (or alternatively an interactive test case)? It's unlikely that we'll be able to proceed with just the information in bug description. (In reply to comment #1) > These are non-breaking spaces, and they are often necessary when editing HTML. That is not the issue if a correct " " is inserted that would be perfectly acceptable, the browser however inserts non ascci characters which are not standard html. Created attachment 15814 [details]
just an editable div
This is what Safari was always doing, so the difference between 3.0.2 and 3.0.3 betas must be in something else. You can use the attached test to verify what is inserted in an editable div.
The character that is inserted is a non-breaking space, just as Alexey mentioned. In the case of the URL you provided, the character on the second line between the quotation marks is a UTF-8 encoded non-breaking space. If you change Safari's encoding via View -> Text Encodings, you will notice that it renders as expected. It appears as an unexpected A-like character as it is being interpreted according to the default encoding of the browser, which is used when the server does not specify which encoding *should* be used.
In Alexey's example entering "A B" (that is, A followed by two spaces then a B) is displayed in the alert as "A B (A%20%A0B)". Plugging this into a short code snippet to display the names of the characters that compose that string reveals:
>>> string = "A\x20\xA0B"
>>> map(unicodedata.name, string.decode('latin1'))
['LATIN CAPITAL LETTER A', 'SPACE', 'NO-BREAK SPACE', 'LATIN CAPITAL LETTER B']
I think this bug should probably be closed as everything appears to be working as expected.
Thank you for the explanation, I might be a bit dumb, but I do not think it is a good idea that the browser is using non standard characters when generating html in content editable div. For the sake of compatibility shouldn't the browser produce clean standard html code, there is already a lot of Apple Style tags and color specified as rgb values and other silly html like wrapping breaks <br> inside of DIV tags etc. But at least these are visible text chars and html. Anyhow I accept that this is not a bug, also these invisible chars mess up the page structure causing tables with width specifications to be wider than specified since the lines do not break. Have looked at the test again and you are right about what happens if you use two spaces, but I think it would be much better to use in these cases and not some invisible non html character. The invisible character *is* a . I'm not sure why you keep saying that it is non-standard. Take a look at <http://www.w3.org/TR/REC-html40/sgml/entities.html>: <!ENTITY nbsp CDATA " " -- no-break space = non-breaking space, U+00A0 ISOnum --> Notice U+00A0? That's the character code you see in Alexey's demo. It is the *same* character. If you are seeing it rendered visibly as anything but a non-breaking space (eg, the A-like character) then you have a character encoding issue, most likely in your web server or application configuration. > these invisible chars mess up the page structure causing tables with > width specifications to be wider than specified since the lines do not break If you have a specific case where this is occurring, please provide an example. WebKit typically inserts alternating pairs of space and non-breaking space when editing which is sufficient to be visually consistent with what the user has typed while also allowing wrapping to occur. This report doesn't give steps to reproduce a bug or demonstrate an incompatibility with a web site; closing. Note that we track a request to serialize non-breaking spaces as in bug 11947. I'm not making this a duplicate, because this report also mentions some difference between 3.0.2 and 3.0.3 betas that we couldn't pin down. |