Bug 9630

Summary: REGRESSION: many spaces typed in <textarea> are posted as non-breaking spaces
Product: WebKit Reporter: David Kilzer (:ddkilzer) <ddkilzer>
Component: FormsAssignee: Darin Adler <darin>
Status: RESOLVED FIXED    
Severity: Critical CC: adele, ap, darin, ian, jon, joost, justin.garcia, mitz
Priority: P1 Keywords: InRadar, Regression
Version: 420+   
Hardware: Mac   
OS: OS X 10.4   
URL: http://landfill.bugzilla.org/bugzilla-2.20-branch/
Attachments:
Description Flags
patch that almost works -- two known issues and no layout tests yet
none
a simple test file I've been using
none
patch that's closer to done, but still some known problems
none
still-newer patch (work in progress)
none
newer version of test file
none
newer patch -- almost there!
none
patch, including change log and layout tests (& all existing layout tests pass) justin.garcia: review+

David Kilzer (:ddkilzer)
Reported 2006-06-28 04:36:41 PDT
I've now noticed three times that Bugzilla has failed to create bug links for "Bug NNNN" text since the new textareas were activated, and it's driving me nuts!  This is more of a tracking bug since the real bug is probably in Bugzilla itself.  It'd be nice to figure out what's causing the problem in Bugzilla, though, and fix it. Bug 9579 Comment #0 Bug 9605 Comment #0 Bug 9611 Comment #2 Joost, can you query the Bugzilla database directly to find out exactly what the textareas are sending to the server? This will require a bug to be opened on http://bugzilla.mozilla.org/ for the fix.
Attachments
patch that almost works -- two known issues and no layout tests yet (50.50 KB, patch)
2006-07-16 12:24 PDT, Darin Adler
no flags
a simple test file I've been using (345 bytes, text/html)
2006-07-16 21:32 PDT, Darin Adler
no flags
patch that's closer to done, but still some known problems (61.44 KB, patch)
2006-07-18 17:26 PDT, Darin Adler
no flags
still-newer patch (work in progress) (65.50 KB, patch)
2006-07-19 08:36 PDT, Darin Adler
no flags
newer version of test file (431 bytes, text/html)
2006-07-19 08:36 PDT, Darin Adler
no flags
newer patch -- almost there! (70.48 KB, patch)
2006-07-21 17:53 PDT, Darin Adler
no flags
patch, including change log and layout tests (& all existing layout tests pass) (219.83 KB, patch)
2006-07-22 11:49 PDT, Darin Adler
justin.garcia: review+
Alexey Proskuryakov
Comment 1 2006-06-28 06:19:54 PDT
Would probably be useful to do a tcpdump capture here, but I couldn't reproduce the problem on landfill.bugzilla.org on purpose. Shouldn't this be P1 (regression), rather than P3/minor?
David Kilzer (:ddkilzer)
Comment 2 2006-06-28 08:07:42 PDT
(In reply to comment #1) > Would probably be useful to do a tcpdump capture here, but I couldn't reproduce > the problem on landfill.bugzilla.org on purpose. The thing about Bugzilla is that it adds the bug links when you view the page--it does NOT process the form input before storing it in the database.  That makes it "easy" to fix this bug in Bugzilla so that the bug links appear again, but it also means that the essense of the difference in what the native textareas are sending may be stored in the Bugzilla database. > Shouldn't this be P1 (regression), rather than P3/minor? Maybe.  I didn't want to jump the gun and blame WebKit right away for a change in behavior, although I guess that's the only possible explanation.
David Kilzer (:ddkilzer)
Comment 3 2006-06-28 08:09:34 PDT
BTW, all of my experience with this bug was done with Mac OS X 10.4.6 (8I127/PowerPC) with Safari 2.0.3 (417.9.3) and a local debug build of WebKit r150xx.
Alexey Proskuryakov
Comment 4 2006-06-28 08:34:36 PDT
(In reply to comment #2) > The thing about Bugzilla is that it adds the bug links when you view the > page--it does NOT process the form input before storing it in the database.  Oh, but I can see the problem even in the processed output! Some of the spaces end up as unbreakable (0xA0). Perhaps Bugzilla could support that one, too, but I wouldn't say that they have a bug. Upgrading to P1, but we need to find out when this happens.
David Kilzer (:ddkilzer)
Comment 5 2006-06-28 10:29:03 PDT
(In reply to comment #4) > Oh, but I can see the problem even in the processed output! Some of the spaces > end up as unbreakable (0xA0). Perhaps Bugzilla could support that one, too, but > I wouldn't say that they have a bug. Darin, how are you determining that the spaces are unbreakable? I was looking at view source, but they didn't jump out at me. I'll file a bug on http://bugzilla.mozilla.org/ about fixing this on the Bugzilla side unless someone beats me to it.
Darin Adler
Comment 6 2006-06-28 10:57:33 PDT
(In reply to comment #5) > Darin, how are you determining that the spaces are unbreakable? I was looking > at view source, but they didn't jump out at me. Alexey determined that -- see comment #4. > I'll file a bug on http://bugzilla.mozilla.org/ about fixing this on the > Bugzilla side unless someone beats me to it. Please don't. So far it looks like this is a WebKit bug, not a Bugzilla bug.
David Kilzer (:ddkilzer)
Comment 7 2006-06-28 11:28:15 PDT
(In reply to comment #5) > Darin, how are you determining that the spaces are unbreakable? I was looking > at view source, but they didn't jump out at me. Sorry, Alexey, I kept seeing Darin change the bug and thought he posted this comment. How did you spot the 0xA0 character? Using Ethereal or a similar network tracing tool?
Alexey Proskuryakov
Comment 8 2006-06-28 11:36:54 PDT
I just saved the page and looked at it in TextWrangler (with Show Invisibles on). Also, I've been getting these in some bugmail (pine displays them as question marks).
Justin Garcia
Comment 9 2006-06-28 14:23:27 PDT
We insert nbsps while editing.  We'd need to change nbsps to regular spaces when we serialize.  But, we wouldn't want to convert nbsps that were added intentionally by the user, so we'd need to wrap those spaces in a span with a special class on it so that we'd know which nbsps to convert and which to retain.   Alternatively, since white-space is pre-wrap in textareas, we could fix this by bailing in RebalanceWhitespace if we're in white-space: pre text.  Some might not like this because this would mean that edited whitespace would collapse if the white-space mode were toggled. 
Justin Garcia
Comment 10 2006-06-28 23:35:29 PDT
> Alternatively, since white-space is pre-wrap in textareas, we could fix this by > bailing in RebalanceWhitespace if we're in white-space: pre text.  Some might > not like this because this would mean that edited whitespace would collapse if > the white-space mode were toggled.  I think we should do this. It will create the problem mentioned above, but we want to fix this bug for textareas asap. What do we do when we create fragments? The functions that create fragments should be given a context (like Range::createContextualFragment) so that they'll know what to do with whitespace.
Alice Liu
Comment 11 2006-07-05 10:02:21 PDT
Darin Adler
Comment 12 2006-07-16 04:44:24 PDT
I've done most of what Justin recommends here. Just working to get the patch to 100%.
Darin Adler
Comment 13 2006-07-16 12:24:10 PDT
Created attachment 9502 [details] patch that almost works -- two known issues and no layout tests yet This patch passes all the existing layout tests (at least the editing and forms ones) but there are at least two outstanding issues: 1) Since there's no line box for the second line of a <pre> with "\n" as its text (or <textarea> equivalent), the caret draws in the wrong place (end of the first line) even though the height is computed correctly. This affects any <textarea> when the last character is a newline. 2) The editing code makes new text nodes for the \n characters rather than merging with surrounding text; I think we want a <textarea> to be a single text node.
Darin Adler
Comment 14 2006-07-16 21:32:35 PDT
Created attachment 9518 [details] a simple test file I've been using
Darin Adler
Comment 15 2006-07-18 17:26:54 PDT
Created attachment 9557 [details] patch that's closer to done, but still some known problems
Darin Adler
Comment 16 2006-07-19 08:36:34 PDT
Created attachment 9568 [details] still-newer patch (work in progress)
Darin Adler
Comment 17 2006-07-19 08:36:56 PDT
Created attachment 9569 [details] newer version of test file
Darin Adler
Comment 18 2006-07-21 17:53:23 PDT
Created attachment 9607 [details] newer patch -- almost there!
Darin Adler
Comment 19 2006-07-22 11:49:59 PDT
Created attachment 9615 [details] patch, including change log and layout tests (& all existing layout tests pass)
Adele Peterson
Comment 20 2006-07-22 21:50:10 PDT
Comment on attachment 9615 [details] patch, including change log and layout tests (& all existing layout tests pass) I looked through this, and it looks good, but I'd like Justin to review also.
Jonathan del Strother
Comment 21 2006-07-24 10:11:53 PDT
Is this responsible for r15993 sending 0xF0 instead of a space if it occurs at the end of a text field, or is that a separate problem?
Darin Adler
Comment 22 2006-07-24 10:40:55 PDT
(In reply to comment #21) > Is this responsible for r15993 sending 0xF0 instead of a space if it occurs at > the end of a text field, or is that a separate problem? I don't know. Maybe. Could you file a separate bug about that problem with steps to reproduce?
Justin Garcia
Comment 23 2006-07-24 19:05:39 PDT
Comment on attachment 9615 [details] patch, including change log and layout tests (& all existing layout tests pass)          // Insert an extra br if the inserted one will collapsed because of quirks mode. -        if (!document()->inStrictMode() && !(pos.downstream().node()->hasTagName(brTag) && pos.downstream().offset() == 0)) { -            insertNodeAt(nodeToInsert, pos.node(), pos.offset()); -            insertNodeAfter(createBreakElement(document()).get(), nodeToInsert); -        } else -            insertNodeAt(nodeToInsert, pos.node(), pos.offset()); +        bool haveBreak = pos.downstream().node()->hasTagName(brTag) && pos.downstream().offset() == 0; +        insertNodeAt(nodeToInsert.get(), pos.node(), pos.offset()); +        if (!haveBreak) +            insertNodeAfter(createBreakElement(document()).get(), nodeToInsert.get()); I guess you found that a br at the end of a block will collapse even in strict mode.  You should fix the comment (it has a spelling error too). @@ -156,6 +157,8 @@ void RenderTextControl::updateFromElemen          if (value != oldText || !m_div->hasChildNodes()) {              ExceptionCode ec = 0;              m_div->setInnerText(value, ec); +            if (value.endsWith("\n") || value.endsWith("\r")) +                m_div->appendChild(new HTMLBRElement(document()), ec);              if (document()->frame())                  document()->frame()->clearUndoRedoOperations();              setEdited(false); Why doesn't setInnerText add the extra br? +        Node* node = downstreamEnd.node(); +        int offset = downstreamEnd.offset(); +        if (node && node == upstreamOnePastEnd.node() && offset + 1 == upstreamOnePastEnd.offset() && node->isTextNode()) { +            // Special case for removing a newline character. +            Text* textNode = static_cast<Text*>(node); +            if (textNode->length() == 1) +                removeNodeAndPruneAncestors(textNode); +            else +                deleteTextFromNode(textNode, offset, 1); +        } else { +            // Merging two paragraphs will destroy the moved one's block styles.  Always move forward to preserve Is this special case just an optimization or is it necessary because moveParagraph will fail?  If it's the latter we should file a bug to fix moveParagraph. None of these are deal breakers.  Looks good.  r=me.
Darin Adler
Comment 24 2006-07-24 21:00:45 PDT
(In reply to comment #23) > I guess you found that a br at the end of a block will collapse even in strict > mode.  You should fix the comment (it has a spelling error too). I see your point. The issue is not so much that a br element at the end of a block will "collapse" in strict mode, but rather than HTML's model is that the last line in a block can have a terminator like a br element or a newline character (in the appropriate whitespace mode) and that doesn't create an additional line. Whereas in text editors, if the last character is a line terminator, then there's a position after that terminator. To get that effect with HTML we need an extra terminator. > @@ -156,6 +157,8 @@ void RenderTextControl::updateFromElemen >          if (value != oldText || !m_div->hasChildNodes()) { >              ExceptionCode ec = 0; >              m_div->setInnerText(value, ec); > +            if (value.endsWith("\n") || value.endsWith("\r")) > +                m_div->appendChild(new HTMLBRElement(document()), ec); >              if (document()->frame()) >                  document()->frame()->clearUndoRedoOperations(); >              setEdited(false); > > Why doesn't setInnerText add the extra br? Hyatt convinced me that setInnerText needs behavior that matches what other web browsers do. This concept that you need an extra br element comes from the needs of editing and matching standard editing behavior, but there's no reason for setInnerText to have editing-specific behavior of this type. But I could imagine going another way in the future -- need to do some additional testing I guess. One special trick here is that since the last thing is a br element and not a newline, it makes the text() function particularly simple. If setInnerText took care of this, I could imagine it using newline characters instead of br elements. > +        Node* node = downstreamEnd.node(); > +        int offset = downstreamEnd.offset(); > +        if (node && node == upstreamOnePastEnd.node() && offset + 1 == > upstreamOnePastEnd.offset() && node->isTextNode()) { > +            // Special case for removing a newline character. > +            Text* textNode = static_cast<Text*>(node); > +            if (textNode->length() == 1) > +                removeNodeAndPruneAncestors(textNode); > +            else > +                deleteTextFromNode(textNode, offset, 1); > +        } else { > +            // Merging two paragraphs will destroy the moved one's block > styles.  Always move forward to preserve > > Is this special case just an optimization or is it necessary because > moveParagraph will fail?  If it's the latter we should file a bug to fix > moveParagraph. The moveParagraph function will fail. I'm just not sure how to write a bug report about that. I'll think it over. > None of these are deal breakers.  Looks good.  r=me. Thanks for the review.
Darin Adler
Comment 25 2006-07-24 21:34:22 PDT
Committed revision 15617.
Note You need to log in before you can comment on or make changes to this bug.