Bug 5241

Summary: Space and tab characters "sent" by an input method give totally different results than typing them directly
Product: WebKit Reporter: Evan Gross <evan>
Component: HTML EditingAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: abarth, ap, eric, gyuyoung.kim, hyatt, ian, justin.garcia, morrita, rniwa, webkit.review.bot
Priority: P2    
Version: 420+   
Hardware: Mac   
OS: OS X 10.4   
Bug Depends on: 52799    
Bug Blocks: 52781    
Attachments:
Description Flags
Input method for testing various issues
none
automated test
none
Patch
none
Patch
none
Patch
none
Patch
none
updated to ToT
none
Patch
none
Patch
none
Patch
none
Patch rniwa: review+

Description Evan Gross 2005-10-02 20:41:03 PDT
If an input method sends space, tab, and perhaps other whitespace characters via an 
UpdateActiveInputArea TSM event, the results are totally different than typing them.

I'll upload an input method you can use to reproduce this problem.

Steps to reproduce:

1. In Blot (good because you can save the html to disk), just a keyboard selected, type 
"one<space>space<space><space>two<space>spaces<space><space><space>three<space>spaces
"
2. Save the file.
3. Close the document, create a new untitled one.
4. Activate the input method I've attached - it allows spaces and tabs in the inline input area.
5. Type the same text as in step 1.
6. Save the file.

Open each in a text editor so you can view the differences.
The basic issue is that spaces from UAIA don't go through the same transformation (wrt option-space) 
as when you type them.
Tab characters cause similar (perhaps worse!) problems. The resulting html is VERY different when the 
above test is done with tabs.

Don't know how bad it is with returns, as this input method uses them to fix the inline text. Most 
applications do the right thing here with spaces and tabs, few deal with returns well.
Comment 1 Evan Gross 2005-10-02 20:42:30 PDT
Created attachment 4164 [details]
Input method for testing various issues

Here's a modified version of the BasicInputMethod sample, enhanced to do the
following:

1. The Send Event Palette can send OffsetToPos with positive/zero/negative
offsets. Good for this particular bug.
2. Various enhancements to the inline input:

- Can do click-positioning of the caret via PosToOffset
- Arrow keys work
- Drag-selecting within inline area works (well, not in NSTextView, radar filed

long, long ago)
- Shift-arrow extends selection
- You can type spaces and tabs into the inline text (good for showing all the
bugs with input methods sending whitespace vs. typing spaces and tabs - radars
filed long, long ago).

A bit buggy, you may need to open and close a window to get it to activate
properly at times.

Good for testing with bug 4681, bug 4682, bug 5230, perhaps others.
Comment 2 Evan Gross 2005-10-02 20:44:55 PDT
Missed one perhaps not-so-obvious step to reproduce - before saving the file when the input method is 
active, confirm the text by hitting enter or return!
Comment 3 Alexey Proskuryakov 2005-10-03 01:40:37 PDT
Confirmed with 10.4.2 and ToT. It may be necessary to open the files with a hex editor to see the 
differences, actually.

Is this limited to kEventTextInputUpdateActiveInputArea, and kEventTextInputUnicodeText is unaffected? I 
mean, IMs rarely need to input spaces, let alone tabs, into inline input areas.
Comment 4 Evan Gross 2005-10-03 19:42:26 PDT
(In reply to comment #3)
You should be able to use BBEdit or even drag the files onto Xcode to at least see the difference (option 
spaces in the typed-in text, only spaces from the input method).

As for rarely needing to input spaces or tabs, well, perhaps not in the inline area, but that's not what 
this is about. It's about sending them via UAIA (and yes, I think kEventTextInputUnicodeText is also 
affected, I can check later to see). There are thousands of Spell Catcher, TypeIt4Me and Grammarian 
users out there that might not agree with the fact that spaces and tabs don't need to be sent to an app 
by an input method ;-). Consider some sort of auto-completing input method where typing "fyi" results 
in "for your information". Spell Catcher doesn't currently do this via inline input right now, too many 
apps don't support it. But that's not to say it won't ever work that way.

Doesn't really matter, the current behavior is wrong and causes trouble. Spaces, for certain, are 
accepted properly via UAIA by every text editing widget I've come across. I have to special-case Mail on 
Tiger just to work-around this. Time to fix it.

I tried stepping through the code with typing vs. UAIA, but without a whole lot of luck. We really need 
someone that knows this stuff inside-out to help us come up with the correct solution...
Comment 5 Alexey Proskuryakov 2005-10-10 11:35:29 PDT
Created attachment 4293 [details]
automated test

(goes to editing/input).

I haven't tried with "real" TSM events, but from this test, it looks like both
kEventTextInputUpdateActiveInputArea and kEventTextInputUnicodeText are
affected, indeed.
Comment 6 Alexey Proskuryakov 2006-01-06 15:06:38 PST
Hyatt says he's unlikely to look at this bug any time soon; unassigning.
Comment 7 Hajime Morrita 2011-01-11 00:42:49 PST
*** Bug 51918 has been marked as a duplicate of this bug. ***
Comment 8 Hajime Morrita 2011-01-11 00:54:37 PST
Created attachment 78502 [details]
Patch
Comment 9 Hajime Morrita 2011-01-11 00:56:04 PST
Adding hyatt and rniwa for possible reviewers.
Comment 10 Ryosuke Niwa 2011-01-11 01:04:33 PST
(In reply to comment #9)
> Adding hyatt and rniwa for possible reviewers.

Please address the comments I made on the bug 51918.
Comment 11 Hajime Morrita 2011-01-11 20:52:20 PST
Created attachment 78646 [details]
Patch
Comment 12 Hajime Morrita 2011-01-11 20:55:21 PST
(In reply to comment #10)
> Please address the comments I made on the bug 51918.
I'm sorry I missed it. I updated the patch to address it.

(In reply to comment #3)
> (From update of attachment 78211 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=78211&action=review
> 
> > LayoutTests/ChangeLog:7
> > +
> 
> You probably want to explain what kind of test you're adding.
Added an explanation.

> 
> > WebCore/editing/CompositeEditCommand.cpp:433
> > +String CompositeEditCommand::stringWithBalancedWhitespaceAt(const Position& position, const String& text)
> 
> Nit: the original function name "rebalanceWhitespaceAt" is better IMO.
Renamed to "rebalanceWhitespaceBetweenWordsAt", which indicates its intension more closely.

> 
> > WebCore/editing/CompositeEditCommand.cpp:438
> > +    String ret;
> 
> Nit: please don't use abbreviation.
Fixed.

> 
> > WebCore/editing/CompositeEditCommand.cpp:469
> > +    }
> 
> Why aren't we calling stringWithRebalancedWhitespace in htmlediting.cpp?
It is for dirrent purpose. stringWithRebalancedWhitespace is for leading/trailing whitespace.
new one is for whitespace other than leading/trailing. I renamed the function to indicate that.

> 
> > WebCore/editing/Editor.cpp:1604
> > +    // Some IME pass sequence of whitespace ' ' instead of nbsp even they expect
> > +    // multiple whitespaces inserted into the redering result. 
> > +    // So we mix some nbsp characters for preserving the whitespace.
> 
> Nit: I don't think this comment is necessary.  Anyone who knows about whitespace rebalancing would understand what this code does.
Removed.

> 
> > WebCore/editing/Editor.cpp:1696
> > +        // Some IME pass sequence of whitespace ' ' instead of nbsp even they expect
> > +        // multiple whitespaces inserted into the redering result. 
> > +        // So we mix some nbsp characters for preserving the whitespace.
> 
> Ditto.
Removed.
Comment 13 Ryosuke Niwa 2011-01-11 21:38:40 PST
Comment on attachment 78646 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=78646&action=review

> LayoutTests/editing/inserting/script-tests/insert-composition-whitespace.js:34
> +test("div", "A  B", "A \xA0B");
> +test("div", "A   B", "A \xA0 B");
> +test("div", "A    B", "A \xA0 \xA0B"); 

Mn... nbsp is 0xC2 0xA0 in UTF-8 but I guess it doesn't matter here.  But watch out for test failures when you're landing this.

> Source/WebCore/ChangeLog:24
> +        * editing/CompositeEditCommand.cpp:
> +        (WebCore::hasWhitespaceList):
> +        (WebCore::CompositeEditCommand::shouldBalanceWhitespaceAt):
> +        (WebCore::CompositeEditCommand::rebalanceWhitespaceAt):
> +        (WebCore::CompositeEditCommand::rebalanceWhitespaceBetweenWordsAt): Added.
> +        * editing/CompositeEditCommand.h:
> +        * editing/Editor.cpp:
> +        (WebCore::Editor::confirmComposition):
> +        (WebCore::Editor::setComposition):

Nit: Since you're making quite few modifications, it'll be nice if you also added inline description to each function.

> Source/WebCore/editing/CompositeEditCommand.cpp:411
> +bool CompositeEditCommand::shouldBalanceWhitespaceAt(const Position& position, bool willInsert)

I'm asking this function "should I balance whitespace at this position?"  What does willInsert mean in that semantics?

> Source/WebCore/editing/CompositeEditCommand.cpp:424
> +    if (node->isTextNode()) {
> +        Text* textNode = static_cast<Text*>(node);
> +        if (!textNode->length() && !willInsert)
> +            return false;
> +    } else {
> +        if (!willInsert)
> +            return false;
> +    }

Can't you just do:
if ((!node->isTextNode() || !static_cast<Text*>(node)->length()) && !willInsert)
    return false;

> Source/WebCore/editing/CompositeEditCommand.cpp:436
> +    if (!hasWhitespaceList(text) || !shouldBalanceWhitespaceAt(position, true))
> +        return text;

Per recent discussion on webkit-dev, we don't want to be calling a function with true/false. r- for this.
Consider making the argument enum type.  But also reconsider the name willInsert.  I don't think it's descriptive enough.

> Source/WebCore/editing/CompositeEditCommand.cpp:469
> +    String balancedText;
> +    unsigned whiteCount = 0;
> +
> +    for (unsigned i = 0; i < text.length(); ++i) {
> +        bool isWhitespace = text[i] == ' ';
> +        if (isWhitespace) {
> +            whiteCount++;
> +
> +            if (i + 1 < text.length())
> +                continue;
> +            // Preserves trailing whitespace, which will be balanced by rebalanceWhitespaceAt()
> +            balancedText.append(text.characters() + text.length() - whiteCount, whiteCount);
> +            break;
> +        }
> +
> +        // Preseves leading whitespace, which will be balanced by rebalanceWhitespaceAt()
> +        if (whiteCount == i)
> +            balancedText.append(text.characters(), whiteCount);
> +        else {
> +            // Interleaves nbsp and whitespace to preserve
> +            // space characters and line-breakability.
> +            if (1 == whiteCount)
> +                balancedText.append(' ');
> +            else {
> +                for (unsigned whiteIndex = 0; whiteIndex < whiteCount; ++whiteIndex)
> +                    balancedText.append(static_cast<UChar>(whiteIndex%2 ? noBreakSpace : ' '));
> +            }
> +        }
> +
> +        balancedText.append(text.characters()[i]);
> +        whiteCount = 0;
> +    }

It's really unfortunate that we can't share any code with rebalanceWhitespaceAt.  Is there any way we can add more general version of the two and call that one in rebalanceWhitespaceAt and rebalanceWhitespaceBetweenWordsAt?
Comment 14 Alexey Proskuryakov 2011-01-11 22:06:17 PST
Comment on attachment 78646 [details]
Patch

+        * editing/inserting/insert-composition-whitespace.html: Added.
+        * editing/inserting/script-tests/insert-composition-whitespace.js: Added.

Please don't split the test into .js and .html parts. Although this anti-pattern is used in many existing tests, it serves no purpose, and is actually harmful.
Comment 15 Alexey Proskuryakov 2011-01-11 22:07:51 PST
This bug is about both spaces and tabs. Are tabs being fixed here?
Comment 16 Hajime Morrita 2011-01-11 22:44:13 PST
(In reply to comment #15)
> This bug is about both spaces and tabs. Are tabs being fixed here?
Good point. I'll address it.
Comment 17 Hajime Morrita 2011-01-13 03:22:39 PST
Created attachment 78794 [details]
Patch
Comment 18 Hajime Morrita 2011-01-13 03:27:21 PST
Ryosuke, Alexey, thank you for reviewing!
I rewrote the patch to share the code path to rebalanceWhitespaceAt()
that also addressed other feedback which you were pointing out.
Could you take another look?
Comment 19 Gyuyoung Kim 2011-01-13 03:34:50 PST
Your last patch has a build break in WebKit EFL.
Please fix it. 


Linking CXX static library libwebcore_efl.a
[100%] Built target webcore_efl
Linking CXX shared library libewebkit.so
[100%] Built target ewebkit
Linking CXX executable Programs/EWebLauncher
WebKit/libewebkit.so.0.1.0: undefined reference to `WebCore::InsertTextCommand::input(WTF::String const&, bool, WebCore::InsertTextCommand::WhitespaceRebalanceType)'
collect2: ld returned 1 exit status
make[2]: *** [Programs/EWebLauncher] Error 1
make[1]: *** [CMakeFiles/Programs/EWebLauncher.dir/all] Error 2
make: *** [all] Error 2
Failed to build Efl port
Comment 20 Ryosuke Niwa 2011-01-13 09:49:37 PST
Comment on attachment 78794 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=78794&action=review

The change looks much saner but I still need answers to the following questions.

> LayoutTests/editing/inserting/insert-composition-whitespace.html:52
> +test("div", " AB", "\xA0AB");
> +test("div", "  AB", "\xA0\xA0AB");
> +test("div", "   AB", "\xA0\xA0 AB");
> +test("div", "    AB", "\xA0\xA0 \xA0AB");
> +test("div", "     AB", "\xA0\xA0 \xA0 AB");
> +test("div", "      AB", "\xA0\xA0 \xA0 \xA0AB");
> +test("div", "       AB", "\xA0\xA0 \xA0 \xA0 AB");

Huh. It's kind of odd that these expected results have two consecutive non-breaking spaces.  Is this really expected?

> Source/WebCore/dom/TextEvent.h:39
> +            InputTypeComposition, // input is done by IME, so whitespace characters should be preserved.

I'm not sure if we should mention about whitespace preservation at all here. That sounds rather arbitrary.

> Source/WebCore/editing/CompositeEditCommand.cpp:427
> +        int newOffset = startOffset - 1;
> +        if (newOffset < 0 || !isWhitespace(text[newOffset]))
>              return;
> +        startOffset = endOffset = newOffset;

Why do we do this off-by-one adjustment here?

> Source/WebCore/editing/CompositeEditCommand.cpp:435
> +    int downstream = std::max(startOffset, endOffset - 1);

I don't get this -1 either. Could you explain what they are doing?

> Source/WebCore/editing/InsertTextCommand.cpp:184
> +        if (whitespaceRebalance == WhitespaceRebalanceSurrounding) {
> +            // The insertion may require adjusting adjacent whitespace, if it is present.
> +            rebalanceWhitespaceAt(endPosition);
> +            // Rebalancing on both sides isn't necessary if we've inserted a space.
> +            if (text != " ") 
> +                rebalanceWhitespaceAt(startPosition);
> +        } else {
> +            ASSERT(whitespaceRebalance == WhitespaceRebalanceAll);
> +            rebalanceWhitespaceBetween(startPosition, endPosition);
> +        }

You could use switch but I guess that's an overkill for just two values. But I'm quite surprised about 'text != " "'.  Shouldn't we also be checking for the case where text had 2, 3, 4, ... whitespaces?

> Source/WebCore/editing/InsertTextCommand.h:37
> +        WhitespaceRebalanceSurrounding,
> +        WhitespaceRebalanceAll,

Mn... I think these should be RebalanceLeadingAndTrailingWhitespaces and RebalanceAllWhitespaces respectively.
Comment 21 Hajime Morrita 2011-01-14 05:15:52 PST
Created attachment 78929 [details]
Patch
Comment 22 Hajime Morrita 2011-01-14 06:02:32 PST
Hi Ryosuke, thank you for reviewing!
I updated the patch again for addressing the feedback.

(In reply to comment #20)
> (From update of attachment 78794 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=78794&action=review
> 
> The change looks much saner but I still need answers to the following questions.
> 
> > LayoutTests/editing/inserting/insert-composition-whitespace.html:52
> > +test("div", " AB", "\xA0AB");
> > +test("div", "  AB", "\xA0\xA0AB");
> > +test("div", "   AB", "\xA0\xA0 AB");
> > +test("div", "    AB", "\xA0\xA0 \xA0AB");
> > +test("div", "     AB", "\xA0\xA0 \xA0 AB");
> > +test("div", "      AB", "\xA0\xA0 \xA0 \xA0AB");
> > +test("div", "       AB", "\xA0\xA0 \xA0 \xA0 AB");
> 
> Huh. It's kind of odd that these expected results have two consecutive non-breaking spaces.  Is this really expected?
Honestly, I have no idea. This is the default behavior.

> 
> > Source/WebCore/dom/TextEvent.h:39
> > +            InputTypeComposition, // input is done by IME, so whitespace characters should be preserved.
> 
> I'm not sure if we should mention about whitespace preservation at all here. That sounds rather arbitrary.
> 
Removed the redundant part.

> > Source/WebCore/editing/CompositeEditCommand.cpp:427
> > +        int newOffset = startOffset - 1;
> > +        if (newOffset < 0 || !isWhitespace(text[newOffset]))
> >              return;
> > +        startOffset = endOffset = newOffset;
> 
> Why do we do this off-by-one adjustment here?
> 
I have no idea here...
But I tried to sweep out one-by-one tricks around here and tests passed.

> > Source/WebCore/editing/CompositeEditCommand.cpp:435
> > +    int downstream = std::max(startOffset, endOffset - 1);
> 
> I don't get this -1 either. Could you explain what they are doing?
Purged one-by-one.

> 
> > Source/WebCore/editing/InsertTextCommand.cpp:184
> > +        if (whitespaceRebalance == WhitespaceRebalanceSurrounding) {
> > +            // The insertion may require adjusting adjacent whitespace, if it is present.
> > +            rebalanceWhitespaceAt(endPosition);
> > +            // Rebalancing on both sides isn't necessary if we've inserted a space.
> > +            if (text != " ") 
> > +                rebalanceWhitespaceAt(startPosition);
> > +        } else {
> > +            ASSERT(whitespaceRebalance == WhitespaceRebalanceAll);
> > +            rebalanceWhitespaceBetween(startPosition, endPosition);
> > +        }
> 
> You could use switch but I guess that's an overkill for just two values. But I'm quite surprised about 'text != " "'.  Shouldn't we also be checking for the case where text had 2, 3, 4, ... whitespaces?
My guess is that this is for handling special case for typing the space key.
Updated patch generalized this though.

> 
> > Source/WebCore/editing/InsertTextCommand.h:37
> > +        WhitespaceRebalanceSurrounding,
> > +        WhitespaceRebalanceAll,
> 
> Mn... I think these should be RebalanceLeadingAndTrailingWhitespaces and RebalanceAllWhitespaces respectively.
Sure. Renamed.
Comment 23 Hajime Morrita 2011-01-17 00:00:57 PST
Created attachment 79131 [details]
updated to ToT
Comment 24 Ryosuke Niwa 2011-01-17 12:00:45 PST
Comment on attachment 79131 [details]
updated to ToT

View in context: https://bugs.webkit.org/attachment.cgi?id=79131&action=review

I was looking at trac logs and found out that enrica attempted a fix at a related bug 30946 (her patch was rolled out in http://trac.webkit.org/changeset/55271). Have you taken a look at that bug?

> Source/WebCore/editing/CompositeEditCommand.cpp:398
> +static inline bool isWhitespaces(const String& text)

How about containsOnlyWhitespace as in JavaScriptCore/wtf/text/StringImpl.h?

> Source/WebCore/editing/CompositeEditCommand.cpp:419
> +void CompositeEditCommand::rebalanceWhitespaceBetween(const Position& startPosition, const Position& endPosition)

"between" is kind of misleading because you're extending positions to include all surrounding whitespace.

How about rebalanceSurroundingWhitespace?

> Source/WebCore/editing/CompositeEditCommand.cpp:422
> +    if (startPosition.anchorType() != Position::PositionIsOffsetInAnchor || !node || !node->isTextNode())

You need to do the same check for endPosition. In fact, don't you need to assert that startPosition and endPosition point to the same text node? r- for this.

Maybe it's better to move this check into caller and have them pass a RefPtr to the text node, startOffset, and endOffset. You don't need to check this condition in InsertTextCommand::input for example.  Although if we're fixing this function to consider surrounding text nodes, taking two positions might make more sense.

> Source/WebCore/editing/CompositeEditCommand.cpp:-421
> -    int offset = position.deprecatedEditingOffset();
> -    // If neither text[offset] nor text[offset - 1] are some form of whitespace, do nothing.
> -    if (!isWhitespace(text[offset])) {
> -        offset--;
> -        if (offset < 0 || !isWhitespace(text[offset]))
> -            return;
> -    }

If you make the above change, then we can move this optimization into CompositeEditCommand::rebalanceWhitespaceAt.  But I'm still not sure that'll make sense if we extended this function to walk surrounding text nodes though.

> Source/WebCore/page/EventHandler.cpp:2656
> -bool EventHandler::handleTextInputEvent(const String& text, Event* underlyingEvent, bool isLineBreak, bool isBackTab)
> +bool EventHandler::handleTextInputEvent(const String& text, Event* underlyingEvent, bool isLineBreak, bool isBackTab, bool isComposition)

I really don't like the fact this function takes 3! boolean arguments.  Would you mind cleaning that up in a separate patch?
Comment 25 Hajime Morrita 2011-01-17 21:27:22 PST
Created attachment 79240 [details]
Patch
Comment 26 Hajime Morrita 2011-01-17 21:34:41 PST
Ryosuke, thank you for continuing the  review! I updated the patch.

(In reply to comment #24)
> (From update of attachment 79131 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=79131&action=review
> 
> I was looking at trac logs and found out that enrica attempted a fix at a related bug 30946 (her patch was rolled out in http://trac.webkit.org/changeset/55271). Have you taken a look at that bug?
I didn't notice  that. But I tried tests under fast/form and they passed.
Fortunately, the change looks  orthogonal, at least conceptually. 

> 
> > Source/WebCore/editing/CompositeEditCommand.cpp:398
> > +static inline bool isWhitespaces(const String& text)
> 
> How about containsOnlyWhitespace as in JavaScriptCore/wtf/text/StringImpl.h?
> 
Looks nice, So Replaced this with String::containsOnlyWhitespace()

> > Source/WebCore/editing/CompositeEditCommand.cpp:419
> > +void CompositeEditCommand::rebalanceWhitespaceBetween(const Position& startPosition, const Position& endPosition)
> 
> "between" is kind of misleading because you're extending positions to include all surrounding whitespace.
Renamed to rebalanceWhitespaceOnTextSubstring() with changing signature as 
your feedback below.

> 
> How about rebalanceSurroundingWhitespace?
> 
> > Source/WebCore/editing/CompositeEditCommand.cpp:422
> > +    if (startPosition.anchorType() != Position::PositionIsOffsetInAnchor || !node || !node->isTextNode())
> 
> You need to do the same check for endPosition. In fact, don't you need to assert that startPosition and endPosition point to the same text node? r- for this.
> 
> Maybe it's better to move this check into caller and have them pass a RefPtr to the text node, startOffset, and endOffset. You don't need to check this condition in InsertTextCommand::input for example.  Although if we're fixing this function to consider surrounding text nodes, taking two positions might make more sense.
Good point. I extracted the guard code and invoke it from call sites.

> 
> > Source/WebCore/editing/CompositeEditCommand.cpp:-421
> > -    int offset = position.deprecatedEditingOffset();
> > -    // If neither text[offset] nor text[offset - 1] are some form of whitespace, do nothing.
> > -    if (!isWhitespace(text[offset])) {
> > -        offset--;
> > -        if (offset < 0 || !isWhitespace(text[offset]))
> > -            return;
> > -    }
> 
> If you make the above change, then we can move this optimization into CompositeEditCommand::rebalanceWhitespaceAt.  But I'm still not sure that'll make sense if we extended this function to walk surrounding text nodes though.
> 
I'd like to keep this code removed because it will pull back off-by-one tricks...

> > Source/WebCore/page/EventHandler.cpp:2656
> > -bool EventHandler::handleTextInputEvent(const String& text, Event* underlyingEvent, bool isLineBreak, bool isBackTab)
> > +bool EventHandler::handleTextInputEvent(const String& text, Event* underlyingEvent, bool isLineBreak, bool isBackTab, bool isComposition)
> 
> I really don't like the fact this function takes 3! boolean arguments.  Would you mind cleaning that up in a separate patch?
Agreed and filed Bug 52608. I'll do it sometime soon.
Comment 27 Ryosuke Niwa 2011-01-17 22:11:38 PST
(In reply to comment #26)
> > > Source/WebCore/editing/CompositeEditCommand.cpp:398
> > > +static inline bool isWhitespaces(const String& text)
> > 
> > How about containsOnlyWhitespace as in JavaScriptCore/wtf/text/StringImpl.h?
> > 
> Looks nice, So Replaced this with String::containsOnlyWhitespace()

Oh I didn't mean to replace it with String::containsOnlyWhitespace.  That function calls isASCIISpace and that function doesn't return true for a non-breaking space.  We should add a test for that.

> > If you make the above change, then we can move this optimization into CompositeEditCommand::rebalanceWhitespaceAt.  But I'm still not sure that'll make sense if we extended this function to walk surrounding text nodes though.
> > 
> I'd like to keep this code removed because it will pull back off-by-one tricks...

But doesn't it undo optimization though?

> > > Source/WebCore/page/EventHandler.cpp:2656
> > > -bool EventHandler::handleTextInputEvent(const String& text, Event* underlyingEvent, bool isLineBreak, bool isBackTab)
> > > +bool EventHandler::handleTextInputEvent(const String& text, Event* underlyingEvent, bool isLineBreak, bool isBackTab, bool isComposition)
> > 
> > I really don't like the fact this function takes 3! boolean arguments.  Would you mind cleaning that up in a separate patch?
> Agreed and filed Bug 52608. I'll do it sometime soon.

Great.  We should probably fix the bug 52608 before fixing this bug though.
Comment 28 Hajime Morrita 2011-01-18 21:01:03 PST
Ryosuke, thank you for reviewing.

> > > How about containsOnlyWhitespace as in JavaScriptCore/wtf/text/StringImpl.h?
> > > 
> > Looks nice, So Replaced this with String::containsOnlyWhitespace()
> 
> Oh I didn't mean to replace it with String::containsOnlyWhitespace.  That function calls isASCIISpace and that function doesn't return true for a non-breaking space.  We should add a test for that.
I'm sorry but I don't think I understand what the point here is... What should happen?

> > > > +bool EventHandler::handleTextInputEvent(const String& text, Event* underlyingEvent, bool isLineBreak, bool isBackTab, bool isComposition)
> > > 
> > > I really don't like the fact this function takes 3! boolean arguments.  Would you mind cleaning that up in a separate patch?
> > Agreed and filed Bug 52608. I'll do it sometime soon.
> 
> Great.  We should probably fix the bug 52608 before fixing this bug though.
Fixed.
Comment 29 Ryosuke Niwa 2011-01-19 12:25:37 PST
Comment on attachment 79240 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=79240&action=review

> Source/JavaScriptCore/wtf/text/WTFString.h:331
> +    bool containsOnlyWhitespace() const { return m_impl ? m_impl->containsOnlyWhitespace() : false; }

So this function returns false for any string that contains a non-breaking space because containsOnlyWhitespace calls isASCIISpace and isASCIISpace returns false for 0xA0 for the obvious reason that 0xA0 isn't available in the ASCII encoding.

See also:
http://trac.webkit.org/browser/trunk/Source/JavaScriptCore/wtf/ASCIICType.h
http://trac.webkit.org/browser/trunk/Source/JavaScriptCore/wtf/text/StringImpl.cpp

> Source/WebCore/editing/InsertTextCommand.cpp:180
> +            // Rebalancing on both sides isn't necessary if we've inserted olly spaces.
> +            if (!text.containsOnlyWhitespace())
> +                rebalanceWhitespaceAt(startPosition);

And don't you still want to rebalance even if text contained a non-breaking space here?
Comment 30 Hajime Morrita 2011-01-19 17:52:46 PST
(In reply to comment #29)
> (From update of attachment 79240 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=79240&action=review
> 
> > Source/JavaScriptCore/wtf/text/WTFString.h:331
> > +    bool containsOnlyWhitespace() const { return m_impl ? m_impl->containsOnlyWhitespace() : false; }
> 
> So this function returns false for any string that contains a non-breaking space because containsOnlyWhitespace calls isASCIISpace and isASCIISpace returns false for 0xA0 for the obvious reason that 0xA0 isn't available in the ASCII encoding.
I realized that containsOnlyWhitespace() doesn't work.

What I'd like to discuss is your first feedback:
> > > How about containsOnlyWhitespace as in JavaScriptCore/wtf/text/StringImpl.h?
Could you clarify what the point here is ...?
Comment 31 Ryosuke Niwa 2011-01-19 18:13:53 PST
(In reply to comment #30)
> What I'd like to discuss is your first feedback:
> > > > How about containsOnlyWhitespace as in JavaScriptCore/wtf/text/StringImpl.h?
> Could you clarify what the point here is ...?

I meant to suggest a function name.  Nothing more.  Sorry about the confusion.
Comment 32 Hajime Morrita 2011-01-19 20:24:54 PST
Created attachment 79545 [details]
Patch
Comment 33 Hajime Morrita 2011-01-19 20:28:32 PST
> 
> I meant to suggest a function name.  Nothing more.  Sorry about the confusion.
Ah, OK. Thanks for the clarification! 

I updated patch for addressing #27. Could you take another (possibly the last ;-) look?

> > > If you make the above change, then we can move this optimization into CompositeEditCommand::rebalanceWhitespaceAt.  But I'm still not sure that'll make sense if we extended this function to walk surrounding text nodes though.
> > > 
> > I'd like to keep this code removed because it will pull back off-by-one tricks...
> 
> But doesn't it undo optimization though?
> 
So recovered this in the rebalanceWhitespaceAt().
Comment 34 Ryosuke Niwa 2011-01-19 20:48:01 PST
Comment on attachment 79545 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=79545&action=review

r=me provided you respond to the following two comments.

> LayoutTests/editing/inserting/insert-composition-whitespace.html:65
> +test("div", " ", "\xA0");
> +test("div", "  ", "\xA0\xA0");
> +test("div", "   ", "\xA0\xA0\xA0");

This isn't right.  Why are we having 3 nbsp's here without any regular spaces?

This might be a bug in stringWithRebalancedWhitespace.  It inserts nbsp at the beginning and the end AFTER replacing twoSpaces with a pattern.  We should probably putting nbsp's at each end first and then replace two spaces with the pattern.  That could be a separate fix but we should definitely file a bug, and refer to the bug in the test if we're not fixing it in this patch.

> Source/WebCore/editing/CompositeEditCommand.cpp:439
> +    String text = static_cast<Text*>(node)->data();

Nit: I think there's an extra space here before String.

> Source/WebCore/editing/CompositeEditCommand.cpp:446
> +    rebalanceWhitespaceOnTextSubstring(static_cast<Text*>(node), position.deprecatedEditingOffset(), position.deprecatedEditingOffset());

We should not be calling deprecatedEditingOffset().  Because we've already checked that position is PositionIsOffsetInAnchor, we should be calling offsetInContainerNode() instead.
Comment 35 Hajime Morrita 2011-01-20 01:19:20 PST
Committed r76215: <http://trac.webkit.org/changeset/76215>
Comment 36 Hajime Morrita 2011-01-20 01:20:58 PST
Hi Ryosuke thanks much for r+ ;-0

(In reply to comment #34)
> (From update of attachment 79545 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=79545&action=review
> 
> r=me provided you respond to the following two comments.
> 
> > LayoutTests/editing/inserting/insert-composition-whitespace.html:65
> > +test("div", " ", "\xA0");
> > +test("div", "  ", "\xA0\xA0");
> > +test("div", "   ", "\xA0\xA0\xA0");
Filed Bug 52781 for this.

> > Source/WebCore/editing/CompositeEditCommand.cpp:439
> > +    String text = static_cast<Text*>(node)->data();
It looks OK (I cannot find any extra spaces.) check-webkit-style should complain if any.

> 
> Nit: I think there's an extra space here before String.
> 
> > Source/WebCore/editing/CompositeEditCommand.cpp:446
> > +    rebalanceWhitespaceOnTextSubstring(static_cast<Text*>(node), position.deprecatedEditingOffset(), position.deprecatedEditingOffset());
> 
> We should not be calling deprecatedEditingOffset().  Because we've already checked that position is PositionIsOffsetInAnchor, we should be calling offsetInContainerNode() instead.
Fixed.
Comment 37 WebKit Review Bot 2011-01-20 01:57:55 PST
http://trac.webkit.org/changeset/76215 might have broken Qt Linux Release
The following tests are not passing:
editing/inserting/insert-composition-whitespace.html
Comment 38 Hajime Morrita 2011-01-23 17:38:39 PST
Created attachment 79886 [details]
Patch
Comment 39 Hajime Morrita 2011-01-23 17:44:46 PST
Hi, Ryosuke, can I ask you to rs+ this?

r76125 was reverted at r76226 due to chromium test failure and the test actually broken.
And Mac DRT did't catch the error but Chromium's did:

+    textInputController.setMarkedText(compositionText, 0, compositionText);
should be
+    textInputController.setMarkedText(compositionText, 0, compositionText.length);

After this change the test passed on Chromium DRT.
I'm sorry for disturbing you so much on this bug...
Comment 40 Ryosuke Niwa 2011-01-23 21:28:05 PST
Comment on attachment 79886 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=79886&action=review

> LayoutTests/editing/inserting/insert-composition-whitespace.html:37
> +    textInputController.unmarkText();
> +    confirmedText = node.innerText;
> +
> +    shouldBe("compositingText", "'" + expected + "'");

Why are we storing confirmedText here?  Don't we also want to check the value of confirmedText?
Comment 41 Hajime Morrita 2011-01-23 22:39:18 PST
Created attachment 79895 [details]
Patch
Comment 42 Hajime Morrita 2011-01-23 22:40:26 PST
(In reply to comment #40)
> 
> Why are we storing confirmedText here?  Don't we also want to check the value of confirmedText?
Oops, this is a junk during a debug...
But verifying the confirmed text looks reasonable. So I updated the patch to do it.
Comment 43 Ryosuke Niwa 2011-01-23 22:48:46 PST
(In reply to comment #42)
> (In reply to comment #40)
> > Why are we storing confirmedText here?  Don't we also want to check the value of confirmedText?
> Oops, this is a junk during a debug...
> But verifying the confirmed text looks reasonable. So I updated the patch to do it.

Ok. r=me.
Comment 44 Hajime Morrita 2011-01-23 22:53:56 PST
Committed r76482: <http://trac.webkit.org/changeset/76482>
Comment 45 WebKit Review Bot 2011-01-24 00:50:14 PST
http://trac.webkit.org/changeset/76482 might have broken GTK Linux 32-bit Release and GTK Linux 32-bit Debug
The following tests are not passing:
editing/inserting/insert-composition-whitespace.html
Comment 46 Ryosuke Niwa 2011-01-24 06:08:52 PST
Test disabled in http://trac.webkit.org/changeset/76489 on GTK due to the lack of DRT feature required by the test.