Bug 25119

Summary: IME modifies the DOM before keydown
Product: WebKit Reporter: Julie Parent <jparent@google.com>
Component: HTML EditingAssignee: Nobody <webkit-unassigned@lists.webkit.org>
Status: NEW    
Severity: Normal CC: agl@chromium.org, amanda@chromium.org, ap@webkit.org, eric@webkit.org, evan@chromium.org, hbono@chromium.org, joone@kldp.org, kai@granus.net, ojan@chromium.org, oliver@apple.com
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: PC   
OS: Mac OS X 10.5   
URL: http://jparent.googlepages.com/IME.html
Attachments:
Description Flags
Fire keydown event before inserting text into the DOM on IME input.
eric: review-
Implements Eric's code review feedback.
ap: review-
Restructure after ap's feedback. none

Description From 2009-04-09 12:12:00 PST
Repro steps:
1. Add a keydown listener to any text input/contentEditable node, in the listener print out value/innerHTML of the text input/contentEditable node (or use demo page provided).
2. Type

Result:
In English, the value/innerHTML does not change to contain the character typed until after the keydown event.  When an IME is in use, the value/innerHTML change before the keydown event.

Expected Result:
DOM should not be changed until after the keydown event, just like non-IME input.

Notes:
* Seen in Safari 3, Safari 4, and WebKit nightly.
* Only effects Safari on Mac.  Safari on Windows functions as expected.
* I've excluded text areas from this report because they are broken in other ways, see https://bugs.webkit.org/show_bug.cgi?id=25061.
------- Comment #1 From 2009-04-10 01:19:42 PST -------
I seem to recall that this was done on purpose to better match IE. Oliver, could you confirm?

It's certainly wrong for Safari on Mac and Windows to behave differently.
------- Comment #2 From 2009-04-10 02:04:52 PST -------
honestly not sure -- i know we deliberately send keydown -- not sure about the dom content changes however.
------- Comment #3 From 2009-04-10 13:28:29 PST -------
The reason this matters is for applications that want to prepare for the character insertions, either by saving the state, modifying the dom, or calling preventDefault to prevent the insertion.  This works fine for all non-IME input, but since the dom is modified before the keydown for IME, there is no way for us to prepare for it, we only know about the input once it has already changed the dom, at which point it is too late.  Calling preventDefault() on the keydown should always cause the character insertion to not happen.

For what its worth, IE 7 has the same behavior as Safari 3/4 on Windows, which is the behavior I want for Safari on Mac too.
------- Comment #4 From 2009-04-10 22:46:14 PST -------
(In reply to comment #3)
> Calling preventDefault() on the keydown should always cause the character
> insertion to not happen.

This is a bit offtopic for this bug, but if character insertion happens due to e.g. drag&drop, Ink, or Character Palette, keydown obviously isn't sent, so preventDefault() won't prevent it.

> For what its worth, IE 7 has the same behavior as Safari 3/4 on Windows, which
> is the behavior I want for Safari on Mac too.

OK.
------- Comment #5 From 2009-04-13 12:56:49 PST -------
(In reply to comment #4)
> (In reply to comment #3)
> > Calling preventDefault() on the keydown should always cause the character
> > insertion to not happen.
> 
> This is a bit offtopic for this bug, but if character insertion happens due to
> e.g. drag&drop, Ink, or Character Palette, keydown obviously isn't sent, so
> preventDefault() won't prevent it.

Yup.  That *should* generate a cancelable textInput event though.

> > For what its worth, IE 7 has the same behavior as Safari 3/4 on Windows, which
> > is the behavior I want for Safari on Mac too.
> 
> OK.
> 
------- Comment #6 From 2009-05-01 04:28:16 PST -------
This is a comment in EventHandler::keyEvent():
    // Run input method in advance of DOM event handling.  This may result in the IM
    // modifying the page prior the keydown event, but this behaviour is necessary
    // in order to match IE:
    // 1. preventing default handling of keydown and keypress events has no effect on IM input;
    // 2. if an input method handles the event, its keyCode is set to 229 in keydown event.
------- Comment #7 From 2009-05-04 00:51:13 PST -------
The comment in EventHandler::keyEvent() does not match IE7 and was made before IE8 existed. I was not able to get a machine with IE6 and East Asian Language support, so I can't check what IE6 does.

In either case, the IE7 behavior of firing a keydown event that cancels the text insertion is more sane. It seems like we should match it.
------- Comment #8 From 2009-05-04 01:06:28 PST -------
OK. The comment is half false actually. In IE7:

1. keydown, keypress happen before the DOM is modified for IME
2. cancelling keydown (i.e. returning false) prevents IME and regular text insertion
3. cancelling keypress prevents only regular text insertion

That's a bit weird, but matching IE in that way seems reasonable to me.
------- Comment #9 From 2009-05-04 01:11:51 PST -------
OK. This makes a bit more sense now. IE and WebKit don't fire keypress events for IME at all. So it makes sense that canceling them does not affect IME. :) So, I come back to the conclusion that we should just match IE7. Keydown should fire before the DOM has been modified and should be cancelable.
------- Comment #10 From 2009-05-04 01:34:04 PST -------
Ojan, our behaviour is a "compatible" blend of IE, Firefox, and Mac IM behaviour -- it was very carefully designed so just "matching IE" is not really a good argument, esp. given none of the browsers really are consistent in this field.
------- Comment #11 From 2009-05-04 01:52:33 PST -------
The intention of this change is that web developers ought to be able to a) see what change a keyboard event is about to make and b) cancel it. The IE7 behavior affords this.

Firefox WIN/MAC modify the DOM after the keydown event as well. Canceling the event only seems to prevent text insertion on Windows Firefox though.

So, in terms of web compatibility, it seems clear that the IE/FF-Win model is best.

Is there somewhere I should look to get a better sense of why the current behavior is correct? What is different about the Mac IM behavior that make the current behavior correct?
------- Comment #12 From 2009-05-04 01:58:46 PST -------
I've filed https://bugs.webkit.org/show_bug.cgi?id=25543 for the bug of not canceling IME insertion when keydown in preventDefaulted. This bug should focus on the timing of the event (i.e. that they keydown should come before the text insertion has occurred).
------- Comment #13 From 2009-05-04 02:00:49 PST -------
I don't think that "careful design" is a particularly strong argument in this case. It is true that a lot of time and energy was spent on this issue, but it is also true that the current behavior is not perfect.

That said, it may be difficult or impossible to ensure that canceling keydown prevents IM from handling the event on Mac OS X. We need to pass the event through the IM before DOM event dispatch in order to determine which keycode to report (remember, events that are handled by IM have a keycode of 229) - and one we pass the event through the IM, we cannot un-pass it.
------- Comment #14 From 2009-05-17 21:04:28 PST -------
Created an attachment (id=30435) [details]
Fire keydown event before inserting text into the DOM on IME input.
------- Comment #15 From 2009-05-17 22:04:05 PST -------
Adding some Chromium folk as the Chromium mac/linux ports will likely need to do this as well once they get around to supporting IME.
------- Comment #16 From 2009-05-17 23:20:50 PST -------
(From update of attachment 30435 [details])
A couple style nits:

 1427 void Editor::applyIMECommand(IMECommand* command) {

     IMECommand(const String& confirmedText)
 63     : text(confirmedText)
 64     , isConfirmationCommand(true) {}
(indent and {} should be on new lines)

No arg name needed:
 281     void applyIMECommand(IMECommand* command);


 34 </body>
 35 </html>
036 \ No newline at end of file


     // If there's a key event, stow away the IME command to be processed in
 5717     // EventHandler::keyEvent after the keydown has been fired.
 5718     if (parameters && parameters->event)
 5719         parameters->event->setIMECommand(command);
 5720     else
 5721         coreFrame->editor()->applyIMECommand(command);

Could go into a static function in the file instead of copy/paste.
------- Comment #17 From 2009-05-17 23:26:33 PST -------
+    // IE sends VK_PROCESSKEY which has value 229;
Weird semicolon.


+        // Currently, only Mac Safari uses this. Eventually GTK, and Chromium Linux/Win

If it affects linux at the IME level, it'll likely affect all linux webkit ports (importantly Qt).  Might be worth rephrasing this so you don't need to enumerate them.  (This comment is duplicated across two locations.)
------- Comment #18 From 2009-05-17 23:42:29 PST -------
(In reply to comment #16)
Eric, all those sound reasonable. Will post a new patch soon.

(In reply to comment #17)
> +    // IE sends VK_PROCESSKEY which has value 229;
> Weird semicolon.
> +        // Currently, only Mac Safari uses this. Eventually GTK, and Chromium
> Linux/Win
> 
> If it affects linux at the IME level, it'll likely affect all linux webkit
> ports (importantly Qt).  Might be worth rephrasing this so you don't need to
> enumerate them.  (This comment is duplicated across two locations.)

I don't actually know if it needs to affect linux. Currently, Qt goes through the same codepath as Mac (at least, according to the comments). That said, the Mac code path is a side effect of the requirements of Mac IME. It's not clear to me that linux needs to go through all the same contortions. Mainly, it might be possible to cancel IME keydowns on Linux as it is on Windows (Safari Windows will ignore an IME keypress if the keydown event is canceled). Specifically, the Mac issue is that you need to tell the IME a keypress happened before you know if it will handle the keypress. But you need to know if it will handle the keypress in order to set the 229 keycode on the keydown event, at which point, it's too late to cancel the keydown as the IME has no undo (basically what Alexey said above). Does Linux have the same issue?
------- Comment #19 From 2009-05-18 01:12:17 PST -------
Created an attachment (id=30442) [details]
Implements Eric's code review feedback.
------- Comment #20 From 2009-05-18 01:32:24 PST -------
(From update of attachment 30435 [details])
> +        Fire the keydown event before inserting text into the DOM on IME input
<...>
> +        On Mac, we need to send the key event to the IME before firing keydown to
> +        know that it's an IME input (so we can set the keycode to 229). In order
> +        to fire the keydown after the DOM is modified we:

This probably should say "before", not "after".

> +        There's no automated way to test IME input, so I've added a manual test.

We have some tests for input methods in editing/input and platform/mac/editing/input, I think that an automated test can and should be added for this change.

>  #if PLATFORM(MAC)
>          // We only have this need to store keypress command info on the Mac.
>          Vector<KeypressCommand>& keypressCommands() { return m_keypressCommands; }
> +
> +        // Currently, only Mac Safari uses this. Eventually GTK, and Chromium Linux/Win
> +        // probably want to use this as well though. To ensure keydown fires before the
> +        // DOM is modified on IME key events.

There is no need to say that only Mac uses the code under a PLATFORM(MAC) guard. Also, as this becomes a part of normal text processing, it's not accurate to say that this code only ensures keydown firing at correct time - that's its original inspiration that can be discovered via svn history, but not the full set of its effects.

>  #if PLATFORM(MAC)        
>          Vector<KeypressCommand> m_keypressCommands;
> +        // Currently, only Mac Safari uses this. Eventually GTK, and Chromium Linux/Win
> +        // probably want to use this as well though. To ensure keydown fires before the
> +        // DOM is modified on IME key events.
> +        OwnPtr<IMECommand> m_imeCommand;
>  #endif

Ditto.

But here is also my major comment about this patch. We are already storing input method commands in the same object, and in a much more versatile way (Vector<KeypressCommand> m_keypressCommands). Why do do need to store another copy? The new IMECommand structure doesn't even adequately describe what commands can be sent back to us as a result of text processing.

Since Cocoa text bindings work in tandem with input methods, you may want to check out a nice description of those at <http://www.hcs.harvard.edu/~jrus/Site/cocoa-text.html>, especially an example binding they give for ^H.

Another consideration is that this approach breaks expectations for input methods that use two-way communication with the editing view. For example, after an IM sends setMarkedText:selectedRange:, it has every right to expect that hasMarkedText will return true.

> +void Editor::applyIMECommand(IMECommand* command) {

Brace goes to next line (and the comment about not using "IME" of course still applies).

> +struct IMECommand {
> +    IMECommand(const String& confirmedText)
> +    : text(confirmedText)
> +    , isConfirmationCommand(true) {}

Initializers should be indented 4 spaces more to the right, and we usually put braces each on its own line (although the latter doesn't seem to be mentioned in coding style guidelines).
------- Comment #21 From 2009-05-18 02:06:00 PST -------
(From update of attachment 30442 [details])
r- per above comments.
------- Comment #22 From 2009-05-19 00:57:44 PST -------
Created an attachment (id=30464) [details]
Restructure after ap's feedback.
------- Comment #23 From 2009-05-19 00:59:29 PST -------
This isn't quite ready for review, but the patch does work. It's basically done except the code in WebHTMLView is a bit ugly and the ChangeLogs are out of date. I'll wrap that up tomorrow, but I wanted to get this patch up in case you have any non-nits about this.
------- Comment #24 From 2009-05-19 06:06:29 PST -------
(From update of attachment 30464 [details])
1 CONSOLE MESSAGE: line 77: TypeError: Result of expression 'keyDowns[current]' [undefined] is not an object.

if (event.characters[0] == '\n') {
3     // Enter key is charCode 13.
4     if (event.characters[0] == String.fromCharCode(13)) {
and why is that clearer?
I figure \n is eaiser to read than fromCharCode(13)
WK style does not != null:
16         if (kotoeriState.compositionString != null) {
if (foo)
not if(foo != null)

Why do you call:
1862     keyDown->setDefaultHandled();
fro in  1857 void EventHandler::dispatchInputMethodKeyDownEvent(PassRefPtr<KeyboardEvent> keyDown)

I might have written:
 5311     if (parameters && parameters->event && !parameters->keyDownWasDispatched && parameters->shouldSaveCommand) {

to early return instead... but it's fine as is.

 5348         else if (hasKeypressCommand) {
probably needs a changelog comment.

If there ever was a need for a static function.. this copy/paste was it:
} else if ([character isEqualToString:@"escape"]) {
 469         const unichar ch = 0x1B;
 470         eventCharacter = [NSString stringWithCharacters:&ch length:1];
------- Comment #25 From 2009-05-20 04:30:35 PST -------
(From update of attachment 30464 [details])
I think the statement "There's no automated way to test IME input, so I've added a manual test." is incorrect? you even appear to have improved the testing :D
------- Comment #26 From 2009-05-20 04:35:37 PST -------
Ooh, Ojan, can you also test that all the standard japanese, simplified and traditional chinese, and korean input methods still work "correctly" (no apparent change in user visible behaviour) in editable regions in Safari, and also mail composition in Mail.app