RESOLVED FIXED 60170
[Chromium] Speech-enabled input fields need the ability to trigger without a click
https://bugs.webkit.org/show_bug.cgi?id=60170
Summary [Chromium] Speech-enabled input fields need the ability to trigger without a ...
Dominic Mazzoni
Reported 2011-05-04 08:23:43 PDT
It needs to be possible to start speech input on speech-enabled input fields with something other than a mouse click - like a keypress.
Attachments
Patch for initial feedback, still has debugging printfs - is the approach good? (8.04 KB, patch)
2011-05-04 08:29 PDT, Dominic Mazzoni
webkit.review.bot: commit-queue-
Updated patch with layout test, ready for review. (18.82 KB, patch)
2011-06-06 13:17 PDT, Dominic Mazzoni
no flags
Fixed style error. (18.82 KB, patch)
2011-06-06 13:45 PDT, Dominic Mazzoni
fishd: review-
Rename methods to be more clear. (19.12 KB, patch)
2011-06-17 17:41 PDT, Dominic Mazzoni
gustavo.noronha: commit-queue-
Fix gtk & qt errors (19.88 KB, patch)
2011-06-20 04:47 PDT, Dominic Mazzoni
no flags
Patch (23.87 KB, patch)
2011-09-01 01:19 PDT, Dominic Mazzoni
no flags
Patch for landing (23.87 KB, patch)
2011-09-01 12:16 PDT, Dominic Mazzoni
no flags
Patch (23.87 KB, patch)
2011-09-01 13:32 PDT, Dominic Mazzoni
no flags
Dominic Mazzoni
Comment 1 2011-05-04 08:29:39 PDT
Created attachment 92246 [details] Patch for initial feedback, still has debugging printfs - is the approach good?
WebKit Review Bot
Comment 2 2011-05-18 12:52:16 PDT
Attachment 92246 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1 Source/WebCore/html/shadow/TextControlInnerElements.cpp:495: Place brace on its own line for function definitions. [whitespace/braces] [4] Total errors found: 1 in 9 files If any of these errors are false positives, please file a bug against check-webkit-style.
WebKit Review Bot
Comment 3 2011-05-18 13:19:34 PDT
Comment on attachment 92246 [details] Patch for initial feedback, still has debugging printfs - is the approach good? Attachment 92246 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/8702872
Early Warning System Bot
Comment 4 2011-05-18 14:26:40 PDT
Comment on attachment 92246 [details] Patch for initial feedback, still has debugging printfs - is the approach good? Attachment 92246 [details] did not pass qt-ews (qt): Output: http://queues.webkit.org/results/8705839
Collabora GTK+ EWS bot
Comment 5 2011-05-18 17:40:52 PDT
Comment on attachment 92246 [details] Patch for initial feedback, still has debugging printfs - is the approach good? Attachment 92246 [details] did not pass gtk-ews (gtk): Output: http://queues.webkit.org/results/8702956
Gyuyoung Kim
Comment 6 2011-05-19 02:23:36 PDT
Comment on attachment 92246 [details] Patch for initial feedback, still has debugging printfs - is the approach good? Attachment 92246 [details] did not pass efl-ews (efl): Output: http://queues.webkit.org/results/8719031
Dominic Mazzoni
Comment 7 2011-06-06 13:17:57 PDT
Created attachment 96115 [details] Updated patch with layout test, ready for review.
WebKit Review Bot
Comment 8 2011-06-06 13:20:31 PDT
Attachment 96115 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'LayoutTests/ChangeLog', u'LayoutTests/fast..." exit_code: 1 Source/WebCore/html/shadow/TextControlInnerElements.cpp:531: Place brace on its own line for function definitions. [whitespace/braces] [4] Total errors found: 1 in 20 files If any of these errors are false positives, please file a bug against check-webkit-style.
Dominic Mazzoni
Comment 9 2011-06-06 13:45:44 PDT
Created attachment 96120 [details] Fixed style error.
Satish Sampath
Comment 10 2011-06-16 04:09:39 PDT
Comment on attachment 96120 [details] Fixed style error. > InputFieldSpeechButtonElement::onToggle() onToggle doesn't seem like the right name for this function. It doesn't toggle between 2 states but moves the state from idle to recording and recording to waiting states. I would suggest moving the code under each case statement to separate functions and add a new toggleSpeechInputState() function which actually moves between Idle and Recording states. I think you'd want a pending recognition session to be cancelled and a new one started if this toggleSpeechInputState() function was called when state == Recognizing. > RenderTextControlSingleLine::toggleSpeechRecognition() could use the same name as mentioned above - toggleSpeechInputState() > WEBKIT_API bool isSpeechEnabled() const; suggest changing to isSpeechInputEnabled - "speech" is not the same as "speechInput" > WEBKIT_API void toggleSpeechRecognition(); > LayoutTestController::toggleSpeechInput change to toggleSpeechInputState() as above
Tony Gentilcore
Comment 11 2011-06-16 08:52:18 PDT
You'll also need a stamp from fishd since this touches Source/WebKit/chromium/public/*.
Darin Fisher (:fishd, Google)
Comment 12 2011-06-16 10:38:37 PDT
Comment on attachment 96120 [details] Fixed style error. View in context: https://bugs.webkit.org/attachment.cgi?id=96120&action=review > Source/WebKit/chromium/public/WebInputElement.h:80 > + WEBKIT_API bool isSpeechEnabled() const; it looks like toggleSpeechRecognition and isSpeechEnabled are related methods, but they do not use the same terms. if isSpeechEnabled is supposed to return true when toggleSpeechRecognition enabled the feature, then isSpeechEnabled should really be named isSpeechRecognitionEnabled. that way the names of the methods reinforces the relationship without need for further documentation. also, we don't have any other "toggle" APIs like this. normally, we have isFooEnabled and enableFoo(bool). why not follow that pattern here? > Tools/DumpRenderTree/LayoutTestController.h:112 > + void toggleSpeechInput(JSContextRef inputElement); same nit: seems like this would be better as a boolean setter. why does the caller have to guess or know the state in order to decide if they should call toggle or not?
Dominic Mazzoni
Comment 13 2011-06-17 16:50:22 PDT
> --- Comment #10 from Satish Sampath <satish@chromium.org> 2011-06-16 04:09:39 PST --- > onToggle doesn't seem like the right name for this function. It doesn't toggle between 2 states but moves the state from idle to recording and recording to waiting states. I would suggest moving the code under each case statement to separate functions and add a new toggleSpeechInputState() function which actually moves between Idle and Recording states. I think you'd want a pending recognition session to be cancelled and a new one started if this toggleSpeechInputState() function was called when state == Recognizing. I'm happy with using toggleSpeechInputState, but I don't want to change the behavior of what happens when you cancel while already in the recognition stage - that'd affect clicks as well. The goal is simply to abstract what happens when you click now so that the "click" could be triggered by a keypress instead - I don't think there's anything wrong with the current behavior, which is to start recording the first click, and stop recording the second click (but not cancel recognition). If you want different behavior for the second click, let's save that for another change. Or, if the current behavior is fine, maybe toggleSpeechInputState isn't the right name for the method. How about startOrStopSpeechInput? > Darin Fisher (:fishd, Google) <fishd@chromium.org> changed: >> Source/WebKit/chromium/public/WebInputElement.h:80 >> + WEBKIT_API bool isSpeechEnabled() const; > > it looks like toggleSpeechRecognition and isSpeechEnabled are related methods, but > they do not use the same terms. if isSpeechEnabled is supposed to return true isSpeechEnabled means, "does this input element have the x-webkit-speech attribute, meaning that speech input is allowed?" toggleSpeechRecognition means, "for an input that has speech input enabled, start recognizing if it's not already, or stop recognizing if you've already started." It doesn't change the result of isSpeechEnabled - that's controlled by the DOM element's attribute. My proposed change: isSpeechInputEnabled and startOrStopSpeechInput. (We could provide startSpeechInput and stopSpeechInput also, if we think they'd be needed or useful. However, a startOrStopSpeechInput is still needed because when the user triggers speech input by an action in the browser process, whether by a key, context menu, etc. it doesn't make sense to send one round-trip to the renderer process to determine the current state, then flip the state in another RPC, with a possible race condition. Having a startOrStopSpeechInput method exactly mimics the user behavior when they click on the microphone icon, which is what we want. - Dominic
Dominic Mazzoni
Comment 14 2011-06-17 17:41:13 PDT
Created attachment 97672 [details] Rename methods to be more clear.
Collabora GTK+ EWS bot
Comment 15 2011-06-17 18:55:16 PDT
Comment on attachment 97672 [details] Rename methods to be more clear. Attachment 97672 [details] did not pass gtk-ews (gtk): Output: http://queues.webkit.org/results/8872723
Early Warning System Bot
Comment 16 2011-06-20 01:29:17 PDT
Comment on attachment 97672 [details] Rename methods to be more clear. Attachment 97672 [details] did not pass qt-ews (qt): Output: http://queues.webkit.org/results/8913438
Dominic Mazzoni
Comment 17 2011-06-20 04:47:42 PDT
Created attachment 97778 [details] Fix gtk & qt errors
Satish Sampath
Comment 18 2011-06-21 03:26:55 PDT
> The goal is simply to abstract what happens when you > click now so that the "click" could be triggered by a > keypress instead I think we should not mimic the click implementation for hotkey. When early prototypes of speech input were being tested we saw many users didn't understand that they can just stop speaking and the browser would detect silence to trigger recognition. Instead they were clicking on the button again thinking it was necessary. That is the reason why we handle a click on the speech button and move to recognition state if audio was already recorded. In the hotkey case, we can message easily to the user that they need to press the hotkey only to start speaking and once they stop speaking results will be available for them. The implementation is also much simpler as you only need to start recording on hotkey press and don't have to toggle (and can ignore hotkey if audio was already being recorded).
Darin Fisher (:fishd, Google)
Comment 19 2011-08-25 15:46:01 PDT
Comment on attachment 97778 [details] Fix gtk & qt errors View in context: https://bugs.webkit.org/attachment.cgi?id=97778&action=review > Source/WebKit/chromium/public/WebInputElement.h:81 > + WEBKIT_API void startOrStopSpeechInput(); This startOrStop method is a bit awkward. It seems like the caller has no way to know what the current state is, which means that they have to just make a guess. This might be OK for certain kinds of UI, or embeddings, but it seems like it could be a pretty confusing API for others. How about just having separate methods for starting and stopping speech input?
Darin Fisher (:fishd, Google)
Comment 20 2011-08-25 15:48:22 PDT
(In reply to comment #13) > (We could provide startSpeechInput and stopSpeechInput also, if we > think they'd be needed or useful. However, a startOrStopSpeechInput is > still needed because when the user triggers speech input by an action > in the browser process, whether by a key, context menu, etc. it > doesn't make sense to send one round-trip to the renderer process to > determine the current state, then flip the state in another RPC, with > a possible race condition. Having a startOrStopSpeechInput method > exactly mimics the user behavior when they click on the microphone > icon, which is what we want. OK, I see that you addressed my question here. I don't think these IPC considerations have much to do with how the WebKit API is shaped. You could very easily have an IPC named StartOrStopSpeechInput, which could use local WebKit APIs to determine whether it makes sense to call startSpeechInput or stopSpeechInput.
Dominic Mazzoni
Comment 21 2011-08-25 16:24:34 PDT
(In reply to comment #20) > I don't think these IPC considerations have much to do with how the WebKit API is shaped. You could very easily have an IPC named StartOrStopSpeechInput, which could use local WebKit APIs to determine whether it makes sense to call startSpeechInput or stopSpeechInput. Ah, that's clear now. Thanks, I'll make that change.
Dominic Mazzoni
Comment 22 2011-09-01 01:19:39 PDT
Dominic Mazzoni
Comment 23 2011-09-01 12:16:34 PDT
Created attachment 106005 [details] Patch for landing
WebKit Review Bot
Comment 24 2011-09-01 12:17:27 PDT
Comment on attachment 106005 [details] Patch for landing Rejecting attachment 106005 [details] from commit-queue. dmazzoni@google.com does not have committer permissions according to http://trac.webkit.org/browser/trunk/Tools/Scripts/webkitpy/common/config/committers.py. - If you do not have committer rights please read http://webkit.org/coding/contributing.html for instructions on how to use bugzilla flags. - If you have committer rights please correct the error in Tools/Scripts/webkitpy/common/config/committers.py by adding yourself to the file (no review needed). The commit-queue restarts itself every 2 hours. After restart the commit-queue will correctly respect your committer rights.
Dimitri Glazkov (Google)
Comment 25 2011-09-01 13:25:44 PDT
Comment on attachment 106005 [details] Patch for landing View in context: https://bugs.webkit.org/attachment.cgi?id=106005&action=review Sorry, I have an extra nit! :) > Source/WebCore/html/shadow/TextControlInnerElements.cpp:527 > + RefPtr<HTMLInputElement> input(static_cast<HTMLInputElement*>(shadowAncestorNode())); use input = foo syntax instead.
Dominic Mazzoni
Comment 26 2011-09-01 13:32:19 PDT
WebKit Review Bot
Comment 27 2011-09-01 14:14:02 PDT
Comment on attachment 106017 [details] Patch Clearing flags on attachment: 106017 Committed r94343: <http://trac.webkit.org/changeset/94343>
WebKit Review Bot
Comment 28 2011-09-01 14:14:08 PDT
All reviewed patches have been landed. Closing bug.
Leandro Graciá Gil
Comment 29 2011-09-05 05:19:54 PDT
Comment on attachment 106017 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=106017&action=review > Source/WebCore/html/shadow/TextControlInnerElements.cpp:530 > + IntRect rect = input->renderer()->absoluteBoundingBoxRect(); This line is introducing a regression that undoes this patch: https://bugs.webkit.org/show_bug.cgi?id=65333 I've created a separate bug to fix this: https://bugs.webkit.org/show_bug.cgi?id=67597
Note You need to log in before you can comment on or make changes to this bug.