81667 – Speech JavaScript API: Plumbing for Chromium

Hans Wennborg

Reported 2012-03-20 08:23:08 PDT

Speech JavaScript API: Plumbing for Chromium

Hans Wennborg

Comment 1 2012-03-20 08:38:11 PDT

Created attachment 132832 [details] Patch

WebKit Review Bot

Comment 2 2012-03-20 08:40:55 PDT

Please wait for approval from fishd@chromium.org before submitting because this patch contains changes to the Chromium public API.

Hans Wennborg

Comment 3 2012-03-20 08:42:10 PDT

I've uploaded this as a single patch so it's easy to see how it all fits together. The bulk of the patch is in SpeechRecognitionProxy. If desired, I could break out the other wrapper classes to a separate patch.

Darin Fisher (:fishd, Google)

Comment 4 2012-03-20 12:41:54 PDT

Comment on attachment 132832 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=132832&action=review > Source/WebKit/chromium/public/WebSpeechGrammar.h:45 > + WEBKIT_EXPORT WebURL src() const; nit: src -> sourceURL ? > Source/WebKit/chromium/public/WebSpeechRecognitionClient.h:46 > + virtual void visibilityHidden() { WEBKIT_ASSERT_NOT_REACHED(); } is visibilityHidden a command? it seems like it is a notification about the visibility state changing to hidden? > Source/WebKit/chromium/public/WebViewClient.h:320 > + // Scripted speech ----------------------------------------------------- does scripted speech really require a different section? can we just extend the "Speech" section? > Source/WebKit/chromium/public/WebViewClient.h:323 > + virtual WebSpeechRecognitionClient* speechRecognitionClient() { return 0; } is the caller expected to delete this pointer when they are done with it? i'm guessing the answer is NO since this function is not named "createFoo". assuming the answer is NO, then WebSpeechRecognitionClient should have a protected destructor to prevent unwanted deletion through this pointer. > Source/WebKit/chromium/src/SpeechRecognitionClientProxy.cpp:50 > + if (!m_recognitionMap.contains(recognition)) { usually we put ID maps like this on the chromium side? does WebKit need this map? it seems like it might be better to create an API wrapper for WebSpeechRecognition instead. the WebKit API is supposed to be a thin layer between chromium and webcore. if chromium needs IDs for IPC purposes, then that is a chromium concern, not a webkit concern.

Hans Wennborg

Comment 5 2012-03-21 08:09:57 PDT

(In reply to comment #4) > (From update of attachment 132832 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=132832&action=review Thank you very much for the review! > > Source/WebKit/chromium/public/WebSpeechGrammar.h:45 > > + WEBKIT_EXPORT WebURL src() const; > > nit: src -> sourceURL ? It's called src() in the wrapped object, which in turn matches the IDL. I'm happy to rename if you think sourceURL is preferable, though. > > Source/WebKit/chromium/public/WebSpeechRecognitionClient.h:46 > > + virtual void visibilityHidden() { WEBKIT_ASSERT_NOT_REACHED(); } > > is visibilityHidden a command? it seems like it is a notification about the visibility state changing to hidden? You're right. Renaming to notifyVisibilityHidden(). Happy to accept better naming suggestions too :) > > Source/WebKit/chromium/public/WebViewClient.h:320 > > + // Scripted speech ----------------------------------------------------- > > does scripted speech really require a different section? can we just extend > the "Speech" section? Done, putting it in the existing speech section. > > Source/WebKit/chromium/public/WebViewClient.h:323 > > + virtual WebSpeechRecognitionClient* speechRecognitionClient() { return 0; } > > is the caller expected to delete this pointer when they are done with it? i'm guessing > the answer is NO since this function is not named "createFoo". assuming the answer is > NO, then WebSpeechRecognitionClient should have a protected destructor to prevent unwanted > deletion through this pointer. Nice catch! Done. > > Source/WebKit/chromium/src/SpeechRecognitionClientProxy.cpp:50 > > + if (!m_recognitionMap.contains(recognition)) { > > usually we put ID maps like this on the chromium side? does WebKit need > this map? it seems like it might be better to create an API wrapper for > WebSpeechRecognition instead. > > the WebKit API is supposed to be a thin layer between chromium and webcore. > if chromium needs IDs for IPC purposes, then that is a chromium concern, > not a webkit concern. Yes, passing a pointer to a WebSpeechRecognition object rather than an id seems nicer, I'll do that. I still need to keep track between the mapping between the SpeechRecognition objects and the wrappers, though. It's important that that same wrapper gets used e.g. for a start() and a stop() call.

Hans Wennborg

Comment 6 2012-03-21 08:10:25 PDT

Created attachment 133043 [details] Patch

WebKit Review Bot

Comment 7 2012-03-21 08:13:36 PDT

Please wait for approval from abarth@webkit.org, dglazkov@chromium.org, fishd@chromium.org, jamesr@chromium.org or tkent@chromium.org before submitting, as this patch contains changes to the Chromium public API. See also https://trac.webkit.org/wiki/ChromiumWebKitAPI.

Darin Fisher (:fishd, Google)

Comment 8 2012-03-22 10:00:27 PDT

Comment on attachment 133043 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=133043&action=review > Source/WebKit/chromium/public/WebSpeechGrammar.h:45 > + WEBKIT_EXPORT WebURL src() const; I understand that this is just a reflection of a DOM attribute. What is the purpose of exposing this attribute? Are you going to use it for security decisions? Or, will you do some resource loading externally to WebKit? > Source/WebKit/chromium/public/WebSpeechRecognition.h:43 > +#if WEBKIT_IMPLEMENTATION nit: we usually put WEBKIT_IMPLEMENTATION sections at the end of the 'public' section. > Source/WebKit/chromium/public/WebSpeechRecognition.h:47 > + WEBKIT_EXPORT void audioStartCallback(); nit: how about giving these "event style" names? didStartAudio didStartSound didStartSpeech didEndSpeech didEndAudio ... It is not quite clear to me how to rename resultCallback or noMatchCallback or resultDeletedCallback. Are those three possible responses to something? It would also be helpful to document some of the event flow here. How is the embedder expected to call these methods? My point is that I think you should do more than just cleave your code into two pieces. I think you should try make the API make sense on its own. Imagine someone is trying to hook this up to a different embedder. What do they need to know? Please explain with comments and helpful function names. > Source/WebKit/chromium/public/WebSpeechRecognition.h:60 > + WebCore::SpeechRecognition* m_speechRecognition; are you sure this shouldn't be a WebPrivatePtr? SpeechRecognition is a reference counted class, right? > Source/WebKit/chromium/public/WebSpeechRecognitionClient.h:40 > + virtual void start(WebSpeechRecognition*, const WebVector<WebSpeechGrammar>&, WebString language, bool continuous) { WEBKIT_ASSERT_NOT_REACHED(); } are you sure you want to pass WebSpeechRecognition by pointer and not by const ref? WebSpeechRecognition could just be a smart pointer around SpeechRecognition. nit: "language" should be passed as "const WebString&" nit: are you sure this interface wouldn't be better named WeBSpeechRecognitionController? with methods like 'start', 'stop', and 'abort', it sure seems like this interface is more of a controller and less of a client. A client is an interface that receives notifications, observes a system, updates UI, implements policy, etc. A controller is an interface that takes action, drives a system, updates state, etc. > Source/WebKit/chromium/src/SpeechRecognitionClientProxy.cpp:51 > + m_recognitionMap.set(recognition, adoptPtr(new WebSpeechRecognition(recognition))); I don't understand why you have a map here. A WebSpeechRecognition should just be a wrapper around a SpeechRecognition. You should be able to map back-n-forth between WebSpeechRecognition and SpeechRecognition just like we can map back-n-forth between WebNode and Node. To support an IDMap on the Chromium side, you can give WebSpeechRecognition an equals and lessThan method, like we have for WebNode. Also, you can define operator== and operator<. Then you can use it as a key in a map. Would that address your concern?

Hans Wennborg

Comment 9 2012-03-23 07:40:24 PDT

(In reply to comment #8) > (From update of attachment 133043 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=133043&action=review Thanks for the review! > > Source/WebKit/chromium/public/WebSpeechGrammar.h:45 > > + WEBKIT_EXPORT WebURL src() const; > > I understand that this is just a reflection of a DOM attribute. What is > the purpose of exposing this attribute? Are you going to use it for > security decisions? Or, will you do some resource loading externally to > WebKit? The latter. We will pass this URL in our request to the speech server, which I suppose will then load it. > > Source/WebKit/chromium/public/WebSpeechRecognition.h:43 > > +#if WEBKIT_IMPLEMENTATION > > nit: we usually put WEBKIT_IMPLEMENTATION sections at the end of the 'public' section. Done. > > Source/WebKit/chromium/public/WebSpeechRecognition.h:47 > > + WEBKIT_EXPORT void audioStartCallback(); > > nit: how about giving these "event style" names? > > didStartAudio > didStartSound > didStartSpeech > didEndSpeech > didEndAudio > ... > > It is not quite clear to me how to rename resultCallback or > noMatchCallback or resultDeletedCallback. Are those three > possible responses to something? Yes, those three are called as results are streamed back from the speech server. (As per your next comment, I'll try to document this better.) The flow is that the user calls start() on the SpeechRecognitionClient, which gets forwarded via the proxy to the embedder. The embedder then calls back via the WebSpeechRecognition interface: audioStartCallback(), speechStartCallback(), some resultCallback()s, etc. The callback methods correspond exactly to the "startaudio", "result", etc. events from the spec: http://speech-javascript-api-spec.googlecode.com/git/speechapi.html#speechreco-events I agree that the function names aren't great. "didStartAudio", etc. sound better, but I don't know how "resultCallback" would be renamed to fit with that. > It would also be helpful to document some of the event flow here. > How is the embedder expected to call these methods? > > My point is that I think you should do more than just cleave your > code into two pieces. I think you should try make the API make > sense on its own. Imagine someone is trying to hook this up to > a different embedder. What do they need to know? Please explain > with comments and helpful function names. I've tried to add comments to make it more clear. > > Source/WebKit/chromium/public/WebSpeechRecognition.h:60 > > + WebCore::SpeechRecognition* m_speechRecognition; > > are you sure this shouldn't be a WebPrivatePtr? SpeechRecognition is > a reference counted class, right? Done. > > Source/WebKit/chromium/public/WebSpeechRecognitionClient.h:40 > > + virtual void start(WebSpeechRecognition*, const WebVector<WebSpeechGrammar>&, WebString language, bool continuous) { WEBKIT_ASSERT_NOT_REACHED(); } > > are you sure you want to pass WebSpeechRecognition by pointer and not by const ref? > WebSpeechRecognition could just be a smart pointer around SpeechRecognition. I was worried about how I would insert the WebRecognition objects in an IDMap Chromium-side, which is why I passed a pointer. As you pointed out below, this can be fixed, so I've changed the code to pass it by const ref. > nit: "language" should be passed as "const WebString&" Done. > nit: are you sure this interface wouldn't be better named WeBSpeechRecognitionController? with methods like 'start', 'stop', and 'abort', it sure seems like this interface is more of a controller and less of a client. A client is an interface that receives notifications, observes a system, updates UI, implements policy, etc. A controller is an interface that takes action, drives a system, updates state, etc. Hmm, that does sound sensible. The problem is we already have a Controller class. In WebCore, there is both a SpeechRecognitionController object and the SpeechRecognitionClient interface. This was to squeeze the design into the Supplement pattern similarly to UserMediaController and UserMediaClient from http://trac.webkit.org/changeset/108437). > > Source/WebKit/chromium/src/SpeechRecognitionClientProxy.cpp:51 > > + m_recognitionMap.set(recognition, adoptPtr(new WebSpeechRecognition(recognition))); > > I don't understand why you have a map here. A WebSpeechRecognition should just be > a wrapper around a SpeechRecognition. You should be able to map back-n-forth between > WebSpeechRecognition and SpeechRecognition just like we can map back-n-forth between > WebNode and Node. > > To support an IDMap on the Chromium side, you can give WebSpeechRecognition an equals > and lessThan method, like we have for WebNode. Also, you can define operator== and > operator<. Then you can use it as a key in a map. Would that address your concern? Yes, that addresses my concerns and makes everything much nicer :) Thanks for helping me get this straight.

Hans Wennborg

Comment 10 2012-03-23 07:41:01 PDT

Created attachment 133484 [details] Patch

Satish Sampath

Comment 11 2012-03-23 07:50:06 PDT

Comment on attachment 133043 [details] Patch > I agree that the function names aren't great. "didStartAudio", etc. sound better, but I don't know how "resultCallback" would be renamed to fit with that. In this case didReceiveResult, didReceiveNoMatch, didDeletePreviousResult and didReceiveError sound good to me. I think there would be a bit of confusion between didStart, didStartAudio, didStartSound and didStartSpeech for a first time viewer so documenting them with a call flow and url to the spec would be useful.

Darin Fisher (:fishd, Google)

Comment 12 2012-03-23 17:32:58 PDT

(In reply to comment #11) > (From update of attachment 133043 [details]) > > I agree that the function names aren't great. "didStartAudio", etc. sound better, but I don't know how "resultCallback" would be renamed to fit with that. > > In this case didReceiveResult, didReceiveNoMatch, didDeletePreviousResult and didReceiveError sound good to me. > I think there would be a bit of confusion between didStart, didStartAudio, didStartSound and didStartSpeech for a first time viewer so documenting them with a call flow and url to the spec would be useful. ^^^ I like Satish's suggestions.

Hans Wennborg

Comment 13 2012-03-26 07:44:12 PDT

(In reply to comment #11) > (From update of attachment 133043 [details]) > > I agree that the function names aren't great. "didStartAudio", etc. sound better, but I don't know how "resultCallback" would be renamed to fit with that. > > In this case didReceiveResult, didReceiveNoMatch, didDeletePreviousResult and didReceiveError sound good to me. Done. > I think there would be a bit of confusion between didStart, didStartAudio, didStartSound and didStartSpeech for a first time viewer so documenting them with a call flow and url to the spec would be useful. Done. (In reply to comment #12) > ^^^ I like Satish's suggestions. Done :)

Hans Wennborg

Comment 14 2012-03-26 07:44:39 PDT

Created attachment 133811 [details] Patch

Darin Fisher (:fishd, Google)

Comment 15 2012-03-28 10:20:12 PDT

Comment on attachment 133811 [details] Patch Sorry for the high latency in getting feedback to you. I think there are some pretty considerable design problems in this API that should be resolved. View in context: https://bugs.webkit.org/attachment.cgi?id=133811&action=review > Source/WebKit/chromium/public/WebSpeechRecognition.h:42 > +// Wrapper around a SpeechRecognition pointer. nit: this comment is not very helpful to WebKit API users. the point of the API is to hide WebCore details. it would be better to have a comment that explains what a WebSpeechRecognition object is. it seems to be an object that represents a "notification sink" for events related to a speech recognition request. i think the class is probably poorly named since its name doesn't really match up with its function. > Source/WebKit/chromium/public/WebSpeechRecognition.h:107 > + WEBKIT_EXPORT void didReceiveError(const WebString& msg, unsigned short code); nit: spell it out... msg -> message > Source/WebKit/chromium/public/WebSpeechRecognitionClient.h:38 > +class WebSpeechRecognitionClient { i'm still not sure why this is named with a Client suffix. its methods are clearly about controlling something... > Source/WebKit/chromium/public/WebSpeechRecognitionClient.h:49 > + virtual void notifyVisibilityHidden() { WEBKIT_ASSERT_NOT_REACHED(); } i don't understand why this method exists. the embedder already knows when the page is hidden. afterall, it is the embedder who communicates that to WebKit via WebView::setVisibilityState. Why does the embedder need WebKit to tell it what it already knows? perhaps you are just trying to add a method to tell the embedder to abort any ongoing speech recognition? in that case, the method name should be specific to that.

Hans Wennborg

Comment 16 2012-03-29 07:26:54 PDT

(In reply to comment #15) > (From update of attachment 133811 [details]) > Sorry for the high latency in getting feedback to you. I think there are some > pretty considerable design problems in this API that should be resolved. Thanks for the comments. I've tried to rework the patch so that the API makes sense on its own rather than just wrap WebCore objects and take names based on that. The two most interesting classes in the API are now: - WebSpeechRecognizer: the interface used for starting and stopping recognition, which should be implemented by the embedder. (This could be renamed to WebSpeechRecognitionController if you prefer.) - WebSpeechRecognitionClient: a client that is passed as an argument to the start, stop, and abort functions; used for reporting on progress. Hopefully this makes things more clear. > View in context: https://bugs.webkit.org/attachment.cgi?id=133811&action=review > > > Source/WebKit/chromium/public/WebSpeechRecognition.h:42 > > +// Wrapper around a SpeechRecognition pointer. > > nit: this comment is not very helpful to WebKit API users. the point of the > API is to hide WebCore details. it would be better to have a comment that > explains what a WebSpeechRecognition object is. it seems to be an object that > represents a "notification sink" for events related to a speech recognition request. > i think the class is probably poorly named since its name doesn't really match up > with its function. This class has been renamed to WebSpeechRecognitionClient. > > Source/WebKit/chromium/public/WebSpeechRecognition.h:107 > > + WEBKIT_EXPORT void didReceiveError(const WebString& msg, unsigned short code); > > nit: spell it out... msg -> message Done. > > Source/WebKit/chromium/public/WebSpeechRecognitionClient.h:38 > > +class WebSpeechRecognitionClient { > > i'm still not sure why this is named with a Client suffix. its methods are clearly > about controlling something... It's now WebSpeechRecognizer. > > Source/WebKit/chromium/public/WebSpeechRecognitionClient.h:49 > > + virtual void notifyVisibilityHidden() { WEBKIT_ASSERT_NOT_REACHED(); } > > i don't understand why this method exists. the embedder already knows when the > page is hidden. afterall, it is the embedder who communicates that to WebKit > via WebView::setVisibilityState. Why does the embedder need WebKit to tell it > what it already knows? You're right, this can be handled in the embedder. Removing the function.

Hans Wennborg

Comment 17 2012-03-29 07:27:24 PDT

Created attachment 134578 [details] Patch

Darin Fisher (:fishd, Google)

Comment 18 2012-03-29 11:42:11 PDT

Comment on attachment 134578 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=134578&action=review > Source/WebKit/chromium/public/WebSpeechRecognitionClient.h:60 > + WEBKIT_EXPORT bool equals(const WebSpeechRecognitionClient&) const; I think it probably makes more sense for this to be an interface and for the recognizer to be instantiated each time you need to start a speech recognition task. > Source/WebKit/chromium/public/WebSpeechRecognitionParameters.h:39 > + WebSpeechRecognitionParameters(const WebVector<WebSpeechGrammar>&, const WebString& language, bool continuous); nit: elsewhere, we use a suffix of "Params" > Source/WebKit/chromium/public/WebSpeechRecognitionParameters.h:41 > + WEBKIT_EXPORT const WebVector<WebSpeechGrammar>& grammars() const; nit: here you can just define these as inline getters since they just simply return member variables. > Source/WebKit/chromium/public/WebViewClient.h:321 > + virtual WebSpeechRecognizer* speechRecognizer() { return 0; } I really like the WebSpeechRecognizer name! However, maybe this should return an instance and be named createSpeechRecognizer? Maybe WebSpeechRecogitionClient should be named WebSpeechRecognizerClient, and it should also be an interface, which gets passed to createSpeechRecognizer.

Hans Wennborg

Comment 19 2012-03-30 07:11:11 PDT

(In reply to comment #18) > (From update of attachment 134578 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=134578&action=review > > > Source/WebKit/chromium/public/WebSpeechRecognitionClient.h:60 > > + WEBKIT_EXPORT bool equals(const WebSpeechRecognitionClient&) const; > > I think it probably makes more sense for this to be an interface and for the > recognizer to be instantiated each time you need to start a speech recognition > task. Hmm, I'm not sure about this one. If we instantiate a new recognizer each time we start a recognition, we need to have a mapping from SpeechRecognition objects to WebSpeechRecognizer, so that a stop() call can be propagated to the correct recognizer. That brings us back to the state we had before, with SpeechRecognitionClientProxy having a HashMap member and logic to maintain the mapping. It would also need to manage the lifetime of the recognizers and destroy them at the right moment. It seems to me that this brings back some of the problems from previous versions of this patch. Not having to worry about any mappings or lifetime issues, but shifting that responsibility to the embedder where it can be done easily since it know exactly how long it needs to keep things alive (until it has sent the last event). > > Source/WebKit/chromium/public/WebSpeechRecognitionParameters.h:39 > > + WebSpeechRecognitionParameters(const WebVector<WebSpeechGrammar>&, const WebString& language, bool continuous); > > nit: elsewhere, we use a suffix of "Params" Done. > > Source/WebKit/chromium/public/WebSpeechRecognitionParameters.h:41 > > + WEBKIT_EXPORT const WebVector<WebSpeechGrammar>& grammars() const; > > nit: here you can just define these as inline getters since they just simply > return member variables. Done. > > Source/WebKit/chromium/public/WebViewClient.h:321 > > + virtual WebSpeechRecognizer* speechRecognizer() { return 0; } > > I really like the WebSpeechRecognizer name! So do I :) > However, maybe this should return > an instance and be named createSpeechRecognizer? I think creating multiple instances of WebSpeechRecognizer would complicate things, as mentioned above. > Maybe WebSpeechRecogitionClient should be named WebSpeechRecognizerClient Done. > , and > it should also be an interface, which gets passed to createSpeechRecognizer. I'm not sure about this one, as mentioned above.

Hans Wennborg

Comment 20 2012-03-30 07:11:44 PDT

Created attachment 134803 [details] Patch

Darin Fisher (:fishd, Google)

Comment 21 2012-03-30 10:03:45 PDT

(In reply to comment #19) > (In reply to comment #18) > > (From update of attachment 134578 [details] [details]) > > View in context: https://bugs.webkit.org/attachment.cgi?id=134578&action=review > > > > > Source/WebKit/chromium/public/WebSpeechRecognitionClient.h:60 > > > + WEBKIT_EXPORT bool equals(const WebSpeechRecognitionClient&) const; > > > > I think it probably makes more sense for this to be an interface and for the > > recognizer to be instantiated each time you need to start a speech recognition > > task. > > Hmm, I'm not sure about this one. If we instantiate a new recognizer each time we start a recognition, we need to have a mapping from SpeechRecognition objects to WebSpeechRecognizer, so that a stop() call can be propagated to the correct recognizer. > > That brings us back to the state we had before, with SpeechRecognitionClientProxy having a HashMap member and logic to maintain the mapping. It would also need to manage the lifetime of the recognizers and destroy them at the right moment. > > It seems to me that this brings back some of the problems from previous versions of this patch. Not having to worry about any mappings or lifetime issues, but shifting that responsibility to the embedder where it can be done easily since it know exactly how long it needs to keep things alive (until it has sent the last event). OK, then how about defining WebSpeechRecognizer and WebSpeechRecognizerClient as interfaces, but keep WebSpeechRecognizer as a singleton, and then introduce WebSpeechRecogition as a wrapper for SpeechRecognition? WebSpeechRecognition would just serve as a handle to idenitfy a recognition task. It would not have any other state of methods. I think WebSpeechRecognizerClient could be implemented by SpeechRecognitionClientProxy.

Hans Wennborg

Comment 22 2012-04-02 06:23:29 PDT

(In reply to comment #21) > OK, then how about defining WebSpeechRecognizer and WebSpeechRecognizerClient as interfaces, but keep WebSpeechRecognizer as a singleton, and then introduce WebSpeechRecogition as a wrapper for SpeechRecognition? WebSpeechRecognition would just serve as a handle to idenitfy a recognition task. It would not have any other state of methods. I think WebSpeechRecognizerClient could be implemented by SpeechRecognitionClientProxy. Yes, this sound like a good approach! Patch coming up.

Hans Wennborg

Comment 23 2012-04-02 06:24:01 PDT

Created attachment 135084 [details] Patch

Darin Fisher (:fishd, Google)

Comment 24 2012-04-02 09:56:33 PDT

Comment on attachment 135084 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=135084&action=review Interface structure looks pretty good. Just some minor nits now. > Source/WebKit/chromium/public/WebSpeechRecognitionHandle.h:41 > +// A handle to a SpeechRecognition object. As long as there is a handle, the nit: avoid making the user think about WebCore objects. mentioning the SpeechRecognition type in this comment breaks that rule. I would just drop the first two sentences here, and then I would change the third sentence to begin as "WebSpeechRecognitionHandle is used by the WebSpeechRecognizer..." > Source/WebKit/chromium/public/WebSpeechRecognitionParams.h:39 > + WebSpeechRecognitionParams(const WebVector<WebSpeechGrammar>&, const WebString& language, bool continuous); nit: implement the constructor inline or else it needs to be tagged with WEBKIT_EXPORT. usually we just implement inline. > Source/WebKit/chromium/public/WebSpeechRecognizer.h:41 > + virtual void start(WebSpeechRecognitionHandle, const WebSpeechRecognitionParams&, WebSpeechRecognizerClient*) { WEBKIT_ASSERT_NOT_REACHED(); } nit: pass handles as "const WebSpeechRecognitionHandle&" ... you should see other classes that are implemented using WebPrivatePtr being passed as const ref. > Source/WebKit/chromium/public/WebSpeechRecognizerClient.h:44 > + virtual void didStartAudio(WebSpeechRecognitionHandle&) = 0; nit: pass handles as "const WebSpeechRecognitionHandle&" > Source/WebKit/chromium/src/SpeechRecognitionClientProxy.cpp:76 > + RefPtr<SpeechRecognition> recognition = static_cast<PassRefPtr<SpeechRecognition> >(handle); the explicit cast to PassRefPtr<SpeechRecognition> should become unnecessary once you change the argument type to "const WebSpeechRecognitionHandle&" as the conversion operator defined on WebSpeechRecognitionHandle will then be used implicitly.

Hans Wennborg

Comment 25 2012-04-03 05:56:05 PDT

(In reply to comment #24) > (From update of attachment 135084 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=135084&action=review > > Interface structure looks pretty good. Just some minor nits now. > > > Source/WebKit/chromium/public/WebSpeechRecognitionHandle.h:41 > > +// A handle to a SpeechRecognition object. As long as there is a handle, the > > nit: avoid making the user think about WebCore objects. mentioning the SpeechRecognition > type in this comment breaks that rule. I would just drop the first two sentences here, > and then I would change the third sentence to begin as "WebSpeechRecognitionHandle is > used by the WebSpeechRecognizer..." Done. > > Source/WebKit/chromium/public/WebSpeechRecognitionParams.h:39 > > + WebSpeechRecognitionParams(const WebVector<WebSpeechGrammar>&, const WebString& language, bool continuous); > > nit: implement the constructor inline or else it needs to be tagged with WEBKIT_EXPORT. usually we just implement inline. Done. > > Source/WebKit/chromium/public/WebSpeechRecognizer.h:41 > > + virtual void start(WebSpeechRecognitionHandle, const WebSpeechRecognitionParams&, WebSpeechRecognizerClient*) { WEBKIT_ASSERT_NOT_REACHED(); } > > nit: pass handles as "const WebSpeechRecognitionHandle&" ... you should see other classes that are implemented using WebPrivatePtr being passed as const ref. Done. > > Source/WebKit/chromium/public/WebSpeechRecognizerClient.h:44 > > + virtual void didStartAudio(WebSpeechRecognitionHandle&) = 0; > > nit: pass handles as "const WebSpeechRecognitionHandle&" Done. > > Source/WebKit/chromium/src/SpeechRecognitionClientProxy.cpp:76 > > + RefPtr<SpeechRecognition> recognition = static_cast<PassRefPtr<SpeechRecognition> >(handle); > > the explicit cast to PassRefPtr<SpeechRecognition> should become unnecessary once you > change the argument type to "const WebSpeechRecognitionHandle&" as the conversion > operator defined on WebSpeechRecognitionHandle will then be used implicitly. Hmm, I still can't do RefPtr<SpeechRecognition> recognition = handle; RefPtr<T> doesn't have a ctor that takes a PassRefPtr<T>. What it does have is a ctor template that takes a PassRefPtr<U>, but since handle isn't a PassRefPtr in itself, the compiler can't do template argument deduction for U. I could do this: RefPtr<SpeechRecognition> recognition; recognition = handle; recognition->didReceiveResult(...); Which might be less ugly, but feels more weird. I could also use the PassRefPtr directly: PassRefPtr<SpeechRecognition> recognition = handle; recognition->didReceiveResult(...); But that fails on the "Local variables should never be PassRefPtr" rule. This would be another solution: static PassRefPtr<SpeechRecognition> unwrapHandle(const WebSpeechRecognitionHandle& handle) { return handle; } void SpeechRecognitionClientProxy::didStartAudio(const WebSpeechRecognitionHandle& handle) { RefPtr<SpeechRecognition> recognition = unwrapHandle(handle); recognition->didStartAudio(); } Still a bit weird, but it avoids the static_cast. I'm not sure what the best way is here, so I've kept the static_casts in for now.

Hans Wennborg

Comment 26 2012-04-03 05:56:47 PDT

Created attachment 135316 [details] Patch

Darin Fisher (:fishd, Google)

Comment 27 2012-04-03 09:01:19 PDT

Comment on attachment 135316 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=135316&action=review > Source/WebKit/chromium/public/WebSpeechRecognizerClient.h:39 > +public: this interface should probably have a protected destructor to ensure that the WebSpeechRecognizer implementation does not try to delete its client. > Source/WebKit/chromium/src/SpeechRecognitionClientProxy.cpp:106 > + RefPtr<SpeechRecognition> recognition = static_cast<PassRefPtr<SpeechRecognition> >(handle); OK... looking around the codebase, the more common pattern in cases like this is to do: RefPtr<SpeechRecognition> recognition = PassRefPtr<SpeechRecognition>(handle); ^^^ slightly less verbosity It seems like it would be cool if we had a global unwrap function that somehow did the right thing here. unwrap<SpeechRecognition>(handle)->didEndAudio() But, such a change would probably be a bigger effort. I'd say it seems out-of-scope for this patch.

Hans Wennborg

Comment 28 2012-04-04 00:52:06 PDT

(In reply to comment #27) > (From update of attachment 135316 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=135316&action=review > > > Source/WebKit/chromium/public/WebSpeechRecognizerClient.h:39 > > +public: > > this interface should probably have a protected destructor to ensure > that the WebSpeechRecognizer implementation does not try to delete > its client. Done. > > Source/WebKit/chromium/src/SpeechRecognitionClientProxy.cpp:106 > > + RefPtr<SpeechRecognition> recognition = static_cast<PassRefPtr<SpeechRecognition> >(handle); > > OK... looking around the codebase, the more common pattern in cases like this is to do: > > RefPtr<SpeechRecognition> recognition = PassRefPtr<SpeechRecognition>(handle); > > ^^^ slightly less verbosity Ah, yes that's nicer. Done. > It seems like it would be cool if we had a global unwrap function that somehow > did the right thing here. > > unwrap<SpeechRecognition>(handle)->didEndAudio() > > But, such a change would probably be a bigger effort. I'd say it seems out-of-scope > for this patch. Yeah that sounds like it might be a good idea. I'll have a think about it.

Hans Wennborg

Comment 29 2012-04-04 01:37:36 PDT

Committed r113149: <http://trac.webkit.org/changeset/113149>

Hans Wennborg

Comment 30 2012-04-04 03:57:06 PDT

Committed r113164: <http://trac.webkit.org/changeset/113164>

Attachments
Patch (40.57 KB, patch) 2012-03-20 08:38 PDT, Hans Wennborg	no flags	Details Formatted Diff Diff
Patch (40.67 KB, patch) 2012-03-21 08:10 PDT, Hans Wennborg	no flags	Details Formatted Diff Diff
Patch (42.76 KB, patch) 2012-03-23 07:41 PDT, Hans Wennborg	no flags	Details Formatted Diff Diff
Patch (49.89 KB, patch) 2012-03-26 07:44 PDT, Hans Wennborg	no flags	Details Formatted Diff Diff
Patch (54.64 KB, patch) 2012-03-29 07:27 PDT, Hans Wennborg	no flags	Details Formatted Diff Diff
Patch (54.12 KB, patch) 2012-03-30 07:11 PDT, Hans Wennborg	no flags	Details Formatted Diff Diff
Patch (60.12 KB, patch) 2012-04-02 06:24 PDT, Hans Wennborg	no flags	Details Formatted Diff Diff
Patch (58.32 KB, patch) 2012-04-03 05:56 PDT, Hans Wennborg	fishd: review+ fishd: commit-queue-	Details Formatted Diff Diff
Show Obsolete (7) View All Add attachment proposed patch, testcase, etc.