Bug 47089 - Language tag in speech for HTML input elements
Summary: Language tag in speech for HTML input elements
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Forms (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Leandro Graciá Gil
URL:
Keywords:
Depends on: 47420 47989 48126
Blocks: 39485
  Show dependency treegraph
 
Reported: 2010-10-04 08:25 PDT by Leandro Graciá Gil
Modified: 2012-09-19 07:16 PDT (History)
10 users (show)

See Also:


Attachments
Patch (20.78 KB, patch)
2010-10-06 03:38 PDT, Leandro Graciá Gil
no flags Details | Formatted Diff | Diff
Patch (31.85 KB, patch)
2010-10-06 04:56 PDT, Leandro Graciá Gil
no flags Details | Formatted Diff | Diff
Patch (33.01 KB, patch)
2010-10-06 07:19 PDT, Leandro Graciá Gil
no flags Details | Formatted Diff | Diff
Patch (33.77 KB, patch)
2010-10-06 07:58 PDT, Leandro Graciá Gil
no flags Details | Formatted Diff | Diff
Patch (34.32 KB, patch)
2010-10-07 12:30 PDT, Leandro Graciá Gil
no flags Details | Formatted Diff | Diff
Patch (35.06 KB, patch)
2010-10-08 06:58 PDT, Leandro Graciá Gil
no flags Details | Formatted Diff | Diff
Patch (34.69 KB, patch)
2010-10-08 07:20 PDT, Leandro Graciá Gil
no flags Details | Formatted Diff | Diff
Patch (34.87 KB, patch)
2010-10-08 08:49 PDT, Leandro Graciá Gil
no flags Details | Formatted Diff | Diff
Patch (34.88 KB, patch)
2010-10-08 09:16 PDT, Leandro Graciá Gil
no flags Details | Formatted Diff | Diff
Patch (35.68 KB, patch)
2010-10-13 05:35 PDT, Leandro Graciá Gil
no flags Details | Formatted Diff | Diff
Patch (35.65 KB, patch)
2010-10-13 06:54 PDT, Leandro Graciá Gil
no flags Details | Formatted Diff | Diff
Patch (35.69 KB, patch)
2010-10-13 07:59 PDT, Leandro Graciá Gil
no flags Details | Formatted Diff | Diff
Patch (35.93 KB, patch)
2010-10-15 08:05 PDT, Leandro Graciá Gil
no flags Details | Formatted Diff | Diff
Patch (35.94 KB, patch)
2010-10-19 06:54 PDT, Leandro Graciá Gil
no flags Details | Formatted Diff | Diff
Patch (38.72 KB, patch)
2010-10-20 09:35 PDT, Leandro Graciá Gil
no flags Details | Formatted Diff | Diff
Patch (45.87 KB, patch)
2010-10-21 14:05 PDT, Leandro Graciá Gil
no flags Details | Formatted Diff | Diff
Patch (45.96 KB, patch)
2010-10-22 03:08 PDT, Leandro Graciá Gil
no flags Details | Formatted Diff | Diff
Patch (46.88 KB, patch)
2010-10-22 09:34 PDT, Leandro Graciá Gil
no flags Details | Formatted Diff | Diff
Patch (46.05 KB, patch)
2010-10-27 09:50 PDT, Leandro Graciá Gil
no flags Details | Formatted Diff | Diff
Patch (39.71 KB, patch)
2010-10-28 10:50 PDT, Leandro Graciá Gil
no flags Details | Formatted Diff | Diff
Patch (39.67 KB, patch)
2010-10-29 04:49 PDT, Leandro Graciá Gil
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Leandro Graciá Gil 2010-10-04 08:25:01 PDT
The current implementation of the speech tags in HTML does not consider the language of the input element, its ancestor elements or the document. The objective of this bug is to patch the implementation to get the contents of the nearest 'lang' attribute, perform a brief verification and include them as a parameter in the startRecognition methods.

Related bug: https://bugs.webkit.org/show_bug.cgi?id=39485
Comment 1 Leandro Graciá Gil 2010-10-06 03:38:18 PDT
Created attachment 69914 [details]
Patch
Comment 2 Leandro Graciá Gil 2010-10-06 04:56:20 PDT
Created attachment 69925 [details]
Patch
Comment 3 WebKit Review Bot 2010-10-06 05:06:34 PDT
Attachment 69925 [details] did not build on gtk:
Build output: http://queues.webkit.org/results/4219107
Comment 4 Leandro Graciá Gil 2010-10-06 07:19:51 PDT
Created attachment 69934 [details]
Patch
Comment 5 WebKit Review Bot 2010-10-06 07:27:15 PDT
Attachment 69934 [details] did not build on gtk:
Build output: http://queues.webkit.org/results/4202096
Comment 6 Leandro Graciá Gil 2010-10-06 07:58:39 PDT
Created attachment 69940 [details]
Patch
Comment 7 mitz 2010-10-06 09:16:00 PDT
Comment on attachment 69940 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=69940&action=review

> WebCore/page/SpeechInput.cpp:96
> +bool SpeechInput::startRecognition(int listenerId, const CString& languageTag, const IntRect& elementRect)

Why are you using a CString for the language identifier?
Comment 8 Leandro Graciá Gil 2010-10-06 10:05:09 PDT
(In reply to comment #7)
> (From update of attachment 69940 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=69940&action=review
> 
> > WebCore/page/SpeechInput.cpp:96
> > +bool SpeechInput::startRecognition(int listenerId, const CString& languageTag, const IntRect& elementRect)
> 
> Why are you using a CString for the language identifier?

Because the expected input is a valid BCP47 language tag: http://tools.ietf.org/rfc/bcp/bcp47.txt. The spec includes a grammar for the valid ones but all of them are composed of alphanumeric ASCII characters and the '-' symbol. Any tag not passing this simple test is considered invalid and the languageTag variable becomes an empty string. Note that a full language tag parsing for validation is not intended here.

So, since any valid content can be safely represented with a 8-bit ASCII string (even with a 7-bit one) there's absolutely no need to use any wide-char strings for this. It's just a waste of memory.
Comment 9 Darin Fisher (:fishd, Google) 2010-10-06 10:28:35 PDT
Comment on attachment 69940 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=69940&action=review

> WebKit/chromium/public/WebSpeechInputController.h:47
> +    virtual bool startRecognition(int requestId, const WebCString&, const WebRect&)

this WebCString parameter needs a name.  you should give parameters a name
unless the name you would pick is redundant with the type name.
Comment 10 Jeremy Orlow 2010-10-06 10:56:29 PDT
Comment on attachment 69940 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=69940&action=review

A few initial thoughts...

> LayoutTests/fast/speech/input-text-language-tag.html:15
> +function onChange_none() {

this is not webkit style..fix here and elsewhere

>>> WebCore/page/SpeechInput.cpp:96
>>> +bool SpeechInput::startRecognition(int listenerId, const CString& languageTag, const IntRect& elementRect)
>> 
>> Why are you using a CString for the language identifier?
> 
> Because the expected input is a valid BCP47 language tag: http://tools.ietf.org/rfc/bcp/bcp47.txt. The spec includes a grammar for the valid ones but all of them are composed of alphanumeric ASCII characters and the '-' symbol. Any tag not passing this simple test is considered invalid and the languageTag variable becomes an empty string. Note that a full language tag parsing for validation is not intended here.
> 
> So, since any valid content can be safely represented with a 8-bit ASCII string (even with a 7-bit one) there's absolutely no need to use any wide-char strings for this. It's just a waste of memory.

For simplicity we generally stick with strings (UTF16) unless there's a good reason not to.  The memory saving his are negligible, so we should probably not be using a cstring here.

> WebCore/rendering/TextControlInnerElements.cpp:103
> +    

get rid of white space

> WebCore/rendering/TextControlInnerElements.cpp:286
> + 

get rid of white space

> WebKit/chromium/public/WebSpeechInputControllerMock.h:47
> +    virtual void setMockRecognitionResult(const WebString& result, const WebString& languageTag) = 0;

We don't do pure virtual methods in the WebKit API anymore.  Please make this (and others) have a WEBKIT_ASSERT_NOT_REACHED() default method body.

> WebKit/chromium/src/WebSpeechInputControllerMockImpl.h:72
> +    // TODO(leandrogracia): delete this temporary fix. Required to launch a two-sided patch.

This is not Webkit style.  Use FIXME.

> WebKit/chromium/src/WebViewImpl.cpp:279
> +    , m_speechInputClient(new SpeechInputClientImpl(client))

SpeechInputClientImpl should have a create factory method that returns a passownptr
Comment 11 Jeremy Orlow 2010-10-06 10:56:56 PDT
Satish, can you please do an informal review of this?
Comment 12 Satish Sampath 2010-10-07 02:55:01 PDT
Comment on attachment 69940 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=69940&action=review

> LayoutTests/fast/speech/input-text-language-tag.html:48
> +    var input_none = document.createElement('input');

Can you declare these input elements in html instead of having all this JS code to create and initialize them?

> LayoutTests/fast/speech/input-text-language-tag.html:50
> +    input_none.speech = 'speech';

This will no longer work, I renamed this attribute to webkitSpeech recently. So this will become:
  input.webkitSpeech = 'webkitSpeech'
Same for all other places in this file. And if you take the above suggestion and declare these in html, then use
  <input x-webkit-speech>

> LayoutTests/fast/speech/input-text-speechbutton.html:35
> +        layoutTestController.setMockSpeechInputResult('Pictures of the moon', 'en');

Is 'en' required here or can we pass an empty string? On a related note, does the c++ code get 'en' as the language if the web page did not specify any language at the document or node level?

> WebCore/ChangeLog:5
> +        Patch the current speech HTML tag implementation to use and validate

nit: replace 'speech HTML tag' with 'speech input' since speech input has not introduced any new HTML tags, instead it tries to work with existing ones.

>>>> WebCore/page/SpeechInput.cpp:96
>>>> +bool SpeechInput::startRecognition(int listenerId, const CString& languageTag, const IntRect& elementRect)
>>> 
>>> Why are you using a CString for the language identifier?
>> 
>> Because the expected input is a valid BCP47 language tag: http://tools.ietf.org/rfc/bcp/bcp47.txt. The spec includes a grammar for the valid ones but all of them are composed of alphanumeric ASCII characters and the '-' symbol. Any tag not passing this simple test is considered invalid and the languageTag variable becomes an empty string. Note that a full language tag parsing for validation is not intended here.
>> 
>> So, since any valid content can be safely represented with a 8-bit ASCII string (even with a 7-bit one) there's absolutely no need to use any wide-char strings for this. It's just a waste of memory.
> 
> For simplicity we generally stick with strings (UTF16) unless there's a good reason not to.  The memory saving his are negligible, so we should probably not be using a cstring here.

I agree, keep it simple and use a String

> WebCore/platform/mock/SpeechInputClientMock.cpp:53
> +bool SpeechInputClientMock::startRecognition(int requestId, const CString& languageTag, const IntRect&)

I suggest renaming 'languageTag' to 'language', because though BCP47 calls them as language tags it is confusing in the context of WebCore whether these are html tags or something else. If you agree, could also rename all other occurances of this name

> WebCore/rendering/TextControlInnerElements.cpp:464
> +CString InputFieldSpeechButtonElement::validatedInheritedLanguage(HTMLInputElement *input)

I think a better place for validating this would be inside the Element::computeInheritedLanguage() method you are calling below. That will allow all callers of that method to benefit from this validation instead of just speech input.

>> WebKit/chromium/public/WebSpeechInputController.h:47
>> +    virtual bool startRecognition(int requestId, const WebCString&, const WebRect&)
> 
> this WebCString parameter needs a name.  you should give parameters a name
> unless the name you would pick is redundant with the type name.

Can you also name the WebRect to 'element_rect' since that is not obvious as well?

>> WebKit/chromium/public/WebSpeechInputControllerMock.h:47
>> +    virtual void setMockRecognitionResult(const WebString& result, const WebString& languageTag) = 0;
> 
> We don't do pure virtual methods in the WebKit API anymore.  Please make this (and others) have a WEBKIT_ASSERT_NOT_REACHED() default method body.

I think pure virtual is used for all public APIs called by the embedder, and WEBKIT_ASSERT_NOT_REACHED() is used for all public APIs implemented by the embedder (and called by webkit). In that context, this interface is called by the embedder hence pure virtual is matching the style.

> WebKit/chromium/src/WebSpeechInputControllerMockImpl.cpp:59
>  void WebSpeechInputControllerMockImpl::setMockRecognitionResult(const WebString& result)

Is this old method still required?

>> WebKit/chromium/src/WebViewImpl.cpp:279
>> +    , m_speechInputClient(new SpeechInputClientImpl(client))
> 
> SpeechInputClientImpl should have a create factory method that returns a passownptr

For the reason behind this suggestion, please see https://lists.webkit.org/pipermail/webkit-dev/2010-August/014006.html
Comment 13 Leandro Graciá Gil 2010-10-07 04:07:07 PDT
(In reply to comment #12)
> > WebKit/chromium/src/WebSpeechInputControllerMockImpl.cpp:59
> >  void WebSpeechInputControllerMockImpl::setMockRecognitionResult(const WebString& result)
> 
> Is this old method still required?

I'm afraid that for the moment it is. It breaks chromium's test_shell otherwise. I'll remove this in a new patch once the chromium-side one has been applied.
Comment 14 Leandro Graciá Gil 2010-10-07 04:55:07 PDT
(In reply to comment #12)
> (From update of attachment 69940 [details])
> > LayoutTests/fast/speech/input-text-speechbutton.html:35
> > +        layoutTestController.setMockSpeechInputResult('Pictures of the moon', 'en');
> 
> Is 'en' required here or can we pass an empty string? On a related note, does the c++ code get 'en' as the language if the web page did not specify any language at the document or node level?

In this case we should pass 'en' there since the empty string is left for the invalid and unspecified language tag cases, which have their own tests. The c++ code in this patch won't get any language if none is specified. This will be done in the chromium side of the patch, where the application locale will be used when receiving an empty language tag string.
Comment 15 Leandro Graciá Gil 2010-10-07 12:30:30 PDT
Created attachment 70128 [details]
Patch
Comment 16 Leandro Graciá Gil 2010-10-07 12:39:13 PDT
Comment on attachment 69940 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=69940&action=review

>> LayoutTests/fast/speech/input-text-language-tag.html:15
>> +function onChange_none() {
> 
> this is not webkit style..fix here and elsewhere

Done.

>> LayoutTests/fast/speech/input-text-language-tag.html:48
>> +    var input_none = document.createElement('input');
> 
> Can you declare these input elements in html instead of having all this JS code to create and initialize them?

Done.

>> LayoutTests/fast/speech/input-text-language-tag.html:50
>> +    input_none.speech = 'speech';
> 
> This will no longer work, I renamed this attribute to webkitSpeech recently. So this will become:
>   input.webkitSpeech = 'webkitSpeech'
> Same for all other places in this file. And if you take the above suggestion and declare these in html, then use
>   <input x-webkit-speech>

Done.

>>> LayoutTests/fast/speech/input-text-speechbutton.html:35
>>> +        layoutTestController.setMockSpeechInputResult('Pictures of the moon', 'en');
>> 
>> Is 'en' required here or can we pass an empty string? On a related note, does the c++ code get 'en' as the language if the web page did not specify any language at the document or node level?
> 
> In this case we should pass 'en' there since the empty string is left for the invalid and unspecified language tag cases, which have their own tests. The c++ code in this patch won't get any language if none is specified. This will be done in the chromium side of the patch, where the application locale will be used when receiving an empty language tag string.

Sorry, my mistake. For some reason I though you where talking about the language tag layout test. You're right, in this case the most logical choice would be to pass an empty language string as no language is set in the test.

However I found today some segmentation fault bugs caused by the HashMap handling these results when the language was an empty string. Because of this, now any call to setMockSpeechInputResult with an empty language string is ignored. In the new patch I have set the appropriate language tags in this and other tests using 'en' as language (could be any valid one).

>> WebCore/ChangeLog:5
>> +        Patch the current speech HTML tag implementation to use and validate
> 
> nit: replace 'speech HTML tag' with 'speech input' since speech input has not introduced any new HTML tags, instead it tries to work with existing ones.

Done.

>>>>> WebCore/page/SpeechInput.cpp:96
>>>>> +bool SpeechInput::startRecognition(int listenerId, const CString& languageTag, const IntRect& elementRect)
>>>> 
>>>> Why are you using a CString for the language identifier?
>>> 
>>> Because the expected input is a valid BCP47 language tag: http://tools.ietf.org/rfc/bcp/bcp47.txt. The spec includes a grammar for the valid ones but all of them are composed of alphanumeric ASCII characters and the '-' symbol. Any tag not passing this simple test is considered invalid and the languageTag variable becomes an empty string. Note that a full language tag parsing for validation is not intended here.
>>> 
>>> So, since any valid content can be safely represented with a 8-bit ASCII string (even with a 7-bit one) there's absolutely no need to use any wide-char strings for this. It's just a waste of memory.
>> 
>> For simplicity we generally stick with strings (UTF16) unless there's a good reason not to.  The memory saving his are negligible, so we should probably not be using a cstring here.
> 
> I agree, keep it simple and use a String

Done.

>> WebCore/platform/mock/SpeechInputClientMock.cpp:53
>> +bool SpeechInputClientMock::startRecognition(int requestId, const CString& languageTag, const IntRect&)
> 
> I suggest renaming 'languageTag' to 'language', because though BCP47 calls them as language tags it is confusing in the context of WebCore whether these are html tags or something else. If you agree, could also rename all other occurances of this name

Done.

>> WebCore/rendering/TextControlInnerElements.cpp:103
>> +    
> 
> get rid of white space

Done.

>> WebCore/rendering/TextControlInnerElements.cpp:286
>> + 
> 
> get rid of white space

Done.

>> WebCore/rendering/TextControlInnerElements.cpp:464
>> +CString InputFieldSpeechButtonElement::validatedInheritedLanguage(HTMLInputElement *input)
> 
> I think a better place for validating this would be inside the Element::computeInheritedLanguage() method you are calling below. That will allow all callers of that method to benefit from this validation instead of just speech input.

Done.

>>> WebKit/chromium/public/WebSpeechInputController.h:47
>>> +    virtual bool startRecognition(int requestId, const WebCString&, const WebRect&)
>> 
>> this WebCString parameter needs a name.  you should give parameters a name
>> unless the name you would pick is redundant with the type name.
> 
> Can you also name the WebRect to 'element_rect' since that is not obvious as well?

Done.

>>> WebKit/chromium/public/WebSpeechInputControllerMock.h:47
>>> +    virtual void setMockRecognitionResult(const WebString& result, const WebString& languageTag) = 0;
>> 
>> We don't do pure virtual methods in the WebKit API anymore.  Please make this (and others) have a WEBKIT_ASSERT_NOT_REACHED() default method body.
> 
> I think pure virtual is used for all public APIs called by the embedder, and WEBKIT_ASSERT_NOT_REACHED() is used for all public APIs implemented by the embedder (and called by webkit). In that context, this interface is called by the embedder hence pure virtual is matching the style.

No changes here for the moment.

>> WebKit/chromium/src/WebSpeechInputControllerMockImpl.cpp:59
>>  void WebSpeechInputControllerMockImpl::setMockRecognitionResult(const WebString& result)
> 
> Is this old method still required?

Yes, for the moment it is. It breaks chromium's test_shell otherwise. I'll delete these in a future patch once the chromium side has landed.

>> WebKit/chromium/src/WebSpeechInputControllerMockImpl.h:72
>> +    // TODO(leandrogracia): delete this temporary fix. Required to launch a two-sided patch.
> 
> This is not Webkit style.  Use FIXME.

Done.

>>> WebKit/chromium/src/WebViewImpl.cpp:279
>>> +    , m_speechInputClient(new SpeechInputClientImpl(client))
>> 
>> SpeechInputClientImpl should have a create factory method that returns a passownptr
> 
> For the reason behind this suggestion, please see https://lists.webkit.org/pipermail/webkit-dev/2010-August/014006.html

Done. Thanks for the link.
Comment 17 WebKit Review Bot 2010-10-07 15:54:28 PDT
Attachment 70128 [details] did not build on gtk:
Build output: http://queues.webkit.org/results/4222120
Comment 18 Leandro Graciá Gil 2010-10-08 06:58:47 PDT
Created attachment 70243 [details]
Patch
Comment 19 Leandro Graciá Gil 2010-10-08 07:20:08 PDT
Created attachment 70248 [details]
Patch
Comment 20 Satish Sampath 2010-10-08 07:38:40 PDT
Comment on attachment 70248 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=70248&action=review

> LayoutTests/fast/speech/input-text-language-tag.html:16
> +    shouldBeEqualToString('document.getElementById("speechInputNone").value', 'error: no result found for language \'\'');

Instead of treating this as an error case, can you set a mock result for the empty language tag and check that here?

> WebCore/platform/mock/SpeechInputClientMock.cpp:33
> +#include <wtf/text/CString.h>

Do you need this include in the latest code?

> WebCore/platform/mock/SpeechInputClientMock.cpp:88
> +        m_recognitionResult.set("_", result);

Define "_" as a constant at the top of the file and use it both here and in timerFired(). Also add a comment with this constant declaration explaining what it is used for.

> WebCore/platform/mock/SpeechInputClientMock.cpp:103
> +            String error("error: no result found for language '");

Since this is test code and from the test result we know exactly which portion of the code called this, perhaps we don't need this human-readable error string. Instead you can just skip calling setRecognitionResult(). That also matches what the real recognizer would do, if it was not able to recognize the user voice in the given language.

> WebCore/platform/mock/SpeechInputClientMock.h:55
> +    bool startRecognition(int, const WTF::String&, const IntRect&);

Remove the WTF:: specifier

> WebKit/chromium/src/SpeechInputClientImpl.h:39
> +#include <wtf/text/WTFString.h>

Remove this since you have a forward declaration for String below.

> WebKit/chromium/src/WebSpeechInputControllerMockImpl.cpp:60
> +    m_webcoreMock->setRecognitionResult(result, WebString::fromUTF8("en"));

Should you pass an empty string here instead of 'en' ?

> WebKit/chromium/src/WebSpeechInputControllerMockImpl.h:39
> +#include <wtf/text/WTFString.h>

Remove this since you have a forward declaration for String below.
Comment 21 Leandro Graciá Gil 2010-10-08 08:44:48 PDT
Comment on attachment 70248 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=70248&action=review

>> LayoutTests/fast/speech/input-text-language-tag.html:16
>> +    shouldBeEqualToString('document.getElementById("speechInputNone").value', 'error: no result found for language \'\'');
> 
> Instead of treating this as an error case, can you set a mock result for the empty language tag and check that here?

Discussed and rejected. We need a result in all cases to start the change event, and this is as good as any other.

>> WebCore/platform/mock/SpeechInputClientMock.cpp:33
>> +#include <wtf/text/CString.h>
> 
> Do you need this include in the latest code?

Sorry, my mistake during rebase merges.
Done.

>> WebCore/platform/mock/SpeechInputClientMock.cpp:88
>> +        m_recognitionResult.set("_", result);
> 
> Define "_" as a constant at the top of the file and use it both here and in timerFired(). Also add a comment with this constant declaration explaining what it is used for.

Done.

>> WebCore/platform/mock/SpeechInputClientMock.cpp:103
>> +            String error("error: no result found for language '");
> 
> Since this is test code and from the test result we know exactly which portion of the code called this, perhaps we don't need this human-readable error string. Instead you can just skip calling setRecognitionResult(). That also matches what the real recognizer would do, if it was not able to recognize the user voice in the given language.

Wee need some output even for the empty language case. Otherwise no change event will be generated and the test will timeout without any checking.

>> WebCore/platform/mock/SpeechInputClientMock.h:55
>> +    bool startRecognition(int, const WTF::String&, const IntRect&);
> 
> Remove the WTF:: specifier

Done.

>> WebKit/chromium/src/SpeechInputClientImpl.h:39
>> +#include <wtf/text/WTFString.h>
> 
> Remove this since you have a forward declaration for String below.

Done.

>> WebKit/chromium/src/WebSpeechInputControllerMockImpl.cpp:60
>> +    m_webcoreMock->setRecognitionResult(result, WebString::fromUTF8("en"));
> 
> Should you pass an empty string here instead of 'en' ?

Done. Now that mock results for empty language strings are again supported it makes sense.

>> WebKit/chromium/src/WebSpeechInputControllerMockImpl.h:39
>> +#include <wtf/text/WTFString.h>
> 
> Remove this since you have a forward declaration for String below.

Done.
Comment 22 Leandro Graciá Gil 2010-10-08 08:49:45 PDT
Created attachment 70252 [details]
Patch
Comment 23 Satish Sampath 2010-10-08 08:54:06 PDT
Comment on attachment 70252 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=70252&action=review

r=me
Looks good, small nit below.

> WebCore/platform/mock/SpeechInputClientMock.cpp:39
> +    // Empty language strings crash the HashMap. This value, an invalid BCP47 tag, is used for those cases.

I'd suggest rewording this as 'HashMap doesn't support empty strings as keys, so ...'
Comment 24 mitz 2010-10-08 09:09:58 PDT
Comment on attachment 70252 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=70252&action=review

> WebCore/dom/Element.cpp:1474
> +        CString languageTag = value.string().ascii();

The creation of a CString here is unnecessary. Please don’t do this.

> WebCore/page/SpeechInputClient.h:39
> +namespace WTF {
> +class String;
> +}
> +

Why is this needed?
Comment 25 Leandro Graciá Gil 2010-10-08 09:16:11 PDT
Created attachment 70258 [details]
Patch
Comment 26 Leandro Graciá Gil 2010-10-13 05:35:27 PDT
Created attachment 70602 [details]
Patch
Comment 27 Leandro Graciá Gil 2010-10-13 05:46:51 PDT
Here are the links to all the other parts of this patch (4-sided):
- 1st part: http://codereview.chromium.org/3615005/show
- 2nd part: https://bugs.webkit.org/show_bug.cgi?id=47089 (this one)
- 3rd part: http://codereview.chromium.org/3595018/show
- 4th part: https://bugs.webkit.org/show_bug.cgi?id=47420
Comment 28 Leandro Graciá Gil 2010-10-13 06:54:28 PDT
Created attachment 70608 [details]
Patch
Comment 29 Leandro Graciá Gil 2010-10-13 07:02:28 PDT
(In reply to comment #24)
> (From update of attachment 70252 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=70252&action=review
>  
> > WebCore/page/SpeechInputClient.h:39
> > +namespace WTF {
> > +class String;
> > +}
> > +
> 
> Why is this needed?

That's for the language string in the startRecognition method. I already tried to remove it but the build breaks if I do.
Comment 30 mitz 2010-10-13 07:34:17 PDT
(In reply to comment #29)
> (In reply to comment #24)
> > (From update of attachment 70252 [details] [details])
> > View in context: https://bugs.webkit.org/attachment.cgi?id=70252&action=review
> >  
> > > WebCore/page/SpeechInputClient.h:39
> > > +namespace WTF {
> > > +class String;
> > > +}
> > > +
> > 
> > Why is this needed?
> 
> That's for the language string in the startRecognition method. I already tried to remove it but the build breaks if I do.

Please just include <wtf/Forward.h> and use unqualified String.
Comment 31 Leandro Graciá Gil 2010-10-13 07:59:01 PDT
Created attachment 70611 [details]
Patch
Comment 32 Leandro Graciá Gil 2010-10-13 08:19:33 PDT
(In reply to comment #30)
> (In reply to comment #29)
> > (In reply to comment #24)
> > > (From update of attachment 70252 [details] [details] [details])
> > > View in context: https://bugs.webkit.org/attachment.cgi?id=70252&action=review
> > >  
> > > > WebCore/page/SpeechInputClient.h:39
> > > > +namespace WTF {
> > > > +class String;
> > > > +}
> > > > +
> > > 
> > > Why is this needed?
> > 
> > That's for the language string in the startRecognition method. I already tried to remove it but the build breaks if I do.
> 
> Please just include <wtf/Forward.h> and use unqualified String.

Done there and in WebKit/chromium/src/SpeechInputClientImpl.h/cpp
Comment 33 Jeremy Orlow 2010-10-15 06:49:26 PDT
Comment on attachment 70611 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=70611&action=review

Looks fine to me.  Mitz, please speak up if you have any additional concerns?

> WebKit/chromium/src/SpeechInputClientImpl.cpp:38
>  #include "page/SpeechInputListener.h"

no page/

> WebKit/chromium/src/SpeechInputClientImpl.h:37
>  #include "page/SpeechInputClient.h"

Remove page/
Comment 34 Leandro Graciá Gil 2010-10-15 08:05:51 PDT
Created attachment 70867 [details]
Patch
Comment 35 Leandro Graciá Gil 2010-10-15 08:06:21 PDT
(In reply to comment #33)
> (From update of attachment 70611 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=70611&action=review
> 
> Looks fine to me.  Mitz, please speak up if you have any additional concerns?
> 
> > WebKit/chromium/src/SpeechInputClientImpl.cpp:38
> >  #include "page/SpeechInputListener.h"
> 
> no page/
> 
> > WebKit/chromium/src/SpeechInputClientImpl.h:37
> >  #include "page/SpeechInputClient.h"
> 
> Remove page/

Done.
Comment 36 Leandro Graciá Gil 2010-10-19 06:54:29 PDT
Created attachment 71163 [details]
Patch
Comment 37 Leandro Graciá Gil 2010-10-19 06:55:28 PDT
(In reply to comment #36)
> Created an attachment (id=71163) [details]
> Patch

Rebased for review and landing.
Comment 38 Leandro Graciá Gil 2010-10-19 06:57:22 PDT
Comment on attachment 71163 [details]
Patch

Rebase performed. Please review and commit if looks good.
Comment 39 Jeremy Orlow 2010-10-19 07:00:29 PDT
Comment on attachment 71163 [details]
Patch

r=me
Comment 40 WebKit Commit Bot 2010-10-19 07:50:36 PDT
Comment on attachment 71163 [details]
Patch

Rejecting patch 71163 from commit-queue.

Failed to run "['./WebKitTools/Scripts/webkit-patch', '--status-host=queues.webkit.org', '--bot-id=abarth-cq-sl', 'build-and-test', '--no-clean', '--no-update', '--test', '--quiet', '--non-interactive']" exit_code: 2
Last 500 characters of output:
....ok
WebKitTools/Scripts/webkitperl/VCSUtils_unittest/parseSvnPropertyValue..........ok
WebKitTools/Scripts/webkitperl/VCSUtils_unittest/prepareParsedPatch.............ok
WebKitTools/Scripts/webkitperl/VCSUtils_unittest/runPatchCommand................ok
WebKitTools/Scripts/webkitperl/VCSUtils_unittest/setChangeLogDateAndReviewer....ok
All tests successful.
Files=14, Tests=304,  1 wallclock secs ( 0.63 cusr +  0.13 csys =  0.76 CPU)
Running build-dumprendertree
Compiling DumpRenderTree failed!

Full output: http://queues.webkit.org/results/4430115
Comment 41 Jeremy Orlow 2010-10-20 07:03:46 PDT
Comment on attachment 71163 [details]
Patch

once more
Comment 42 WebKit Commit Bot 2010-10-20 07:53:57 PDT
Comment on attachment 71163 [details]
Patch

Rejecting patch 71163 from commit-queue.

Failed to run "['./WebKitTools/Scripts/webkit-patch', '--status-host=queues.webkit.org', '--bot-id=abarth-cq-sl', 'build-and-test', '--no-clean', '--no-update', '--test', '--quiet', '--non-interactive']" exit_code: 2
Last 500 characters of output:
....ok
WebKitTools/Scripts/webkitperl/VCSUtils_unittest/parseSvnPropertyValue..........ok
WebKitTools/Scripts/webkitperl/VCSUtils_unittest/prepareParsedPatch.............ok
WebKitTools/Scripts/webkitperl/VCSUtils_unittest/runPatchCommand................ok
WebKitTools/Scripts/webkitperl/VCSUtils_unittest/setChangeLogDateAndReviewer....ok
All tests successful.
Files=14, Tests=304,  0 wallclock secs ( 0.64 cusr +  0.14 csys =  0.78 CPU)
Running build-dumprendertree
Compiling DumpRenderTree failed!

Full output: http://queues.webkit.org/results/4517018
Comment 43 WebKit Review Bot 2010-10-20 09:15:39 PDT
http://trac.webkit.org/changeset/70149 might have broken SnowLeopard Intel Release (Tests) and Qt Linux Release
Comment 44 Leandro Graciá Gil 2010-10-20 09:35:45 PDT
Created attachment 71299 [details]
Patch
Comment 45 Leandro Graciá Gil 2010-10-20 09:37:40 PDT
Comment on attachment 71299 [details]
Patch

Added the missing patches to the Win, Mac and Wx versions of the LayoutTestController that were causing the build to break.
Comment 46 Jeremy Orlow 2010-10-20 09:42:07 PDT
Comment on attachment 71299 [details]
Patch

looks fine...hopefully ASSERT_UNUSEDs aren't necessary, but I guess the commit queue will let us know if it is?
Comment 47 WebKit Commit Bot 2010-10-20 11:55:10 PDT
Comment on attachment 71299 [details]
Patch

Rejecting patch 71299 from commit-queue.

Failed to run "['./WebKitTools/Scripts/webkit-patch', '--status-host=queues.webkit.org', 'apply-attachment', '--force-clean', '--non-interactive', '--quiet', 71299]" exit_code: 2
Cleaning working directory
Updating working directory
Failed to run "['WebKitTools/Scripts/update-webkit']" exit_code: 2

Full output: http://queues.webkit.org/results/4524017
Comment 48 Eric Seidel (no email) 2010-10-20 11:57:58 PDT
Comment on attachment 71299 [details]
Patch

I think this was a false rejection from a run-away commit-node.
Comment 49 WebKit Commit Bot 2010-10-21 02:24:37 PDT
Comment on attachment 71299 [details]
Patch

Rejecting patch 71299 from commit-queue.

Failed to run "['./WebKitTools/Scripts/webkit-patch', '--status-host=queues.webkit.org', 'apply-attachment', '--force-clean', '--non-interactive', '--quiet', 71299]" exit_code: 2
Cleaning working directory
Updating working directory
Logging in as commit-queue@webkit.org...
Fetching: https://bugs.webkit.org/attachment.cgi?id=71299&action=edit
Fetching: https://bugs.webkit.org/show_bug.cgi?id=47089&ctype=xml
Processing 1 patch from 1 bug.
Processing patch 71299 from bug 47089.
Failed to run "[u'/Projects/CommitQueue/WebKitTools/Scripts/svn-apply', u'--reviewer', u'Jeremy Orlow', u'--force']" exit_code: 1

Full output: http://queues.webkit.org/results/4616007
Comment 50 Leandro Graciá Gil 2010-10-21 14:05:36 PDT
Created attachment 71487 [details]
Patch
Comment 51 Leandro Graciá Gil 2010-10-21 15:03:44 PDT
Comment on attachment 71487 [details]
Patch

Rebase and patches to the Qt port.
Comment 52 Leandro Graciá Gil 2010-10-22 03:08:44 PDT
Created attachment 71540 [details]
Patch
Comment 53 Leandro Graciá Gil 2010-10-22 03:13:58 PDT
Comment on attachment 71540 [details]
Patch

Rebase.
Comment 54 Jeremy Orlow 2010-10-22 05:09:26 PDT
Comment on attachment 71540 [details]
Patch

Clearing flags on attachment: 71540

Committed r70301: <http://trac.webkit.org/changeset/70301>
Comment 55 Jeremy Orlow 2010-10-22 05:09:41 PDT
All reviewed patches have been landed.  Closing bug.
Comment 56 WebKit Review Bot 2010-10-22 05:46:51 PDT
http://trac.webkit.org/changeset/70301 might have broken Qt Linux Release
The following tests are not passing:
fast/css/lang-selector-empty-attribute.xhtml
fast/dom/SelectorAPI/resig-SelectorsAPI-test.xhtml
Comment 57 Leandro Graciá Gil 2010-10-22 09:34:28 PDT
Created attachment 71569 [details]
Patch
Comment 58 Leandro Graciá Gil 2010-10-22 09:38:05 PDT
Comment on attachment 71569 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=71569&action=review

Fixed problems in Element.cpp that caused some layout tests to fail.

> WebCore/dom/Element.cpp:1480
> +        CString asciiOnly = value.string().ascii();

The CString object is required here to ensure that the memory pointed by p is valid until the end of the if block. Note that value contains the object returned by string() but ascii creates a new CString with the ASCII-only representation of the UTF16 string value, replacing all non-ASCII characters with question mark symbols.
Comment 59 Alexey Proskuryakov 2010-10-22 10:52:04 PDT
Reopening, since r70301 has been rolled out.
Comment 60 Alexey Proskuryakov 2010-10-22 10:55:34 PDT
Comment on attachment 71569 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=71569&action=review

> WebCore/ChangeLog:18
> +        Fixed a problem in dom/Element.cpp that caused some layout tests to fail.

No need to mention that - this doesn't fix any problems in ToT.

> WebCore/ChangeLog:22
> +        (WebCore::Element::computeInheritedLanguage): includes a brief character
> +          validation for the BCP 47 language tag.

Please file a separate bug for this, and provide a patch that tests this change with default WebKit build options (i.e. outside of speech recognition context). r- for the need to split DOM changes out.
Comment 61 Leandro Graciá Gil 2010-10-27 09:50:08 PDT
Created attachment 72049 [details]
Patch
Comment 62 Leandro Graciá Gil 2010-10-27 09:59:46 PDT
Comment on attachment 72049 [details]
Patch

Language validation has been removed and handled separately in bug https://bugs.webkit.org/show_bug.cgi?id=48225 (currently a WONTFIX). The invalid language tag case has been removed from the layout test.
Comment 63 Jeremy Orlow 2010-10-27 10:03:03 PDT
Comment on attachment 72049 [details]
Patch

yup...looks pretty much like before minus those bits
Comment 64 Jeremy Orlow 2010-10-27 10:14:12 PDT
(In reply to comment #63)
> (From update of attachment 72049 [details])
> yup...looks pretty much like before minus those bits

r=me (AP's comments seem to be addressed, but please yell if that's not true)
Comment 65 Satish Sampath 2010-10-27 10:19:59 PDT
Comment on attachment 72049 [details]
Patch

Clearing flags on attachment: 72049

Committed r70665: <http://trac.webkit.org/changeset/70665>
Comment 66 Satish Sampath 2010-10-27 10:20:15 PDT
All reviewed patches have been landed.  Closing bug.
Comment 67 Alexey Proskuryakov 2010-10-27 10:29:51 PDT
Comment on attachment 72049 [details]
Patch

Sorry, I only looked at the latest patch now. I'm not happy with it.

+        http://codereview.chromium.org/3595018/show. The last of the 4 patches
+        depends also on the language tag validation provided by this patch:
+        https://bugs.webkit.org/show_bug.cgi?id=48225.
...
+description('Tests for language tag inheritance and validation in speech buttons.');
...
+        Patch the current speech input implementation to use and validate the
+        nearest language tag. The language is now passed to the startRecognition
...
+        (WebCore::Element::computeInheritedLanguage): includes a brief character
+          validation for the BCP 47 language tag.

We don't validate, correct?

+namespace {

As a matter of coding style, we use static keyword, not anonymous namespaces.

+    // HashMap doesn't support empty strings as keys, so this value (an invalid BCP47 tag) is used for those cases.
+    const String emptyLanguage = "_";

This will have an observable effect, now that we don't validate - the computed language can be "_". Perhaps not a big deal, but somewhat unclean nonetheless. This creates an untested and poorly understood code path - something that security bugs need to breed.

Also, we can not have global static objects with constructors and destructors. The rationale is that no WebKit code at all should run when WebKit library is loaded into an application, and no WebKit code should run when an application quits.

We have an automated test that fails the build for constructors and destructors on Mac, so this should have broken Mac build. I'm not sure if the check failed, or if you didn't try building on Mac.

Please consider using AtomicString for language codes - they are quite likely to all be the same in a particular engine instance, so atomizing should be highly beneficial.

I didn't read the whole patch, just the parts around instances of the word "valid".
Comment 68 Satish Sampath 2010-10-27 11:55:01 PDT
Reverted r70665 for reason:

Need to address Alexey's review comments.

Committed r70669: <http://trac.webkit.org/changeset/70669>
Comment 69 WebKit Review Bot 2010-10-27 14:04:36 PDT
http://trac.webkit.org/changeset/70665 might have broken GTK Linux 32-bit Debug
Comment 70 Leandro Graciá Gil 2010-10-28 10:50:08 PDT
Created attachment 72208 [details]
Patch
Comment 71 Leandro Graciá Gil 2010-10-28 10:55:47 PDT
(In reply to comment #67)
> (From update of attachment 72049 [details])
> Sorry, I only looked at the latest patch now. I'm not happy with it.
> 
> +        http://codereview.chromium.org/3595018/show. The last of the 4 patches
> +        depends also on the language tag validation provided by this patch:
> +        https://bugs.webkit.org/show_bug.cgi?id=48225.
> ...
> +description('Tests for language tag inheritance and validation in speech buttons.');
> ...
> +        Patch the current speech input implementation to use and validate the
> +        nearest language tag. The language is now passed to the startRecognition
> ...
> +        (WebCore::Element::computeInheritedLanguage): includes a brief character
> +          validation for the BCP 47 language tag.
> 
> We don't validate, correct?
> 
> +namespace {
> 
> As a matter of coding style, we use static keyword, not anonymous namespaces.
> 
> +    // HashMap doesn't support empty strings as keys, so this value (an invalid BCP47 tag) is used for those cases.
> +    const String emptyLanguage = "_";
> 
> This will have an observable effect, now that we don't validate - the computed language can be "_". Perhaps not a big deal, but somewhat unclean nonetheless. This creates an untested and poorly understood code path - something that security bugs need to breed.
> 
> Also, we can not have global static objects with constructors and destructors. The rationale is that no WebKit code at all should run when WebKit library is loaded into an application, and no WebKit code should run when an application quits.
> 
> We have an automated test that fails the build for constructors and destructors on Mac, so this should have broken Mac build. I'm not sure if the check failed, or if you didn't try building on Mac.
> 
> Please consider using AtomicString for language codes - they are quite likely to all be the same in a particular engine instance, so atomizing should be highly beneficial.
> 
> I didn't read the whole patch, just the parts around instances of the word "valid".

Changes done and AtomicString adopted for the language codes.

About solving the problem with the empty language case in the hash map, I think that handling the case separately is probably the simplest and cleanest way. Other possible solutions like STL maps required operator overloading, or using STL vectors required creating (language, result) tuples and checking for existing (language,*) entries. Could be done but this seems to be quite simpler.

Any suggestions are welcome, and thank you for the style information on the static objects.
Comment 72 Alexey Proskuryakov 2010-10-28 10:59:44 PDT
Thanks Leandro! I don't have any complaints about this patch.
Comment 73 Alexey Proskuryakov 2010-10-28 11:01:49 PDT
STL containers generally require exceptions to be enabled, so we cannot use them in WebKit (we sometimes use them in separate tools).
Comment 74 Leandro Graciá Gil 2010-10-28 11:12:43 PDT
(In reply to comment #73)
> STL containers generally require exceptions to be enabled, so we cannot use them in WebKit (we sometimes use them in separate tools).

Good and useful to know. Thank you for your help.
Comment 75 Leandro Graciá Gil 2010-10-29 04:49:04 PDT
Created attachment 72316 [details]
Patch
Comment 76 Jeremy Orlow 2010-10-29 04:53:26 PDT
Comment on attachment 72316 [details]
Patch

rubber stamp...again
Comment 77 Satish Sampath 2010-10-29 05:00:17 PDT
Comment on attachment 72316 [details]
Patch

Clearing flags on attachment: 72316

Committed r70863: <http://trac.webkit.org/changeset/70863>
Comment 78 Satish Sampath 2010-10-29 05:00:36 PDT
All reviewed patches have been landed.  Closing bug.