RESOLVED FIXED Bug 90269
[GTK] Add a new and reusable enchant-based spellchecker in WebCore
https://bugs.webkit.org/show_bug.cgi?id=90269
Summary [GTK] Add a new and reusable enchant-based spellchecker in WebCore
Mario Sanchez Prada
Reported 2012-06-29 04:08:59 PDT
The idea is to have a class in WebCore that we can use both for WebKit and WebKit2, which would take care of dealing with libenchant to implement a spellchecker using that library. This new class would allow simplifying code in Source/WebKit/gtk/webkit/webkitspellcheckerenchant.cpp and would ease the implementation of the new spell checker API for WebKit2GTK (see bug 90268)
Attachments
Patch proposal (24.29 KB, patch)
2012-06-29 05:35 PDT, Mario Sanchez Prada
no flags
Patch proposal (23.88 KB, patch)
2012-07-18 10:12 PDT, Mario Sanchez Prada
no flags
Patch proposal (26.16 KB, patch)
2012-07-20 08:47 PDT, Mario Sanchez Prada
no flags
Patch proposal (24.68 KB, patch)
2012-07-31 05:30 PDT, Mario Sanchez Prada
mrobinson: review+
Patch proposal (25.89 KB, patch)
2012-08-02 06:52 PDT, Mario Sanchez Prada
mrobinson: review+
mrobinson: commit-queue-
Mario Sanchez Prada
Comment 1 2012-06-29 05:35:51 PDT
Created attachment 150155 [details] Patch proposal Attaching proposal of new TextCheckerEnchant class under WebCore/platform/text/gtk
Martin Robinson
Comment 2 2012-06-29 09:01:47 PDT
Comment on attachment 150155 [details] Patch proposal View in context: https://bugs.webkit.org/attachment.cgi?id=150155&action=review Looks good, though I think we should make the interface more closely match WebCore style. > Source/WebCore/platform/text/gtk/TextCheckerEnchant.cpp:57 > +void TextCheckerEnchant::ignoreWord(const char* word) > +{ > + for (Vector<EnchantDict*>::const_iterator iter = m_enchantDicts.begin(); iter != m_enchantDicts.end(); ++iter) > + enchant_dict_add_to_session(*iter, word, -1); > +} > + > +void TextCheckerEnchant::learnWord(const char* word) > +{ These should probably take Strings, which encapsulate encoding information better. > Source/WebCore/platform/text/gtk/TextCheckerEnchant.cpp:62 > +void TextCheckerEnchant::checkSpellingOfString(const char* string, int* misspellingLocation, int* misspellingLength) It think it makes sense to rename this to be more precise. checkSpellingOfString could be getLocationAndLengthOfFirstMisspellilng. This aligns with the WebKit style of using the word "get" for functions and methods that have output arguments. > Source/WebCore/platform/text/gtk/TextCheckerEnchant.cpp:86 > + while (attrs.get()[end].is_word_end < 1) > + end++; You're missing the fix that I recently made to the WebKit1 enchant code to handle contractions. > Source/WebCore/platform/text/gtk/TextCheckerEnchant.cpp:114 > +char** TextCheckerEnchant::getGuessesForWord(const char* word) Since this is a WebCore interface, a Vector<String> makes more sense here. > Source/WebCore/platform/text/gtk/TextCheckerEnchant.cpp:125 > + if (numberOfSuggestions > 0) { This looks like it should be an early continue. > Source/WebCore/platform/text/gtk/TextCheckerEnchant.cpp:127 > + if (numberOfSuggestions > 10) > + numberOfSuggestions = 10; 10 should probably be a constant like: static const int maximumNumberOfSuggestions = 10; > Source/WebCore/platform/text/gtk/TextCheckerEnchant.cpp:141 > +void TextCheckerEnchant::updateSpellCheckingLanguages(const char* languages) It makes sense for this to take a String instead of a const char*. > Source/WebCore/platform/text/gtk/TextCheckerEnchant.cpp:146 > + char** langs = g_strsplit(languages, ",", -1); You could use String.split here and avoid having to free the list of languages. > Source/WebCore/platform/text/gtk/TextCheckerEnchant.cpp:179 > +#endif /* ENABLE(SPELLCHECK) */ Nit: You should use a C++ comment here.
Mario Sanchez Prada
Comment 3 2012-06-29 15:06:06 PDT
(In reply to comment #2) > (From update of attachment 150155 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=150155&action=review > > Looks good, though I think we should make the interface more closely match WebCore style. Thanks for the quick review, Martin. As you can guess the patch is mostly about moving code down to WebCore, but I agree with the points you made and will work on that after coming from holidays. Actually I shouldn't be checking mail now already... just couldn't help it :-) Thanks, Mario
Mario Sanchez Prada
Comment 4 2012-07-18 07:23:42 PDT
Comment on attachment 150155 [details] Patch proposal View in context: https://bugs.webkit.org/attachment.cgi?id=150155&action=review Now working on this. See some comments in the meanwhile... >> Source/WebCore/platform/text/gtk/TextCheckerEnchant.cpp:57 >> +{ > > These should probably take Strings, which encapsulate encoding information better. Makes sense. I'll change that. >> Source/WebCore/platform/text/gtk/TextCheckerEnchant.cpp:62 >> +void TextCheckerEnchant::checkSpellingOfString(const char* string, int* misspellingLocation, int* misspellingLength) > > It think it makes sense to rename this to be more precise. checkSpellingOfString could be getLocationAndLengthOfFirstMisspellilng. This aligns with the WebKit style of using the word "get" for functions and methods that have output arguments. Even if I agree with you in that something like that could be a better name as a general thought, I don't think we should do it in this case since it would be confusing, as checkSpellingOfString() is the expected name according to what is defined in WebCore/platform/text/TextCheckerClient.h: class TextCheckerClient { public: virtual ~TextCheckerClient() {} [...] virtual void learnWord(const String&) = 0; virtual void checkSpellingOfString(const UChar*, int length, int* misspellingLocation, int* misspellingLength) = 0; virtual String getAutoCorrectSuggestionForMisspelledWord(const String& misspelledWord) = 0; [...] }; So, I think we should leave this name as is, to reflect that its related to that. >> Source/WebCore/platform/text/gtk/TextCheckerEnchant.cpp:86 >> + end++; > > You're missing the fix that I recently made to the WebKit1 enchant code to handle contractions. I guess you mean SVN r121360. I'll fix that in an upcoming patch. >> Source/WebCore/platform/text/gtk/TextCheckerEnchant.cpp:114 >> +char** TextCheckerEnchant::getGuessesForWord(const char* word) > > Since this is a WebCore interface, a Vector<String> makes more sense here. Makes sense. I'll change that. >> Source/WebCore/platform/text/gtk/TextCheckerEnchant.cpp:125 >> + if (numberOfSuggestions > 0) { > > This looks like it should be an early continue. Ok. >> Source/WebCore/platform/text/gtk/TextCheckerEnchant.cpp:127 >> + numberOfSuggestions = 10; > > 10 should probably be a constant like: > static const int maximumNumberOfSuggestions = 10; Just moved code from WebKit1, but I agree with this change too. I'll fix it. >> Source/WebCore/platform/text/gtk/TextCheckerEnchant.cpp:141 >> +void TextCheckerEnchant::updateSpellCheckingLanguages(const char* languages) > > It makes sense for this to take a String instead of a const char*. Ok. >> Source/WebCore/platform/text/gtk/TextCheckerEnchant.cpp:146 >> + char** langs = g_strsplit(languages, ",", -1); > > You could use String.split here and avoid having to free the list of languages. Ok. >> Source/WebCore/platform/text/gtk/TextCheckerEnchant.cpp:179 >> +#endif /* ENABLE(SPELLCHECK) */ > > Nit: You should use a C++ comment here. Ok.
Mario Sanchez Prada
Comment 5 2012-07-18 10:12:14 PDT
Created attachment 153034 [details] Patch proposal New patch addressing issues pointed out by Martin
Mario Sanchez Prada
Comment 6 2012-07-20 08:47:41 PDT
Created attachment 153515 [details] Patch proposal I found today an issue in the previous patch in TextCheckerEnchant::~TextCheckerEnchant(), so this new patch basically fixes that. Thus, this is the one to be reviewed. Sorry for the noise.
Martin Robinson
Comment 7 2012-07-28 04:49:56 PDT
Comment on attachment 153515 [details] Patch proposal View in context: https://bugs.webkit.org/attachment.cgi?id=153515&action=review > Source/WebCore/platform/text/gtk/TextCheckerEnchant.cpp:30 > +static const size_t maximumNumberOfSuggestions = 10; Since this is only used in one place, feel free to simply move this to right above where you use it. > Source/WebCore/platform/text/gtk/TextCheckerEnchant.cpp:34 > + Vector<CString>* dicts = static_cast<Vector<CString>*>(data); dicts -> dictionaries > Source/WebCore/platform/text/gtk/TextCheckerEnchant.h:35 > +class TextCheckerEnchant : public RefCounted<TextCheckerEnchant> { > +public: > + static PassRefPtr<TextCheckerEnchant> create() { return adoptRef(new TextCheckerEnchant); } This shouldn't be shared so there's no reason to make it RefCounted. > Source/WebKit/gtk/webkit/webkitspellcheckerenchant.cpp:42 > + RefPtr<TextCheckerEnchant> textCheckerEnchant; This can just be TextCheckerEnchant textCheckerEnchant.
Mario Sanchez Prada
Comment 8 2012-07-28 11:28:47 PDT
WebKit Review Bot
Comment 9 2012-07-30 10:06:25 PDT
Re-opened since this is blocked by 92655
Mario Sanchez Prada
Comment 10 2012-07-31 05:30:04 PDT
Created attachment 155504 [details] Patch proposal There was an issue using the new TextCheckerEnchant class from WebKitSpellCheckerEnchant, where we should use String::fromUTF8() instead of plainly using the String (const char*) constructor. This new patch fixes that.
WebKit Review Bot
Comment 11 2012-07-31 05:44:12 PDT
Comment on attachment 153515 [details] Patch proposal Cleared Martin Robinson's review+ from obsolete attachment 153515 [details] so that this bug does not appear in http://webkit.org/pending-commit.
Martin Robinson
Comment 12 2012-07-31 08:25:46 PDT
Comment on attachment 155504 [details] Patch proposal Okay. Let's give it a shot. Sorry that I didn't see this issue in my previous reviews.
Mario Sanchez Prada
Comment 13 2012-07-31 08:51:52 PDT
WebKit Review Bot
Comment 14 2012-07-31 11:08:30 PDT
Re-opened since this is blocked by 92773
Mario Sanchez Prada
Comment 15 2012-08-02 06:52:44 PDT
Created attachment 156071 [details] Patch proposal Strike 3. So it seems the problem happened due to wrong usage of CString::length(). I was assuming it was returning the actual number of utf8 characters and it was actually returning the number of elements in CString's internal vector holding the data: size_t length = string.utf8().length(); ...and this was causing trouble when special characters were present in the layout test (as it was the case for those failing tests) The right version of this code (in TextEnchantChecker::checkSpellingOfString()) would be like this: size_t length = string.length(); This new patch fixed that.
Martin Robinson
Comment 16 2012-08-02 10:33:46 PDT
Comment on attachment 156071 [details] Patch proposal View in context: https://bugs.webkit.org/attachment.cgi?id=156071&action=review Looks good, but I have one tiny suggestion before landing. > Source/WebCore/platform/text/gtk/TextCheckerEnchant.cpp:165 > + GOwnPtr<gchar> currentLanguage(g_strdup(iter->utf8().data())); This can just be: CString currentLanguage = iter->utf8(); and you could avoid one memory allocation.
Mario Sanchez Prada
Comment 17 2012-08-03 01:04:14 PDT
Note You need to log in before you can comment on or make changes to this bug.