Summary: | Double Click to Select Words does not work on Windows | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Hunter L. Williams <hlwebkit> | ||||||||||
Component: | New Bugs | Assignee: | Nobody <webkit-unassigned> | ||||||||||
Status: | RESOLVED FIXED | ||||||||||||
Severity: | Normal | CC: | darin, mjs, mrowe | ||||||||||
Priority: | P2 | ||||||||||||
Version: | 420+ | ||||||||||||
Hardware: | PC | ||||||||||||
OS: | Windows XP | ||||||||||||
Attachments: |
|
Description
Hunter L. Williams
2006-10-04 17:17:51 PDT
Created attachment 10919 [details]
straw man patch
I am interested in WebKit developers' thoughts on this approach. This is not a complete patch (no Changelog or equivalent Mac project file work) but could be completed if desired.
Comment on attachment 10919 [details]
straw man patch
Looks fine.
But we don't want to add a new copy of the code from TextUtilities.mm. Instead, we need to delete currentTextBreakLocaleID, and the two sentence functions from TextUtilities.mm, and we need to update the project file on Macintosh as well as on Windows to include TextUtilities.cpp, with proper ifdefs. At some point we might even share the word boundary code, but for now we should probbably leave the Objective-C version on Macintosh in case it does something better than the ICU version.
review- because we need a patch that doesn't create two copies of the code.
Another thing I'm thinking is that the locale-determining code is probably in the wrong place. I actually don't know where the Locale().getLanguage() code is coming from, but wherever that is, that's where we should put currentTextBreakLocaleID or the equivalent. Anyway, looks good. Another thought: If this code is at all hot, we don't want to keep creating and destroying UBreakIterator objects. StringImpl.h has the getWordBreakIterator function, which returns a single global shared word break iterator. I suggest we create a BreakIterator.h/cpp set of files in platform/icu and use that in this code as well as in StringImpl.cpp (and remove getWordBreakIterator from StringImpl.h). Locale is a class provided by ICU. I used it to get the system language, since it provided it in a format that ubrk_open() could handle (ISO-639) OK I'll look into that thanks Darin. In StringImpl.cpp:
> ubrk_open(UBRK_WORD, "en_us", 0, 0, &status);
I'm guessing this should be ubrk_open(UBRK_WORD, Locale().getLanguage(), 0, 0, &status); ? ;-)
Question though - shouldn't word breaking on a page be defined by the language of the page, rather than the host OS? (Although that would assume the page had provided its language in the response header, and we had access to a response header abstraction at this point in the code).
Darin - there is no platform/icu directory, do you want me to create one with just BreakIterator.h/cpp in it or put those files in the platform directory? Created attachment 10933 [details]
more complete but still not done patch
Does the suggested code reshuffling. Not yet built on Mac, going to do that next (attaching so I can transfer it from machine to machine!)
Created attachment 10937 [details]
patch for review
Here's a patch that implements what was suggested. This works on Windows, doesn't seem to regress anything per the regression tests on 10.4.7.
Note: I changed the locale determination for non Mac platforms to use uloc_getDefault() instead of Locale().getLanguage(), since it seems that not all systems have the C++ header locid.h, and if icu ever becomes the canonical location on OS X for this information (instead of the plist it's read out of here) then it would probably be best to get the info from the same function call on all platforms. Comment on attachment 10937 [details]
patch for review
Looks great. A few nits:
We don't use relative paths in includes. So this:
+#include "icu/BreakIterator.h"
should just be:
#include "BreakIterator.h"
And we deal with the paths in the project setup.
The copyright in BreakIterator.h/cpp only needs to mention Apple and 2006. Use of ICU for word breaking came in 2006, so there's no need to copy the other copyrights from StringImpl.h since they didn't cover the break iterator. No way you could have known that, but should fix it.
+namespace WebCore {
+
+typedef void UBreakIterator;
UBreakIterator should be declared outside the WebCore namespace to match how it's defined in ICU. This declaration instead adds a new UBreakIterator inside the WebCore namespace.
Created attachment 11020 [details]
patch, comments addressed
Comment on attachment 11020 [details]
patch, comments addressed
r=me
First off, this patch is a great change! But I'm a little confused about what code goes where with this patch. It puts ICU specific code in the platform/ directory directly, while some other ICU-specific code is in platform/icu. Meanwhile there's also mac-specific code in platform/icu. Maybe that could go in platform/mac and the non-mac-related bits could be in #if !PLATFORM(MAC), although I guess that's a matter of taste. We do similar things in other cases though. Probably all of the implementation should go in platform/icu unless there's some truly generic portion. Maciej, I think Darin's suggestion was that the platform/icu part wasn't all code that _used_ icu, but rather ICU utilities for code that does use it - in other words code to create the shared break iterator. The Mac specific code that's in there is solely to get the system breaking language. I could move that one function to a new .mm/.h file in platform/mac, but platform/icu/BreakIterator.cpp would still need to #include that new header #ifdef PLATFORM(MAC) ... (other platforms don't need to use that specific function since they just rely on ICU to tell what the breaking language is - if you think Mac should do that too, then this is simplified a lot and the Mac code goes away). The Mac specific word boundary code remains unchanged in platform/mac/TextBoundaries.mm Does this patch need updated to address Maciej's comments, or is it fine to land as-is? Sounds like there's some confusion here. Are there any more changes needed to at least get the Mac/cross platform parts landed? Darin & Maciej, please comment on Comment #15, Comment #16 and Comment #17. I thik this patch is fine but I am not sure it still applies. Comment on attachment 11020 [details]
patch, comments addressed
Clearing darin's r+.
Sorry, this patch has bit-rotten. Please resubmit it against the current tree.
|