Bug 48625

Summary: [GTK] Optimize foldCase, toLower and toUpper methods in glib unicode backend
Product: WebKit Reporter: Carlos Garcia Campos <cgarcia>
Component: WebKitGTKAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: commit-queue
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: PC   
OS: Linux   
Attachments:
Description Flags
Patch
none
Fixed minor coding style issue in previous patch
mrobinson: review-
Updated patch according to review none

Description Carlos Garcia Campos 2010-10-29 05:34:58 PDT
We could use our owns methods to convert between utf8 and utf16 to avoid the last memcpy needed in every method.
Comment 1 Carlos Garcia Campos 2010-10-29 05:37:41 PDT
Created attachment 72318 [details]
Patch

GLib methods use UTF-8 strings, so we have to convert from UTF-16 to UTF-8 to perform the case operations and then convert back the result to UTF-16. GLib conversion methods return a new allocated string, so we have to memcpy the result into the destination buffer too. Using our own methods to convert between UTF-8 and UTF-16 from wtf/unicode/UTF8.h we don't need such memcpy, since they take an already allocated buffer rather than returning a new one. There's another optimization for the case when the destination buffer is not large enough. In that case, methods should return the expected destination buffer size and are called again with a new buffer. We can avoid the conversion to UTF-16 by pre-calculating the required size for the destination buffer.
Comment 2 Carlos Garcia Campos 2010-10-29 05:45:08 PDT
Created attachment 72319 [details]
Fixed minor coding style issue in previous patch
Comment 3 Martin Robinson 2010-11-11 11:12:28 PST
Comment on attachment 72319 [details]
Fixed minor coding style issue in previous patch

View in context: https://bugs.webkit.org/attachment.cgi?id=72319&action=review

Looks good. It just needs a couple small cleanups.

> JavaScriptCore/wtf/unicode/glib/UnicodeGLib.cpp:58
> +        utf16Length += (character >= 0x10000) ? 2 : 1;

There's a macro in TextBreakIterator.h for this.

> JavaScriptCore/wtf/unicode/glib/UnicodeGLib.cpp:83
> +    GOwnPtr<char> utf8Result;
> +    utf8Result.set(caseFunction(buffer.data(), buffer.size()));

I think it makes more sense for this to be:

GOwnPtr<char> utf8Result(caseFunction(buffer.data(), buffer.size());
Comment 4 Carlos Garcia Campos 2010-11-12 00:19:47 PST
Created attachment 73710 [details]
Updated patch according to review
Comment 5 Xan Lopez 2010-11-24 04:29:56 PST
Comment on attachment 73710 [details]
Updated patch according to review

View in context: https://bugs.webkit.org/attachment.cgi?id=73710&action=review

> JavaScriptCore/wtf/unicode/glib/UnicodeGLib.cpp:30
> +

Perhaps this could be shared, but you can do that afterwards.
Comment 6 WebKit Commit Bot 2010-11-24 04:51:13 PST
Comment on attachment 73710 [details]
Updated patch according to review

Clearing flags on attachment: 73710

Committed r72662: <http://trac.webkit.org/changeset/72662>
Comment 7 WebKit Commit Bot 2010-11-24 04:51:18 PST
All reviewed patches have been landed.  Closing bug.