Bug 48625 - [GTK] Optimize foldCase, toLower and toUpper methods in glib unicode backend
Summary: [GTK] Optimize foldCase, toLower and toUpper methods in glib unicode backend
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKitGTK (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC Linux
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-10-29 05:34 PDT by Carlos Garcia Campos
Modified: 2010-11-24 04:51 PST (History)
1 user (show)

See Also:


Attachments
Patch (7.24 KB, patch)
2010-10-29 05:37 PDT, Carlos Garcia Campos
no flags Details | Formatted Diff | Diff
Fixed minor coding style issue in previous patch (7.24 KB, patch)
2010-10-29 05:45 PDT, Carlos Garcia Campos
mrobinson: review-
Details | Formatted Diff | Diff
Updated patch according to review (7.30 KB, patch)
2010-11-12 00:19 PST, Carlos Garcia Campos
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Carlos Garcia Campos 2010-10-29 05:34:58 PDT
We could use our owns methods to convert between utf8 and utf16 to avoid the last memcpy needed in every method.
Comment 1 Carlos Garcia Campos 2010-10-29 05:37:41 PDT
Created attachment 72318 [details]
Patch

GLib methods use UTF-8 strings, so we have to convert from UTF-16 to UTF-8 to perform the case operations and then convert back the result to UTF-16. GLib conversion methods return a new allocated string, so we have to memcpy the result into the destination buffer too. Using our own methods to convert between UTF-8 and UTF-16 from wtf/unicode/UTF8.h we don't need such memcpy, since they take an already allocated buffer rather than returning a new one. There's another optimization for the case when the destination buffer is not large enough. In that case, methods should return the expected destination buffer size and are called again with a new buffer. We can avoid the conversion to UTF-16 by pre-calculating the required size for the destination buffer.
Comment 2 Carlos Garcia Campos 2010-10-29 05:45:08 PDT
Created attachment 72319 [details]
Fixed minor coding style issue in previous patch
Comment 3 Martin Robinson 2010-11-11 11:12:28 PST
Comment on attachment 72319 [details]
Fixed minor coding style issue in previous patch

View in context: https://bugs.webkit.org/attachment.cgi?id=72319&action=review

Looks good. It just needs a couple small cleanups.

> JavaScriptCore/wtf/unicode/glib/UnicodeGLib.cpp:58
> +        utf16Length += (character >= 0x10000) ? 2 : 1;

There's a macro in TextBreakIterator.h for this.

> JavaScriptCore/wtf/unicode/glib/UnicodeGLib.cpp:83
> +    GOwnPtr<char> utf8Result;
> +    utf8Result.set(caseFunction(buffer.data(), buffer.size()));

I think it makes more sense for this to be:

GOwnPtr<char> utf8Result(caseFunction(buffer.data(), buffer.size());
Comment 4 Carlos Garcia Campos 2010-11-12 00:19:47 PST
Created attachment 73710 [details]
Updated patch according to review
Comment 5 Xan Lopez 2010-11-24 04:29:56 PST
Comment on attachment 73710 [details]
Updated patch according to review

View in context: https://bugs.webkit.org/attachment.cgi?id=73710&action=review

> JavaScriptCore/wtf/unicode/glib/UnicodeGLib.cpp:30
> +

Perhaps this could be shared, but you can do that afterwards.
Comment 6 WebKit Commit Bot 2010-11-24 04:51:13 PST
Comment on attachment 73710 [details]
Updated patch according to review

Clearing flags on attachment: 73710

Committed r72662: <http://trac.webkit.org/changeset/72662>
Comment 7 WebKit Commit Bot 2010-11-24 04:51:18 PST
All reviewed patches have been landed.  Closing bug.