RESOLVED FIXED 47019
[chromium] FontLinux performance improvement
https://bugs.webkit.org/show_bug.cgi?id=47019
Summary [chromium] FontLinux performance improvement
Xiaomei Ji
Reported 2010-10-01 14:31:12 PDT
Reduce te number of calls for normalization function because converting to NFC form is very expensive. Combine space normalization and character mirroring into one text scan.
Attachments
patch (6.89 KB, patch)
2010-10-01 14:49 PDT, Xiaomei Ji
levin: review-
path (11.20 KB, patch)
2010-10-19 17:00 PDT, Xiaomei Ji
levin: review+
levin: commit-queue-
Xiaomei Ji
Comment 1 2010-10-01 14:49:40 PDT
Created attachment 69524 [details] patch Port Claire's work on Android.
David Levin
Comment 2 2010-10-13 23:49:54 PDT
Comment on attachment 69524 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=69524&action=review This is looking pretty good. Just a few changes and you'll get this in. Thanks. > WebCore/ChangeLog:8 > + Reduce te number of calls for normalization function because converting typo: te s/for/for the/ > WebCore/ChangeLog:13 > + No new tests since there is no functionality change. Is there any test that currently verifies this functionality? If not, please add one. (Hopefully that won't be hard.) > WebCore/platform/graphics/chromium/FontLinux.cpp:@ > public: Please update the copyright year at the top of the file. (Add 2010. Don't delete the other years.) Could you do a patch (without any other changes) to fix the layout of this code? These methods shouldn't be in the class itself as they shouldn't be inlined. This would also help the diff'ing tool to pick up function names better. > WebCore/platform/graphics/chromium/FontLinux.cpp:386 > + const TextRun& getNormalizedTextRun(const TextRun& originalRun) It would be nice it this function was static, since you are calling it from within the initialization of the member variables in the constructor. As the moment it is way too easy to use a member variable in here and that shouldn't be done in this function or any that it calls (except for special cases). It looks like it only uses m_normalizedBuffer and m_normalizedRun, so those could be passed in to it. (Or perhaps this method and these two member variables should be split off into a little class that is called at this point.) > WebCore/platform/graphics/chromium/FontLinux.cpp:402 > + // Convert to NFC form if text has diacritical marks s/if /if the / Please add a period to the end of the comment. > WebCore/platform/graphics/chromium/FontLinux.cpp:409 > + icu::Normalizer::normalize(icu::UnicodeString(originalRun.characters(), Why do we only need to call icu::Normalizer::normalize when UBLOCK_COMBINING_DIACRITICAL_MARKS? Are there any other cases? > WebCore/platform/graphics/chromium/FontLinux.cpp:410 > + originalRun.length()), UNORM_NFC, 0 /* no options */, alignment with ( of previous line. (Note that you don't need to wrap the line.) > WebCore/platform/graphics/chromium/FontLinux.cpp:419 > + int normalizedBufferLen; normalizedBufferLength (Avoid abbreviations in WebKit code.) > WebCore/platform/graphics/chromium/FontLinux.cpp:420 > + const UChar* srcText; Use sourceText. (Avoid abbreviations in WebKit code.) > WebCore/platform/graphics/chromium/FontLinux.cpp:430 > + normSpaceAndMirrorChars(originalRun.rtl(), m_normalizedBuffer.get(), srcText, normalizedBufferLen); The ordering of parameters seems odd. It almost follows a memcpy ordering but it also has a bool before them. I would go with input/output ordering. Like this normSpaceAndMirrorChars(srcText, originalRun.rtl(), m_normalizedBuffer.get(), normalizedBufferLen); > WebCore/platform/graphics/chromium/FontLinux.cpp:586 > + void normSpaceAndMirrorChars(bool rtl, UChar* destination, const UChar* source, int length) const "norm" avoid abbreviation. s/Space/Spaces/ > WebCore/platform/graphics/chromium/FontLinux.cpp:598 > + if (rtl) Do "else if" As opposed to else { if
Adam Langley
Comment 3 2010-10-14 07:33:00 PDT
Sorry. I could have sworn that I had already commented on this code but there's no record of it :( I think David hit all the points, above.
Xiaomei Ji
Comment 4 2010-10-19 16:59:33 PDT
Thanks for the review. Please see my replies inline. (In reply to comment #2) > (From update of attachment 69524 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=69524&action=review > > This is looking pretty good. Just a few changes and you'll get this in. > > Thanks. > > > WebCore/ChangeLog:8 > > + Reduce te number of calls for normalization function because converting > > typo: te > s/for/for the/ done. > > > WebCore/ChangeLog:13 > > + No new tests since there is no functionality change. > > Is there any test that currently verifies this functionality? > > If not, please add one. (Hopefully that won't be hard.) I added a simple one having \t (treated as space) and diacritical marks (will trigger normalization). There is test for mirrored character in layout test. convert from '\t' to ' ' is triggered in the code, but I do not know why dumpAsText() still shows '\t'. > > > WebCore/platform/graphics/chromium/FontLinux.cpp:@ > > public: > > Please update the copyright year at the top of the file. (Add 2010. Don't delete the other years.) done. > > Could you do a patch (without any other changes) to fix the layout of this code? > > These methods shouldn't be in the class itself as they shouldn't be inlined. > > This would also help the diff'ing tool to pick up function names better. done. > > > WebCore/platform/graphics/chromium/FontLinux.cpp:386 > > + const TextRun& getNormalizedTextRun(const TextRun& originalRun) > > It would be nice it this function was static, since you are calling it from within the initialization of the member variables in the constructor. > > As the moment it is way too easy to use a member variable in here and that shouldn't be done in this function or any that it calls (except for special cases). > > It looks like it only uses m_normalizedBuffer and m_normalizedRun, so those could be passed in to it. (Or perhaps this method and these two member variables should be split off into a little class that is called at this point.) changed to static. > > > WebCore/platform/graphics/chromium/FontLinux.cpp:402 > > + // Convert to NFC form if text has diacritical marks > > s/if /if the / > Please add a period to the end of the comment. done. > > > WebCore/platform/graphics/chromium/FontLinux.cpp:409 > > + icu::Normalizer::normalize(icu::UnicodeString(originalRun.characters(), > > Why do we only need to call icu::Normalizer::normalize when UBLOCK_COMBINING_DIACRITICAL_MARKS? that was in the previous logic and remained. from my tests, looks like icu::..normalize() only do NFC to those with combining marks, for example: a + \u308, not \u0915 + \u093c. seems that other decomposition/composition in complex script, such as both \u0958 and \u0915 + \u093c (which are equivalent composed and de-composed forms), are rendered correctly too. My guess is harfbuzz does the normalization. > > Are there any other cases? > > > WebCore/platform/graphics/chromium/FontLinux.cpp:410 > > + originalRun.length()), UNORM_NFC, 0 /* no options */, > > alignment with ( of previous line. (Note that you don't need to wrap the line.) done. > > > WebCore/platform/graphics/chromium/FontLinux.cpp:419 > > + int normalizedBufferLen; > > normalizedBufferLength > > (Avoid abbreviations in WebKit code.) done. > > > WebCore/platform/graphics/chromium/FontLinux.cpp:420 > > + const UChar* srcText; > > Use sourceText. done. > > (Avoid abbreviations in WebKit code.) > > > WebCore/platform/graphics/chromium/FontLinux.cpp:430 > > + normSpaceAndMirrorChars(originalRun.rtl(), m_normalizedBuffer.get(), srcText, normalizedBufferLen); > > The ordering of parameters seems odd. It almost follows a memcpy ordering but it also has a bool before them. > I would go with input/output ordering. > > Like this > normSpaceAndMirrorChars(srcText, originalRun.rtl(), m_normalizedBuffer.get(), normalizedBufferLen); done. > > > WebCore/platform/graphics/chromium/FontLinux.cpp:586 > > + void normSpaceAndMirrorChars(bool rtl, UChar* destination, const UChar* source, int length) const > > "norm" avoid abbreviation. done. > > s/Space/Spaces/ done. > > > WebCore/platform/graphics/chromium/FontLinux.cpp:598 > > + if (rtl) > > Do "else if" > > As opposed to > > else { > if done.
Xiaomei Ji
Comment 5 2010-10-19 17:00:20 PDT
David Levin
Comment 6 2010-10-20 19:41:26 PDT
Comment on attachment 71229 [details] path View in context: https://bugs.webkit.org/attachment.cgi?id=71229&action=review Please consider addressing the feedback when you land this. Thanks. > WebCore/platform/graphics/chromium/FontLinux.cpp:231 > + static void normalizeSpacesAndMirrorChars(const UChar*, bool, UChar*, int); Ideally there would be parameter names here that indicated what each of these parameters should be. (Use the parameter names from your function below.) The WebKit rule for parameter names is remove them if they don't add any information. > WebCore/platform/graphics/chromium/FontLinux.cpp:232 > + static const TextRun& getNormalizedTextRun(const TextRun&, OwnPtr<TextRun>&, OwnArrayPtr<UChar>&); Add parameter names here too. > WebCore/platform/graphics/chromium/FontLinux.cpp:444 > + Extra blank line. > WebCore/platform/graphics/chromium/FontLinux.cpp:449 > + Extra blank line. > WebCore/platform/graphics/chromium/FontLinux.cpp:598 > +void TextRunWalker::normalizeSpacesAndMirrorChars(const UChar* source, bool rtl, UChar* destination, int length) It would be nice if the ordering of these functions followed the ordering in the class. (You could just swap them in the class.) This isn't required but it is nice. > WebCore/platform/graphics/chromium/FontLinux.cpp:610 > + character = u_charMirror(character); The indentation is way off here. Perhaps there is a tab? > LayoutTests/platform/chromium/fast/text/font-linux-normalize.html:3 > +<script src=../editing.js language="JavaScript" type="text/JavaScript" ></script> It doesn't look like you use anything from editing.js. If you don't, then don't include it. > LayoutTests/platform/chromium/fast/text/font-linux-normalize.html:40 > + //assertEqual("devanagari + a with diaeresis", string, "\u0958\u0909 \u00e4"); Why is this commented out? Maybe you should just remove it.
Xiaomei Ji
Comment 7 2010-10-21 14:43:41 PDT
Note You need to log in before you can comment on or make changes to this bug.