RESOLVED FIXED Bug 63319
Refactor text iterator code respecting surrogate pairs from WidthIterator
https://bugs.webkit.org/show_bug.cgi?id=63319
Summary Refactor text iterator code respecting surrogate pairs from WidthIterator
Nikolas Zimmermann
Reported 2011-06-24 04:08:46 PDT
After the patch from bug 59085 landed we found out there's a regression with Acid3, that led to the rollout of the patch. In order to fix the Acid3 regression I need to handle surrogate pairs correctly in the SVGFonts GlyphPage. For this I need to make SVGGlyphMap UTF16 aware by using UChar32 instead of UChar in the glyph maps. This works just fine and fixed the Acid3 regression. To prepare resubmitting the patch for bug 59085, I'm uploading a preparation patch here soon that breaks out SurrogatePairAwareTextIterator out of WidthIterator so SVGGlyphMap can reuse it soon.
Attachments
Patch (33.06 KB, patch)
2011-06-24 04:14 PDT, Nikolas Zimmermann
webkit.review.bot: commit-queue-
Patch v2 (43.74 KB, patch)
2011-06-24 04:24 PDT, Nikolas Zimmermann
webkit.review.bot: commit-queue-
Patch v3 (43.97 KB, patch)
2011-06-24 05:12 PDT, Nikolas Zimmermann
krit: review+
webkit.review.bot: commit-queue-
Archive of layout-test-results from ec2-cr-linux-03 (1.41 MB, application/zip)
2011-06-24 05:22 PDT, WebKit Review Bot
no flags
Patch v4 (44.38 KB, patch)
2011-06-24 08:28 PDT, Nikolas Zimmermann
webkit.review.bot: commit-queue-
Archive of layout-test-results from ec2-cr-linux-01 (1.33 MB, application/zip)
2011-06-24 08:38 PDT, WebKit Review Bot
no flags
Patch v5 (45.98 KB, patch)
2011-06-24 11:30 PDT, Nikolas Zimmermann
krit: review+
Nikolas Zimmermann
Comment 1 2011-06-24 04:14:48 PDT
Created attachment 98479 [details] Patch Simple refactoring patch.
WebKit Review Bot
Comment 2 2011-06-24 04:17:57 PDT
Attachment 98479 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/CMakeLists.txt', u'Source/W..." exit_code: 1 Source/WebCore/platform/graphics/SurrogatePairAwareTextIterator.cpp:23: Found other header before a header this file implements. Should be: config.h, primary header, blank line, and then alphabetically sorted. [build/include_order] [4] Source/WebCore/platform/graphics/SurrogatePairAwareTextIterator.cpp:25: Alphabetical sorting problem. [build/include_order] [4] Source/WebCore/platform/graphics/SurrogatePairAwareTextIterator.cpp:128: Should have only a single space after a punctuation in a comment. [whitespace/comments] [5] Source/WebCore/platform/graphics/SurrogatePairAwareTextIterator.cpp:266: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5] Total errors found: 4 in 10 files If any of these errors are false positives, please file a bug against check-webkit-style.
WebKit Review Bot
Comment 3 2011-06-24 04:20:32 PDT
Comment on attachment 98479 [details] Patch Attachment 98479 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/8937345
Nikolas Zimmermann
Comment 4 2011-06-24 04:24:19 PDT
Created attachment 98481 [details] Patch v2 Merging failed :( SurrogatePairAwareTextIterator.cpp contained a copy of WidthIterator.cpp in the first patch :( Hopefully it works now, otherwhise I'll have to clean my tree.
WebKit Review Bot
Comment 5 2011-06-24 04:27:17 PDT
Attachment 98481 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/CMakeLists.txt', u'Source/W..." exit_code: 1 Source/WebCore/platform/graphics/SurrogatePairAwareTextIterator.cpp:72: Should have only a single space after a punctuation in a comment. [whitespace/comments] [5] Source/WebCore/platform/graphics/SurrogatePairAwareTextIterator.cpp:106: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5] Total errors found: 2 in 10 files If any of these errors are false positives, please file a bug against check-webkit-style.
WebKit Review Bot
Comment 6 2011-06-24 04:28:45 PDT
Comment on attachment 98481 [details] Patch v2 Attachment 98481 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/8934443
WebKit Review Bot
Comment 7 2011-06-24 04:57:29 PDT
Comment on attachment 98481 [details] Patch v2 Attachment 98481 [details] did not pass cr-mac-ews (chromium): Output: http://queues.webkit.org/results/8931531
Nikolas Zimmermann
Comment 8 2011-06-24 05:12:34 PDT
Created attachment 98484 [details] Patch v3
WebKit Review Bot
Comment 9 2011-06-24 05:15:54 PDT
Attachment 98484 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/CMakeLists.txt', u'Source/W..." exit_code: 1 Source/WebCore/platform/graphics/SurrogatePairAwareTextIterator.cpp:72: Should have only a single space after a punctuation in a comment. [whitespace/comments] [5] Source/WebCore/platform/graphics/SurrogatePairAwareTextIterator.cpp:106: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5] Total errors found: 2 in 10 files If any of these errors are false positives, please file a bug against check-webkit-style.
WebKit Review Bot
Comment 10 2011-06-24 05:22:11 PDT
Comment on attachment 98484 [details] Patch v3 Attachment 98484 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/8937366 New failing tests: editing/selection/anchor-focus1.html editing/selection/4895428-2.html editing/selection/5007143-2.html editing/execCommand/insert-list-with-noneditable-content.html css2.1/t1604-c541-word-sp-01-b-a.html editing/execCommand/move-selection-back-line.html editing/selection/5232159.html editing/pasteboard/4947130.html editing/deleting/delete-line-012.html editing/selection/5057506.html editing/deleting/smart-delete-002.html editing/deleting/4916235-1.html editing/execCommand/findString-2.html editing/deleting/pruning-after-merge-1.html css1/text_properties/word_spacing.html editing/deleting/delete-line-011.html editing/pasteboard/4242293-1.html editing/selection/14971.html editing/selection/5007143.html css2.1/t1604-c541-word-sp-00-b-a.html
WebKit Review Bot
Comment 11 2011-06-24 05:22:16 PDT
Created attachment 98486 [details] Archive of layout-test-results from ec2-cr-linux-03 The attached test failures were seen while running run-webkit-tests on the chromium-ews. Bot: ec2-cr-linux-03 Port: Chromium Platform: Linux-2.6.35-28-virtual-x86_64-with-Ubuntu-10.10-maverick
Dirk Schulze
Comment 12 2011-06-24 05:42:10 PDT
Comment on attachment 98484 [details] Patch v3 View in context: https://bugs.webkit.org/attachment.cgi?id=98484&action=review The patch looks good. I wonder about the failing tests on chromium. I hope you run the complete DRT suite. r=me if it passes all tests on your machine. >> Source/WebCore/platform/graphics/SurrogatePairAwareTextIterator.cpp:72 >> + // Do we have a surrogate pair? If so, determine the full Unicode (32 bit) code point before glyph lookup. > > Should have only a single space after a punctuation in a comment. [whitespace/comments] [5] Thats right :-) Can you fix it before landing? >> Source/WebCore/platform/graphics/SurrogatePairAwareTextIterator.cpp:106 >> + if (resultLength == 1 && uStatus == 0) > > Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5] Should also be a simple fix
Nikolas Zimmermann
Comment 13 2011-06-24 08:26:48 PDT
(In reply to comment #12) > (From update of attachment 98484 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=98484&action=review > > The patch looks good. I wonder about the failing tests on chromium. I hope you run the complete DRT suite. r=me if it passes all tests on your machine. Yes, there was still a bug with the patch, I'm uploading a new one to see if also the chromium bot thinks there are no regressions.
Nikolas Zimmermann
Comment 14 2011-06-24 08:28:07 PDT
Created attachment 98494 [details] Patch v4 Fix problem with stale variable, should fix all regressions. Please wait for cr-linux EWS before touching the r? state.
WebKit Review Bot
Comment 15 2011-06-24 08:38:10 PDT
Comment on attachment 98494 [details] Patch v4 Attachment 98494 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/8935411 New failing tests: editing/selection/anchor-focus1.html editing/selection/4895428-2.html editing/selection/5007143-2.html editing/execCommand/insert-list-with-noneditable-content.html editing/execCommand/move-selection-back-line.html editing/selection/5232159.html editing/pasteboard/4947130.html editing/deleting/delete-line-012.html editing/selection/caret-bidi-first-and-last-letters.html editing/selection/5057506.html editing/deleting/smart-delete-002.html editing/deleting/4916235-1.html editing/execCommand/findString-2.html editing/deleting/pruning-after-merge-1.html editing/selection/click-in-padding-with-multiple-line-boxes.html editing/deleting/delete-line-011.html editing/pasteboard/4242293-1.html editing/selection/14971.html editing/selection/5007143.html editing/selection/click-in-margins-inside-editable-div.html
WebKit Review Bot
Comment 16 2011-06-24 08:38:15 PDT
Created attachment 98495 [details] Archive of layout-test-results from ec2-cr-linux-01 The attached test failures were seen while running run-webkit-tests on the chromium-ews. Bot: ec2-cr-linux-01 Port: Chromium Platform: Linux-2.6.35-28-virtual-x86_64-with-Ubuntu-10.10-maverick
Dirk Schulze
Comment 17 2011-06-24 09:47:20 PDT
Looks like the css tests pass now but other edit tests fail as well.
Nikolas Zimmermann
Comment 18 2011-06-24 11:30:10 PDT
Created attachment 98516 [details] Patch v5 Fixed last merging problem :( Another stale variable, this time I ran all tests and verified none regressed, on my machine! Especially editing is fine now: editing/deleting ............................................................................................................................................................................................................... editing/editability .... editing/execCommand .................................................................................................................................................................................................................................. editing/input ............. editing/inserting ...................................................................................................................................................... editing/pasteboard .............................................................................................................................................................................................................................. editing/selection ..................................................................................................................................................................................................................................................................................................................... editing/spelling .............. editing/style ........................................................................................... editing/text-iterator ..... editing/undo ................................. editing/unsupported-content .........
Dirk Schulze
Comment 19 2011-06-24 11:43:41 PDT
Comment on attachment 98516 [details] Patch v5 r=me
Nikolas Zimmermann
Comment 20 2011-06-24 11:48:37 PDT
Landed in r89690.
Darin Adler
Comment 21 2011-06-24 14:04:35 PDT
Comment on attachment 98516 [details] Patch v5 Performance impact? Confusing to call this a "text iterator" when that is the name of a very different beast in WebCore. It's trivial to iterate surrogate pairs without needing an object. There are many code paths in many places in WebKit that do it; the macros in ICU’s UTF16.h make it almost trivial. As far as I can see what’s non trivial here is the additional normalization, specifically the handling of voicing marks. Because of that, I think the name of the class is wrong.
Nikolas Zimmermann
Comment 22 2011-06-24 14:07:34 PDT
(In reply to comment #21) > (From update of attachment 98516 [details]) > Performance impact? > > Confusing to call this a "text iterator" when that is the name of a very different beast in WebCore. I wanted to call it TextIterator first, but it clashed with that one. > > It's trivial to iterate surrogate pairs without needing an object. There are many code paths in many places in WebKit that do it; the macros in ICU’s UTF16.h make it almost trivial. As far as I can see what’s non trivial here is the additional normalization, specifically the handling of voicing marks. Because of that, I think the name of the class is wrong. You've got a very good point here. I'll be happy to change the name to anything better if you have an idea! I thought about it for a while, and both Dirk and me couldn't come up with a better name, help appreciated :-)
Note You need to log in before you can comment on or make changes to this bug.