RESOLVED FIXED Bug 160172
Remove stripLeadingAndTrailingWhitespace from MathMLElement.cpp
https://bugs.webkit.org/show_bug.cgi?id=160172
Summary Remove stripLeadingAndTrailingWhitespace from MathMLElement.cpp
David Kilzer (:ddkilzer)
Reported 2016-07-25 14:57:59 PDT
Based on Bug 160111 Comment #14, we should: - Move skipLeadingAndTrailingWhitespace() into the HTMLParserIdioms class, - Rename it to overload stripLeadingAndTrailingHTMLSpaces(), and - Change the argument from const StringView& to StringView. Eventually, we should remove the stripLeadingAndTrailingHTMLSpaces(const String&) method in favor of the StringView method.
Attachments
Patch v1 (7.60 KB, patch)
2016-07-25 16:33 PDT, David Kilzer (:ddkilzer)
no flags
Patch v2 (7.88 KB, patch)
2016-07-26 06:12 PDT, David Kilzer (:ddkilzer)
no flags
Patch v3 (8.35 KB, patch)
2016-07-26 08:00 PDT, David Kilzer (:ddkilzer)
no flags
Patch v4 (6.53 KB, patch)
2016-07-26 08:43 PDT, David Kilzer (:ddkilzer)
no flags
Patch v5 (13.67 KB, patch)
2016-08-07 06:30 PDT, David Kilzer (:ddkilzer)
no flags
Patch (2.35 KB, patch)
2019-02-22 00:47 PST, Rob Buis
no flags
Patch (4.36 KB, patch)
2019-02-22 02:59 PST, Rob Buis
no flags
Patch (4.41 KB, patch)
2019-02-22 03:39 PST, Rob Buis
no flags
Archive of layout-test-results from ews114 for mac-highsierra (2.30 MB, application/zip)
2019-02-22 05:26 PST, EWS Watchlist
no flags
Archive of layout-test-results from ews126 for ios-simulator-wk2 (2.47 MB, application/zip)
2019-02-22 05:41 PST, EWS Watchlist
no flags
Patch (5.09 KB, patch)
2019-02-22 05:50 PST, Rob Buis
no flags
David Kilzer (:ddkilzer)
Comment 1 2016-07-25 16:33:08 PDT
Created attachment 284544 [details] Patch v1
David Kilzer (:ddkilzer)
Comment 2 2016-07-25 16:33:40 PDT
Comment on attachment 284544 [details] Patch v1 Oops, not ready yet.
David Kilzer (:ddkilzer)
Comment 3 2016-07-26 06:12:54 PDT
Created attachment 284586 [details] Patch v2
David Kilzer (:ddkilzer)
Comment 4 2016-07-26 08:00:18 PDT
Created attachment 284589 [details] Patch v3
David Kilzer (:ddkilzer)
Comment 5 2016-07-26 08:43:13 PDT
Created attachment 284593 [details] Patch v4
David Kilzer (:ddkilzer)
Comment 6 2016-07-26 08:46:13 PDT
(In reply to comment #5) > Created attachment 284593 [details] > Patch v4 This has the minimal amount of changes to move the method into HTMLParserIdioms.
Chris Dumez
Comment 7 2016-07-26 09:23:59 PDT
Comment on attachment 284593 [details] Patch v4 View in context: https://bugs.webkit.org/attachment.cgi?id=284593&action=review > Source/WebCore/html/parser/HTMLParserIdioms.cpp:80 > + unsigned start = 0; start could be declared later, right before the first while loop. > Source/WebCore/html/parser/HTMLParserIdioms.cpp:84 > + return string.isNull() ? string : StringView::empty(); Is this ternary reversed? Right now it says, if string is null, return null. If string is empty, return empty. Seems like a no-op.
Darin Adler
Comment 8 2016-07-26 10:50:33 PDT
Comment on attachment 284593 [details] Patch v4 View in context: https://bugs.webkit.org/attachment.cgi?id=284593&action=review Should probably make at least some small improvements before landing. Also should probably have unit tests for this function. > Source/WebCore/html/parser/HTMLParserIdioms.cpp:78 > +StringView stripLeadingAndTrailingHTMLSpaces(StringView string) Should consider writing this more like the existing String version. Separate loops for 8-bit and 16-bit characters are likely to result in better performance, hoisting the check for the string type out of the two loops. On the other hand, maybe it’s not a measurable performance difference. > Source/WebCore/html/parser/HTMLParserIdioms.cpp:81 > + unsigned stringLength = string.length(); This should just be named length. After the value is modified during the execution of the function it’s no longer the string length. It becomes the substring length, so the name stringLength does more harm than good. >> Source/WebCore/html/parser/HTMLParserIdioms.cpp:84 >> + return string.isNull() ? string : StringView::empty(); > > Is this ternary reversed? Right now it says, if string is null, return null. If string is empty, return empty. Seems like a no-op. This special case for zero length strings can and should simply be removed. The loops below both handle zero length strings perfectly, as does the substring function. > Source/WebCore/html/parser/HTMLParserIdioms.cpp:86 > + while (stringLength > 0 && isHTMLSpace(string[start])) { Should just be while (stringLength && ...), not > 0. > Source/WebCore/html/parser/HTMLParserIdioms.cpp:88 > + start++; > + stringLength--; Pre-increment and pre-decrement are more our standard C++ style. > Source/WebCore/html/parser/HTMLParserIdioms.cpp:91 > + while (stringLength > 0 && isHTMLSpace(string[start + stringLength - 1])) Should be just while (stringLength && ...), not > 0. > Source/WebCore/html/parser/HTMLParserIdioms.cpp:92 > + stringLength--; Pre-decrement is more our standard C++ style.
Frédéric Wang (:fredw)
Comment 9 2016-08-03 02:46:53 PDT
@David: Are you still working on this?
David Kilzer (:ddkilzer)
Comment 10 2016-08-03 03:01:21 PDT
Yes, planning to do another patch to address Darin's comments and to add tests. Got side-tracked with other work for the past couple of days.
Frédéric Wang (:fredw)
Comment 11 2016-08-03 04:01:17 PDT
(In reply to comment #10) > Yes, planning to do another patch to address Darin's comments and to add > tests. > > Got side-tracked with other work for the past couple of days. OK, I was asking because I need to make skipLeadingAndTrailingWhitespace accessible from another file in bug 160462, so just wanted to warn you that the two patches may conflict.
David Kilzer (:ddkilzer)
Comment 12 2016-08-05 14:36:15 PDT
Comment on attachment 284593 [details] Patch v4 View in context: https://bugs.webkit.org/attachment.cgi?id=284593&action=review >> Source/WebCore/html/parser/HTMLParserIdioms.cpp:78 >> +StringView stripLeadingAndTrailingHTMLSpaces(StringView string) > > Should consider writing this more like the existing String version. Separate loops for 8-bit and 16-bit characters are likely to result in better performance, hoisting the check for the string type out of the two loops. On the other hand, maybe it’s not a measurable performance difference. Okay, I'll make this change when removing: String stripLeadingAndTrailingHTMLSpaces(const String& string) and replacing it with: StringView stripLeadingAndTrailingHTMLSpaces(StringView string) >>> Source/WebCore/html/parser/HTMLParserIdioms.cpp:84 >>> + return string.isNull() ? string : StringView::empty(); >> >> Is this ternary reversed? Right now it says, if string is null, return null. If string is empty, return empty. Seems like a no-op. > > This special case for zero length strings can and should simply be removed. The loops below both handle zero length strings perfectly, as does the substring function. I added this because stripLeadingAndTrailingHTMLSpaces(const String& string) has this early return. Seems like when I refactor to remove stripLeadingAndTrailingHTMLSpaces(const String& string), then I should bring this back.
David Kilzer (:ddkilzer)
Comment 13 2016-08-05 14:37:39 PDT
(In reply to comment #8) > Comment on attachment 284593 [details] > Patch v4 > > View in context: > https://bugs.webkit.org/attachment.cgi?id=284593&action=review > > Should probably make at least some small improvements before landing. Also > should probably have unit tests for this function. Going to write some test cases before landing (or maybe shortly afterwards in a separate patch). (In reply to comment #11) > (In reply to comment #10) > > Yes, planning to do another patch to address Darin's comments and to add > > tests. > > > > Got side-tracked with other work for the past couple of days. > > OK, I was asking because I need to make skipLeadingAndTrailingWhitespace > accessible from another file in bug 160462, so just wanted to warn you that > the two patches may conflict. Thanks for the head's up! I'll check to make sure this is working as expected before landing.
David Kilzer (:ddkilzer)
Comment 14 2016-08-07 06:30:55 PDT
Created attachment 285527 [details] Patch v5
David Kilzer (:ddkilzer)
Comment 15 2016-08-07 06:35:39 PDT
Comment on attachment 284593 [details] Patch v4 View in context: https://bugs.webkit.org/attachment.cgi?id=284593&action=review >> Source/WebCore/html/parser/HTMLParserIdioms.cpp:80 >> + unsigned start = 0; > > start could be declared later, right before the first while loop. With Patch v5 change, this is a moot point. >>>> Source/WebCore/html/parser/HTMLParserIdioms.cpp:84 >>>> + return string.isNull() ? string : StringView::empty(); >>> >>> Is this ternary reversed? Right now it says, if string is null, return null. If string is empty, return empty. Seems like a no-op. >> >> This special case for zero length strings can and should simply be removed. The loops below both handle zero length strings perfectly, as does the substring function. > > I added this because stripLeadingAndTrailingHTMLSpaces(const String& string) has this early return. > > Seems like when I refactor to remove stripLeadingAndTrailingHTMLSpaces(const String& string), then I should bring this back. After writing API tests, it's obvious that stripLeadingAndTrailingHTMLSpaces() is always supposed to return a null String in the case of a zero-length string, never an empty String. I suspect this is a micro-optimization since we must always check m_impl to see if it's nullptr first whenever comparing strings or getting their length or calling isNull() or isEmpty().
Darin Adler
Comment 16 2016-08-07 09:51:06 PDT
Comment on attachment 285527 [details] Patch v5 View in context: https://bugs.webkit.org/attachment.cgi?id=285527&action=review > Source/WebCore/html/parser/HTMLParserIdioms.cpp:91 > + return length ? string.substring(start, length) : emptyAtom.string(); I don’t understand why we are doing a special case here for 0 length. If this special case is needed, it should be in StringView::substring, not here. Unless this function has special requirements that StringView::substring does not. > Source/WebCore/html/parser/HTMLParserIdioms.h:50 > +// FIXME: Replace overload using const String& with method using StringView. > WEBCORE_EXPORT String stripLeadingAndTrailingHTMLSpaces(const String&); I don’t think this FIXME is exactly right. If I have a String that probably does *not* have leading and trailing HTML spaces, then converting it to a StringView and back to a String will have to allocate a new String! On the other hand, if you are starting with a String and do not need to convert the result to a String for storage, just need to look at it, then it’s better to have the result be a StringView. So there is value in having a version that produces a String that can reuse the existing string when there is nothing to strip. Not clear if overloading with the same name is the best design choice, since both the return type and the argument type matter. Implying that we will want to remove this overload entirely is probably incorrect, though.
Darin Adler
Comment 17 2016-08-07 16:23:00 PDT
Comment on attachment 284593 [details] Patch v4 View in context: https://bugs.webkit.org/attachment.cgi?id=284593&action=review >>>>> Source/WebCore/html/parser/HTMLParserIdioms.cpp:84 >>>>> + return string.isNull() ? string : StringView::empty(); >>>> >>>> Is this ternary reversed? Right now it says, if string is null, return null. If string is empty, return empty. Seems like a no-op. >>> >>> This special case for zero length strings can and should simply be removed. The loops below both handle zero length strings perfectly, as does the substring function. >> >> I added this because stripLeadingAndTrailingHTMLSpaces(const String& string) has this early return. >> >> Seems like when I refactor to remove stripLeadingAndTrailingHTMLSpaces(const String& string), then I should bring this back. > > After writing API tests, it's obvious that stripLeadingAndTrailingHTMLSpaces() is always supposed to return a null String in the case of a zero-length string, never an empty String. > > I suspect this is a micro-optimization since we must always check m_impl to see if it's nullptr first whenever comparing strings or getting their length or calling isNull() or isEmpty(). I don’t agree that stripLeadingAndTrailingHTMLSpaces should return a null string. How does writing API tests make this clear? I don’t understand what you mean with the micro optimization comment. To me it does not seem like we should add branches to optimize the empty or null string cases.
David Kilzer (:ddkilzer)
Comment 18 2016-08-08 17:11:26 PDT
Comment on attachment 285527 [details] Patch v5 View in context: https://bugs.webkit.org/attachment.cgi?id=285527&action=review >> Source/WebCore/html/parser/HTMLParserIdioms.cpp:91 >> + return length ? string.substring(start, length) : emptyAtom.string(); > > I don’t understand why we are doing a special case here for 0 length. If this special case is needed, it should be in StringView::substring, not here. > > Unless this function has special requirements that StringView::substring does not. In a separate comment, you (Darin) wrote: > I don’t agree that stripLeadingAndTrailingHTMLSpaces should return a null string. How does writing API tests make this clear? It made it clear insofar as that is the current behavior of stripLeadingAndTrailingHTMLSpaces(const String&). I wanted to match the behavior of stripLeadingAndTrailingHTMLSpaces(const String&) when implementing stripLeadingAndTrailingHTMLSpaces(StringView) while writing the tests. > I don’t understand what you mean with the micro optimization comment. Calling String[View].isEmpty() on a null string checks to see if m_impl is nullptr, which is (slightly) faster than checking if String.isEmpty() on an empty string: WTFString: bool isEmpty() const { return !m_impl || !m_impl->length(); } I guess StringView just checks length, though, and skips the NULL check, which seems like WTFString could do as well! (There's no perf win here, however minimal.) StringView: inline bool StringView::isEmpty() const { return !length(); } inline unsigned StringView::length() const { return m_length; } Although now that you point it out, this is changing null strings into empty strings, so it's not even doing what I expected, even though the tests pass. I need to go back and add more tests, probably for isNull() and isEmpty(). And you're saying that we should remove these special cases for stripLeadingAndTrailingHTMLSpaces(const String&) as well? > To me it does not seem like we should add branches to optimize the empty or null string cases. Okay, researching the origin of this code in stripLeadingAndTrailingHTMLSpaces(const String&), it was introduced by you in r68854 to make sure not to turn null strings into empty strings: * html/parser/HTMLParserIdioms.cpp: (WebCore::stripLeadingAndTrailingHTMLSpaces): Fixed bug where the function would turn the null string into the empty string. Fixed bug where the function would not strip all trailing spaces. <https://trac.webkit.org/changeset/68854/trunk/WebCore/html/parser/HTMLParserIdioms.cpp> I'm clearing the review flag and will work on a new patch. >> Source/WebCore/html/parser/HTMLParserIdioms.h:50 >> WEBCORE_EXPORT String stripLeadingAndTrailingHTMLSpaces(const String&); > > I don’t think this FIXME is exactly right. > > If I have a String that probably does *not* have leading and trailing HTML spaces, then converting it to a StringView and back to a String will have to allocate a new String! On the other hand, if you are starting with a String and do not need to convert the result to a String for storage, just need to look at it, then it’s better to have the result be a StringView. > > So there is value in having a version that produces a String that can reuse the existing string when there is nothing to strip. Not clear if overloading with the same name is the best design choice, since both the return type and the argument type matter. > > Implying that we will want to remove this overload entirely is probably incorrect, though. Okay, I'll remove the FIXME before landing. I think I read too much into your original comment about moving the method into HTMLParserIdioms.cpp.
Frédéric Wang (:fredw)
Comment 19 2016-08-23 06:40:10 PDT
(In reply to comment #13) > (In reply to comment #11) > > (In reply to comment #10) > > > Yes, planning to do another patch to address Darin's comments and to add > > > tests. > > > > > > Got side-tracked with other work for the past couple of days. > > > > OK, I was asking because I need to make skipLeadingAndTrailingWhitespace > > accessible from another file in bug 160462, so just wanted to warn you that > > the two patches may conflict. > > Thanks for the head's up! I'll check to make sure this is working as > expected before landing. Hi David, I landed the patch for bug 160462, so you'll have to rebase your patch here.
Rob Buis
Comment 20 2019-02-22 00:47:28 PST
Rob Buis
Comment 21 2019-02-22 02:59:54 PST
Frédéric Wang (:fredw)
Comment 22 2019-02-22 03:04:24 PST
Comment on attachment 362713 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=362713&action=review > Source/WebCore/ChangeLog:9 > + from HTTPParsers instead. If should be something like "no new tests, already covered by MathML tests".
Rob Buis
Comment 23 2019-02-22 03:39:12 PST
EWS Watchlist
Comment 24 2019-02-22 05:26:33 PST
Comment on attachment 362714 [details] Patch Attachment 362714 [details] did not pass mac-debug-ews (mac): Output: https://webkit-queues.webkit.org/results/11245928 New failing tests: imported/w3c/web-platform-tests/mathml/relations/css-styling/lengths-3.html mathml/mpadded-crash.html
EWS Watchlist
Comment 25 2019-02-22 05:26:35 PST
Created attachment 362716 [details] Archive of layout-test-results from ews114 for mac-highsierra The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews114 Port: mac-highsierra Platform: Mac OS X 10.13.6
EWS Watchlist
Comment 26 2019-02-22 05:41:06 PST
Comment on attachment 362714 [details] Patch Attachment 362714 [details] did not pass ios-sim-ews (ios-simulator-wk2): Output: https://webkit-queues.webkit.org/results/11245953 New failing tests: fast/viewport/ios/device-width-viewport-after-changing-view-scale.html
EWS Watchlist
Comment 27 2019-02-22 05:41:08 PST
Created attachment 362718 [details] Archive of layout-test-results from ews126 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews126 Port: ios-simulator-wk2 Platform: Mac OS X 10.13.6
Rob Buis
Comment 28 2019-02-22 05:50:42 PST
Frédéric Wang (:fredw)
Comment 29 2019-02-22 08:02:29 PST
Comment on attachment 362719 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=362719&action=review > Source/WebCore/mathml/MathMLPresentationElement.cpp:298 > + StringView stringView = string; Probably the next step will be to see if we can use a StringView for parseMathMLLength
WebKit Commit Bot
Comment 30 2019-02-22 08:31:23 PST
Comment on attachment 362719 [details] Patch Clearing flags on attachment: 362719 Committed r241947: <https://trac.webkit.org/changeset/241947>
WebKit Commit Bot
Comment 31 2019-02-22 08:31:25 PST
All reviewed patches have been landed. Closing bug.
Radar WebKit Bug Importer
Comment 32 2019-02-22 08:39:27 PST
Darin Adler
Comment 33 2019-02-22 09:50:14 PST
I think we have both stripLeadingAndTrailingHTMLSpaces and stripLeadingAndTrailingHTTPSpaces, I am really surprised that this one uses the HTTP one, not the HTML one. Otherwise, I love this!
David Kilzer (:ddkilzer)
Comment 34 2019-02-22 15:46:19 PST
Rob, thanks for finishing this patch!
Note You need to log in before you can comment on or make changes to this bug.