Bug 161669 - parseHTMLInteger() should take a StringView in parameter
Summary: parseHTMLInteger() should take a StringView in parameter
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: DOM (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Chris Dumez
URL:
Keywords:
: 161315 (view as bug list)
Depends on: 161885
Blocks:
  Show dependency treegraph
 
Reported: 2016-09-06 18:45 PDT by Chris Dumez
Modified: 2017-10-30 08:03 PDT (History)
7 users (show)

See Also:


Attachments
Patch (18.86 KB, patch)
2016-09-06 19:38 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff
Patch (22.12 KB, patch)
2016-09-06 20:05 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff
Patch (22.12 KB, patch)
2016-09-06 20:27 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff
Patch (22.10 KB, patch)
2016-09-08 19:20 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff
Patch (21.40 KB, patch)
2016-09-09 09:04 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff
Patch (22.94 KB, patch)
2016-09-09 09:38 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Chris Dumez 2016-09-06 18:45:46 PDT
parseHTMLInteger() should take a StringView in parameter instead of a const String&.
Comment 1 Chris Dumez 2016-09-06 19:38:20 PDT
Created attachment 288091 [details]
Patch
Comment 2 Chris Dumez 2016-09-06 20:05:52 PDT
Created attachment 288093 [details]
Patch
Comment 3 Chris Dumez 2016-09-06 20:27:47 PDT
Created attachment 288094 [details]
Patch
Comment 4 Chris Dumez 2016-09-08 19:20:28 PDT
Created attachment 288377 [details]
Patch
Comment 5 Alex Christensen 2016-09-09 08:50:37 PDT
Comment on attachment 288377 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=288377&action=review

> Source/JavaScriptCore/runtime/DateConversion.cpp:118
> -                builder.append(timeZoneName);
> +                builder.append(String(timeZoneName));

There's no ChangeLog entry for this.  Could we do this without copying into a String?

> Source/WTF/wtf/text/StringView.h:69
>      StringView(const LChar*, unsigned length);
>      StringView(const UChar*, unsigned length);
> +    StringView(const char*);

I think we want a length parameter here like we do with the other pointer-based constructors to avoid accidentally adding strlen calls.  Or we could do something like this:

template<unsigned charactersCount>
StringView(const char (&characters)[charactersCount]);
Comment 6 Chris Dumez 2016-09-09 09:00:31 PDT
Comment on attachment 288377 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=288377&action=review

>> Source/JavaScriptCore/runtime/DateConversion.cpp:118
>> +                builder.append(String(timeZoneName));
> 
> There's no ChangeLog entry for this.  Could we do this without copying into a String?

When timeZoneName is a char* then we could call append() directly. However, we run into trouble if timeZoneName is a const WCHAR*. I'll investigate further.

>> Source/WTF/wtf/text/StringView.h:69
>> +    StringView(const char*);
> 
> I think we want a length parameter here like we do with the other pointer-based constructors to avoid accidentally adding strlen calls.  Or we could do something like this:
> 
> template<unsigned charactersCount>
> StringView(const char (&characters)[charactersCount]);

The call sites I want to support do not have the length and want to pass a const char*.
This is consistent with StringBuilder::append(const char* characters) and String(const char* characters)
Comment 7 Chris Dumez 2016-09-09 09:04:03 PDT
Created attachment 288406 [details]
Patch
Comment 8 Chris Dumez 2016-09-09 09:31:12 PDT
(In reply to comment #6)
> Comment on attachment 288377 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=288377&action=review
> 
> >> Source/JavaScriptCore/runtime/DateConversion.cpp:118
> >> +                builder.append(String(timeZoneName));
> > 
> > There's no ChangeLog entry for this.  Could we do this without copying into a String?
> 
> When timeZoneName is a char* then we could call append() directly. However,
> we run into trouble if timeZoneName is a const WCHAR*. I'll investigate
> further.

As I thought, WCHAR* is causing trouble on Windows:
C:\cygwin\home\buildbot\WebKit\Source\JavaScriptCore\runtime\DateConversion.cpp(118): error C2668: 'WTF::StringBuilder::append': ambiguous call to overloaded function [C:\cygwin\home\buildbot\WebKit\WebKitBuild\Release\Source\JavaScriptCore\JavaScriptCore.vcxproj]
  C:\cygwin\home\buildbot\WebKit\WebKitBuild\Release\DerivedSources\ForwardingHeaders\wtf/text/StringBuilder.h(78): note: could be 'void WTF::StringBuilder::append(const WTF::StringBuilder &)' (compiling source file C:\cygwin\home\buildbot\WebKit\Source\JavaScriptCore\runtime\DateConversion.cpp)
  C:\cygwin\home\buildbot\WebKit\WebKitBuild\Release\DerivedSources\ForwardingHeaders\wtf/text/StringBuilder.h(152): note: or       'void WTF::StringBuilder::append(char)' (compiling source file C:\cygwin\home\buildbot\WebKit\Source\JavaScriptCore\runtime\DateConversion.cpp)
  C:\cygwin\home\buildbot\WebKit\WebKitBuild\Release\DerivedSources\ForwardingHeaders\wtf/text/StringBuilder.h(141): note: or       'void WTF::StringBuilder::append(LChar)' (compiling source file C:\cygwin\home\buildbot\WebKit\Source\JavaScriptCore\runtime\DateConversion.cpp)
  C:\cygwin\home\buildbot\WebKit\WebKitBuild\Release\DerivedSources\ForwardingHeaders\wtf/text/StringBuilder.h(125): note: or       'void WTF::StringBuilder::append(UChar)' (compiling source file C:\cygwin\home\buildbot\WebKit\Source\JavaScriptCore\runtime\DateConversion.cpp)
  C:\cygwin\home\buildbot\WebKit\WebKitBuild\Release\DerivedSources\ForwardingHeaders\wtf/text/StringBuilder.h(119): note: or       'void WTF::StringBuilder::append(const char *)' (compiling source file C:\cygwin\home\buildbot\WebKit\Source\JavaScriptCore\runtime\DateConversion.cpp)
  C:\cygwin\home\buildbot\WebKit\WebKitBuild\Release\DerivedSources\ForwardingHeaders\wtf/text/StringBuilder.h(97): note: or       'void WTF::StringBuilder::append(WTF::StringView)' (compiling source file C:\cygwin\home\buildbot\WebKit\Source\JavaScriptCore\runtime\DateConversion.cpp)
  C:\cygwin\home\buildbot\WebKit\WebKitBuild\Release\DerivedSources\ForwardingHeaders\wtf/text/StringBuilder.h(157): note: or       'void WTF::StringBuilder::append(UChar32)' (compiling source file C:\cygwin\home\buildbot\WebKit\Source\JavaScriptCore\runtime\DateConversion.cpp)
  C:\cygwin\home\buildbot\WebKit\WebKitBuild\Release\DerivedSources\ForwardingHeaders\wtf/text/StringBuilder.h(58): note: or       'void WTF::StringBuilder::append(const WTF::String &)' (compiling source file C:\cygwin\home\buildbot\WebKit\Source\JavaScriptCore\runtime\DateConversion.cpp)
  C:\cygwin\home\buildbot\WebKit\WebKitBuild\Release\DerivedSources\ForwardingHeaders\wtf/text/StringBuilder.h(53): note: or       'void WTF::StringBuilder::append(const WTF::AtomicString &)' (compiling source file C:\cygwin\home\buildbot\WebKit\Source\JavaScriptCore\runtime\DateConversion.cpp)
  C:\cygwin\home\buildbot\WebKit\Source\JavaScriptCore\runtime\DateConversion.cpp(118): note: while trying to match the argument list '(const WCHAR *)'
Comment 9 Chris Dumez 2016-09-09 09:38:21 PDT
Created attachment 288408 [details]
Patch
Comment 10 Ryosuke Niwa 2016-09-09 21:28:52 PDT
Comment on attachment 288408 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=288408&action=review

> Source/JavaScriptCore/runtime/DateConversion.cpp:122
> +#if OS(WINDOWS)
> +                builder.append(String(timeZoneName));
> +#else
>                  builder.append(timeZoneName);
> +#endif

Can we create timeZoneName as a String object in the above #if OS(WINDOWS) instead?

> Source/WebCore/html/parser/HTMLParserIdioms.cpp:194
> -Optional<int> parseHTMLInteger(const String& input)
> +Optional<int> parseHTMLInteger(StringView input)

Do we really want to pass by value here?  Why can't we just use a reference instead?

> Source/WebCore/html/parser/HTMLParserIdioms.cpp:210
> -Optional<unsigned> parseHTMLNonNegativeInteger(const String& input)
> +Optional<unsigned> parseHTMLNonNegativeInteger(StringView input)

Ditto.

> Source/WebCore/html/parser/HTMLParserIdioms.h:136
> -inline unsigned limitToOnlyHTMLNonNegativeNumbersGreaterThanZero(const String& stringValue, unsigned defaultValue = 1)
> +inline unsigned limitToOnlyHTMLNonNegativeNumbersGreaterThanZero(StringView stringValue, unsigned defaultValue = 1)

Ditto.

> Source/WebCore/html/parser/HTMLParserIdioms.h:154
> -inline unsigned limitToOnlyHTMLNonNegative(const String& stringValue, unsigned defaultValue = 0)
> +inline unsigned limitToOnlyHTMLNonNegative(StringView stringValue, unsigned defaultValue = 0)

Ditto.
Comment 11 Chris Dumez 2016-09-09 22:36:09 PDT
Comment on attachment 288408 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=288408&action=review

>> Source/JavaScriptCore/runtime/DateConversion.cpp:122
>> +#endif
> 
> Can we create timeZoneName as a String object in the above #if OS(WINDOWS) instead?

I thought about this but this will not reduce the ifdef'ing because of the if (timezoneName[0]) above.

>> Source/WebCore/html/parser/HTMLParserIdioms.cpp:194
>> +Optional<int> parseHTMLInteger(StringView input)
> 
> Do we really want to pass by value here?  Why can't we just use a reference instead?

This is intentional and the common pattern. StringView is a small object.
Comment 12 Ryosuke Niwa 2016-09-09 23:36:12 PDT
Well, I'm not talking about the size but the runtime cost of having to create & destroy StringView objects. It does a bunch of work in copy constructor.

Also, there is a bunch of other functions that take const StringView&.  So what's the benefit / reason for passing StringView by value versus passing as a reference?
Comment 13 Chris Dumez 2016-09-10 08:59:31 PDT
(In reply to comment #12)
> Well, I'm not talking about the size but the runtime cost of having to
> create & destroy StringView objects. It does a bunch of work in copy
> constructor.
> 
> Also, there is a bunch of other functions that take const StringView&.  So
> what's the benefit / reason for passing StringView by value versus passing
> as a reference?

I am also talking about runtime performance. StringView is small and contains on very simple data members (a bool, an unsigned and a pointer). Its implicit copy constructor only has to copy those data members which is cheap. Its destructor basically has nothing to do. The idea is that copying such small data types is likely cheaper than passing then by reference (similarly to IntPoint).

That said, I do not feel strongly about this. The reason I chose to pass by value honestly is because Darin asked me to do so in earlier reviews.
Comment 14 Ryosuke Niwa 2016-09-10 10:29:55 PDT
Okay. We might want to consider replacing const StringView& with StringView if that's really preferable then.
Comment 15 WebKit Commit Bot 2016-09-10 11:10:36 PDT
Comment on attachment 288408 [details]
Patch

Clearing flags on attachment: 288408

Committed r205787: <http://trac.webkit.org/changeset/205787>
Comment 16 WebKit Commit Bot 2016-09-10 11:10:42 PDT
All reviewed patches have been landed.  Closing bug.
Comment 17 Darin Adler 2016-09-10 17:14:45 PDT
(In reply to comment #14)
> Okay. We might want to consider replacing const StringView& with StringView
> if that's really preferable then.

Yes, Anders has said that we should do that. We should never use const StringView&. We could measure to make absolutely sure it’s more efficient and then do it globally.
Comment 18 Darin Adler 2016-09-12 14:05:29 PDT
Comment on attachment 288408 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=288408&action=review

Some coding style tweak ideas.

We could consider adding a StringBuilder::append for CFStringRef; it would be used in at 10 call sites in this patch and if there is reason to want to avoid copying the string an extra time at even one of them then it would probably be worth it.

> Source/WTF/wtf/text/StringView.h:199
> +#include <wtf/text/AtomicString.h>
>  #include <wtf/text/WTFString.h>

We should figure out what to do about the nest of string headers that all include one another. There may be a way to handle this that is better for compilation speed.

> Source/WebCore/html/parser/HTMLParserIdioms.cpp:366
> +    Optional<unsigned> number = parseHTMLNonNegativeInteger(StringView(numberStart, position - numberStart));

We may be able to write:

    parseHTMLNonNegativeInteger({ numberStart, position - numberStart })

If so, I think that’s nicer than StringView(). Maybe you agree?

> Source/WebCore/platform/sql/SQLiteStatement.cpp:291
> -    return equalLettersIgnoringASCIICase(StringView(reinterpret_cast<const UChar*>(sqlite3_column_decltype16(m_statement, col))), "blob");
> +    return equalLettersIgnoringASCIICase(StringView(sqlite3_column_decltype(m_statement, col)), "blob");

This is OK, but a bit peculiar. It seems strange to involve our string classes at when we want to compare two const char* in an ASCII case insensitive way.

> Source/WebCore/rendering/RenderThemeIOS.mm:1296
>          StringBuilder builder;
> -        builder.append([NSString stringWithContentsOfFile:[[NSBundle bundleForClass:[WebCoreRenderThemeBundle class]] pathForResource:@"mediaControlsiOS" ofType:@"css"] encoding:NSUTF8StringEncoding error:nil]);
> +        builder.append(String([NSString stringWithContentsOfFile:[[NSBundle bundleForClass:[WebCoreRenderThemeBundle class]] pathForResource:@"mediaControlsiOS" ofType:@"css"] encoding:NSUTF8StringEncoding error:nil]));
>          m_mediaControlsStyleSheet = builder.toString();

Why involve a string builder at all?

> Source/WebCore/rendering/RenderThemeIOS.mm:1312
> +        scriptBuilder.append(String([NSString stringWithContentsOfFile:[[NSBundle bundleForClass:[WebCoreRenderThemeBundle class]] pathForResource:@"mediaControlsLocalizedStrings" ofType:@"js"] encoding:NSUTF8StringEncoding error:nil]));
> +        scriptBuilder.append(String([NSString stringWithContentsOfFile:[[NSBundle bundleForClass:[WebCoreRenderThemeBundle class]] pathForResource:@"mediaControlsApple" ofType:@"js"] encoding:NSUTF8StringEncoding error:nil]));
> +        scriptBuilder.append(String([NSString stringWithContentsOfFile:[[NSBundle bundleForClass:[WebCoreRenderThemeBundle class]] pathForResource:@"mediaControlsiOS" ofType:@"js"] encoding:NSUTF8StringEncoding error:nil]));
>          m_mediaControlsScript = scriptBuilder.toString();

Does seem like a gratuitously inefficient way to read three files and decode UTF-8 all to make a string. Maybe doesn’t matter though since we do this only once.

> Source/WebCore/rendering/RenderThemeMac.mm:239
>          StringBuilder styleSheetBuilder;
> -        styleSheetBuilder.append([NSString stringWithContentsOfFile:[[NSBundle bundleForClass:[WebCoreRenderThemeBundle class]] pathForResource:@"mediaControlsApple" ofType:@"css"] encoding:NSUTF8StringEncoding error:nil]);
> +        styleSheetBuilder.append(String([NSString stringWithContentsOfFile:[[NSBundle bundleForClass:[WebCoreRenderThemeBundle class]] pathForResource:@"mediaControlsApple" ofType:@"css"] encoding:NSUTF8StringEncoding error:nil]));
>          m_mediaControlsStyleSheet = styleSheetBuilder.toString();

Same comment as above.

> Source/WebCore/rendering/RenderThemeMac.mm:253
> +        scriptBuilder.append(String([NSString stringWithContentsOfFile:[[NSBundle bundleForClass:[WebCoreRenderThemeBundle class]] pathForResource:@"mediaControlsLocalizedStrings" ofType:@"js"] encoding:NSUTF8StringEncoding error:nil]));
> +        scriptBuilder.append(String([NSString stringWithContentsOfFile:[[NSBundle bundleForClass:[WebCoreRenderThemeBundle class]] pathForResource:@"mediaControlsApple" ofType:@"js"] encoding:NSUTF8StringEncoding error:nil]));

And again.
Comment 19 Frédéric Wang (:fredw) 2017-10-30 08:03:32 PDT
*** Bug 161315 has been marked as a duplicate of this bug. ***