WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
240942
Make StringView(const char*) private
https://bugs.webkit.org/show_bug.cgi?id=240942
Summary
Make StringView(const char*) private
Chris Dumez
Reported
2022-05-25 20:52:10 PDT
Make StringView(const char*) private and update existing call sites to use either StringView(ASCIILiteral) or StringView::fromLatin1(const char*).
Attachments
Add attachment
proposed patch, testcase, etc.
Chris Dumez
Comment 1
2022-05-25 21:20:12 PDT
Pull request:
https://github.com/WebKit/WebKit/pull/1040
EWS
Comment 2
2022-05-26 17:50:39 PDT
Committed
r294916
(
251034@main
): <
https://commits.webkit.org/251034@main
> Reviewed commits have been landed. Closing PR #1040 and removing active labels.
Radar WebKit Bug Importer
Comment 3
2022-05-26 17:51:15 PDT
<
rdar://problem/94007259
>
Fujii Hironori
Comment 4
2023-01-27 04:16:10 PST
bug#250017
is going to reimplement makeString with using more StringView internally, and makeString accepts a char pointer. What is the reason you disallow constructing a StringView from a char pointer? Why do we have to construct ASCIILiteral just to pass into a StringView?
Chris Dumez
Comment 5
2023-01-27 07:46:24 PST
(In reply to Fujii Hironori from
comment #4
)
>
bug#250017
is going to reimplement makeString with using more StringView > internally, and makeString accepts a char pointer. > What is the reason you disallow constructing a StringView from a char > pointer? > Why do we have to construct ASCIILiteral just to pass into a StringView?
As I remember it, we merely replaced the `StringView(const char*)` constructor with a `StringView::fromLatin1(const char*)` function, to encourage people to use `StringView(ASCIILiteral)` whenever they are actually dealing with a string literal. If you not dealing with a string literal and just dealing with a `const char*`, just use `StringView::fromLatin1(const char*)`. Don't construct an ASCIILiteral from something that is not a string literal, that would lead to bugs.
Fujii Hironori
Comment 6
2023-01-27 12:16:00 PST
(In reply to Chris Dumez from
comment #5
)
> As I remember it, we merely replaced the `StringView(const char*)` > constructor with a `StringView::fromLatin1(const char*)` function, to > encourage people to use `StringView(ASCIILiteral)` whenever they are > actually dealing with a string literal.
In my understanding, we use ASCIILiteral to pass a string literal into a String.
> If you not dealing with a string literal and just dealing with a `const > char*`, just use `StringView::fromLatin1(const char*)`.
`StringView::fromLatin1(const char*)` just calls `StringView::StringView(const char*)` constructor.
> Don't construct an > ASCIILiteral from something that is not a string literal, that would lead to > bugs.
Right. But, it's not a problem to construct a StringView from a char* , something that is not a string literal.
Chris Dumez
Comment 7
2023-01-27 12:21:17 PST
(In reply to Fujii Hironori from
comment #6
)
> (In reply to Chris Dumez from
comment #5
) > > As I remember it, we merely replaced the `StringView(const char*)` > > constructor with a `StringView::fromLatin1(const char*)` function, to > > encourage people to use `StringView(ASCIILiteral)` whenever they are > > actually dealing with a string literal. > > In my understanding, we use ASCIILiteral to pass a string literal into a > String.
We use ASCIILiteral to represent any string literal in WebKit, whether to construct a String or something else. This allows the recipient code to make some optimization if appropriate.
> > > If you not dealing with a string literal and just dealing with a `const > > char*`, just use `StringView::fromLatin1(const char*)`. > > `StringView::fromLatin1(const char*)` just calls > `StringView::StringView(const char*)` constructor.
Yes, so what's the problem with you using it? It does what you want. It just makes it clearer that we're treating the input as latin1, which we thought was a good idea.
> > > Don't construct an > > ASCIILiteral from something that is not a string literal, that would lead to > > bugs. > > Right. > But, it's not a problem to construct a StringView from a char* , something > that is not a string literal.
It is not a problem. We just wanted to be explicit about it being treated as latin1 iirc, exactly like we did for String.
Chris Dumez
Comment 8
2023-01-27 12:22:46 PST
(In reply to Chris Dumez from
comment #7
)
> (In reply to Fujii Hironori from
comment #6
) > > (In reply to Chris Dumez from
comment #5
) > > > As I remember it, we merely replaced the `StringView(const char*)` > > > constructor with a `StringView::fromLatin1(const char*)` function, to > > > encourage people to use `StringView(ASCIILiteral)` whenever they are > > > actually dealing with a string literal. > > > > In my understanding, we use ASCIILiteral to pass a string literal into a > > String. > > We use ASCIILiteral to represent any string literal in WebKit, whether to > construct a String or something else. This allows the recipient code to make > some optimization if appropriate. > > > > > > If you not dealing with a string literal and just dealing with a `const > > > char*`, just use `StringView::fromLatin1(const char*)`. > > > > `StringView::fromLatin1(const char*)` just calls > > `StringView::StringView(const char*)` constructor. > > Yes, so what's the problem with you using it? It does what you want. It just > makes it clearer that we're treating the input as latin1, which we thought > was a good idea. > > > > > > Don't construct an > > > ASCIILiteral from something that is not a string literal, that would lead to > > > bugs. > > > > Right. > > But, it's not a problem to construct a StringView from a char* , something > > that is not a string literal. > > It is not a problem. We just wanted to be explicit about it being treated as > latin1 iirc, exactly like we did for String.
I added Darin in CC just in case. I *think* this change was one he suggested, with the reasoning I cited in my previous comment iirc.
Fujii Hironori
Comment 9
2023-01-27 12:40:14 PST
Please grep "makeString". WebKit is using a lot of string literals in it, not with ""_s.
https://searchfox.org/wubkat/search?q=makeString&path=&case=false®exp=false
If a function take a StringView, it'd be nice if we can call it with string literal without ""_s. That's the reason makeString accepts char*, I think.
Chris Dumez
Comment 10
2023-01-27 12:41:31 PST
(In reply to Fujii Hironori from
comment #9
)
> Please grep "makeString". WebKit is using a lot of string literals in it, > not with ""_s. >
https://searchfox.org/wubkat/
> search?q=makeString&path=&case=false®exp=false
Yes, because we haven't gotten around to fixing this code yet. Nowadays, the expectation in WebKit is that you should be using the _s suffix whenever you are dealing with a String literal.
> If a function take a StringView, it'd be nice if we can call it with string > literal without ""_s. That's the reason makeString accepts char*, I think.
Darin Adler
Comment 11
2023-01-27 12:50:16 PST
(In reply to Fujii Hironori from
comment #9
)
> If a function take a StringView, it'd be nice if we can call it with string > literal without ""_s. That's the reason makeString accepts char*, I think.
I am not sure I agree about this, going forward. We should probably move away from this. The reason makeString accepts const char* is historical. When we created it, ASCIILiteral wasn’t as easy to use as it is now. And I must admit that I often use the const char* version because I like not having to add "_s" to the ends of my string literals, even in new code. But a const char* is ambiguous about encoding. In WebKit code, it could easily be either UTF-8 or Latin-1, or, less-likely some other encoding. But makeString treats const char* as Latin-1. If the string if UTF-8, then it will actively do the wrong thing. One really great thing about ASCIILiteral is that it states explicitly that all characters will be ASCII. Such strings can be safely passed to functions that treat the bytes as Latin-1, functions that treat the bytes as UTF-8, or even functions that optimize by assuming all bytes are ASCII, and there are no 0x80-0xFF bytes. This is good for both optimizations and correctness. A direction where we stop using non-ASCIILiteral string literals entirely is likely a good one for WebKit. I think we should consider setting some guidelines for new code so we can make this transition. We have had many mistakes where we treat UTF-8 strings as Latin-1 accidentally and it would be good to make this error much more difficult in the future. We may even want to add UTF8Literal at some point!
Darin Adler
Comment 12
2023-01-27 12:51:01 PST
So, I approve of this change Chris made.
Fujii Hironori
Comment 13
2023-01-27 13:25:41 PST
Makes sense. Thank you.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug