Bug 171586 - NeverDestroyed<String>(ASCIILiteral(...)) is not thread safe.
Summary: NeverDestroyed<String>(ASCIILiteral(...)) is not thread safe.
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Mark Lam
URL:
Keywords: InRadar
Depends on:
Blocks: 171702 171800
  Show dependency treegraph
 
Reported: 2017-05-02 18:01 PDT by Mark Lam
Modified: 2017-05-08 03:28 PDT (History)
16 users (show)

See Also:


Attachments
proposed patch. (48.08 KB, patch)
2017-05-02 18:30 PDT, Mark Lam
no flags Details | Formatted Diff | Diff
rebased to ToT (47.81 KB, patch)
2017-05-02 18:37 PDT, Mark Lam
fpizlo: review-
Details | Formatted Diff | Diff
proposed patch. (134.69 KB, patch)
2017-05-04 14:20 PDT, Mark Lam
ysuzuki: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Lam 2017-05-02 18:01:39 PDT
JavaScriptCore allows multiple VMs to be instantiated, and each of these should be able to run concurrently on different threads.  However, NeverDestroyed<String>(ASCIILiteral(...)) is not thread-safe because each thread will ref and deref the underlying StringImpl.  Since this ref and deref is down in a thread-safe way, the NeverDestroyed<String> may get destroyed due to the ref/deref races.

The fix is to use the StaticStringImpl class which is safe for ref/derefing concurrently from different threads.

An alternative solution would be to change all the uses of NeverDestroyed<String> to use per-VM strings.  However, this solution is cumbersome, and makes it harder to allocate the intended shared string.  It also uses more memory and takes more CPU time because it requires allocating the same string for each VM instance.  The StaticStringImpl solution wins out because it is more efficient and is easier to use.

<rdar://problem/31873190>
Comment 1 Mark Lam 2017-05-02 18:30:40 PDT
Created attachment 308877 [details]
proposed patch.
Comment 2 Mark Lam 2017-05-02 18:37:17 PDT
Created attachment 308879 [details]
rebased to ToT
Comment 3 Build Bot 2017-05-02 18:39:32 PDT
This patch modifies the WEB_REPLAY inputs generator. Please ensure that you have rebaselined any generator test results (i.e., by running `Tools/Scripts/run-input-generator-tests --reset-results`)
Comment 4 Oliver Hunt 2017-05-02 20:06:12 PDT
Comment on attachment 308879 [details]
rebased to ToT

View in context: https://bugs.webkit.org/attachment.cgi?id=308879&action=review

> Source/WTF/ChangeLog:13
> +

I would rather just have didReportCost check if it's a static string and simply not update (and return true) - that seems more sensible to me as it is (functionally) a static part of the binary

> Source/WTF/ChangeLog:23
> +            NeverDestroyed<String> myString(ASCIILiteral("myString"));

Why are there different ascii literal strings? Is this to allow non-atomic increment of literal strings in the dom?

> Source/WTF/ChangeLog:24
> +

I ask because the bulk of this patch is changing asciiliteral to staticasciiliteral, but the whole point of asciiliteral /is/ that it is a literal, so to me this is at best poorly named

> Source/WTF/wtf/text/StringImpl.h:573
> +        //    is no way to set/clear the s_hashFlag8BitBuffer flag other than at construction.

Wait, isn't m_data16 created lazily if it is ever needed?

> Source/WTF/wtf/text/StringImpl.h:887
> +static_assert(sizeof(StringImpl) == sizeof(StaticStringImpl), "");

I feel that it would be better to just wrap the sttingimpl members in a struct and embed that struct - why did you choose not to?
Comment 5 Yusuke Suzuki 2017-05-03 03:58:00 PDT
Comment on attachment 308879 [details]
rebased to ToT

View in context: https://bugs.webkit.org/attachment.cgi?id=308879&action=review

> Source/JavaScriptCore/ChangeLog:12
> +        meant to be shared by all VMs.

Right.

> Source/JavaScriptCore/ChangeLog:24
> +        However, NeverDestroyed<String>(ASCIILiteral(...)) is not thread-safe because
> +        each thread will ref and deref the underlying StringImpl.  Since this ref and
> +        deref is down in a thread-safe way, the NeverDestroyed<String> may get destroyed
> +        due to the ref/deref races.  Additionally, each thread may modify the StringImpl
> +        by setting its hash and also twiddling its flags.
> +
> +        The fix is to use the StaticStringImpl class which is safe for ref/derefing
> +        concurrently from different threads.  StaticStringImpl is also pre-set with a
> +        hash on construction, and its flags are set in such a way as to prevent twiddling
> +        at runtime.  Hence, we will be able to share a NeverDestroyed<String> between
> +        VMs, as long as it is backed by a StaticStringImpl.

How about just using `static StringImpl::StaticStringImpl StringValue("String");`, casting it to StringImpl* and returning it?

> Source/JavaScriptCore/ChangeLog:31
> +        to use.

Yeah, it is unacceptable.

>> Source/WTF/ChangeLog:23
>> +            NeverDestroyed<String> myString(ASCIILiteral("myString"));
> 
> Why are there different ascii literal strings? Is this to allow non-atomic increment of literal strings in the dom?

ASCIILiteral() is introduced to distinguish `const char*` from literal, it can represent some good features of literals.
For example, while `const char*` may be non-null-terminated, we can say that ASCIILiteral() is always null-terminated.

I think the current StaticASCIILiteral has a bug since makeStaticStringImpl does not work well. I suggest using StaticStringImpl directly in NeverDestroyed<String> places.
Like, `static StringImpl::StaticStringImpl myString("myString")`.

>> Source/WTF/wtf/text/StringImpl.h:573
>> +        //    is no way to set/clear the s_hashFlag8BitBuffer flag other than at construction.
> 
> Wait, isn't m_data16 created lazily if it is ever needed?

No, we do not create it lazily. StringImpl is basically immutable except for ref count and atomicity flag.

> Source/WTF/wtf/text/StringImpl.h:581
> +        //    a. StaticStringImpl's constructor sets the s_hashFlagDidReportCost flag to ensure
> +        //       that StringImpl::cost() returns early.
> +        //       This means StaticStringImpl costs are not counted. But since there should only
> +        //       be a finite set of StaticStringImpls, their cost can be aggregated into a single
> +        //       system cost if needed.

I think setting s_hashFlagDidReportCost is OK. Adding some document about static strings in ::cost() would be helpful.

> Source/WTF/wtf/text/StringImpl.h:583
> +        //       setIsAtomic() asserts !isStatic().

Right.

> Source/WTF/wtf/text/StringImpl.h:587
> +        //       Additionally, StringImpl::setHash() asserts hasHash() and !isStatic().

Right.

> Source/WTF/wtf/text/StringImpl.h:1164
> +template<unsigned charactersCount>
> +StaticStringImpl& makeStaticStringImpl(const char (&characters)[charactersCount])
> +{
> +    static StaticStringImpl impl(characters);
> +    return impl;
> +}
> +
> +template<unsigned charactersCount>
> +StaticStringImpl& makeStaticStringImpl(const char16_t (&characters)[charactersCount])
> +{
> +    static StaticStringImpl impl(characters);
> +    return impl;
> +}

I think it's not correct. It returns the same StaticStringImpl& for `makeStaticStringImpl("Hello")` and `makeStaticStringImpl("World")` (strlen("Hello") == strlen("World")), right?
Comment 6 Filip Pizlo 2017-05-03 08:36:39 PDT
Comment on attachment 308879 [details]
rebased to ToT

View in context: https://bugs.webkit.org/attachment.cgi?id=308879&action=review

r- because bug

> Source/JavaScriptCore/ChangeLog:16
> +        deref is down in a thread-safe way, the NeverDestroyed<String> may get destroyed

*done

>>> Source/WTF/ChangeLog:23
>>> +            NeverDestroyed<String> myString(ASCIILiteral("myString"));
>> 
>> Why are there different ascii literal strings? Is this to allow non-atomic increment of literal strings in the dom?
> 
> ASCIILiteral() is introduced to distinguish `const char*` from literal, it can represent some good features of literals.
> For example, while `const char*` may be non-null-terminated, we can say that ASCIILiteral() is always null-terminated.
> 
> I think the current StaticASCIILiteral has a bug since makeStaticStringImpl does not work well. I suggest using StaticStringImpl directly in NeverDestroyed<String> places.
> Like, `static StringImpl::StaticStringImpl myString("myString")`.

I agree.

>> Source/WTF/wtf/text/StringImpl.h:1164
>> +}
> 
> I think it's not correct. It returns the same StaticStringImpl& for `makeStaticStringImpl("Hello")` and `makeStaticStringImpl("World")` (strlen("Hello") == strlen("World")), right?

Agreed.
Comment 7 Mark Lam 2017-05-03 10:36:43 PDT
Comment on attachment 308879 [details]
rebased to ToT

View in context: https://bugs.webkit.org/attachment.cgi?id=308879&action=review

>> Source/WTF/ChangeLog:13
>> +
> 
> I would rather just have didReportCost check if it's a static string and simply not update (and return true) - that seems more sensible to me as it is (functionally) a static part of the binary

The s_hashFlagDidReportCost flag is used to avoid reporting the cost of the same StringImpl multiple times.  Option 1: Reporting 0 means under-counting the cost by a constant C.  Option 2: Computing the cost every time and without updating the flag means potentially over-counting the cost by an unknown multiple of C, and also using more (though probably negligible) CPU time both for calculating the cost as well as the check to determine whether to update the flag or not.  I think option 1 is the superior option both for a more accurate estimate of cost as well as minimizing CPU time.

>>>> Source/WTF/ChangeLog:23
>>>> +            NeverDestroyed<String> myString(ASCIILiteral("myString"));
>>> 
>>> Why are there different ascii literal strings? Is this to allow non-atomic increment of literal strings in the dom?
>> 
>> ASCIILiteral() is introduced to distinguish `const char*` from literal, it can represent some good features of literals.
>> For example, while `const char*` may be non-null-terminated, we can say that ASCIILiteral() is always null-terminated.
>> 
>> I think the current StaticASCIILiteral has a bug since makeStaticStringImpl does not work well. I suggest using StaticStringImpl directly in NeverDestroyed<String> places.
>> Like, `static StringImpl::StaticStringImpl myString("myString")`.
> 
> I agree.

@Yusuke, great catch.  I tried replacing with static StaticStringImpl before settling on the makeStaticStringImpl() idiom, but ran into some issues that made it not ideal ... though I've forgotten the reason now.  Maybe I was not doing it right.  I'll try again.  If not able to do so, I can fix the bug in makeStaticStringImpl().

>> Source/WTF/ChangeLog:24
>> +
> 
> I ask because the bulk of this patch is changing asciiliteral to staticasciiliteral, but the whole point of asciiliteral /is/ that it is a literal, so to me this is at best poorly named

Yes, the name is not one that I'm proud of, but it's the best one I can think of at the time.  Might be moot now if we can do without it.

>>> Source/WTF/wtf/text/StringImpl.h:573
>>> +        //    is no way to set/clear the s_hashFlag8BitBuffer flag other than at construction.
>> 
>> Wait, isn't m_data16 created lazily if it is ever needed?
> 
> No, we do not create it lazily. StringImpl is basically immutable except for ref count and atomicity flag.

... and the s_hashFlagDidReportCost flag.

>> Source/WTF/wtf/text/StringImpl.h:581
>> +        //       system cost if needed.
> 
> I think setting s_hashFlagDidReportCost is OK. Adding some document about static strings in ::cost() would be helpful.

OK.

>> Source/WTF/wtf/text/StringImpl.h:887
>> +static_assert(sizeof(StringImpl) == sizeof(StaticStringImpl), "");
> 
> I feel that it would be better to just wrap the sttingimpl members in a struct and embed that struct - why did you choose not to?

Because there is already a lot of StringImpl code that accesses the fields directly.  I didn't want to touch all that code just to make them reference thru an embedded named struct.  I don't see a way to use an anonymous struct.  What's wrong with private inheritance of the Shape class?

>>> Source/WTF/wtf/text/StringImpl.h:1164
>>> +}
>> 
>> I think it's not correct. It returns the same StaticStringImpl& for `makeStaticStringImpl("Hello")` and `makeStaticStringImpl("World")` (strlen("Hello") == strlen("World")), right?
> 
> Agreed.

Will fix (one way or another).
Comment 8 Geoffrey Garen 2017-05-03 10:48:03 PDT
> The fix is to use the StaticStringImpl class which is safe for ref/derefing
> concurrently from different threads.

StaticStringImpl is still not thread-safe in its initialization, since multiple threads can race to initialize.
Comment 9 Oliver Hunt 2017-05-03 10:49:09 PDT
Comment on attachment 308879 [details]
rebased to ToT

View in context: https://bugs.webkit.org/attachment.cgi?id=308879&action=review

>>> Source/WTF/ChangeLog:13
>>> +
>> 
>> I would rather just have didReportCost check if it's a static string and simply not update (and return true) - that seems more sensible to me as it is (functionally) a static part of the binary
> 
> The s_hashFlagDidReportCost flag is used to avoid reporting the cost of the same StringImpl multiple times.  Option 1: Reporting 0 means under-counting the cost by a constant C.  Option 2: Computing the cost every time and without updating the flag means potentially over-counting the cost by an unknown multiple of C, and also using more (though probably negligible) CPU time both for calculating the cost as well as the check to determine whether to update the flag or not.  I think option 1 is the superior option both for a more accurate estimate of cost as well as minimizing CPU time.

Ah, i thought that there was a simple way to identify persistent/permanent strings. So yeah this makes sense to me now.

>>>>> Source/WTF/ChangeLog:23
>>>>> +            NeverDestroyed<String> myString(ASCIILiteral("myString"));
>>>> 
>>>> Why are there different ascii literal strings? Is this to allow non-atomic increment of literal strings in the dom?
>>> 
>>> ASCIILiteral() is introduced to distinguish `const char*` from literal, it can represent some good features of literals.
>>> For example, while `const char*` may be non-null-terminated, we can say that ASCIILiteral() is always null-terminated.
>>> 
>>> I think the current StaticASCIILiteral has a bug since makeStaticStringImpl does not work well. I suggest using StaticStringImpl directly in NeverDestroyed<String> places.
>>> Like, `static StringImpl::StaticStringImpl myString("myString")`.
>> 
>> I agree.
> 
> @Yusuke, great catch.  I tried replacing with static StaticStringImpl before settling on the makeStaticStringImpl() idiom, but ran into some issues that made it not ideal ... though I've forgotten the reason now.  Maybe I was not doing it right.  I'll try again.  If not able to do so, I can fix the bug in makeStaticStringImpl().

My concern is that i don't fully understand why we would not consider /all/ ascii literals to be static -- i'm willing to accept that there could be performance concerns, but i'd like to know if it's deliberate.

Specifically why should these differ in semantics:

String(ASCIILiteral("foo"))
vs.
String(StaticASCIILiteral("foo"))

?
Comment 10 Mark Lam 2017-05-03 10:55:01 PDT
Comment on attachment 308879 [details]
rebased to ToT

View in context: https://bugs.webkit.org/attachment.cgi?id=308879&action=review

>>>>>> Source/WTF/ChangeLog:23
>>>>>> +            NeverDestroyed<String> myString(ASCIILiteral("myString"));
>>>>> 
>>>>> Why are there different ascii literal strings? Is this to allow non-atomic increment of literal strings in the dom?
>>>> 
>>>> ASCIILiteral() is introduced to distinguish `const char*` from literal, it can represent some good features of literals.
>>>> For example, while `const char*` may be non-null-terminated, we can say that ASCIILiteral() is always null-terminated.
>>>> 
>>>> I think the current StaticASCIILiteral has a bug since makeStaticStringImpl does not work well. I suggest using StaticStringImpl directly in NeverDestroyed<String> places.
>>>> Like, `static StringImpl::StaticStringImpl myString("myString")`.
>>> 
>>> I agree.
>> 
>> @Yusuke, great catch.  I tried replacing with static StaticStringImpl before settling on the makeStaticStringImpl() idiom, but ran into some issues that made it not ideal ... though I've forgotten the reason now.  Maybe I was not doing it right.  I'll try again.  If not able to do so, I can fix the bug in makeStaticStringImpl().
> 
> My concern is that i don't fully understand why we would not consider /all/ ascii literals to be static -- i'm willing to accept that there could be performance concerns, but i'd like to know if it's deliberate.
> 
> Specifically why should these differ in semantics:
> 
> String(ASCIILiteral("foo"))
> vs.
> String(StaticASCIILiteral("foo"))
> 
> ?

Consider this scenario:
String getString() {
    const char* str1 = "hello";
    const char* str2 = "world";
    const char* chosenString = reasons ? str1 : str2;
    return String(ASCIILiteral(chosenString));
}

This is legal with ASCIILiteral but not legal with StaticASCIILiteral.  And yes, we do have code that uses ASCIILiteral in that way.
Comment 11 Mark Lam 2017-05-03 10:57:56 PDT
(In reply to Geoffrey Garen from comment #8)
> > The fix is to use the StaticStringImpl class which is safe for ref/derefing
> > concurrently from different threads.
> 
> StaticStringImpl is still not thread-safe in its initialization, since
> multiple threads can race to initialize.

StaticStringImpl's initialization is a constexpr.
Comment 12 Mark Lam 2017-05-04 14:20:20 PDT
Created attachment 309103 [details]
proposed patch.
Comment 13 Yusuke Suzuki 2017-05-04 14:41:28 PDT
Comment on attachment 309103 [details]
proposed patch.

r=me
Comment 14 Mark Lam 2017-05-04 16:25:05 PDT
Thanks for the review.  Landed in r216217: <http://trac.webkit.org/r216217>.
Comment 15 Ryan Haddad 2017-05-04 17:39:47 PDT
(In reply to Mark Lam from comment #14)
> Thanks for the review.  Landed in r216217: <http://trac.webkit.org/r216217>.

This change broke the Windows build:

https://build.webkit.org/builders/Apple%20Win%20Release%20%28Build%29/builds/1222
Comment 16 Mark Lam 2017-05-04 17:47:18 PDT
(In reply to Ryan Haddad from comment #15)
> (In reply to Mark Lam from comment #14)
> > Thanks for the review.  Landed in r216217: <http://trac.webkit.org/r216217>.
> 
> This change broke the Windows build:
> 
> https://build.webkit.org/builders/Apple%20Win%20Release%20%28Build%29/builds/
> 1222

Build fix for Windows landed in r216220: <http://trac.webkit.org/r216220>.