Bug 145109 - Move AtomicStringImpl table related operations from AtomicString to AtomicStringImpl
Summary: Move AtomicStringImpl table related operations from AtomicString to AtomicStr...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Yusuke Suzuki
URL:
Keywords:
Depends on:
Blocks: 144848
  Show dependency treegraph
 
Reported: 2015-05-17 12:34 PDT by Yusuke Suzuki
Modified: 2015-05-19 21:19 PDT (History)
3 users (show)

See Also:


Attachments
Patch (82.79 KB, patch)
2015-05-17 12:36 PDT, Yusuke Suzuki
no flags Details | Formatted Diff | Diff
Patch (82.76 KB, patch)
2015-05-17 12:37 PDT, Yusuke Suzuki
darin: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Yusuke Suzuki 2015-05-17 12:34:56 PDT
Move AtomicStringImpl table related operations from AtomicString to AtomicStringImpl
Comment 1 Yusuke Suzuki 2015-05-17 12:36:52 PDT
Created attachment 253292 [details]
Patch
Comment 2 Yusuke Suzuki 2015-05-17 12:37:53 PDT
Created attachment 253293 [details]
Patch
Comment 3 WebKit Commit Bot 2015-05-17 12:39:34 PDT
Attachment 253293 [details] did not pass style-queue:


ERROR: Source/WTF/wtf/text/AtomicString.h:45:  Should be indented on a separate line, with the colon or comma first on that line.  [whitespace/indent] [4]
ERROR: Source/WTF/wtf/text/AtomicString.h:46:  Should be indented on a separate line, with the colon or comma first on that line.  [whitespace/indent] [4]
ERROR: Source/WTF/wtf/text/AtomicString.h:47:  Should be indented on a separate line, with the colon or comma first on that line.  [whitespace/indent] [4]
ERROR: Source/WTF/wtf/text/AtomicString.h:48:  Should be indented on a separate line, with the colon or comma first on that line.  [whitespace/indent] [4]
ERROR: Source/WTF/wtf/text/AtomicString.h:49:  Should be indented on a separate line, with the colon or comma first on that line.  [whitespace/indent] [4]
ERROR: Source/WTF/wtf/text/AtomicString.h:50:  Should be indented on a separate line, with the colon or comma first on that line.  [whitespace/indent] [4]
ERROR: Source/WTF/wtf/text/AtomicString.h:61:  Should be indented on a separate line, with the colon or comma first on that line.  [whitespace/indent] [4]
ERROR: Source/WTF/wtf/text/AtomicString.h:164:  Should be indented on a separate line, with the colon or comma first on that line.  [whitespace/indent] [4]
ERROR: Source/WTF/wtf/text/AtomicString.h:167:  Should be indented on a separate line, with the colon or comma first on that line.  [whitespace/indent] [4]
ERROR: Source/WTF/wtf/text/WTFString.h:121:  Should be indented on a separate line, with the colon or comma first on that line.  [whitespace/indent] [4]
ERROR: Source/WTF/wtf/text/WTFString.h:122:  Should be indented on a separate line, with the colon or comma first on that line.  [whitespace/indent] [4]
ERROR: Source/WTF/wtf/text/WTFString.h:123:  Should be indented on a separate line, with the colon or comma first on that line.  [whitespace/indent] [4]
Total errors found: 12 in 29 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 4 Yusuke Suzuki 2015-05-18 08:53:55 PDT
Comment on attachment 253293 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=253293&action=review

Added comments

> Source/WTF/wtf/text/AtomicString.h:50
> +    AtomicString(const UChar* s) : m_string(AtomicStringImpl::add(s)) { }

These follow the original code's style. And it causes cpplint's warnings.
Is it better to change it to the following?
AtomicString(const LChar* s)
    : m_string(AtomicStringImpl::add(s))
{
}
Comment 5 Darin Adler 2015-05-19 08:56:29 PDT
(In reply to comment #4)
> These follow the original code's style. And it causes cpplint's warnings.
> Is it better to change it to the following?
> AtomicString(const LChar* s)
>     : m_string(AtomicStringImpl::add(s))
> {
> }

I like that style and yes that’s how I would do it. But, better, I would not put the function definition into the class definition. I would declare it in the class and define it after the class. The class definition itself is more readable if it’s not full of function bodies.
Comment 6 Darin Adler 2015-05-19 09:06:30 PDT
Comment on attachment 253293 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=253293&action=review

> Source/JavaScriptCore/runtime/TypeSet.cpp:355
> -        String property = String((*iter));
> +        String property = String(iter->get());

I don’t understand this change.

> Source/WTF/wtf/text/AtomicString.cpp:108
> +    if (RefPtr<AtomicStringImpl> impl = AtomicStringImpl::addFromUTF8(charactersStart, charactersEnd)) {

Should use auto here instead of writing out the RefPtr type.

> Source/WTF/wtf/text/AtomicString.cpp:110
> +        AtomicString atomicString(impl.get());
> +        return atomicString;

I don’t understand the reason for the local variable here. I would write:

    return impl.get();

Or if that doesn’t work:

    return AtomicString { impl.get() };

> Source/WTF/wtf/text/AtomicString.cpp:112
> +    return nullAtom;

I prefer early return for the error case, although that gets rid of the scoping.

>> Source/WTF/wtf/text/AtomicString.h:50
>> +    AtomicString(const UChar* s) : m_string(AtomicStringImpl::add(s)) { }
> 
> These follow the original code's style. And it causes cpplint's warnings.
> Is it better to change it to the following?
> AtomicString(const LChar* s)
>     : m_string(AtomicStringImpl::add(s))
> {
> }

I like that style and yes that’s how I would do it. But, better, I would not put the function definition into the class definition. I would declare it in the class and define it after the class. The class definition itself is more readable if it’s not full of function bodies.

> Source/WTF/wtf/text/AtomicStringImpl.h:33
> +private:
> +    AtomicStringImpl() = delete;

I suggest putting this down in the private section below.

> Source/WTF/wtf/text/AtomicStringImpl.h:39
> +    static AtomicStringImpl* lookup(LChar* characters, unsigned length)
> +    {
> +        return lookupInternal(characters, length);
> +    }

The verb "look up" is two words. So the function name should be lookUp. Not "lookup", which is a noun, not a verb. Should return a reference, not a pointer.

> Source/WTF/wtf/text/AtomicStringImpl.h:43
> +    static AtomicStringImpl* lookup(UChar* characters, unsigned length)
> +    {
> +        return lookupInternal(characters, length);
> +    }

The verb "look up" is two words. So the function name should be lookUp. Not "lookup", which is a noun, not a verb. Should return a reference, not a pointer.

> Source/WTF/wtf/text/AtomicStringImpl.h:49
> +    static AtomicStringImpl* lookup(StringImpl* string)
> +    {
> +        if (!string || string->isAtomic())
> +            return static_cast<AtomicStringImpl*>(string);
> +        return lookupSlowCase(*string);
> +    }

The verb "look up" is two words. So the function name should be lookUp. Not "lookup", which is a noun, not a verb. Should return a reference, not a pointer. Can we have one that takes and returns a reference? The one that takes and returns a pointer can call the one that takes and returns a reference after doing the null check.

> Source/WTF/wtf/text/AtomicStringImpl.h:60
> +    WTF_EXPORT_STRING_API static RefPtr<AtomicStringImpl> add(const LChar*);
> +    ALWAYS_INLINE static RefPtr<AtomicStringImpl> add(const char* s) { return add(reinterpret_cast<const LChar*>(s)); };
> +    WTF_EXPORT_STRING_API static RefPtr<AtomicStringImpl> add(const LChar*, unsigned length);
> +    WTF_EXPORT_STRING_API static RefPtr<AtomicStringImpl> add(const UChar*, unsigned length);
> +    ALWAYS_INLINE static RefPtr<AtomicStringImpl> add(const char* s, unsigned length) { return add(reinterpret_cast<const LChar*>(s), length); };
> +    WTF_EXPORT_STRING_API static Ref<AtomicStringImpl> add(const UChar*, unsigned length, unsigned existingHash);
> +    WTF_EXPORT_STRING_API static RefPtr<AtomicStringImpl> add(const UChar*);
> +    WTF_EXPORT_STRING_API static RefPtr<AtomicStringImpl> add(StringImpl*, unsigned offset, unsigned length);

Any of these that can’t return null should be returning Ref, not RefPtr. I guess maybe they already are.

> Source/WTF/wtf/text/AtomicStringImpl.h:67
> +    WTF_EXPORT_STRING_API static Ref<AtomicStringImpl> addFromLiteralData(const char* characters, unsigned length);

I’m not sure “add from” is the best naming here. Maybe just addLiteralData. But I’m also not sure what “literal data” is. I think maybe addLiteral is the best name.

> Source/WTF/wtf/text/AtomicStringImpl.h:70
> +    // AtomicStringImpl::addFromUTF8 will return a nullptr if
> +    // the input data contains invalid UTF-8 sequences.

Don’t need to say the name of the function in the comment:

    // Returns null if the input data contains an invalid UTF-8 sequence.

> Source/WTF/wtf/text/AtomicStringImpl.h:71
> +    WTF_EXPORT_STRING_API static RefPtr<AtomicStringImpl> addFromUTF8(const char* start, const char* end);

This should be addUTF8, because from doesn’t really make sense. I don’t understand why we use a start and end pointer here, though!

> Source/WTF/wtf/text/AtomicStringImpl.h:80
> +            return static_cast<AtomicStringImpl*>(string);

This should just be:

    return nullptr;

> Source/WTF/wtf/text/StringImpl.h:333
> +    // To allow generating the destructor in the base classes, we expose the destructor as protected.

Don’t need that comment. That’s what protected means.

Should move this down because we try to organize our classes with public first, then protected, then private, without interleaving them.

> Source/WTF/wtf/text/WTFString.h:119
>      String(Ref<StringImpl>&& impl) : m_impl(WTF::move(impl)) { }
>      String(RefPtr<StringImpl>&& impl) : m_impl(impl) { }

Doesn’t make sense that one of these uses WTF::move and the other does not!

> Source/WTF/wtf/text/WTFString.h:121
> +    String(PassRefPtr<AtomicStringImpl> impl) : m_impl(impl) { }

Do we really need to add this one? PassRefPtr is deprecated. Can we just fix all the call sites, or are there too many?

> Source/WTF/wtf/text/WTFString.h:123
> +    String(Ref<AtomicStringImpl>&& impl) : m_impl(WTF::move(impl)) { }
> +    String(RefPtr<AtomicStringImpl>&& impl) : m_impl(impl) { }

Doesn’t make sense that one of these uses WTF::move and the other does not!
Comment 7 Yusuke Suzuki 2015-05-19 09:33:57 PDT
(In reply to comment #5)
> (In reply to comment #4)
> > These follow the original code's style. And it causes cpplint's warnings.
> > Is it better to change it to the following?
> > AtomicString(const LChar* s)
> >     : m_string(AtomicStringImpl::add(s))
> > {
> > }
> 
> I like that style and yes that’s how I would do it. But, better, I would not
> put the function definition into the class definition. I would declare it in
> the class and define it after the class. The class definition itself is more
> readable if it’s not full of function bodies.

OK. So following the style, declaring the prototype in class body and define it with `inline` in the header.
Comment 8 Yusuke Suzuki 2015-05-19 11:18:26 PDT
Comment on attachment 253293 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=253293&action=review

Thank you for your review!

>> Source/JavaScriptCore/runtime/TypeSet.cpp:355
>> +        String property = String(iter->get());
> 
> I don’t understand this change.

This is due to ambiguity of the following things.
     String(RefPtr<AtomicStringImpl>&& impl) : m_impl(impl) { }
     String(PassRefPtr<AtomicStringImpl> impl) : m_impl(impl) { }
     String(PassRefPtr<StringImpl> impl) : m_impl(impl) { }

OK, so in this time, I've just changed it to `**iter` to extract StringImpl& reference.

>> Source/WTF/wtf/text/AtomicString.cpp:108
>> +    if (RefPtr<AtomicStringImpl> impl = AtomicStringImpl::addFromUTF8(charactersStart, charactersEnd)) {
> 
> Should use auto here instead of writing out the RefPtr type.

Make sense! Changed.

>> Source/WTF/wtf/text/AtomicString.cpp:110
>> +        return atomicString;
> 
> I don’t understand the reason for the local variable here. I would write:
> 
>     return impl.get();
> 
> Or if that doesn’t work:
> 
>     return AtomicString { impl.get() };

`return impl.get();` works. Fixed.

>> Source/WTF/wtf/text/AtomicString.cpp:112
>> +    return nullAtom;
> 
> I prefer early return for the error case, although that gets rid of the scoping.

OK, rewrite it to

auto impl = ...
if (!impl)
    return nullAtom;
return impl.get();

>>> Source/WTF/wtf/text/AtomicString.h:50
>>> +    AtomicString(const UChar* s) : m_string(AtomicStringImpl::add(s)) { }
>> 
>> These follow the original code's style. And it causes cpplint's warnings.
>> Is it better to change it to the following?
>> AtomicString(const LChar* s)
>>     : m_string(AtomicStringImpl::add(s))
>> {
>> }
> 
> I like that style and yes that’s how I would do it. But, better, I would not put the function definition into the class definition. I would declare it in the class and define it after the class. The class definition itself is more readable if it’s not full of function bodies.

Fixed :D

>> Source/WTF/wtf/text/AtomicStringImpl.h:33
>> +    AtomicStringImpl() = delete;
> 
> I suggest putting this down in the private section below.

Fixed.

>> Source/WTF/wtf/text/AtomicStringImpl.h:39
>> +    }
> 
> The verb "look up" is two words. So the function name should be lookUp. Not "lookup", which is a noun, not a verb. Should return a reference, not a pointer.

Ah, but, when the AtomicStringImpl's table doesn't contain AtomicStringImpl* for the given string, it returns nullptr.
Changed `lookup` to `lookUp`.

>> Source/WTF/wtf/text/AtomicStringImpl.h:43
>> +    }
> 
> The verb "look up" is two words. So the function name should be lookUp. Not "lookup", which is a noun, not a verb. Should return a reference, not a pointer.

Fixed the name.

>> Source/WTF/wtf/text/AtomicStringImpl.h:60
>> +    WTF_EXPORT_STRING_API static RefPtr<AtomicStringImpl> add(StringImpl*, unsigned offset, unsigned length);
> 
> Any of these that can’t return null should be returning Ref, not RefPtr. I guess maybe they already are.

Seeing the implementation, when the given char* (const UChar* etc.) is null, it returns nullptr.
Some (like `add(const UChar*, unsigned length, unsigned existingHash);`) returns Ref, because they require the given pointer is not nullptr. (since there's existingHash value, it is not nullptr in the used context I guess.)

>> Source/WTF/wtf/text/AtomicStringImpl.h:67
>> +    WTF_EXPORT_STRING_API static Ref<AtomicStringImpl> addFromLiteralData(const char* characters, unsigned length);
> 
> I’m not sure “add from” is the best naming here. Maybe just addLiteralData. But I’m also not sure what “literal data” is. I think maybe addLiteral is the best name.

Changed to `addLiteral`.

>> Source/WTF/wtf/text/AtomicStringImpl.h:70
>> +    // the input data contains invalid UTF-8 sequences.
> 
> Don’t need to say the name of the function in the comment:
> 
>     // Returns null if the input data contains an invalid UTF-8 sequence.

Fixed.

>> Source/WTF/wtf/text/AtomicStringImpl.h:71
>> +    WTF_EXPORT_STRING_API static RefPtr<AtomicStringImpl> addFromUTF8(const char* start, const char* end);
> 
> This should be addUTF8, because from doesn’t really make sense. I don’t understand why we use a start and end pointer here, though!

I've changed the API name to addUTF8.
Haha, I guess that the given UTF8 is user-input; it may contain \0 and to avoid performing strlen(str) accidentally, we manage the range with pointers.
But it's my guess.

>> Source/WTF/wtf/text/AtomicStringImpl.h:80
>> +            return static_cast<AtomicStringImpl*>(string);
> 
> This should just be:
> 
>     return nullptr;

Fixed.

>> Source/WTF/wtf/text/StringImpl.h:333
>> +    // To allow generating the destructor in the base classes, we expose the destructor as protected.
> 
> Don’t need that comment. That’s what protected means.
> 
> Should move this down because we try to organize our classes with public first, then protected, then private, without interleaving them.

OK, removed the comment and moved it down.

>> Source/WTF/wtf/text/WTFString.h:119
>>      String(RefPtr<StringImpl>&& impl) : m_impl(impl) { }
> 
> Doesn’t make sense that one of these uses WTF::move and the other does not!

OK, use WTF::move.

>> Source/WTF/wtf/text/WTFString.h:121
>> +    String(PassRefPtr<AtomicStringImpl> impl) : m_impl(impl) { }
> 
> Do we really need to add this one? PassRefPtr is deprecated. Can we just fix all the call sites, or are there too many?

At least, dropping this can be done!

>> Source/WTF/wtf/text/WTFString.h:123
>> +    String(RefPtr<AtomicStringImpl>&& impl) : m_impl(impl) { }
> 
> Doesn’t make sense that one of these uses WTF::move and the other does not!

Fixed.
Comment 9 Yusuke Suzuki 2015-05-19 11:22:46 PDT
Comment on attachment 253293 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=253293&action=review

>>> Source/JavaScriptCore/runtime/TypeSet.cpp:355
>>> +        String property = String(iter->get());
>> 
>> I don’t understand this change.
> 
> This is due to ambiguity of the following things.
>      String(RefPtr<AtomicStringImpl>&& impl) : m_impl(impl) { }
>      String(PassRefPtr<AtomicStringImpl> impl) : m_impl(impl) { }
>      String(PassRefPtr<StringImpl> impl) : m_impl(impl) { }
> 
> OK, so in this time, I've just changed it to `**iter` to extract StringImpl& reference.

Ah, it may be null, so use `String property = iter->get();`
Comment 10 Darin Adler 2015-05-19 13:39:33 PDT
Comment on attachment 253293 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=253293&action=review

>>> Source/WTF/wtf/text/AtomicStringImpl.h:71
>>> +    WTF_EXPORT_STRING_API static RefPtr<AtomicStringImpl> addFromUTF8(const char* start, const char* end);
>> 
>> This should be addUTF8, because from doesn’t really make sense. I don’t understand why we use a start and end pointer here, though!
> 
> I've changed the API name to addUTF8.
> Haha, I guess that the given UTF8 is user-input; it may contain \0 and to avoid performing strlen(str) accidentally, we manage the range with pointers.
> But it's my guess.

I was wondering more about why two pointers rather than start and length. We almost always use pointer and length for this kind of thing, as in addFromLiteralData above.
Comment 11 Yusuke Suzuki 2015-05-19 19:01:29 PDT
Comment on attachment 253293 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=253293&action=review

>>>> Source/WTF/wtf/text/AtomicStringImpl.h:71
>>>> +    WTF_EXPORT_STRING_API static RefPtr<AtomicStringImpl> addFromUTF8(const char* start, const char* end);
>>> 
>>> This should be addUTF8, because from doesn’t really make sense. I don’t understand why we use a start and end pointer here, though!
>> 
>> I've changed the API name to addUTF8.
>> Haha, I guess that the given UTF8 is user-input; it may contain \0 and to avoid performing strlen(str) accidentally, we manage the range with pointers.
>> But it's my guess.
> 
> I was wondering more about why two pointers rather than start and length. We almost always use pointer and length for this kind of thing, as in addFromLiteralData above.

Ah, seeing the caller side of this function, I've found the edge case.
In AtomicString::fromUTF8(const char* characters), we call it with the form,

fromUTF8Internal(characters, 0);

Here, 0 means nullptr.
When the end is nullptr, this `addUTF8` works specially.
(`calculateStringHashAndLengthFromUTF8MaskingTop8Bits` handles this)

This is a little bit error prone design. In the meantime, I'll fix it in the separate patch.
Comment 12 Yusuke Suzuki 2015-05-19 21:19:25 PDT
Committed r184612: <http://trac.webkit.org/changeset/184612>