Bug 60279 - Modify WTF::String and UString to use a more safe API for lookup functions.
Summary: Modify WTF::String and UString to use a more safe API for lookup functions.
Status: ASSIGNED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC Linux
: P2 Normal
Assignee: Alexis Menard (darktears)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-05-05 10:55 PDT by Alexis Menard (darktears)
Modified: 2011-05-05 12:58 PDT (History)
5 users (show)

See Also:


Attachments
Patch (89.70 KB, patch)
2011-05-05 11:11 PDT, Alexis Menard (darktears)
darin: review-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Alexis Menard (darktears) 2011-05-05 10:55:20 PDT
Modify WTF::String and UString to use a more safe API for lookup functions.
Comment 1 Alexis Menard (darktears) 2011-05-05 11:11:28 PDT
Created attachment 92434 [details]
Patch
Comment 2 Alexis Menard (darktears) 2011-05-05 11:17:05 PDT
I'm seeking comments here...
Comment 3 Darin Adler 2011-05-05 12:44:28 PDT
Comment on attachment 92434 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=92434&action=review

What’s the real problem being solved here? Can you show me the specific call sites for find that have a problem? I think we may in fact want to move to a 64-bit interface to strings even though we have a 32-bit limit on their length at this time, so changing this function from size_t to uint32_t may be a step in the wrong direction.

> Source/JavaScriptCore/ChangeLog:8
> +        This patch change the API of WTF::String and UString to a more safe one.

Saying this is more safe seems to oversell it.

Anywhere we use notFound rather than String::notFound to check a result will compile correctly and work wrong; this adds a new kind of peril for programmers using the string class where they have to remember to use the correct notFound value.

Changing the index and size types of the string class to be more consistent is a good idea.

> Source/JavaScriptCore/runtime/UString.h:109
> -    size_t find(UChar c, unsigned start = 0) const
> -        { return m_impl ? m_impl->find(c, start) : notFound; }
> -    size_t find(const UString& str, unsigned start = 0) const
> -        { return m_impl ? m_impl->find(str.impl(), start) : notFound; }
> -    size_t find(const char* str, unsigned start = 0) const
> -        { return m_impl ? m_impl->find(str, start) : notFound; }
> +    uint32_t find(UChar c, unsigned start = 0) const
> +        { return m_impl ? m_impl->find(c, start) : String::notFound; }
> +    uint32_t find(const UString& str, unsigned start = 0) const
> +        { return m_impl ? m_impl->find(str.impl(), start) : String::notFound; }
> +    uint32_t find(const char* str, unsigned start = 0) const
> +        { return m_impl ? m_impl->find(str, start) : String::notFound; }
>  
>      // Find the last instance of a single character or string.
> -    size_t reverseFind(UChar c, unsigned start = UINT_MAX) const
> -        { return m_impl ? m_impl->reverseFind(c, start) : notFound; }
> -    size_t reverseFind(const UString& str, unsigned start = UINT_MAX) const
> -        { return m_impl ? m_impl->reverseFind(str.impl(), start) : notFound; }
> +    uint32_t reverseFind(UChar c, unsigned start = UINT_MAX) const
> +        { return m_impl ? m_impl->reverseFind(c, start) : String::notFound; }
> +    uint32_t reverseFind(const UString& str, unsigned start = UINT_MAX) const
> +        { return m_impl ? m_impl->reverseFind(str.impl(), start) : String::notFound; }

I don’t like a patch that makes a single function, find, use two distinct unsigned 32-bit integer types for the argument and result. Specifically, both uint32_t and unsigned.

> Source/JavaScriptCore/wtf/text/StringImpl.cpp:83
> +    // We do not support this length.
> +    if (length == 0xFFFFFFFF)
> +        CRASH();

I don’t think that the 3 create functions you patched here are the only places that can make strings that have lengths == 0xFFFFFFFF.

For example, StringImpl::adopt doesn't go through any of these checks.
Comment 4 Alexis Menard (darktears) 2011-05-05 12:58:40 PDT
(In reply to comment #3)
> (From update of attachment 92434 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=92434&action=review
> 
> What’s the real problem being solved here? Can you show me the specific call sites for find that have a problem? I think we may in fact want to move to a 64-bit interface to strings even though we have a 32-bit limit on their length at this time, so changing this function from size_t to uint32_t may be a step in the wrong direction.

Hi Darin,

Thanks for the feedback.

Yes well if the plan is to move to something that support both, well yes my patch is wrong.

What I'm trying to solve is a warning in StringPrototype.cpp line 465.

int ovector[2] = { matchPos, matchEnd };

matchPos being :

size_t matchPos = source.find(patternString);

and matchEnd being :

int matchLen = patternString.length();

One could say I can change ovector but it's a bigger change than it seems (it will modify RegExpConstructor and RegExp.

> 
> > Source/JavaScriptCore/ChangeLog:8
> > +        This patch change the API of WTF::String and UString to a more safe one.
> 
> Saying this is more safe seems to oversell it.

oops.

> 
> Anywhere we use notFound rather than String::notFound to check a result will compile correctly and work wrong; this adds a new kind of peril for programmers using the string class where they have to remember to use the correct notFound value.

Yes that is the risk. That's why I was looking for comments or better suggestions.

> 
> Changing the index and size types of the string class to be more consistent is a good idea.
> 
> > Source/JavaScriptCore/runtime/UString.h:109
> > -    size_t find(UChar c, unsigned start = 0) const
> > -        { return m_impl ? m_impl->find(c, start) : notFound; }
> > -    size_t find(const UString& str, unsigned start = 0) const
> > -        { return m_impl ? m_impl->find(str.impl(), start) : notFound; }
> > -    size_t find(const char* str, unsigned start = 0) const
> > -        { return m_impl ? m_impl->find(str, start) : notFound; }
> > +    uint32_t find(UChar c, unsigned start = 0) const
> > +        { return m_impl ? m_impl->find(c, start) : String::notFound; }
> > +    uint32_t find(const UString& str, unsigned start = 0) const
> > +        { return m_impl ? m_impl->find(str.impl(), start) : String::notFound; }
> > +    uint32_t find(const char* str, unsigned start = 0) const
> > +        { return m_impl ? m_impl->find(str, start) : String::notFound; }
> >  
> >      // Find the last instance of a single character or string.
> > -    size_t reverseFind(UChar c, unsigned start = UINT_MAX) const
> > -        { return m_impl ? m_impl->reverseFind(c, start) : notFound; }
> > -    size_t reverseFind(const UString& str, unsigned start = UINT_MAX) const
> > -        { return m_impl ? m_impl->reverseFind(str.impl(), start) : notFound; }
> > +    uint32_t reverseFind(UChar c, unsigned start = UINT_MAX) const
> > +        { return m_impl ? m_impl->reverseFind(c, start) : String::notFound; }
> > +    uint32_t reverseFind(const UString& str, unsigned start = UINT_MAX) const
> > +        { return m_impl ? m_impl->reverseFind(str.impl(), start) : String::notFound; }
> 
> I don’t like a patch that makes a single function, find, use two distinct unsigned 32-bit integer types for the argument and result. Specifically, both uint32_t and unsigned.

We can definitively be more consistent here. 

> 
> > Source/JavaScriptCore/wtf/text/StringImpl.cpp:83
> > +    // We do not support this length.
> > +    if (length == 0xFFFFFFFF)
> > +        CRASH();
> 
> I don’t think that the 3 create functions you patched here are the only places that can make strings that have lengths == 0xFFFFFFFF.
> 
> For example, StringImpl::adopt doesn't go through any of these checks.

Thanks. I will look at that.