Bug 44132

Summary: Implement virtual path utilities for FileSystem API
Product: WebKit Reporter: Kinuko Yasuda <kinuko>
Component: WebCore Misc.Assignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: dumi, ericu, fishd, michaeln
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: PC   
OS: OS X 10.5   
Bug Depends on:    
Bug Blocks: 42903    
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch
none
Patch dumi: review+

Description Kinuko Yasuda 2010-08-17 14:43:12 PDT
Add file path utilities for HTML5 FileSystem API.
We need some utility class that takes care of path sanitization (cleaning up relative references), concatenation and etc for virtual filesystem paths.
Comment 1 Kinuko Yasuda 2010-08-17 15:11:43 PDT
Created attachment 64643 [details]
Patch
Comment 2 Kinuko Yasuda 2010-08-18 13:50:28 PDT
Created attachment 64767 [details]
Patch

Simplified the patch.
Comment 3 Kinuko Yasuda 2010-08-18 22:54:39 PDT
Comment on attachment 64767 [details]
Patch

(seems like it'll need more logic)
Comment 4 Kinuko Yasuda 2010-08-23 11:49:24 PDT
Created attachment 65145 [details]
Patch
Comment 5 Eric U. 2010-08-23 19:52:45 PDT
Comment on attachment 65145 [details]
Patch

WebCore/storage/DOMFilePath.cpp:43
 +  const char* DOMFilePath::root = "/";
const char DOMFilePath::root[] = "/"; ?

WebCore/storage/DOMFilePath.cpp:73
 +      return path;
This seems odd, when compared to getName.  Is returning path really the right thing to do in each case?

WebCore/storage/DOMFilePath.cpp:78
 +      return mayBeChild.startsWith(path);
/foo/bar is not the parent of /foo/barrister.

WebCore/storage/DOMFilePath.cpp:115
 +      const char* p = utf8String.data();
What happens here if the UTF-8 representation of a character in the path doesn't fit in a char?

WebCore/storage/DOMFilePath.cpp:125
 +      if (unallowedNamesRegExp1.match(path) >= 0)
Is this match case-sensitive?  It needs to be case-insensitive on all of these.

WebCore/storage/DOMFilePath.cpp:122
 +      DEFINE_STATIC_LOCAL(RegularExpression, unallowedNamesRegExp1, ("/(CON|PRN|AUX|NUL)([\\./]|$)", TextCaseInsensitive));
I've never used RegularExpressions from WebKit internally before.  Is that last '/' in the right place?

WebCore/storage/DOMFilePath.cpp:131
 +      DEFINE_STATIC_LOCAL(RegularExpression, endingRegExp, ("[\\. ](/|$)", TextCaseInsensitive));
What about other kinds of whitespace [tab, vtab, newline, cr, ...]

WebCore/storage/DOMFilePath.cpp:138
 +      DEFINE_STATIC_LOCAL(RegularExpression, unallowedCharsRegExp, ("[\\\\<><>:\\?\\*\"|]", TextCaseInsensitive));
You've got "<>" in there twice.

WebCore/storage/DOMFilePath.cpp:108
 +  bool DOMFilePath::isValid(const String& path)
Perhaps "isValidPath" instead of just "isValid"?  One might call this by accident for a name and let something containing a directory separator slip through.

WebCore/storage/DOMFilePath.h:44
 +      static const char* root;
How about static const char root[]?

WebCore/storage/DOMFilePath.cpp:74
 +  }
I think it might be better to return an empty string, or perhaps make sure not to be called in that case.  What's the directory of "foo"?  I suppose it could be ".".

WebCore/storage/DOMFilePath.h:52
 +      // Checks if a given path is a parent of mayBeChild. This method assumes given paths are absolute and do not have extra parent references.
Explain what a parent reference is.  Do you mean "../"?

WebCore/storage/DOMFilePath.h:55
 +      // Appends the separator at the end of the path.
...if it's not there already.

WebCore/storage/DOMFilePath.h:71
 +      // Removes extra parent ("..") references. This doesn't take all the references if it goes beyond the root directory.
It does more than that.  How about something like this?

// Evaluates all "../" and "./" segments.  Note that "/../" expands to "/", so you can't ever refer to anything above the root directory.
Comment 6 Kinuko Yasuda 2010-08-23 23:08:25 PDT
Created attachment 65211 [details]
Patch
Comment 7 Kinuko Yasuda 2010-08-23 23:11:38 PDT
Thanks for your careful review!

(In reply to comment #5)
> (From update of attachment 65145 [details])
> WebCore/storage/DOMFilePath.cpp:43
>  +  const char* DOMFilePath::root = "/";
> const char DOMFilePath::root[] = "/"; ?

Fixed.

> WebCore/storage/DOMFilePath.cpp:73
>  +      return path;
> This seems odd, when compared to getName.  Is returning path really the right thing to do in each case?

I wasn't sure what to return, but I changed this to return an empty string.  As you say probably "." or an empty string would be the expected value.

> WebCore/storage/DOMFilePath.cpp:78
>  +      return mayBeChild.startsWith(path);
> /foo/bar is not the parent of /foo/barrister.

Oh of course not... fixed.

> WebCore/storage/DOMFilePath.cpp:115
>  +      const char* p = utf8String.data();
> What happens here if the UTF-8 representation of a character in the path doesn't fit in a char?

Fixed.

> WebCore/storage/DOMFilePath.cpp:125
>  +      if (unallowedNamesRegExp1.match(path) >= 0)
> Is this match case-sensitive?  It needs to be case-insensitive on all of these.

All of RegExps are case-insensitive (CaseInsensitive).

> WebCore/storage/DOMFilePath.cpp:122
>  +      DEFINE_STATIC_LOCAL(RegularExpression, unallowedNamesRegExp1, ("/(CON|PRN|AUX|NUL)([\\./]|$)", TextCaseInsensitive));
> I've never used RegularExpressions from WebKit internally before.  Is that last '/' in the right place?

It's in the place I intended and my intension was: it should match if CON, PRN, AUX or NUL is followed by a period (\\.), slash (/) or the end of string ($).  Does that sound right?

> WebCore/storage/DOMFilePath.cpp:131
>  +      DEFINE_STATIC_LOCAL(RegularExpression, endingRegExp, ("[\\. ](/|$)", TextCaseInsensitive));
> What about other kinds of whitespace [tab, vtab, newline, cr, ...]

Replaced ' ' with '\s'.

> WebCore/storage/DOMFilePath.cpp:138
>  +      DEFINE_STATIC_LOCAL(RegularExpression, unallowedCharsRegExp, ("[\\\\<><>:\\?\\*\"|]", TextCaseInsensitive));
> You've got "<>" in there twice.

Fixed.

> WebCore/storage/DOMFilePath.cpp:108
>  +  bool DOMFilePath::isValid(const String& path)
> Perhaps "isValidPath" instead of just "isValid"?  One might call this by accident for a name and let something containing a directory separator slip through.

Fixed.

> WebCore/storage/DOMFilePath.h:44
>  +      static const char* root;
> How about static const char root[]?

Fixed.

> WebCore/storage/DOMFilePath.h:52
>  +      // Checks if a given path is a parent of mayBeChild. This method assumes given paths are absolute and do not have extra parent references.
> Explain what a parent reference is.  Do you mean "../"?

Yes I meant a reference to a parent, i.e. "..".  Changed the expression a bit.

> WebCore/storage/DOMFilePath.h:55
>  +      // Appends the separator at the end of the path.
> ...if it's not there already.

Fixed.

> WebCore/storage/DOMFilePath.h:71
>  +      // Removes extra parent ("..") references. This doesn't take all the references if it goes beyond the root directory.
> It does more than that.  How about something like this?
>
> // Evaluates all "../" and "./" segments.  Note that "/../" expands to "/", so you can't ever refer to anything above the root directory.

Changed the comment (and its behavior) as suggested.
Comment 8 Kinuko Yasuda 2010-08-23 23:44:06 PDT
> > WebCore/storage/DOMFilePath.cpp:125
> >  +      if (unallowedNamesRegExp1.match(path) >= 0)
> > Is this match case-sensitive?  It needs to be case-insensitive on all of these.
> 
> All of RegExps are case-insensitive (CaseInsensitive).

I meant... all of the regexps used here are declared with CaseInsensitive for case-insensitive match.
Comment 9 Eric U. 2010-08-24 11:02:40 PDT
Comment on attachment 65211 [details]
Patch

WebCore/storage/DOMFilePath.cpp:124
 +      DEFINE_STATIC_LOCAL(RegularExpression, unallowedNamesRegExp1, ("/(CON|PRN|AUX|NUL)([\\./]|$)", TextCaseInsensitive));
Ah, I think I see how this is supposed to work.  However, you're requiring a leading '/' before matching for e.g. "CON".  If this isn't an absolute path, that's not safe.  "CON" and "CON/safe/rest/of/path" will pass the test.  Or you could assert and document that the input path must be absolute.
Comment 10 Kinuko Yasuda 2010-08-24 11:12:33 PDT
Created attachment 65298 [details]
Patch
Comment 11 Kinuko Yasuda 2010-08-24 11:13:46 PDT
(In reply to comment #9)
> (From update of attachment 65211 [details])
> WebCore/storage/DOMFilePath.cpp:124
>  +      DEFINE_STATIC_LOCAL(RegularExpression, unallowedNamesRegExp1, ("/(CON|PRN|AUX|NUL)([\\./]|$)", TextCaseInsensitive));
> Ah, I think I see how this is supposed to work.  However, you're requiring a leading '/' before matching for e.g. "CON".  If this isn't an absolute path, that's not safe.  "CON" and "CON/safe/rest/of/path" will pass the test.  Or you could assert and document that the input path must be absolute.

Right, fixed.
Comment 12 Eric U. 2010-08-24 16:25:03 PDT
(In reply to comment #11)
> (In reply to comment #9)
> > (From update of attachment 65211 [details] [details])
> > WebCore/storage/DOMFilePath.cpp:124
> >  +      DEFINE_STATIC_LOCAL(RegularExpression, unallowedNamesRegExp1, ("/(CON|PRN|AUX|NUL)([\\./]|$)", TextCaseInsensitive));
> > Ah, I think I see how this is supposed to work.  However, you're requiring a leading '/' before matching for e.g. "CON".  If this isn't an absolute path, that's not safe.  "CON" and "CON/safe/rest/of/path" will pass the test.  Or you could assert and document that the input path must be absolute.
> 
> Right, fixed.

OK, LGTM.  Now you need a reviewer ;'>.
Comment 13 Dumitru Daniliuc 2010-08-24 16:32:57 PDT
Comment on attachment 65298 [details]
Patch

r=me. LGTM + i trust eric's review.
Comment 14 Kinuko Yasuda 2010-08-24 21:31:38 PDT
Committed r65975: <http://trac.webkit.org/changeset/65975>