Bug 35498

Summary: check-webkit-style: Path-specific logic should work when invoking with files syntax
Product: WebKit Reporter: Chris Jerdonek <cjerdonek>
Component: Tools / TestsAssignee: Chris Jerdonek <cjerdonek>
Status: RESOLVED FIXED    
Severity: Normal CC: abarth, cjerdonek, eric, hamaji, levin
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Attachments:
Description Flags
Proposed patch
none
Proposed patch 2
cjerdonek: commit-queue-
Proposed patch 3
cjerdonek: commit-queue-
Proposed patch 4
hamaji: review-, cjerdonek: commit-queue-
Proposed patch 5
cjerdonek: commit-queue-
Proposed patch 6
cjerdonek: commit-queue-
Proposed patch 7
hamaji: review-
Proposed patch 8 hamaji: review+, cjerdonek: commit-queue-

Description Chris Jerdonek 2010-02-28 11:51:25 PST
It looks like in a lot of places where a path is checked for a certain sub-string (e.g. for path-specific filter rules, whether to skip the file, etc), the absolute form of the path is not always used.  This can cause those checks to fail if check-webkit-style is run from a sub-directory.  Adding the following in appropriate places should help resolve this:

            path = os.path.abspath(path)

We want to be careful though to continue *displaying* the path that was provided -- or at least the path relative to the source root.
Comment 1 Chris Jerdonek 2010-03-08 01:06:56 PST
Created attachment 50196 [details]
Proposed patch
Comment 2 Shinichiro Hamaji 2010-03-08 21:27:47 PST
Comment on attachment 50196 [details]
Proposed patch

OK.

If you checkout WebKit repository in a directory whose name is "ForwardingHeaders", build/header_guard will be not checked for all files, right? If so, I guess we may eventually want to modify the absolute path so that it doesn't contain directories where the WebKit repository is put. Anyway, I think this kind of situations are unlikely to happen for now.
Comment 3 Chris Jerdonek 2010-03-08 22:27:19 PST
(In reply to comment #2)

Thanks for reviewing, Shinichiro!

> (From update of attachment 50196 [details])
> If you checkout WebKit repository in a directory whose name is
> "ForwardingHeaders", build/header_guard will be not checked for all files,
> right? If so, I guess we may eventually want to modify the absolute path so
> that it doesn't contain directories where the WebKit repository is put. Anyway,
> I think this kind of situations are unlikely to happen for now.

That's a really good point.  It makes me realize that I don't think I have the right solution.  Actually, I see now that the issue I mention in this report is only an issue when the "file" syntax is used -- as opposed to the patch or --git-commit syntax.  When the latter syntax is used, the path relative to the source root is passed to the checker, making the conversion to absolute paths unnecessary.

To illustrate (executing check-webkit-style from the JavaScriptCore/wtf
directory)--

chris_g4@stonewall:wtf (tot-staging)> check-webkit-style 
JavaScriptCore/wtf/StringExtras.h:25:  Tab found; better to use spaces  [whitespace/tab] [1]
Total errors found: 1 in 1 files
chris_g4@stonewall:wtf (tot-staging)> check-webkit-style StringExtras.h
StringExtras.h:26:  #ifndef header guard has wrong style, please use: StringExtras_h  [build/header_guard] [5]
StringExtras.h:25:  Tab found; better to use spaces  [whitespace/tab] [1]
StringExtras.h:95:  Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons.  [readability/comparison_to_zero] [5]
StringExtras.h:98:  Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons.  [readability/comparison_to_zero] [5]
Total errors found: 4 in 1 files
chris_g4@stonewall:wtf (tot-staging)> 

So I think a better solution might be as follows:

If the file syntax is used, try to find the source root.  If the source root can be found, then pass the paths into the checker relative to the source root, and apply the path-specific rules as usual.  Otherwise, if the source root cannot be found, pass the paths relative to the current working directory, and do not apply any path-specific rules.

With that solution, we will never need to take absolute paths.  It will keep the code simpler.  It will render more nicely (since paths will be displayed relative to the source root).  And it will let people check entire directories, etc. using WebKit's style configuration rules -- even if they aren't part of a patch.

The only drawback is if someone has a copy of the repository that lacks the needed .svn files, etc to allow the source root to be detected.  In that case, checking style wouldn't take the path-specific rules into account.  If we used absolute paths, it would.  However, the correct solution in this case is probably to make the source root detection code smarter -- to find the source root based on file paths rather than on the presence of .svn files, etc.  (I haven't actually looked at the source root detection code.)  And if that can't be done, we could always allow a source root to be passed manually as a flag, but I think the need for that is probably pretty rare.

What do you think?
Comment 4 Shinichiro Hamaji 2010-03-09 00:02:04 PST
> So I think a better solution might be as follows:
> 
> If the file syntax is used, try to find the source root.  If the source root
> can be found, then pass the paths into the checker relative to the source root,
> and apply the path-specific rules as usual.  Otherwise, if the source root
> cannot be found, pass the paths relative to the current working directory, and
> do not apply any path-specific rules.
> 
> With that solution, we will never need to take absolute paths.  It will keep
> the code simpler.  It will render more nicely (since paths will be displayed
> relative to the source root).  And it will let people check entire directories,
> etc. using WebKit's style configuration rules -- even if they aren't part of a
> patch.

This plan sounds promising. I'm not sure if it's easy to convert relative paths into source root relative paths though. Maybe python provides a nice function for this?

> The only drawback is if someone has a copy of the repository that lacks the
> needed .svn files, etc to allow the source root to be detected.  In that case,
> checking style wouldn't take the path-specific rules into account.  If we used
> absolute paths, it would.  However, the correct solution in this case is
> probably to make the source root detection code smarter -- to find the source
> root based on file paths rather than on the presence of .svn files, etc.  (I
> haven't actually looked at the source root detection code.)  And if that can't
> be done, we could always allow a source root to be passed manually as a flag,
> but I think the need for that is probably pretty rare.

Yeah, it seems WebKit's nightly source doesn't contain .svn or .git (http://nightly.webkit.org/). I guess just checking WebKitTools instead of .svn or .git would solve this problem for almost all cases.
Comment 5 Chris Jerdonek 2010-03-09 00:13:20 PST
(In reply to comment #4)
> This plan sounds promising. I'm not sure if it's easy to convert relative paths
> into source root relative paths though. Maybe python provides a nice function
> for this?

I was just working on this.  It turns out they do: os.path.relpath.

http://docs.python.org/library/os.path.html#os.path.relpath

But after trying it, I found out it's new in Python 2.6. :(

There should be alternatives -- perhaps using os.path.commonprefix on the two paths made absolute:

http://docs.python.org/library/os.path.html#os.path.commonprefix
Comment 6 Chris Jerdonek 2010-03-09 00:20:04 PST
(In reply to comment #4)
> > The only drawback is if someone has a copy of the repository that lacks the
> > needed .svn files, etc to allow the source root to be detected.  In that case,
> > checking style wouldn't take the path-specific rules into account.  If we used
> > absolute paths, it would.  However, the correct solution in this case is
> > probably to make the source root detection code smarter -- to find the source
> > root based on file paths rather than on the presence of .svn files, etc.  (I
> > haven't actually looked at the source root detection code.)  And if that can't
> > be done, we could always allow a source root to be passed manually as a flag,
> > but I think the need for that is probably pretty rare.
> 
> Yeah, it seems WebKit's nightly source doesn't contain .svn or .git
> (http://nightly.webkit.org/). I guess just checking WebKitTools instead of .svn
> or .git would solve this problem for almost all cases.

By the way, it turns out that turning off path-specific logic isn't really possible after all (at least turning off all of it).  This is because not all of the path-specific logic is encapsulated by the configuration variables.  Some of it is in cpp.py, for instance.

So rather than trying to turning it part-way off, I'm just going to leave the path-specific logic turned on all the way, but display a warning that there may be unexpected results.  The benefit, though, is that check-webkit-style will still work correctly on the nightlies, provided that check-webkit-style is run from the root of the nightly.  Maybe I will include some advice to that effect in the warning as well.
Comment 7 Chris Jerdonek 2010-03-12 10:52:01 PST
Created attachment 50610 [details]
Proposed patch 2

Added patch with revised logic, as discussed in the comment thread.
Comment 8 Chris Jerdonek 2010-03-12 11:02:58 PST
Created attachment 50612 [details]
Proposed patch 3

Removed some trailing spaces, fixed a hanging indent, and renamed two unit tests: (un)converted -> (un)convertible.
Comment 9 Chris Jerdonek 2010-03-12 11:36:04 PST
Created attachment 50617 [details]
Proposed patch 4

Changed the abs_start_path parameter of _rel_path() to start_path.  This makes the function more similar to os.path.relpath() and harder to use incorrectly.
Comment 10 Shinichiro Hamaji 2010-03-14 11:14:46 PDT
Comment on attachment 50617 [details]
Proposed patch 4

Sorry for the latency... I've been somewhat sick and busy these days.

Overall this change looks pretty good! But I believe we can improve this a bit.

I think it's nice to rename UnitTestLogStream in separate change. Also, I'm not sure why TestLogStream is the better naming.

> -def configure_logging(stream, logger=None):
> +def configure_logging(stream, logger=None, logging_level=logging.INFO):

Why? It seems this change is used in this patch. If my understanding is correct, I'm appreciated to see this change in separated patch.

> +def _rel_path(path, start_path, os_path_abspath=None):
> +    """Return a path relative to the given start path, or None.
> +
> +    Returns None if the path is not contained in the directory start_path.
> +
> +    Args:
> +      path: An absolute or relative path to convert to a relative path.
> +      start_path: The path relative to which the given path should be
> +                  converted.
> +      os_path_abspath: A replacement function for unit testing.
> +                       Defaults to os.path.abspath.
> +
> +    """
> +    if os_path_abspath is None:
> +        os_path_abspath = os.path.abspath
> +
> +    start_path = os_path_abspath(start_path)
> +    path = os_path_abspath(path)
> +
> +    if not path.lower().startswith(start_path.lower()):
> +        # Then path is outside the directory given by start_path.
> +        return None
> +
> +    path = path[len(start_path):]
> +    # Trim off any leading slash from path in case start_path
> +    # does not end in a slash.
> +    path = path.lstrip("/\\")
> +
> +    return path

What happens if a user call this function like

_rel_path("/tmp/foobar", "/tmp/foo")

? I guess the return value will be "bar". I guess it's better to check if path.lstrip actually modified the string. Otherwise, this function should return None, I think.

> +def handle_scm(paths=None, mock_detect_scm=None, mock_os=None):

Why cannot we just do mock_os=os ? I know default_value=[] is bad, but I think it's OK to assign os module here. Sometimes I've seen this idiom in your patch, so I should have asked earlier.

> +        _log.warn("Source root not detected: path-specific logic may yield "
> +                  "unexpected results.\n"
> +                  " If you are running this script from a known source "
> +                  "root, ignore this warning.")

This message is for end-users, right? I guess this message isn't super helpful for end-users. I would say "WebKit source root" or something instead of "source root". Also, I would provide an example of unexpected results.

> +            if rel_path is None:
> +                # Then the path is outside the source tree.  Since all
> +                # paths should be interpreted relative to the same root,
> +                # do not interpret any of the paths as relative to the
> +                # source root.  Interpret all of them relative to the
> +                # current working directory, and do not change the current
> +                # working directory.
> +                _log.warn("One of the given files is outside the source "
> +                          "tree: path-specific logic may yield unexpected "
> +                          "results.\n Ensure that all files are within "
> +                          "the source tree to achieve expected results.")
> +                return (paths, scm)

I'd add the value of path into the warning message.

> +    def test_rel_path(self):
> +        rel_path = self._rel_path("/WebKit/test/Foo.txt", "/WebKit")
> +        self.assertEquals(rel_path, "test/Foo.txt")
> +
> +        rel_path = self._rel_path("/WebKit/test/Foo.txt", "/WebKit/")
> +        self.assertEquals(rel_path, "test/Foo.txt")
> +
> +        rel_path = self._rel_path(r"c:\WebKit\test\Foo.txt", r"c:\WebKit")
> +        self.assertEquals(rel_path, r"test\Foo.txt")
> +
> +        rel_path = self._rel_path(r"c:\WebKit\test\Foo.txt", "c:\\WebKit\\")
> +        self.assertEquals(rel_path, r"test\Foo.txt")

I'd check //WebKit//test//Foo.txt, too. I'd also check self._rel_path("/WebKitTools", "/WebKit") .
Comment 11 Chris Jerdonek 2010-03-14 12:51:48 PDT
(In reply to comment #10)
> (From update of attachment 50617 [details])
> Overall this change looks pretty good! But I believe we can improve this a bit.

Thanks a lot for the review and excellent comments!

> I think it's nice to rename UnitTestLogStream in separate change. Also, I'm not
> sure why TestLogStream is the better naming.

Okay, I'll do that separately.  I thought TestLogStream is preferable because (1) it is more concise, and (2) it is more consistent with the rest of the nomenclature in the module, which drops the word "unit" (logtesting module and LogTesting class rather than logunittesting and LogUnitTesting).

> > -def configure_logging(stream, logger=None):
> > +def configure_logging(stream, logger=None, logging_level=logging.INFO):
> 
> Why? It seems this change is used in this patch. If my understanding is
> correct, I'm appreciated to see this change in separated patch.

I think you mean "not" used in this patch?  Well, yes and no.  :)  I was using it when debugging by setting logging_level to logging.DEBUG in the call to configure_calling().  There is a FIXME in checker.py to add to configure_logging() support for more verbose logging, which is not quite implemented yet.

> What happens if a user call this function like
> 
> _rel_path("/tmp/foobar", "/tmp/foo")
> 
> ? I guess the return value will be "bar". I guess it's better to check if
> path.lstrip actually modified the string. Otherwise, this function should
> return None, I think.

Good catch!  Yes, and then in that case I will also need to rstrip() trailing slashes from start_path prior to removing from path.

> > +def handle_scm(paths=None, mock_detect_scm=None, mock_os=None):
> 
> Why cannot we just do mock_os=os ? I know default_value=[] is bad, but I think
> it's OK to assign os module here. Sometimes I've seen this idiom in your patch,
> so I should have asked earlier.

I got into this practice after reading Google's Python style guide:

http://google-styleguide.googlecode.com/svn/trunk/pyguide.html#Default_Argument_Values

where it says, "Do not use mutable objects as default values in the function or method definition."

And good suggestions on the warning messages.
Comment 12 Chris Jerdonek 2010-03-14 13:46:18 PDT
(In reply to comment #11)
> Okay, I'll do that separately.  I thought TestLogStream is preferable because
> (1) it is more concise, and (2) it is more consistent with the rest of the
> nomenclature in the module, which drops the word "unit" (logtesting module and
> LogTesting class rather than logunittesting and LogUnitTesting).

Created a report for this here:

https://bugs.webkit.org/show_bug.cgi?id=36099
Comment 13 Chris Jerdonek 2010-03-14 13:49:34 PDT
(In reply to comment #11)
> There is a FIXME in checker.py to add to
> configure_logging() support for more verbose logging, which is not quite
> implemented yet.

Also created a report this here:

https://bugs.webkit.org/show_bug.cgi?id=36100
Comment 14 Shinichiro Hamaji 2010-03-15 00:53:06 PDT
Thanks for trying to splitting the patch.

> I think you mean "not" used in this patch?

Yes.

> Well, yes and no.  :)  I was using
> it when debugging by setting logging_level to logging.DEBUG in the call to
> configure_calling().  There is a FIXME in checker.py to add to
> configure_logging() support for more verbose logging, which is not quite
> implemented yet.

So, I think it's better to add the parameter when it is actually used.

> I got into this practice after reading Google's Python style guide:
> 
> http://google-styleguide.googlecode.com/svn/trunk/pyguide.html#Default_Argument_Values
> 
> where it says, "Do not use mutable objects as default values in the function or
> method definition."

I believe this sentence is vague but trying to warn of code like following:

def add_to_list(element, array=[]):
  array.append(element)
  return array

print add_to_list(1)  # [1]
print add_to_list(2)  # [1, 2]... wtf!?

As no one will modify os or os.path, I think we can think they are immutable. However, I think the current code is also OK if you like it. It's verbose but consistent in all cases.
Comment 15 Eric Seidel (no email) 2010-03-15 15:54:55 PDT
Comment on attachment 50196 [details]
Proposed patch

Cleared Shinichiro Hamaji's review+ from obsolete attachment 50196 [details] so that this bug does not appear in http://webkit.org/pending-commit.
Comment 16 Chris Jerdonek 2010-03-21 12:11:32 PDT
Created attachment 51252 [details]
Proposed patch 5

Lots of changes in this patch.

Note that this patch may not apply correctly because it may depend on another patch being landed (one of the patches that was requested I split out: bug 36100).
Comment 17 Chris Jerdonek 2010-03-21 12:18:33 PDT
Created attachment 51253 [details]
Proposed patch 6

Re-attaching -- that wasn't the newest one.
Comment 18 Chris Jerdonek 2010-03-21 12:33:44 PDT
(In reply to comment #14)
> So, I think it's better to add the parameter when it is actually used.

I am taking care of this here--

https://bugs.webkit.org/show_bug.cgi?id=36100

> As no one will modify os or os.path, I think we can think they are immutable.
> However, I think the current code is also OK if you like it. It's verbose but
> consistent in all cases.

I decided to keep it because we never know what future developers will do.  We modify sys, for instance, in check-webkit-style.  Though it is something I don't favor, I could see a developer modifying os or os.path to change the behavior of, for example, os.chdir or os.path.abspath (especially for unit testing).  In this patch we avoid needing to do that by providing parameters to allow mock objects to be substituted.
Comment 19 Chris Jerdonek 2010-03-24 01:00:44 PDT
Created attachment 51482 [details]
Proposed patch 7

This should apply now that the patch for bug 36100 has landed.
Comment 20 Shinichiro Hamaji 2010-03-24 05:03:12 PDT
Comment on attachment 51482 [details]
Proposed patch 7

As usual, this patch looks very good :) r- for a few style nitpicks.

> +        _log.warn("WebKit checkout root not found:\n"
> +                  "  Path-dependent style checks may not work correctly.\n"
> +                  "  See the help documentation for more info.")
> +
> +        return paths
> +
> +    if paths:
> +        # Then try converting all of the paths to paths relative to
> +        # the checkout root.
> +        rel_paths = []
> +        for path in paths:
> +            rel_path = _rel_path(path, checkout_root)
> +            if rel_path is None:
> +                # Then the path is not below the checkout root.  Since all
> +                # paths should be interpreted relative to the same root,
> +                # do not interpret any of the paths as relative to the
> +                # checkout root.  Interpret all of them relative to the
> +                # current working directory, and do not change the current
> +                # working directory.
> +                _log.warn("\
> +Path-dependent style checks may not work correctly:\n\
> +\n\
> +  One of the given paths is outside the WebKit checkout of the current\n\
> +  working directory:\n\
> +\n\
> +    Path: %s\n\
> +    Checkout root: %s\n\
> +\n\
> +  Pass only files below the checkout root to ensure correct results.\n\
> +  See the help documentation for more info.\n"
> +                          % (path, checkout_root))

I'm not sure why we are using this style here. I prefer the style of above _log.error. Or, we may want to use a string literal with three double-quotes?

> +        expected_rel_path = os.path.join("test", "Foo.txt")
> +        path = os.path.join(start_path, expected_rel_path)

I'd create these values by ["test", "Foo.txt"].join("/") so the test case becomes OS independent. Of course, we may want one test case which uses backslashes as the path separator.

Also, I'd check rel_path("WebKit//test//Foo.txt", "WebKit") == "test//Foo.txt" .

> +        actual_rel_path = self._rel_path(path, start_path)
> +        self.assertEquals(expected_rel_path, actual_rel_path)

I'd remove actual_rel_path, but it's OK as is.

> +        rel_path = self._rel_path(path, start_path)
> +        self.assertTrue(rel_path is None)
> +
> +        rel_path = self._rel_path("WebKitTools", "WebKit")
> +        self.assertTrue(rel_path is None)

Same here.

> +        log_messages = ["""WARNING: WebKit checkout root not found:
> +  Path-dependent style checks may not work correctly.
> +  See the help documentation for more info.
> +"""]

> +        log_messages = [
> +"""WARNING: Path-dependent style checks may not work correctly:
> +
> +  One of the given paths is outside the WebKit checkout of the current
> +  working directory:
> +
> +    Path: /outside/foo2.txt
> +    Checkout root: /WebKit
> +
> +  Pass only files below the checkout root to ensure correct results.
> +  See the help documentation for more info.
> +
> +"""]

They look inconsistent, too.

> +        if not found_checkout and not paths:
> +            _log.error("WebKit checkout not found: You must run this script "
> +                       "from inside a WebKit checkout if you are not passing "
> +                       "specific paths to check.")
> +            sys.exit(1)

I'd put this in check-webkit-style and then move it to main.py when we remove FIXME in check-webkit-style you added in this patch. But it's not strong preference and please feel free to go ahead as is.

>      def _parse(self):
>          """Return a default parse() function for testing."""
> -        return self._create_parser().parse
> +        # We will test found_scm False separately.
> +        def parse(args, found_scm=True):
> +            parser = self._create_parser()
> +            return parser.parse(args, found_scm)
> +
> +        return parse

I didn't notice this function is a higher order function when I looked this before, but I'd just define this like

def _parse(self, args, found_scm=True):
    """..."""
    parser = self._create_parser()
    return parser.parse(args, found_scm)

and use it like

parse = self._parse
self.assertRaises(SystemExit, parse, [...])

because this would be simpler.
Comment 21 Shinichiro Hamaji 2010-03-24 05:15:00 PDT
(In reply to comment #18)
> (In reply to comment #14)
> > So, I think it's better to add the parameter when it is actually used.
> 
> I am taking care of this here--
> 
> https://bugs.webkit.org/show_bug.cgi?id=36100

Thanks a lot!

> I decided to keep it because we never know what future developers will do.  We
> modify sys, for instance, in check-webkit-style.  Though it is something I
> don't favor, I could see a developer modifying os or os.path to change the
> behavior of, for example, os.chdir or os.path.abspath (especially for unit
> testing).  In this patch we avoid needing to do that by providing parameters to
> allow mock objects to be substituted.

I see. Though I'm not fully convinced yet (I guess I like shorter code too
much... Or, maybe I just dislike Python's default parameters - if they are
evaluated every time we invoke functions, we didn't need this discussion), I
can understand your opinion. Please go ahead with the current way (assign None
first then assign actual values inside the function body).

Anyway, thanks for the detailed description!
Comment 22 Chris Jerdonek 2010-03-27 12:06:59 PDT
(In reply to comment #20)
> (From update of attachment 51482 [details])
> As usual, this patch looks very good :) r- for a few style nitpicks.

Thanks again for reviewing!  Sorry for the delay in resubmitting.

> > +        _log.warn("WebKit checkout root not found:\n"
> > +                  "  Path-dependent style checks may not work correctly.\n"
> > +                  "  See the help documentation for more info.")
> > +
> > +        return paths
> > +
> > +    if paths:
> > +        # Then try converting all of the paths to paths relative to
> > +        # the checkout root.
> > +        rel_paths = []
> > +        for path in paths:
> > +            rel_path = _rel_path(path, checkout_root)
> > +            if rel_path is None:
> > +                # Then the path is not below the checkout root.  Since all
> > +                # paths should be interpreted relative to the same root,
> > +                # do not interpret any of the paths as relative to the
> > +                # checkout root.  Interpret all of them relative to the
> > +                # current working directory, and do not change the current
> > +                # working directory.
> > +                _log.warn("\
> > +Path-dependent style checks may not work correctly:\n\
> > +\n\
> > +  One of the given paths is outside the WebKit checkout of the current\n\
> > +  working directory:\n\
> > +\n\
> > +    Path: %s\n\
> > +    Checkout root: %s\n\
> > +\n\
> > +  Pass only files below the checkout root to ensure correct results.\n\
> > +  See the help documentation for more info.\n"
> > +                          % (path, checkout_root))
> 
> I'm not sure why we are using this style here. I prefer the style of above
> _log.error. Or, we may want to use a string literal with three double-quotes?

Yes, of course.  Three double-quotes would be better here.  For some reason I was thinking format string weren't allowed with three double quotes.  (Also, I prefer the string literal over the non-literal because in order to see more clearly what it looks like in multi-line format.) 

> > +        expected_rel_path = os.path.join("test", "Foo.txt")
> > +        path = os.path.join(start_path, expected_rel_path)
> 
> I'd create these values by ["test", "Foo.txt"].join("/") so the test case
> becomes OS independent. Of course, we may want one test case which uses
> backslashes as the path separator.

I could be mistaken on the comments I'm making here, but isn't it preferable if we can eliminate the presence of forward and backward slashes from the tests (i.e. eliminate platform-specific path separators and generate all paths using cross-platform methods)?  One reason, for example, is that os.path.abspath() (which _rel_path() uses) won't work for paths with backward slashes when run from a Mac:

http://docs.python.org/library/os.path.html#os.path.abspath
http://docs.python.org/library/os.path.html#os.path.normpath

I guess it wouldn't matter for our tests if we were using mock os.path methods for all of our path processing (which we aren't really).  But if we were, then it doesn't seem like it would help to test both forward and backward slashes, etc. since then we would just be testing the mock methods rather than the code itself.

I think my preference is to use os.path methods everywhere possible rather than literal slashes.  Otherwise, it seems like we would have to explicitly import, for example, ntpath to test the case of backward slashes.

> Also, I'd check rel_path("WebKit//test//Foo.txt", "WebKit") == "test//Foo.txt"

See comment above.

> > +        actual_rel_path = self._rel_path(path, start_path)
> > +        self.assertEquals(expected_rel_path, actual_rel_path)
> 
> I'd remove actual_rel_path, but it's OK as is.

Thanks, you are right.  That was left over from previous.

> 
> > +        rel_path = self._rel_path(path, start_path)
> > +        self.assertTrue(rel_path is None)
> > +
> > +        rel_path = self._rel_path("WebKitTools", "WebKit")
> > +        self.assertTrue(rel_path is None)
> 
> Same here.

?

> > +        log_messages = ["""WARNING: WebKit checkout root not found:
> > +  Path-dependent style checks may not work correctly.
> > +  See the help documentation for more info.
> > +"""]
> 
> > +        log_messages = [
> > +"""WARNING: Path-dependent style checks may not work correctly:
> > +
> > +  One of the given paths is outside the WebKit checkout of the current
> > +  working directory:
> > +
> > +    Path: /outside/foo2.txt
> > +    Checkout root: /WebKit
> > +
> > +  Pass only files below the checkout root to ensure correct results.
> > +  See the help documentation for more info.
> > +
> > +"""]
> 
> They look inconsistent, too.

Will fix.

> > +        if not found_checkout and not paths:
> > +            _log.error("WebKit checkout not found: You must run this script "
> > +                       "from inside a WebKit checkout if you are not passing "
> > +                       "specific paths to check.")
> > +            sys.exit(1)
> 
> I'd put this in check-webkit-style and then move it to main.py when we remove
> FIXME in check-webkit-style you added in this patch. But it's not strong
> preference and please feel free to go ahead as is.

Okay, I'll revisit this later.

> >      def _parse(self):
> >          """Return a default parse() function for testing."""
> > -        return self._create_parser().parse
> > +        # We will test found_scm False separately.
> > +        def parse(args, found_scm=True):
> > +            parser = self._create_parser()
> > +            return parser.parse(args, found_scm)
> > +
> > +        return parse
> 
> I didn't notice this function is a higher order function when I looked this
> before, but I'd just define this like
> 
> def _parse(self, args, found_scm=True):
>     """..."""
>     parser = self._create_parser()
>     return parser.parse(args, found_scm)
> 
> and use it like
> 
> parse = self._parse
> self.assertRaises(SystemExit, parse, [...])
> 
> because this would be simpler.

Yes, thanks.
Comment 23 Chris Jerdonek 2010-03-27 12:13:03 PDT
(In reply to comment #22)
> > Also, I'd check rel_path("WebKit//test//Foo.txt", "WebKit") == "test//Foo.txt"
> 
> See comment above.

To clarify, I'm not sure cases like this help because they are testing the mock os.path.abspath() method (which in our case doesn't do anything) rather than the code itself.  In actual practice, os.path.abspath() (which calls normpath()) collapses consecutive slashes:

http://docs.python.org/library/os.path.html#os.path.normpath
Comment 24 Chris Jerdonek 2010-03-27 12:23:59 PDT
(In reply to comment #22)

> > > +        log_messages = ["""WARNING: WebKit checkout root not found:
> > > +  Path-dependent style checks may not work correctly.
> > > +  See the help documentation for more info.
> > > +"""]
> > 
> > > +        log_messages = [
> > > +"""WARNING: Path-dependent style checks may not work correctly:
> > > +
> > > +  One of the given paths is outside the WebKit checkout of the current
> > > +  working directory:
> > > +
> > > +    Path: /outside/foo2.txt
> > > +    Checkout root: /WebKit
> > > +
> > > +  Pass only files below the checkout root to ensure correct results.
> > > +  See the help documentation for more info.
> > > +
> > > +"""]
> > 
> > They look inconsistent, too.
> 
> Will fix.

FYI, this difference turned out to have been forced by the second case exceeding the 79 character limit (my preference was to start immediately after the bracket by default, if possible).  Maybe the solution is to always begin triple double-quote strings at the beginning of the following line, when used in cases other than doc strings.
Comment 25 Chris Jerdonek 2010-03-27 12:39:52 PDT
Created attachment 51837 [details]
Proposed patch 8

And to think the first patch in this report had an r+! :P
Comment 26 Shinichiro Hamaji 2010-03-27 19:56:10 PDT
Comment on attachment 51837 [details]
Proposed patch 8

> > > +        expected_rel_path = os.path.join("test", "Foo.txt")
> > > +        path = os.path.join(start_path, expected_rel_path)
> > 
> > I'd create these values by ["test", "Foo.txt"].join("/") so the test case
> > becomes OS independent. Of course, we may want one test case which uses
> > backslashes as the path separator.
> 
> I could be mistaken on the comments I'm making here, but isn't it preferable if
> we can eliminate the presence of forward and backward slashes from the tests
> (i.e. eliminate platform-specific path separators and generate all paths using
> cross-platform methods)?  One reason, for example, is that os.path.abspath()
> (which _rel_path() uses) won't work for paths with backward slashes when run
> from a Mac:
> 
> http://docs.python.org/library/os.path.html#os.path.abspath
> http://docs.python.org/library/os.path.html#os.path.normpath
> 
> I guess it wouldn't matter for our tests if we were using mock os.path methods
> for all of our path processing (which we aren't really).  But if we were, then
> it doesn't seem like it would help to test both forward and backward slashes,
> etc. since then we would just be testing the mock methods rather than the code
> itself.
> 
> I think my preference is to use os.path methods everywhere possible rather than
> literal slashes.  Otherwise, it seems like we would have to explicitly import,
> for example, ntpath to test the case of backward slashes.

I see. I didn't notice os.path.abspath depends on os.path.sep. So, now I think using os.path.join would be better. As you are using a slash and a backslash in _rel_path to lstrip the path, I thought testing without os.path.join would be better to check both slashes and backslashes will be lstripped.

> > > +        actual_rel_path = self._rel_path(path, start_path)
> > > +        self.assertEquals(expected_rel_path, actual_rel_path)
> > 
> > I'd remove actual_rel_path, but it's OK as is.
> 
> Thanks, you are right.  That was left over from previous.
> 
> > 
> > > +        rel_path = self._rel_path(path, start_path)
> > > +        self.assertTrue(rel_path is None)
> > > +
> > > +        rel_path = self._rel_path("WebKitTools", "WebKit")
> > > +        self.assertTrue(rel_path is None)
> > 
> > Same here.
> 
> ?

Sorry for this unclear comment. I tried to mean we can remove actual_rel_path and rel_path. Like


self.assertEquals(expected_rel_path, self._rel_path(path, start_path))

or

self.assertTrue(self._rel_path(path, start_path) is None)

But this change isn't important at all. Please go ahead as is if you like the current code.

> FYI, this difference turned out to have been forced by the second case
> exceeding the 79 character limit (my preference was to start immediately after
> the bracket by default, if possible).  Maybe the solution is to always begin
> triple double-quote strings at the beginning of the following line, when used
> in cases other than doc strings.

If you like, you may be able to use

  str = """\
Foo
  bar
  baz
"""

but your code is good as well as is.
Comment 27 Chris Jerdonek 2010-03-27 20:54:01 PDT
(In reply to comment #26)

> I see. I didn't notice os.path.abspath depends on os.path.sep. So, now I think
> using os.path.join would be better. As you are using a slash and a backslash in
> _rel_path to lstrip the path, I thought testing without os.path.join would be
> better to check both slashes and backslashes will be lstripped.

I see.  Good point.  I guess the ideal would be if the method could use os methods to do the lstrip instead of having to reference slash characters explicitly.  Alternatively, we could use os.altsep (only if it is not None on the system) to construct the other test case.
Comment 28 Chris Jerdonek 2010-03-28 00:58:56 PDT
Manually committed (via "git svn dcommit"):

http://trac.webkit.org/changeset/56682
Comment 29 Chris Jerdonek 2010-03-29 09:23:41 PDT
(In reply to comment #27)
> (In reply to comment #26)
> 
> > I see. I didn't notice os.path.abspath depends on os.path.sep. So, now I think
> > using os.path.join would be better. As you are using a slash and a backslash in
> > _rel_path to lstrip the path, I thought testing without os.path.join would be
> > better to check both slashes and backslashes will be lstripped.
> 
> I see.  Good point.  I guess the ideal would be if the method could use os
> methods to do the lstrip instead of having to reference slash characters
> explicitly.  Alternatively, we could use os.altsep (only if it is not None on
> the system) to construct the other test case.

I made a report for this issue here:

https://bugs.webkit.org/show_bug.cgi?id=36759