Support new format for test expectations
Created attachment 147716 [details] Patch
This patch is missing some pieces. e.g. garden-o-matic should be able to generate test expectations using new format, and flakiness dashboard also needs to support new format. I'm intending to update those tools in follow up patches.
Comment on attachment 147716 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=147716&action=review > Tools/ChangeLog:11 > + It replaces BUGWK12345, BUGCR67890, and BUGRNIWA by webkit.org/12345, crbug.com/67890, and Bug(rniwa), webkit.org/12345 <-- not webkit.org/b/12345 ?
(In reply to comment #2) > This patch is missing some pieces. e.g. garden-o-matic should be able to generate test expectations using new format, and flakiness dashboard also needs to support new format. I'm intending to update those tools in follow up patches. I'm not really a fan of breaking these tools for more than a couple hours. People depend on these. I'd rather you break the patch up by changing one bit at a time (e.g. uppercase->camelCase in one patch, BUGWK->urls in another, etc) and keep all the tools working. Also, I think it's just easier to reason about the patch if it's only changing one of those things at a time. There's no benefit to doing this in one big patch.
(In reply to comment #4) > (In reply to comment #2) > > This patch is missing some pieces. e.g. garden-o-matic should be able to generate test expectations using new format, and flakiness dashboard also needs to support new format. I'm intending to update those tools in follow up patches. > > I'm not really a fan of breaking these tools for more than a couple hours. My plan is make this change, fix those tools, and then convert test expectations. i.e. I don't intend on converting test expectations until I patch other tools. I wanted to land this patch first so that people can experiment with new format. > People depend on these. I'd rather you break the patch up by changing one bit at a time (e.g. uppercase->camelCase in one patch, BUGWK->urls in another, etc) and keep all the tools working. I don't think that's realistic because things like flakiness dashboard can't be updated synchronously. > Also, I think it's just easier to reason about the patch if it's only changing one of those things at a time. There's no benefit to doing this in one big patch. There is a big benefit that tools will be broken at most once, not 4-5 times.
Comment on attachment 147716 [details] Patch Thanks for working on this! Note that this probably duplicates bug 86796, so we should close one or the other (probably that one). (In reply to comment #2) > This patch is missing some pieces. e.g. garden-o-matic should be able to generate test expectations using new format, and flakiness dashboard also needs to support new format. I'm intending to update those tools in follow up patches. I don't think the flakiness dashboard actually looks at the test expectations syntax. We do need to make garden-o-matic work as part of this; it should be easy to do. On the other hand, I don't think I agree with Ojan that we should change the syntax in steps. That will just be really confusing over the course of the transition and I shudder to think of the merge conflicts that might arise. I'm a big-bang fan. Go bigger! (Actual patch comments): View in context: https://bugs.webkit.org/attachment.cgi?id=147716&action=review > Tools/ChangeLog:17 > + Finally, comments start with # instead of //. I don't recall agreeing on changing the comment delimiter, but whatever :) > Tools/Scripts/webkitpy/layout_tests/run_webkit_tests_integrationtest.py:261 > + port_obj.expectations_dict = lambda: {'': 'a # syntax error'} nit: I'd probably get rid of the '#' here since it might be confusing. It was not the intent of this test for '# syntax error' to represent a comment in any way. > Tools/Scripts/webkitpy/layout_tests/models/test_expectations.py:271 > + elif modifier.lower().startswith('bug') or '/' in modifier: just checking if '/' is in the modifier is probably a bit fast-and-loose for me, since it would allow 'foo/bar' to be legal. > Tools/Scripts/webkitpy/layout_tests/models/test_expectations.py:339 > + I would rather we not try to match this all as a single regex, since it looks like gibberish to me :) Can we split this into old-style match and new-style match and figure out which one to use? "description" does not seem like a great name to me; I'd probably use 'bugid' to be consistent with the variable names. 'platforms' is not the right name, since it can include things like 'release' and 'debug' and presumably things like 'qt-4.8' and 'wk2' and gpu/cpu. Also, you need to update the docstring in _tokenize() Lastly, since this is line-by-line, I'm guessing it will allow you to mix the two syntaxes in the file. I realize this is just a transitional thing, but I'm not sure if I like that. In particular it's hard for me to judge the correctness of this and it's hard to tell what the post-transition code would look like. Maybe it would help to post a version of the code that will only parse the new syntax? > Tools/Scripts/webkitpy/layout_tests/models/test_expectations.py:389 > + # FIXME: Should we support rebaseline modifier at all? Yes, we still need to support REBASELINE, but I've been saying it would move to the expectations (and no longer be a modifier) > Tools/Scripts/webkitpy/layout_tests/models/test_expectations.py:397 > + expectation_line.warnings.append("WontFix should not be used with any other expectation") I don't think we need to enforce this; I seem to recall arguing that skip and wontfix should be handled identically.
(In reply to comment #6) > (From update of attachment 147716 [details]) > Thanks for working on this! Note that this probably duplicates bug 86796, so we should close one or the other (probably that one). I think we can make this bug a blocker instead because we'll need to fix garden-o-matic as well. > (In reply to comment #2) > > This patch is missing some pieces. e.g. garden-o-matic should be able to generate test expectations using new format, and flakiness dashboard also needs to support new format. I'm intending to update those tools in follow up patches. > > I don't think the flakiness dashboard actually looks at the test expectations syntax. We do need to make garden-o-matic work as part of this; it should be easy to do. It does in javascript. It fetches TextExpectations via XHR and shows expectations. > > Tools/Scripts/webkitpy/layout_tests/run_webkit_tests_integrationtest.py:261 > > + port_obj.expectations_dict = lambda: {'': 'a # syntax error'} > > nit: I'd probably get rid of the '#' here since it might be confusing. It was not the intent of this test for '# syntax error' to represent a comment in any way. Will do. > > Tools/Scripts/webkitpy/layout_tests/models/test_expectations.py:271 > > + elif modifier.lower().startswith('bug') or '/' in modifier: > > just checking if '/' is in the modifier is probably a bit fast-and-loose for me, since it would allow 'foo/bar' to be legal. On the other hand, we don't want to do a full URL/URI match either. Also, I don't think we want to whitelist URLs because we want to be able to use URLs of Skia, V8, etc... issue trackers. > > Tools/Scripts/webkitpy/layout_tests/models/test_expectations.py:339 > > + > > I would rather we not try to match this all as a single regex, since it looks like gibberish to me :) Can we split this into old-style match and new-style match and figure out which one to use? A big advantage of using a regular expression here is so that the said javascript code in flakiness dashboard can use the regular expression. > "description" does not seem like a great name to me; I'd probably use 'bugid' to be consistent with the variable names. But that's not accurate because we allow things like Bug(rniwa). Also webkit.org/b/12345 isn't really a bug id. Maybe bug_info? > 'platforms' is not the right name, since it can include things like 'release' and 'debug' and presumably things like 'qt-4.8' and 'wk2' and gpu/cpu. How about configurations? It's definitely not modifiers. > In particular it's hard for me to judge the correctness of this and it's hard to tell what the post-transition code would look like. Maybe it would help to post a version of the code that will only parse the new syntax? In the post transition, we'll get rid of the statements after the if. > > Tools/Scripts/webkitpy/layout_tests/models/test_expectations.py:389 > > + # FIXME: Should we support rebaseline modifier at all? > > Yes, we still need to support REBASELINE, but I've been saying it would move to the expectations (and no longer be a modifier) Okay, so just remove this FIXME? > > Tools/Scripts/webkitpy/layout_tests/models/test_expectations.py:397 > > + expectation_line.warnings.append("WontFix should not be used with any other expectation") > > I don't think we need to enforce this; I seem to recall arguing that skip and wontfix should be handled identically. If WontFix and Skip are synonymous, then we certainly don't want other expectations to be there, right?
(In reply to comment #7) > (In reply to comment #6) > > (From update of attachment 147716 [details] [details]) > > Thanks for working on this! Note that this probably duplicates bug 86796, so we should close one or the other (probably that one). > > I think we can make this bug a blocker instead because we'll need to fix garden-o-matic as well. > Well, I want you to fix garden-o-matic as a part of this, if we can. Can you work up a patch that does that so we can see if it adds much complexity or not? > > (In reply to comment #2) > > > This patch is missing some pieces. e.g. garden-o-matic should be able to generate test expectations using new format, and flakiness dashboard also needs to support new format. I'm intending to update those tools in follow up patches. > > > > I don't think the flakiness dashboard actually looks at the test expectations syntax. We do need to make garden-o-matic work as part of this; it should be easy to do. > > It does in javascript. It fetches TextExpectations via XHR and shows expectations. > I see. I believe I've had conversations w/ Ojan where we need to stop this and probably just embed the appropriate text into the uploaded json file (or do something else). Especially with cascading expectations, getting this right gets harder and harder, duplicating more and more code. Maybe we need to fix that before landing this. > > > Tools/Scripts/webkitpy/layout_tests/models/test_expectations.py:271 > > > + elif modifier.lower().startswith('bug') or '/' in modifier: > > > > just checking if '/' is in the modifier is probably a bit fast-and-loose for me, since it would allow 'foo/bar' to be legal. > > On the other hand, we don't want to do a full URL/URI match either. Also, I don't think we want to whitelist URLs because we want to be able to use URLs of Skia, V8, etc... issue trackers. > I understand, but this is too loose. maybe look for at least something that looks like a domain name (.com/.org?) followed by a slash and ends in a number? > > > Tools/Scripts/webkitpy/layout_tests/models/test_expectations.py:339 > > > + > > > > I would rather we not try to match this all as a single regex, since it looks like gibberish to me :) Can we split this into old-style match and new-style match and figure out which one to use? > > A big advantage of using a regular expression here is so that the said javascript code in flakiness dashboard can use the regular expression. > good point, however, cf. above :) Do javascript regexps actually let you name subpattern matches? > > "description" does not seem like a great name to me; I'd probably use 'bugid' to be consistent with the variable names. > > But that's not accurate because we allow things like Bug(rniwa). Also webkit.org/b/12345 isn't really a bug id. Maybe bug_info? > I didn't realize bugid had an actual definition :) If you prefer bug_info that's fine, or bug_token, or bug, or something. > > 'platforms' is not the right name, since it can include things like 'release' and 'debug' and presumably things like 'qt-4.8' and 'wk2' and gpu/cpu. > > How about configurations? It's definitely not modifiers. > configurations works for me. > > In particular it's hard for me to judge the correctness of this and it's hard to tell what the post-transition code would look like. Maybe it would help to post a version of the code that will only parse the new syntax? > > In the post transition, we'll get rid of the statements after the if. Which 'if'? Also, the regex changes as well, right? > > > > Tools/Scripts/webkitpy/layout_tests/models/test_expectations.py:389 > > > + # FIXME: Should we support rebaseline modifier at all? > > > > Yes, we still need to support REBASELINE, but I've been saying it would move to the expectations (and no longer be a modifier) > > Okay, so just remove this FIXME? > Yes. > > > Tools/Scripts/webkitpy/layout_tests/models/test_expectations.py:397 > > > + expectation_line.warnings.append("WontFix should not be used with any other expectation") > > > > I don't think we need to enforce this; I seem to recall arguing that skip and wontfix should be handled identically. > > If WontFix and Skip are synonymous, then we certainly don't want other expectations to be there, right? I believe I've said before that I'm okay with other expectations being there (although not required), since it can be useful for when you run with --force/--skipped=only to see if tests are doing something different than what you expected).
(In reply to comment #8) > (In reply to comment #7) > > (In reply to comment #6) > > > (From update of attachment 147716 [details] [details] [details]) > > > Thanks for working on this! Note that this probably duplicates bug 86796, so we should close one or the other (probably that one). > > > > I think we can make this bug a blocker instead because we'll need to fix garden-o-matic as well. > > > > Well, I want you to fix garden-o-matic as a part of this, if we can. Can you work up a patch that does that so we can see if it adds much complexity or not? I guess we can try that. > > > > Tools/Scripts/webkitpy/layout_tests/models/test_expectations.py:271 > > > > + elif modifier.lower().startswith('bug') or '/' in modifier: > > > > > > just checking if '/' is in the modifier is probably a bit fast-and-loose for me, since it would allow 'foo/bar' to be legal. > > > > On the other hand, we don't want to do a full URL/URI match either. Also, I don't think we want to whitelist URLs because we want to be able to use URLs of Skia, V8, etc... issue trackers. > > > > I understand, but this is too loose. maybe look for at least something that looks like a domain name (.com/.org?) followed by a slash and ends in a number? Match r"(\w+\.\w+/)" ? > > > > Tools/Scripts/webkitpy/layout_tests/models/test_expectations.py:339 > > > > + > > > > > > I would rather we not try to match this all as a single regex, since it looks like gibberish to me :) Can we split this into old-style match and new-style match and figure out which one to use? > > > > A big advantage of using a regular expression here is so that the said javascript code in flakiness dashboard can use the regular expression. > > > > good point, however, cf. above :) Do javascript regexps actually let you name subpattern matches? I don't think it does name sub-pattern matches, or support verbose mode. However, that's much better not being able to share any logic at all. > > > In particular it's hard for me to judge the correctness of this and it's hard to tell what the post-transition code would look like. Maybe it would help to post a version of the code that will only parse the new syntax? > > > > In the post transition, we'll get rid of the statements after the if. > > Which 'if'? Also, the regex changes as well, right? The if that wraps all the code I'm adding in _tokenize_line. > > > > Tools/Scripts/webkitpy/layout_tests/models/test_expectations.py:397 > > > > + expectation_line.warnings.append("WontFix should not be used with any other expectation") > > > > > > I don't think we need to enforce this; I seem to recall arguing that skip and wontfix should be handled identically. > > > > If WontFix and Skip are synonymous, then we certainly don't want other expectations to be there, right? > > I believe I've said before that I'm okay with other expectations being there (although not required), since it can be useful for when you run with --force/--skipped=only to see if tests are doing something different than what you expected). Okay, will remove it for now then.
(In reply to comment #9) > Match r"(\w+\.\w+/)" ? maybe with .*\d+ on the end but I'm not too picky.
*** Bug 86796 has been marked as a duplicate of this bug. ***
for the record, see old discussion in https://bugs.webkit.org/show_bug.cgi?id=86796 that led up to this (also some threads on webkit-dev that I'm too lazy to link to right now).
Comment on attachment 147716 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=147716&action=review >>>>> Tools/Scripts/webkitpy/layout_tests/models/test_expectations.py:271 >>>>> + elif modifier.lower().startswith('bug') or '/' in modifier: >>>> >>>> just checking if '/' is in the modifier is probably a bit fast-and-loose for me, since it would allow 'foo/bar' to be legal. >>> >>> On the other hand, we don't want to do a full URL/URI match either. Also, I don't think we want to whitelist URLs because we want to be able to use URLs of Skia, V8, etc... issue trackers. >> >> I understand, but this is too loose. maybe look for at least something that looks like a domain name (.com/.org?) followed by a slash and ends in a number? > > Match r"(\w+\.\w+/)" ? Lets be more strict about it. Bugs should start with one of the following: bug crbug.com webkit.org/b I don't think we should allow the longform URLS or the http(s)://. If other ports have other bug database URLS they want to add, they can extend the list of valid prefixes (e.g. rdar://)
(In reply to comment #13) > (From update of attachment 147716 [details]) > Lets be more strict about it. Bugs should start with one of the following: > bug > crbug.com > webkit.org/b Why? What's the benefit in doing that? It'll be annoying having to modify test running scripts just to use new URL. It's both counter-intutitive and counter-productive.
(In reply to comment #14) > (In reply to comment #13) > > (From update of attachment 147716 [details] [details]) > > Lets be more strict about it. Bugs should start with one of the following: > > bug > > crbug.com > > webkit.org/b > > Why? What's the benefit in doing that? It'll be annoying having to modify test running scripts just to use new URL. It's both counter-intutitive and counter-productive. It helps people not make typos, e.g. the linter will catch if you type webkit.org/1234, which is a very common mistake. Also, I don't think we should allow the longform URLs as it would decrease the readability of the file.
(In reply to comment #15) > It helps people not make typos, e.g. the linter will catch if you type webkit.org/1234, which is a very common mistake. Also, I don't think we should allow the longform URLs as it would decrease the readability of the file. That doesn't prevent people from using a wrong bug number.
(In reply to comment #16) > (In reply to comment #15) > > It helps people not make typos, e.g. the linter will catch if you type webkit.org/1234, which is a very common mistake. Also, I don't think we should allow the longform URLs as it would decrease the readability of the file. > > That doesn't prevent people from using a wrong bug number. I'm not sure what you're saying here. Yes, that's true, but I don't see how that relates. Just because we can't warn about every possible error doesn't mean we shouldn't warn the ones that we can.
(In reply to comment #17) > I'm not sure what you're saying here. Yes, that's true, but I don't see how that relates. Just because we can't warn about every possible error doesn't mean we shouldn't warn the ones that we can. I'm saying that I don't see any value in verifying that bug URLs start with webkit.org/b/ or crbug.com/ because at the end of the day, people can introduce any typos there. And it'll be extremely annoying having to modify webkitpy code to support new URLs. That's unacceptable level of overhead.
It is conceivable that people will write scripts and tools that will parse the files and want to do something with the URLs (I've done this before and will likely do it again, e.g. for the LTTF dashboard we used a couple years ago), and having some restrictions on what URLs might be used is useful in that case. Also, there's a pretty strong disincentive to adding more trackers, because people want webkit-related issues to be tracked on bugs.webkit.org. I personally get grumpy when I see things listed in rdar:// instead of b.w.o, and I can imagine others feel similarly about crbug (to that end, most of the crbugs have moved or should be moved to b.w.o, for example). That said, I don't feel strongly about this so those who do can decide what to do.
(In reply to comment #18) > (In reply to comment #17) > > I'm not sure what you're saying here. Yes, that's true, but I don't see how that relates. Just because we can't warn about every possible error doesn't mean we shouldn't warn the ones that we can. > > I'm saying that I don't see any value in verifying that bug URLs start with webkit.org/b/ or crbug.com/ because at the end of the day, people can introduce any typos there. And it'll be extremely annoying having to modify webkitpy code to support new URLs. That's unacceptable level of overhead. How often do people need to add support for new URL types? It's at most once per new port (i.e. less than once per year). Modifying a list in a python file is simply not a big overhead. I'm picturing code like the following: VALID_BUG_PREFIXES = ['crbug.com/', 'webkit.org/b/'] for prefix in VALID_BUG_PREFIXES: ... All you need to do is add an extra value to the list. It's arguably not even modifying code at that point. How is that a lot of overhead? People put new bug URLs in TestExpectations files many times every day. You can look at the history of threads on webkit-dev to see how often people accidentally leave out the "b/" in the webkit.org bug URLs. It's a very common mistake. One case of fallout to consider is that the flakiness dashboard will autolinkify these URLs. If they are mistyped, then the links won't work. As Dirk pointed out, any tooling we write for this data will rely on the URLs being valid. I've said my piece here though. I don't think there's more to discuss. If you still disagree, I won't block you committing this. I do strongly disagree though. I don't think this is making the right tradeoff.
(In reply to comment #20) > How often do people need to add support for new URL types? It's at most once per new port (i.e. less than once per year). Modifying a list in a python file is simply not a big overhead. I'm picturing code like the following: > > VALID_BUG_PREFIXES = ['crbug.com/', 'webkit.org/b/'] > for prefix in VALID_BUG_PREFIXES: > ... > > All you need to do is add an extra value to the list. It's arguably not even modifying code at that point. How is that a lot of overhead? The fact it's buried in webkitpy code is the unacceptable part. Then we'll require adding a test, etc... I would be okay if it were in TestExpectations themselves as in: AllowedURLs: webkit.org/b/\d*, crbug.com/\d+ at the top of each file. Presumably, we can declare this at the top-level TestExpectations file once we support cascading. > People put new bug URLs in TestExpectations files many times every day. You can look at the history of threads on webkit-dev to see how often people accidentally leave out the "b/" in the webkit.org bug URLs. It's a very common mistake. We should just add webkit.org/12345 -> webkit.org/b/12345 alias then.
(In reply to comment #21) > I would be okay if it were in TestExpectations themselves as in: > > AllowedURLs: webkit.org/b/\d*, crbug.com/\d+ On my second thought, this is such an over-engineered piece of cr*p. So I take it back.
Sorry if my comments stalled this moving forward. As I said, I'm fine with you moving on without us coming to agreement on the strictness of bug listings. FWIW, I just stumbled across the following typos in platform/chromium/TestExpectations: BUGKW76557 : svg/custom/transform-with-shadow-and-gradient.svg = IMAGE // This typo is repeated 8 times BUGW81803 MAC : platform/chromium/compositing/rubberbanding/transform-overhang-e.html = PASS IMAGE
I'm happy to review this patch if you expand it to include the the flakiness dashboard and garden-o-matic changes. I'm not actually sure you need to change anything in garden-o-matic. garden-o-matic just relies on webkit-patch for everything. It doesn't read the TestExpectations format at all AFAIK.
> I'm not actually sure you need to change anything in garden-o-matic. garden-o-matic just relies on webkit-patch for everything. It doesn't read the TestExpectations format at all AFAIK. Correct.
I'm working on this now and will have a set of patches coming shortly.
Note that while I'm using the refactoring that rniwa already landed, I'm not looking at the patch he posted to this at all, and instead taking a different approach. The general path I'm planning on for getting to the new syntax: 1) Refactor the code slightly to prepare for handling two syntaxes simultaneously 2) Implement reading in the new syntax by mapping it back onto the old syntax, so that the new parser is isolated and easily reviewable. Parsing will be done line by line, so a given file will support a mixture of both syntaxes. This significantly eases migration complexity, and hopefully won't show up much in practice. 3) Modify the serializer code to write out the new syntax rather than the old (to support webkit-patch rebaseline-expectations) 4) Concurrently w/ #3 - reformat and check in the existing files in the new format 5) Stop reading/supporting the old format 6) Clean up / refactor the code once support for the old format is no longer needed. My intent is for the time between 3) and 5) to be only about as long as it'll take me to land the patches and make sure nothing breaks, so that the period of time when both syntaxes are supported is short (hopefully only slightly longer than the window of a chromium roll, which will be needed to convert the downstream chromium expectations). I am not planning to support migrating patches that might contain the old syntax (though I have not yet figure out how to prevent such from landing and confusing things). I'm open to suggestions here one way or another.
(In reply to comment #27) > I am not planning to support migrating patches that might contain the old syntax (though I have not yet figure out how to prevent such from landing and confusing things). I'm open to suggestions here one way or another. Once the linter starts complaining for the old syntax, the presubmit will catch this and prevent a commit, no?
(In reply to comment #28) > (In reply to comment #27) > > I am not planning to support migrating patches that might contain the old syntax (though I have not yet figure out how to prevent such from landing and confusing things). I'm open to suggestions here one way or another. > > Once the linter starts complaining for the old syntax, the presubmit will catch this and prevent a commit, no? Do the presubmit checks run on commit? I thought they only ran on upload. At any rate, I was more concerned about providing a tool to easily migrate old syntax to new that might be integrated into webkit-patch apply et al. It will be easy enough to add a separate tool (since I'll need one to migrate the files anyway) but I didn't want to have to integrate it into apply and (more importantly) I didn't want to set the expectation that people should expect it to work and as a result drag out the time when we were supporting both syntaxes.