Bug 193857

Summary: run-webkit-tests emits line about number of tests run for each booted simulator regardless of whether any tests were run
Product: WebKit Reporter: Daniel Bates <dbates>
Component: Tools / TestsAssignee: Jonathan Bedard <jbedard>
Status: NEW ---    
Severity: Enhancement CC: lforschler, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Local Build   
Hardware: iPhone / iPad   
OS: iOS 12   
See Also: https://bugs.webkit.org/show_bug.cgi?id=194133
https://bugs.webkit.org/show_bug.cgi?id=194158

Description Daniel Bates 2019-01-25 16:21:53 PST
run-webkit-tests's output is even more confusing. Even without the --force flag it will run the test in current simulator, iPad, but always emits a line saying that it did not run the test in iPhone SE. If --no-retry-failures is passed then the latter line replaces the former so you briefly see it say it is running the test on iPad (and it runs it in iPad - why? I didn't say force and I thought the policy was to prefer iPhone SE?) and then it overwrites the line to say it ran 0 tests in iPhone SE. Here's a run of what this looks like when the test is retried:

[[
$ Tools/Scripts/run-webkit-tests --no-build --debug --ios-simulator LayoutTests/fast/events/ios/keypress-keys-in-non-editable-element.html 
Using port 'ios-simulator'
Test configuration: <, x86_64, debug>
Placing test results in /Volumes/.../Debug-iphonesimulator/layout-test-results
Using Debug build
Pixel tests disabled
Regular timeout: 30000, slow test timeout: 150000
Command line: /Volumes/.../Debug-iphonesimulator/WebKitTestRunnerApp.app -

--lint-test-files warnings:                                    
LayoutTests/platform/ios/TestExpectations:12 Path does not exist. media/controls/ipad

--lint-test-files warnings:                        
LayoutTests/platform/ios/TestExpectations:12 Path does not exist. media/controls/ipad

Found 1 test; running 1, skipping 0.
                  
Verbose baseline search path: platform/ipad-(6th generation)-...-wk2 -> platform/ipad-(6th generation)-... -> platform/ipad-(6th generation)-simulator-wk2 -> platform/ipad-(6th generation)-simulator -> platform/ipad-...-wk2 -> platform/ipad-... -> platform/ipad-simulator-wk2 -> platform/ipad-simulator -> platform/ios-...-wk2 -> platform/ios-... -> platform/ios-simulator-wk2 -> platform/ios-simulator -> platform/ipad-(6th generation)-...-wk2 -> platform/ipad-(6th generation)-... -> platform/ipad-(6th generation)-wk2 -> platform/ipad-(6th generation) -> platform/ipad-...-wk2 -> platform/ipad-... -> platform/ipad-wk2 -> platform/ipad -> platform/ios-...-wk2 -> platform/ios-... -> platform/ios-wk2 -> platform/ios -> platform/wk2 -> generic

Baseline search path: platform/ios-simulator-wk2 -> platform/ios-simulator -> platform/ipad -> platform/ios-wk2 -> platform/ios -> platform/wk2 -> generic

Running 1 test for iPad (6th generation) running None

Running 1 WebKitTestRunnerApp.app.

[1/1] fast/events/ios/keypress-keys-in-non-editable-element.html failed unexpectedly (text diff)

Retrying 1 unexpected failure ...

Running 1 WebKitTestRunnerApp.app.

[1/1] fast/events/ios/keypress-keys-in-non-editable-element.html failed unexpectedly (text diff)

Verbose baseline search path: platform/iphone-se-...-wk2 -> platform/iphone-se-... -> platform/iphone-se-simulator-wk2 -> platform/iphone-se-simulator -> platform/iphone-...-wk2 -> platform/iphone-... -> platform/iphone-simulator-wk2 -> platform/iphone-simulator -> platform/ios-...-wk2 -> platform/ios-... -> platform/ios-simulator-wk2 -> platform/ios-simulator -> platform/iphone-se-...-wk2 -> platform/iphone-se-... -> platform/iphone-se-wk2 -> platform/iphone-se -> platform/iphone-...-wk2 -> platform/iphone-... -> platform/iphone-wk2 -> platform/iphone -> platform/ios-...-wk2 -> platform/ios-... -> platform/ios-wk2 -> platform/ios -> platform/wk2 -> generic

Baseline search path: platform/ios-simulator-wk2 -> platform/ios-simulator -> platform/ios-wk2 -> platform/ios -> platform/wk2 -> generic

Running 0 tests for iPhone SE running None


0 tests ran as expected, 1 didn't:


Regressions: Unexpected text-only failures (1)
  fast/events/ios/keypress-keys-in-non-editable-element.html [ Failure ]

]]

It's more confusing if --no-retry-failure is passed.

Also, what is going on with the "running None"?
Comment 1 Radar WebKit Bug Importer 2019-01-25 16:22:26 PST
<rdar://problem/47566197>
Comment 2 Daniel Bates 2019-01-25 16:24:43 PST
Never mind about the more confusing part and the overwriting. It just always emits "Running 0 tests for iPhone SE running None" when it ran the test using iPad.
Comment 3 Daniel Bates 2019-01-25 16:35:31 PST
*** Bug 193856 has been marked as a duplicate of this bug. ***
Comment 4 Jonathan Bedard 2019-01-31 08:56:32 PST
Was there a booted iPhone SE and a booted iPad when this was was observed? If they are both booted, I'm not sure that this is the wrong behavior. If only one was booted, this is definitely a problem.
Comment 5 Daniel Bates 2019-01-31 09:16:49 PST
(In reply to Jonathan Bedard from comment #4)
> Was there a booted iPhone SE and a booted iPad when this was was observed?
> If they are both booted, I'm not sure that this is the wrong behavior. If
> only one was booted, this is definitely a problem.

I had both booted as implied by my comment "it runs it in iPad - why? I didn't say force and I thought the policy was to prefer iPhone SE?". If both are booted why does it only run the test in one of the simulators AND why did it pick iPad to run the test in?
Comment 6 Jonathan Bedard 2019-01-31 11:33:33 PST
(In reply to Daniel Bates from comment #5)
> (In reply to Jonathan Bedard from comment #4)
> > Was there a booted iPhone SE and a booted iPad when this was was observed?
> > If they are both booted, I'm not sure that this is the wrong behavior. If
> > only one was booted, this is definitely a problem.
> 
> I had both booted as implied by my comment "it runs it in iPad - why? I
> didn't say force and I thought the policy was to prefer iPhone SE?". If both
> are booted why does it only run the test in one of the simulators AND why
> did it pick iPad to run the test in?

The 'policy of preferring iPhone SE' at the moment is in a bit of a weird state, since all of these tests should also be gardened for iPad (not that they are yet, but they should be, see <https://bugs.webkit.org/show_bug.cgi?id=193767>). That being said, there is a pretty good argument to be made about predictably ordering devices in the event a user has multiple booted, right now the order would be determined by the output of 'xcrun simctl list', and even then, I'm not 100% certain which booted device would come first.

I'm going to stand by the 'Running 0 tests for xxx' behavior because it makes it clear that there was another device that could have been used, although was not.

We only run the tests on one device because after discussing the techniques we could use to run a device multiple times in a single test run, the general consensus was that such an approach was both complicated and confusing and not worth the effort.
Comment 7 Daniel Bates 2019-01-31 12:10:56 PST
(In reply to Jonathan Bedard from comment #6)
> I'm going to stand by the 'Running 0 tests for xxx' behavior because it
> makes it clear that there was another device that could have been used,
> although was not.
> 

Great! Why should I care?
Comment 8 Jonathan Bedard 2019-01-31 14:37:31 PST
(In reply to Daniel Bates from comment #7)
> (In reply to Jonathan Bedard from comment #6)
> > ...
> 
> Great! Why should I care?

Because it is not obvious which tests will be run on which devices. Even if we establish a more predictable device precedence order, the inverse of your problem (ie, why did we pick an iPhone SE instead of an iPad) is just as likely.

When tests start failing, the user needs to care what device type was used because although tests should be device agnostic, plenty are not.
Comment 9 Daniel Bates 2019-01-31 15:23:55 PST
(In reply to Jonathan Bedard from comment #8)
> (In reply to Daniel Bates from comment #7)
> > (In reply to Jonathan Bedard from comment #6)
> > > ...
> > 
> > Great! Why should I care?
> 
> Because it is not obvious which tests will be run on which devices. Even if
> we establish a more predictable device precedence order, the inverse of your
> problem (ie, why did we pick an iPhone SE instead of an iPad) is just as
> likely.
> 
> When tests start failing, the user needs to care what device type was used
> because although tests should be device agnostic, plenty are not.

Why should I care about what devices were *not* used?
Comment 10 Jonathan Bedard 2019-01-31 17:23:36 PST
(In reply to Daniel Bates from comment #9)
> (In reply to Jonathan Bedard from comment #8)
> > ...
> 
> Why should I care about what devices were *not* used?

The devices listed were valid targets for the specified tests and configuration. Since it's pretty easy to lose track of which simulators you have booted (I think it was Xcode 9 that simulators could be headlessly booted), this lets users see that a device they booted is available and a valid target, even if it was not used.

Actually, I think this bug demonstrates why such logging can be useful. In this case, you expected the test to be run on an iPhone SE. It was not, but that device was both available and a valid target. Just from the log, we know that 2 devices were booted and matched the configuration, that information seems useful to me.
Comment 11 Daniel Bates 2019-01-31 17:54:14 PST
(In reply to Jonathan Bedard from comment #10)
> (In reply to Daniel Bates from comment #9)
> > (In reply to Jonathan Bedard from comment #8)
> > > ...
> > 
> > Why should I care about what devices were *not* used?
> 
> The devices listed were valid targets for the specified tests and
> configuration. Since it's pretty easy to lose track of which simulators you
> have booted (I think it was Xcode 9 that simulators could be headlessly
> booted), this lets users see that a device they booted is available and a
> valid target, even if it was not used.

Not useful. I know what simulators I booted. I guess I would appreciate this output more if I had alzheimer's :/ If that's the audience you're going for then success! This is a feature! Or if you're looking for affirmation that the code you wrote is working and that it didn't run the test twice then I can see the benefit of keeping this behavior and closing this bug. I'll be reminded of this each time I run a test and have multiple simulators open and then become confused as to why didn't run-webkit-tests not run the test in all open simulators.

> 
> Actually, I think this bug demonstrates why such logging can be useful. In
> this case, you expected the test to be run on an iPhone SE. 

I got this info just by reading the "Running 1 test for iPad (6th generation) running None" line. I didn't need to see "Running 0 tests for iPhone SE running None".

> It was not, but
> that device was both available and a valid target. Just from the log, we
> know that 2 devices were booted and matched the configuration, that
> information seems useful to me.

I think there is a saying, something like "less is more". This applies to this bug.
Comment 12 Daniel Bates 2019-01-31 17:55:27 PST
Shout out to the "...running None" line.
Comment 13 Daniel Bates 2019-01-31 18:28:10 PST
(In reply to Daniel Bates from comment #11)
> (In reply to Jonathan Bedard from comment #10)
> > (In reply to Daniel Bates from comment #9)
> > > (In reply to Jonathan Bedard from comment #8)
> > > > ...
> > > 
> > > Why should I care about what devices were *not* used?
> > 
> > The devices listed were valid targets for the specified tests and
> > configuration. Since it's pretty easy to lose track of which simulators you
> > have booted (I think it was Xcode 9 that simulators could be headlessly
> > booted), this lets users see that a device they booted is available and a
> > valid target, even if it was not used.
> 
> Not useful. I know what simulators I booted. I guess I would appreciate this
> output more if I had alzheimer's :/ If that's the audience you're going for
> then success! This is a feature! Or if you're looking for affirmation that
> the code you wrote is working and that it didn't run the test twice then I
> can see the benefit of keeping this behavior and closing this bug. I'll be
> reminded of this each time I run a test and have multiple simulators open
> and then become confused as to why didn't run-webkit-tests not run the test
> in all open simulators.
> 
> > 
> > Actually, I think this bug demonstrates why such logging can be useful. In
> > this case, you expected the test to be run on an iPhone SE. 
> 
> I got this info just by reading the "Running 1 test for iPad (6th
> generation) running None" line. I didn't need to see "Running 0 tests for
> iPhone SE running None".
> 
> > It was not, but
> > that device was both available and a valid target. Just from the log, we
> > know that 2 devices were booted and matched the configuration, that
> > information seems useful to me.
> 
> I think there is a saying, something like "less is more". This applies to
> this bug.

Reading this back, the tone sounds negative. I didn't mean to come across like that. I just find the behavior of this bug silly. I can understand if the verbosity is useful to other people. So, this isn't a regression per se as much as it is an intended behavior change and hence this bug represents an enhancement request. Updating bug title and fields to reflect this.
Comment 14 Daniel Bates 2019-01-31 18:31:26 PST
(In reply to Daniel Bates from comment #12)
> Shout out to the "...running None" line.

Filed bug #194133 for this.
Comment 15 Jonathan Bedard 2019-02-01 08:51:47 PST
I filed <https://bugs.webkit.org/show_bug.cgi?id=194158> because it seems that this bug is now entirely about logging concerns.