Full Text Bug Listing

Daniel Bates

Reported 2019-01-25 16:21:53 PST

run-webkit-tests's output is even more confusing. Even without the --force flag it will run the test in current simulator, iPad, but always emits a line saying that it did not run the test in iPhone SE. If --no-retry-failures is passed then the latter line replaces the former so you briefly see it say it is running the test on iPad (and it runs it in iPad - why? I didn't say force and I thought the policy was to prefer iPhone SE?) and then it overwrites the line to say it ran 0 tests in iPhone SE. Here's a run of what this looks like when the test is retried: [[ $ Tools/Scripts/run-webkit-tests --no-build --debug --ios-simulator LayoutTests/fast/events/ios/keypress-keys-in-non-editable-element.html Using port 'ios-simulator' Test configuration: <, x86_64, debug> Placing test results in /Volumes/.../Debug-iphonesimulator/layout-test-results Using Debug build Pixel tests disabled Regular timeout: 30000, slow test timeout: 150000 Command line: /Volumes/.../Debug-iphonesimulator/WebKitTestRunnerApp.app - --lint-test-files warnings: LayoutTests/platform/ios/TestExpectations:12 Path does not exist. media/controls/ipad --lint-test-files warnings: LayoutTests/platform/ios/TestExpectations:12 Path does not exist. media/controls/ipad Found 1 test; running 1, skipping 0. Verbose baseline search path: platform/ipad-(6th generation)-...-wk2 -> platform/ipad-(6th generation)-... -> platform/ipad-(6th generation)-simulator-wk2 -> platform/ipad-(6th generation)-simulator -> platform/ipad-...-wk2 -> platform/ipad-... -> platform/ipad-simulator-wk2 -> platform/ipad-simulator -> platform/ios-...-wk2 -> platform/ios-... -> platform/ios-simulator-wk2 -> platform/ios-simulator -> platform/ipad-(6th generation)-...-wk2 -> platform/ipad-(6th generation)-... -> platform/ipad-(6th generation)-wk2 -> platform/ipad-(6th generation) -> platform/ipad-...-wk2 -> platform/ipad-... -> platform/ipad-wk2 -> platform/ipad -> platform/ios-...-wk2 -> platform/ios-... -> platform/ios-wk2 -> platform/ios -> platform/wk2 -> generic Baseline search path: platform/ios-simulator-wk2 -> platform/ios-simulator -> platform/ipad -> platform/ios-wk2 -> platform/ios -> platform/wk2 -> generic Running 1 test for iPad (6th generation) running None Running 1 WebKitTestRunnerApp.app. [1/1] fast/events/ios/keypress-keys-in-non-editable-element.html failed unexpectedly (text diff) Retrying 1 unexpected failure ... Running 1 WebKitTestRunnerApp.app. [1/1] fast/events/ios/keypress-keys-in-non-editable-element.html failed unexpectedly (text diff) Verbose baseline search path: platform/iphone-se-...-wk2 -> platform/iphone-se-... -> platform/iphone-se-simulator-wk2 -> platform/iphone-se-simulator -> platform/iphone-...-wk2 -> platform/iphone-... -> platform/iphone-simulator-wk2 -> platform/iphone-simulator -> platform/ios-...-wk2 -> platform/ios-... -> platform/ios-simulator-wk2 -> platform/ios-simulator -> platform/iphone-se-...-wk2 -> platform/iphone-se-... -> platform/iphone-se-wk2 -> platform/iphone-se -> platform/iphone-...-wk2 -> platform/iphone-... -> platform/iphone-wk2 -> platform/iphone -> platform/ios-...-wk2 -> platform/ios-... -> platform/ios-wk2 -> platform/ios -> platform/wk2 -> generic Baseline search path: platform/ios-simulator-wk2 -> platform/ios-simulator -> platform/ios-wk2 -> platform/ios -> platform/wk2 -> generic Running 0 tests for iPhone SE running None 0 tests ran as expected, 1 didn't: Regressions: Unexpected text-only failures (1) fast/events/ios/keypress-keys-in-non-editable-element.html [ Failure ] ]] It's more confusing if --no-retry-failure is passed. Also, what is going on with the "running None"?

Radar WebKit Bug Importer

Comment 1 2019-01-25 16:22:26 PST

<rdar://problem/47566197>

Daniel Bates

Comment 2 2019-01-25 16:24:43 PST

Never mind about the more confusing part and the overwriting. It just always emits "Running 0 tests for iPhone SE running None" when it ran the test using iPad.

Daniel Bates

Comment 3 2019-01-25 16:35:31 PST

*** Bug 193856 has been marked as a duplicate of this bug. ***

Jonathan Bedard

Comment 4 2019-01-31 08:56:32 PST

Was there a booted iPhone SE and a booted iPad when this was was observed? If they are both booted, I'm not sure that this is the wrong behavior. If only one was booted, this is definitely a problem.

Daniel Bates

Comment 5 2019-01-31 09:16:49 PST

(In reply to Jonathan Bedard from comment #4) > Was there a booted iPhone SE and a booted iPad when this was was observed? > If they are both booted, I'm not sure that this is the wrong behavior. If > only one was booted, this is definitely a problem. I had both booted as implied by my comment "it runs it in iPad - why? I didn't say force and I thought the policy was to prefer iPhone SE?". If both are booted why does it only run the test in one of the simulators AND why did it pick iPad to run the test in?

Jonathan Bedard

Comment 6 2019-01-31 11:33:33 PST

(In reply to Daniel Bates from comment #5) > (In reply to Jonathan Bedard from comment #4) > > Was there a booted iPhone SE and a booted iPad when this was was observed? > > If they are both booted, I'm not sure that this is the wrong behavior. If > > only one was booted, this is definitely a problem. > > I had both booted as implied by my comment "it runs it in iPad - why? I > didn't say force and I thought the policy was to prefer iPhone SE?". If both > are booted why does it only run the test in one of the simulators AND why > did it pick iPad to run the test in? The 'policy of preferring iPhone SE' at the moment is in a bit of a weird state, since all of these tests should also be gardened for iPad (not that they are yet, but they should be, see <https://bugs.webkit.org/show_bug.cgi?id=193767>). That being said, there is a pretty good argument to be made about predictably ordering devices in the event a user has multiple booted, right now the order would be determined by the output of 'xcrun simctl list', and even then, I'm not 100% certain which booted device would come first. I'm going to stand by the 'Running 0 tests for xxx' behavior because it makes it clear that there was another device that could have been used, although was not. We only run the tests on one device because after discussing the techniques we could use to run a device multiple times in a single test run, the general consensus was that such an approach was both complicated and confusing and not worth the effort.

Daniel Bates

Comment 7 2019-01-31 12:10:56 PST

(In reply to Jonathan Bedard from comment #6) > I'm going to stand by the 'Running 0 tests for xxx' behavior because it > makes it clear that there was another device that could have been used, > although was not. > Great! Why should I care?

Jonathan Bedard

Comment 8 2019-01-31 14:37:31 PST

(In reply to Daniel Bates from comment #7) > (In reply to Jonathan Bedard from comment #6) > > ... > > Great! Why should I care? Because it is not obvious which tests will be run on which devices. Even if we establish a more predictable device precedence order, the inverse of your problem (ie, why did we pick an iPhone SE instead of an iPad) is just as likely. When tests start failing, the user needs to care what device type was used because although tests should be device agnostic, plenty are not.

Daniel Bates

Comment 9 2019-01-31 15:23:55 PST

(In reply to Jonathan Bedard from comment #8) > (In reply to Daniel Bates from comment #7) > > (In reply to Jonathan Bedard from comment #6) > > > ... > > > > Great! Why should I care? > > Because it is not obvious which tests will be run on which devices. Even if > we establish a more predictable device precedence order, the inverse of your > problem (ie, why did we pick an iPhone SE instead of an iPad) is just as > likely. > > When tests start failing, the user needs to care what device type was used > because although tests should be device agnostic, plenty are not. Why should I care about what devices were *not* used?

Jonathan Bedard

Comment 10 2019-01-31 17:23:36 PST

(In reply to Daniel Bates from comment #9) > (In reply to Jonathan Bedard from comment #8) > > ... > > Why should I care about what devices were *not* used? The devices listed were valid targets for the specified tests and configuration. Since it's pretty easy to lose track of which simulators you have booted (I think it was Xcode 9 that simulators could be headlessly booted), this lets users see that a device they booted is available and a valid target, even if it was not used. Actually, I think this bug demonstrates why such logging can be useful. In this case, you expected the test to be run on an iPhone SE. It was not, but that device was both available and a valid target. Just from the log, we know that 2 devices were booted and matched the configuration, that information seems useful to me.

Daniel Bates

Comment 11 2019-01-31 17:54:14 PST

(In reply to Jonathan Bedard from comment #10) > (In reply to Daniel Bates from comment #9) > > (In reply to Jonathan Bedard from comment #8) > > > ... > > > > Why should I care about what devices were *not* used? > > The devices listed were valid targets for the specified tests and > configuration. Since it's pretty easy to lose track of which simulators you > have booted (I think it was Xcode 9 that simulators could be headlessly > booted), this lets users see that a device they booted is available and a > valid target, even if it was not used. Not useful. I know what simulators I booted. I guess I would appreciate this output more if I had alzheimer's :/ If that's the audience you're going for then success! This is a feature! Or if you're looking for affirmation that the code you wrote is working and that it didn't run the test twice then I can see the benefit of keeping this behavior and closing this bug. I'll be reminded of this each time I run a test and have multiple simulators open and then become confused as to why didn't run-webkit-tests not run the test in all open simulators. > > Actually, I think this bug demonstrates why such logging can be useful. In > this case, you expected the test to be run on an iPhone SE. I got this info just by reading the "Running 1 test for iPad (6th generation) running None" line. I didn't need to see "Running 0 tests for iPhone SE running None". > It was not, but > that device was both available and a valid target. Just from the log, we > know that 2 devices were booted and matched the configuration, that > information seems useful to me. I think there is a saying, something like "less is more". This applies to this bug.

Daniel Bates

Comment 12 2019-01-31 17:55:27 PST

Shout out to the "...running None" line.

Daniel Bates

Comment 13 2019-01-31 18:28:10 PST

(In reply to Daniel Bates from comment #11) > (In reply to Jonathan Bedard from comment #10) > > (In reply to Daniel Bates from comment #9) > > > (In reply to Jonathan Bedard from comment #8) > > > > ... > > > > > > Why should I care about what devices were *not* used? > > > > The devices listed were valid targets for the specified tests and > > configuration. Since it's pretty easy to lose track of which simulators you > > have booted (I think it was Xcode 9 that simulators could be headlessly > > booted), this lets users see that a device they booted is available and a > > valid target, even if it was not used. > > Not useful. I know what simulators I booted. I guess I would appreciate this > output more if I had alzheimer's :/ If that's the audience you're going for > then success! This is a feature! Or if you're looking for affirmation that > the code you wrote is working and that it didn't run the test twice then I > can see the benefit of keeping this behavior and closing this bug. I'll be > reminded of this each time I run a test and have multiple simulators open > and then become confused as to why didn't run-webkit-tests not run the test > in all open simulators. > > > > > Actually, I think this bug demonstrates why such logging can be useful. In > > this case, you expected the test to be run on an iPhone SE. > > I got this info just by reading the "Running 1 test for iPad (6th > generation) running None" line. I didn't need to see "Running 0 tests for > iPhone SE running None". > > > It was not, but > > that device was both available and a valid target. Just from the log, we > > know that 2 devices were booted and matched the configuration, that > > information seems useful to me. > > I think there is a saying, something like "less is more". This applies to > this bug. Reading this back, the tone sounds negative. I didn't mean to come across like that. I just find the behavior of this bug silly. I can understand if the verbosity is useful to other people. So, this isn't a regression per se as much as it is an intended behavior change and hence this bug represents an enhancement request. Updating bug title and fields to reflect this.

Daniel Bates

Comment 14 2019-01-31 18:31:26 PST

(In reply to Daniel Bates from comment #12) > Shout out to the "...running None" line. Filed bug #194133 for this.

Jonathan Bedard

Comment 15 2019-02-01 08:51:47 PST

I filed <https://bugs.webkit.org/show_bug.cgi?id=194158> because it seems that this bug is now entirely about logging concerns.

Summary:	run-webkit-tests emits line about number of tests run for each booted simulator regardless of whether any tests were run
Product:	WebKit	Reporter:	Daniel Bates <dbates>
Component:	Tools / Tests	Assignee:	Jonathan Bedard <jbedard>
Status:	NEW
Severity:	Enhancement	CC:	lforschler, webkit-bug-importer
Priority:	P2	Keywords:	InRadar
Version:	WebKit Local Build
Hardware:	iPhone / iPad
OS:	iOS 12
See Also:	https://bugs.webkit.org/show_bug.cgi?id=194133 https://bugs.webkit.org/show_bug.cgi?id=194158

Bug 193857