Bug 133149 - REGRESSION(r169092 and r169102): Skip failing JSC tests poperly on non-x86 Darwin platforms
Summary: REGRESSION(r169092 and r169102): Skip failing JSC tests poperly on non-x86 Da...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P1 Critical
Assignee: Éva Balázsfalvi
URL:
Keywords:
Depends on:
Blocks: 133090
  Show dependency treegraph
 
Reported: 2014-05-21 01:51 PDT by Csaba Osztrogonác
Modified: 2014-06-03 04:41 PDT (History)
8 users (show)

See Also:


Attachments
Patch (3.64 KB, patch)
2014-05-21 06:56 PDT, Éva Balázsfalvi
no flags Details | Formatted Diff | Diff
Patch (5.22 KB, patch)
2014-05-22 04:20 PDT, Éva Balázsfalvi
no flags Details | Formatted Diff | Diff
Patch (5.63 KB, patch)
2014-05-28 01:33 PDT, Éva Balázsfalvi
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from webkit-ews-16 for mac-mountainlion-wk2 (518.89 KB, application/zip)
2014-05-28 02:39 PDT, Build Bot
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Csaba Osztrogonác 2014-05-21 01:51:20 PDT
https://trac.webkit.org/changeset/169092 and https://trac.webkit.org/changeset/169102
skipped many JSC tests everywhere excpet x86. (in a very-very strange way ...)

All the skipped tests pass on ARMv7 Traditional (ARM instruction set) 
and ARMv7 Thumb2, we don't want to loose test coverage on these platforms.

"$architecture != /x86/" doesn't mean ARM64, it should be fixed.
Comment 1 Carlos Alberto Lopez Perez 2014-05-21 03:57:43 PDT
(In reply to comment #0)
> https://trac.webkit.org/changeset/169092 and https://trac.webkit.org/changeset/169102
> skipped many JSC tests everywhere excpet x86. (in a very-very strange way ...)
> 
> All the skipped tests pass on ARMv7 Traditional (ARM instruction set) 
> and ARMv7 Thumb2, we don't want to loose test coverage on these platforms.
> 
> "$architecture != /x86/" doesn't mean ARM64, it should be fixed.

Indeed.

"uname -m" is a better option and works also on Mac.

I tested what values returns on some platforms:

* On AMD64 it returns "x86_64" both on Linux and Mac
* On x86 it returns "i686" on Linux, but I think it can also return "i386", "i486" or "i586".
* On ARM it returns "armv7l" on Linux, but I think it can also return "arm", "armv6" and "armv6t2"
* On ARM64 maybe returns "aarch64" on Linux (not sure)

Here are some more possible strings https://en.wikipedia.org/wiki/Uname#Examples
Comment 2 Csaba Osztrogonác 2014-05-21 06:41:52 PDT
Additionally these tests are skipped on non Mac x86 too, because there 
isn't otool, so $architecture is always nil on non Mac platforms.
Comment 3 Éva Balázsfalvi 2014-05-21 06:56:57 PDT
Created attachment 231825 [details]
Patch
Comment 4 Csaba Osztrogonác 2014-05-21 07:26:19 PDT
Comment on attachment 231825 [details]
Patch

LGTM, r=me. 

Now the $architerture is always nil on non Mac, because
machO magic is Mac only, on Linux we use ELF file format.

Maybe we should implement a similar thing for ELF too.
But it's not a problem now, because there are only skipped
tests on ARM64, and there aren't any ARM64 hardware to run
tests except the newest iPhone. :)
Comment 5 WebKit Commit Bot 2014-05-21 07:57:22 PDT
Comment on attachment 231825 [details]
Patch

Clearing flags on attachment: 231825

Committed r169159: <http://trac.webkit.org/changeset/169159>
Comment 6 WebKit Commit Bot 2014-05-21 07:57:26 PDT
All reviewed patches have been landed.  Closing bug.
Comment 7 Filip Pizlo 2014-05-21 09:27:14 PDT
It's ridiculous that you guys did this without allowing anyone who works on JavaScriptCore to chime in.

The point of this patch was to skip on not-X86 because we believe that these tests don't work on anything that isn't X86.  The profiler tests don't work in remote testing mode but the only way we have of testing this on Darwin right now is "you're not X86".  The regress tests - we're not sure about that one but we see failures on anything that's not X86.

As it stands, you've made our ARMv7 internal testing red.

Please don't make random changes to test gardening without giving time for feedback.
Comment 8 Filip Pizlo 2014-05-21 09:28:14 PDT
(In reply to comment #1)
> (In reply to comment #0)
> > https://trac.webkit.org/changeset/169092 and https://trac.webkit.org/changeset/169102
> > skipped many JSC tests everywhere excpet x86. (in a very-very strange way ...)
> > 
> > All the skipped tests pass on ARMv7 Traditional (ARM instruction set) 
> > and ARMv7 Thumb2, we don't want to loose test coverage on these platforms.
> > 
> > "$architecture != /x86/" doesn't mean ARM64, it should be fixed.
> 
> Indeed.
> 
> "uname -m" is a better option and works also on Mac.
> 
> I tested what values returns on some platforms:
> 
> * On AMD64 it returns "x86_64" both on Linux and Mac
> * On x86 it returns "i686" on Linux, but I think it can also return "i386", "i486" or "i586".
> * On ARM it returns "armv7l" on Linux, but I think it can also return "arm", "armv6" and "armv6t2"
> * On ARM64 maybe returns "aarch64" on Linux (not sure)
> 
> Here are some more possible strings https://en.wikipedia.org/wiki/Uname#Examples

No.  uname won't work because that won't support cross-testing where run-jsc-stress-tests creates a shellscript testing payload that later gets transferred to another machine (in this case an ARM device).
Comment 9 Filip Pizlo 2014-05-21 09:34:44 PDT
Rolled out in http://trac.webkit.org/changeset/169164.  I don't think that this bug and its description are valid.
Comment 10 Filip Pizlo 2014-05-21 09:42:17 PDT
My previous gardening of JSC stress tests was designed to green all of our embedded testing.  This change made it red again on everything except ARM64.

The issues here are:

- Some tests, like the profiler tests, cannot run on iOS.  I think I previously said that it's got to do with remote mode; that's wrong.  It's just that iOS can't support them.  Using uname is incorrect because run-jsc-stress-tests is running on a Mac that is driving the tests on an iOS box.  "Not x86" is the current best way of detecting this.

- Other tests are known to fail on slower processors.  "Not x86" is almost certainly the best way of detecting this.

If you want to investigate whether those tests pass on Linux on not-X86 and then enable them, then go ahead.  But please don't break the gardening that we did for Darwin in the process.  That is not acceptable behavior.

In particular, I think it's always acceptable for a committer to determine that a test should be skipped on a conservatively too big set of platforms.  But it's not acceptable to do the opposite: unskip a test on platforms you didn't test.
Comment 11 Csaba Osztrogonác 2014-05-22 01:44:33 PDT
(In reply to comment #7)
> It's ridiculous that you guys did this without allowing anyone who works on JavaScriptCore to chime in.

Calm down man! Is there a new rule about we should wait 
for your agreement for each patch? I don't know about it.

"Unreviewed, roll out ​http://trac.webkit.org/changeset/169159.
This was a unilateral change and wasn't properly reviewed."

I feel being insulted in this commit log. I protest against this harsh
and offended tone. Please be a little bit friendlier and more cooperative.

> The point of this patch was to skip on not-X86 because we believe that these tests don't work on anything that isn't X86.  The profiler tests don't work in remote testing mode but the only way we have of testing this on Darwin right now is "you're not X86".  The regress tests - we're not sure about that one but we see failures on anything that's not X86.

I can understand your point of view now, but the title of r169092 was very
misleading: "Take care of some ARM64 test failures" This sentence means for
me there are failure on ARM64 and not everywhere except x86. Additionally all
of the tests you skipped pass on ARMv7 Linux with ARM and Thumb2 instruction
set too as I mentioned previously. And r169092 skipped these tests on x86
Linux too, because otool is Mac only, and $architecture is nil on Linux now.
So we still _don't_ want to skip these _passing_ tests on x86/ARM Linux.

> As it stands, you've made our ARMv7 internal testing red.
As you mentioned now, this testing is Apple internal. We didn't break it
intentionally of course. But please don't expect us not to break a thing 
which we don't know about it, because it isn't publicly available at all.
 
> Please don't make random changes to test gardening without giving time for feedback.

But please, it wasn't random change at all. It seemed a logical 
and obviously proper change in the context what we see.

On the other hand, you broke the whole _public_ JSC Linux test infrastructure
with r169092 without caring about it at all. I could have considered it as
"random change" and I could have rolled it out and made harsh comments as
you did. But I simply considered it as a minor mistake what can be fixed
easily without wrangling and Éva fixed it quickly in r169156.

(In reply to comment #9)
> Rolled out in http://trac.webkit.org/changeset/169164.  I don't think that this bug and its description are valid.

The bug is still valid, I changed the title based on your feedback and reopend the bug.

(In reply to comment #10)
> The issues here are:
> 
> - Some tests, like the profiler tests, cannot run on iOS.  I think I previously said that it's got to do with remote mode; that's wrong.  It's just that iOS can't support them.  Using uname is incorrect because run-jsc-stress-tests is running on a Mac that is driving the tests on an iOS box.  "Not x86" is the current best way of detecting this.

Ah, in this case I don't think if the "not x86" is 
the proper condition, but "not x86 && os == darwin"
 
> - Other tests are known to fail on slower processors.  "Not x86" is almost certainly the best way of detecting this.

All of them work fine and pass on our ARM boards, so I still 
think if "not x86 && os == darwin" would be a better choice.
 
> If you want to investigate whether those tests pass on Linux on not-X86 and then enable them, then go ahead.  But please don't break the gardening that we did for Darwin in the process.  That is not acceptable behavior.

These tests pass and we are going to unskip them on x86 Linux and ARM Linux
too with paying more attention not to break your top secret internal testers.
 
> In particular, I think it's always acceptable for a committer to determine that a test should be skipped on a conservatively too big set of platforms.  But it's not acceptable to do the opposite: unskip a test on platforms you didn't test.

I don't agree with you at all. Passing tests shouldn't be skipped by anybody
on any platform. And failing tests should be skipped only on the platform,
where they fail, not everywhere. There are a sophisticated infrastructure
to be able to do it for LayoutTests, but not for JSC tests now. There isn't
a good reason to skip tests everywhere. For example your CStack branch merge
broke many JSC tests on x86 Linux (Bug12790) and ARM Thumb2 Linux (Bug132740)
too, but I don't think if it would be acceptable to skip the ~180 failing
tests on iPhone testers too, because there isn't a good way to skip them
only on Linux.

I can accept skipping the passing tests on Linux temporarily, but the
proper fix would be to find how to skip them only on non-x86 darwin.

Éva, could you possibly update your patch to make Linux and Darwin 
users happy too?
Comment 12 Csaba Osztrogonác 2014-05-22 01:59:31 PDT
(In reply to comment #11)
> broke many JSC tests on x86 Linux (Bug12790) and ARM Thumb2 Linux (Bug132740)
Typo fix: Bug127902
Comment 13 Éva Balázsfalvi 2014-05-22 04:20:04 PDT
Created attachment 231875 [details]
Patch

I added 'determineOS' function to run-jsc-stress-tests to be able determine if the
script runs on Darwin, Linux or Windows. Also  modified the conditions in the tests
to skip them only on (!x86 && Darwin)
Comment 14 Csaba Osztrogonác 2014-05-22 04:57:20 PDT
(In reply to comment #13)
> Created an attachment (id=231875) [details]
> Patch
> 
> I added 'determineOS' function to run-jsc-stress-tests to be able determine if the
> script runs on Darwin, Linux or Windows. Also  modified the conditions in the tests
> to skip them only on (!x86 && Darwin)

LGTM, I think it will be good for Darwin and Linux too.

But I let Filip to do the final review, we don't want
to break the internal iPhone Air bots accidentally. ;)
Comment 15 Csaba Osztrogonác 2014-05-23 00:36:03 PDT
Filip, any objection against this new fix?
Comment 16 Filip Pizlo 2014-05-23 08:45:51 PDT
Comment on attachment 231875 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=231875&action=review

It looks good but needs some stylistic changes.

> LayoutTests/js/script-tests/function-apply-many-args.js:1
> -//@ skip if $architecture !~ /x86/i
> +//@ skip if $architecture !~ /x86/i && $host_os == "darwin"

small nit: we typically say "and" instead of "&&" in our ruby code.

> PerformanceTests/SunSpider/profiler-test.yaml:32
> -      if $architecture =~ /x86/
> -          runProfiler
> -      else
> +      if $architecture !~ /x86/i && $host_os == "darwin"
>            skip
> +      else
> +          runProfiler

ditto

> Source/JavaScriptCore/tests/mozilla/mozilla-tests.yaml:2119
> -      if $architecture =~ /x86/i
> -          defaultRunMozillaTest :normal, "../shell.js"
> -      else
> +      if $architecture !~ /x86/i && $host_os == "darwin"
>            skip
> +      else
> +          defaultRunMozillaTest :normal, "../shell.js"

ditto

> Tools/Scripts/run-jsc-stress-tests:231
> +def determineOS
> +    case RbConfig::CONFIG["host_os"]
> +    when /darwin/i
> +        "darwin"
> +    when /linux/i
> +        "linux"
> +    when /mswin|mingw|cygwin/
> +        "windows"
> +    else
> +        $stderr.puts "Warning: unable to determine host operating system"
> +        nil
> +    end
> +end
> +
>  $architecture = determineArchitecture
> +$host_os = determineOS

We usually use camelCase rather than under_scores in our Ruby code.  Please call this $hostOS.
Comment 17 Éva Balázsfalvi 2014-05-28 01:33:37 PDT
Created attachment 232179 [details]
Patch

Thank you for the review, changed '&&' to 'and' and '' to '' everywhere.
Comment 18 Éva Balázsfalvi 2014-05-28 01:40:30 PDT
My comment on the patch turned out to be incomplete. What I meant is that I changed "&&" to "and" and "$host_os" to "$hostOS" everywhere.
Comment 19 Build Bot 2014-05-28 02:39:25 PDT
Comment on attachment 232179 [details]
Patch

Attachment 232179 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.appspot.com/results/6649064590409728

New failing tests:
media/W3C/video/networkState/networkState_during_loadstart.html
Comment 20 Build Bot 2014-05-28 02:39:28 PDT
Created attachment 232181 [details]
Archive of layout-test-results from webkit-ews-16 for mac-mountainlion-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: webkit-ews-16  Port: mac-mountainlion-wk2  Platform: Mac OS X 10.8.5
Comment 21 Éva Balázsfalvi 2014-05-29 01:01:07 PDT
Comment on attachment 232179 [details]
Patch

It seems the Mac WK2 EWS is flakey, this patch can't break any test, so cq? again.
Comment 22 Csaba Osztrogonác 2014-06-03 04:10:54 PDT
Comment on attachment 232179 [details]
Patch

All comments addressed, so r=me.
Comment 23 WebKit Commit Bot 2014-06-03 04:41:04 PDT
Comment on attachment 232179 [details]
Patch

Clearing flags on attachment: 232179

Committed r169559: <http://trac.webkit.org/changeset/169559>
Comment 24 WebKit Commit Bot 2014-06-03 04:41:09 PDT
All reviewed patches have been landed.  Closing bug.