Bug 38457 - [Qt] Enable passing Sputnik tests
Summary: [Qt] Enable passing Sputnik tests
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P3 Normal
Assignee: QtWebKit Unassigned
URL:
Keywords: Qt
Depends on:
Blocks:
 
Reported: 2010-05-03 05:56 PDT by Csaba Osztrogonác
Modified: 2010-05-06 08:50 PDT (History)
5 users (show)

See Also:


Attachments
enable tests (3.21 KB, patch)
2010-05-03 05:56 PDT, Csaba Osztrogonác
ap: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Csaba Osztrogonác 2010-05-03 05:56:14 PDT
Sputnik test cases landed 4 days before: (~5000 tests)
https://bugs.webkit.org/show_bug.cgi?id=38296

I tested them on Qt bot, they are sufficiently 
stable, most of them pass, only ~50 tests fail.

Runtime of LayoutTests would be increased by ~40 sec on the bot.
(Now the runtime is ~240 sec.)
Comment 1 Csaba Osztrogonác 2010-05-03 05:56:47 PDT
Created attachment 54919 [details]
enable tests
Comment 2 Alexey Proskuryakov 2010-05-03 08:15:56 PDT
Comment on attachment 54919 [details]
enable tests

rs=me

+# Failing Sputnik tests

You could also consider landing platform specific results. Of course, these aren't all the tests that fail, we have many expected cross-platform failures checked in.
Comment 3 Csaba Osztrogonác 2010-05-05 23:18:24 PDT
(In reply to comment #2)
> (From update of attachment 54919 [details])
> rs=me
Thx, landed: http://trac.webkit.org/changeset/58863
 
> You could also consider landing platform specific results. Of course, these
> aren't all the tests that fail, we have many expected cross-platform failures
> checked in.

In general we don't land Qt specific expected results for failures,
we prefer putting these tests into the skipped list and file a bug.

I know disabling a complex test can be worse than check in couple of failures  (eg. FAIL instead of PASS) into the expected file. But I don't see now how can we maintain which expected file is absolutely correct and which consists failures. We will think how can we do it. 

I cc-ed Simon, Tor Arne, Kenneth and András too. Guys, I think we 
should discuss check-in or not check-in failures policy sometime somewhere.
Comment 4 Andras Becsi 2010-05-06 03:16:58 PDT
If I am correct these tests are dumpAsText() tests, so my opinion is that we could check in results for tests which have any PASS hunks in the output. If someone fixes some issues he or she could easily decide if the resulting test failure is due to wrong expected results or some regression.
In general I think this kind of policy would be a step forward, but I am categorically against checking in wrong DRT tree dumps, because to decide afterwards if the dump is correct or not is really hard.
 
(In reply to comment #3)
> (In reply to comment #2)
> > (From update of attachment 54919 [details] [details])
> > rs=me
> Thx, landed: http://trac.webkit.org/changeset/58863
> 
> > You could also consider landing platform specific results. Of course, these
> > aren't all the tests that fail, we have many expected cross-platform failures
> > checked in.
> 
> In general we don't land Qt specific expected results for failures,
> we prefer putting these tests into the skipped list and file a bug.
> 
> I know disabling a complex test can be worse than check in couple of failures 
> (eg. FAIL instead of PASS) into the expected file. But I don't see now how can
> we maintain which expected file is absolutely correct and which consists
> failures. We will think how can we do it. 
> 
> I cc-ed Simon, Tor Arne, Kenneth and András too. Guys, I think we 
> should discuss check-in or not check-in failures policy sometime somewhere.
Comment 5 Tor Arne Vestbø 2010-05-06 03:29:10 PDT
(In reply to comment #4)
> If I am correct these tests are dumpAsText() tests, so my opinion is that we
> could check in results for tests which have any PASS hunks in the output. If
> someone fixes some issues he or she could easily decide if the resulting test
> failure is due to wrong expected results or some regression.
> In general I think this kind of policy would be a step forward, but I am
> categorically against checking in wrong DRT tree dumps, because to decide
> afterwards if the dump is correct or not is really hard.

I agree, for dumpAsText it makes sense. For render tree output we prefer to skip tests for now.
Comment 6 Alexey Proskuryakov 2010-05-06 08:50:30 PDT
Sputnik tests are weird - they silently run subtests until the first failure, at which time they bail out and print an error message. Only if there were no (unexpected) errors during the run, PASS is printed.