NEW 20553
[Meta] Import Philip Taylor's HTML5 <canvas> test suite and pass it all
https://bugs.webkit.org/show_bug.cgi?id=20553
Summary [Meta] Import Philip Taylor's HTML5 <canvas> test suite and pass it all
Eric Seidel (no email)
Reported 2008-08-27 17:27:31 PDT
Import Philip Tyler's HTML5 <canavs> test suite and pass it all :) We really should add: http://philip.html5.org/tests/canvas/suite/tests to our layout tests to cover more <canvas> cases.
Attachments
fix patch (deleted)
2010-05-10 13:22 PDT, Chang Shu
no flags
fix patch 2 (deleted)
2010-05-14 12:56 PDT, Chang Shu
oliver: review-
fix patch 3 (deleted)
2010-05-14 13:49 PDT, Chang Shu
no flags
Add expected results (448.74 KB, patch)
2010-05-24 08:06 PDT, Chang Shu
no flags
Add expected results, updated ChangeLog (389.63 KB, patch)
2010-05-25 06:41 PDT, Chang Shu
no flags
enable qt tests (9.10 KB, patch)
2010-05-28 07:20 PDT, Chang Shu
no flags
use failed results (77.56 KB, patch)
2010-05-28 14:17 PDT, Chang Shu
zimmermann: review-
Chang Shu
Comment 1 2010-05-06 06:33:53 PDT
Here are some statistics: Total test cases 789 Test cases failed on both Safari and Firefox 77 Test cases failed on Safari but passed on Firefox 39
Chang Shu
Comment 2 2010-05-06 11:30:01 PDT
I have the test cases ready for check-in. However, the patch I am about to submit is 12Mb. Would this be ok?
Oliver Hunt
Comment 3 2010-05-06 11:32:03 PDT
The tests should not be 12mb -- more or less all of phillips tests should be dumpAsText, there is no reason for them to be pixel tests, and so no reason for them to be anything like 12mb in size.
Laszlo Gombos
Comment 4 2010-05-06 11:44:34 PDT
Are there any licensing issues around the test content ?
Chang Shu
Comment 5 2010-05-06 11:45:43 PDT
(In reply to comment #3) > The tests should not be 12mb -- more or less all of phillips tests should be > dumpAsText, there is no reason for them to be pixel tests, and so no reason for > them to be anything like 12mb in size. I double-checked the test suite and there are about 192 png files in the package. The size of unzipped files is about 18Mb.
Oliver Hunt
Comment 6 2010-05-06 11:56:58 PDT
You should talk to phillip on irc and get him to clarify the license, i would go through the images and just unify all the identical images -- i suspect it generates many identical images.
Philip Taylor
Comment 7 2010-05-06 12:53:42 PDT
The test code and data is currently all MIT-licensed (I think it says that in various places in the source). Most of the PNGs are demonstrating the approximate expected output, for a human to interpret (they can't be exact matches because of antialiasing, rounding, etc), so I assume they're not useful for automated tests. (Most of the test cases also run scripts that determine pass/fail, which might miss some rendering bugs (since they only test a few output pixels) but are probably much easier to hook into whatever test harness there is.)
Chang Shu
Comment 8 2010-05-07 09:03:25 PDT
Actually, the png files only contribute about 1Mb in the total size. I removed a couple of big files though: current-work tests/results.html tests/spec.html And now the patch file is about 6.4Mb. The total size on the file system should reach 12Mb. This is because there are 2546 html files under tests folder and each of them is several kb.
Oliver Hunt
Comment 9 2010-05-07 12:03:42 PDT
(In reply to comment #8) > Actually, the png files only contribute about 1Mb in the total size. I removed > a couple of big files though: > current-work > tests/results.html > tests/spec.html > And now the patch file is about 6.4Mb. The total size on the file system should > reach 12Mb. This is because there are 2546 html files under tests folder and > each of them is several kb. Are all the tests set to use text only results?
Chang Shu
Comment 10 2010-05-07 12:12:20 PDT
(In reply to comment #9) > (In reply to comment #8) > > Actually, the png files only contribute about 1Mb in the total size. I removed > > a couple of big files though: > > current-work > > tests/results.html > > tests/spec.html > > And now the patch file is about 6.4Mb. The total size on the file system should > > reach 12Mb. This is because there are 2546 html files under tests folder and > > each of them is several kb. > > Are all the tests set to use text only results? Right, no image comparison is used to determine passing/failing of the tests. Instead, it may test certain pixels like some of current layout tests do. Correct me if I'm wrong, Philip.
Chang Shu
Comment 11 2010-05-10 13:22:56 PDT
Created attachment 55597 [details] fix patch
Chang Shu
Comment 12 2010-05-12 07:22:10 PDT
If people feel the complete test suite is too big, an alternative way is to check in the test cases one by one, on necessity base.
Chang Shu
Comment 13 2010-05-14 12:56:37 PDT
Created attachment 56097 [details] fix patch 2 I removed many html files that are only used for manual testing. This reduces the total size to 2.2Mb. Hopefully, this is good to go.
Dimitri Glazkov (Google)
Comment 14 2010-05-14 13:01:43 PDT
Don't forget to skip on chromium, too. it's a bit different syntax, but it's fairly intuitive: trac.webkit.org/browser/trunk/LayoutTests/chromium/test_expectations.txt Also, these tests look to be fairly trivial to be made text-only, i.e. setting layoutTestController.dumpAsText(). Can you do that in tests.js?
Oliver Hunt
Comment 15 2010-05-14 13:03:46 PDT
(In reply to comment #14) > Don't forget to skip on chromium, too. it's a bit different syntax, but it's fairly intuitive: trac.webkit.org/browser/trunk/LayoutTests/chromium/test_expectations.txt > > Also, these tests look to be fairly trivial to be made text-only, i.e. setting layoutTestController.dumpAsText(). Can you do that in tests.js? I have explicitly requested that these not be pixel tests earlier. Why are they still doing pixel dumps?
Oliver Hunt
Comment 16 2010-05-14 13:05:38 PDT
Comment on attachment 56097 [details] fix patch 2 r- due to using pixel tests
Chang Shu
Comment 17 2010-05-14 13:25:19 PDT
Thanks for the comments, guys. Is this the right syntax for chromium skip file? // New canvas test suite from http://philip.html5.org/tests/canvas/suite/tests/ BUG20553 SKIP : fast/canvas/philip = TEXT
Chang Shu
Comment 18 2010-05-14 13:49:27 PDT
Created attachment 56106 [details] fix patch 3 Based on the comments: 1. Dump text results only 2. Skip tests on chromium
Chang Shu
Comment 19 2010-05-18 06:59:53 PDT
Can anybody review my patch? :)
Kent Hansen
Comment 20 2010-05-19 07:22:12 PDT
Cool! What's stopping you from adding expected test results straight away, though? I'd just run-webkit-tests --new-test-results and modify all the failing ones to say "Passed". That should be a good starting point for creating the cross-platform & platform-specific skip lists for the individual tests.
Chang Shu
Comment 21 2010-05-19 07:33:20 PDT
(In reply to comment #20) > Cool! > What's stopping you from adding expected test results straight away, though? > I'd just run-webkit-tests --new-test-results and modify all the failing ones to say "Passed". > That should be a good starting point for creating the cross-platform & platform-specific skip lists for the individual tests. My plan is to check in the test cases as the first step. The next step would be generate expected results on Mac and update skip list with individual tests (not the whole directory as it is right now). Then move on to other platforms, like Qt, which I can take care of.
Eric Seidel (no email)
Comment 22 2010-05-21 12:46:10 PDT
Comment on attachment 56106 [details] fix patch 3 I like new tests. Please add results for the various platforms in follow-up commits. If you need help from various port maintainers, ask on webkit-dev.
Chang Shu
Comment 23 2010-05-21 13:51:10 PDT
Comment on attachment 56106 [details] fix patch 3 committed manually, r59954. clean up flags on attachment 56106 [details]. Will work on expected results.
James Robinson
Comment 24 2010-05-21 14:11:21 PDT
Can you move these from fast/canvas to canvas/? Also, do you know that the results will (in general) be platform-specific? Without looking in detail I would expect the dumpAsText results to mostly be the same. If not, that would probably be a problem,
Chang Shu
Comment 25 2010-05-21 14:18:16 PDT
(In reply to comment #24) > Can you move these from fast/canvas to canvas/? Also, do you know that the results will (in general) be platform-specific? Without looking in detail I would expect the dumpAsText results to mostly be the same. If not, that would probably be a problem, that's a good point (on moving tests to canvas/). will do. Yes, I also would expect most expected results are the same. thanks!
Chang Shu
Comment 26 2010-05-24 07:56:22 PDT
I moved the test cases manually to LayoutTests/canvas. Unreviewed. r60072. I didn't generate a patch for this because it's big with all html contents marked as "-". Hopefully, it's acceptable. :)
Chang Shu
Comment 27 2010-05-24 08:06:12 PDT
Created attachment 56888 [details] Add expected results
James Robinson
Comment 28 2010-05-24 14:30:47 PDT
You should not skip just because they fail, only skip tests if they will cause problems running the rest of the suite. Tests that we currently fail should have the failures reflected in their -expected.txt results and have bugs on file somewhere in bugs.webkit.org so that they can get triaged and fixed as needed. Several tests have output that say "Cannot automatically verify result" or something similar. Is there much value to running these anyway?
James Robinson
Comment 29 2010-05-24 14:58:48 PDT
We chatted a bit in IRC. I think this patch is fine to land if it just removes canvas/philip from platform/mac/Skipped altogether. The test failures are already encapsulated in the -expected.txt files and since other ports still have this on their own Skipped lists they should not turn red.
James Robinson
Comment 30 2010-05-24 15:45:16 PDT
Actually, it looks like the -expectated.txt files here are not really the output from DRT but something else. That is definitely not right, the test expectations should always be the actual output currently produced and not something else. I don't think this patch should land until the expectations match what actually happens on at least one port.
Chang Shu
Comment 31 2010-05-24 19:24:24 PDT
(In reply to comment #30) > Actually, it looks like the -expectated.txt files here are not really the output from DRT but something else. That is definitely not right, the test expectations should always be the actual output currently produced and not something else. I don't think this patch should land until the expectations match what actually happens on at least one port. All the -expected.txt files for succeeded cases are actual output from DRT. The failed ones on Mac are modified manually to "passed". This is suggested in comment#20 by Kent, which makes a lot of sense to me. I don't think the expected results should match the actual output on any platforms, unless all the test cases pass, which is not happening yet.
Kent Hansen
Comment 32 2010-05-25 02:55:49 PDT
(In reply to comment #31) > (In reply to comment #30) > > Actually, it looks like the -expectated.txt files here are not really the output from DRT but something else. That is definitely not right, the test expectations should always be the actual output currently produced and not something else. I don't think this patch should land until the expectations match what actually happens on at least one port. > > All the -expected.txt files for succeeded cases are actual output from DRT. The failed ones on Mac are modified manually to "passed". This is suggested in comment#20 by Kent, which makes a lot of sense to me. I don't think the expected results should match the actual output on any platforms, unless all the test cases pass, which is not happening yet. Right. The skipped tests would be removed from the Skipped-list as each issue is fixed, without having to modify the test data. The expected results look good, except for a couple of the hand-edited ones that seem to be missing the "Expected output:\n\n" part, making the actual-vs-expected diff larger than necessary when running the tests. When running the tests on Qt/Linux there are quite a few more failures than on Safari/Mac, so it makes sense to maintain per-platform Skipped lists for now. But once all those are in place it's probably a good idea to go through the failures and try to sort out which ones are really platform-specific.
Chang Shu
Comment 33 2010-05-25 03:02:17 PDT
> > The expected results look good, except for a couple of the hand-edited ones that seem to be missing the "Expected output:\n\n" part, making the actual-vs-expected diff larger than necessary when running the tests. > Thanks, I noticed that too. My local workspace has them fixed. I will commit with the right text after r+.
Chang Shu
Comment 34 2010-05-25 06:41:10 PDT
Created attachment 57013 [details] Add expected results, updated ChangeLog
Kenneth Rohde Christiansen
Comment 35 2010-05-25 06:57:12 PDT
Comment on attachment 57013 [details] Add expected results, updated ChangeLog OK looks good to me now. Way better ChangeLog.
Chang Shu
Comment 36 2010-05-25 07:33:43 PDT
Patch 57013 committed manually, r60162. Flags cleared.
WebKit Review Bot
Comment 37 2010-05-25 08:05:12 PDT
http://trac.webkit.org/changeset/60162 might have broken SnowLeopard Intel Release (Tests)
Chang Shu
Comment 38 2010-05-25 08:41:12 PDT
(In reply to comment #37) > http://trac.webkit.org/changeset/60162 might have broken SnowLeopard Intel Release (Tests) I added two additional cases that failed on SnowLeopard only to the Skipped. Code change was committed manually, r60166.
Adam Barth
Comment 39 2010-05-25 09:11:03 PDT
Looks like your change also caused problems on Tiger.
Antonio Gomes
Comment 40 2010-05-25 09:15:10 PDT
(In reply to comment #39) > Looks like your change also caused problems on Tiger. see bug 39677 , it skipped failing tests on Tiger.
Adam Roben (:aroben)
Comment 41 2010-05-25 09:16:36 PDT
(In reply to comment #32) > (In reply to comment #31) > > (In reply to comment #30) > > > Actually, it looks like the -expectated.txt files here are not really the output from DRT but something else. That is definitely not right, the test expectations should always be the actual output currently produced and not something else. I don't think this patch should land until the expectations match what actually happens on at least one port. > > > > All the -expected.txt files for succeeded cases are actual output from DRT. The failed ones on Mac are modified manually to "passed". This is suggested in comment#20 by Kent, which makes a lot of sense to me. I don't think the expected results should match the actual output on any platforms, unless all the test cases pass, which is not happening yet. > > Right. The skipped tests would be removed from the Skipped-list as each issue is fixed, without having to modify the test data. This isn't how we've traditionally done things in the WebKit project. the -expected files should reflect what actually happens today when you run the test, not what we wish happened. We should run all the tests on all platforms, rather than skipping them, and update the expected results as we fix bugs. One benefit of doing things this way is that you'll notice when you fix a bug in one of these tests, because the test will show up as not matching the expected results (whereas if it were on the Skipped list, you might not notice you had fixed the test at all!). Conversely, if you make a change that makes one of the already-failing tests even worse, you will notice that, too. I recommend that we check in results for all the tests that match what happens in WebKit today, and remove them from the Skipped list.
Kenneth Rohde Christiansen
Comment 42 2010-05-25 09:19:37 PDT
So we land expected failures and add bug reports for them? Sounds good with me.
Chang Shu
Comment 43 2010-05-25 10:10:43 PDT
As long as we are on the same page, I don't mind either way. I submitted "Failed" expected results before but at that time, some reviewers told me not to. So I guess it's still confusing to many contributors.
Kent Hansen
Comment 44 2010-05-25 10:26:22 PDT
(In reply to comment #41) > (In reply to comment #32) > > Right. The skipped tests would be removed from the Skipped-list as each issue is fixed, without having to modify the test data. > > This isn't how we've traditionally done things in the WebKit project. the -expected files should reflect what actually happens today when you run the test, not what we wish happened. We should run all the tests on all platforms, rather than skipping them, and update the expected results as we fix bugs. One benefit of doing things this way is that you'll notice when you fix a bug in one of these tests, because the test will show up as not matching the expected results (whereas if it were on the Skipped list, you might not notice you had fixed the test at all!). Conversely, if you make a change that makes one of the already-failing tests even worse, you will notice that, too. > > I recommend that we check in results for all the tests that match what happens in WebKit today, and remove them from the Skipped list. I see. It would be nice if there were a way to express expected failures, rather than checking in incorrect results. (I thought the purpose of adding this test suite was to test conformance, not current behavior, but maybe that's not the case. Certainly an interesting point :) ) In Qt, if a test is flagged as expected to fail, the bot will stay green, but the expected failure is clearly visible in the test output (so it's not so easily forgotten). If a test that's expected to fail suddenly starts to pass, the bot reports an "unexpected pass", and goes red. Such failures are readily distinguishable from "normal" test failures.
Chang Shu
Comment 45 2010-05-25 10:42:13 PDT
One benefit of having all failed test cases in the skipped list is easy to read. Searching failed test cases in bug list doesn't seem to be as clear as by reading skipped file.
Adam Roben (:aroben)
Comment 46 2010-05-25 10:46:25 PDT
(In reply to comment #44) > I see. It would be nice if there were a way to express expected failures, rather than checking in incorrect results. new-run-webkit-tests has this ability. Hopefully we'll be able to switch over soon. > (I thought the purpose of adding this test suite was to test conformance, not current behavior, but maybe that's not the case. Certainly an interesting point :) ) All of the tests in LayoutTests help us catch regressions. Many of them also help us test conformance to some specification(s). But the intent is that run-webkit-tests should report no failures on all platforms at all times. Any failures it reports should be either fixed immediately or put into a bug report. (If run-webkit-tests consistently reported failures on some platforms, it would be very hard to catch newly-introduced regressions.)
Adam Roben (:aroben)
Comment 47 2010-05-25 10:47:43 PDT
(In reply to comment #45) > One benefit of having all failed test cases in the skipped list is easy to read. Searching failed test cases in bug list doesn't seem to be as clear as by reading skipped file. That is true. But I think that benefit is outweighed by the disadvantages I listed in comment 41. And it doesn't match our current policy. (The policy could of course be changed, but that would only happen after discussion with the rest of the WebKit community.)
Chang Shu
Comment 48 2010-05-25 10:54:08 PDT
A practical question: do we put failed-expectations in platform specific folder? Let's say, if test case fails on Mac and originally, I put it in common folder. Then I run it on Qt and it passes. Shall I put the passing result in platform/qt or replace it in common folder after moving the "failed" one to platform/mac?
Adam Roben (:aroben)
Comment 49 2010-05-25 11:06:39 PDT
(In reply to comment #48) > A practical question: do we put failed-expectations in platform specific folder? Let's say, if test case fails on Mac and originally, I put it in common folder. Then I run it on Qt and it passes. Shall I put the passing result in platform/qt or replace it in common folder after moving the "failed" one to platform/mac? I guess the answer depends on what other platforms do. Do the tests also fail on Windows? GTK+? Chromium? If the tests pass on more than one platform, then it seems likely that the failures are due to incorrect platform-specific code. In that case, it makes sense to me to put the failure results in the platform-specific directories. If the tests only pass on one platform, then it seems likely that the failures are due to incorrect cross-platform code. In that case, it makes sense to me to put the failure results in the cross-platform directory. (This is just a rule-of-thumb.)
Chang Shu
Comment 50 2010-05-25 11:12:01 PDT
Ok, understood. I will update the expected results and clean up skipped file.
Chang Shu
Comment 51 2010-05-28 07:20:20 PDT
Created attachment 57328 [details] enable qt tests
Chang Shu
Comment 52 2010-05-28 11:44:43 PDT
Comment on attachment 57328 [details] enable qt tests committed manually, r60371. remove flags
WebKit Review Bot
Comment 53 2010-05-28 13:58:36 PDT
Chang Shu
Comment 54 2010-05-28 14:17:48 PDT
Created attachment 57375 [details] use failed results
Kent Hansen
Comment 55 2010-07-23 05:35:16 PDT
(In reply to comment #54) > Created an attachment (id=57375) [details] > use failed results This needs rebasing now, and the expected results need to be updated (at least there are some expected results that don't coincide with the results I'm getting on my local Mac).
Nikolas Zimmermann
Comment 56 2010-07-30 23:20:22 PDT
Comment on attachment 57375 [details] use failed results r- because of Kent Hansens comment.
Sam Sneddon [:gsnedders]
Comment 57 2016-09-15 08:24:06 PDT
These are nowadays maintained as part of https://github.com/w3c/web-platform-tests/tree/master/2dcontext so this is really obsoleted by just running more of web-platform-tests.
Note You need to log in before you can comment on or make changes to this bug.