20553 – [Meta] Import Philip Taylor's HTML5 <canvas> test suite and pass it all

Eric Seidel (no email)

Reported 2008-08-27 17:27:31 PDT

Import Philip Tyler's HTML5 <canavs> test suite and pass it all :) We really should add: http://philip.html5.org/tests/canvas/suite/tests to our layout tests to cover more <canvas> cases.

Chang Shu

Comment 1 2010-05-06 06:33:53 PDT

Here are some statistics: Total test cases 789 Test cases failed on both Safari and Firefox 77 Test cases failed on Safari but passed on Firefox 39

Chang Shu

Comment 2 2010-05-06 11:30:01 PDT

I have the test cases ready for check-in. However, the patch I am about to submit is 12Mb. Would this be ok?

Oliver Hunt

Comment 3 2010-05-06 11:32:03 PDT

The tests should not be 12mb -- more or less all of phillips tests should be dumpAsText, there is no reason for them to be pixel tests, and so no reason for them to be anything like 12mb in size.

Laszlo Gombos

Comment 4 2010-05-06 11:44:34 PDT

Are there any licensing issues around the test content ?

Chang Shu

Comment 5 2010-05-06 11:45:43 PDT

(In reply to comment #3) > The tests should not be 12mb -- more or less all of phillips tests should be > dumpAsText, there is no reason for them to be pixel tests, and so no reason for > them to be anything like 12mb in size. I double-checked the test suite and there are about 192 png files in the package. The size of unzipped files is about 18Mb.

Oliver Hunt

Comment 6 2010-05-06 11:56:58 PDT

You should talk to phillip on irc and get him to clarify the license, i would go through the images and just unify all the identical images -- i suspect it generates many identical images.

Philip Taylor

Comment 7 2010-05-06 12:53:42 PDT

The test code and data is currently all MIT-licensed (I think it says that in various places in the source). Most of the PNGs are demonstrating the approximate expected output, for a human to interpret (they can't be exact matches because of antialiasing, rounding, etc), so I assume they're not useful for automated tests. (Most of the test cases also run scripts that determine pass/fail, which might miss some rendering bugs (since they only test a few output pixels) but are probably much easier to hook into whatever test harness there is.)

Chang Shu

Comment 8 2010-05-07 09:03:25 PDT

Actually, the png files only contribute about 1Mb in the total size. I removed a couple of big files though: current-work tests/results.html tests/spec.html And now the patch file is about 6.4Mb. The total size on the file system should reach 12Mb. This is because there are 2546 html files under tests folder and each of them is several kb.

Oliver Hunt

Comment 9 2010-05-07 12:03:42 PDT

(In reply to comment #8) > Actually, the png files only contribute about 1Mb in the total size. I removed > a couple of big files though: > current-work > tests/results.html > tests/spec.html > And now the patch file is about 6.4Mb. The total size on the file system should > reach 12Mb. This is because there are 2546 html files under tests folder and > each of them is several kb. Are all the tests set to use text only results?

Chang Shu

Comment 10 2010-05-07 12:12:20 PDT

(In reply to comment #9) > (In reply to comment #8) > > Actually, the png files only contribute about 1Mb in the total size. I removed > > a couple of big files though: > > current-work > > tests/results.html > > tests/spec.html > > And now the patch file is about 6.4Mb. The total size on the file system should > > reach 12Mb. This is because there are 2546 html files under tests folder and > > each of them is several kb. > > Are all the tests set to use text only results? Right, no image comparison is used to determine passing/failing of the tests. Instead, it may test certain pixels like some of current layout tests do. Correct me if I'm wrong, Philip.

Chang Shu

Comment 11 2010-05-10 13:22:56 PDT

Created attachment 55597 [details] fix patch

Chang Shu

Comment 12 2010-05-12 07:22:10 PDT

If people feel the complete test suite is too big, an alternative way is to check in the test cases one by one, on necessity base.

Chang Shu

Comment 13 2010-05-14 12:56:37 PDT

Created attachment 56097 [details] fix patch 2 I removed many html files that are only used for manual testing. This reduces the total size to 2.2Mb. Hopefully, this is good to go.

Dimitri Glazkov (Google)

Comment 14 2010-05-14 13:01:43 PDT

Don't forget to skip on chromium, too. it's a bit different syntax, but it's fairly intuitive: trac.webkit.org/browser/trunk/LayoutTests/chromium/test_expectations.txt Also, these tests look to be fairly trivial to be made text-only, i.e. setting layoutTestController.dumpAsText(). Can you do that in tests.js?

Oliver Hunt

Comment 15 2010-05-14 13:03:46 PDT

(In reply to comment #14) > Don't forget to skip on chromium, too. it's a bit different syntax, but it's fairly intuitive: trac.webkit.org/browser/trunk/LayoutTests/chromium/test_expectations.txt > > Also, these tests look to be fairly trivial to be made text-only, i.e. setting layoutTestController.dumpAsText(). Can you do that in tests.js? I have explicitly requested that these not be pixel tests earlier. Why are they still doing pixel dumps?

Oliver Hunt

Comment 16 2010-05-14 13:05:38 PDT

Comment on attachment 56097 [details] fix patch 2 r- due to using pixel tests

Chang Shu

Comment 17 2010-05-14 13:25:19 PDT

Thanks for the comments, guys. Is this the right syntax for chromium skip file? // New canvas test suite from http://philip.html5.org/tests/canvas/suite/tests/ BUG20553 SKIP : fast/canvas/philip = TEXT

Chang Shu

Comment 18 2010-05-14 13:49:27 PDT

Created attachment 56106 [details] fix patch 3 Based on the comments: 1. Dump text results only 2. Skip tests on chromium

Chang Shu

Comment 19 2010-05-18 06:59:53 PDT

Can anybody review my patch? :)

Kent Hansen

Comment 20 2010-05-19 07:22:12 PDT

Cool! What's stopping you from adding expected test results straight away, though? I'd just run-webkit-tests --new-test-results and modify all the failing ones to say "Passed". That should be a good starting point for creating the cross-platform & platform-specific skip lists for the individual tests.

Chang Shu

Comment 21 2010-05-19 07:33:20 PDT

(In reply to comment #20) > Cool! > What's stopping you from adding expected test results straight away, though? > I'd just run-webkit-tests --new-test-results and modify all the failing ones to say "Passed". > That should be a good starting point for creating the cross-platform & platform-specific skip lists for the individual tests. My plan is to check in the test cases as the first step. The next step would be generate expected results on Mac and update skip list with individual tests (not the whole directory as it is right now). Then move on to other platforms, like Qt, which I can take care of.

Eric Seidel (no email)

Comment 22 2010-05-21 12:46:10 PDT

Comment on attachment 56106 [details] fix patch 3 I like new tests. Please add results for the various platforms in follow-up commits. If you need help from various port maintainers, ask on webkit-dev.

Chang Shu

Comment 23 2010-05-21 13:51:10 PDT

Comment on attachment 56106 [details] fix patch 3 committed manually, r59954. clean up flags on attachment 56106 [details]. Will work on expected results.

James Robinson

Comment 24 2010-05-21 14:11:21 PDT

Can you move these from fast/canvas to canvas/? Also, do you know that the results will (in general) be platform-specific? Without looking in detail I would expect the dumpAsText results to mostly be the same. If not, that would probably be a problem,

Chang Shu

Comment 25 2010-05-21 14:18:16 PDT

(In reply to comment #24) > Can you move these from fast/canvas to canvas/? Also, do you know that the results will (in general) be platform-specific? Without looking in detail I would expect the dumpAsText results to mostly be the same. If not, that would probably be a problem, that's a good point (on moving tests to canvas/). will do. Yes, I also would expect most expected results are the same. thanks!

Chang Shu

Comment 26 2010-05-24 07:56:22 PDT

I moved the test cases manually to LayoutTests/canvas. Unreviewed. r60072. I didn't generate a patch for this because it's big with all html contents marked as "-". Hopefully, it's acceptable. :)

Chang Shu

Comment 27 2010-05-24 08:06:12 PDT

Created attachment 56888 [details] Add expected results

James Robinson

Comment 28 2010-05-24 14:30:47 PDT

You should not skip just because they fail, only skip tests if they will cause problems running the rest of the suite. Tests that we currently fail should have the failures reflected in their -expected.txt results and have bugs on file somewhere in bugs.webkit.org so that they can get triaged and fixed as needed. Several tests have output that say "Cannot automatically verify result" or something similar. Is there much value to running these anyway?

James Robinson

Comment 29 2010-05-24 14:58:48 PDT

We chatted a bit in IRC. I think this patch is fine to land if it just removes canvas/philip from platform/mac/Skipped altogether. The test failures are already encapsulated in the -expected.txt files and since other ports still have this on their own Skipped lists they should not turn red.

James Robinson

Comment 30 2010-05-24 15:45:16 PDT

Actually, it looks like the -expectated.txt files here are not really the output from DRT but something else. That is definitely not right, the test expectations should always be the actual output currently produced and not something else. I don't think this patch should land until the expectations match what actually happens on at least one port.

Chang Shu

Comment 31 2010-05-24 19:24:24 PDT

(In reply to comment #30) > Actually, it looks like the -expectated.txt files here are not really the output from DRT but something else. That is definitely not right, the test expectations should always be the actual output currently produced and not something else. I don't think this patch should land until the expectations match what actually happens on at least one port. All the -expected.txt files for succeeded cases are actual output from DRT. The failed ones on Mac are modified manually to "passed". This is suggested in comment#20 by Kent, which makes a lot of sense to me. I don't think the expected results should match the actual output on any platforms, unless all the test cases pass, which is not happening yet.

Kent Hansen

Comment 32 2010-05-25 02:55:49 PDT

(In reply to comment #31) > (In reply to comment #30) > > Actually, it looks like the -expectated.txt files here are not really the output from DRT but something else. That is definitely not right, the test expectations should always be the actual output currently produced and not something else. I don't think this patch should land until the expectations match what actually happens on at least one port. > > All the -expected.txt files for succeeded cases are actual output from DRT. The failed ones on Mac are modified manually to "passed". This is suggested in comment#20 by Kent, which makes a lot of sense to me. I don't think the expected results should match the actual output on any platforms, unless all the test cases pass, which is not happening yet. Right. The skipped tests would be removed from the Skipped-list as each issue is fixed, without having to modify the test data. The expected results look good, except for a couple of the hand-edited ones that seem to be missing the "Expected output:\n\n" part, making the actual-vs-expected diff larger than necessary when running the tests. When running the tests on Qt/Linux there are quite a few more failures than on Safari/Mac, so it makes sense to maintain per-platform Skipped lists for now. But once all those are in place it's probably a good idea to go through the failures and try to sort out which ones are really platform-specific.

Chang Shu

Comment 33 2010-05-25 03:02:17 PDT

> > The expected results look good, except for a couple of the hand-edited ones that seem to be missing the "Expected output:\n\n" part, making the actual-vs-expected diff larger than necessary when running the tests. > Thanks, I noticed that too. My local workspace has them fixed. I will commit with the right text after r+.

Chang Shu

Comment 34 2010-05-25 06:41:10 PDT

Created attachment 57013 [details] Add expected results, updated ChangeLog

Kenneth Rohde Christiansen

Comment 35 2010-05-25 06:57:12 PDT

Comment on attachment 57013 [details] Add expected results, updated ChangeLog OK looks good to me now. Way better ChangeLog.

Chang Shu

Comment 36 2010-05-25 07:33:43 PDT

Patch 57013 committed manually, r60162. Flags cleared.

WebKit Review Bot

Comment 37 2010-05-25 08:05:12 PDT

http://trac.webkit.org/changeset/60162 might have broken SnowLeopard Intel Release (Tests)

Chang Shu

Comment 38 2010-05-25 08:41:12 PDT

(In reply to comment #37) > http://trac.webkit.org/changeset/60162 might have broken SnowLeopard Intel Release (Tests) I added two additional cases that failed on SnowLeopard only to the Skipped. Code change was committed manually, r60166.

Adam Barth

Comment 39 2010-05-25 09:11:03 PDT

Looks like your change also caused problems on Tiger.

Antonio Gomes

Comment 40 2010-05-25 09:15:10 PDT

(In reply to comment #39) > Looks like your change also caused problems on Tiger. see bug 39677 , it skipped failing tests on Tiger.

Adam Roben (:aroben)

Comment 41 2010-05-25 09:16:36 PDT

(In reply to comment #32) > (In reply to comment #31) > > (In reply to comment #30) > > > Actually, it looks like the -expectated.txt files here are not really the output from DRT but something else. That is definitely not right, the test expectations should always be the actual output currently produced and not something else. I don't think this patch should land until the expectations match what actually happens on at least one port. > > > > All the -expected.txt files for succeeded cases are actual output from DRT. The failed ones on Mac are modified manually to "passed". This is suggested in comment#20 by Kent, which makes a lot of sense to me. I don't think the expected results should match the actual output on any platforms, unless all the test cases pass, which is not happening yet. > > Right. The skipped tests would be removed from the Skipped-list as each issue is fixed, without having to modify the test data. This isn't how we've traditionally done things in the WebKit project. the -expected files should reflect what actually happens today when you run the test, not what we wish happened. We should run all the tests on all platforms, rather than skipping them, and update the expected results as we fix bugs. One benefit of doing things this way is that you'll notice when you fix a bug in one of these tests, because the test will show up as not matching the expected results (whereas if it were on the Skipped list, you might not notice you had fixed the test at all!). Conversely, if you make a change that makes one of the already-failing tests even worse, you will notice that, too. I recommend that we check in results for all the tests that match what happens in WebKit today, and remove them from the Skipped list.

Kenneth Rohde Christiansen

Comment 42 2010-05-25 09:19:37 PDT

So we land expected failures and add bug reports for them? Sounds good with me.

Chang Shu

Comment 43 2010-05-25 10:10:43 PDT

As long as we are on the same page, I don't mind either way. I submitted "Failed" expected results before but at that time, some reviewers told me not to. So I guess it's still confusing to many contributors.

Kent Hansen

Comment 44 2010-05-25 10:26:22 PDT

(In reply to comment #41) > (In reply to comment #32) > > Right. The skipped tests would be removed from the Skipped-list as each issue is fixed, without having to modify the test data. > > This isn't how we've traditionally done things in the WebKit project. the -expected files should reflect what actually happens today when you run the test, not what we wish happened. We should run all the tests on all platforms, rather than skipping them, and update the expected results as we fix bugs. One benefit of doing things this way is that you'll notice when you fix a bug in one of these tests, because the test will show up as not matching the expected results (whereas if it were on the Skipped list, you might not notice you had fixed the test at all!). Conversely, if you make a change that makes one of the already-failing tests even worse, you will notice that, too. > > I recommend that we check in results for all the tests that match what happens in WebKit today, and remove them from the Skipped list. I see. It would be nice if there were a way to express expected failures, rather than checking in incorrect results. (I thought the purpose of adding this test suite was to test conformance, not current behavior, but maybe that's not the case. Certainly an interesting point :) ) In Qt, if a test is flagged as expected to fail, the bot will stay green, but the expected failure is clearly visible in the test output (so it's not so easily forgotten). If a test that's expected to fail suddenly starts to pass, the bot reports an "unexpected pass", and goes red. Such failures are readily distinguishable from "normal" test failures.

Chang Shu

Comment 45 2010-05-25 10:42:13 PDT

One benefit of having all failed test cases in the skipped list is easy to read. Searching failed test cases in bug list doesn't seem to be as clear as by reading skipped file.

Adam Roben (:aroben)

Comment 46 2010-05-25 10:46:25 PDT

(In reply to comment #44) > I see. It would be nice if there were a way to express expected failures, rather than checking in incorrect results. new-run-webkit-tests has this ability. Hopefully we'll be able to switch over soon. > (I thought the purpose of adding this test suite was to test conformance, not current behavior, but maybe that's not the case. Certainly an interesting point :) ) All of the tests in LayoutTests help us catch regressions. Many of them also help us test conformance to some specification(s). But the intent is that run-webkit-tests should report no failures on all platforms at all times. Any failures it reports should be either fixed immediately or put into a bug report. (If run-webkit-tests consistently reported failures on some platforms, it would be very hard to catch newly-introduced regressions.)

Adam Roben (:aroben)

Comment 47 2010-05-25 10:47:43 PDT

(In reply to comment #45) > One benefit of having all failed test cases in the skipped list is easy to read. Searching failed test cases in bug list doesn't seem to be as clear as by reading skipped file. That is true. But I think that benefit is outweighed by the disadvantages I listed in comment 41. And it doesn't match our current policy. (The policy could of course be changed, but that would only happen after discussion with the rest of the WebKit community.)

Chang Shu

Comment 48 2010-05-25 10:54:08 PDT

A practical question: do we put failed-expectations in platform specific folder? Let's say, if test case fails on Mac and originally, I put it in common folder. Then I run it on Qt and it passes. Shall I put the passing result in platform/qt or replace it in common folder after moving the "failed" one to platform/mac?

Adam Roben (:aroben)

Comment 49 2010-05-25 11:06:39 PDT

(In reply to comment #48) > A practical question: do we put failed-expectations in platform specific folder? Let's say, if test case fails on Mac and originally, I put it in common folder. Then I run it on Qt and it passes. Shall I put the passing result in platform/qt or replace it in common folder after moving the "failed" one to platform/mac? I guess the answer depends on what other platforms do. Do the tests also fail on Windows? GTK+? Chromium? If the tests pass on more than one platform, then it seems likely that the failures are due to incorrect platform-specific code. In that case, it makes sense to me to put the failure results in the platform-specific directories. If the tests only pass on one platform, then it seems likely that the failures are due to incorrect cross-platform code. In that case, it makes sense to me to put the failure results in the cross-platform directory. (This is just a rule-of-thumb.)

Chang Shu

Comment 50 2010-05-25 11:12:01 PDT

Ok, understood. I will update the expected results and clean up skipped file.

Chang Shu

Comment 51 2010-05-28 07:20:20 PDT

Created attachment 57328 [details] enable qt tests

Chang Shu

Comment 52 2010-05-28 11:44:43 PDT

Comment on attachment 57328 [details] enable qt tests committed manually, r60371. remove flags

WebKit Review Bot

Comment 53 2010-05-28 13:58:36 PDT

http://trac.webkit.org/changeset/60371 might have broken GTK Linux 32-bit Debug The following changes are on the blame list: http://trac.webkit.org/changeset/60371 http://trac.webkit.org/changeset/60372 http://trac.webkit.org/changeset/60373

Chang Shu

Comment 54 2010-05-28 14:17:48 PDT

Created attachment 57375 [details] use failed results

Kent Hansen

Comment 55 2010-07-23 05:35:16 PDT

(In reply to comment #54) > Created an attachment (id=57375) [details] > use failed results This needs rebasing now, and the expected results need to be updated (at least there are some expected results that don't coincide with the results I'm getting on my local Mac).

Nikolas Zimmermann

Comment 56 2010-07-30 23:20:22 PDT

Comment on attachment 57375 [details] use failed results r- because of Kent Hansens comment.

Sam Sneddon [:gsnedders]

Comment 57 2016-09-15 08:24:06 PDT

These are nowadays maintained as part of https://github.com/w3c/web-platform-tests/tree/master/2dcontext so this is really obsoleted by just running more of web-platform-tests.

Attachments
fix patch (deleted) 2010-05-10 13:22 PDT, Chang Shu	no flags	Details Formatted Diff Diff
fix patch 2 (deleted) 2010-05-14 12:56 PDT, Chang Shu	oliver: review-	Details Formatted Diff Diff
fix patch 3 (deleted) 2010-05-14 13:49 PDT, Chang Shu	no flags	Details Formatted Diff Diff
Add expected results (448.74 KB, patch) 2010-05-24 08:06 PDT, Chang Shu	no flags	Details Formatted Diff Diff
Add expected results, updated ChangeLog (389.63 KB, patch) 2010-05-25 06:41 PDT, Chang Shu	no flags	Details Formatted Diff Diff
enable qt tests (9.10 KB, patch) 2010-05-28 07:20 PDT, Chang Shu	no flags	Details Formatted Diff Diff
use failed results (77.56 KB, patch) 2010-05-28 14:17 PDT, Chang Shu	zimmermann: review-	Details Formatted Diff Diff
Show Obsolete (3) View All Add attachment proposed patch, testcase, etc.

abarth
ap
aroben
cshu
dglazkov
eric
excors
gsnedders
jamesr
jchaffraix
kenneth
kent.hansen
laszlo.gombos
mdelaney7
oliver
ossy
peter
qi.2.zhang
sam
simon.fraser
tonikitoo
webkit.review.bot