Bug 92653 - [Chromium-Android] Apply all Linux applicable layout test expectations
Summary: [Chromium-Android] Apply all Linux applicable layout test expectations
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: 528+ (Nightly build)
Hardware: Android Android
: P2 Normal
Assignee: Xianzhu Wang
URL:
Keywords:
: 90515 90520 (view as bug list)
Depends on:
Blocks: 66687 93627
  Show dependency treegraph
 
Reported: 2012-07-30 09:51 PDT by Xianzhu Wang
Modified: 2012-08-09 10:19 PDT (History)
8 users (show)

See Also:


Attachments
Patch (104.83 KB, patch)
2012-07-30 10:00 PDT, Xianzhu Wang
no flags Details | Formatted Diff | Diff
Patch (89.32 KB, patch)
2012-07-30 10:43 PDT, Xianzhu Wang
no flags Details | Formatted Diff | Diff
Patch (81.25 KB, patch)
2012-08-08 11:31 PDT, Xianzhu Wang
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Xianzhu Wang 2012-07-30 09:51:40 PDT
We used to let chromium_android.py replace all 'LINUX' to 'LINUX ANDROID' in test expectations on Android to automatically inherit all test expectations of chromium-linux, by overriding Port.test_expectations() with ChromiumAndroidPort.test_expectations(). This has been not working since Port.test_expectations() became Port.expectations_dict(). We could convert ChromiumAndroidPort.test_expectations() to ChromiumAndroidPort.expectations_dict(), but I think it's still a temporary thing and we should go ahead of it: we should start to upstream the Android expectations:

1) actually replace 'LINUX' to 'LINUX ANDROID' in TestExpectations
2) upstream all the special expectations in test_expectations_android.txt (currently in downstream only), module by module when we enable the module in upstream bot
3) remove the code about special expectations in chromium_android.py

This bug is for the above 1).
Comment 1 Adam Barth 2012-07-30 09:57:44 PDT
My main concern here is maintainability.  What happens when someone adds a new entry to TestExpectations for LINUX.  Do we expect them to blindly add ANDROID too?  If they don't, will we need to fork TestExpectations downstream?

These problems all go way once we have chromium-android test bots running upstream so that folks can see whether the test actually fails on Android.
Comment 2 Xianzhu Wang 2012-07-30 10:00:59 PDT
Created attachment 155308 [details]
Patch
Comment 3 Adam Barth 2012-07-30 10:05:12 PDT
Comment on attachment 155308 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=155308&action=review

> LayoutTests/platform/chromium/TestExpectations:127
> +BUGCR89468 WIN LINUX ANDROID ANDROID LION : css3/selectors3/xml/css3-modsel-d1.xml = IMAGE IMAGE+TEXT
> +BUGCR89468 WIN LINUX ANDROID ANDROID LION : css3/selectors3/xml/css3-modsel-d1b.xml = IMAGE IMAGE+TEXT
> +BUGCR89468 WIN LINUX ANDROID ANDROID LION : css3/selectors3/xml/css3-modsel-d2.xml = IMAGE IMAGE+TEXT
> +BUGCR89468 WIN LINUX ANDROID ANDROID LION : css3/selectors3/xml/css3-modsel-d3.xml = IMAGE IMAGE+TEXT
> +BUGCR89468 WIN LINUX ANDROID ANDROID LION : css3/selectors3/xml/css3-modsel-d4.xml = IMAGE IMAGE+TEXT

These seem to list ANDROID already.
Comment 4 Xianzhu Wang 2012-07-30 10:10:44 PDT
Comment on attachment 155308 [details]
Patch

Remove review request. I think I should only add 'ANDROID' where the test is applicable.
Comment 5 Xianzhu Wang 2012-07-30 10:15:33 PDT
(In reply to comment #1)
> My main concern here is maintainability.  What happens when someone adds a new entry to TestExpectations for LINUX.  Do we expect them to blindly add ANDROID too?  If they don't, will we need to fork TestExpectations downstream?
> 

I think they can just ignore android for now. For new failures on Android, we will still use test_expectations_android.txt in downstream until it is completely upstreamed.

> These problems all go way once we have chromium-android test bots running upstream so that folks can see whether the test actually fails on Android.
Comment 6 Adam Barth 2012-07-30 10:20:34 PDT
> I think they can just ignore android for now. For new failures on Android, we will still use test_expectations_android.txt in downstream until it is completely upstreamed.

Ok.  When this patch lands, would you be willing to send a message to the webkit-gardening list to that effect?
Comment 7 Xianzhu Wang 2012-07-30 10:27:41 PDT
(In reply to comment #6)
> Ok.  When this patch lands, would you be willing to send a message to the webkit-gardening list to that effect?

Sure.
Comment 8 Xianzhu Wang 2012-07-30 10:43:18 PDT
Created attachment 155319 [details]
Patch
Comment 9 Peter Beverloo 2012-07-30 10:46:35 PDT
I feel it's a bit early to land this, given that we're not even able to run layout tests on a bot configuration yet, and it therefore would be a void change.. Given the number of updates to the TestExpectations file, it'd definitely need another round of changes when we do activate the tester. Should we wait until we do run layout tests upstream?

Furthermore, I expect a very, very big amount of failures related to other issues, i.e. the color and font issues we're seeing right now. There will be plenty of triaging to do. Do you think it may be better to start from zero, when doing that?
Comment 10 Xianzhu Wang 2012-07-30 11:15:24 PDT
(In reply to comment #9)
> I feel it's a bit early to land this, given that we're not even able to run layout tests on a bot configuration yet, and it therefore would be a void change.. Given the number of updates to the TestExpectations file, it'd definitely need another round of changes when we do activate the tester. Should we wait until we do run layout tests upstream?
>

I'm also hesitating. I made this change mainly for downstream to fix the current about 200 failures. There is also a downstream change to add the failure expectations in test_expectations_android.txt. I have no preference between them. Another way is to fix ChromiumAndroidPort.test_expectations() so that we can still automatically inherit Linux expectations, but I don't like it because it is still temporary. What's your opinion?

> Furthermore, I expect a very, very big amount of failures related to other issues, i.e. the color and font issues we're seeing right now. There will be plenty of triaging to do. Do you think it may be better to start from zero, when doing that?

Are these failures already in the downstream test_expectations_android.txt? If not, there must be caused by some difference between upstream/downstream.
Comment 11 Xianzhu Wang 2012-07-30 12:15:54 PDT
Comment on attachment 155319 [details]
Patch

Now I feel the same as Peter: it's early to do this. I will do it in downstream first.
Comment 12 Peter Beverloo 2012-07-30 12:20:12 PDT
(In reply to comment #10)
> I'm also hesitating. I made this change mainly for downstream to fix the current about 200 failures. There is also a downstream change to add the failure expectations in test_expectations_android.txt. I have no preference between them. Another way is to fix ChromiumAndroidPort.test_expectations() so that we can still automatically inherit Linux expectations, but I don't like it because it is still temporary. What's your opinion?

Downstream we have an Android-specific file with test expectations, but it's hidden by the diff tool. Before deciding which way to go, I think it's important to assess where exactly we are in terms of layout test "performance".

> Are these failures already in the downstream test_expectations_android.txt? If not, there must be caused by some difference between upstream/downstream.

Parts will be. I'm still in favor of removing some of the text-related code we have in place to mimic Linux as closely as possible. We should be testing code paths that are being used in our product, even if that comes at the cost of having to maintain more baselines. There are tools to do that anyway :).
Comment 13 Xianzhu Wang 2012-07-30 12:36:09 PDT
(In reply to comment #12)
> Downstream we have an Android-specific file with test expectations, but it's hidden by the diff tool. Before deciding which way to go, I think it's important to assess where exactly we are in terms of layout test "performance".
>

What do you mean about "layout test performance"?

> 
> Parts will be. I'm still in favor of removing some of the text-related code we have in place to mimic Linux as closely as possible. We should be testing code paths that are being used in our product, even if that comes at the cost of having to maintain more baselines. There are tools to do that anyway :).

In fact chromium-linux does the similar things about fonts to match chromium-win expectations. The difference that chromium-linux uses a special fontconfig file while chromium-android uses several lines of hard-coded font rendering style code and Android-specific font configuration files. Without this I guess the complexity of maintenance work would increase by an order of magnitude. Based on our data collected before (around 12/2010), we would need to rebaseline more than 4000 text expectations without the font configurations.
Comment 14 Peter Beverloo 2012-07-30 12:45:44 PDT
(In reply to comment #13)
> What do you mean about "layout test performance"?

Number of passing v.s. failing tests.

> In fact chromium-linux does the similar things about fonts to match chromium-win expectations. The difference that chromium-linux uses a special fontconfig file while chromium-android uses several lines of hard-coded font rendering style code and Android-specific font configuration files. Without this I guess the complexity of maintenance work would increase by an order of magnitude. Based on our data collected before (around 12/2010), we would need to rebaseline more than 4000 text expectations without the font configurations.

I'm not sure. Adam, what would you advise? It's the many custom baseline v.s. not testing exactly what we're using question.
Comment 15 Xianzhu Wang 2012-07-30 13:42:31 PDT
(In reply to comment #14)
> (In reply to comment #13)
> > What do you mean about "layout test performance"?
> 
> Number of passing v.s. failing tests.
> 

As of 6/12/2012, 1% tests were failing to be fixed/investigated, 2.5% tests were failing but won't be fixed (because they are not applicable on Android). Note these are Android-specific, i.e. tests that were also failing on Linux are not counted. We have 335 lines (containing '=') in downstream test_expectations_android.txt.
Comment 16 Xianzhu Wang 2012-08-08 11:31:38 PDT
Created attachment 157260 [details]
Patch
Comment 17 Xianzhu Wang 2012-08-08 11:34:55 PDT
I think we'd better make the change now otherwise we need to update individual expectations which will cost us more.

For the issues of font rendering and color, I think we can work on them concurrently.
Comment 18 Adam Barth 2012-08-08 12:42:30 PDT
Comment on attachment 157260 [details]
Patch

I'm fine with this as long as Peter is happy.
Comment 19 Peter Beverloo 2012-08-08 13:03:42 PDT
While I still feel a bit wary, go for it.

We should make up a final plan about layout tests/WebKit infrastructure offline, to make sure that we're all on the same line. I'll start a thread about it tomorrow.
Comment 20 Ojan Vafai 2012-08-08 13:21:52 PDT
Should we instead just make android a subset of linux? In other words, LINUX would imply LUCID and ANDROID instead of just LUCID. This is the same as MAC implying LION, SNOWLEOPARD, MOUNTAINLION.
Comment 21 Adam Barth 2012-08-08 13:27:51 PDT
> Should we instead just make android a subset of linux?

Nope.  There have been long discussions about that topic, and the result of those discussions is that considering Android to be a flavor of Linux causes more problems than it fixes.  We're just being consistent with that decision here.
Comment 22 Xianzhu Wang 2012-08-08 14:36:07 PDT
Comment on attachment 157260 [details]
Patch

Thanks all for review and comments. Just ran 'new-run-webkit-tests --no-pixel-tests fast' with the changed expectations. Results:
  8215 tests ran as expected, 169 didn't
  Expected to timeout, but passed (2)
  Expected to fail, but passed (6)
  Regressions: Unexpected text failures : (128)
  Regressions: Unexpected image-only failures : (1)
  Regressions: Unexpected crashes : (9)
  Regressions: Unexpected timeouts : (23)

In the 169 failed tests, 155 of them have been in downstream text_expectations_android.txt or have been rebaselined in downstream (which will be upstreamed). Only 14 of them are real unexpected which might be because of new changes not merged into downstream. I think some of the text mismatches Peter observed were because of this bug.

About the concern about not testing real product paths, my thoughts are:

1) about using the customized fonts, I think this is not another product path because the different Android devices might also use different fonts;

2) about font rendering styles, the differences are about hinting and sub-pixel positioning. This is a trade-off between running real code path and thousands of tests to be rebaselined. In addition, we could test product code path by explicitly set hinting and sub-pixel positioning in particular platform-specific tests, like platform/chromium-linux/fast/text/chromium-linux-text-subpixel-positioning.html.

3) about scroll bar overlay, my opinion is the same as 2).
Comment 23 WebKit Review Bot 2012-08-08 14:56:40 PDT
Comment on attachment 157260 [details]
Patch

Clearing flags on attachment: 157260

Committed r125094: <http://trac.webkit.org/changeset/125094>
Comment 24 WebKit Review Bot 2012-08-08 14:56:45 PDT
All reviewed patches have been landed.  Closing bug.
Comment 25 Wei James (wistoch) 2012-08-08 18:42:37 PDT
*** Bug 90515 has been marked as a duplicate of this bug. ***
Comment 26 Xingnan Wang 2012-08-08 20:16:38 PDT
*** Bug 90520 has been marked as a duplicate of this bug. ***