RESOLVED FIXED 96382
Android's mock scrollbars shows up as a difference in layout test results
https://bugs.webkit.org/show_bug.cgi?id=96382
Summary Android's mock scrollbars shows up as a difference in layout test results
Attachments
Patch (4.75 KB, patch)
2012-09-24 04:21 PDT, Peter Beverloo
no flags
Fix typos (4.75 KB, patch)
2012-09-24 04:29 PDT, Peter Beverloo
no flags
Peter Beverloo
Comment 1 2012-09-12 06:03:41 PDT
I'm not sure we can address this without needing new Android baselines for every test that has a scrollbar. Windows, Mac and Linux do the same (see links below). Our scrollbars are painted using the compositor, which we force to be on in the shipping version of the product itself. I think the best way forward would be to have these show up on the pixel results (i.e. what the product itself uses), and then submit a few massive commits with baselines. Ideas / confirmation before I do this would be most welcome! Another example output: http://build.webkit.org/results/Chromium%20Android%20Release%20(Tests)/r128190%20(72)/css2.1/t1202-counters-09-b-diffs.html Mac example: http://build.webkit.org/results/Chromium%20Mac%20Release%20%28Tests%29/r128297%20%2823853%29/fast/block/float/028-diffs.html Windows using mock scrollbars: http://build.webkit.org/results/Chromium%20Win%20Release%20%28Tests%29/r128290%20%2829919%29/fast/block/float/028-diffs.html
Peter Beverloo
Comment 2 2012-09-12 06:25:40 PDT
Android always uses the compositor, and forces it to be on. The scrollbars have been implemented in the compositor as well. For the most accurate results, we'd need to turn the compositor on for all tests. This means that we'd have to add new baselines for (a) all text baselines, as they would now start showing layers, (b) all image baselines with a scrollbar and (c) image baselines needing other adjustments. Adding everything up, this means that we'd need new baselines for pretty much every test. Maybe some intermediary result directory between chromium-linux and chromium-android, i.e. chromium-linux-compositing, could mean that we can still share many results with Linux (when/if they start forcing the compositor)...
Tony Chang
Comment 3 2012-09-12 10:15:34 PDT
FWIW, I don't think it's a big deal to land different pixel results for chromium-android. There are "only" 7218 chromium-linux png files. If there are the same number of chromium-android files, that's not much compared to the 183551 total files in the repository. The main problem is keeping the results up to date, but once the bots start running tests and we hook up garden-o-matic to it, that should be pretty easy.
Peter Beverloo
Comment 4 2012-09-12 10:17:39 PDT
(In reply to comment #3) > FWIW, I don't think it's a big deal to land different pixel results for chromium-android. There are "only" 7218 chromium-linux png files. If there are the same number of chromium-android files, that's not much compared to the 183551 total files in the repository. > > The main problem is keeping the results up to date, but once the bots start running tests and we hook up garden-o-matic to it, that should be pretty easy. Getting garden-o-matic to work is definitely a priority. One nit here: afaik turning on the compositor will also start showing the layers in the normal DumpRenderTree output, which means new text results as well.
James Robinson
Comment 5 2012-09-12 10:30:07 PDT
Layers don't show up in the text output unless the test asks for them by testRunner.layerTreeAsText(). Do you want scrollbars to show up in the .pngs at all on android? They don't occupy space (do they?) so maybe you should just leave them out. That means you can't share pixel baselines with other ports on tests with visible scrollbars, but you aren't sharing behavior either thanks to overlay so that seems appropriate.
Xianzhu Wang
Comment 6 2012-09-12 10:37:04 PDT
How does garden-o-matic distinguish tests to be rebaselined and bugs to be fixed, among the thousands of mismatches? The following are a bit off topic: Perhaps we can make the pixel results automatically rebaselined, but for crash/timeout/text mismatches, I think we still need to manually check one by one to distinguish bugs from rebaselines. So I think keeping the text expectations as many matching is important to reduce the cost. For the scrollbar case, the current code makes the text expectations matching chromium-linux/chromium-win. So does the code for font configurations. For those features, DumpRenderTree doesn't execute the actual code paths of chromium-android. I think we can add several Android-specific test cases for them with the actual paths selected with some Android-specific layout test API. For the other tests the DumpRenderTree code paths are used to reduce the number of mismatches.
Peter Beverloo
Comment 7 2012-09-12 10:37:25 PDT
(In reply to comment #5) > Layers don't show up in the text output unless the test asks for them by testRunner.layerTreeAsText(). > > Do you want scrollbars to show up in the .pngs at all on android? They don't occupy space (do they?) so maybe you should just leave them out. That means you can't share pixel baselines with other ports on tests with visible scrollbars, but you aren't sharing behavior either thanks to overlay so that seems appropriate. I see, that's good news, so we can enable forced compositing. Do you think it'd be fine to now show scrollbars altogether? They're layers in a parallel tree and are drawn a top of the content, so it seems to me that they indeed don't occupy space. My concern is that this would make it impossible to determine whether the content exceeds the result's viewport or not. Since we're not going to be able to share the results anyway, both (a) being closer to what we ship and (b) making it more obvious why a test may be failing in case of scrolling content make me think that we should enable them.
Peter Beverloo
Comment 8 2012-09-12 10:40:03 PDT
(In reply to comment #6) > How does garden-o-matic distinguish tests to be rebaselined and bugs to be fixed, among the thousands of mismatches? > > > The following are a bit off topic: > > Perhaps we can make the pixel results automatically rebaselined, but for crash/timeout/text mismatches, I think we still need to manually check one by one to distinguish bugs from rebaselines. So I think keeping the text expectations as many matching is important to reduce the cost. > > For the scrollbar case, the current code makes the text expectations matching chromium-linux/chromium-win. So does the code for font configurations. For those features, DumpRenderTree doesn't execute the actual code paths of chromium-android. I think we can add several Android-specific test cases for them with the actual paths selected with some Android-specific layout test API. For the other tests the DumpRenderTree code paths are used to reduce the number of mismatches. We shouldn't be having thousands of mismatches to start with, but rather a good baseline of the expected results. Crashes will show up independently. With our own larger set of baselines, we can also remove a number of these exceptions and start testing what we use. Fonts would be an exception here due to the shaping and dimensions, unfortunately.
Dirk Pranke
Comment 9 2012-09-12 10:42:52 PDT
(In reply to comment #6) > How does garden-o-matic distinguish tests to be rebaselined and bugs to be fixed, among the thousands of mismatches? > It doesn't. That is up the dev's discretion. > > The following are a bit off topic: > > Perhaps we can make the pixel results automatically rebaselined, but for crash/timeout/text mismatches, I think we still need to manually check one by one to distinguish bugs from rebaselines. So I think keeping the text expectations as many matching is important to reduce the cost. > I'm not sure I understand what you mean by "automatically rebaselined" here?
Dirk Pranke
Comment 10 2012-09-12 10:43:19 PDT
(In reply to comment #7) > My concern is that this would make it impossible to determine whether the content exceeds the result's viewport or not. The render tree output would at least tell you this.
Xianzhu Wang
Comment 11 2012-09-12 10:48:04 PDT
(In reply to comment #9) > (In reply to comment #6) > > How does garden-o-matic distinguish tests to be rebaselined and bugs to be fixed, among the thousands of mismatches? > > > > It doesn't. That is up the dev's discretion. > > > > > The following are a bit off topic: > > > > Perhaps we can make the pixel results automatically rebaselined, but for crash/timeout/text mismatches, I think we still need to manually check one by one to distinguish bugs from rebaselines. So I think keeping the text expectations as many matching is important to reduce the cost. > > > > I'm not sure I understand what you mean by "automatically rebaselined" here? This is related to my first question. Maybe I'm wrong but it seems impractical to me to check the thousands of mismatches one by one, so I guessed pixel mismatches are rebaselined without checking ("automatically rebaselined").
Dirk Pranke
Comment 12 2012-09-12 10:49:02 PDT
(In reply to comment #11) > > I'm not sure I understand what you mean by "automatically rebaselined" here? > > This is related to my first question. Maybe I'm wrong but it seems impractical to me to check the thousands of mismatches one by one, so I guessed pixel mismatches are rebaselined without checking ("automatically rebaselined"). Ah, I see. No, typically when this happens people actually at least glance at all of the results (or blindly approve them, though that is "discouraged" :).
Xianzhu Wang
Comment 13 2012-09-12 10:53:06 PDT
(In reply to comment #8) > We shouldn't be having thousands of mismatches to start with, but rather a good baseline of the expected results. Crashes will show up independently. > This is true for text expectations because we have created special code paths in layout test mode. However for pixel results, based on our experience in downstream, there would be thousands of mismatches because of the slight differences at the edges of the font strokes. Because of this we didn't run pixel tests in downstream to avoid the workload and hope this be resolved in upstream with the gardening process.
Xianzhu Wang
Comment 14 2012-09-12 11:36:26 PDT
FYI, the following layout test specific code paths have been created to reduce text mismatches of scrollbars (but not to reduce pixel mismatches): - let measurement of scrollbar match chromium-linux: http://trac.webkit.org/changeset/104139 - disable overlay scrollbar: http://trac.webkit.org/changeset/123825 - why the scrollbar is black in layout test mode: http://trac.webkit.org/browser/trunk/Source/WebCore/platform/chromium/ScrollbarThemeChromiumAndroid.cpp?annotate=blame#L165
James Robinson
Comment 15 2012-09-12 12:14:41 PDT
You should have DRT behave in as close to the same way as your actual browser does as possible. Disabling overlay scrollbars seems like a huge mistake, you want to test what users are actually experiencing.
Peter Beverloo
Comment 16 2012-09-12 12:21:09 PDT
We'll (In reply to comment #10) > (In reply to comment #7) > > My concern is that this would make it impossible to determine whether the content exceeds the result's viewport or not. > > The render tree output would at least tell you this. Yes, but is there a reason not to? What do you think about the following steps forward? 1) We'll force compositor mode and threaded compositing for DumpRenderTree on Android, matching the product itself. 2) We'll make the mock scrollbars light grey (so the scroller is visible and it's visually distinguishable from the test's content), if possible, keep the reserved space, and enable the overlay scrollbars again. 3) We'll diagnose further failures, to make sure there isn't any low hanging fruit that may cause major re-baselines after we're done. 4) We'll start a plan to get the tests re-based while verifying every result visually. There are various WebKit committers in London and with some tools we should be able to do this quickly and accurately. Note for (2): Some tests may be depending on the predictable inner layout width, so I don't think we should change that, but this is a triviality. The eventual goal is to match the product as closely as possible. Integration with garden-o-matic is planned and shouldn't be too long away, but we do need a good baseline in the first place in order for that to work.
Peter Beverloo
Comment 17 2012-09-12 12:22:57 PDT
(In reply to comment #16) > Note for (2): Some tests may be depending on the predictable inner layout width, so I don't think we should change that, but this is a triviality. Err, the "but this is a triviality" part is from a sentence I removed. I am not experienced enough with WebKit layout tests to judge what impact removing the reserved space would have, and would like to defer to one of you guys for making that decision.
Tony Chang
Comment 18 2012-09-12 12:24:28 PDT
You should be able to use garden-o-matic to examine and rebaseline the chromium-android results. I've used it in the past for rebaselining hundreds of pixel test changes (I think it took less than an hour).
Xianzhu Wang
Comment 19 2012-09-12 12:28:29 PDT
(In reply to comment #15) > You should have DRT behave in as close to the same way as your actual browser does as possible. Disabling overlay scrollbars seems like a huge mistake, you want to test what users are actually experiencing. I think this is a trade-off and we can reduce the mistake by adding a few tests with an layout test API to enable overlay scrollbars. For the existing tests, most of them are not actually testing scrollbars, keeping them matching with existing baselines can reduce the cost of managing the new baselines.
Peter Beverloo
Comment 20 2012-09-24 04:21:54 PDT
Sami Kyöstilä
Comment 21 2012-09-24 04:27:44 PDT
Comment on attachment 165351 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=165351&action=review > Source/WebCore/ChangeLog:10 > + bringing the tests closer to what we actually skip. s/skip/ship/ > Source/WebCore/ChangeLog:13 > + take and width on Android, they're rendered on top of the content. Therefore s/and/any/
Peter Beverloo
Comment 22 2012-09-24 04:29:43 PDT
Created attachment 165354 [details] Fix typos
Adam Barth
Comment 23 2012-09-24 11:22:01 PDT
Comment on attachment 165354 [details] Fix typos Removing calls to isRunningLayoutTest LGTM.
WebKit Review Bot
Comment 24 2012-09-24 11:42:03 PDT
Comment on attachment 165354 [details] Fix typos Clearing flags on attachment: 165354 Committed r129394: <http://trac.webkit.org/changeset/129394>
WebKit Review Bot
Comment 25 2012-09-24 11:42:07 PDT
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.