RESOLVED FIXED 85591
[Gtk] Many tests revealed as passing after moving from Skipped to test_expectations.txt
https://bugs.webkit.org/show_bug.cgi?id=85591
Summary [Gtk] Many tests revealed as passing after moving from Skipped to test_expect...
Zan Dobersek
Reported 2012-05-04 01:20:39 PDT
After moving to test_expectations.txt-like handling of test failures, meaning every test is now run and the result compared to expectations, a lot of tests are now passing. Fixes for these tests probably slipped into the source tree unnoticed or even unplanned (by accident one might say), making tests pass but keeping them skipped. This bug will cover the expectation changes for these tests and their status and numbers throughout this process. While most of these tests at the moment have wrong expectations for each build configuration that's covered by the GTK builders, some are skipped only on debug configurations. There are also test cases that fail only on 32-bit configurations - we should look into adding modifiers for test_expectations.txt that cover such configurations only (but this is probably a task for another bug).
Attachments
webkit-patch list-passing (7.56 KB, patch)
2012-05-04 01:37 PDT, Zan Dobersek
no flags
List of passing tests 05042012 0730UTC (27.55 KB, application/octet-stream)
2012-05-04 01:39 PDT, Zan Dobersek
no flags
List of passing tests 05042012 1520UTC (33.87 KB, application/octet-stream)
2012-05-04 09:51 PDT, Zan Dobersek
no flags
Patch (19.93 KB, patch)
2012-05-04 10:16 PDT, Zan Dobersek
no flags
Patch (2.43 KB, patch)
2012-05-09 12:05 PDT, Zan Dobersek
no flags
List of passing tests 05102012 1030UTC (57.92 KB, text/plain)
2012-05-10 03:44 PDT, Zan Dobersek
no flags
Patch (1.67 KB, patch)
2012-05-13 22:40 PDT, Zan Dobersek
no flags
Updated patch (2.90 KB, patch)
2012-05-18 12:55 PDT, Zan Dobersek
no flags
Updated patch (3.37 KB, patch)
2012-05-23 12:27 PDT, Zan Dobersek
no flags
Patch proposal (5.01 KB, patch)
2012-06-12 08:04 PDT, Mario Sanchez Prada
no flags
Patch (7.15 KB, patch)
2012-06-24 05:21 PDT, Zan Dobersek
no flags
Zan Dobersek
Comment 1 2012-05-04 01:37:01 PDT
Created attachment 140180 [details] webkit-patch list-passing This is just a simple command added to webkit-patch that lists tests from the GTK builders that are passing while expected to fail. It is just a one-off tool I've written specifically for this task. The GTK builders and the number of builds to check are all hardcoded, making this command in its current form useless for any other port. I don't intend to get this patch reviewed and landed unless there is some interest by any other ports in it. The command gathers passing tests with wrong expectations for the last 5 builds of each GTK builder and produces a list of such tests that is common to each of those builders. If wished the number of builds can be increased to get more consistent list of such tests, but this can get tricky sometimes as even the builders are unreliable (lost slaves, interrupted or broken builds etc). For each builder a common set of these tests through those last 5 builds is printed out along with a list of tests for each build that are not common. Finally, these three common sets are intersected to provide a test list of tests that passed in all the 15 builds (all the 5 builds on each of the 3 builders). Also printed out for each builder is the list of tests that passed in all the 5 builds on that builder but is not passing on other builders' builds.
Zan Dobersek
Comment 2 2012-05-04 01:39:45 PDT
Created attachment 140181 [details] List of passing tests 05042012 0730UTC This is the output of running `webkit-patch list-passing` on 4th of May at 7:30 UTC. For the last 5 builds of each builder, a list of 138 tests passing on all the builders was made.
Zan Dobersek
Comment 3 2012-05-04 01:51:10 PDT
I plan to remove incorrect expectations that are not CRASH and only have a BUGWKGTK modifier without review later today. If any of the tests backfires I'll create a proper bug entry for it and link the bug # here. For incorrect test expectations that have a BUGWK% modifier that points to a bug entry that's specific to the Gtk failure, I'll upload a patch for a review here and post a note in that bug entry about the test seeming to now pass. If, after removing incorrect expectations, the test seems OK, I'll be closing those bug entries. If the tests backfire, I'll note that in the bug entry and update the test expectations. For incorrect test expectations that have a BUGWK% modifier that points to a bug from which the offending commit originates, I'll remove them without review and, if a test backfires, create a bug entry, linking the bug # here. Incorrect test expectations that are CRASH will be handled with more care and removed all at once. I'll be posting the patch up here for a review. I'll then proceed with analyzing test sets that pass only on a specific build configuration with a similar approach.
Zan Dobersek
Comment 4 2012-05-04 07:15:48 PDT
Zan Dobersek
Comment 5 2012-05-04 07:17:00 PDT
(In reply to comment #4) > Committed r116094: <http://trac.webkit.org/changeset/116094> This is the first part, removing test expectations for tests that are passing across all the builders and had the BUGWKGTK modifier.
Martin Robinson
Comment 6 2012-05-04 08:58:26 PDT
Nice Zan. Perhaps the new tool would be useful to other ports...
Zan Dobersek
Comment 7 2012-05-04 09:51:08 PDT
Created attachment 140259 [details] List of passing tests 05042012 1520UTC The first removing of foul test expectations seems to have passed by without problems. I'm attaching the output of `webkit-patch list-passing` when run on 4th of May at 15:20 UTC. A list of 138 tests passes on all the builders but has incorrect expectations listed. The next patch will require a review and will remove these expectations for about 110 tests.
Zan Dobersek
Comment 8 2012-05-04 10:16:47 PDT
Zan Dobersek
Comment 9 2012-05-04 10:21:27 PDT
(In reply to comment #8) > Created an attachment (id=140263) [details] > Patch This patch removes incorrect expectations for most of the rest of these tests. Contrary to what I planned in comment #3, I'll be updating bug entries later, after the tests either keep passing or backfire, to keep the work volume at a low level (expectations for tests from 43 bugs are being changed).
Zan Dobersek
Comment 10 2012-05-04 10:40:45 PDT
Comment on attachment 140263 [details] Patch Clearing flags on attachment: 140263 Committed r116122: <http://trac.webkit.org/changeset/116122>
Zan Dobersek
Comment 11 2012-05-05 10:52:17 PDT
All the test cases whose expectations were removed in r116122 and had Gtk-specific bug entries had these entries updated and closed if possible. There are two bug entries (bug #54073 and bug #54136) that cover tests whose failures seem to be observable only by inspecting pixel output but had their expectations removed. I'll reinsert a PASS expectation into test_expectations.txt and note that while the test passes, it actually presents incorrect behavior.
Philippe Normand
Comment 12 2012-05-07 07:42:35 PDT
Good job, Zan. Thanks you!
Zan Dobersek
Comment 13 2012-05-09 12:05:07 PDT
Philippe Normand
Comment 14 2012-05-09 13:21:30 PDT
Comment on attachment 140990 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=140990&action=review > LayoutTests/platform/gtk/test_expectations.txt:-324 > -// This test crashes whatever test follows it. Perhaps it's related to the previous failure. > -BUGWKGTK : fast/dom/gc-10.html = CRASH I think I tried in the past to unflag this one but I had to back off. Let's try again anyway.
Zan Dobersek
Comment 15 2012-05-09 13:24:20 PDT
Comment on attachment 140990 [details] Patch Clearing flags on attachment: 140990 Committed r116553: <http://trac.webkit.org/changeset/116553>
Zan Dobersek
Comment 16 2012-05-10 03:44:10 PDT
Created attachment 141132 [details] List of passing tests 05102012 1030UTC The list of passing tests common to all the GTK builders is now 37 tests long. I'll remove their expectations shortly and observe their behavior, updating their bug entries appropriately. On a different note, all the tests that had CRASH expectations removed seem to be behaving well, their bug entries will be updated soon.
Zan Dobersek
Comment 17 2012-05-10 04:20:09 PDT
Zan Dobersek
Comment 18 2012-05-10 04:26:14 PDT
(In reply to comment #17) > Committed r116627: <http://trac.webkit.org/changeset/116627> This commit removes erroneous test expectations that are not CRASH for a couple of more tests. Their bug entries, if existent, will be updated accordingly. This currently leaves two more tests passing that have CRASH expectations and a couple of tests who pass but probably have pixel failures. I'll upload another patch for review for the former and compile another list of the latter.
Zan Dobersek
Comment 19 2012-05-13 22:32:26 PDT
Zan Dobersek
Comment 20 2012-05-13 22:40:54 PDT
Zan Dobersek
Comment 21 2012-05-13 22:58:32 PDT
Zan Dobersek
Comment 22 2012-05-18 12:55:32 PDT
Created attachment 142767 [details] Updated patch Remove redundant expectations for crashes in debug builds as well
Zan Dobersek
Comment 23 2012-05-23 12:27:41 PDT
Created attachment 143615 [details] Updated patch Removed two more crash expectations for two SVG tests that were apparently fixed somewhere between r117786 and r117795.
Philippe Normand
Comment 24 2012-05-23 15:52:42 PDT
Comment on attachment 143615 [details] Updated patch Ok. One minor thing though, while I think that for this specific case it's good to group the patches in a single bug... but we usually do one patch per bug :)
Zan Dobersek
Comment 25 2012-05-24 22:00:41 PDT
Comment on attachment 143615 [details] Updated patch Clearing flags on attachment: 143615 Committed r118474: <http://trac.webkit.org/changeset/118474>
Mario Sanchez Prada
Comment 26 2012-06-12 07:34:11 PDT
I have observed that the following list of tests have been consitently passing in GTK bots as well: Expected to fail, but passed: (38) fast/forms/number/spin-in-multi-column.html fast/multicol/span/span-as-immediate-child-property-removal.html fast/multicol/span/span-as-immediate-columns-child-removal.html fast/repaint/moving-shadow-on-container.html fast/repaint/moving-shadow-on-path.html fast/sub-pixel/column-clipping.html http/tests/misc/favicon-loads-with-images-disabled.html media/sources-fallback-codecs.html security/block-test.html sputnik/Conformance/10_Execution_Contexts/10.2_Entering_An_Execution_Context/10.2.2_Eval_Code/S10.2.2_A1.1_T1.html sputnik/Conformance/10_Execution_Contexts/10.2_Entering_An_Execution_Context/10.2.2_Eval_Code/S10.2.2_A1.1_T10.html sputnik/Conformance/10_Execution_Contexts/10.2_Entering_An_Execution_Context/10.2.2_Eval_Code/S10.2.2_A1.1_T11.html sputnik/Conformance/10_Execution_Contexts/10.2_Entering_An_Execution_Context/10.2.2_Eval_Code/S10.2.2_A1.1_T2.html sputnik/Conformance/10_Execution_Contexts/10.2_Entering_An_Execution_Context/10.2.2_Eval_Code/S10.2.2_A1.1_T3.html sputnik/Conformance/10_Execution_Contexts/10.2_Entering_An_Execution_Context/10.2.2_Eval_Code/S10.2.2_A1.1_T4.html sputnik/Conformance/10_Execution_Contexts/10.2_Entering_An_Execution_Context/10.2.2_Eval_Code/S10.2.2_A1.1_T5.html sputnik/Conformance/10_Execution_Contexts/10.2_Entering_An_Execution_Context/10.2.2_Eval_Code/S10.2.2_A1.1_T6.html sputnik/Conformance/10_Execution_Contexts/10.2_Entering_An_Execution_Context/10.2.2_Eval_Code/S10.2.2_A1.1_T7.html sputnik/Conformance/10_Execution_Contexts/10.2_Entering_An_Execution_Context/10.2.2_Eval_Code/S10.2.2_A1.1_T8.html sputnik/Conformance/10_Execution_Contexts/10.2_Entering_An_Execution_Context/10.2.2_Eval_Code/S10.2.2_A1.1_T9.html sputnik/Conformance/10_Execution_Contexts/10.2_Entering_An_Execution_Context/10.2.2_Eval_Code/S10.2.2_A1.2_T1.html sputnik/Conformance/10_Execution_Contexts/10.2_Entering_An_Execution_Context/10.2.2_Eval_Code/S10.2.2_A1.2_T10.html sputnik/Conformance/10_Execution_Contexts/10.2_Entering_An_Execution_Context/10.2.2_Eval_Code/S10.2.2_A1.2_T11.html sputnik/Conformance/10_Execution_Contexts/10.2_Entering_An_Execution_Context/10.2.2_Eval_Code/S10.2.2_A1.2_T2.html sputnik/Conformance/10_Execution_Contexts/10.2_Entering_An_Execution_Context/10.2.2_Eval_Code/S10.2.2_A1.2_T3.html sputnik/Conformance/10_Execution_Contexts/10.2_Entering_An_Execution_Context/10.2.2_Eval_Code/S10.2.2_A1.2_T4.html sputnik/Conformance/10_Execution_Contexts/10.2_Entering_An_Execution_Context/10.2.2_Eval_Code/S10.2.2_A1.2_T5.html sputnik/Conformance/10_Execution_Contexts/10.2_Entering_An_Execution_Context/10.2.2_Eval_Code/S10.2.2_A1.2_T6.html sputnik/Conformance/10_Execution_Contexts/10.2_Entering_An_Execution_Context/10.2.2_Eval_Code/S10.2.2_A1.2_T7.html sputnik/Conformance/10_Execution_Contexts/10.2_Entering_An_Execution_Context/10.2.2_Eval_Code/S10.2.2_A1.2_T8.html sputnik/Conformance/10_Execution_Contexts/10.2_Entering_An_Execution_Context/10.2.2_Eval_Code/S10.2.2_A1.2_T9.html sputnik/Conformance/15_Native_Objects/15.8_Math/15.8.2/15.8.2.16_sin/S15.8.2.16_A7.html sputnik/Conformance/15_Native_Objects/15.8_Math/15.8.2/15.8.2.18_tan/S15.8.2.18_A7.html svg/W3C-SVG-1.1/paths-data-03-f.svg svg/css/composite-shadow-example.html svg/css/composite-shadow-with-opacity.html svg/css/stars-with-shadow.html svg/hittest/svg-ellipse-non-scale-stroke.xhtml I've run those tests both in the GTK 64 debug and release bots with the following commands: * Tools/Scripts/run-webkit-tests --gtk * Tools/Scripts/run-webkit-tests --gtk --iterations=10 * Tools/Scripts/run-webkit-tests --gtk --iterations=10 --repeat-each=10 And I get always the same (positive) result. I'll attach a patch to unskip them promptly, asking for review.
Mario Sanchez Prada
Comment 27 2012-06-12 08:04:15 PDT
Created attachment 147088 [details] Patch proposal Attaching a patch that unskips the sputnik tests in that list only. I left other tests out of this patch because they were reported I can't be sure about them passing in the 32bit bot since it has been a while since that bot doesn't compile (it's fixed now, but still compiling atm) and so I can't be sure whether they are passing now or not. Sputnik tests, in the other hand, are consistently passing in GTK 64 bit bots now and also passed in the last build in the 32bit bot that we have results for (http://build.webkit.org/builders/GTK%20Linux%2032-bit%20Release/builds/24071/steps/layout-test/logs/stdio), so I'd say it's pretty safe to unskip them.
Martin Robinson
Comment 28 2012-06-12 08:07:59 PDT
Comment on attachment 147088 [details] Patch proposal Let's give it a shot!
Mario Sanchez Prada
Comment 29 2012-06-12 08:11:48 PDT
Mario Sanchez Prada
Comment 30 2012-06-12 09:18:40 PDT
(In reply to comment #27) > [...] > Sputnik tests, in the other hand, are consistently passing in GTK 64 bit bots > now and also passed in the last build in the 32bit bot that we have results for > (http://build.webkit.org/builders/GTK%20Linux%2032-bit%20Release/builds/24071/steps/layout-test/logs/stdio), > so I'd say it's pretty safe to unskip them. It seems the tests we unskipped are now doing well. From the list of remaining tests, I've observed only the following two ones are passing now in all the 3 GTK bots: fast/forms/number/spin-in-multi-column.html fast/multicol/span/span-as-immediate-child-property-removal.html However, not unskipping yet because I can't tell for sure whether they are flaky or not. Let's observe them for a while more
Zan Dobersek
Comment 31 2012-06-24 05:21:00 PDT
Zan Dobersek
Comment 32 2012-06-24 05:29:32 PDT
(In reply to comment #31) > Created an attachment (id=149191) [details] > Patch This patch suggests creating a new category for test failures that occur only on specific build architectures. Tests there are marked as flaky so the expectations cover build architectures where the test passes as well. Basically, this is a workaround for adding 32-bit and 64-bit modifiers which at the moment seems to me as an overkill (no other port seems to require it).
Martin Robinson
Comment 33 2012-07-02 10:37:08 PDT
Comment on attachment 149191 [details] Patch This seems reasonable!
Zan Dobersek
Comment 34 2012-07-02 10:56:15 PDT
Comment on attachment 149191 [details] Patch Clearing flags on attachment: 149191 Committed r121694: <http://trac.webkit.org/changeset/121694>
Zan Dobersek
Comment 35 2012-07-20 10:10:04 PDT
Zan Dobersek
Comment 36 2012-07-20 10:16:52 PDT
(In reply to comment #35) > Committed r123224: <http://trac.webkit.org/changeset/123224> Accidentally closed the bug while committing this, but there should be no more unexpected passes on the GTK builder. If there are still any, they will be handled outside of the scope of this bug. A bit sad that it took me more than three months to get this sorted, but still, at least now it's done.
Martin Robinson
Comment 37 2012-07-20 10:24:25 PDT
Incredible work!
Note You need to log in before you can comment on or make changes to this bug.