Bug 162814

Summary: Web Inspector: TestSuite test cases should have their own timeout to ensure tests fail with output instead of timeout by test runner
Product: WebKit Reporter: Joseph Pecoraro <joepeck>
Component: Web InspectorAssignee: Devin Rousso <hi>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, bburg, commit-queue, ews-watchlist, hi, inspector-bugzilla-changes, rniwa, sroberts, webkit-bot-watchers-bugzilla, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: All   
OS: All   
See Also: https://bugs.webkit.org/show_bug.cgi?id=174272
https://bugs.webkit.org/show_bug.cgi?id=158006
https://bugs.webkit.org/show_bug.cgi?id=155607
https://bugs.webkit.org/show_bug.cgi?id=196915
Attachments:
Description Flags
Patch
none
Archive of layout-test-results from ews103 for mac-highsierra
none
Archive of layout-test-results from ews106 for mac-highsierra-wk2
none
Archive of layout-test-results from ews115 for mac-highsierra
none
Patch
none
Patch
none
Patch
none
Archive of layout-test-results from ews106 for mac-highsierra-wk2
none
Archive of layout-test-results from ews201 for win-future
none
Patch none

Joseph Pecoraro
Reported 2016-09-30 15:51:15 PDT
Summary: TestSuite test cases should have their own timeout to ensure tests fail with output instead of timeout by test runner Currently when an inspector tests times out we get almost no data, even though much of the test may have run successfully. An individual test case in a test suite can be missing an event and hosing the entire test. I think it would make a lot of sense to make individual test cases in a test suite have a timeout. When that happens, it can fail the individual test case (or ALL test cases if they are dependent) and we would get useful failing output instead of generic Test Runner timeout data. This has the added benefit of making long timeouts fail more quickly so it is easier to work on tests.
Attachments
Patch (126.01 KB, patch)
2019-04-04 17:32 PDT, Devin Rousso
no flags
Archive of layout-test-results from ews103 for mac-highsierra (2.47 MB, application/zip)
2019-04-04 18:21 PDT, EWS Watchlist
no flags
Archive of layout-test-results from ews106 for mac-highsierra-wk2 (2.85 MB, application/zip)
2019-04-04 18:40 PDT, EWS Watchlist
no flags
Archive of layout-test-results from ews115 for mac-highsierra (2.33 MB, application/zip)
2019-04-04 19:23 PDT, EWS Watchlist
no flags
Patch (133.19 KB, patch)
2019-04-04 21:49 PDT, Devin Rousso
no flags
Patch (133.46 KB, patch)
2019-04-04 21:51 PDT, Devin Rousso
no flags
Patch (140.90 KB, patch)
2019-04-04 23:09 PDT, Devin Rousso
no flags
Archive of layout-test-results from ews106 for mac-highsierra-wk2 (2.69 MB, application/zip)
2019-04-05 00:31 PDT, EWS Watchlist
no flags
Archive of layout-test-results from ews201 for win-future (13.17 MB, application/zip)
2019-04-05 01:42 PDT, EWS Watchlist
no flags
Patch (141.15 KB, patch)
2019-04-05 10:20 PDT, Devin Rousso
no flags
Radar WebKit Bug Importer
Comment 1 2016-09-30 16:02:15 PDT
Blaze Burg
Comment 2 2016-09-30 16:16:37 PDT
So, it will check elapsed time before continuing to the next test case? Or would we start a JS watchdog timer for each testcase that will abort the entire suite? I fear that if the real problem is the bots are simply too contended or slow, then a watchdog timer in JS will just get stuck like the test and be ineffective. If it's a flake even on a fast machine, then this seems like a decent mitigation. I think a separate issue is that we don't get partial test output when a test times out, even if multiple things have been sent back to the test page. It would be nice to get a warning event sent from the test runner when it's about to kill the test so we could flush any buffered test results.
Joseph Pecoraro
Comment 3 2016-09-30 17:49:30 PDT
(In reply to comment #2) > So, it will check elapsed time before continuing to the next test case? Or > would we start a JS watchdog timer for each testcase that will abort the > entire suite? Correct, I was thinking a setTimeout for each test case. Either aborting the entire suite or a single test case. Both are fine, As long as we get some output. The safest thing to do is abort everything, but sometimes that isn't needed and it would be nice to run other test cases that may pass afterwards. > I fear that if the real problem is the bots are simply too contended or > slow, then a watchdog timer in JS will just get stuck like the test and be > ineffective. If it's a flake even on a fast machine, then this seems like a > decent mitigation. Almost always, especially locally when authoring a test, some event doesn't fire and the test times out. The only way I can debug that now is InspectorTest.debug() and a lot of waiting. > I think a separate issue is that we don't get partial test output when a > test times out, even if multiple things have been sent back to the test > page. It would be nice to get a warning event sent from the test runner when > it's about to kill the test so we could flush any buffered test results. Correct. I think this would be easier for TestSuite to do it. Especially considering this is really an issue for our tests, many other tests don't buffer things like us.
Devin Rousso
Comment 4 2019-04-04 17:32:55 PDT
Created attachment 366781 [details] Patch Patch for bots to see what tests need rebasing.
Devin Rousso
Comment 5 2019-04-04 17:36:19 PDT
Comment on attachment 366781 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=366781&action=review > Source/WebInspectorUI/ChangeLog:12 > + (AsyncTestSuite.prototype.runTestCases): > + (SyncTestSuite.prototype.runTestCases): I should probably mention that this change makes it so that a single test case failing won't prevent subsequent test cases from running. > Source/WebInspectorUI/UserInterface/Test/TestSuite.js:144 > + let delay = testcase.timeout || 10000; 10000 was derived from the idea that 30000 is the default timeout, and most test suites have more than one or two test cases.
Alexey Proskuryakov
Comment 6 2019-04-04 17:41:02 PDT
30 seconds is the default timeout unless we are running GuardMalloc or ASan, in which case it's 3x longer (maybe leaks too, I don't remember off-hand). Probably fine to land for a specific investigation, but this will likely make tests more flaky on those queues.
EWS Watchlist
Comment 7 2019-04-04 18:21:14 PDT Comment hidden (obsolete)
EWS Watchlist
Comment 8 2019-04-04 18:21:16 PDT Comment hidden (obsolete)
EWS Watchlist
Comment 9 2019-04-04 18:40:51 PDT Comment hidden (obsolete)
EWS Watchlist
Comment 10 2019-04-04 18:40:53 PDT Comment hidden (obsolete)
EWS Watchlist
Comment 11 2019-04-04 19:23:52 PDT Comment hidden (obsolete)
EWS Watchlist
Comment 12 2019-04-04 19:23:54 PDT Comment hidden (obsolete)
Devin Rousso
Comment 13 2019-04-04 21:49:12 PDT
Devin Rousso
Comment 14 2019-04-04 21:51:41 PDT
Devin Rousso
Comment 15 2019-04-04 23:09:37 PDT
EWS Watchlist
Comment 16 2019-04-05 00:31:51 PDT Comment hidden (obsolete)
EWS Watchlist
Comment 17 2019-04-05 00:31:53 PDT Comment hidden (obsolete)
EWS Watchlist
Comment 18 2019-04-05 01:42:46 PDT Comment hidden (obsolete)
EWS Watchlist
Comment 19 2019-04-05 01:42:48 PDT Comment hidden (obsolete)
Devin Rousso
Comment 20 2019-04-05 10:20:41 PDT
Created attachment 366824 [details] Patch Fix the logic that determines when to add a newline for extra space.
Blaze Burg
Comment 21 2019-04-05 15:09:38 PDT
Comment on attachment 366824 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=366824&action=review > Source/WebInspectorUI/ChangeLog:15 > + This change also relaxes the expectation that any individual test case failure will stop the This patch is trying to do too many things. Given that it's making a risky infrastructure change that's likely to trip up the bumbling bots, I think you should split out the "continue on failure" part out. That part seems to be responsible for most of the test and result diffs, yet it seems less contentious and less likely to be rolled out and back in repeatedly than the "add a timeout" part of the patch. > Source/WebInspectorUI/ChangeLog:21 > + "out of nowhere" in the middle of other tests. It sounds like the only way we can make test cases have non-flaky results is by disabling continue-on-fail. For if a failure is a timeout, any further tests run in the same context could be tainted by residual code/logging still running from the timeout'd test. On second thought, if all the nondeterminism is limited to just the one test file, then that's probably okay because it will be marked as failing anyway. > Source/WebInspectorUI/UserInterface/Test/TestSuite.js:57 > + return this.runCount - (this.failCount - this.skipCount); Why make skips count as passes?
Blaze Burg
Comment 22 2019-04-05 15:38:01 PDT
Comment on attachment 366824 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=366824&action=review r=me >> Source/WebInspectorUI/ChangeLog:15 >> + This change also relaxes the expectation that any individual test case failure will stop the > > This patch is trying to do too many things. Given that it's making a risky infrastructure change that's likely to trip up the bumbling bots, I think you should split out the "continue on failure" part out. That part seems to be responsible for most of the test and result diffs, yet it seems less contentious and less likely to be rolled out and back in repeatedly than the "add a timeout" part of the patch. OTOH, maybe it will work. If it has to be rolled out, let's be more cautious in followup attempts. > LayoutTests/inspector/canvas/resources/shaderProgram-utilities.js:78 > + whenProgramAdded((program) => { Nice.
Devin Rousso
Comment 23 2019-04-05 16:12:18 PDT
Comment on attachment 366824 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=366824&action=review >>> Source/WebInspectorUI/ChangeLog:15 >>> + This change also relaxes the expectation that any individual test case failure will stop the >> >> This patch is trying to do too many things. Given that it's making a risky infrastructure change that's likely to trip up the bumbling bots, I think you should split out the "continue on failure" part out. That part seems to be responsible for most of the test and result diffs, yet it seems less contentious and less likely to be rolled out and back in repeatedly than the "add a timeout" part of the patch. > > OTOH, maybe it will work. If it has to be rolled out, let's be more cautious in followup attempts. None of the changes in this patch should ever be seen (ideally). In a perfect world, all the tests would pass, so continue on failure wouldn't ever be necessary (the same is true for automatic timeouts). The point of this is to handle the cases where the "bumbling bots" are having issues, or when a test is being written (e.g. "I forgot to add a listener, so it times out"), so that we, as developers, can get more of the output to see what went wrong. >> Source/WebInspectorUI/ChangeLog:21 >> + "out of nowhere" in the middle of other tests. > > It sounds like the only way we can make test cases have non-flaky results is by disabling continue-on-fail. For if a failure is a timeout, any further tests run in the same context could be tainted by residual code/logging still running from the timeout'd test. > > On second thought, if all the nondeterminism is limited to just the one test file, then that's probably okay because it will be marked as failing anyway. The nondeterminism is limited to just that test file. My main point here is that if each test case is independent of each other, and one of them times out, the "extra" logging that appears in the output of later test cases could come from the one that timed out, as it's still running (no way to just stop JS execution). >> Source/WebInspectorUI/UserInterface/Test/TestSuite.js:57 >> + return this.runCount - (this.failCount - this.skipCount); > > Why make skips count as passes? This does the opposite of that. It makes skips not count as passes. Skips are treated as failures, so we only want the failures that aren't skips.
WebKit Commit Bot
Comment 24 2019-04-05 16:39:22 PDT
Comment on attachment 366824 [details] Patch Clearing flags on attachment: 366824 Committed r243953: <https://trac.webkit.org/changeset/243953>
WebKit Commit Bot
Comment 25 2019-04-05 16:39:24 PDT
All reviewed patches have been landed. Closing bug.
Shawn Roberts
Comment 27 2019-04-09 10:37:14 PDT
Found another test that is failing due to the timeout of an existing test that was marked a timeout. inspector/console/console-time.html will fail because of the timeout, then pass on the retry. It appears this test was marked flaky in https://bugs.webkit.org/show_bug.cgi?id=158006 After this revision instead of showing a timeout on the dashboard I believe they've been replaced with the failure. I will update test expectations to skip this test as well. I also updated radars as hopefully we can fix the initial timeout to resolve this failure as well. Dashboard: https://webkit-test-results.webkit.org/dashboards/flakiness_dashboard.html#showAllRuns=true&tests=inspector%2Fconsole%2Fconsole-time.html Diff: https://build.webkit.org/results/Apple%20Mojave%20Release%20WK2%20(Tests)/r244076%20(3555)/inspector/console/console-time-diff.txt
Note You need to log in before you can comment on or make changes to this bug.