| Summary: | [WPE] Debug bot timeouts with many unresponsive webprocess errors | ||
|---|---|---|---|
| Product: | WebKit | Reporter: | Lauro Moura <lmoura> |
| Component: | WPE WebKit | Assignee: | Nobody <webkit-unassigned> |
| Status: | RESOLVED CONFIGURATION CHANGED | ||
| Severity: | Normal | CC: | bugs-noreply |
| Priority: | P2 | ||
| Version: | Other | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| See Also: | https://bugs.webkit.org/show_bug.cgi?id=188048 | ||
|
Description
Lauro Moura
2020-03-31 21:44:04 PDT
Correction: Managed to reproduce the issue with jhbuild (which is still used in the bots). Hard to reproduce consistently on my setup. Sometimes happening when a lot of parallel tests are run (like 25 instances in my 8 core laptop, but rarely). Directly on the bot it is failing more consistently with the following command line: $ python ./Tools/Scripts/run-webkit-tests --no-build --no-show-results --no-new-test-results --clobber-old-results --exit-after-n-crashes-or-timeouts 2 --exit-after-n-failures 5 --debug --wpe --results-directory layout-test-results --debug-rwt-logging --child-processes=10 --iterations=20 --fully-parallel --no-http fast/dom/image-object.html With WEBKIT_DEBUG=all, the crashing test gives this output, which up to the error message seems to be similar to what a normal run outputs: ``` UNIMPLEMENTED: ../../Source/WebKit/UIProcess/WebPreferences.cpp(201) : void WebKit::WebPreferences::platformInitializeStore() UNIMPLEMENTED: ../../Source/WebKit/UIProcess/WebPreferences.cpp(244) : bool WebKit::WebPreferences::platformGetBoolUserValueForKey(const WTF::String&, bool&) UNIMPLEMENTED: ../../Source/WebKit/UIProcess/WebPreferences.cpp(250) : bool WebKit::WebPreferences::platformGetUInt32UserValueForKey(const WTF::String&, uint32_t&) UNIMPLEMENTED: ../../Source/WebKit/UIProcess/WebPreferences.cpp(213) : void WebKit::WebPreferences::platformUpdateBoolValueForKey(const WTF::String&, bool) UNIMPLEMENTED: ../../Source/WebKit/UIProcess/WebPreferences.cpp(208) : void WebKit::WebPreferences::platformUpdateStringValueForKey(const WTF::String&, const WTF::String&) (Back/Forward) Created WebBackForwardList 0x7fe9502ed3b8 UNIMPLEMENTED: ../../Source/WebKit/UIProcess/wpe/WebPageProxyWPE.cpp(44) : void WebKit::WebPageProxy::platformInitialize() (NetworkProcess) synchronizing cache (NetworkProcess) opened cache storage, success 1 (NetworkProcess) blob synchronization completed approximateSize=0 (NetworkProcess) cache synchronization completed size=0 recordCount=0 (NetworkProcess) synchronizing cache (NetworkProcess) opened cache storage, success 1 (NetworkProcess) blob synchronization completed approximateSize=0 (NetworkProcess) cache synchronization completed size=0 recordCount=0 (ProcessSwapping) Removing process with pid 0 from the origin cache set WebPageProxy 7 activityStateDidChange - mayHaveChanged loading WebPageProxy 7 dispatchActivityStateChange - potentiallyChangedActivityStateFlags loading WebPageProxy 7 dispatchActivityStateChange: state changed from active window, focused, visible, visible or occluded, in-window to active window, focused, visible, visible or occluded, in-window, loading <unknown> - TestController::run - Failed to reset state to consistent values #PROCESS UNRESPONSIVE - WPEWebProcess ``` Some tests in the wpe-debug-tests bot with different timeout values (startup timeout is 1/4 of the regular timeout) and 20 processes: 30s - Always timeouts on startup (That means the WebProcess is taking more than 7,5s to start and load the about:blank page. 45s - Timeouts most times 60s - Never timeouts Upping the number of processes to 25 made the 60s timeout to trigger the startup issues too. Some more pairs: 30s/10proc - timeout 30s/5proc - timeout sometimes 30s/1proc - works Given we are planning to move the bots to a flatpak-based setup, we could try increasing the timeout limit/reducing the number of parallel jobs to allow the bot to run while a proper fix is found. After reenabling the WPE debug bots with the reduced number of processes from 20 to 10 : 12 builds (#1700 to #1711) 5 ran to completion (Between 1h45m ~1h51m) 7 exited early. 8 runs had those timeouts in the beginning. (One of them managed to run to the end). This has not happened since the move to the Flatpak SDK. Closing for now. |