| Summary: | REGRESSION: [ iOS ] http/tests/resourceLoadStatistics/standalone-web-application-exempt-from-website-data-deletion tests are flaky crashing, failing an timing out. | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | WebKit | Reporter: | Jason Lawrence <Lawrence.j> | ||||||||||||||
| Component: | Tools / Tests | Assignee: | Alex Christensen <achristensen> | ||||||||||||||
| Status: | RESOLVED FIXED | ||||||||||||||||
| Severity: | Normal | CC: | achristensen, beidson, cdumez, ggaren, katherine_cheney, webkit-bot-watchers-bugzilla, webkit-bug-importer | ||||||||||||||
| Priority: | P2 | Keywords: | InRadar | ||||||||||||||
| Version: | WebKit Nightly Build | ||||||||||||||||
| Hardware: | iPhone / iPad | ||||||||||||||||
| OS: | iOS 13 | ||||||||||||||||
| See Also: |
https://bugs.webkit.org/show_bug.cgi?id=209634 https://bugs.webkit.org/show_bug.cgi?id=211305 |
||||||||||||||||
| Attachments: |
|
||||||||||||||||
|
Description
Jason Lawrence
2020-04-29 10:05:15 PDT
reproduction command: run-webkit-tests --iterations 99 --exit-after-n-failures 2 --force -f --ios-simulator http/tests/resourceLoadStatistics/standalone-web-application-exempt-from-website-data-deletion-database.html It looks like there's two more resourceLoadStatistics tests affected by this issue: http/tests/resourceLoadStatistics/strip-referrer-to-origin-for-prevalent-subresource-redirects-database.html http/tests/resourceLoadStatistics/third-party-cookie-blocking-on-sites-without-user-interaction-database.html New History link: https://results.webkit.org/?suite=layout-tests&suite=layout-tests&suite=layout-tests&suite=layout-tests&test=http%2Ftests%2FresourceLoadStatistics%2Fstandalone-web-application-exempt-from-website-data-deletion-database.html&test=http%2Ftests%2FresourceLoadStatistics%2Fstandalone-web-application-exempt-from-website-data-deletion.html&test=http%2Ftests%2FresourceLoadStatistics%2Fstrip-referrer-to-origin-for-prevalent-subresource-redirects-database.html&test=http%2Ftests%2FresourceLoadStatistics%2Fthird-party-cookie-blocking-on-sites-without-user-interaction-database.html&platform=ios&limit=50000&style=debug&style=release We probably have a number of distinct issues called out in this bug, but r260841 looks like it landed right before we started seeing text failures with at least these two tests: https://results.webkit.org/?suite=layout-tests&suite=layout-tests&test=http%2Ftests%2FresourceLoadStatistics%2Fstrip-referrer-to-origin-for-prevalent-subresource-redirects-database.html&test=http%2Ftests%2FresourceLoadStatistics%2Fthird-party-cookie-blocking-on-sites-without-user-interaction-database.html&platform=ios&limit=50000&style=debug&style=release These two: http/tests/resourceLoadStatistics/standalone-web-application-exempt-from-website-data-deletion-database.html http/tests/resourceLoadStatistics/standalone-web-application-exempt-from-website-data-deletion.html were added 4 weeks ago with https://trac.webkit.org/changeset/259440/webkit (which was reverted then re-landed for test issues), and they seem to have been flaky timeouts on iOS bots from the beginning. http://trac.webkit.org/r260791 may be related to the WAL errors in the crashes called out in the description cc'ing some folks, but I think we need this to be split up into distinct bugs. It seems like the problem is multiple threads trying to access the same database file simultaneously. This likely has to do with the fact that the standalone web application tests create their own website data store at the same path as the WebKitTestRunner default store. This is probably related to http://trac.webkit.org/r260791 and more specifically this line in NetworkSession::setResourceLoadStatisticsEnabled: destroyResourceLoadStatistics([] { }); We probably need to wait for that completion. (In reply to Alex Christensen from comment #9) > This is probably related to http://trac.webkit.org/r260791 and more > specifically this line in NetworkSession::setResourceLoadStatisticsEnabled: > destroyResourceLoadStatistics([] { }); > We probably need to wait for that completion. Ah, this would explain it. Maybe we should pass a completion handler to this function then. Created attachment 398018 [details]
Patch
Created attachment 398019 [details]
Patch
My iOS SDK isn't doing too well, so I can't run tests with it but I'm pretty sure this will fix it. Could someone verify that they can reproduce the issue without but not with this patch? Created attachment 398039 [details]
Patch
Created attachment 398058 [details]
Patch
Comment on attachment 398058 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=398058&action=review I used this command to verify there were crashes before but not after this patch. run-webkit-tests --iterations 99 --exit-after-n-failures 2 --force -f --ios-simulator http/tests/resourceLoadStatistics/standalone-web-application-exempt-from-website-data-deletion-database.html http/tests/resourceLoadStatistics/strip-referrer-to-origin-for-prevalent-subresource-redirects-database.html http/tests/resourceLoadStatistics/third-party-cookie-blocking-on-sites-without-user-interaction-database.html http/tests/resourceLoadStatistics/standalone-web-application-exempt-from-website-data-deletion-database.html --no-build > Tools/WebKitTestRunner/TestController.cpp:536 > + WKWebsiteDataStoreConfigurationSetResourceLoadStatisticsDirectory(configuration, toWK(makeString(temporaryFolder, pathSeparator, "ResourceLoadStatistics", pathSeparator, cryptographicallyRandomNumber())).get()); This is necessary to prevent two WebsiteDataStores from opening databases at the same directory at the same time. Comment on attachment 398058 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=398058&action=review > Source/WebKit/UIProcess/API/C/WKWebsiteDataStoreRef.h:48 > +WK_EXPORT void WKWebsiteDataStoreSetResourceLoadStatisticsEnabled(WKWebsiteDataStoreRef dataStoreRef, bool enable, void* functionContext, WKWebsiteDataStoreStatisticsEnabledFunction completionHandler); If the client does: WKWebsiteDataStoreSetResourceLoadStatisticsEnabled(dataStore, false, nullptr, nullptr); WKWebsiteDataStoreSetResourceLoadStatisticsEnabled(dataStore, true, nullptr, nullptr); without waiting for completion handlers. Do I understand correctly that we will still crash? If so, I think this is super fragile and not the right design. Also, turning on or off a feature should really not need to be async IMO. (In reply to Chris Dumez from comment #17) > Comment on attachment 398058 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=398058&action=review > > > Source/WebKit/UIProcess/API/C/WKWebsiteDataStoreRef.h:48 > > +WK_EXPORT void WKWebsiteDataStoreSetResourceLoadStatisticsEnabled(WKWebsiteDataStoreRef dataStoreRef, bool enable, void* functionContext, WKWebsiteDataStoreStatisticsEnabledFunction completionHandler); > > If the client does: > WKWebsiteDataStoreSetResourceLoadStatisticsEnabled(dataStore, false, > nullptr, nullptr); > WKWebsiteDataStoreSetResourceLoadStatisticsEnabled(dataStore, true, nullptr, > nullptr); > > without waiting for completion handlers. Do I understand correctly that we > will still crash? If so, I think this is super fragile and not the right > design. > I think all crashes should be fixed using just the random addition to the ResourceLoadStatistics directory, per my comment about two databases being created at the same path. This change alone fixed all local crashes for me. Comment on attachment 398058 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=398058&action=review >>> Source/WebKit/UIProcess/API/C/WKWebsiteDataStoreRef.h:48 >>> +WK_EXPORT void WKWebsiteDataStoreSetResourceLoadStatisticsEnabled(WKWebsiteDataStoreRef dataStoreRef, bool enable, void* functionContext, WKWebsiteDataStoreStatisticsEnabledFunction completionHandler); >> >> If the client does: >> WKWebsiteDataStoreSetResourceLoadStatisticsEnabled(dataStore, false, nullptr, nullptr); >> WKWebsiteDataStoreSetResourceLoadStatisticsEnabled(dataStore, true, nullptr, nullptr); >> >> without waiting for completion handlers. Do I understand correctly that we will still crash? If so, I think this is super fragile and not the right design. >> >> Also, turning on or off a feature should really not need to be async IMO. > > I think all crashes should be fixed using just the random addition to the ResourceLoadStatistics directory, per my comment about two databases being created at the same path. This change alone fixed all local crashes for me. I think it's possible for both statements to be true: It's possible that a client that didn't wait for a completion handler would crash; and also that all our client code happens to wait for a completion handler. Do we need setting this preference to also do the work of creating and destroying databases before we consider the preference set? Would it would to treat setting the preference to true to be a simple set of a boolean, where database creation would happen on demand when we had new data to store? And for setting the preference to false, could we schedule an async task to delete all previous databases without the client needing to wait for that result? Created attachment 398068 [details]
Patch
This whole thing is problematic. We should make it so you can't turn resource load statistics on and off at runtime, you can't switch between database mode and plist mode at runtime, and we should remove the plist mode entirely. But my latest patch fixes the crashes for me. It is indeed invalid to have two persistent website data stores pointing at the same location at the same time, and none of our SPI clients do that plist mode changes sound reasonable; but for turning on/off at runtime, isn't that a Safari feature? [ Preferences -> Privacy -> Prevent cross-site tracking ] We should make that safari feature require a restart like so many other things do. Comment on attachment 398068 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=398068&action=review > Tools/WebKitTestRunner/TestController.cpp:536 > + WKWebsiteDataStoreConfigurationSetResourceLoadStatisticsDirectory(configuration, toWK(makeString(temporaryFolder, pathSeparator, "ResourceLoadStatistics", pathSeparator, cryptographicallyRandomNumber())).get()); Why are we doing this only for ITP? The issue is that we have a default store with those temporary paths set. Then, if the test uses options.standaloneWebApplicationURL, it will get its own data store. Currently, this new data store gets the exact same paths as the default data store, which always seems wrong, no? You are fixing it for ITP but do we have any reason to believe it is OK both other databases? I'm fixing ITP right now because we have tests crashing right now and that's urgent, but you're right, we should fix everything at some point. I think fixing everything else should be done in a separate patch. This is a targeted response to a recent regression. (In reply to Chris Dumez from comment #25) > Comment on attachment 398068 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=398068&action=review > > > Tools/WebKitTestRunner/TestController.cpp:536 > > + WKWebsiteDataStoreConfigurationSetResourceLoadStatisticsDirectory(configuration, toWK(makeString(temporaryFolder, pathSeparator, "ResourceLoadStatistics", pathSeparator, cryptographicallyRandomNumber())).get()); > > Why are we doing this only for ITP? The issue is that we have a default > store with those temporary paths set. Then, if the test uses > options.standaloneWebApplicationURL, it will get its own data store. > Currently, this new data store gets the exact same paths as the default data > store, which always seems wrong, no? You are fixing it for ITP but do we > have any reason to believe it is OK both other databases? Another option could be to set TestController::defaultWebsiteDataStore() to be the data store created by the standalone-web-application tests for those tests only. This will prevent two data stores from being initialized for the same test. (In reply to katherine_cheney from comment #28) > (In reply to Chris Dumez from comment #25) > > Comment on attachment 398068 [details] > > Patch > > > > View in context: > > https://bugs.webkit.org/attachment.cgi?id=398068&action=review > > > > > Tools/WebKitTestRunner/TestController.cpp:536 > > > + WKWebsiteDataStoreConfigurationSetResourceLoadStatisticsDirectory(configuration, toWK(makeString(temporaryFolder, pathSeparator, "ResourceLoadStatistics", pathSeparator, cryptographicallyRandomNumber())).get()); > > > > Why are we doing this only for ITP? The issue is that we have a default > > store with those temporary paths set. Then, if the test uses > > options.standaloneWebApplicationURL, it will get its own data store. > > Currently, this new data store gets the exact same paths as the default data > > store, which always seems wrong, no? You are fixing it for ITP but do we > > have any reason to believe it is OK both other databases? > > Another option could be to set TestController::defaultWebsiteDataStore() to > be the data store created by the standalone-web-application tests for those > tests only. This will prevent two data stores from being initialized for the > same test. TestController::defaultWebsiteDataStore() is a static data store which is used by most tests (except ephemeral ones and standalone-web-application ones). It is not created on a per-test basic, it outlives the tests. I think the bug here is that we create a new non-ephemeral data store that has the exact same paths as the default data store. As it stands, the patch looks like a partial fix / hack and it would be very little work to make it fully correct. Created attachment 398079 [details]
Patch
Committed r260961: <https://trac.webkit.org/changeset/260961> All reviewed patches have been landed. Closing bug and clearing flags on attachment 398079 [details]. We're still seeing tests crash with the following in stderr: ERROR: SQLite database failed to set journal_mode to WAL, error: database is locked ./platform/sql/SQLiteDatabase.cpp(175) : void WebCore::SQLiteDatabase::useWALJournalMode() ERROR: SQLite database failed to checkpoint: database is locked ./platform/sql/SQLiteDatabase.cpp(185) : void WebCore::SQLiteDatabase::useWALJournalMode() ERROR: Unable to prepare statement to fetch schema for the ObservedDomains table. /Volumes/Data/slave/ios-simulator-13-debug/build/Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp(507) : void WebKit::ResourceLoadStatisticsDatabaseStore::openAndUpdateSchemaIfNecessary() SHOULD NEVER BE REACHED /Volumes/Data/slave/ios-simulator-13-debug/build/Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp(508) : void WebKit::ResourceLoadStatisticsDatabaseStore::openAndUpdateSchemaIfNecessary() There are also 16 unassociated com.apple.WebKit.Networking.Development crashes on this run https://build.webkit.org/results/Apple%20iOS%2013%20Simulator%20Debug%20WK2%20(Tests)/r260998%20(3465)/results.html I think there should be a separate bug. It looks like this fix did address the standalone-web-application-exempt-from-website-data-deletion tests. The other tests might need some variation of Alex's original patch. (In reply to katherine_cheney from comment #33) > I think there should be a separate bug. It looks like this fix did address > the standalone-web-application-exempt-from-website-data-deletion tests. The > other tests might need some variation of Alex's original patch. https://bugs.webkit.org/show_bug.cgi?id=211305 |