| Summary: | PDFPlugin hangs in deallocation when data load contends with waiting for thread completion | ||||||
|---|---|---|---|---|---|---|---|
| Product: | WebKit | Reporter: | Daniel Jalkut <jalkut> | ||||
| Component: | Assignee: | Chris Dumez <cdumez> | |||||
| Status: | RESOLVED FIXED | ||||||
| Severity: | Normal | CC: | achristensen, beidson, cdumez, darin, simon.fraser, thorton, webkit-bug-importer | ||||
| Priority: | P2 | Keywords: | InRadar | ||||
| Version: | WebKit Local Build | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Attachments: |
|
||||||
|
Description
Daniel Jalkut
2022-10-13 05:40:48 PDT
The underlying NetNewsWire bug that inspired me to locate this bug: https://github.com/Ranchero-Software/NetNewsWire/issues/3716 The simplest fix I can think of is this inside the for loop at https://github.com/WebKit/WebKit/blob/main/Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm#L879: if (plugin->isBeingDestroyed()) dataSemaphore.signal(); else { plugin->getResourceBytesAtPosition( ... ... } Given both destroy and the for loop are running on the main thread, I don’t think we have to do anything more sophisticated. Unfortunately I don't think the proposed fix will work, because the deadlock prevents the code that might signal the dataSemaphore from ever being reached. So the RunLoop::main().dispatch(...) that gets invoked right before waiting on the semaphore, is prevented from having its block run if/when the main thread is already waiting on the thread to complete in PDFPlugin::destroy.
I tried to adapt the proposed solution to the semaphore wait phase, by replacing dataSemaphore.wait() with
while (true) {
if (monitorPluginRef->isBeingDestroyed()) {
break;
}
if (dataSemaphore.waitFor(100_ms)) {
break;
}
}
This seems to ameliorate the problem, and seems thread safe since the isBeingDestroyed property is only ever going to be true once, and never go back to not being true. But implementing this solution led me to other nuanced issues with exceptions being thrown when the main runloop based code did finally get dispatched, presumably because the plugin is truly gone by that point.
If I put the ->isBeingDestroyed check both inside the main runloop closure and in the semaphore wait construct as above, it seems to escape the code paths without hanging but eventually crashes in an assertion in WebPage::SandboxExtensionTracker::didStartProvisionalLoad. Not really sure how that is connected, but it's a good example of why this is difficult for me to pursue unless we can come up with a "silver bullet" like Darin tried to do here!
I think this line of thinking is the right idea: how to ensure when destroying the plugin that any current OR upcoming attempt to wait on data will be ignored, without leaving dangling closures that might have trouble with that.
I’ll keep thinking about this although eventually we will want to await the arrival of an expert when I almost certainly don’t succeed. One thing that probably isn’t right about the adapted code above is that it’s inherently racy to access isBeingDestroyed from a non-main thread. And of course we don’t want that type of polling loop in our completed solution. Pull request: https://github.com/WebKit/WebKit/pull/25448 Committed 275707@main (e604d7c70127): <https://commits.webkit.org/275707@main> Reviewed commits have been landed. Closing PR #25448 and removing active labels. |