| Summary: | WTF::StackTrace::captureStackTrace broken on aarch64 (at least when called from ResourceError::internalError) | ||||||
|---|---|---|---|---|---|---|---|
| Product: | WebKit | Reporter: | Alberto Garcia <berto> | ||||
| Component: | WebKitGTK | Assignee: | Nobody <webkit-unassigned> | ||||
| Status: | RESOLVED FIXED | ||||||
| Severity: | Normal | CC: | bugs-noreply, cgarcia, mcatanzaro, webkit, zan | ||||
| Priority: | P2 | ||||||
| Version: | WebKit Nightly Build | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| See Also: |
https://bugs.webkit.org/show_bug.cgi?id=245576 https://bugs.webkit.org/show_bug.cgi?id=245826 |
||||||
| Attachments: |
|
||||||
|
Description
Alberto Garcia
2022-08-17 03:30:06 PDT
Another stack trace:
/usr/lib/aarch64-linux-gnu/webkit2gtk-4.0/WebKitNetworkProcess
(gdb) bt
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x0000ffff7cfdfaa0 in __GI_abort () at abort.c:79
#2 0x0000ffff7fa8ac50 in WTFCrashWithInfo(int, char const*, char const*, int) () at WTF/Headers/wtf/Assertions.h:741
#3 0x0000ffff80a2d5a8 in captureStackTrace () at ../Source/WTF/wtf/StackTrace.cpp:79
#4 0x0000ffff80a08ea0 in WTFReleaseLogStackTrace () at ../Source/WTF/wtf/Assertions.cpp:592
#5 0x0000ffff83c06550 in internalError () at ../Source/WebCore/platform/network/ResourceErrorBase.cpp:97
#6 0x0000ffff820e8d1c in preconnectTo () at ../Source/WebKit/NetworkProcess/NetworkConnectionToWebProcess.cpp:735
#7 0x0000ffff81fc62f4 in callMemberFunctionImpl<WebKit::NetworkConnectionToWebProcess, void (WebKit::NetworkConnectionToWebProcess::*)(std::optional<WTF::ObjectIdentifier<WebCore::ResourceLoader> >, WebKit::NetworkResourceLoadParameters&&), std::tuple<std::optional<WTF::ObjectIdentifier<WebCore::ResourceLoader> >, WebKit::NetworkResourceLoadParameters>, 0, 1> () at ../Source/WebKit/Platform/IPC/HandleMessage.h:125
#8 callMemberFunction<WebKit::NetworkConnectionToWebProcess, void (WebKit::NetworkConnectionToWebProcess::*)(std::optional<WTF::ObjectIdentifier<WebCore::ResourceLoader> >, WebKit::NetworkResourceLoadParameters&&), std::tuple<std::optional<WTF::ObjectIdentifier<WebCore::ResourceLoader> >, WebKit::NetworkResourceLoadParameters>, std::integer_sequence<unsigned long, 0, 1> > () at ../Source/WebKit/Platform/IPC/HandleMessage.h:131
#9 handleMessage<Messages::NetworkConnectionToWebProcess::PreconnectTo, WebKit::NetworkConnectionToWebProcess, void (WebKit::NetworkConnectionToWebProcess::*)(std::optional<WTF::ObjectIdentifier<WebCore::ResourceLoader> >, WebKit::NetworkResourceLoadParameters&&)> () at ../Source/WebKit/Platform/IPC/HandleMessage.h:196
#10 didReceiveNetworkConnectionToWebProcessMessage () at DerivedSources/WebKit/NetworkConnectionToWebProcessMessageReceiver.cpp:479
#11 0x0000ffff822543d0 in dispatchMessage () at ../Source/WebKit/Platform/IPC/Connection.cpp:1134
#12 0x0000ffff82254768 in dispatchOneIncomingMessage () at ../Source/WebKit/Platform/IPC/Connection.cpp:1203
#13 0x0000ffff80a2bf40 in operator() () at ../Source/WTF/wtf/Function.h:82
#14 performWork () at ../Source/WTF/wtf/RunLoop.cpp:133
#15 0x0000ffff80a85190 in operator() () at ../Source/WTF/wtf/glib/RunLoopGLib.cpp:80
#16 __invoke () at ../Source/WTF/wtf/glib/RunLoopGLib.cpp:79
#17 0x0000ffff80a84524 in operator() () at ../Source/WTF/wtf/glib/RunLoopGLib.cpp:53
#18 __invoke () at ../Source/WTF/wtf/glib/RunLoopGLib.cpp:45
#19 0x0000ffff7d551ab4 in g_main_context_dispatch () from /usr/lib/aarch64-linux-gnu/libglib-2.0.so.0
#20 0x0000ffff7d551e5c in ?? () from /usr/lib/aarch64-linux-gnu/libglib-2.0.so.0
#21 0x0000ffff7d5521b0 in g_main_loop_run () from /usr/lib/aarch64-linux-gnu/libglib-2.0.so.0
#22 0x0000ffff80a84b20 in run () at ../Source/WTF/wtf/glib/RunLoopGLib.cpp:108
#23 0x0000ffff822280d8 in run () at ../Source/WebKit/Shared/AuxiliaryProcessMain.h:70
#24 AuxiliaryProcessMain<WebKit::NetworkProcessMainSoup> () at ../Source/WebKit/Shared/AuxiliaryProcessMain.h:96
#25 0x0000ffff7cfdfe18 in __libc_start_main (main=0x400878 <__wrap_main>, argc=3, argv=0xfffff1c90058, init=<optimized out>, fini=<optimized out>,
rtld_fini=<optimized out>, stack_end=<optimized out>) at ../csu/libc-start.c:308
#26 0x0000000000400874 in _start ()
/usr/lib/aarch64-linux-gnu/webkit2gtk-4.0/WebKitWebProcess
(gdb) bt
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x0000ffff99831aa0 in __GI_abort () at abort.c:79
#2 0x0000ffff9c2dcc50 in WTFCrashWithInfo(int, char const*, char const*, int) () at WTF/Headers/wtf/Assertions.h:741
#3 0x0000ffff9d27f5a8 in captureStackTrace () at ../Source/WTF/wtf/StackTrace.cpp:79
#4 0x0000ffff9d25aea0 in WTFReleaseLogStackTrace () at ../Source/WTF/wtf/Assertions.cpp:592
#5 0x0000ffffa0458550 in internalError () at ../Source/WebCore/platform/network/ResourceErrorBase.cpp:97
#6 0x0000ffff9edead30 in internallyFailedLoadTimerFired () at ../Source/WebKit/WebProcess/Network/WebLoaderStrategy.cpp:495
#7 0x0000ffff9d2d723c in operator() () at ../Source/WTF/wtf/glib/RunLoopGLib.cpp:177
#8 __invoke () at ../Source/WTF/wtf/glib/RunLoopGLib.cpp:169
#9 0x0000ffff9d2d6524 in operator() () at ../Source/WTF/wtf/glib/RunLoopGLib.cpp:53
#10 __invoke () at ../Source/WTF/wtf/glib/RunLoopGLib.cpp:45
#11 0x0000ffff99da3ab4 in g_main_context_dispatch () from /usr/lib/aarch64-linux-gnu/libglib-2.0.so.0
#12 0x0000ffff99da3e5c in ?? () from /usr/lib/aarch64-linux-gnu/libglib-2.0.so.0
#13 0x0000ffff99da41b0 in g_main_loop_run () from /usr/lib/aarch64-linux-gnu/libglib-2.0.so.0
#14 0x0000ffff9d2d6b20 in run () at ../Source/WTF/wtf/glib/RunLoopGLib.cpp:108
#15 0x0000ffff9eea47c4 in run () at ../Source/WebKit/Shared/AuxiliaryProcessMain.h:70
#16 AuxiliaryProcessMain<WebKit::WebProcessMainGtk> () at ../Source/WebKit/Shared/AuxiliaryProcessMain.h:96
#17 0x0000ffff99831e18 in __libc_start_main (main=0x400878 <__wrap_main>, argc=3, argv=0xfffff7b85168, init=<optimized out>, fini=<optimized out>,
rtld_fini=<optimized out>, stack_end=<optimized out>) at ../csu/libc-start.c:308
#18 0x0000000000400874 in _start ()
The network process one hits this line:
https://github.com/WebKit/WebKit/blob/webkitgtk-2.36.6/Source/WebKit/NetworkProcess/NetworkConnectionToWebProcess.cpp#L735
Huh, there's a lot going on here. First, WTFReleaseLogStackTrace is broken. It's a long function with a bunch of code, but the first line calls WTF::StackTrace::captureStackTrace, which is fatal and does not return, so the rest is all pointless. WTFReleaseLogStackTrace is clearly not intended to be fatal. Note that ResourceError::internalError is the only place where it is ever used for WPE/GTK. The only other uses are in PixelBufferConformerCV.cpp, which is platform-specific. So that's why we didn't notice. As for the errors themselves, there are two different traces: (1) Web process crash in WebLoaderStrategy::internallyFailedLoadTimerFired. It seems the web process is designed to call ResourceError::internalError whenever the network process crashes. So this crash is just a symptom of the network process crash. I don't think we need to investigate this further: fixing WTFReleaseLogStackTrace and fixing the network process crash would suffice. (2) Network process crash when calling NetworkConnectionToWebProcess::preconnectTo. We should look closer to decide what to do here. Although fixing WTFReleaseLogStackTrace would avoid the crash, I think we should go further and ensure that ResourceError::internalError does not get called. Note this only happens when ENABLE_SERVER_PRECONNECT is disabled, so the crash is specific to libsoup 2 builds only. Probably we should drop the request in NetworkConnectionToWebProcess::preconnectTo with some different error, but another option would be to find everywhere that calls it and guard it behind ENABLE_SERVER_PRECONNECT. (In reply to Michael Catanzaro from comment #2) > Huh, there's a lot going on here. > > First, WTFReleaseLogStackTrace is broken. It's a long function with a bunch > of code, but the first line calls WTF::StackTrace::captureStackTrace, which > is fatal and does not return, so the rest is all pointless. Oh, it looks like this is not expected, but rather a bug in StackTrace::captureStackTrace: WTFGetBacktrace(&trace->m_skippedFrame0, &numberOfFrames); if (numberOfFrames) { RELEASE_ASSERT(numberOfFrames >= framesToSkip); That calls backtrace() from execinfo.h, see the manpage backtrace(3). I wonder if something goes wrong there only on aarch64. I think we should never try a preconnect when ENABLE_SERVER_PRECONNECT is disabled. Created attachment 462492 [details]
Patch
Could someone try this patch?
(In reply to Carlos Garcia Campos from comment #5) > Created attachment 462492 [details] > Patch > > Could someone try this patch? I have just tried it on top of 2.36.7 and it doesn't help - the network process still crashes in the same way. Then I need to know where preconnectTo is called, and unfortunately that's not in the backtraces. It's the Messages::NetworkConnectionToWebProcess::PreconnectTo message in WebLoaderStrategy::preconnectTo(). But the point of crash is the release assert in stack trace capturing, assuming some amount of frames that the libc's backtrace on this specific platform/configuration can't extract. Let's split into two bugs: * Created bug #245576 for the problem with preconnect that causes the internal error to be logged and stacktrace to print * Retitled this bug to focus it on not crashing when printing the stacktrace (In reply to Carlos Garcia Campos from comment #5) > Could someone try this patch? It doesn't solve the problem with 2.38.0 either. According to Sebastian this patch fixes the crash: https://github.com/WebKit/WebKit/pull/4790 More news: there might be a compiler problem here, because all these crashes are happening when WebKit is compiled with clang. With gcc it seems stable (or more stable at least). I'm talking about gcc 10.2.1 and clang 11.0 The RELEASE_ASSERT triggering these crashes is in the process of being removed: https://bugs.webkit.org/show_bug.cgi?id=245826 https://github.com/WebKit/WebKit/pull/4830 (In reply to Zan Dobersek from comment #13) > The RELEASE_ASSERT triggering these crashes is in the process of being > removed: > https://bugs.webkit.org/show_bug.cgi?id=245826 > https://github.com/WebKit/WebKit/pull/4830 This change has landed. |