Bug 276312
| Summary: | [WPE][GTK] Internal error fired from WebLoaderStrategy.cpp(559) : internallyFailedLoadTimerFired | ||
|---|---|---|---|
| Product: | WebKit | Reporter: | Michael Catanzaro <mcatanzaro> |
| Component: | WebKitGTK | Assignee: | Simon Pena <spena> |
| Status: | RESOLVED FIXED | ||
| Severity: | Normal | CC: | aperez, berto, bugs-noreply, mcatanzaro, mcrha, paul, spena, zimmermann |
| Priority: | P2 | ||
| Version: | Other | ||
| Hardware: | PC | ||
| OS: | Linux | ||
| See Also: | https://bugs.webkit.org/show_bug.cgi?id=303704 | ||
Michael Catanzaro
In 278778@main I added logging to catch where frequent network errors in GTK port are coming from. Turns out 100% of our "internal errors" are coming from internallyFailedLoadTimerFired in WebLoaderStrategy.cpp:
ERROR: WebKit encountered an internal error. This is a WebKit bug.
/buildstream/gnome/sdk/webkitgtk-6.0.bst/Source/WebKit/WebProcess/Network/WebLoaderStrategy.cpp(559) : internallyFailedLoadTimerFired
Unfortunately I do not have a reliable reproducer, and the website this occurs most frequently on is internal to Red Hat. That's going to make debugging difficult. But there are three reasons for such an error:
* Network process crash (not happening in this case)
* WebLoaderStrategy::scheduleLoadFromNetworkProcess called with no sourceOrigin (*probably* not?)
* Messages::NetworkConnectionToWebProcess::ScheduleResourceLoad returns an error code (probably this?)
I don't see how NetworkConnectionToWebProcess::ScheduleResourceLoad could fail, though.
Not sure how to make further progress on this without a good reproducer.
| Attachments | ||
|---|---|---|
| Add attachment proposed patch, testcase, etc. |
Michael Catanzaro
I see this same error message mentioned in bug #286834, but that is supposed to be fixed in 289567@main, and I just hit this today on YouTube using 290567@main which is newer, so there must be multiple bugs that trigger this error message.
Michael Catanzaro
*** Bug 300321 has been marked as a duplicate of this bug. ***
Alberto Garcia
I can reproduce this problem very easily with WebKitGTK 2.50.2 using Epiphany 48.5 in application mode to open Discord:
$ /usr/bin/epiphany --application-mode --profile=$HOME/.local/share/org.gnome.Epiphany.WebApp_HASH https://discord.com/channels/...
ERROR: WebKit encountered an internal error. This is a WebKit bug.
Source/WebKit/WebProcess/Network/WebLoaderStrategy.cpp(618) : void WebKit::WebLoaderStrategy::internallyFailedLoadTimerFired()
ERROR: WebKit encountered an internal error. This is a WebKit bug.
Source/WebKit/WebProcess/Network/WebLoaderStrategy.cpp(618) : void WebKit::WebLoaderStrategy::internallyFailedLoadTimerFired()
ERROR: WebKit encountered an internal error. This is a WebKit bug.
Source/WebKit/WebProcess/Network/WebLoaderStrategy.cpp(618) : void WebKit::WebLoaderStrategy::internallyFailedLoadTimerFired()
Removing the cache directory solves the problem:
$ rm -rf ~/.cache/org.gnome.Epiphany.WebApp_HASH/
$ /usr/bin/epiphany --application-mode --profile=$HOME/.local/share/org.gnome.Epiphany.WebApp_HASH https://discord.com/channels/...
Alberto Garcia
The problem also happens with WebKitGTK 2.50.1
Nikolas Zimmermann
Also seen on the bots: https://build.webkit.org/results/GTK-Linux-64-bit-Debug-Tests/304475@main%20(17735)/http/tests/webrtc/filtering-ice-candidate-same-origin-frame2-crash-log.txt
Michael Catanzaro
(In reply to Nikolas Zimmermann from comment #5)
> Also seen on the bots:
> https://build.webkit.org/results/GTK-Linux-64-bit-Debug-Tests/
> 304475@main%20(17735)/http/tests/webrtc/filtering-ice-candidate-same-origin-
> frame2-crash-log.txt
That log shows a use after free of SoupSession in the (unsandboxed!) network process. I'd say we have much bigger problems there than this loader issue. That's worth a separate security component bug report.
Simon Pena
I won't claim it's an exact duplicate of https://bugs.webkit.org/show_bug.cgi?id=308051, but could you retest with the fix and see if it has solved these issues?
Michael Catanzaro
Not worth testing, because this bug report predates the existence of NetworkMDNSRegisterGLib. :)
Milan Crha
webkit2gtk4.1-2.50.4:
> ERROR: WebKit encountered an internal error. This is a WebKit bug.
> /builddir/build/BUILD/webkitgtk-2.50.4-build/webkitgtk-2.50.4/Source/WebKit/WebProcess/Network/WebLoaderStrategy.cpp(618) : void WebKit::WebLoaderStrategy::internallyFailedLoadTimerFired()
I won't say it's a bug, at least not "internal error", because I closed a window which contained a WebKitWebView while it had been loading its content. The app itself had been still running, the window was not the app window.
Michael Catanzaro
Ah, that explains a lot.
In that case, probably the solution will be to just not print the error. Of course, we do still want to print the error when the load failure is expected. But load failure because the web view is destroyed is expected and not something we should be printing.
Michael Catanzaro
> Of course, we do still want to print the error when the load failure is expected.
I meant: we do still want to print the error when the load failure is unexpected.
Paul van Tilburg
We've also been hit by #300321 since Debian updated from 2.48.6 to 2.50.1 and been unable to render Salesforce dashboards properly since.
The original post dismisses a network process crash, but I saw that its PID had been changing. On strace-ing it, I found that it does get SIGKILL-ed by something?
Once the loading/rendering of the dashboard runs into the internal error while loading its resources, it will never render the page again until we nuke the cache.
It feels like the cache gets corrupted somehow?
Even if we "refresh" the page by either destroying the web view and reconstructing it again or loading the original dashboard URL, and that does not help.
Finally, when using the Document Viewer cache model, the dashboard loads and renders properly again, albeit in a much slower fashion.
Is there a way to find out what happens to the network process or cache here?
Alberto Garcia
FWIW I can no longer reproduce this crash, same computer, same website, similar conditions (well, I'm on WebKitGTK 2.50.4 now, but from what I read other people are having problems with that release). I'm also on a different Debian point release, although I don't think that fixed anything.
Paul van Tilburg
I should have been more clear about the versions we use:
We can still reproduce it with 2.50.4 (which is the version that Debian 13.4 (Trixie) has), but we haven't tested 2.50.6 yet as we await its backport.
Interestingly, we cannot reproduce it in Epiphany 48.5 by closing and unclosing the tab (which I assume is similar behaviour to how are application does the refreshing).
For the moment we are stuck with the WebKitGtk version 2.48.5 on Debian 12.13 (Bookworm)
Alberto Garcia
> we haven't tested 2.50.6 yet as we await its backport.
What a coincidence! :-)
https://people.debian.org/~berto/webkit2gtk-2.50.6-1/
I plan to make the official releases today or tomorrow, but unless I find unexpected surprises during testing, they're going to be identical to those ones
Paul van Tilburg
Thanks for the heads up and your efforts! I tried those out, and it went immediately wrong on the first page "refresh".
I guess we'll move ahead and upgrade to Debian Trixie with the cache model set to Document Viewer for now, accept the sluggish page loads and keep looking into why this happens.
Michael Catanzaro
Do you have a URL that can be used to reproduce?
Paul van Tilburg
Unfortunately, I cannot share it, because it is a company's internal job hiring dashboard :(
And we mainly have issues with these Salesforce dashboards. I had put my hope into the issue also existing for Discord and the Tauri app of #300321.
I will be on the lookout for some public!
Alberto Garcia
So it happens in both bookworm and trixie, is that correct?
If you manage to get a stack trace of the error that would be helpful.
Michael Catanzaro
A stack trace is not possible without a core dump.
OK, reviewing this issue report again, I think you're hitting this problem as a *side effect* of your network process dying, because it's expected that this error message should be printed in that case. You mentioned it was receiving SIGKILL, which is very weird. I think you need a separate bug report. (Could it be OOM killer, like systemd-oomd?)
Simon Pena
Pull request: https://github.com/WebKit/WebKit/pull/61528
EWS
Committed 310907@main (0831c81f23e3): <https://commits.webkit.org/310907@main>
Reviewed commits have been landed. Closing PR #61528 and removing active labels.