Bug 183348 - REGRESSION(r222772): [GTK][WPE] WebProcess from WebKitGtk+ 2.19.9x SIGSEVs in WebKit::WebProcess::ensureNetworkProcessConnection() at Source/WebKit/WebProcess/WebProcess.cpp:1127
Summary: REGRESSION(r222772): [GTK][WPE] WebProcess from WebKitGtk+ 2.19.9x SIGSEVs in...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKitGTK (show other bugs)
Version: WebKit Local Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords:
: 184006 184869 (view as bug list)
Depends on:
Blocks:
 
Reported: 2018-03-05 14:39 PST by Andres Gomez Garcia
Modified: 2018-07-17 22:49 PDT (History)
8 users (show)

See Also:


Attachments
BT from gdb for the WebProcess (106.74 KB, text/plain)
2018-03-05 14:39 PST, Andres Gomez Garcia
no flags Details
bt full from xreader (19.65 KB, text/x-log)
2018-04-08 21:41 PDT, Tomas Popela
no flags Details
Patch (5.56 KB, patch)
2018-04-09 02:58 PDT, Carlos Garcia Campos
no flags Details | Formatted Diff | Diff
Updated patch using platform ifdefs (6.07 KB, patch)
2018-04-28 03:07 PDT, Carlos Garcia Campos
mcatanzaro: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Andres Gomez Garcia 2018-03-05 14:39:49 PST
Created attachment 335030 [details]
BT from gdb for the WebProcess

I'm using epiphany 3.27.90-22-ge89d8de with WebKitGtk+ 2.19.91 from gnome-nightly's flatpak.

I see often crashed tabs. When inspecting in my system with coredumpctl I've found several cores.

In this case, the WebProcess is SIGSEVing. I do not know the conditions for this.
Comment 1 Andres Gomez Garcia 2018-03-05 14:40:56 PST
FTR, I have several duplicated cores of this type.
Comment 2 Michael Catanzaro 2018-03-05 17:03:01 PST
This isn't itself a bug: it just means the network process crashed. Probably because of bug #183347, I would assume.

*** This bug has been marked as a duplicate of bug 183347 ***
Comment 3 Michael Catanzaro 2018-03-05 17:07:42 PST
I meant bug #183346. The network process crash, of course.

*** This bug has been marked as a duplicate of bug 183346 ***
Comment 4 Andres Gomez Garcia 2018-03-06 01:18:17 PST
(In reply to Michael Catanzaro from comment #2)
> This isn't itself a bug: it just means the network process crashed. Probably
> because of bug #183347, I would assume.
> 
> *** This bug has been marked as a duplicate of bug 183347 ***

It is a bug since it generates a core after a SIGSEV.

Surely we can close the process in an ordered way (?).
Comment 5 Michael Catanzaro 2018-03-06 05:24:53 PST
I can change WTFCrash() to generate a core after a SIGABRT, if you prefer. I've wanted to do that for a long time, actually.

It definitely needs to generate a core, though. I don't think it's problematic that a crash in the network process triggers a corresponding crash in the web process.
Comment 6 Andres Gomez Garcia 2018-03-06 06:00:33 PST
(In reply to Michael Catanzaro from comment #5)
> I can change WTFCrash() to generate a core after a SIGABRT, if you prefer.
> I've wanted to do that for a long time, actually.
> 
> It definitely needs to generate a core, though. I don't think it's
> problematic that a crash in the network process triggers a corresponding
> crash in the web process.

Why does it have to crash the WebProcess? Why cannot you just generate an error condition and exit properly?
Comment 7 Michael Catanzaro 2018-03-06 06:25:59 PST
(In reply to Andres Gomez Garcia from comment #6)
> Why does it have to crash the WebProcess? Why cannot you just generate an
> error condition and exit properly?

Because then it's waaaay harder to figure out why your web process is dying unexpectedly. :)
Comment 8 Andres Gomez Garcia 2018-03-06 07:27:59 PST
(In reply to Michael Catanzaro from comment #7)
> Because then it's waaaay harder to figure out why your web process is dying
> unexpectedly. :)

Sorry, but I think it is pretty much the other way around.

Right now I have a WebProcess core and I have to inspect the core to try to figure out if it really crashed WebProcess or NetworkProcess (for which I already have a core anyway).

What I'm saying is that the WebProcess shoudl identify the error, meaning, the NetworkProcess died, and show a message accordingly to the user, giving also the option to reload or, whichever seems appropriate.

I think that makes much more sense than just crashing the WebProcess too.
Comment 9 Michael Catanzaro 2018-03-10 09:11:52 PST
I guess the difference is that WebKit is designed to allow graceful recovery from  web process crashes, but not from network process crashes. I'll let Carlos decide this one, but if we change the web process to exit gracefully, we should consider adding a new WebProcessTermination reason for that to our API. Personally, I don't think it's worth it.
Comment 10 Andres Gomez Garcia 2018-03-27 07:13:55 PDT
Still happening with epiphany 3.27.90-38-gdb76a7f with WebKitGtk+ 2.19.92 from gnome-nightly's flatpak.
Comment 11 Michael Catanzaro 2018-03-27 07:29:45 PDT
My preference is to keep the crash, so I'm not planning to change anything here. Unless Carlos wants to change this to quit with exit() instead, this is WONTFIX.

Would be great to get a bug report for the network process crash, of course.
Comment 12 Michael Catanzaro 2018-03-27 07:35:35 PDT
BTW, I got some report recently (on IRC? on Matrix? it definitely did not make its way to Bugzilla, so I guess it's lost to time) that this crash was being hit without a corresponding network process crash, meaning the network process is quitting without a crash to indicate why. Without a network process backtrace, that's going to be extremely difficult to debug. This is why crash signals are good. :)
Comment 13 Michael Catanzaro 2018-03-28 09:34:39 PDT
Actually it did: bug #184006
Comment 14 Michael Catanzaro 2018-04-07 10:42:26 PDT
Reopening this. We have a problem here in the network process, and this web process backtrace really is all we get.
Comment 15 Michael Catanzaro 2018-04-07 10:43:07 PDT
This is a regression in 2.20
Comment 16 Michael Catanzaro 2018-04-07 10:47:12 PDT
(In reply to Michael Catanzaro from comment #15)
> This is a regression in 2.20

2.20.0 reached stable yesterday in Fedora, and we already have 181 total reports from 70 users.

I have no clue what we can do to debug this, except try to reproduce following the steps in bug #184006, which is clearly related.
Comment 17 Michael Catanzaro 2018-04-07 10:47:15 PDT
*** Bug 184006 has been marked as a duplicate of this bug. ***
Comment 18 Tomas Popela 2018-04-08 21:41:21 PDT
Created attachment 337480 [details]
bt full from xreader

I'm attaching the bt full from the xreader crash. I was attached to WebProcess as well as NetworkProcess. At the time when WebProcess crashed the NetworkProcess was ok - no crash occurred there.
Comment 19 Carlos Garcia Campos 2018-04-08 23:50:41 PDT
(In reply to Tomas Popela from comment #18)
> Created attachment 337480 [details]
> bt full from xreader
> 
> I'm attaching the bt full from the xreader crash. I was attached to
> WebProcess as well as NetworkProcess. At the time when WebProcess crashed
> the NetworkProcess was ok - no crash occurred there.

But was it still alive? or did it exit normally?
Comment 20 Tomas Popela 2018-04-09 00:20:52 PDT
(In reply to Carlos Garcia Campos from comment #19)
> But was it still alive? or did it exit normally?

It was still alive.
Comment 21 Carlos Garcia Campos 2018-04-09 02:04:38 PDT
This regressed in r222772. This is not a network process crash, the network process is killed by the UI process while the web process is still initializing because the web process pool is destroyed. This happens if you create a web view and destroy it in the next run loop iteration. xreader always creates a web view, the loads the document and if it's not an epub the web view is destroyed. I have a unit test now to reproduce the issue. Before r222772 the web process returned with exit(0) before sendSync() returns, now it returns false and CRASH is called.
Comment 22 Carlos Garcia Campos 2018-04-09 02:58:19 PDT
Created attachment 337489 [details]
Patch
Comment 23 Tomas Popela 2018-04-09 03:22:30 PDT
I can confirm that this fixes the xreader crash. Thank you Carlos! Good work as always!
Comment 24 Michael Catanzaro 2018-04-09 12:49:10 PDT
Comment on attachment 337489 [details]
Patch

Wow.

Just... wow.

How did you pin this to r222772? I'd kind of favor reverting that instead, since I don't like the platform-specific behavior here, but I guess this is OK and I like the test. It needs an owner, though.
Comment 25 Michael Catanzaro 2018-04-09 12:54:13 PDT
(In reply to Michael Catanzaro from comment #16)
> (In reply to Michael Catanzaro from comment #15)
> > This is a regression in 2.20
> 
> 2.20.0 reached stable yesterday in Fedora, and we already have 181 total
> reports from 70 users.

Two days later and we're at 782 reports from 236 unique users... I guess if we leave it broken for a while, we can get a pretty good estimate of how many users we have in Fedora. :P
Comment 26 Carlos Garcia Campos 2018-04-10 00:25:49 PDT
(In reply to Michael Catanzaro from comment #25)
> (In reply to Michael Catanzaro from comment #16)
> > (In reply to Michael Catanzaro from comment #15)
> > > This is a regression in 2.20
> > 
> > 2.20.0 reached stable yesterday in Fedora, and we already have 181 total
> > reports from 70 users.
> 
> Two days later and we're at 782 reports from 236 unique users... I guess if
> we leave it broken for a while, we can get a pretty good estimate of how
> many users we have in Fedora. :P

And what is more concerning, why so many people use xreader?
Comment 27 Carlos Garcia Campos 2018-04-10 00:29:39 PDT
(In reply to Michael Catanzaro from comment #24)
> Comment on attachment 337489 [details]
> Patch
> 
> Wow.
> 
> Just... wow.
> 
> How did you pin this to r222772? I'd kind of favor reverting that instead,
> since I don't like the platform-specific behavior here, but I guess this is
> OK and I like the test. It needs an owner, though.

Reverting is another option, Miguel has confirmed that r222772 didn't fix the issue in all the cases and we are still not cleaning up everything related to GStreamer. I don't want to make big changes about this in 2.20, so I'm going to merge this commit there to release 2.20.1. For trunk, Miguel is checking it, but I'm afraid any solution will require platform ifdefs anyway. My other option was to go back to exit on sync failure, but changing the exit for a main loop quit, and explicitly release whatever we need (cookies, GStreamer, etc) in the main function after the main loop quits.
Comment 28 Tomas Popela 2018-04-10 00:32:41 PDT
(In reply to Carlos Garcia Campos from comment #26)
> And what is more concerning, why so many people use xreader?

I think that they are mostly users who migrated from Linux Mint to Fedora or Cinnamon users, where the xreader is the default document viewer.
Comment 29 Michael Catanzaro 2018-04-10 00:53:08 PDT
There's no evidence as to how many reports are coming from xreader... we'd only know that if it were a UI process crash.
Comment 30 Michael Catanzaro 2018-04-10 18:09:08 PDT
(In reply to Michael Catanzaro from comment #25)
> Two days later and we're at 782 reports from 236 unique users... I guess if
> we leave it broken for a while, we can get a pretty good estimate of how
> many users we have in Fedora. :P

One day later, 1605 reports from 409 users... I should never again second-guess the importance of bugs reported by Andres. :P
Comment 31 Michael Catanzaro 2018-04-11 19:59:18 PDT
(In reply to Michael Catanzaro from comment #30)
> (In reply to Michael Catanzaro from comment #25)
> > Two days later and we're at 782 reports from 236 unique users... I guess if
> > we leave it broken for a while, we can get a pretty good estimate of how
> > many users we have in Fedora. :P
> 
> One day later, 1605 reports from 409 users... I should never again
> second-guess the importance of bugs reported by Andres. :P

2298 reports from 511 users as of today... looks like a big drop in the increase in unique users. That's lower than I'd expected.

BTW to calculate these numbers, just add the totals from these two links:

https://retrace.fedoraproject.org/faf/reports/2077056/
https://retrace.fedoraproject.org/faf/reports/2112254/
Comment 32 Michael Catanzaro 2018-04-13 10:50:21 PDT
(In reply to Michael Catanzaro from comment #24)
> Comment on attachment 337489 [details]
> Patch
> 
> Wow.
> 
> Just... wow.
> 
> How did you pin this to r222772? I'd kind of favor reverting that instead,
> since I don't like the platform-specific behavior here, but I guess this is
> OK and I like the test. It needs an owner, though.

r=me if you really want to keep r222772
Comment 33 Carlos Garcia Campos 2018-04-14 01:30:34 PDT
(In reply to Michael Catanzaro from comment #32)
> (In reply to Michael Catanzaro from comment #24)
> > Comment on attachment 337489 [details]
> > Patch
> > 
> > Wow.
> > 
> > Just... wow.
> > 
> > How did you pin this to r222772? I'd kind of favor reverting that instead,
> > since I don't like the platform-specific behavior here, but I guess this is
> > OK and I like the test. It needs an owner, though.
> 
> r=me if you really want to keep r222772

Yes, that's what we already have in 2.20.1, but for trunk we need an owner. Alex? Note that this is actually dead code for other ports, because at that point Connection has already called exit(0).
Comment 34 Alex Christensen 2018-04-16 08:38:03 PDT
Comment on attachment 337489 [details]
Patch

I think it's a bad idea to add code that's implicitly port-specific like this.  Why are you so sure you want to keep r222772?  Why are these sync messages failing?  Could they be made more robust?
Comment 35 Michael Catanzaro 2018-04-16 11:33:29 PDT
(In reply to Alex Christensen from comment #34)
> Comment on attachment 337489 [details]
> Patch
> 
> I think it's a bad idea to add code that's implicitly port-specific like
> this. Why are you so sure you want to keep r222772? 

My preference is to roll out both r222772 and r230481, and carefully reconsider what we're doing here. Having special process termination behavior for GTK/WPE will lead to more platform-specific problems in the future.

I've already pointed out why the motivation for r222772 and r230481 is incompatible with the WebKit multiprocess design. The goal there was to prevent leaks of hardware resources that are not released by process termination, but this simply cannot be right because the web process is *expected* to crash before releasing its resources.

> Why are these sync
> messages failing?  Could they be made more robust?

The sync messages are failing because the network process is already gone. I guess we could try to ensure that all web processes have terminated cleanly before attempting to bring down the network process, but I'm not sure if that would really be a good idea.
Comment 36 Carlos Garcia Campos 2018-04-16 22:18:59 PDT
(In reply to Alex Christensen from comment #34)
> Comment on attachment 337489 [details]
> Patch
> 
> I think it's a bad idea to add code that's implicitly port-specific like
> this.  Why are you so sure you want to keep r222772?  Why are these sync
> messages failing?  Could they be made more robust?

I can add platform ifdefs instead, but the change is actually required when connection doesn't exit on sync message failure, it's specific to GTK and WPE, because we disable that option. I want to keep r222772 because it has worked fine since we added it (except for this very specific case of creating and destroying a web view very quickly). This particular sync message fails because the UI process closes the connections while the web page is still being initialized, so it can't connect to the network process. I don't see why we need to have the same behavior than Apple here, and we already discussed this in bug #168126. We really need to exit cleanly to clean up resources, while Apple doesn't. Most of those resources are multimedia related, not our fault and not under our control (GStreamer), but we end up leaking then and we can avoid it easily.
Comment 37 Michael Catanzaro 2018-04-19 11:49:46 PDT
This patch doesn't seem likely to be approved, and we still need to fix the issue:

(In reply to Michael Catanzaro from comment #35)
> My preference is to roll out both r222772 and r230481, and carefully
> reconsider what we're doing here. Having special process termination
> behavior for GTK/WPE will lead to more platform-specific problems in the
> future.
Comment 38 Carlos Garcia Campos 2018-04-19 22:27:55 PDT
(In reply to Michael Catanzaro from comment #37)
> This patch doesn't seem likely to be approved, and we still need to fix the
> issue:
> 
> (In reply to Michael Catanzaro from comment #35)
> > My preference is to roll out both r222772 and r230481, and carefully
> > reconsider what we're doing here. Having special process termination
> > behavior for GTK/WPE will lead to more platform-specific problems in the
> > future.

Why not? I still think this is the best fix, we are already using this in 2.20 without any issues so far.
Comment 39 Michael Catanzaro 2018-04-20 09:15:28 PDT
> (In reply to Michael Catanzaro from comment #35)
> Having special process termination
> behavior for GTK/WPE will lead to more platform-specific problems in the
> future.

With those reservations, I could r+ if you add platform #ifdefs, I guess, so that the code would at least not be secretly platform-specific and wouldn't need an owner. It just seems like such a shame to be using platform #ifdefs for code that has no reason to be platform-specific.
Comment 40 Carlos Garcia Campos 2018-04-20 22:54:35 PDT
(In reply to Michael Catanzaro from comment #39)
> > (In reply to Michael Catanzaro from comment #35)
> > Having special process termination
> > behavior for GTK/WPE will lead to more platform-specific problems in the
> > future.
> 
> With those reservations, I could r+ if you add platform #ifdefs, I guess, so
> that the code would at least not be secretly platform-specific and wouldn't
> need an owner. It just seems like such a shame to be using platform #ifdefs
> for code that has no reason to be platform-specific.

Because it doesn't really need platform ifdefs, this code depends on whether the connection is exiting in case of send sync message failure or not. It happens that only GTK and WPE change that setting in the web process to not exit in case of send sync failure, because we *need* to terminate the web process to ensure resources are properly freed. So, there are reasons for us to have a different behavior than other platforms, *not leaking resources*. If there are more problems in the future we will fix them, but not fixing this only because other platforms don't need to terminate the web process this way doesn't make sense either. Note that not exiting on send sync failure makes the process termination more consistent/deterministic, because without it sometimes we finish on exit(0) and sometimes we don't, depending on when the connection is closed. I've had to debug that process several times since WebKit2, and others too (ask Miguel), and it's a pain.
Comment 41 Michael Catanzaro 2018-04-21 20:04:43 PDT
(In reply to Carlos Garcia Campos from comment #38)
> we are already using this in 2.20 without any issues so far.

Bug #184869 looks suspicious. I wonder if that is going to be an issue.

(In reply to Carlos Garcia Campos from comment #40)
> Because it doesn't really need platform ifdefs, this code depends on whether
> the connection is exiting in case of send sync message failure or not.

Indeed....

Anyway, to commit without ifdefs, you know you need to convince Alex, not me.

> It
> happens that only GTK and WPE change that setting in the web process to not
> exit in case of send sync failure, because we *need* to terminate the web
> process to ensure resources are properly freed. So, there are reasons for us
> to have a different behavior than other platforms, *not leaking resources*.

I'll ask again: what's the plan to free these hardware resources when the web process crashes? Seems like any application expecting the web process to never crash is not going to last very long.

> If there are more problems in the future we will fix them, but not fixing
> this only because other platforms don't need to terminate the web process
> this way doesn't make sense either. Note that not exiting on send sync
> failure makes the process termination more consistent/deterministic, because
> without it sometimes we finish on exit(0) and sometimes we don't, depending
> on when the connection is closed. I've had to debug that process several
> times since WebKit2, and others too (ask Miguel), and it's a pain.

That's a good point.
Comment 42 Carlos Garcia Campos 2018-04-21 23:39:18 PDT
(In reply to Michael Catanzaro from comment #41)
> (In reply to Carlos Garcia Campos from comment #38)
> > we are already using this in 2.20 without any issues so far.
> 
> Bug #184869 looks suspicious. I wonder if that is going to be an issue.
> 
> (In reply to Carlos Garcia Campos from comment #40)
> > Because it doesn't really need platform ifdefs, this code depends on whether
> > the connection is exiting in case of send sync message failure or not.
> 
> Indeed....
> 
> Anyway, to commit without ifdefs, you know you need to convince Alex, not me.

Right, but you are not helping me on that either. I don't mind to add platform ifdefs, but I prefer not to precisely because it gives the message that the code is specific to GTK and WPE and not to the ports not failing on send sync message.

> > It
> > happens that only GTK and WPE change that setting in the web process to not
> > exit in case of send sync failure, because we *need* to terminate the web
> > process to ensure resources are properly freed. So, there are reasons for us
> > to have a different behavior than other platforms, *not leaking resources*.
> 
> I'll ask again: what's the plan to free these hardware resources when the
> web process crashes? Seems like any application expecting the web process to
> never crash is not going to last very long.

It's better to leak resources always on crash than leaking on crash but also randomly when normal terminating the web process.

> > If there are more problems in the future we will fix them, but not fixing
> > this only because other platforms don't need to terminate the web process
> > this way doesn't make sense either. Note that not exiting on send sync
> > failure makes the process termination more consistent/deterministic, because
> > without it sometimes we finish on exit(0) and sometimes we don't, depending
> > on when the connection is closed. I've had to debug that process several
> > times since WebKit2, and others too (ask Miguel), and it's a pain.
> 
> That's a good point.
Comment 43 Carlos Garcia Campos 2018-04-25 23:36:21 PDT
So Alex, are you ok with this patch, or do you want me to add platform ifdefs? We need to fix this in trunk, because it's currently fixed in 2.20.x but not in 2.21.x
Comment 44 Alex Christensen 2018-04-26 10:12:03 PDT
I don't think it's a good design to use a bool that is only set in platform-specific code for this sort of thing.  It gives no insight into why we should exit on sync message sending failure.  For example, it makes me wonder why it is only set in the WebProcess.  Don't the other processes need it for the same reason, or do we just not set it there because we haven't observed many crashes there?  What is the underlying reason why we need to exit?  It also looks platform-independent when it is not.
Comment 45 Carlos Garcia Campos 2018-04-27 01:07:52 PDT
(In reply to Alex Christensen from comment #44)
> I don't think it's a good design to use a bool that is only set in
> platform-specific code for this sort of thing.  It gives no insight into why
> we should exit on sync message sending failure.  For example, it makes me
> wonder why it is only set in the WebProcess.  Don't the other processes need
> it for the same reason, or do we just not set it there because we haven't
> observed many crashes there?  What is the underlying reason why we need to
> exit?  It also looks platform-independent when it is not.

Well, the reason why setShouldExitOnSyncMessageSendFailure() exists is probably out of scope of this bug, but I'll try to explain:

 - It was initially added in r78392 to fix a crash when the UI process closes the connection while the web process was handling a message. At that time (2011) only web and plugin process existed and both were implementing the connection client method to exit(0) with these comments:

// We were making a synchronous call to a web process that doesn't exist any more.
// Callers are unlikely to be prepared for an error like this, so it's best to exit immediately.

// We were making a synchronous call to a UI process that doesn't exist any more.
// Callers are unlikely to be prepared for an error like this, so it's best to exit immediately.

 - Then in r84288 (still 2011) the connection client method was changed to a simple flag (bool), the plugin process implementation was removed but not replaced with a call to setShouldExitOnSyncMessageSendFailure(true) like the web process. I don't know why because neither the changelog nor the bug report says why.

 - New processes were added and none of them call setShouldExitOnSyncMessageSendFailure(true). This probably proves that callers are indeed prepared to fail on send sync message, since all process are handling them (and now also the web process in GTK and WPE).

 - This behavior has always been a headache for me, I've had to debug the web process termination several times to fix different bugs and the inconsistent non-deterministic way of terminating the web process has always made it painful. But not only because of that, but because we have needed a proper termination of the web process multiple times in the past (to save the clipboard contents, to dump the old soup disk cache, and some other things we needed to to on shutdown).

 - I tried to fix it once and for all in bug #147036 (2015) but Darin opposed to it because that might slow down the web process termination (I agree it's important that a tab closes immediately, but that's still the case for us). The bug was closed as WONTFIX and I tried to workaround all the issues we had related tot that.

 - But new issues appeared related to the web process exiting on sync message failure. And I finally got an approval to not exit on sync message failure for GTK and WPE in bug #168126 (2018). Brady said that the change would slow down browser quitting when there are many tabs, and that it's not desirable for the platform that don't really needed it. And I agree, that's what makes this change platform specific, it's an optimization for platforms where all resources are guaranteed to be released no matter how the process is terminated. That's unfortunately not our case, and it's also probably not WebKit faults, but it's so easy to fix. 

 - So, after years I finally go the approval and landed the patch. WebKitGTK+ 2.20 was released with the patch and nobody has complained of browser being slow to close, I hadn't notice any slow down and I always have a lot of tabs open. So, I think this was the right fix to ensure resources are released and makes web process termination consistent with all other processes and deterministic.

 - But this bug showed up, because the existing code assumes that the web process always exits on sync message failure. But that's no longer the case for us. I fixed this bug in 2.20.1 and nobody has complained either.

 - So, this is not platform specific, but we decided to use a different behavior for GTK and WPE, because we need it, and to make sure other platforms that don't need it are not affected. I would be more than happy if we simply remove setShouldExitOnSyncMessageSendFailure(), but I don't know if that will have any implication in iOS/Mac.

Sorry for the long comment, I hope it helps.
Comment 46 Alex Christensen 2018-04-27 10:06:32 PDT
Bug 168126 made this platform-specific.  I'd prefer it have platform macros if you really want to make the change to linux ports here.
Comment 47 Carlos Garcia Campos 2018-04-28 03:06:29 PDT
*** Bug 184869 has been marked as a duplicate of this bug. ***
Comment 48 Carlos Garcia Campos 2018-04-28 03:07:02 PDT
Created attachment 339063 [details]
Updated patch using platform ifdefs
Comment 49 Carlos Garcia Campos 2018-05-03 01:45:44 PDT
Committed r231298: <https://trac.webkit.org/changeset/231298>
Comment 50 Michael Catanzaro 2018-05-07 07:33:21 PDT
Comment on attachment 339063 [details]
Updated patch using platform ifdefs

View in context: https://bugs.webkit.org/attachment.cgi?id=339063&action=review

> Source/WebKit/WebProcess/WebProcess.cpp:1241
> +#if PLATFORM(GTK) || PLATFORM(WPE)
> +            // GTK+ and WPE ports don't exit on send sync message failure.
> +            // In this particular case, the storage process can be terminated by the UI process while the
> +            // connection is being done, so we always want to exit instead of crashing.
> +            // See https://bugs.webkit.org/show_bug.cgi?id=183348.
> +#else

Oops, there's no exit() here... isn't this going to crash a few lines below?
Comment 51 Carlos Garcia Campos 2018-05-07 22:38:45 PDT
(In reply to Michael Catanzaro from comment #50)
> Comment on attachment 339063 [details]
> Updated patch using platform ifdefs
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=339063&action=review
> 
> > Source/WebKit/WebProcess/WebProcess.cpp:1241
> > +#if PLATFORM(GTK) || PLATFORM(WPE)
> > +            // GTK+ and WPE ports don't exit on send sync message failure.
> > +            // In this particular case, the storage process can be terminated by the UI process while the
> > +            // connection is being done, so we always want to exit instead of crashing.
> > +            // See https://bugs.webkit.org/show_bug.cgi?id=183348.
> > +#else
> 
> Oops, there's no exit() here... isn't this going to crash a few lines below?

I don't know what happened there, I guess I messed it up in a rebase/merge. I'll fix it, thanks.
Comment 52 Carlos Garcia Campos 2018-05-07 23:09:38 PDT
Committed r231482: <https://trac.webkit.org/changeset/231482>
Comment 53 Chris Dumez 2018-07-17 20:43:03 PDT
Comment on attachment 339063 [details]
Updated patch using platform ifdefs

View in context: https://bugs.webkit.org/attachment.cgi?id=339063&action=review

> Source/WebKit/WebProcess/WebProcess.cpp:1152
> +#if PLATFORM(GTK) || PLATFORM(WPE)

The way COCOA deals with this is:
#if !PLATFORM(GTK) && !PLATFORM(WPE)
    connection->setShouldExitOnSyncMessageSendFailure(true);
#endif

In the same file. Could we harmonize the ports. Seems weird that GTK / WPE is doing it differently.
Comment 54 Carlos Garcia Campos 2018-07-17 22:49:11 PDT
(In reply to Chris Dumez from comment #53)
> Comment on attachment 339063 [details]
> Updated patch using platform ifdefs
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=339063&action=review
> 
> > Source/WebKit/WebProcess/WebProcess.cpp:1152
> > +#if PLATFORM(GTK) || PLATFORM(WPE)
> 
> The way COCOA deals with this is:
> #if !PLATFORM(GTK) && !PLATFORM(WPE)
>     connection->setShouldExitOnSyncMessageSendFailure(true);
> #endif
> 
> In the same file. Could we harmonize the ports. Seems weird that GTK / WPE
> is doing it differently.

We need a proper web process termination to avoid leaking resources (not memory) but GStreamer cache files and some other things that depend on other platform libraries. See Comment #45, I explained the whole situation there.