Bug 228296

Summary: REGRESSION (iOS 15): Websocket connection instance in javascript client getting closed
Product: WebKit Reporter: ABDUL RAHIMAN MULLA <abdulrahiman575>
Component: Page LoadingAssignee: Nobody <webkit-unassigned>
Status: RESOLVED MOVED    
Severity: Blocker CC: achristensen, alberto, aleckgibson, alexhultman, beidson, bfulgham, cdumez, eric.oconnell, firstcontact, gsnedders, isaac+webkit, jorge, Justin, mads, marc_aurel, mattwindwer, mwake, renchap, rychouwei, sathiam.m, sathiamoorthy.m, tmnousia, viesturs.proskins, webkit-bug-importer, youennf
Priority: P2 Keywords: InRadar
Version: Safari Technology Preview   
Hardware: iPhone / iPad   
OS: Other   
See Also: https://bugs.webkit.org/show_bug.cgi?id=228329
Attachments:
Description Flags
Screencast of the bug
none
Screenshot of the error
none
Screenshot showing the web socket error when using chats in Basecamp none

Description ABDUL RAHIMAN MULLA 2021-07-26 13:13:30 PDT
While sending a websocket response from Websocket-sharp c# implemented websocket server, which is having mutiple fragments writing into the socket stream, connection at the client instance is getting closed.

WebSocket connection to 'ws://******:****/' failed: The operation couldn’t be completed. (kNWErrorDomainPOSIX error 100 - Protocol error)
Comment 1 ABDUL RAHIMAN MULLA 2021-07-26 13:15:10 PDT
This issue is being observed only in iPadOS 15 beta, this was working as expected in iPadOS 14.6 and earlier versions
Comment 2 Alexey Proskuryakov 2021-07-26 15:31:39 PDT
Could you please provide a test case (preferably live, as I'm not sure if anyone here would know how to run C# code).
Comment 3 Alex Christensen 2021-07-26 16:16:30 PDT
I'm quite interested in this.  What do you mean by "multiple fragments"?  Are you doing anything else interesting with your server?
Comment 4 ABDUL RAHIMAN MULLA 2021-07-27 00:19:25 PDT
(In reply to Alex Christensen from comment #3)
> I'm quite interested in this.  What do you mean by "multiple fragments"? 
> Are you doing anything else interesting with your server?

while sending large message size (> 70 KB) from websocket-sharp server, it is trying to send multiple frames in the websocket protocol, and in js client we will have websocket api to take care of that frames and reading as a single message. Now issue is when my server is trying to write multiple frames to socket stream, host machine (js client) is abnormally closing the connection and getting this protocol error.
Comment 5 Alex Christensen 2021-07-27 09:53:40 PDT
Is there any way you could either provide the code of a server that hits this issue or provide an IP address of such a server running on the internet?  Feel free to send the IP address to my email privately if you don't want to post it publicly here.
Comment 6 Alex Christensen 2021-07-27 18:21:09 PDT
My initial investigation has found that large numbers of continuation frames works as expected.  I do this:
1. Send a text frame with fin=0 and length=1
2. Send many continuation frames with fin=0 and length=1
3. Send a continuation frame with fin=1 and length=1
Could you describe exactly what I'm doing differently than your server?  Sending a packet capture could also be enlightening.
Comment 7 Radar WebKit Bug Importer 2021-08-02 13:14:16 PDT
<rdar://problem/81425845>
Comment 8 isaac+webkit 2021-09-08 18:49:52 PDT
I've raised what could be a similar issue, affecting macOS Monterey here: https://bugs.webkit.org/show_bug.cgi?id=230076
Comment 9 Jay Charles 2021-09-10 13:59:51 PDT
I am encountering similar issues as well against golang gorilla websockets on iOS 15 beta 8. Websocket connections occasionally fail to complete, and frequently close unexpectedly.
Comment 10 Alex Christensen 2021-09-14 09:00:05 PDT
I would really like to look into this, but the descriptions do not contain enough information for me to reproduce.  If someone could send me a link to a server that is reproducing this issue, I'll look into what is going on.
Comment 11 youenn fablet 2021-09-14 09:24:33 PDT
rdar://81747517 is probably the bug tracking internal work.
Comment 12 Alex Christensen 2021-09-14 11:55:08 PDT
Or alternatively if someone provides me with actual C# or golang code that makes a server that reproduces the issue, I could also use that to look into what is going on.  I've made web socket servers that do what you describe and they work, but I definitely believe that something is going wrong.
Comment 13 Alex Christensen 2021-09-15 00:28:22 PDT
*** Bug 230076 has been marked as a duplicate of this bug. ***
Comment 14 Alex Christensen 2021-09-15 00:29:14 PDT
With some additional information provided by https://bugs.webkit.org/show_bug.cgi?id=230076 I found what is going on here, and prepared a fix for the underlying framework.
Comment 15 Alex Christensen 2021-09-15 00:29:58 PDT
I'm tracking my internal work with rdar://82917968
Comment 16 Viesturs 2021-09-28 06:28:50 PDT
This can be easily reproduced with a reference ws as pointed out in https://developer.apple.com/forums/thread/685403?login=true via https://libwebsockets.org/testserver/
Comment 17 Alex Christensen 2021-09-29 09:38:02 PDT
That is a different bug that is not a regression in iOS15.  I filed https://bugs.webkit.org/show_bug.cgi?id=230962
Comment 18 Mikael Nousiainen 2021-10-05 02:52:01 PDT
I've encountered this issue too with a WebSocket server using HTTP/2 + TLS 1.3. The connection succeeds at first and I'm able to send and receive some (short) WebSocket messages, but then the connection gets disconnected with the aforementioned "kNWErrorDomainPOSIX error 100 - Protocol error" error message. A longer message sent by the server seems to cause the disconnection.

Based on the discussion in this thread:
https://developer.apple.com/forums/thread/685403

I'm also suspecting that the reason might be what is described there:

"This error is caused by NSURLSession’s inability to process split messages normally. As long as the received WebSocket Message Frame is Fin=0, an error will occur."

There seems to be a workaround for this:

1. Navigate to: Settings > Safari > Advanced > Experimental Features
2. Set "NSURLSession WebSocket" to OFF and restart Safari (or the phone/tablet). This seems to fix the issue for now at least.
Comment 19 Mads Erik Forberg 2021-10-06 06:13:55 PDT
I can confirm that the disabling of "NSURLSession WebSocket" workaround works.
Comment 20 Matthew Windwer 2021-11-18 11:17:40 PST
We have a similar issue starting iOS 15 and also Safari 15.x on Mac.  The error we are getting is "[Error] WebSocket connection to 'wss://xxx' failed: WebSocket is closed before the connection is established." (xxx is our site).

Simply disconnecting from WiFi then reconnecting seems to fix it, but reloading the website does not. Our customers also report that restarting their devices fixes it. I do not know how to force reproduction of this bug but it is definitely happening to many of our customers since iOS 15 and the latest version of Safari. Is this still being investigated by the webkit team?
Comment 21 Alex Christensen 2021-11-18 13:13:48 PST
The "kNWErrorDomainPOSIX error 100" bug should've been fixed in iOS 15.1.  If it wasn't, please let me know, ideally with a way to reproduce the bug.

This is the first I've heard of the "WebSocket is closed before the connection is established" bug.  I'm happy to look into it if you have a way to reproduce the bug.  It sounds like something may be going on with TLS but I'd need more information to be sure.
Comment 22 Matthew Windwer 2021-11-22 16:15:53 PST
Created attachment 444989 [details]
Screencast of the bug

Alex, we are able to reproduce an issue on Safari 15.1 for Mac as well as the latest Safari Technology Preview Release 135 where "NSUrlSession WebSocket" (which is enabled by default) is breaking the ability to connect to our server at all using WebSockets after the computer sleeps in the following scenario:

1. Visit our website, which establishes an active WebSocket connection to our server.
2. Walk away from the laptop (letting it sleep) for some time.
3. Return to the laptop, and when it awakens, Safari loses the ability to connect to WebSockets completely on our website.  Even reloading the website no longer establishes a WebSocket connection.

In order to resolve this, we have found that the user needs to do one of two things:

1. Exit Safari and re-open Safari.
2. Disconnect from WiFi and then reconnect (no need to exit Safari).

We are pretty sure this bug is also in Safari for iOS 15.x (including 15.1) based on customer reports.

I have attached a screencast demonstrating how WebSockets becomes broken on our website, even after reload, with "NSUrlSession WebSocket" enabled when the above scenario occurs on Safari Technology Preview 135 using my M1 MacBook Air. The video shows the following:

1. Computer has woken from sleep after some time, and WebSocket connection (via the /cable endpoint) can no longer be established (client-side code continues to retry every 6 seconds).
2. After reloading the page, the connection still cannot be established.
3. After disabling "NSURLSession WebSocket", the connection is established immediately.
4. Reloading the page works fine with "NSURLSession WebSocket" disabled.
5. Re-enabling "NSURLSession WebSocket" kills the connection again, in perpetuity, until it is disabled.

Please ignore the Basic Auth prompts in the video as that is just for the JavaScript source maps since the web inspector is open.

I hope that this helps demonstrate the issue at hand. Please let me know if I can provide any further details to help in the investigation, or if a new bug report needs to be filed, and I will be happy to help in any way I can.
Comment 23 marc_aurel 2021-11-26 00:16:27 PST
This also happens when using macOS Safari 15.1 (15.0 works fine) when connecting to a c# websocket server from https://github.com/ngld/OverlayPlugin and large chunks of data get send. 

I’ve also  attached an image with the specific error if that helps…
Comment 24 marc_aurel 2021-11-26 00:17:13 PST
Created attachment 445174 [details]
Screenshot of the error
Comment 25 Jorge Manrubia 2021-12-01 04:26:54 PST
I could reproduce the problem exactly as described by Matt Windwer in https://bugs.webkit.org/show_bug.cgi?id=228296#c22.

We have been experiencing this bug in our Basecamp iOS app since the upgrade to iOS 15: when opening a chats and trying to send a message, it would sometimes fail to deliver it. We got several reports from customers about this.

The cause is that the underlying web sockets fail to connect with the same error:

> [Error] WebSocket connection to 'wss://xxx' failed: WebSocket is closed before the connection is established.

1. It happens intermittently when restoring the app from the background.
2. The only way to fix is restarting the app. The app embeds a WKWebView instance, and reloading the page there won't fix the problem.

   I tried to debug the problem with Safari dev tools, with my unit plugged, and closing the web sockets and creating new ones from the console wouldn't fix the problem either.

3. Disconnecting the wifi, trying to send a message, and connecting it again fixed the issue.

Today, I was able to also reproduce the problem with Safari Technology Preview release 135 (macOS Monterey):

1. Open a campfire in Basecamp (you can create a free account in https://basecamp.com)
2. Put the computer to sleep.
3. Wait for like 15 minutes (you need to give it some time to happen, it won't fail if you wake up the computer right away).
4. Come back and try to send a message, you will see exactly the same problem described by Matt and shown in his screencast. I'm attaching an screenshot.
Comment 26 Jorge Manrubia 2021-12-01 04:28:09 PST
Created attachment 445552 [details]
Screenshot showing the web socket error when using chats in Basecamp
Comment 27 Matthew Windwer 2021-12-01 08:53:24 PST
Hey Jorge, as a work-around, I found that closing the WebSocket connection manually when the page is backgrounded prevents this issue from occurring (instead of waiting for the browser/OS to close the connection some time after sleep, which seems to trigger the bad state that results in the inability to reconnect to WebSockets even after page reload).

In our case, using ActionCable:

document.addEventListener('visibilitychange', () => {
  if (document.visibilityState === 'hidden') {
    cable.disconnect()
  } else {
    cable.connect()
  }
})

We are no longer getting customer reports after the above code went live. Still, this is a regression with "NSURLSession WebSocket" as previously mentioned.
Comment 28 youenn fablet 2021-12-01 09:01:57 PST
@mattwinder, if you can reproduce, could you send me (youenn@apple.com) a sysdiagnose when you reproduce the issue (including the time the issue reproduced).
Comment 29 Sathiamoorthy 2021-12-02 01:53:00 PST
Facing the same issue in iOS 15.1, but not able to reproduce in iOS 14.1 (may be lesser than 15 version).

Any solution for this ?
Comment 30 Sathiamoorthy 2021-12-02 01:55:44 PST
(In reply to Sathiamoorthy from comment #29)
> Facing the same issue in iOS 15.1, but not able to reproduce in iOS 14.1
> (may be lesser than 15 version).
> 
> Any solution for this ?

Same issue happened in iOS 15.1 & Chrome (96.0.4664.53)
Comment 31 Jorge Manrubia 2021-12-02 03:30:21 PST
Matt, thank you so much for the workaround! We are giving a try to that patch. I'll follow up when we validate whether it works or not.
Comment 32 Jorge Manrubia 2021-12-03 04:28:37 PST
Matt, we are testing the workaround internally but we are still hitting the error. Could you confirm if it has fixed the problems for you for good?
Comment 33 Matthew Windwer 2021-12-03 10:52:52 PST
@Jorge

After deploying the workaround, our support requests regarding the issue stopped within a day. Previously we were getting a dozen or so requests daily of the nature "you need to force close and re-open Safari and/or the app" (our app is an embedded WKWebView using capacitor/cordova).

Since the workaround, I also haven't been able to reproduce the issue after many attempts on multiple devices (Mac & iPad) over the course of several days, except I think it did happen on one occasion, so I wouldn't say that it is full proof.

I sent youenn fablet a sysdiagnose by reproducing the issue (without the workaround) along with the time of reproduction. Perhaps you can do the same to help them diagnose the issue.
Comment 34 Alberto Fernández-Capel 2021-12-16 09:34:02 PST
I've been to able to consistently reproduce a NSURLSession crash on Safari with this POC code.

https://gist.github.com/afcapel/4e1012bd658f6818e70115178edd3489

The error only happens if these conditions are met:

* The NSURLSession is enabled

* The permessage-deflate websocket extension is enabled

* The message is big enough that the server splits it in different frames (this depends on the server implementation so it's difficult to pinpoint when exactly will happen)

The culprit seems to be that Safari is compressing the frame, instead of the whole message.
Comment 35 youenn fablet 2021-12-16 09:35:48 PST
@Alberto, which OS version are you testing? Can you send share the crash log (email for instance at youenn@apple.com)?
Comment 36 Alberto Fernández-Capel 2021-12-16 09:39:59 PST
@youenn testing on Safari desktop, 15.1, MacOS Monterey.
Comment 37 Alex Christensen 2021-12-16 19:59:07 PST
Alberto, thank you for the excellent go program to reliably reproduce an issue with multiple fragments and the extension combination of extensions "permessage-deflate; server_no_context_takeover; client_no_context_takeover".  I found what I believe was a misplaced call to zlib's inflateReset in a non-WebKit framework and am working to fix it internally, tracked by rdar://85078119

I haven't looked into the issue having to do with WebSockets and computers that have gone to sleep yet, though.
Comment 38 Jorge Manrubia 2021-12-20 02:42:05 PST
We found a reliable way to reproduce the problem with web sockets getting in a broken state that only gets fixed by restarting the browser/app.

It has to do with closing Websockets that are in a "connecting" state.

You can try the following snippet in the Javascript console of any app with a WebSocket endpoint. Just update the URL. In this case, you can test in this demo page:

```
let socket = null
let iterations = 0
let intervalHandle = setInterval(function () {
  if (socket) {
    socket.close()
  }
  socket = new WebSocket("wss://libwebsockets.org/")
  iterations += 1

  if (iterations > 50) {
    clearInterval(intervalHandle)
  }
}, 25)
```

It tries several iterations because it's not totally deterministic. But with 50 it always reproduces the issue for me.

After running the script in the console, you will see some errors about "WebSocket is closed before the connection is established", which are expected. Then reload the page, and you can observe how the Websockets connections fail to establish. You need to restart Safari to fix.

Reproduce with Safari 15.1 on macOS Monterey, and also with Safari Technology Preview 135.
Comment 39 Jorge Manrubia 2021-12-20 02:49:00 PST
Sorry, I forgot to include the link for the demo page to reproduce with the snippet above:

https://libwebsockets.org/testserver/

If you try a different one, just update the wss URL.
Comment 40 Chris Dumez 2022-01-10 10:43:45 PST
(In reply to Jorge Manrubia from comment #39)
> Sorry, I forgot to include the link for the demo page to reproduce with the
> snippet above:
> 
> https://libwebsockets.org/testserver/
> 
> If you try a different one, just update the wss URL.

Thank you for the reproduction steps. It looks like I can reproduce in Safari but not Chrome. I am investigating.
Comment 41 Chris Dumez 2022-01-10 10:47:53 PST
(In reply to Chris Dumez from comment #40)
> (In reply to Jorge Manrubia from comment #39)
> > Sorry, I forgot to include the link for the demo page to reproduce with the
> > snippet above:
> > 
> > https://libwebsockets.org/testserver/
> > 
> > If you try a different one, just update the wss URL.
> 
> Thank you for the reproduction steps. It looks like I can reproduce in
> Safari but not Chrome. I am investigating.

Disabling "NSURLSession WebSocket" experimental feature seems to address the issue.
Comment 42 Jorge Manrubia 2022-01-10 12:24:19 PST
> Disabling "NSURLSession WebSocket" experimental feature seems to address the issue.

Yes, I forgot to mention this. Before finding a workaround, we added a banner recommending this in our app for Safari users when the error happened.
Comment 43 Chris Dumez 2022-01-10 13:01:06 PST
Proper reproduction steps (since they were not clear to me initially):
1. Open https://libwebsockets.org/testserver/ and notice the counter is incrementing
2. Open Web Inspector and paste this code:
let socket = null
let iterations = 0
let intervalHandle = setInterval(function () {
  if (socket) {
    socket.close()
  }
  socket = new WebSocket("wss://libwebsockets.org/")
  iterations += 1

  if (iterations > 50) {
    clearInterval(intervalHandle)
  }
}, 25)
3. Open https://libwebsockets.org/testserver/ in a new tab and notice that the counter is not incrementing in this tab (because WebSocket fails to connect)

Note that after 1 minute or 2, the second tab does seem to recover and managers to connect. It looks to me like new connection requests are getting queued and something eventually times out, allowing the new requests to proceed.
Comment 44 Alex Hultman 2022-01-11 16:48:59 PST
I've tried to reach Apple since pretty much day 1 but Apple is unreachable and their Feedback Assistant goes straight to the shredder.

I've clients that report total meltdown for all clients on ALL versions of Safari since iOS 15 or macOS 12.

This is the analysis I have come up with: https://github.com/uNetworking/uWebSockets/issues/1347

I think it is a negotiation issue. Could someone please check if the issue I describe is also getting fixed as part of this?

Thank you...?
Comment 45 Brent Fulgham 2022-02-10 10:56:27 PST
The fix for this issue was needed outside the WebKit project, therefore this is being resolved as 'Moved'.

This issue should be fixed in an upcoming iOS 15.4 Beta and macOS 12.3 Beta.
Comment 46 Eric O'Connell 2022-06-27 11:25:32 PDT
Based on comment #45 from Brent Fulgham, it seems like this should be fixed as of iOS 15.4 release. However we are experiencing a similar failure still on 15.5. Is it possible the non-WebKit fix got delayed? Any insight to this would be greatly appreciated, as all of our iOS 15 clients are experiencing this issue right now.
Comment 47 Brent Fulgham 2022-06-27 11:29:43 PDT
(In reply to Eric O'Connell from comment #46)
> Based on comment #45 from Brent Fulgham, it seems like this should be fixed
> as of iOS 15.4 release. However we are experiencing a similar failure still
> on 15.5. Is it possible the non-WebKit fix got delayed? Any insight to this
> would be greatly appreciated, as all of our iOS 15 clients are experiencing
> this issue right now.

Eric: Would you mind filing a new bug describing the issue you are seeing. I double-checked, and the CFNetwork fix indeed shipped with iOS 15.4.

Therefore, I'd like us to start with a fresh log and understanding of the issue you are seeing.
Comment 48 Eric O'Connell 2022-06-27 16:54:50 PDT
(In reply to Brent Fulgham from comment #47)
> (In reply to Eric O'Connell from comment #46)
> > Based on comment #45 from Brent Fulgham, it seems like this should be fixed
> > as of iOS 15.4 release. However we are experiencing a similar failure still
> > on 15.5. Is it possible the non-WebKit fix got delayed? Any insight to this
> > would be greatly appreciated, as all of our iOS 15 clients are experiencing
> > this issue right now.
> 
> Eric: Would you mind filing a new bug describing the issue you are seeing. I
> double-checked, and the CFNetwork fix indeed shipped with iOS 15.4.
> 
> Therefore, I'd like us to start with a fresh log and understanding of the
> issue you are seeing.

Thank you Brent! We've just discovered the issue and need to mitigate immediately by implementing an HTTP fallback, but as we investigate the WebSocket issues more, I will definitely open a new bug if it looks like WebKit is implicated.
Comment 49 Alec Gibson 2022-10-31 05:06:02 PDT
(In reply to Eric O'Connell from comment #48)
> (In reply to Brent Fulgham from comment #47)
> > (In reply to Eric O'Connell from comment #46)
> > > Based on comment #45 from Brent Fulgham, it seems like this should be fixed
> > > as of iOS 15.4 release. However we are experiencing a similar failure still
> > > on 15.5. Is it possible the non-WebKit fix got delayed? Any insight to this
> > > would be greatly appreciated, as all of our iOS 15 clients are experiencing
> > > this issue right now.
> > 
> > Eric: Would you mind filing a new bug describing the issue you are seeing. I
> > double-checked, and the CFNetwork fix indeed shipped with iOS 15.4.
> > 
> > Therefore, I'd like us to start with a fresh log and understanding of the
> > issue you are seeing.
> 
> Thank you Brent! We've just discovered the issue and need to mitigate
> immediately by implementing an HTTP fallback, but as we investigate the
> WebSocket issues more, I will definitely open a new bug if it looks like
> WebKit is implicated.

Sorry about the noise on a resolved issue, but we're still running into something similar on iOS 16. 

Jorge's original repro steps don't seem to work for us, and we can't seem to reproduce this (I've run into it myself this weekend after months of being on the lookout).

Eric, did you ever manage to reproduce?