Summary: | REGRESSION(r204512): WebSocket errors with "Failed to send WebSocket frame." if too much data is sent | ||||||
---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Nathan Friedly <nathan.friedly+webkit> | ||||
Component: | WebCore JavaScript | Assignee: | Alex Christensen <achristensen> | ||||
Status: | RESOLVED FIXED | ||||||
Severity: | Normal | CC: | 569569569, achristensen, adamduren, ap, berkus, carlos, cgarcia, deepak, gneius, jmorris, khomenkoigor, kurteknikk, lisongli199019, manian, planauts, quint, rmay31, spalmer, thomas.schreiber, vandenbj, webkit-bug-importer, youennf | ||||
Priority: | P2 | Keywords: | InRadar | ||||
Version: | Safari 10 | ||||||
Hardware: | Mac | ||||||
OS: | macOS 10.12 | ||||||
URL: | https://watson-speech.mybluemix.net/file-url.html | ||||||
Attachments: |
|
Description
Nathan Friedly
2017-04-04 12:14:45 PDT
Op, the JSFiddle link should be https://jsfiddle.net/yta2mjuf/4/ Safari Technology Preview behaves the same as 10.1. WebKit Nightly (r214532) behaves a little differently - it logs the "Failed to send WebSocket frame" error three times and doesn't close the connection until the end. We have encountered this as well. Without a workaround our product which requires moving large amounts of data over WebSockets is totally broken in Safari. CFWriteStreamCanAcceptBytes seems to be returning false. We should just put the bytes in the buffer in this case instead of reporting a failure. Carlos, does this reproduce on linux, too? (In reply to Alex Christensen from comment #6) > Carlos, does this reproduce on linux, too? I'm not sure how to test it, but loading https://jsfiddle.net/yta2mjuf/4/ and sending console messages to stdout, this is what I get: https://fiddle.jshell.net/_display/:60:14: CONSOLE LOG sending first binary message https://fiddle.jshell.net/_display/:62:14: CONSOLE LOG bufferedAmount is 0 https://fiddle.jshell.net/_display/:65:14: CONSOLE LOG sending second binary message https://fiddle.jshell.net/_display/:67:14: CONSOLE LOG bufferedAmount is 0 https://fiddle.jshell.net/_display/:69:14: CONSOLE LOG sending third binary message https://fiddle.jshell.net/_display/:71:14: CONSOLE LOG bufferedAmount is 0 https://fiddle.jshell.net/_display/:56:14: CONSOLE LOG WebSocket closed with code: 1005, reason: And this is what I get with WebKitGTK+ 2.16 branch (before web sockets were moved to the network process) https://fiddle.jshell.net/_display/:60:14: CONSOLE LOG sending first binary message https://fiddle.jshell.net/_display/:62:14: CONSOLE LOG bufferedAmount is 6709 https://fiddle.jshell.net/_display/:65:14: CONSOLE LOG sending second binary message https://fiddle.jshell.net/_display/:67:14: CONSOLE LOG bufferedAmount is 29802 https://fiddle.jshell.net/_display/:69:14: CONSOLE LOG sending third binary message https://fiddle.jshell.net/_display/:71:14: CONSOLE LOG bufferedAmount is 52895 https://fiddle.jshell.net/_display/:56:14: CONSOLE LOG WebSocket closed with code: 1005, reason: We too suffer from this blocking issue. We transfer large amounts of data in our product using a WebSocket. We have emailed all of our customers and partners to postpone the update of Safari and in the meanwhile we are working on a hack/patch to limit the amount of data we transfer. This fix however is seriously affecting the user experience. Hope to receive additional feedback about this issue. This regressed with http://trac.webkit.org/r204512 @jona(In reply to Jonas from comment #9) Our workaround in the meantime was to delay between each call to send. Our service was already chunking data to 8000 bytes and a delay of 125ms seemed to mostly smooth over this issue in the meantime but with a hit to overall performance. It will most likely depend on individual message sizes, the bandwidth of your users, and other network factors. Created attachment 306397 [details]
Patch
The attached patch fixes your issue and restores our compatibility with other browsers. Carlos, could you see if it fixes the libsoup implementation? I'm having trouble writing a regression test that fails before this change and succeeds after using a local python server. What exactly is echo.websocket.org doing? Is the source available somewhere? Is it taking a long time to do something? Maybe it's just that it's a remote server, so sending and receiving takes longer than it ever would when testing on a local machine. Comment on attachment 306397 [details]
Patch
Um, shouldn't we fix the callers to expect std::nullopt instead? If I saw this when working on nearby code, I'd be tempted to change it back to return nullopt and call it good if it doesn't break any tests.
No, if CFWriteStreamWrite returns -1, then we do need an error because something has gone wrong with the socket. If the stream cannot accept bytes yet, then we report that we have written 0 bytes and there was no error. The bytes go into the buffer and they will be written later. (In reply to Alex Christensen from comment #13) > The attached patch fixes your issue and restores our compatibility with > other browsers. > Carlos, could you see if it fixes the libsoup implementation? > I'm having trouble writing a regression test that fails before this change > and succeeds after using a local python server. What exactly is > echo.websocket.org doing? Is the source available somewhere? Is it taking > a long time to do something? Maybe it's just that it's a remote server, so > sending and receiving takes longer than it ever would when testing on a > local machine. I think you're correct that the difference is just that it's a remote vs local server. You could try increasing the amount of data the test sends, but I'm not sure how much it would take to get a reliable test. Alternatively, is it possible to throttle the connection to the local server, or otherwise make it slow? echo.websocket.org is just supposed to respond with whatever data you sent it. My actual server is stream.watsonplatform.net - but that requires credentials. I don't think this should be committed without a regression test to make sure we never regress this behavior again. Using https://trac.webkit.org/browser/webkit/trunk/LayoutTests/http/tests/websocket/tests/hybi/send_wsh.py doesn't reproduce the issue. I committed this to https://trac.webkit.org/changeset/215102/webkit after manually verifying manually that Nathan's test case was fixed, meaning the error was no longer there and data was actually sent and received. Nathan, Adam, and Jonas, could you please do some testing with the upcoming Safari Technology Preview? We have released them every 2 weeks or so at https://developer.apple.com/safari/technology-preview/ Will do. When is this expected to drop into the main Safari builds? Yes, I'll check on the next Safari Technology Preview. Could you post an update here (or send me an email if you prefer) once it's available? (In reply to Carlos Garcia Campos from comment #7) > (In reply to Alex Christensen from comment #6) > > Carlos, does this reproduce on linux, too? > > I'm not sure how to test it, but loading https://jsfiddle.net/yta2mjuf/4/ > and sending console messages to stdout, this is what I get: > > https://fiddle.jshell.net/_display/:60:14: CONSOLE LOG sending first binary > message > https://fiddle.jshell.net/_display/:62:14: CONSOLE LOG bufferedAmount is 0 > https://fiddle.jshell.net/_display/:65:14: CONSOLE LOG sending second binary > message > https://fiddle.jshell.net/_display/:67:14: CONSOLE LOG bufferedAmount is 0 > https://fiddle.jshell.net/_display/:69:14: CONSOLE LOG sending third binary > message > https://fiddle.jshell.net/_display/:71:14: CONSOLE LOG bufferedAmount is 0 > https://fiddle.jshell.net/_display/:56:14: CONSOLE LOG WebSocket closed with > code: 1005, reason: How can I check this is fixed in soup? I'm still getting this output even after the patch. I used Nathan's code to verify that onerror was no longer being called. I modified it by adding an onmessage that counts the number of bytes received to verify that the bytes were actually being sent and received. We really need to figure out how to reproduce this issue locally with python. Our customers are currently experiencing the same issue as described by this defect and we have to put in a fixed delay between sending data frames in order to get around this problem and it doesn't always work. Since this has been marked as fixed. Is there an ETA on when the new version of Safari and iOS will be released with this fix so we can communicate with our customers? thx Folks mentioning impact, can you give us names of your products and the number of folks who are impacted? (In reply to Divya Manian from comment #24) > Folks mentioning impact, can you give us names of your products and the > number of folks who are impacted? Our product is called Klipfolio (www.klipfolio.com). We have 22 of our customers reporting this issue, before we put in a 50ms delay between sending messages. After the 50ms delay there are still 3 reported cases that are still experiencing this issue. (In reply to Divya Manian from comment #24) > Folks mentioning impact, can you give us names of your products and the > number of folks who are impacted? In my case it's the IBM Watson Speech to Text API. I can't give exact numbers, but we handled around 100~150k WebSocket connections from Safari users last month. (Presumably it would be quite a bit more if Safari supported microphone input via getUserMedia rather than just files, so maybe we got lucky here ;) Tested our application with r215468 and it appears to be fixed in this build. Hooray! Thanks! I still think we need to add an automated regression test using python somehow to prevent us from accidentally breaking this again. We will do refactoring or reimplementing in the future Our product is called Awingu and we emailed all of our customers warning them not to update to the latest Safari. Because of this, we only had to patch one customer environment with a fix that limits the package size to around 5000 bytes. How and where can we test the "r215468" build? Is it better to wait for the next technology preview? I'm guessing it's going to be released one of these days? Thanks for the update! > How and where can we test the "r215468" build? Is it better to wait for the next technology preview? Either the next Safari Technology preview or the current WebKit nightly build (https://webkit.org/downloads/) would work for this purpose. I've verified the fix on the Safari Technology Preview r28 of yesterday and I can confirm the issue is not present anymore. Now it's only a matter of knowing when this version is releasable to the public. Just an additional note, in addition to Safari, a portion of our mobile user who's updated to latest iOS is also affected by this defect. Our native mobile app also uses webkit behind it. It will be great to know the release schedule for iOS with this fix as well. I have the same issue with 10.1 but it works fine with 10.0.3 Here is my log from browser console: http://image.quickblox.com/459f0469f8c5af77d71775066988.injoit.png Our product https://qm.quickblox.com is completely broken in Safari 10.1. So waiting for you guys to release a new update.. Igor, could you also verify that the latest Safari Technology Preview has fixed your problem? It works fine with dev preview version. When are you going to push 10.1.1? Hi guys, When are you going to push fixed 10.1 safari latest version? Hi folks, I've just upgraded Safari to 10.1.1 and this issue can still happen... It looks like 10.1.1 only contains Security updates. https://support.apple.com/en-us/HT207804 Yes, still I am facing this issue... There is a Beta program https://beta.apple.com that would be good to sign up for to get early access on what goes into the next release of Safari ahead of time. (In reply to Divya Manian from comment #40) > There is a Beta program https://beta.apple.com that would be good to sign up > for to get early access on what goes into the next release of Safari ahead > of time. Hi, We have more and more customer experiencing this issue. So far the work around is to add some delay when sending data through websocket. The work around is not really acceptable by our customers as it adds a lot of overhead and reduce performance by a large margin. Is it possible to get an ETA on when the fix will make it into the release version of Safari and iOS so we can communicate that to our customers? TIA! qing Hi, you're unlikely to get a response from Apple developers here, as they're not generally allowed to comment about future product releases. This issue has been fixed in WebKit, and that's as much as you're going to be able to get here. Is this confirmed to be part of Safari 11? This is fixed in the latest updates to shipping Safari, in macOS 10.12.6 and iOS 10.3.3. This issue is still present in Safari 13.0.2 (14608.2.40.1.3) Log from the sample above in this version of Safari: [Log] sending first binary message (wstest.js, line 16) [Log] bufferedAmount is 0 (wstest.js, line 18) [Log] sending second binary message (wstest.js, line 21) [Log] bufferedAmount is 0 (wstest.js, line 23) [Log] sending third binary message (wstest.js, line 25) [Log] bufferedAmount is 0 (wstest.js, line 27) [Log] WebSocket closed with code: 1005, reason: (wstest.js, line 12) Log from the same code on Firefox: sending first binary message wstest.js:16:11 bufferedAmount is 23085 wstest.js:18:11 sending second binary message wstest.js:21:11 bufferedAmount is 46170 wstest.js:23:11 sending third binary message wstest.js:25:11 bufferedAmount is 69255 wstest.js:27:11 WebSocket closed with code: 1005, reason: That's macOS 10.14.6 Not sure if this is related or not, but we get on Safari 13.0.2 the following: WebSocket connection to 'wss://xxx/' failed: Failed to compress frame This seems related to larger size frames. I'm seeing "bufferedAmount is 0" in the console logs from jsfiddle. Last time it was an indicator that the WebSocket data was not being sent, which had huge consequences. This time something different is going on, but I've verified the WebSocket data is being sent. What is the actual consequence of the bufferedAmount being 0 in a real application? The "Failed to compress frame" is probably something unrelated. I'm also seeing issues when I send a large payload. It works on Safari 12 but not Safari 13. Is this being investigated again? The status on this bug states "RESOLVED FIXED", can we please update to "REOPENED" or "OPEN"? Please file a new bug. This one from 2017 has been fixed for a long time, and it is not appropriate to track new issues here, even if they have similar symptoms. What about comments #45 and #48 - are those not relevant as well? New bug for frame compress issue is created: https://bugs.webkit.org/show_bug.cgi?id=204237 |