Summary: | REGRESSION (iOS 16.4): Safari occasionally locks up and stops completing XHR requests | ||
---|---|---|---|
Product: | WebKit | Reporter: | Nick M <webkit-bugzilla> |
Component: | WebKit2 | Assignee: | Nobody <webkit-unassigned> |
Status: | NEW --- | ||
Severity: | Normal | CC: | achristensen, ap, bfulgham, fligosmate, kkinnunen, webkit-bug-importer, wilander, youennf |
Priority: | P2 | Keywords: | InRadar |
Version: | Safari 16 | ||
Hardware: | iPhone / iPad | ||
OS: | Unspecified |
Description
Nick M
2023-06-12 09:44:14 PDT
Thank you for the report! Would it be possible for you to provide steps to reproduce that we can follow, even if not 100%? If that's not possible, could you please file a report with a sysdiagnose (taken in state) at https://feedbackassistant.apple.com? Hey Alexey, thanks for the response. Unfortunately I cannot share the site (which rules out reproduction steps). I have spent time today trying to recreate the issue myself so I can file a report with a sysdiagnose. This has been unsuccessful. We realized today one version of the app seems to see the issue more than the other two (this is one app that runs as 3 different brands). This is something we're investigating to hope to narrow down the specific root cause for trigging the bug. I realize this is not really enough information for y'all to investigate, I apologize. I plan to continue investigating this and trying to recreate personally. I will be sure to update this thread as we discover more information. Hey there, I think we may have a lead on what could be causing the issue. While researching the issue we found this article[1] talking about changes to how cookies are handled in iOS 16.4, specifically surrounding third party cookies and cookies using ‘CNAME cloaking.’ That article links to this Webkit PR[2] which discusses the change in detail. Our application uses Okta as an Identity Provider, and utilizes the function ‘getWithoutPrompt’ in their SDK. According to their docs[3], this function is known to cause issues when it comes to third party cookie tracking prevention. Additionally, we use a CNAME record on our domain to direct traffic to Okta. From what we’ve seen, the cookie is not being dropped from requests so we don’t believe the Okta method is broken or just being caught by tracking prevention. We are wondering if a bug was introduced with the iOS 16.4 change that is being triggered by our usage of Okta. We also found a post on the Apple community forums[4] that feels like the same behavior we’re seeing in our application. [1] https://www.imore.com/security/apples-secret-safari-cookie-crackdown-could-have-unintended-consequences-for-your-logins [2] https://github.com/WebKit/WebKit/pull/5347 [3] https://github.com/okta/okta-auth-js#third-party-cookies [4] https://discussions.apple.com/thread/254879217 Ope, realized I missed an important detail. This behavior seems to regularly occur right after our call to Okta. This further leads us to believe that the call is in some way triggering the bug. We have users experiencing very similar issues: > At some point in the session, a API request would appear to hang, eventually timing out (the application times out the request, this timeout is not from iOS or Safari itself). From that point on, all XHR requests fail to resolve and are eventually timed out as well. After this happens, the tab seems to be "fouled" and all requests fail. Same behavior observed by us, with the difference that the floodgates are released at an arbitrary point in time in the future, sometimes even within the 60 seconds, making no timeouts occur. > We are confident network conditions are fine in a majority of these cases are these are not "true" timeouts. Same here, fetch requests are both starting and completing just fine during this "freeze". > The first call to hang/timeout is not always the same call, or even in the same user flow in the application. We also haven't seen this behavior on any other browser. Given this, we're confident the issue isn't with the application itself and is most likely a Safari bug. Same observed by us. Furthermore, we have client logs from the network tab, along with server logs, which together paint the story that Nick shared with you of XHR requests locking up. Summarized, they say: 1. Several XHR requests* are started, and stay in a pending state for very long. 2. Some fetch requests are performed in parallell, and they start and complete within reasonable time frames. 3. If 60 seconds pass, the XHR requests time out**. If not, a successful response is returned. 4. In case of a successful response, server logs indicate that no XHR request reached the server until very shortly before the response was sent. 5. Meanwhile, XHR requests from other clients (Windows Edge in this example) are handled just fine during the same time frame that the requests are held up, indicating no server issues. *XHR requests towards one domain only, it actually works fine for requests targeting another domain, in the same timeframe. Perhaps this lends some credit to Nick's cookie theories? **In one case, the OPTIONS request was recorded on the server, but then the request timed out without performing the follow-up request, indicating the floodgates were released just before 60 seconds passed. Most of the time, there is no such trace, indicating that the timeout occurred before anything reached the server. We have the option to migrate from XHR to fetch, so we are probably going to set up slow request tracking, then migrate to fetch, and then see if there is any measurable difference. |