Bug 293286
| Summary: | REGRESSION(?): Timeouts observed with TLS upgrade | ||
|---|---|---|---|
| Product: | WebKit | Reporter: | gabor.bodo |
| Component: | Page Loading | Assignee: | rupin |
| Status: | RESOLVED FIXED | ||
| Severity: | Normal | CC: | achristensen, ahmad.saleem792, ap, beidson, karlcow, m_finkel, webkit-bug-importer |
| Priority: | P2 | Keywords: | InRadar |
| Version: | Safari 18 | ||
| Hardware: | All | ||
| OS: | macOS 15 | ||
gabor.bodo
Our internal testing recently started to report sporadic timeouts on a specific URL which might take 5-6 seconds to serve a request if its cache is stale. Steps to reproduce:
1. Clear browsing history and cache OR open a new private browsing session
2. Navigate to http://blackrock.com/prospectus (note the naked domain and the http protocol. We couldn't replicate the issue with the https protocol, but due to its intermittent nature, we're not certain that it cannot be replicated using https and/or www)
3. Most of the time the request will upgrade to TLS, redirect to www and then redirect to https://www.blackrock.com/us/individual/resources/regulatory-documents?cid=vanity:regulatory:blk. In some cases, the page will fail to load with a "Safari can't open page, server isn't responding error)
Our server logs indicate that the requests are served with a 200 response code, and our CDN logs show ERR_CLIENT_ABORT errors. We could replicate the issue on both MacBooks and iPhone, but couldn't replicate it on a machine with macOS 14 (which is only indicative, given the sporadic nature of the problem). Our devices have either Safari 18.4 or 18.5 installed.
Looking at the release notes (https://developer.apple.com/documentation/safari-release-notes/safari-18_4-release-notes), we noticed a particular change about timeouts during optimistic TLS upgrades, and trying to understand if it could be related.
https://github.com/WebKit/WebKit/commit/50f926d27862352367373c172fc56426b225789e
| Attachments | ||
|---|---|---|
| Add attachment proposed patch, testcase, etc. |
Alexey Proskuryakov
Thank you for the report! Just to clarify, are you saying that sporadic 5-6 delays are expected for this server, but getting a timeout is the bug tracked here? I could reproduce the delays (even when using `curl` instead of Safari), but I could never reproduce the timeout.
The change in https://commits.webkit.org/283494@main is an interesting find indeed. However, the adaptive timeout aspect of it is tracked per network session, so each new private browsing window or tab would start with the default timeout, and this shouldn't be making the issue harder to reproduce.
Given that you can sometimes reproduce, could you please collect a sysdiagnose right after reproducing, and file a report with the sysdiagnose attached via https://feedbackassistant.apple.com? Please post the FB number here once you have it.
gabor.bodo
Hi Alexey,
Sporadic 5-6 sec responses are expected from the server, the potential bug tracked here is the timeout. Is it possible that the issue here is the 3 seconds timeout the adaptive logic is initialised with?
We will submit a report and paste the FB number here.
Thank you,
Gabor
gabor.bodo
Quick update, it might take a day or two to submit the sysdiagnose logs due to the InfoSec process. Apologies for the delay
Matthew Finkel
Thanks for reporting this and helping investigate. The expectation with the adaptive timeout is that if the request is not completed within that time (3 seconds by default), then WebKit will re-attempt the request using http://. Your report sounds like Safari does not always re-attempt the request after the timeout expires and we don't fall back to using a http connection. Given the sporadic nature of this issue, have you seen either of:
1) the fallback behavior to http is not reliable, or
2) is the request's success/failure completely dependent on how quickly the server responds?
I can definitely believe that 3 seconds is too aggressive for a default time out, even though most requests should finish within that time. The goal is to provide a good balance between user experience and best-effort secure connections with servers.
gabor.bodo
Hi Matthew,
1) Can you please clarify what exactly 'fallback mechanism to http' means? Our servers only support https, http requests always get redirected
2) Yes, it seems that we can't reproduce the issue if the cache is warm and the server responds faster
The redirect chain set up on our infrastructure is the following:
http://blackrock.com/prospectus --- (301) ---> https://www.blackrock.com/prospectus -- (301) ---> https://www.blackrock.com/us/individual/resources/regulatory-documents?cid=vanity:regulatory:blk
I think an http to https upgrade might be happening (in parallel), i.e. http://blackrock.com/prospectus -> https://blackrock.com/prospectus , I 'm not sure if that can cause any issue. We saw the parallel upgrade behaviour on other browsers (Chrome), Safari's dev tool view collates the redirects, so it's a bit more difficult to track them.
gabor.bodo
I've submitted a feedback report under FB17682317
Alexey Proskuryakov
rdar://151910984
gabor.bodo
Just checking in if you managed to review the sysdiagnose report? Let me know if you need further information
Karl Dubost
This is announced as A+
https://www.ssllabs.com/ssltest/analyze.html?d=www.blackrock.com
After clearing cache and history, I tried a couple of times on macOS 15.5 (24F74) and could not reproduce with STP 219.
gabor.bodo
The cache TTL on this endpoint is 15 minutes (there might be some extra delay as we're running multiple instances).
We can reproduce the issue fairly reliably during EMEA morning hours, when there hasn't been a recent call to keep the cache warm.
gabor.bodo
Please note that due to the regulatory nature of the endpoint, we're rolling out a proactive patch today which warms the cache in the background, and practically always serves clients from the cache. After the patch is applied, reproducing the error might be extremely difficult, if not impossible.
rupin
Pull request: https://github.com/WebKit/WebKit/pull/47365
EWS
Committed 297128@main (cde2625ebe14): <https://commits.webkit.org/297128@main>
Reviewed commits have been landed. Closing PR #47365 and removing active labels.