When downloading Safari tries to fetch a file from network and bypasses the ServiceWorker. So files which are generated inside the ServiceWorker can't be downloaded. See sample URL: https://stat-info.cz/safari-sw-download.html
<rdar://problem/55721934>
Any progress on this issue?
*** Bug 201580 has been marked as a duplicate of this bug. ***
usually don't bump threads with +1 or how it's going? cuz I know how annoying it can be. But I have been waiting for a year now for this to be resolved with one failed attempt (#182848) and got no feedback or progress on this topic. feels like this have been kept in the dark and have been forgotten about. Could someone give this a tiny bit higher priority please?
To be noted that if the file generation takes too long, the service worker might be stopped before the download finishes. I did not check in Chrome and Firefox but I would guess they have this behavior?
I'm using Chrome/Firefox service worker downloads for some time and I have no problem to download hundreds of GB in a single download. It can take hours and the service worker will not be interrupted. It is still downloading even on Android with Chrome in background! Yes, there are edge cases where the service worker might be interrupted but the "standard" download might be interrupted as well by any browser.
(In reply to mrskman from comment #6) > I'm using Chrome/Firefox service worker downloads for some time and I have > no problem to download hundreds of GB in a single download. It can take > hours and the service worker will not be interrupted. It is still > downloading even on Android with Chrome in background! Thanks for the feedback. Can you tell me whether this is the case even if there is no client for that particular service worker, say you have one page which is the sole service worker client, you start the download and then close the page?
I'm feeding service worker from the page so I can't tell if it works reliably this way. As a quick test I tried to close the page in Chrome while downloading. The download was not interrupted, but was waiting endlessly for data as expected.
I know for a fact that one way to keep the service worker alive is to keep sending postMessages every x second to it to prevent it from going idle. (i think this is stupid and the spec should change it to allow it to stay open for longer: https://github.com/w3c/ServiceWorker/issues/980#issuecomment-486827233) I have also read up in some issues in chromium bug and bugzilla that service worker should be alive for longer periods if it is holding a `FetchEvent.respondWith()` and the FetchEvent's window is still open. If the FetchEvent's window is closed, the `respondWith()` keep alive should be revoked. https://github.com/w3c/ServiceWorker/issues/882 https://bugzilla.mozilla.org/show_bug.cgi?id=1302715
Another thing as jake mention https://github.com/w3c/ServiceWorker/issues/882#issuecomment-262754779 if a response is responded with a native builtin stream (from fetch, cache or transferable streams) then service worker can be closed. but if it's a user constructed readable stream from within the service worker, then as long as there is a destination to pipe the data to, then keep it alive. if the request is aborted from the controlling page or a download then it can be safe to revoke the keep alive
As long as the service worker has a client (page with the same origin), I think it is fine to keep the service worker running if it is fueling a download. The problem is really when the service worker has no remaining client. If you close all evil.com pages as a user, the expectation is that no evil.com script is being run. If we keep running the service worker alone, this breaks this assumption.
Regardless of the above discussion, even if the client page remains open, downloading a service worker generated response (Content-Disposition: attachment) still doesn't work. Downloading by clicking on a link simply doesn't trigger the service worker at all and downloads the URL from the network. Downloading by creating a hidden iframe does trigger the service worker, and I can see that the service worker responds with a stream and that this stream is actually read all the way through, but the download manager simply doesn't care and downloads the URL from the network. This is a show-stopper for my application so I'm forced to tell my users that downloading files this way is not supported by safari.
(In reply to Pedro Pedrosa from comment #12) > Regardless of the above discussion, even if the client page remains open, > downloading a service worker generated response (Content-Disposition: > attachment) still doesn't work. > > Downloading by clicking on a link simply doesn't trigger the service worker > at all and downloads the URL from the network. > > Downloading by creating a hidden iframe does trigger the service worker, and > I can see that the service worker responds with a stream and that this > stream is actually read all the way through, but the download manager simply > doesn't care and downloads the URL from the network. > > This is a show-stopper for my application so I'm forced to tell my users > that downloading files this way is not supported by safari. Right, this is a known limitation that should definitely be fixed. If the service worker is generating the content (no opaque response), the workaround is to generate a blob from the Response and trigger the download as a blob.
(In reply to youenn fablet from comment #13) > Right, this is a known limitation that should definitely be fixed. > If the service worker is generating the content (no opaque response), the > workaround is to generate a blob from the Response and trigger the download > as a blob. Thank you for quick turn-around. This won't work for my application. I'm trying to download files greater than 20gb by downloading them in multiple chunks and then using the SW to concatenate all the chunks together without keeping anything in memory. Using a blob just defeats the purpose and blows up pretty much any machine's RAM.
Just to chime in that we also use a service worker and writable streams for downloading arbitrary large files and video/audio streaming with seeking. A resolution to this bug would help us immensely. It has been possible in Chrome for a number of years and in Firefox behind a flag [1] for a while. Thanks for the work to enable WritableStreams in Safari [2]. [1] https://bugzilla.mozilla.org/show_bug.cgi?id=1474543 [2] https://bugs.webkit.org/show_bug.cgi?id=149842
I also just wanted to say this bug is a major limitation for my library (https://github.com/Touffy/client-zip) that creates Zip files in the browser. So far, going through a ServiceWorker is the only way I can think of to stream the file without holding it all in memory, but Safari is unable to download the stream from the worker. You can test it here: https://touffy.me/client-zip/demo/worker Just reload to make sure the Service Worker is activated, then input some URLs and submit the form. It works fine in Chrome and Firefox.
(In reply to youenn fablet from comment #11) > As long as the service worker has a client (page with the same origin), I > think it is fine to keep the service worker running if it is fueling a > download. > > The problem is really when the service worker has no remaining client. > If you close all evil.com pages as a user, the expectation is that no > evil.com script is being run. If we keep running the service worker alone, > this breaks this assumption. I have had some ppl using my StreamSaver.js lib saying that they have experience some technical limitation. The download happens just like any other "normal" download dose in a background thread so they have thought, hey it must be okey to close this tab while i download this 20gb large file since it happens in the background - with no knowledge that a service worker or a main thread is feeding the stream with data. I have given example/instruction on some best practices using onunload event to warn user from leaving. so i would like to change your first statement to something like As long as the service worker has stream fulling the hard drive with data and is not idle (paused for to long) and if the stream/pipeline isn't aborted or closed then i think it is fine to keep the service worker running. it would be awesome if you wasn't forced to stay on any page durning the hole time. Otherwise this have some mixing UX behavior differences that may leave users thinking that it's fine to close the page as it might happen in a background thread but really it isn't --- little of topic here but... Honestly, I think this hole limited lifetime _almost_ completely stupid and not fully completely thought through, sure it had good intention but it's a pain in the but! There are good usecases too by having long lived service worker and developer have asked for it for a long time. I think everything could have been solved in any other "better" way. see my comment on this mather: https://github.com/w3c/ServiceWorker/issues/980#issuecomment-486827233 Sure it's grate that you can kill un-needed or wasteful workers but if developers and user wants keep the worker alive for good reason's then let them make that decision, how? not sure there are some thoughts floating around in w3c/ServiceWorker#980
Oh, and it sucks that i can't edit my comment cuz i made some spelling misstake :P Move to github where all the developers are :P it have better issue handling (it supports markdown, a new good newly added reply feature and a bunch more) and that's where most developer are.
Still a problem here. Safari is now the only desktop browser that has to load entire files in memory if they are downloaded through the a Service Worker. It's one of the last limitations to have software-like websites, a real shame it's not fixed yet
Fyi, the are going to specify how extendable events are given time to execute https://github.com/w3c/ServiceWorker/issues/1566 in a short summary i think this will mean that we as developer can keep the service worker alive as long as there is any controlling origin tab open. and the hard lifetime timer will be loosen up and not be as restrictive.
It'd be very nice for my application if this is fixed! Don't really want to single out Safari. One weird thing is that I tried to download a file from Safari I get empty download.txt file with Frame load interrupted in error log.
I just tested this issue with the latest version of Safari 15 and unluckily this is still not fixed. Our application relies heavily on this feature and it is a pity that we have to advise our clients to rather use Chrome based browsers or Firefox.
We really miss this feature. It's a shame that Safari is always a step behind other browsers.
It's impossible to use files on the web without this functionality!
Could you fix it ASAP, we need this for our business feature
Created attachment 446017 [details] Patch
Created attachment 446020 [details] Patch
Created attachment 446025 [details] Patch
Created attachment 446032 [details] Patch
Created attachment 446033 [details] Patch
Comment on attachment 446033 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=446033&action=review r=me > Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerFetchTask.cpp:306 > + SWFETCH_RELEASE_LOG("cancelFromClient:%d", m_isDone); We should probably the logging better, it is really not obvious what %d is for. SWFETCH_RELEASE_LOG("cancelFromClient: m_isDone=%d", m_isDone); > Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerFetchTask.cpp:434 > + // FIXME: We might want to keep the service worker alive until the download ends. Probably. > Source/WebKit/NetworkProcess/ServiceWorker/WebSWServerToContextConnection.cpp:202 > + auto iterator = m_ongoingDownloads.find(makeObjectIdentifier<FetchIdentifierType>(decoder.destinationID())); I think this would look nicer (more concise): ``` if (auto download = m_ongoingDownloads.get(makeObjectIdentifier<FetchIdentifierType>(decoder.destinationID())) download->didReceiveMessage(connection, decoder); ``` > Source/WebKit/Scripts/generate-unified-sources.sh:17 > +UnifiedSourceCppFileCount=112 What is this for? Why is it included in this patch?
Thanks for the review, I will update the patch accordingly. > > Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerFetchTask.cpp:434 > > + // FIXME: We might want to keep the service worker alive until the download ends. > > Probably. Right, I have not yet fully thought of this case but could live with an extended fixed delay above the current delay. I don't think it is great to allow a service worker to extend at will their lifetime once user agrees to do a download. I would hope though that most use cases will be covered without any specific handling, and that other cases could go to the network to finish downloads if need be. In general, we try to limit the cases of service workers without any client. > > Source/WebKit/Scripts/generate-unified-sources.sh:17 > > +UnifiedSourceCppFileCount=112 > > What is this for? Why is it included in this patch? Build error due to more than 111 unified files being created by the script.
Comment on attachment 446033 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=446033&action=review Needs file operations to be on a different run loop. > Source/WebKit/NetworkProcess/NetworkLoad.cpp:59 > +NetworkLoad::NetworkLoad(NetworkLoadClient& client, NetworkSession& networkSession, const Function<RefPtr<NetworkDataTask>(NetworkDataTaskClient&)>& createTask) nit: I think this can be a scoped lambda to save an allocation. > Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerDownloadTask.cpp:71 > + FileSystem::deleteFile(m_pendingDownloadLocation); I don't think cancelling deletes the existing file. > Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerDownloadTask.cpp:85 > + m_state = State::Running; Should we do this only if the state was suspended? Also, this doesn't do anything to actually start the download. Should it? > Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerDownloadTask.cpp:135 > + size_t bytesWritten = FileSystem::writeToFile(m_downloadFile, data.data(), data.size()); Does this happen on the main thread? We need to assert that this is not the main run loop or we will introduce bad hangs. Same with all file operations here. > Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerDownloadTask.cpp:162 > + m_sandboxExtension->revoke(); > + m_sandboxExtension = nullptr; nit: I feel like std::exchange could be nicely used several places in this file. > Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerDownloadTask.cpp:184 > + FileSystem::deleteFile(m_pendingDownloadLocation); ditto, I think you shouldn't delete the file here. >> Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerFetchTask.cpp:434 >> + // FIXME: We might want to keep the service worker alive until the download ends. > > Probably. Definitely. >> Source/WebKit/Scripts/generate-unified-sources.sh:17 >> +UnifiedSourceCppFileCount=112 > > What is this for? Why is it included in this patch? This is needed to compile successfully with Xcode. > Source/WebKit/WebKit.xcodeproj/project.pbxproj:15463 > + 2CAECB6627465AE400AB78D0 /* UnifiedSource112.cpp in Sources */, nit: you may consider adding more than one so we don't need to add another one for a little while. Like 5. > Tools/WebKitTestRunner/TestController.h:671 > + long long m_downloadTotalBytesWritten { -1 }; nit: I'd put std::optional<uint64_t>
I think once the user agrees to download, we should be quite generous with service worker lifetime until the download completes.
(In reply to Alex Christensen from comment #34) > I think once the user agrees to download, we should be quite generous with > service worker lifetime until the download completes. The service worker can easily decide to never end the download (never call controller.close()). AIUI, the main use case with ReadableStream bodies is to stream some large amount of data that is already available to the service worker. It should not take more than a couple of seconds. We can always update the heuristics as a follow-up.
(In reply to youenn fablet from comment #35) > The service worker can easily decide to never end the download (never call > controller.close()). AIUI, the main use case with ReadableStream bodies is > to stream some large amount of data that is already available to the service > worker. > It should not take more than a couple of seconds. > We can always update the heuristics as a follow-up. We use service worker to download and decrypt large files on the fly on users' device. It may take hours, not seconds.
(In reply to youenn fablet from comment #35) > AIUI, the main use case with ReadableStream bodies is to stream some > large amount of data that is already available to the service worker. > It should not take more than a couple of seconds. > We can always update the heuristics as a follow-up. There are common use cases (packaging separate downloads into a single response, possibly applying some transform) where the ServiceWorker's download is throttled by upstream network speed. My use case is creating a ZIP archive so the user doesn't need to click a thousand links to download a thousand files at once. Or the file could be generated locally by a computationally intensive process which also outputs lots of data (more than you'd want to store in a Blob). It's even possible that the download speed will be throttled at the other end (if the user decided to "download to…" a slow USB key or networked device. In any case, a more general-purpose heuristic may be to timeout the download if the ServiceWorker doesn't push any data for a while.
Such a heuristic would still allow a service worker to run forever if it writes 1 byte every 5 seconds. We don't want to allow a service worker to run forever when the user allows one download. Allowing downloads that take less than 10 seconds would certainly be more useful than nothing, and there will always be a demand for more. This is definitely an interesting problem.
(In reply to mrskman from comment #36) > (In reply to youenn fablet from comment #35) > We use service worker to download and decrypt large files on the fly on > users' device. It may take hours, not seconds. @David, @mrksman, thanks for these use cases. I think the current heuristic can support these cases as follows: - Prepare the download data in the background in the service worker (or in a service worker client). If need be, persist it on disk (IDB, Cache API, filesystem API). - Keep a web page open to keep the service worker open. This page can be used to show the progress of this background task. - If the page gets closed (and is the last of the same origin), service worker will be stopped. It might resume when a page of the given origin is loaded. - When background task is finished, data is ready: trigger the download, the heuristic should be tuned to let the download finish, even if there is no service worker client remaining. In the future, one could also use background fetch to do downloads in the background and decrypt once fetch is completed to trigger the actual download to save file on disk. In general, I do not think long lasting loads or downloads in service workers are a great fit. Service workers are assumed to work for short period of times and be stoppable/runnable at will without causing application breakage. More specific to long downloads, users might want to pause/resume downloads through their user agent. Service worker downloads do not support this. Or users might want to stop any JS to run for a given origin (this is the contract between user and its user agent when a user is closing the last page of a given origin). Stopping a service worker would fail the download without a chance to resume it properly once service worker gets relaunched. We would need a new API to properly cover this case.
(In reply to youenn fablet from comment #39) > I think the current heuristic can support these cases as follows: > - Prepare the download data in the background in the service worker (or in a > service worker client). If need be, persist it on disk (IDB, Cache API, > filesystem API). > - Keep a web page open to keep the service worker open. This page can be > used to show the progress of this background task. > - If the page gets closed (and is the last of the same origin), service > worker will be stopped. It might resume when a page of the given origin is > loaded. > - When background task is finished, data is ready: trigger the download, the > heuristic should be tuned to let the download finish, even if there is no > service worker client remaining. The point of going through the ServiceWorker is sometimes precisely that we don't want to (or can't even) keep the whole payload in memory. It's called streaming. Without it, we could just make a Blob and provide an Object URL to download it, bypassing the ServiceWorker. The filesystem API would solve some use cases better than a download through the ServiceWorker, but it's far from ready. > More specific to long downloads, users might want to pause/resume downloads > through their user agent. Service worker downloads do not support this. They could. The WHATWG Streams API supports backpressure.
(also not just backpressure : the ServiceWorker could support HTTP partial content headers, which is how browsers can "resume" a download from an actual HTTP server ; as long as the browser reactivates the ServiceWorker in response to the new download request)
(In reply to youenn fablet from comment #39) > - Prepare the download data in the background in the service worker (or in a > service worker client). If need be, persist it on disk (IDB, Cache API, > filesystem API). That's what we do now using Blob download. This non-streaming approach has one big disadvantage - you need to store the same data twice on user's device.
Created attachment 446142 [details] Patch
Created attachment 446149 [details] Patch
Created attachment 446336 [details] Patch
Comment on attachment 446336 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=446336&action=review > Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerDownloadTask.cpp:50 > + , m_writeQueue(WorkQueue::create("ServiceWorkerDownloadTask")) Could we share the queue if there are multiple downloads happening at the same time? > Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerDownloadTask.cpp:72 > + m_writeQueue->dispatch([protectedThis = Ref { *this }, function = WTFMove(function)]() mutable { nit: I don't think this needs to be mutable. > Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerDownloadTask.cpp:79 > + m_writeQueue->dispatch([this, protectedThis = Ref { *this }] { ASSERT(isMainRunLoop()); outside, ASSERT(!isMainRunLoop()); inside. Let's add these to all the functions here. > Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerDownloadTask.cpp:84 > + FileSystem::deleteFile(m_pendingDownloadLocation); ? > Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerDownloadTask.cpp:126 > + if (allowOverwrite && FileSystem::fileExists(filename)) > + FileSystem::deleteFile(filename); Do we need to delete the file? Should we fail if we don't allow overwrite and the file exists? > Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerDownloadTask.cpp:222 > + FileSystem::deleteFile(m_pendingDownloadLocation); I really don't think we delete the file when failing or cancelling.
> > Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerDownloadTask.cpp:126 > > + if (allowOverwrite && FileSystem::fileExists(filename)) > > + FileSystem::deleteFile(filename); > > Do we need to delete the file? Should we fail if we don't allow overwrite > and the file exists? > > > Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerDownloadTask.cpp:222 > > + FileSystem::deleteFile(m_pendingDownloadLocation); > > I really don't think we delete the file when failing or cancelling. That is what we are doing in NetworkDataTaskBlob, it seems good to keep both in sync. We can change that in both places and add tests, at least for the service worker case, but this seems fine to me as a follow-up. I do not know what NetworkDataTaskCocoa does when cancel happens (probably worth writing a test if there is not already one, and aligning blob and sw according to NetworkDataTaskCocoa). The file is apparently deleted after URLSession:downloadTask:didFinishDownloadingToURL: is called from NSURLSession.h.
(In reply to youenn fablet from comment #47) > > > Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerDownloadTask.cpp:126 > > > + if (allowOverwrite && FileSystem::fileExists(filename)) > > > + FileSystem::deleteFile(filename); > > > > Do we need to delete the file? Should we fail if we don't allow overwrite > > and the file exists? > > > > > Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerDownloadTask.cpp:222 > > > + FileSystem::deleteFile(m_pendingDownloadLocation); > > > > I really don't think we delete the file when failing or cancelling. I had a try and file deletion actually fails when trying to do so, so this is a no-op. I'll remove the call to deleteFile from the SW code path.
Created attachment 446520 [details] Patch
Created attachment 446524 [details] Patch
Comment on attachment 446524 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=446524&action=review > Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerDownloadTask.cpp:63 > +} maybe call close here to unregister if we haven't already? > Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerNavigationPreloader.h:30 > +#include "DownloadID.h" nit: extra space
Created attachment 446977 [details] Patch for landing
Committed r286944 (245169@main): <https://commits.webkit.org/245169@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 446977 [details].