RESOLVED FIXED Bug 144259
Network Cache: Disk cache getting filled by YouTube video data
https://bugs.webkit.org/show_bug.cgi?id=144259
Summary Network Cache: Disk cache getting filled by YouTube video data
Antti Koivisto
Reported 2015-04-27 06:15:16 PDT
MSE media loads via the cache and tends to eventually fill it.
Attachments
patch (26.98 KB, patch)
2015-04-27 06:31 PDT, Antti Koivisto
no flags
patch (27.35 KB, patch)
2015-04-27 06:55 PDT, Antti Koivisto
no flags
patch (28.39 KB, patch)
2015-04-27 07:55 PDT, Antti Koivisto
no flags
patch (28.86 KB, patch)
2015-04-27 08:36 PDT, Antti Koivisto
darin: review+
Antti Koivisto
Comment 1 2015-04-27 06:15:41 PDT
Antti Koivisto
Comment 2 2015-04-27 06:31:05 PDT
WebKit Commit Bot
Comment 3 2015-04-27 06:32:42 PDT
Attachment 251736 [details] did not pass style-queue: ERROR: Source/WebCore/platform/network/ResourceRequestBase.h:150: enum members should use InterCaps with an initial capital letter. [readability/enum_casing] [4] Total errors found: 1 in 18 files If any of these errors are false positives, please file a bug against check-webkit-style.
Antti Koivisto
Comment 4 2015-04-27 06:55:32 PDT
WebKit Commit Bot
Comment 5 2015-04-27 06:57:08 PDT
Attachment 251738 [details] did not pass style-queue: ERROR: Source/WebCore/platform/network/ResourceRequestBase.h:150: enum members should use InterCaps with an initial capital letter. [readability/enum_casing] [4] Total errors found: 1 in 18 files If any of these errors are false positives, please file a bug against check-webkit-style.
Antti Koivisto
Comment 6 2015-04-27 07:55:32 PDT
WebKit Commit Bot
Comment 7 2015-04-27 07:57:20 PDT
Attachment 251739 [details] did not pass style-queue: ERROR: Source/WebCore/platform/network/ResourceRequestBase.h:150: enum members should use InterCaps with an initial capital letter. [readability/enum_casing] [4] Total errors found: 1 in 19 files If any of these errors are false positives, please file a bug against check-webkit-style.
Antti Koivisto
Comment 8 2015-04-27 08:36:01 PDT
WebKit Commit Bot
Comment 9 2015-04-27 08:38:12 PDT
Attachment 251743 [details] did not pass style-queue: ERROR: Source/WebCore/platform/network/ResourceRequestBase.h:150: enum members should use InterCaps with an initial capital letter. [readability/enum_casing] [4] Total errors found: 1 in 19 files If any of these errors are false positives, please file a bug against check-webkit-style.
Darin Adler
Comment 10 2015-04-27 09:06:33 PDT
Comment on attachment 251743 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=251743&action=review > Source/WebCore/platform/network/ResourceRequestBase.h:231 > + Requester m_requester { Requester::Unspecified }; The data members just above are going out of their way to use bitfields, and here for something that takes only 2 bits we are not using one. > Source/WebKit2/NetworkProcess/cache/NetworkCache.cpp:307 > + bool isLikelyStreamingMedia = isXHR && (response.mimeType().startsWith("video/") || response.mimeType().startsWith("audio/")); Is MIME type guaranteed to be lower case? I think it would be best to do this comparison in a case folding way.
Antti Koivisto
Comment 11 2015-04-27 10:35:16 PDT
> The data members just above are going out of their way to use bitfields, and > here for something that takes only 2 bits we are not using one. Those bitfields are an over-optimization. I was planning to get rid of them separately.
Geoffrey Garen
Comment 12 2015-04-27 10:53:16 PDT
Comment on attachment 251743 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=251743&action=review > Source/WebKit2/NetworkProcess/cache/NetworkCache.cpp:309 > + // Media loaded via XHR is likely being used for MSE streaming (YouTube and Netflix for example). > + // Streaming media fills the cache quickly and is unlikely to be reused. > + bool isXHR = originalRequest.requester() == WebCore::ResourceRequest::Requester::XHR; > + bool isLikelyStreamingMedia = isXHR && (response.mimeType().startsWith("video/") || response.mimeType().startsWith("audio/")); > + if (isLikelyStreamingMedia) > + return StoreDecision::NoDueToStreamingMedia; We don't think that people rewind, fast-forward, and/or revisit videos? I tend to do all of those things. :(
Alexey Proskuryakov
Comment 13 2015-04-27 21:14:57 PDT
We just removed "requestor" from ResourceRequest due to it being a layering violation, and deprecated isMainResourceRequest, so it's unfortunate that it's being added back. ResourceRequest is a network level concept that shouldn't know about "XMLHttpRequest" on "main resource" in my opinion. Should we have an abstraction for networking, or just use web concepts all the way through?
Antti Koivisto
Comment 14 2015-04-28 00:19:50 PDT
In practice we need this sort of information through the stack. We can pass it via side channels. This is bug-prone and cumbersome. We could add a new high level request type that knows about the web and wraps a low level type that doesn't. This is adds yet another layer to the already too layered networking code. In the end we are making a browser engine not a platonic networking platform. "Web concepts all the way through" is a decent description of WebKit architecture.
Antti Koivisto
Comment 15 2015-04-28 03:20:08 PDT
> We don't think that people rewind, fast-forward, and/or revisit videos? I > tend to do all of those things. :( At least the current YouTube player never seems to generate the same URL twice so cache entires are never hit. For rewind and fast-forward have 5min 1024p, 314MB memory cache on MSE level (bigger than the entire disk cache). We can't currently move data from the memory cache to MSE without copying so there is no mapped file benefit either. We should revisit this at some point. I think we want a separate more transient media cache section on disk and we should allow copy-free use of mapped buffers through the stack.
Antti Koivisto
Comment 16 2015-04-28 03:21:26 PDT
> We can't currently move data from the memory cache to MSE without copying so > there is no mapped file benefit either. from the disk cache I mean
Antti Koivisto
Comment 17 2015-04-28 05:51:07 PDT
mitz
Comment 18 2015-04-28 07:18:04 PDT
(In reply to comment #17) > https://trac.webkit.org/r183467 Looks like this broke the USE(CFNETWORK) iOS configuration: Source/WebCore/platform/network/ios/ResourceRequestIOS.mm:42:7: error: member initializer 'm_mainResourceRequest' does not name a non-static data member or base class
Andy Estes
Comment 19 2015-04-28 08:54:48 PDT
Mavericks Debug build fix: http://trac.webkit.org/changeset/183478
Darin Adler
Comment 20 2015-04-28 09:26:45 PDT
Regarding Alexey and Antti’s discussion about “web concepts” at the lower level: I think what we want for the layering here is that the lower level exposes the concepts it uses to make the cache work, and the higher level adopts/implements those. There is no reason, in my opinion, that the lower level concepts need to be “non-web”, but we do want them to be as simple as possible, not simply plumbing through the entire high level. And obviously, “nothing at all” is the simplest, but may not be sufficient.
Alexey Proskuryakov
Comment 21 2015-04-28 09:56:53 PDT
It's not really a choice about platonic vs. pragmatical, it's simple layered software design. If you want networking to implement a policy that is easy for other engineers to understand, if should have a descriptive name (not "whatever XHR needs"), and the policies should be as centralized as possible, not scattered across multiple frameworks and processes. In other words, policy decisions are made before a ResourceRequest is passed to NetworkProcess. It should be especially straightforward in this case, where you just want to inhibit caching.
Antti Koivisto
Comment 22 2015-04-28 13:20:09 PDT
If you look at the patch you will see that the requster is used for number of non-cache related things. Those used to be handled by a side-channel and by (already) having the requester type in ResourceRequest (on iOS). I want to keep cache decision-making in one place. I also don't want to be coy and use some other name when I really mean XHR. I would be surprised if this confused anyone. Generally less abstract code is more understandable.
Antti Koivisto
Comment 23 2015-04-28 13:35:50 PDT
Also "which layer knows about which concept" kind of layering is very soft when not reflected in code dependencies. I don't think there is any "simple layered software design" hard rules here.
Alexey Proskuryakov
Comment 24 2015-05-04 19:49:07 PDT
This caused bug 144533. For the record, I still think that this is poor design. Even your own words demonstrate that - you say that one needs to look at the patch to find out what "XMLHttpRequest" means to the low level networking code. I do not think that any human can fully understand the relationship between explicit behaviors passed from upper levels and implicit ones triggered by ResourceRequest::Requester::XHR.
Note You need to log in before you can comment on or make changes to this bug.