<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://bugs.webkit.org/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4.1"
          urlbase="https://bugs.webkit.org/"
          
          maintainer="admin@webkit.org"
>

    <bug>
          <bug_id>207531</bug_id>
          
          <creation_ts>2020-02-11 01:05:02 -0800</creation_ts>
          <short_desc>CachedResource should purge SharedBuffer if it is a particular type</short_desc>
          <delta_ts>2020-03-05 18:33:11 -0800</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WebKit</product>
          <component>Page Loading</component>
          <version>WebKit Nightly Build</version>
          <rep_platform>Unspecified</rep_platform>
          <op_sys>Unspecified</op_sys>
          <bug_status>NEW</bug_status>
          <resolution></resolution>
          
          <see_also>https://bugs.webkit.org/show_bug.cgi?id=208683</see_also>
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>Normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Yusuke Suzuki">ysuzuki</reporter>
          <assigned_to name="Yusuke Suzuki">ysuzuki</assigned_to>
          <cc>basuke</cc>
    
    <cc>beidson</cc>
    
    <cc>ggaren</cc>
    
    <cc>simon.fraser</cc>
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>1617300</commentid>
    <comment_count>0</comment_count>
    <who name="Yusuke Suzuki">ysuzuki</who>
    <bug_when>2020-02-11 01:05:02 -0800</bug_when>
    <thetext>The detailed memgraph data collected from Membuster is saying there are many large Vectors,
and they are a data segment of SharedBuffer of CachedResources, including CachedScript, CachedCSSStyleSheet, CachedImage etc.
But important thing is that they also have decoded data too! This means we have double-sized data basically so long as CachedScript etc. is held by CachedScriptSourceProvider.

For example, we have CachedScript, and it has decoded string.
This means... We have duplicate data for this script, one in SharedBuffer and one in decoded String.
The same thing can be said for CachedCSSStyleSheet, CachedImage etc.

Instead of destroying SharedBuffer, we have a mechanism destroying decoded data (destroyDecodedData).
But this would not be called so long as CachedScriptSourceProvider is holding this CachedScript.
This basically means that we have duplicate data so long as we are in this page.
If we navigate to the other page, we could purge decoded data (and we could purge CachedResource too.)

For some CachedResource types, we should hold decoded data, and should purge SharedBuffer instead.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1617301</commentid>
    <comment_count>1</comment_count>
    <who name="Yusuke Suzuki">ysuzuki</who>
    <bug_when>2020-02-11 01:12:54 -0800</bug_when>
    <thetext>I’ll try this tomorrow. Plan is using Variant&lt;SharedBuffer, String, ...&gt; as data</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1617309</commentid>
    <comment_count>2</comment_count>
    <who name="Yusuke Suzuki">ysuzuki</who>
    <bug_when>2020-02-11 01:53:42 -0800</bug_when>
    <thetext>Seems that blink folks are doing this. We should try.
https://docs.google.com/document/d/1v0yTAZ6wkqX2U_M6BNIGUJpM1s0TIw1VsqpxoL7aciY/edit#heading=h.hydebxiwp5hv</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1617845</commentid>
    <comment_count>3</comment_count>
    <who name="Yusuke Suzuki">ysuzuki</who>
    <bug_when>2020-02-11 20:11:39 -0800</bug_when>
    <thetext>We have a path using SharedBuffer as a content when it is ASCII. And seems that Membuster is using this path mainly, so maybe, this does not affect on memory usage of Membuster.

But for image case, we should do it.
And still, we should do it, but I&apos;ll check later since this would not affect on Membuster result.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1618415</commentid>
    <comment_count>4</comment_count>
    <who name="Yusuke Suzuki">ysuzuki</who>
    <bug_when>2020-02-12 23:40:21 -0800</bug_when>
    <thetext>For CachedImage case,

// On Mac the NSData inside the SharedBuffer can be secretly appended to without the SharedBuffer&apos;s knowledge.
// We use SharedBuffer&apos;s ability to wrap itself inside CFData to get around this, ensuring that ImageIO is
// really looking at the SharedBuffer.

We are already doing this, cool.

Other possibility is,

1. non-ASCII string source code
2. script source code compression
3. style sheet source code compression</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1618417</commentid>
    <comment_count>5</comment_count>
    <who name="Yusuke Suzuki">ysuzuki</who>
    <bug_when>2020-02-12 23:47:35 -0800</bug_when>
    <thetext>(In reply to Yusuke Suzuki from comment #4)
&gt; For CachedImage case,
&gt; 
&gt; // On Mac the NSData inside the SharedBuffer can be secretly appended to
&gt; without the SharedBuffer&apos;s knowledge.
&gt; // We use SharedBuffer&apos;s ability to wrap itself inside CFData to get around
&gt; this, ensuring that ImageIO is
&gt; // really looking at the SharedBuffer.
&gt; 
&gt; We are already doing this, cool.
&gt; 
&gt; Other possibility is,
&gt; 
&gt; 1. non-ASCII string source code
&gt; 2. script source code compression
&gt; 3. style sheet source code compression

Wait, I need to check whether ImageIO is using this buffer directly or doing some fancy buffering internally.</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>