Bug 133207

Summary: GC should have fast-path for destroying strings.
Product: WebKit Reporter: Andreas Kling <kling>
Component: JavaScriptCoreAssignee: Andreas Kling <kling>
Status: RESOLVED INVALID    
Severity: Normal CC: barraclough, ggaren, kling, mhahnenberg
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch idea none

Description Andreas Kling 2014-05-23 00:56:12 PDT
I have this little idea for making GC destroy JSStrings more efficiently.
Comment 1 Andreas Kling 2014-05-23 00:57:05 PDT
Created attachment 231948 [details]
Patch idea
Comment 2 Andreas Kling 2014-05-23 00:58:43 PDT
Benchmark report for Octane on CabMook (MacBookPro10,1).

VMs tested:
"ToT" at /Volumes/Data/Source/Safari/Reference-OpenSource/WebKitBuild/Release/jsc
"Hacks" at /Volumes/Data/Source/Safari/OpenSource/WebKitBuild/Release/jsc

Collected 4 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc()
between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the
jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution
times with 95% confidence intervals in milliseconds.

                              ToT                      Hacks                                       

encrypt                 1.08497+-0.01829          1.07993+-0.02012       
decrypt                19.30935+-0.24879         19.29161+-0.02956       
deltablue      x2       1.21815+-0.00833          1.21247+-0.01500       
earley                  2.19743+-0.05550          2.18668+-0.00854       
boyer                  23.90228+-0.06863    ?    23.96938+-0.14821       ?
navier-stokes  x2      23.86052+-0.01006    ?    23.86080+-0.00542       ?
raytrace       x2       8.10303+-0.12664    ?     8.15091+-0.17171       ?
richards       x2       0.60843+-0.04798          0.60282+-0.01505       
splay          x2       1.55989+-0.08370          1.49182+-0.01882         might be 1.0456x faster
regexp         x2     199.19518+-0.92605        197.05836+-2.10837         might be 1.0108x faster
pdfjs          x2     266.75545+-0.98525    ^   262.35065+-0.61502       ^ definitely 1.0168x faster
mandreel       x2     299.90476+-2.45375    ?   301.06825+-3.98014       ?
gbemu          x2     272.13303+-3.90926    ?   276.64499+-7.16828       ? might be 1.0166x slower
closure                 2.37485+-0.01098          2.37377+-0.01897       
jquery                 31.55491+-0.20196         31.54801+-0.22624       
box2d          x2     120.80826+-6.90088        120.15285+-5.52892       
zlib           x2    1798.62225+-5.93989       1780.48275+-88.35760        might be 1.0102x faster
typescript     x2    3297.27832+-59.34699   ?  3307.64124+-34.31568      ?

<arithmetic>          422.01728+-3.34998        421.39617+-6.21773         might be 1.0015x faster
<geometric> *          35.74942+-0.37086         35.56974+-0.26158         might be 1.0051x faster
<harmonic>              3.52166+-0.12538          3.48135+-0.04920         might be 1.0116x faster
Comment 3 Geoffrey Garen 2014-05-23 09:48:00 PDT
Comment on attachment 231948 [details]
Patch idea

I'm not a huge fan of this. The brittleness of the tight coupling between GC and string, and the fact that non-string has to do an extra branch, feels more significant than the speedup.

Maybe this would look better if there were an explicit String dtorType, and strings all got segregated into their own special MarkedBlock.
Comment 4 Andreas Kling 2014-05-23 10:05:59 PDT
(In reply to comment #3)
> (From update of attachment 231948 [details])
> I'm not a huge fan of this. The brittleness of the tight coupling between GC and string, and the fact that non-string has to do an extra branch, feels more significant than the speedup.

I paid for the cell type branch by killing the branch on dtorType, no? ;)

> Maybe this would look better if there were an explicit String dtorType, and strings all got segregated into their own special MarkedBlock.

Yeah maybe. That seems like it could have other effects though. I can try it and see.
Comment 5 Geoffrey Garen 2014-05-23 11:46:47 PDT
How about this:

(1) A tiny patch to specialize the dtor call as a template parameter, so there's no branch on it. Is great good.

(2) A bigger patch to put all strings in their own MarkedAllocator. Specialize the sweeping of that fellow, too. Now, we've removed a branch from sweeping strings, and the system is still pretty flexible.

Eventually, we will probably build on this to allocating StringImpls inline in JSString, and remove malloc / free / destruction from most strings.
Comment 6 Andreas Kling 2017-03-14 13:08:02 PDT
GC actually does have a fast path for string destruction now that they get their own space.