Summary: | References from code to Structures should be stronger than weak | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Filip Pizlo <fpizlo> | ||||||
Component: | JavaScriptCore | Assignee: | Filip Pizlo <fpizlo> | ||||||
Status: | RESOLVED FIXED | ||||||||
Severity: | Normal | CC: | commit-queue, ggaren, keith_miller, mark.lam, msaboff, saam | ||||||
Priority: | P2 | ||||||||
Version: | WebKit Nightly Build | ||||||||
Hardware: | All | ||||||||
OS: | All | ||||||||
Bug Depends on: | |||||||||
Bug Blocks: | 157334 | ||||||||
Attachments: |
|
Description
Filip Pizlo
2016-05-03 14:02:09 PDT
Created attachment 278026 [details]
possible patch
Created attachment 278044 [details]
the patch
Comment on attachment 278044 [details] the patch View in context: https://bugs.webkit.org/attachment.cgi?id=278044&action=review r=me with comments. > Source/JavaScriptCore/ChangeLog:23 > + This change introduces a more nuanced policy a more nuanced policy: if it's cheap to mark a duplicate "a more nuanced policy". > Source/JavaScriptCore/bytecode/PolymorphicAccess.cpp:562 > + if (m_structure) > + result &= m_structure->markIfCheap(visitor); If markIfCheap() returns true here, does it still make sense to continue below? We might end up appending m_structure twice, but I suppose that is harmless in terms of correctness. Out of curiosity, can Heap::isMarked(m_structure->previousID()) ever be false if m_structure->markIfCheap() is true? > Source/JavaScriptCore/runtime/Structure.cpp:1141 > + return (!m_globalObject || Heap::isMarked(m_globalObject.get())) > + && (!storedPrototypeObject() || Heap::isMarked(storedPrototypeObject())); Can you clarify why this is considered to be "cheap"? Is it because these are the conditions that do not require traversing the graphs of m_globalObject and storedPrototypeObject()? I see. So, "cheap" here means, the GC doesn't have to do much work for marking this structure, yes? (In reply to comment #3) > Comment on attachment 278044 [details] > the patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=278044&action=review > > r=me with comments. > > > Source/JavaScriptCore/ChangeLog:23 > > + This change introduces a more nuanced policy a more nuanced policy: if it's cheap to mark a > > duplicate "a more nuanced policy". Fixed. > > > Source/JavaScriptCore/bytecode/PolymorphicAccess.cpp:562 > > + if (m_structure) > > + result &= m_structure->markIfCheap(visitor); > > If markIfCheap() returns true here, does it still make sense to continue > below? We might end up appending m_structure twice, but I suppose that is > harmless in terms of correctness. You're right! I wanted to structure the code according to the intuition behind it: first handle all transitions that don't care about m_type and then switch on m_type to handle the per-type things. That's the structure we use elsewhere in AccessCase, so I felt like it would be helpful for others to be consistent. > > Out of curiosity, can Heap::isMarked(m_structure->previousID()) ever be > false if m_structure->markIfCheap() is true? Sure! We could have just marked the global object and prototype, but have yet to trace beyond this and mark either m_structure or m_structure->previousID(). The propagateTransitions() code could be called at any stage in tracing and we must assume that objects could be in any state so long as the stack is black and the tri-color invariant holds. As a reminder, the tri-color theory of marking says that: white = not marked, grey = marked but not yet visited, black = marked and visited. The invariant is that black objects cannot refer to white ones, because at the moment that we visit an object, we mark anything it refers to. Marking changes white objects to grey, but it doesn't change the color of objects that were already grey or black. In our GC, isMarked() is true when the object is not white. A grey object can be distinguished from a black one based on whether it's in the worklist. If it's not in the worklist anymore, then we have visited it, so it's black. This information can also be computed, under the right circumstances, using a combination of the mark bit and data in the object's header. A useful property of the tri-color invariant is that it's an exact property of our GC: just as the tri-color invariant doesn't prohibit grey or white objects from pointing to objects of any color, so too must our GC always assume that so long as the job of marking is not yet complete, it's possible for white or grey objects to refer to objects of any color. So, m_structure and m_structure->previousID() could both be white, in which case m_structure->previousID() would return false. markIfCheap() would return true if m_structure->globalObject and m_structure->storedPrototype are both grey or black. That's totally possible, since white objects are allowed to reference grey or black objects. It doesn't violate the tri-color invariant, therefore it is possible. > > > Source/JavaScriptCore/runtime/Structure.cpp:1141 > > + return (!m_globalObject || Heap::isMarked(m_globalObject.get())) > > + && (!storedPrototypeObject() || Heap::isMarked(storedPrototypeObject())); > > Can you clarify why this is considered to be "cheap"? Is it because these > are the conditions that do not require traversing the graphs of > m_globalObject and storedPrototypeObject()? I see. So, "cheap" here means, > the GC doesn't have to do much work for marking this structure, yes? I'll write a comment about this. Here's a detailed explanation. A structure by itself is not an expensive thing to have in the heap. We control how big it is, and we're quite good at ensuring that those structures that we cache in this code are unlikely to be large. Structures are a VM-internal data structure that the user cannot directly manipulate, so even if we keep alive one that isn't of use to the user, we can be sure that this "leak" is bounded in size: we control the bound because we control the object. But, a structure strongly references its objects' global object and prototype. The global object and prototype can be arbitrarily large, and they may refer to an arbitrarily large user-constructed graph of other arbitrarily large objects. Hence, if a VM-internal optimization like inline caching marks a structure even though there are no objects in the heap that have that structure as their structure, then the risk is that we will keep a global object or prototype alive too long. Ordinarily a structure would keep a global object or prototype alive because the structure is referenced from some object O and that object has G and P as their global object and prototype, respectively. We're semantically required to keep G and P alive if O is alive so the fact that this happens via some structure that acts as an intermediary is just fine. But if there does not exist any object that uses the structure and the VM keeps it alive due to caching, then it's possible that this will keep an otherwise-dead global object and prototype alive. That's bad, since this means that the VM's optimization may make a program that runs fine (because the ginormous global object got deallocated on next GC) into one that doesn't (it swaps or jetsams because the VM kept alive a ginormous global object across many GCs because the VM internally cached some otherwise-dead structure). I've argued that structures by themselves are cheap to mark because the structure object itself is small and completely under our control. I've also argued that structures may be extremely expensive to mark because the structure may refer to a ginormous global object or prototype, and the structure may be the last reference to those ginormous things - so marking the structure would be uniquely responsible for increasing our memory footprint by a lot. Hence, being "cheap" to mark means: marking this structure will definitely only add a small and bounded amount of bytes to the heap size. This is true if the global object and prototype are already marked, since in that case, marking the structure will only increase footprint by the size of the structure itself, which has a small and bounded size. Landed in http://trac.webkit.org/changeset/200405 *** Bug 128072 has been marked as a duplicate of this bug. *** |