RESOLVED FIXED 79191
[JSC] The end atom of the marked block should be considered to decide if the cell is live
https://bugs.webkit.org/show_bug.cgi?id=79191
Summary [JSC] The end atom of the marked block should be considered to decide if the ...
Hojong Han
Reported 2012-02-21 21:39:51 PST
It's necessary to check the end atom as well as the first atom to decide whether the cell is live, and GC logic could benefit from the assertion.
Attachments
Patch (1.47 KB, patch)
2012-02-22 01:09 PST, Hojong Han
no flags
Patch (1.48 KB, patch)
2012-03-07 16:30 PST, Hojong Han
no flags
Hojong Han
Comment 1 2012-02-22 01:09:08 PST
Filip Pizlo
Comment 2 2012-02-26 19:58:00 PST
Comment on attachment 128150 [details] Patch This feels strange. Either it is possible, due to the conservative nature of the stack scans, that we will see a pointer that passes the not-cell-middle test but is nonetheless beyond m_endAtom, or it isn't. If it is, this patch will make us crash in debug mode and do the right thing in release mode. If it is not possible, then this patch just adds noise. So which is it? Can you justify why you've added code that results in assertion failures for the case that you're claiming to handle?
Hojong Han
Comment 3 2012-02-26 20:55:47 PST
(In reply to comment #2) > (From update of attachment 128150 [details]) > This feels strange. Either it is possible, due to the conservative nature of the stack scans, that we will see a pointer that passes the not-cell-middle test but is nonetheless beyond m_endAtom, or it isn't. If it is, this patch will make us crash in debug mode and do the right thing in release mode. If it is not possible, then this patch just adds noise. > > So which is it? Can you justify why you've added code that results in assertion failures for the case that you're claiming to handle? The situation you described is the reason why I've added code. There were some crashes, I've analyzed, caused by a pointer beyond m_endAtom. That unusual pointer was added to registerFileRoots while gathering conservative roots, but I've not figured out how the pointers beyond m_endAtom were used and could be existed in the register file. I expect this patch will make crash with more information in debug mode and do the right this in release.
Filip Pizlo
Comment 4 2012-02-26 21:02:58 PST
(In reply to comment #3) > (In reply to comment #2) > > (From update of attachment 128150 [details] [details]) > > This feels strange. Either it is possible, due to the conservative nature of the stack scans, that we will see a pointer that passes the not-cell-middle test but is nonetheless beyond m_endAtom, or it isn't. If it is, this patch will make us crash in debug mode and do the right thing in release mode. If it is not possible, then this patch just adds noise. > > > > So which is it? Can you justify why you've added code that results in assertion failures for the case that you're claiming to handle? > > The situation you described is the reason why I've added code. > There were some crashes, I've analyzed, caused by a pointer beyond m_endAtom. > That unusual pointer was added to registerFileRoots while gathering conservative roots, but I've not figured out how the pointers beyond m_endAtom were used and could be existed in the register file. > I expect this patch will make crash with more information in debug mode and do the right this in release. Why not just do the right thing in both debug and release? I don't like the idea of our GC having ASSERT()'s that are motivated by us not knowing what our GC is doing. Seems like a bad precedent.
Hojong Han
Comment 5 2012-02-26 21:09:59 PST
(In reply to comment #4) > (In reply to comment #3) > > (In reply to comment #2) > > > (From update of attachment 128150 [details] [details] [details]) > > > This feels strange. Either it is possible, due to the conservative nature of the stack scans, that we will see a pointer that passes the not-cell-middle test but is nonetheless beyond m_endAtom, or it isn't. If it is, this patch will make us crash in debug mode and do the right thing in release mode. If it is not possible, then this patch just adds noise. > > > > > > So which is it? Can you justify why you've added code that results in assertion failures for the case that you're claiming to handle? > > > > The situation you described is the reason why I've added code. > > There were some crashes, I've analyzed, caused by a pointer beyond m_endAtom. > > That unusual pointer was added to registerFileRoots while gathering conservative roots, but I've not figured out how the pointers beyond m_endAtom were used and could be existed in the register file. > > I expect this patch will make crash with more information in debug mode and do the right this in release. > > Why not just do the right thing in both debug and release? > > I don't like the idea of our GC having ASSERT()'s that are motivated by us not knowing what our GC is doing. Seems like a bad precedent. I got it. Then, is it fine that the conditional check for m_endAtom only without ASSERT()'s ??
Filip Pizlo
Comment 6 2012-02-26 21:12:45 PST
(In reply to comment #5) > (In reply to comment #4) > > (In reply to comment #3) > > > (In reply to comment #2) > > > > (From update of attachment 128150 [details] [details] [details] [details]) > > > > This feels strange. Either it is possible, due to the conservative nature of the stack scans, that we will see a pointer that passes the not-cell-middle test but is nonetheless beyond m_endAtom, or it isn't. If it is, this patch will make us crash in debug mode and do the right thing in release mode. If it is not possible, then this patch just adds noise. > > > > > > > > So which is it? Can you justify why you've added code that results in assertion failures for the case that you're claiming to handle? > > > > > > The situation you described is the reason why I've added code. > > > There were some crashes, I've analyzed, caused by a pointer beyond m_endAtom. > > > That unusual pointer was added to registerFileRoots while gathering conservative roots, but I've not figured out how the pointers beyond m_endAtom were used and could be existed in the register file. > > > I expect this patch will make crash with more information in debug mode and do the right this in release. > > > > Why not just do the right thing in both debug and release? > > > > I don't like the idea of our GC having ASSERT()'s that are motivated by us not knowing what our GC is doing. Seems like a bad precedent. > > I got it. > Then, is it fine that the conditional check for m_endAtom only without ASSERT()'s ?? Only if we know that we actually need that check. Do you have some case where you believe that this check would be necessary? Should be easy enough to try to reproduce it with an ASSERT() placed in your local copy. If you can demonstrate that the check is necessary (and I'm willing to believe that it is if I have evidence) then certainly we should commit the check, along with a test case that demonstrates its need! But I don't want a check that isn't known to be necessary, because its presence may confuse us into later thinking that it *is* necessary. Makes maintenance hard.
Hojong Han
Comment 7 2012-02-26 21:22:15 PST
(In reply to comment #6) > Only if we know that we actually need that check. > > Do you have some case where you believe that this check would be necessary? Should be easy enough to try to reproduce it with an ASSERT() placed in your local copy. If you can demonstrate that the check is necessary (and I'm willing to believe that it is if I have evidence) then certainly we should commit the check, along with a test case that demonstrates its need! > > But I don't want a check that isn't known to be necessary, because its presence may confuse us into later thinking that it *is* necessary. Makes maintenance hard. I'm really appreciated for your kind review and comment. I'll find out the evidence you need and update status. Thank you.
Hojong Han
Comment 8 2012-02-28 01:04:40 PST
I've tried to find the steps to reproduce the same problem, but it's very difficult to artificially manipulate GC and memory allocation related to the system. Concurrently I've been analyzing the crash already occurred. It tells me that there were many abandoned addresses in the register file. Do you have ideas about the possibility that unnecessary addresses or values remain in the register file? The register file grows while processing in JIT code without shrink. I cannot find any shrink in JIT code as I've looked into it. If it just grows, there would be many unnecessary things after one of the callframes is done in JIT code. At last those would affect GC works. Any comment, advice, and feedback welcomed.
Filip Pizlo
Comment 9 2012-02-28 12:51:21 PST
(In reply to comment #8) > I've tried to find the steps to reproduce the same problem, > but it's very difficult to artificially manipulate GC and memory allocation related to the system. > > Concurrently I've been analyzing the crash already occurred. > It tells me that there were many abandoned addresses in the register file. > > Do you have ideas about the possibility that unnecessary addresses or values remain in the register file? My immediate instinct is that this is totally unsurprising. The register file can be quite sparse; for example a function may never touch its temporaries when it executes because it takes a path that doesn't need them. So stale values can remain on the register file for quite some time. > > The register file grows while processing in JIT code without shrink. > I cannot find any shrink in JIT code as I've looked into it. We rarely shrink the register file. Though its growth should not be unbounded - it should reach steady state. If you've found cases of unbounded stack growth, then that's a bug. > If it just grows, there would be many unnecessary things after one of the callframes is done in JIT code. > At last those would affect GC works. > > Any comment, advice, and feedback welcomed. The GC should be impervious to stale pointers in the register file. That's why the stack scan has various checks. I think the relevant question for this bug is: is the fact that we don't have a special check for a pointer being beyond m_endAtom lead to GC state being corrupted, or is it the case that either (a) the GC doesn't care or (b) the check would be redundant because our other checks (like in-middle-of-cell) already cover that case.
Hojong Han
Comment 10 2012-02-28 17:19:30 PST
Here is some information brought from the core dump. (JSC::MarkedBlock *) 0x47210000 = { <WTF::DoublyLinkedListNode<JSC::MarkedBlock>> = {<No data fields>}, members of JSC::MarkedBlock: static atomSize = 16, static atomShift = 5, static blockSize = 16384, static blockMask = 4294950912, static atomsPerBlock = 1024, static atomMask = 1023, static cardShift = 8, static bytesPerCard = 256, static cardCount = 64, static cardMask = 63, static atomAlignmentMask = 15, m_atomsPerCell = 7, m_endAtom = 1018, m_marks = { static wordSize = 32, static words = 32, static one = <optimized out>, bits = { m_data = {33818624, 270549121, 2164392968, 135274560, 1082196484, 67637280, 541098242, 33818640, 270549121, 2164392968, 135274560, 1082196484, 67637280, 541098242, 33818640, 270549121, 2164392968, 135274560, 1082196484, 67637280, 541098242, 33818640, 270549121, 2164392968, 135274560, 1082196484, 67637280, 541098242, 33818640, 270549121, 2164392968, 135274560} } }, m_state = JSC::MarkedBlock::Marked, m_allocation = { <WTF::PageBlock> = { m_realBase = 0x47210000, m_base = 0x47210000, m_size = 16384 }, members of WTF::PageAllocationAligned: m_reservation = { m_realBase = 0x4720e000, m_base = 0x4720e000, m_size = 28672 } }, m_heap = 0x5b4be8, m_prev = 0x46508000, m_next = 0x434b4000 } [Memory dump of the register file] 0x48550700: 0x432533f0 0xfffffffb 0x4640c514 0xfffffffb 0x48550710: 0x00000000 0xfffffffb 0x4640c458 0xfffffffb 0x48550720: 0x026004d0 0xfffffffb 0x428bbeb0 0xfffffffb 0x48550730: 0x428bbeb0 0xfffffffb 0x428bbef0 0xfffffffb 0x48550740: 0x428bbeb0 0xfffffffb 0x00000001 0xffffffff 0x48550750: 0x00000001 0xffffffff 0x41baf3b0 0xfffffffb 0x48550760: 0x426301b0 0xfffffffb 0x428bbe70 0xfffffffb 0x48550770: 0x428bbe30 0xfffffffb 0x00000000 0xfffffffc 0x48550780: 0x426301b0 0xfffffffb 0x00000001 0xffffffff 0x48550790: 0x47213fb0 0xfffffffb 0x00000000 0xfffffffc 0x485507a0: 0x465a88b0 0xfffffffb 0x00000001 0xffffffff 0x485507b0: 0x43eb46b0 0xfffffffb 0x438f2af0 0xfffffffb 0x485507c0: 0x452fd89c 0xfffffffb 0x00c988c8 0xfffffffc 0x485507d0: 0x43eb4670 0xfffffffb 0x43eb4670 0xfffffffb What I mentioned is 0x47213fb0 in the register file. This address is fit to the 1019th atom of that MarkedBlock, and cannot be filtered by in-middle-of-cell check.
Hojong Han
Comment 11 2012-03-04 16:51:06 PST
(In reply to comment #9) > (In reply to comment #8) > The GC should be impervious to stale pointers in the register file. That's why the stack scan has various checks. I think the relevant question for this bug is: is the fact that we don't have a special check for a pointer being beyond m_endAtom lead to GC state being corrupted, or is it the case that either (a) the GC doesn't care or (b) the check would be redundant because our other checks (like in-middle-of-cell) already cover that case. My point is that we don't have a special check for a pointer being beyond m_endAtom lead to GC state being corrupted. I wrote some information at comment #10. (I didn't attach the full dump of register file, because it's quite big.) Could you take a look into that, and let me know what is needed or missed?
Geoffrey Garen
Comment 12 2012-03-07 11:35:56 PST
Comment on attachment 128150 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=128150&action=review > Source/JavaScriptCore/heap/MarkedBlock.h:324 > + if (atomNumber > m_endAtom) { // Filters pointers into invalid cells out of the range. You need to check ">=", not ">". See the definition of m_endAtom: size_t m_endAtom; // This is a fuzzy end. Always test for < m_endAtom. > Source/JavaScriptCore/heap/MarkedBlock.h:325 > + ASSERT_NOT_REACHED(); Your data say this condition is possible and not an error, so please remove this ASSERT.
Hojong Han
Comment 13 2012-03-07 16:30:55 PST
Hojong Han
Comment 14 2012-03-07 16:34:29 PST
(In reply to comment #12) > (From update of attachment 128150 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=128150&action=review > > > Source/JavaScriptCore/heap/MarkedBlock.h:324 > > + if (atomNumber > m_endAtom) { // Filters pointers into invalid cells out of the range. > > You need to check ">=", not ">". See the definition of m_endAtom: > > size_t m_endAtom; // This is a fuzzy end. Always test for < m_endAtom. > > > Source/JavaScriptCore/heap/MarkedBlock.h:325 > > + ASSERT_NOT_REACHED(); > > Your data say this condition is possible and not an error, so please remove this ASSERT. I applied your review to the patch. Thank you very much.
Geoffrey Garen
Comment 15 2012-03-07 18:21:00 PST
Comment on attachment 130720 [details] Patch r=me
WebKit Review Bot
Comment 16 2012-03-07 19:15:16 PST
Comment on attachment 130720 [details] Patch Clearing flags on attachment: 130720 Committed r110134: <http://trac.webkit.org/changeset/110134>
WebKit Review Bot
Comment 17 2012-03-07 19:15:20 PST
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.