Bug 200149

Summary: Add crash diagnostics for debugging unexpected zapped cells.
Product: WebKit Reporter: Mark Lam <mark.lam>
Component: JavaScriptCoreAssignee: Mark Lam <mark.lam>
Status: RESOLVED FIXED    
Severity: Normal CC: commit-queue, ews-watchlist, keith_miller, mcatanzaro, msaboff, saam, tzagallo, webkit-bug-importer, ysuzuki
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 200214    
Bug Blocks:    
Attachments:
Description Flags
proposed patch.
none
proposed patch.
none
proposed patch.
none
proposed patch.
ysuzuki: review+
proposed patch.
ysuzuki: review+
patch for landing. ews-watchlist: commit-queue-

Description Mark Lam 2019-07-25 18:48:53 PDT
...
Comment 1 Radar WebKit Bug Importer 2019-07-25 18:58:51 PDT
<rdar://problem/53570112>
Comment 2 Mark Lam 2019-07-25 19:17:41 PDT
Created attachment 374932 [details]
proposed patch.

Let's get some testing on the EWS first.  I'm still in the process of perf testing the patch locally.
Comment 3 Mark Lam 2019-07-25 21:38:59 PDT
Created attachment 374938 [details]
proposed patch.
Comment 4 Mark Lam 2019-07-25 23:50:16 PDT
Created attachment 374944 [details]
proposed patch.
Comment 5 Michael Saboff 2019-07-26 11:46:50 PDT
Comment on attachment 374944 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=374944&action=review

r=me with some nits

> Source/JavaScriptCore/ChangeLog:10
> +        SlotVisitor::visitChildren().  If a zapped cell is detected, we we'll crash with

grammar *we well*.  May *we will* or *we'll*.

> Source/JavaScriptCore/heap/HeapCell.h:51
> +    // the rest of the cell header bits in tact for crash analysis uses.

*intact*
Comment 6 Saam Barati 2019-07-26 11:53:04 PDT
Comment on attachment 374944 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=374944&action=review

> Source/JavaScriptCore/heap/HeapCell.h:55
> +        uint32_t* cellWords = reinterpret_cast_ptr<uint32_t*>(this);

bitwise_cast

> Source/JavaScriptCore/heap/HeapCell.h:60
> +    bool isZapped() const { return !*reinterpret_cast_ptr<const uint32_t*>(this); }

ditto

> Source/JavaScriptCore/heap/SlotVisitor.cpp:293
> +#if PLATFORM(MAC)

why not X86_64?

> Source/JavaScriptCore/heap/SlotVisitor.cpp:852
> +NEVER_INLINE void SlotVisitor::reportZappedCellAndCrash(JSCell* cell)
> +{
> +    MarkedBlock::Handle* foundBlock = nullptr;
> +    uint32_t* cellWords = reinterpret_cast_ptr<uint32_t*>(this);
> +
> +    uintptr_t cellAddress = bitwise_cast<uintptr_t>(cell);
> +    uintptr_t headerWord = cellWords[1];
> +    uintptr_t zapReason = cellWords[2];
> +    unsigned subspaceHash = 0;
> +    size_t cellSize = 0;
> +
> +    m_heap.objectSpace().forEachBlock([&] (MarkedBlock::Handle* block) {
> +        if (block->contains(cell)) {
> +            foundBlock = block;
> +            return IterationStatus::Done;
> +        }
> +        return IterationStatus::Continue;
> +    });
> +
> +    if (foundBlock) {
> +        subspaceHash = bmalloc::stringHash(foundBlock->subspace()->name());
> +        cellSize = foundBlock->cellSize();
> +    }
> +
> +    RELEASE_ASSERT(!cell->isZapped(), cellAddress, headerWord, zapReason, subspaceHash, cellSize);
> +}

adding all of this worries more than just crashing. Do we not think the crash alone is good enough? At least in the appendToMarkedStack case.

If you do keep it, please:
- use bitwise_cast
- Use WTF's string hash instead of bmalloc
Comment 7 Saam Barati 2019-07-26 11:53:47 PDT
I didn't mean to clear Michael's r+, though I do have some hesitation. Last time we added a bunch of debugging code, we ended up in a worse place than just crashing.
Comment 8 Saam Barati 2019-07-26 11:57:28 PDT
Comment on attachment 374944 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=374944&action=review

> Source/JavaScriptCore/heap/SlotVisitor.cpp:851
> +    RELEASE_ASSERT(!cell->isZapped(), cellAddress, headerWord, zapReason, subspaceHash, cellSize);

please make this CRASH_WITH_INFO instead. Just to avoid any chance someone raced and allocated over this.
Comment 9 Mark Lam 2019-07-26 16:22:32 PDT
Created attachment 374997 [details]
proposed patch.

I've applied the feedback from Michael and Saam, and added some additional tooling to dump subspace hashes.
Comment 10 Mark Lam 2019-07-26 16:25:48 PDT
Comment on attachment 374997 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=374997&action=review

> Source/JavaScriptCore/heap/SlotVisitor.cpp:849
> +#endif // PLATFORM(MAC)

I'll fix this comment before landing.  Should be CPU(X86_64).
Comment 11 Yusuke Suzuki 2019-07-26 16:39:46 PDT
Comment on attachment 374997 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=374997&action=review

r=me with a nit.

> Source/JavaScriptCore/heap/SlotVisitor.cpp:834
> +    m_heap.objectSpace().forEachBlock([&] (MarkedBlock::Handle* block) {

This is potentially dangerous because m_blocks can be added by the main thread while the concurrent marker is working.
But maybe, it does not matter much since the crash trace itself offers much more profitable information already.
Can you add a comment about it with FIXME or invent something safer?
Comment 12 Mark Lam 2019-07-26 16:53:33 PDT
Thanks for the reviews.  Preliminary perf testing results looks promising i.e. no significant change.  I'm going to go ahead and land this.  We can roll it out later if we find that it does regress performance.
Comment 13 Mark Lam 2019-07-26 17:44:31 PDT
Landed in r247886: <http://trac.webkit.org/r247886>.
Comment 14 Michael Catanzaro 2019-07-28 18:32:45 PDT
[975/2287] Building CXX object Source/...sources/UnifiedSource-ee8a7a7a-7.cpp.o
In file included from DerivedSources/JavaScriptCore/unified-sources/UnifiedSource-ee8a7a7a-7.cpp:2:
/home/mcatanzaro/Projects/WebKit/Source/JavaScriptCore/heap/SlotVisitor.cpp: In member function ‘void JSC::SlotVisitor::reportZappedCellAndCrash(JSC::JSCell*)’:
/home/mcatanzaro/Projects/WebKit/Source/JavaScriptCore/heap/SlotVisitor.cpp:828:15: warning: unused variable ‘cellAddress’ [-Wunused-variable]
  828 |     uintptr_t cellAddress = bitwise_cast<uintptr_t>(cell);
      |               ^~~~~~~~~~~
/home/mcatanzaro/Projects/WebKit/Source/JavaScriptCore/heap/SlotVisitor.cpp:829:15: warning: unused variable ‘headerWord’ [-Wunused-variable]
  829 |     uintptr_t headerWord = cellWords[1];
      |               ^~~~~~~~~~
/home/mcatanzaro/Projects/WebKit/Source/JavaScriptCore/heap/SlotVisitor.cpp:830:15: warning: unused variable ‘zapReason’ [-Wunused-variable]
  830 |     uintptr_t zapReason = cellWords[2];
      |               ^~~~~~~~~
/home/mcatanzaro/Projects/WebKit/Source/JavaScriptCore/heap/SlotVisitor.cpp:831:14: warning: variable ‘subspaceHash’ set but not used [-Wunused-but-set-variable]
  831 |     unsigned subspaceHash = 0;
      |              ^~~~~~~~~~~~
/home/mcatanzaro/Projects/WebKit/Source/JavaScriptCore/heap/SlotVisitor.cpp:832:12: warning: variable ‘cellSize’ set but not used [-Wunused-but-set-variable]
  832 |     size_t cellSize = 0;
      |            ^~~~~~~~

So this is kinda annoying to fix.

The obvious solution is to just use the UNUSED_VARIABLE() macro for each of these in SlotVisitor.cpp. But that's annoying because the variables are passed to CRASH_WITH_INFO(), so one would expect CRASH_WITH_INFO() to use them. It does for Clang and MSVCC, but not for the fallback GCC implementation. And writing a macro that passes the __VA_ARGS__ to UNUSED_VARIABLE() doesn't seem simple. But I think we could create a macro that calls a variadic function and just does nothing with its va_list.
Comment 15 WebKit Commit Bot 2019-07-28 21:28:39 PDT
Re-opened since this is blocked by bug 200214
Comment 16 Mark Lam 2019-07-28 21:34:06 PDT
FYI, this patch has been rolled out in r247900: <https://trac.webkit.org/changeset/247900> because it regresses PLT5.
Comment 17 Tadeu Zagallo 2019-07-28 22:46:57 PDT
(In reply to Michael Catanzaro from comment #14)
> The obvious solution is to just use the UNUSED_VARIABLE() macro for each of
> these in SlotVisitor.cpp. But that's annoying because the variables are
> passed to CRASH_WITH_INFO(), so one would expect CRASH_WITH_INFO() to use
> them. It does for Clang and MSVCC, but not for the fallback GCC
> implementation. And writing a macro that passes the __VA_ARGS__ to
> UNUSED_VARIABLE() doesn't seem simple. But I think we could create a macro
> that calls a variadic function and just does nothing with its va_list.

FWIW, there's `WTF_LAZY_FOR_EACH_TERM` in wtf/Forward.h. I believe you could use `WTF_LAZY_FOR_EACH_TERM(UNUSED_VARIABLE,  __VA_ARGS__)`
Comment 18 Michael Catanzaro 2019-07-29 04:19:30 PDT
That seems good, thanks!
Comment 19 Mark Lam 2019-07-29 14:30:36 PDT
I'll fix the CRASH_WITH_INFO unused parameters issue separately in https://bugs.webkit.org/show_bug.cgi?id=200243.
Comment 20 Mark Lam 2019-08-01 14:15:29 PDT
Created attachment 375343 [details]
proposed patch.
Comment 21 EWS Watchlist 2019-08-01 14:17:34 PDT
Attachment 375343 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/assembler/CPU.cpp:74:  Use nullptr instead of NULL.  [readability/null] [5]
ERROR: Source/JavaScriptCore/assembler/CPU.cpp:84:  Use nullptr instead of NULL.  [readability/null] [5]
Total errors found: 2 in 20 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 22 Yusuke Suzuki 2019-08-01 14:20:08 PDT
Comment on attachment 375343 [details]
proposed patch.

r=me
Comment 23 Mark Lam 2019-08-01 14:28:59 PDT
Created attachment 375346 [details]
patch for landing.
Comment 24 EWS Watchlist 2019-08-01 18:06:38 PDT
Comment on attachment 375346 [details]
patch for landing.

Attachment 375346 [details] did not pass jsc-ews (mac):
Output: https://webkit-queues.webkit.org/results/12849056

New failing tests:
mozilla-tests.yaml/js1_5/Array/regress-101964.js.mozilla-ftl-eager-no-cjit-validate-phases
Comment 25 Mark Lam 2019-08-01 18:59:03 PDT
(In reply to Build Bot from comment #24)
> Comment on attachment 375346 [details]
> patch for landing.
> 
> Attachment 375346 [details] did not pass jsc-ews (mac):
> Output: https://webkit-queues.webkit.org/results/12849056
> 
> New failing tests:
> mozilla-tests.yaml/js1_5/Array/regress-101964.js.mozilla-ftl-eager-no-cjit-
> validate-phases

This is unrelated. I ran this test on my slowest MacBookAir, and it runs just fine.
Comment 26 Mark Lam 2019-08-01 18:59:37 PDT
Landed in r248143: <http://trac.webkit.org/r248143>.
Comment 27 Mark Lam 2019-08-02 11:17:53 PDT
+ Build fix in r248162: <http://trac.webkit.org/r248162>.