Bug 153996

Summary:

Changed WTFCrash to not trash the crash site register state.

Product:

WebKit

Reporter:

Mark Lam <mark.lam>

Component:

JavaScriptCore

Assignee:

Mark Lam <mark.lam>

Status:

RESOLVED FIXED

Severity:

Normal

CC:

achristensen, benjamin, cdumez, cmarcelo, commit-queue, ggaren, mitz

Priority:

Version:

WebKit Local Build

Hardware:

Unspecified

OS:

Unspecified

Attachments:

Description	Flags
proposed patch.	none
x86_64 benchmark result.	none
proposed patch 2: added volatile attribute to asm statements.	ggaren: review+
patch for landing.	none

Mark Lam

Reported 2016-02-08 11:31:40 PST

When doing post-mortem crash site analysis using data from crash reports, it is immensely valuable to be able to infer the crashing program's state from the register values at crash time. However, for RELEASE_ASSERT failures, we crash using WTFCrash(), and WTFCrash() is currently implemented as a function call that, in turn, calls a lot of other functions to do crash handling before actually crashing. As a result, the register values captured in the crash reports are not likely to still contain the values used by the caller function that failed the RELEASE_ASSERT. This patch aims to remedy this issue for non-debug builds on OS(DARWIN) ports. It does so by changing WTFCrash() into an inlined function that has an inlined asm statement to issues the CPU specific breakpoint trap instruction. As a result, for non-debug OS(DARWIN) builds, crashes due to failed RELEASE_ASSERTs will now show up in crash reports as crashing due to EXC_BREAKPOINT (SIGTRAP) instead of a EXC_BAD_ACCESS (SIGSEGV) on address 0xbbadbeef. For debug and non-DARWIN builds, WTFCrash() behavior currently remains unchanged.

Attachments
proposed patch. (3.35 KB, patch) 2016-02-08 11:36 PST, Mark Lam	no flags	Details Formatted Diff Diff
x86_64 benchmark result. (66.48 KB, text/plain) 2016-02-08 11:37 PST, Mark Lam	no flags	Details
proposed patch 2: added volatile attribute to asm statements. (3.35 KB, patch) 2016-02-10 10:03 PST, Mark Lam	ggaren: review+	Details Formatted Diff Diff
patch for landing. (3.37 KB, patch) 2016-02-10 15:00 PST, Mark Lam	no flags	Details Formatted Diff Diff
Show Obsolete (2) View All Add attachment proposed patch, testcase, etc.

Mark Lam

Comment 1 2016-02-08 11:36:53 PST

Created attachment 270871 [details] proposed patch.

Mark Lam

Comment 2 2016-02-08 11:37:53 PST

Created attachment 270872 [details] x86_64 benchmark result.

Mark Lam

Comment 3 2016-02-08 11:40:57 PST

A benchmark run (on a 13" Retina MacBook Pro) shows that perf is effectively neutral with this change. A size comparison on the jsc framework binary shows that this patch is a progression: Without patch: $ size old/WebKitBuild/Release/JavaScriptCore.framework/JavaScriptCore __TEXT __DATA __OBJC others dec hex 10776576 225280 0 4739072 15740928 f03000 With patch: $ size new/WebKitBuild/Release/JavaScriptCore.framework/JavaScriptCore __TEXT __DATA __OBJC others dec hex 10752000 225280 0 4730880 15708160 efb000

mitz

Comment 4 2016-02-08 12:01:44 PST

A nice feature of the current implementation of WTFCrash() is that if the debugger is attached to the program when it calls WTFCrash(), it’s possible to continue execution by doing something like “thread return” followed by “continue”. Is there a similar way to continue execution after hitting an int3 instruction?

Alex Christensen

Comment 5 2016-02-08 12:12:45 PST

Right now it's easy to look in a crash report for 0xbbadbeef and know with almost certainty that that crash was from WTFCrash. If I look at a crash report after this change, will there be any easy way to distinguish it from a non-WTFCrash-crash?

Mark Lam

Comment 6 2016-02-08 12:17:06 PST

(In reply to comment #4) > A nice feature of the current implementation of WTFCrash() is that if the > debugger is attached to the program when it calls WTFCrash(), it’s possible > to continue execution by doing something like “thread return” followed by > “continue”. Is there a similar way to continue execution after hitting an > int3 instruction? That's interesting. I don't believe there is (at least not without manually rigging some program state). We can step over the int3, but there's nothing to return from. The generated code implements the call to WTFCrash as a jump instead as a call because it is a NO_RETURN function. May I ask why this continuation feature is needed / useful (considering we just crashed due to a RELEASE_ASSERT failure)? (In reply to comment #5) > Right now it's easy to look in a crash report for 0xbbadbeef and know with > almost certainty that that crash was from WTFCrash. If I look at a crash > report after this change, will there be any easy way to distinguish it from > a non-WTFCrash-crash? Yes, look for EXC_BREAKPOINT (SIGTRAP) instead of a EXC_BAD_ACCESS (SIGSEGV) on address 0xbbadbeef. You shouldn't be getting SIGTRAP crashes from any other sources.

mitz

Comment 7 2016-02-08 12:55:26 PST

(In reply to comment #6) > (In reply to comment #4) > > A nice feature of the current implementation of WTFCrash() is that if the > > debugger is attached to the program when it calls WTFCrash(), it’s possible > > to continue execution by doing something like “thread return” followed by > > “continue”. Is there a similar way to continue execution after hitting an > > int3 instruction? > > That's interesting. I don't believe there is (at least not without manually > rigging some program state). We can step over the int3, but there's nothing > to return from. The generated code implements the call to WTFCrash as a > jump instead as a call because it is a NO_RETURN function. > > May I ask why this continuation feature is needed / useful (considering we > just crashed due to a RELEASE_ASSERT failure)? I’ve found it useful in debugging when either my changes or the bug are causing an assertion to fail, but I would still like to see what the code would have done if the ASSERT wasn’t there. I could recompile the code without the assertion(s), but doing this in the debugger is sometimes more convenient.

Michael Catanzaro

Comment 8 2016-02-08 13:20:31 PST

Comment on attachment 270871 [details] proposed patch. View in context: https://bugs.webkit.org/attachment.cgi?id=270871&action=review > Source/WTF/wtf/Assertions.h:172 > + asm("int3"); I think this should work on Linux as well, so it probably doesn't need to be Darwin-specific?

Mark Lam

Comment 9 2016-02-08 14:11:45 PST

(In reply to comment #8) > Comment on attachment 270871 [details] > proposed patch. > > View in context: > https://bugs.webkit.org/attachment.cgi?id=270871&action=review > > > Source/WTF/wtf/Assertions.h:172 > > + asm("int3"); > > I think this should work on Linux as well, so it probably doesn't need to be > Darwin-specific? It should. But this patch changes the behavior of WTFCrash() such that it no longer invokes the globalHook nor calls WTFReportBacktrace(). My understanding is that those were added by linux folks. Is it OK to by pass those for linux as well?

Mark Lam

Comment 10 2016-02-08 18:22:58 PST

(In reply to comment #7) > (In reply to comment #6) > > (In reply to comment #4) > > > A nice feature of the current implementation of WTFCrash() is that if the > > > debugger is attached to the program when it calls WTFCrash(), it’s possible > > > to continue execution by doing something like “thread return” followed by > > > “continue”. Is there a similar way to continue execution after hitting an > > > int3 instruction? > > > > That's interesting. I don't believe there is (at least not without manually > > rigging some program state). We can step over the int3, but there's nothing > > to return from. The generated code implements the call to WTFCrash as a > > jump instead as a call because it is a NO_RETURN function. > > > > May I ask why this continuation feature is needed / useful (considering we > > just crashed due to a RELEASE_ASSERT failure)? > > I’ve found it useful in debugging when either my changes or the bug are > causing an assertion to fail, but I would still like to see what the code > would have done if the ASSERT wasn’t there. I could recompile the code > without the assertion(s), but doing this in the debugger is sometimes more > convenient. I just tried doing "thread return" after a WTFCrash() on a release build without my patch. With that, I see that the debugger assumes that the instruction after the call to WTFCrash() is the return address. However, this turns out to not be true. Because WTFCrash() has the attribute NO_RETURN_DUE_TO_CRASH, you were never actually able to just continue execution. The compiler emits the call to WTFCrash() out of line, and does not expect it to return to the mainline code. Hence, it is not followed by any code that transitions execution back to mainline. You may have been lucked out, and the program continue to executed, but that is just a fluke and it may not be executing what you expected. Note: this is on a non-debug build (which is where I'm applying this change). Based on this, I think there's no loss of functionality with this patch after all.

mitz

Comment 11 2016-02-08 18:42:08 PST

Makes sense. My experience is largely with Debug builds.

Geoffrey Garen

Comment 12 2016-02-09 11:09:53 PST

> I’ve found it useful in debugging when either my changes or the bug are > causing an assertion to fail, but I would still like to see what the code > would have done if the ASSERT wasn’t there. I could recompile the code > without the assertion(s), but doing this in the debugger is sometimes more > convenient. A point that I think has been lost here: The proposed int3 instruction is a breakpoint instruction, which the debugger should be able to catch and continue from directly, just like any other breakpoint.

Mark Lam

Comment 13 2016-02-09 11:16:18 PST

(In reply to comment #12) > > I’ve found it useful in debugging when either my changes or the bug are > > causing an assertion to fail, but I would still like to see what the code > > would have done if the ASSERT wasn’t there. I could recompile the code > > without the assertion(s), but doing this in the debugger is sometimes more > > convenient. > > A point that I think has been lost here: The proposed int3 instruction is a > breakpoint instruction, which the debugger should be able to catch and > continue from directly, just like any other breakpoint. You can continue, but there's nothing to make you continue in the caller of WTFCrash (which is what I think we hoped the continue would do). Since WTFCrash is a NO_RETURN function, the C++ compiler implements it as a jump to out of line code where it calls WTFCrash (or inline the int3 in this case). It is expected to never continue thereafter. Hence, the instructions that follow after it may have nothing to do with the call site of WTFCrash.

Geoffrey Garen

Comment 14 2016-02-09 11:18:09 PST

Comment on attachment 270871 [details] proposed patch. View in context: https://bugs.webkit.org/attachment.cgi?id=270871&action=review > Source/WTF/wtf/Assertions.h:176 > + asm("brk #0"); I think you need to mark your asm blocks volatile. Otherwise, the compiler might "optimize away" some assertions that they test for seemingly impossible or undefined behavior.

Geoffrey Garen

Comment 15 2016-02-09 11:32:57 PST

Comment on attachment 270871 [details] proposed patch. View in context: https://bugs.webkit.org/attachment.cgi?id=270871&action=review > Source/WTF/wtf/Assertions.h:165 > +#if defined(NDEBUG) && OS(DARWIN) Honestly, I think this behavior is better in debug builds too. The tradeoff is: You lose the printf backtrace, but you gain stopping in the debugger exactly at the point of the crash.

Mark Lam

Comment 16 2016-02-10 09:45:03 PST

(In reply to comment #15) > Comment on attachment 270871 [details] > proposed patch. > > View in context: > https://bugs.webkit.org/attachment.cgi?id=270871&action=review > > > Source/WTF/wtf/Assertions.h:165 > > +#if defined(NDEBUG) && OS(DARWIN) > > Honestly, I think this behavior is better in debug builds too. The tradeoff > is: You lose the printf backtrace, but you gain stopping in the debugger > exactly at the point of the crash. With the pre-existing WTFCrash(): 1. Without the debugger attached, the crash trace will be conveniently printed for us on the console. 2. With the debugger attached, we can still easily inspect the crashing frame by going 1 frame up. I feel that we didn't lose much convenience in terms of getting to the point of crash, but we certainly would lose more convenience in getting the crash trace when the debugger is not attached (will have to consult crash logs to now find the trace). In both cases, there's no loss of functionality, just some tradeoff in convenience. Does anyone else have an opinion on this?

Mark Lam

Comment 17 2016-02-10 10:03:10 PST

Created attachment 270999 [details] proposed patch 2: added volatile attribute to asm statements. Applied Geoff's feedback on the need for the volatile attribute.

Geoffrey Garen

Comment 18 2016-02-10 13:57:40 PST

Comment on attachment 270999 [details] proposed patch 2: added volatile attribute to asm statements. r=me

Geoffrey Garen

Comment 19 2016-02-10 13:58:01 PST

You should email webkit-dev with instructions for identifying an ASSERT crash before you land this.

Mark Lam

Comment 20 2016-02-10 14:59:00 PST

Comment on attachment 270999 [details] proposed patch 2: added volatile attribute to asm statements. View in context: https://bugs.webkit.org/attachment.cgi?id=270999&action=review > Source/WTF/wtf/Assertions.h:176 > +#if CPU(X86_64) || CPU(X86) > + asm("int3"); > +#elif CPU(ARM_THUMB2) > + asm("bkpt #0"); > +#elif CPU(ARM64) > + asm("brk #0"); Eeek. I uploaded the wrong patch here without the "volatile" changes. Will land the proper one with the change included.

Mark Lam

Comment 21 2016-02-10 15:00:29 PST

Created attachment 271034 [details] patch for landing.

Mark Lam

Comment 22 2016-02-10 15:03:23 PST

Landed in r196397: <http://trac.webkit.org/r196397>.

Note You need to log in before you can comment on or make changes to this bug.