RESOLVED FIXED 190656
useProbeOSRExit causes failures for Win64 DFG JIT
https://bugs.webkit.org/show_bug.cgi?id=190656
Summary useProbeOSRExit causes failures for Win64 DFG JIT
Ross Kirsling
Reported 2018-10-16 18:38:09 PDT
About three dozen JSC tests fail for WinCairo in dfg-eager mode only, but all of these pass if useProbeOSRExit is switched off. Two-thirds appear to be failing the `!Heap::heap(value) || Heap::heap(value) == Heap::heap(this)` assert in JSObject::putDirectInternal: https://github.com/WebKit/webkit/blob/master/Source/JavaScriptCore/runtime/JSObjectInlines.h#L270 The rest are failing isObject() assertions. --- Here is a simple test case that reproduces the issue (distilled from stress/sparse-map-non-skip.js): test.js ``` function checkGetter(object) { if (object.foo !== 0) throw new Error(`bad value for object.foo! expected 0, found ${object.foo}`); } noInline(checkGetter); for (var i = 0; i < 2305; ++i) checkGetter({ get foo() { return 0; } }); checkGetter({ get foo() { return 0; } }); ``` dfg-whitelist.txt ``` <global>#Cuu2O0 ``` With this as input, `jsc --useConcurrentJIT=false --useProbeOSRExit=true --dfgWhitelist=dfg-whitelist.txt test.js` suffices to repro. Debug: > ASSERTION FAILED: getter.isObject() || setter.isObject() > ...\jit\JITOperations.cpp(1748) : JSC::operationPutGetterSetter > 1 00007FFB3E4321EA > 2 00007FFB2FC42F11 > 3 00007FFB30910027 > 4 000002B0398E2AE0 Release: > Exception: Error: bad value for object.foo! expected 0, found undefined > checkGetter@test.js:3:20 > global code@test.js:10:12
Attachments
Patch (3.79 KB, patch)
2018-10-29 17:44 PDT, Ross Kirsling
no flags
Ross Kirsling
Comment 1 2018-10-19 16:02:49 PDT
I'll try to list up what I know so far: - FWIW, testmasm has no failures on WinCairo. - If I dataLog on save and restore in DFGOSREntry.cpp, AssemblyHelpers.h, and DFGOSRExit.cpp (and use printEachOSRExit), it definitely looks like callee-save registers are saved and restored consistently whether or not the probe is used. - If I use dumpDFGDisassembly and blot out all raw hex values (and normalize Structure identifiers), then the output is consistent with or without the probe. There is a final large codeblock which is only spit out in the non-probe case, but that's also true on Mac, which isn't broken. - On the last run of operationPutGetterSetter (in JITOperations.cpp), the encodedGetterValue has an entirely different prefix (0x00007FFB... instead of 0x000001...) from the first 2305 times. If we look at the JSCell for the getter, there is one point of potential note... All cases without probe, first 2305 cases with probe: m_structureID 0x00000104 unsigned int m_indexingTypeAndMisc 0x00 '\0' unsigned char m_type JSFunctionType (0x18 '\x18') JSC::JSType m_flags 0x0e '\xe' unsigned char m_cellState DefinitelyWhite (0x01 '\x1') JSC::CellState Final case, after DFG OSR exit, with probe: m_structureID 0x0000000d unsigned int m_indexingTypeAndMisc 0x18 '\x18' unsigned char m_type CellType (0x00 '\0') JSC::JSType m_flags 0x00 '\0' unsigned char m_cellState PossiblyBlack (0x00 '\0') JSC::CellState ...namely, that 0x18 seems like it's jumped up a slot. But then, everything after it is zeroes, so this may be a red herring...? - One way or another, it's hard to imagine what other than Win64 calling convention could be at play here. Now, although we're saving and restoring RBX, RSI, RDI, R12, R13, R14, and R15, we're intentionally skipping over RBP and RSP -- RBP appears to be callee-save for x86_64 in general, so surely that one's a non-issue(?), but RSP is callee-save for Windows (according to MSDN, although it's not indicated as such in RegisterSet or GPRInfo) so perhaps the probe mucks with the stack pointer in a way that Win64 doesn't like?
Stephan Szabo
Comment 2 2018-10-23 15:24:43 PDT
Adding a little bit of information from what was seen today. The value that appears to come through for operationPutGetterSetter's encodedGetterValue seems to be the same value that was used in the prior call to operationPutByIdStrictOptimize's uid (which is also the fifth argument to that call). The timeline with probe on seems to be: operationNewFunction is called and seems to return in rax the value that we would want to use later as encodedGetterValue ctiMasmProbeTrampoline is started - goes into executeProbe and then executeOSRExit and some more before leaving operationPutByIdNonStrictOptimize is called and a value is placed into memory for the uid value. operationPutGetterSetter is called and the value on the stack for the fifth argument has the same value as what was passed to the previous call as uid.
Stephan Szabo
Comment 3 2018-10-23 16:55:23 PDT
In the case with probe on, when it goes to set up the arguments for operationPutByIdNonStrictOptimize I seem to see that rsp+20h and rbp-60h [in my test the last 4 digits of rsp were D220 and rbp was D300] are the same location, so setting the argument seems to overwrite the location that the getter was stored in. Later, when the call to operationPutGetterSetter goes to get the value, it reads the overwritten value and then puts that on the stack as the argument value and things fail. With the probe off, at the point the argument is set up, I seem to have values like rsp ending with D2E0 and rbp ending with D3A0, so the two values are in different places. When the call to operationPutGetterSetter goes to get the value it gets the value from operationNewFunction and puts that on the stack as the argument and things work.
Stephan Szabo
Comment 4 2018-10-23 16:59:06 PDT
(In reply to Stephan Szabo from comment #3) > In the case with probe on, when it goes to set up the arguments for > operationPutByIdNonStrictOptimize I seem to see that rsp+20h and rbp-60h [in > my test the last 4 digits of rsp were D220 and rbp was D300] are the same Correction D280 and D300
Ross Kirsling
Comment 5 2018-10-26 11:33:44 PDT
Update: This definitely is a stack pointer management problem. When the probe is off, RSP always has the same value at the first line of operationPutGetterSetter in JITOperations, whether before DFG entry or after DFG exit: https://github.com/WebKit/webkit/blob/master/Source/JavaScriptCore/jit/JITOperations.cpp#L1742 (* Technically, with or without the probe, RSP is raised by 0x10 just on the single run after calling prepareOSREntry but before actually switching out of baseline JIT. Apparently this isn't problematic though.) When the probe is on, RSP ends up raised by 0x40 after DFG exit. As such, if we hack in a `exitState.stackPointerOffset -= 0x40;` line right before setting RSP in executeOSRExit, the test case succeeds: https://github.com/WebKit/webkit/blob/master/Source/JavaScriptCore/dfg/DFGOSRExit.cpp#L437
Ross Kirsling
Comment 6 2018-10-26 14:22:30 PDT
(In reply to Ross Kirsling from comment #5) > As such, if we hack in a `exitState.stackPointerOffset -= 0x40;` line right > before setting RSP in executeOSRExit, the test case succeeds: > https://github.com/WebKit/webkit/blob/master/Source/JavaScriptCore/dfg/ > DFGOSRExit.cpp#L437 To put this last part a bit more clearly, if we add 8 to the variable count returned from the DFG::VariableEventStream::reconstruct call here: https://github.com/WebKit/webkit/blob/master/Source/JavaScriptCore/dfg/DFGOSRExit.cpp#L401 And comment out a sanity-check assert for Probe::Stack::lowWatermark here: https://github.com/WebKit/webkit/blob/master/Source/JavaScriptCore/assembler/ProbeStack.h#L156 Then we finally have a state in which all the failing dfg-eager tests pass.
Saam Barati
Comment 7 2018-10-26 15:32:25 PDT
Seems like the probe itself is probably making mistakes on windows ABI
Ross Kirsling
Comment 8 2018-10-29 16:40:35 PDT
A bit of progress: The issue doesn't seem to be with the return value of DFG::VariableEventStream::reconstruct, as this is the same regardless of probe and regardless of platform. the issue seems to be that this return value isn't appropriate as a stack pointer offset -- after all, it is simply thrown away in the non-probe path. It appears that if we mimic this line from the non-probe path: https://github.com/WebKit/webkit/blob/master/Source/JavaScriptCore/dfg/DFGOSRExit.cpp#L1379 That is, if we replace numVariables with `codeBlock->jitCode()->dfgCommon()->requiredRegisterCountForExit` on this line: https://github.com/WebKit/webkit/blob/master/Source/JavaScriptCore/dfg/DFGOSRExit.cpp#L402 Then all of the relevant test cases pass (even on Mac!). The catch is that we still need to remove the lowWatermark assert mentioned in my last comment: https://github.com/WebKit/webkit/blob/master/Source/JavaScriptCore/assembler/ProbeStack.h#L156 Not sure at the moment what correction needs to occur here.
Ross Kirsling
Comment 9 2018-10-29 17:44:57 PDT
Keith Miller
Comment 10 2018-10-30 11:03:58 PDT
Comment on attachment 353338 [details] Patch r=me.
WebKit Commit Bot
Comment 11 2018-10-30 11:32:29 PDT
Comment on attachment 353338 [details] Patch Clearing flags on attachment: 353338 Committed r237595: <https://trac.webkit.org/changeset/237595>
WebKit Commit Bot
Comment 12 2018-10-30 11:32:31 PDT
All reviewed patches have been landed. Closing bug.
Radar WebKit Bug Importer
Comment 13 2018-10-30 11:34:02 PDT
Ross Kirsling
Comment 14 2018-11-01 10:42:52 PDT
Ugh, we might not be totally out of the woods yet... While this patch fixed all dfg-eager failures in Debug, apparently four failures remain in Release only: stress/destructuring-assignment-accepts-iterables.js.dfg-eager stress/map-constructor.js.dfg-eager stress/set-constructor.js.dfg-eager stress/weak-map-constructor.js.dfg-eager And sure enough, disabling the probe makes them pass.
Note You need to log in before you can comment on or make changes to this bug.