Bug 157537 - We should have one calleeSaveRegistersBuffer per VMEntryFrame, not one per VM.
Summary: We should have one calleeSaveRegistersBuffer per VMEntryFrame, not one per VM.
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Local Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Mark Lam
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2016-05-10 14:56 PDT by Mark Lam
Modified: 2016-05-13 13:16 PDT (History)
9 users (show)

See Also:


Attachments
proposed patch. (17.33 KB, patch)
2016-05-12 16:22 PDT, Mark Lam
no flags Details | Formatted Diff | Diff
proposed patch: with ChangeLog fix and unskipped test. (18.62 KB, patch)
2016-05-12 16:45 PDT, Mark Lam
buildbot: commit-queue-
Details | Formatted Diff | Diff
Archive of layout-test-results from ews107 for mac-yosemite-wk2 (853.93 KB, application/zip)
2016-05-12 17:42 PDT, Build Bot
no flags Details
Archive of layout-test-results from ews112 for mac-yosemite (1.51 MB, application/zip)
2016-05-12 17:54 PDT, Build Bot
no flags Details
Archive of layout-test-results from ews101 for mac-yosemite (1.01 MB, application/zip)
2016-05-12 18:47 PDT, Build Bot
no flags Details
proposed patch: with revised ChangeLog and rebased test results. (23.69 KB, patch)
2016-05-12 22:16 PDT, Mark Lam
msaboff: review-
Details | Formatted Diff | Diff
proposed patch: take 3 with ChangeLog changes and lots of renames added. (41.69 KB, patch)
2016-05-13 12:06 PDT, Mark Lam
msaboff: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Lam 2016-05-10 14:56:31 PDT
Consider this scenario:

1. Some C++ code enters the VM to call into JS code.
2. JS code throws an exception, and saves callee saved regs (from step 1) into the VM calleeSaveRegistersBuffer.

3. The Inspector wants to inspect that exception, and calls some C++ inspector code.  With ASAN, this C++ code alters %rbx on x86_64.
4. C++ inspector code calls a JS inspector function.  This re-enters the VM.
5. The JS inspector code runs hot enough that we do an enterOptimizationCheck on it.
    The enterOptimizationCheck first saves callee saved regs (from step 4) into the VM calleeSaveRegistersBuffer.
    Note: the VM calleeSaveRegistersBuffer now contains callee saved values from step 4.

6. The Inspector eventually returns to the caller JS code (from step 2).
7. The JS code does not have a handler for the exception and treats it as an uncaught exception.
8. The _handleUncaughtException exit point in the LLINT thunks does a restoreCalleeSavesFromVMCalleeSavesBuffer to restore the callee saved regs from step 1.
    Unfortunately, the VM calleeSaveRegistersBuffer now contains callee saved values from step 4, not the ones from step 1.

9. _handleUncaughtException returns to the outer C++ code.
10. The C++ code tries to use %rbx and crashes because it contains a bad value.

To fix this, we'll allocate space in the VMEntryFrame for the calleeSaveRegistersBuffer, and used that for each VM entry session instead of a singleton buffer in the VM.
Comment 1 Mark Lam 2016-05-10 14:57:02 PDT
<rdar://problem/24794845>
Comment 2 Mark Lam 2016-05-12 16:22:52 PDT
Created attachment 278777 [details]
proposed patch.
Comment 3 Mark Lam 2016-05-12 16:23:48 PDT
Comment on attachment 278777 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=278777&action=review

> Source/JavaScriptCore/ChangeLog:58
> +        to guarantee without rigging the Inspector, or setting up an ASan.

typo: /ASan/ASan test run/.
Comment 4 Mark Lam 2016-05-12 16:45:35 PDT
Created attachment 278780 [details]
proposed patch: with ChangeLog fix and unskipped test.
Comment 5 Build Bot 2016-05-12 17:42:28 PDT
Comment on attachment 278780 [details]
proposed patch: with ChangeLog fix and unskipped test.

Attachment 278780 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.webkit.org/results/1311992

New failing tests:
inspector/debugger/regress-133182.html
Comment 6 Build Bot 2016-05-12 17:42:31 PDT
Created attachment 278788 [details]
Archive of layout-test-results from ews107 for mac-yosemite-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews107  Port: mac-yosemite-wk2  Platform: Mac OS X 10.10.5
Comment 7 Build Bot 2016-05-12 17:54:06 PDT
Comment on attachment 278780 [details]
proposed patch: with ChangeLog fix and unskipped test.

Attachment 278780 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/1312004

New failing tests:
inspector/debugger/regress-133182.html
Comment 8 Build Bot 2016-05-12 17:54:09 PDT
Created attachment 278792 [details]
Archive of layout-test-results from ews112 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews112  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 9 Build Bot 2016-05-12 18:47:04 PDT
Comment on attachment 278780 [details]
proposed patch: with ChangeLog fix and unskipped test.

Attachment 278780 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.webkit.org/results/1312289

New failing tests:
inspector/debugger/regress-133182.html
Comment 10 Build Bot 2016-05-12 18:47:08 PDT
Created attachment 278798 [details]
Archive of layout-test-results from ews101 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews101  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 11 Mark Lam 2016-05-12 18:53:49 PDT
Comment on attachment 278780 [details]
proposed patch: with ChangeLog fix and unskipped test.

Taking out of review while I investigate the EWS bot sadness.
Comment 12 Michael Saboff 2016-05-12 19:15:47 PDT
Comment on attachment 278780 [details]
proposed patch: with ChangeLog fix and unskipped test.

View in context: https://bugs.webkit.org/attachment.cgi?id=278780&action=review

> Source/JavaScriptCore/ChangeLog:12
> +        2. JS code throws an exception, and saves callee saved registers (from step 1)

Use callee *saves* in all places.

> Source/JavaScriptCore/ChangeLog:18
> +           C++ inspector code.  With ASan, this C++ code alters %rbx (a callee saved
> +           register) on x86_64.

ASan code using %rbx is only one example.  The real problem is that the buffer is overwritten in step 5

> Source/JavaScriptCore/ChangeLog:25
> +           Note: the callee saved register values from step 4 has now been written over
> +           the ones from step 1 in the VM calleeSaveRegistersBuffer.

I think this would read better if you said something like "Note: The callee saves registers copied in step 1 to the VM calleeSaveRegistersBuffer have now been overwritten."

> Source/JavaScriptCore/ChangeLog:37
> +        7. The JS code does not have a handler for the exception.  JS execution resumes
> +           in the _handleUncaughtException handler in the LLINT thunks which will exit
> +           the VM.
> +        8. Before _handleUncaughtException exits the VM, it first does a
> +           restoreCalleeSavesFromVMCalleeSavesBuffer with the intent of restoring the
> +           callee saved regs from step 1.  Unfortunately, the VM calleeSaveRegistersBuffer
> +           now contains callee saved values from step 4, not the ones from step 1.
> +
> +        9. _handleUncaughtException returns to the outer C++ code.
> +        10. The C++ code tries to use %rbx and crashes because it contains a bad value.

This is just one example.  The problem is that the callee saves are not properly restored wherever the exception is ultimately handled.

> Source/JavaScriptCore/interpreter/VMEntryRecord.h:80
> +struct VMEntryFrame {
> +#if ENABLE(JIT) && NUMBER_OF_CALLEE_SAVES_REGISTERS > 0
> +    static ptrdiff_t vmEntryRecordOffset()
> +    {
> +        VMEntryFrame* fakeVMEntryFrame = reinterpret_cast<VMEntryFrame*>(0x1000);
> +        VMEntryRecord* record = vmEntryRecord(fakeVMEntryFrame);
> +        return static_cast<ptrdiff_t>(
> +            reinterpret_cast<char*>(record) - reinterpret_cast<char*>(fakeVMEntryFrame));
> +    }
> +
> +    static ptrdiff_t calleeSaveRegistersBufferOffset()
> +    {
> +        return vmEntryRecordOffset() + OBJECT_OFFSETOF(VMEntryRecord, calleeSaveRegistersBuffer);
> +    }
> +#endif
> +};
> +

What is this used for?
Comment 13 Mark Lam 2016-05-12 19:29:19 PDT
Comment on attachment 278780 [details]
proposed patch: with ChangeLog fix and unskipped test.

View in context: https://bugs.webkit.org/attachment.cgi?id=278780&action=review

>> Source/JavaScriptCore/ChangeLog:12
>> +        2. JS code throws an exception, and saves callee saved registers (from step 1)
> 
> Use callee *saves* in all places.

Why?  A reader of this ChangeLog will only know that "callee saves" means a certain types of registers if they are already thinking about callee saved registers.  I don't think "callee saves" is a term of art either.  A quick web search shows use of  "callee-save register" or "calleee-saved register" as the terms of choice.  I think "callee saved register" is also more clear and unambiguous.

>> Source/JavaScriptCore/ChangeLog:18
>> +           register) on x86_64.
> 
> ASan code using %rbx is only one example.  The real problem is that the buffer is overwritten in step 5

I'm trying to provide a concrete example.  I'll rewrite this to say that it is an example.

>> Source/JavaScriptCore/ChangeLog:25
>> +           the ones from step 1 in the VM calleeSaveRegistersBuffer.
> 
> I think this would read better if you said something like "Note: The callee saves registers copied in step 1 to the VM calleeSaveRegistersBuffer have now been overwritten."

I wanted to make the point about what it is overwritten with.  I will rephrase the sentence.

>> Source/JavaScriptCore/ChangeLog:37
>> +        10. The C++ code tries to use %rbx and crashes because it contains a bad value.
> 
> This is just one example.  The problem is that the callee saves are not properly restored wherever the exception is ultimately handled.

I will change the comment to state that this is an example.

>> Source/JavaScriptCore/interpreter/VMEntryRecord.h:80
>> +
> 
> What is this used for?

Search for VMEntryFrame::calleeSaveRegistersBufferOffset() in this patch and you will see how it's used.
Comment 14 Mark Lam 2016-05-12 22:16:29 PDT
Created attachment 278817 [details]
proposed patch: with revised ChangeLog and rebased test results.

I've revised the ChangeLog for clarity, and did away with the mention of %rbx.  Instead, I gave the items at play names (R, A, and B) so that it's keep track of what's what.  I also went with "callee save registers" because that is a common way to describe those registers, and is consistent with the name "calleeSaveRegistersBuffer", which is used in pre-existing code to describe the buffer where they are saved to.

The EWS test failure was due to the unskipped test's expected results not having outdated line numbers.  I've rebased the expected results accordingly.
Comment 15 Michael Saboff 2016-05-13 10:42:43 PDT
Comment on attachment 278817 [details]
proposed patch: with revised ChangeLog and rebased test results.

View in context: https://bugs.webkit.org/attachment.cgi?id=278817&action=review

There is both a function and a LLInt macro called copyCalleeSavesToVMCalleeSavesBuffer.  Those names should be changed to be more descriptive.

The logic looks sound.  Please clean up a few things.

> Source/JavaScriptCore/ChangeLog:49
> +        1. Some C++ code makes use of a callee save register, R, to hold value A.
> +           This C++ code then enters the VM to call into JS code.
> +
> +        2. The JS code throws an exception, and saves value A into the slot for register
> +           R in the VM calleeSaveRegistersBuffer.
> +
> +        3. As part of handling the exception, the VM notifies the Inspector about the
> +           exception.  The Inspector decides to inspect that exception, and calls some
> +           C++ inspector code.  This C++ code happen to make use of the same callee
> +           save register, R, and stores value B into it.
> +
> +        4. The C++ inspector code then re-enters the VM to call a JS inspector function.
> +
> +        5. The JS inspector code runs hot enough that we do an enterOptimizationCheck
> +           on it.  The enterOptimizationCheck first saves all callee save registers
> +           into the VM calleeSaveRegistersBuffer.
> +
> +           Note: register R now contains B (from step 3).  As a result, the slot for
> +           register R in the VM calleeSaveRegistersBuffer is overwritten with value B.
> +
> +        6. The Inspector eventually returns to the caller JS code (from step 2).
> +
> +        7. The JS code does not have a handler for the exception.  JS execution resumes
> +           in the _handleUncaughtException handler in the LLINT thunks which will exit
> +           the VM.
> +
> +        8. Before _handleUncaughtException exits the VM, it first does a
> +           restoreCalleeSavesFromVMCalleeSavesBuffer with the intent of restoring the
> +           callee saved registers from step 1, and would expect it to set register R to
> +           value A.
> +           
> +           Unfortunately, the slot for register R in the VM calleeSaveRegistersBuffer
> +           now contains value B.  Hence, register R is set to value B.
> +
> +           Then, _handleUncaughtException returns to the outer C++ code.
> +
> +        9. The C++ code uses register R and expects it to hold value A.  But because R
> +           contains value B (which is not what the code expects), it has a bad time and
> +           crashes.

I think this explanation is too complicated.  The issue is that in step 2, we save all callee save registers to the VM calleeSaveRegistersBuffer.  And then in step 5 we overwrite VM calleeSaveRegistersBuffer with possibly different values.  After step 6, when the callee save registers are restored for the exception thrown in step 2, they are wrong.  I would replace step 7 with that text. It doesn't matter if the exception is caught or not or if the trashed register(s) are used by C++ or JS code.  The issue is that one or more callee save registers are trashed.  I think steps 1, 8 & 9 are not needed.  I also think that register R and values A & B confuse things.

> Source/JavaScriptCore/ChangeLog:71
> +        No new test is added because this bug requires that:
> +        a. we have C++ code that relies on values in a callee save register (step 1).
> +        b. that C++ code should show an observable difference in behavior if that
> +           callee save register value has been changed (step 9).
> +        c. we have another piece of C++ code in the Inspector that will alter that same
> +           callee save register (step 3).

Doesn't ASan testing catch this bug?  Isn't that sufficient test coverage?

> Source/JavaScriptCore/interpreter/VMEntryRecord.h:52
> +    static ptrdiff_t calleeSaveRegistersBufferOffset()
> +    {
> +        return OBJECT_OFFSETOF(VMEntryRecord, calleeSaveRegistersBuffer);
> +    }

If the other calleeSaveRegistersBufferOffset() is used, then eliminate this one.
Comment 16 Mark Lam 2016-05-13 10:47:52 PDT
Comment on attachment 278817 [details]
proposed patch: with revised ChangeLog and rebased test results.

View in context: https://bugs.webkit.org/attachment.cgi?id=278817&action=review

>> Source/JavaScriptCore/ChangeLog:49
>> +           crashes.
> 
> I think this explanation is too complicated.  The issue is that in step 2, we save all callee save registers to the VM calleeSaveRegistersBuffer.  And then in step 5 we overwrite VM calleeSaveRegistersBuffer with possibly different values.  After step 6, when the callee save registers are restored for the exception thrown in step 2, they are wrong.  I would replace step 7 with that text. It doesn't matter if the exception is caught or not or if the trashed register(s) are used by C++ or JS code.  The issue is that one or more callee save registers are trashed.  I think steps 1, 8 & 9 are not needed.  I also think that register R and values A & B confuse things.

OK, I will simplify as you suggest.

>> Source/JavaScriptCore/ChangeLog:71
>> +           callee save register (step 3).
> 
> Doesn't ASan testing catch this bug?  Isn't that sufficient test coverage?

I will remove this text.

>> Source/JavaScriptCore/interpreter/VMEntryRecord.h:52
>> +    }
> 
> If the other calleeSaveRegistersBufferOffset() is used, then eliminate this one.

VMEntryFrame::calleeSaveRegistersBufferOffset() is computed using (vmEntryRecordOffset() + VMEntryRecord::calleeSaveRegistersBufferOffset()).  Note: one is the offset into the VMEntryFrame; the other is into the VMEntryRecord.  Hence, we need both values.
Comment 17 Mark Lam 2016-05-13 10:48:57 PDT
(In reply to comment #15)
> Comment on attachment 278817 [details]
> proposed patch: with revised ChangeLog and rebased test results.
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=278817&action=review
> 
> There is both a function and a LLInt macro called
> copyCalleeSavesToVMCalleeSavesBuffer.  Those names should be changed to be
> more descriptive.

Good catch.  I meant to rename these since the buffer isn't in the VM anymore, but I forgot.  Will take care of it.
Comment 18 Mark Lam 2016-05-13 12:06:44 PDT
Created attachment 278860 [details]
proposed patch: take 3 with ChangeLog changes and lots of renames added.
Comment 19 Michael Saboff 2016-05-13 12:36:08 PDT
Comment on attachment 278860 [details]
proposed patch: take 3 with ChangeLog changes and lots of renames added.

View in context: https://bugs.webkit.org/attachment.cgi?id=278860&action=review

r=me with one comment.

> Source/JavaScriptCore/interpreter/VMEntryRecord.h:52
> +    static ptrdiff_t calleeSaveRegistersBufferOffset()
> +    {
> +        return OBJECT_OFFSETOF(VMEntryRecord, calleeSaveRegistersBuffer);
> +    }

This is not used.  For VMEntryFrame::calleeSaveRegistersBufferOffset() you're using OBJECT_OFFSETOF() and not this function.  Either use this function or remove it.
Comment 20 Mark Lam 2016-05-13 13:07:47 PDT
(In reply to comment #19)
> > Source/JavaScriptCore/interpreter/VMEntryRecord.h:52
> > +    static ptrdiff_t calleeSaveRegistersBufferOffset()
> > +    {
> > +        return OBJECT_OFFSETOF(VMEntryRecord, calleeSaveRegistersBuffer);
> > +    }
> 
> This is not used.  For VMEntryFrame::calleeSaveRegistersBufferOffset()
> you're using OBJECT_OFFSETOF() and not this function.  Either use this
> function or remove it.

Oh, you're right.  I didn't realize that I wasn't using it.  My head was thinking one thing, and my fingers typed (or cut pasted) something else.  Since should only ever be needed in one place, I'll remove it and just use OFFSETOF in the VMEntryFrame version.  Thanks.
Comment 21 Mark Lam 2016-05-13 13:16:32 PDT
Thanks for the reviews.  Landed in r200879: <http://trac.webkit.org/r200879>.