Bug 167239 - ArityFixup should adjust SP first
Summary: ArityFixup should adjust SP first
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore JavaScript (show other bugs)
Version: WebKit Nightly Build
Hardware: PC Linux
: P2 Normal
Assignee: Yusuke Suzuki
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2017-01-20 05:59 PST by Kamil Frankowicz
Modified: 2017-02-17 06:24 PST (History)
12 users (show)

See Also:


Attachments
POC to heap out of bounds write (jsc) (117 bytes, application/javascript)
2017-01-20 05:59 PST, Kamil Frankowicz
no flags Details
Patch (5.77 KB, patch)
2017-01-31 13:01 PST, Yusuke Suzuki
msaboff: review+
buildbot: commit-queue-
Details | Formatted Diff | Diff
Archive of layout-test-results from ews122 for ios-simulator-wk2 (814.91 KB, application/zip)
2017-01-31 21:53 PST, Build Bot
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Kamil Frankowicz 2017-01-20 05:59:00 PST
Created attachment 299345 [details]
POC to heap out of bounds write (jsc)

Affected SVN revision: 210958

To reproduce the problem:
./jsc jsc_oobw_llint_entry.js

GDB Backtrace:

#0  0x00007ffff767a830 in llint_entry ()
   from XYZ/WebKitBuild/Release/lib/libJavaScriptCore.so.1
#1  0x00007ffff767fe91 in llint_entry ()
   from XYZ/WebKitBuild/Release/lib/libJavaScriptCore.so.1
#2  0x00007ffff767945f in vmEntryToJavaScript ()
   from XYZ/WebKitBuild/Release/lib/libJavaScriptCore.so.1
#3  0x00007ffff7603aae in JSC::JITCode::execute(JSC::VM*, JSC::ProtoCallFrame*) ()
   from XYZ/WebKitBuild/Release/lib/libJavaScriptCore.so.1
#4  0x00007ffff75d512f in JSC::Interpreter::execute(JSC::ProgramExecutable*, JSC::ExecState*, JSC::JSObject*) ()
   from XYZ/WebKitBuild/Release/lib/libJavaScriptCore.so.1
#5  0x00007ffff77cfd5a in JSC::evaluate(JSC::ExecState*, JSC::SourceCode const&, JSC::JSValue, WTF::NakedPtr<JSC::Exception>&) () from XYZ/WebKitBuild/Release/lib/libJavaScriptCore.so.1
#6  0x000000000040de87 in runJSC(JSC::VM*, CommandLine) ()
#7  0x000000000040c997 in jscmain(int, char**) ()
#8  0x000000000040c867 in main ()
#9  0x00007ffff3bdd830 in __libc_start_main (main=0x40c850 <main>, argc=0x2, argv=0x7fffffffdcd8, 
    init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffdcc8)
    at ../csu/libc-start.c:291
#10 0x000000000040ad79 in _start ()

Regards,
Kamil Frankowicz
Comment 1 Alexey Proskuryakov 2017-01-22 00:13:25 PST
I cannot reproduce on Mac with ASan build of r211025.
Comment 2 Kamil Frankowicz 2017-01-30 09:54:03 PST
I tried on r211363 + ASAN and problem still exist.

My configuration:
CC = afl-clang-fast (Clang 3.9.1)
CXX = afl-clang-fast++ (Clang 3.9.1)
OS = Xubuntu 16.04 x64
Comment 3 Yusuke Suzuki 2017-01-30 23:06:21 PST
Reproduced in JSCOnly port.
I'm now investigating this.
Basically the problem is too many arguments.
More simplified POC for this issue.

var args = "y,".repeat(80000);
var g = Function(args, "return 0");
g();
Comment 4 Yusuke Suzuki 2017-01-30 23:29:35 PST
crashed position is functionArityCheck's copyLoop.
Comment 5 Yusuke Suzuki 2017-01-30 23:48:34 PST
Current my guess is that soft stack height check is failed in Linux due to the different stack height in each platform.
Comment 6 Yusuke Suzuki 2017-01-31 00:44:34 PST
soft stack height check should pass.
The problem is the arity check slow path function does not return failure.
Comment 7 Yusuke Suzuki 2017-01-31 01:21:06 PST
OK, it seems that Linux softStackLimit is incorrect.
Comment 8 Yusuke Suzuki 2017-01-31 02:09:49 PST
ok, it seems that LLInt code bug. quad signed words are handled with long size ops.
Comment 9 Yusuke Suzuki 2017-01-31 02:50:52 PST
OK, finally found the issue.
The problem is, in Linux, we use pthread_attr_getstack to obtain stack base and size.
But it does not return the correct value if it is called on the main thread.
Comment 10 Yusuke Suzuki 2017-01-31 12:05:05 PST
Aha! I've got the reason of this bug.
It is not related to loop operation and stack limits.
It is related to x64 red zone!

Now, we perform copy loop. It extends the C stack.
But we accidentally perform this operation before extending sp.
Logically it is ok. But this means that you now touch beyond the red zone of the stack!
Comment 11 Yusuke Suzuki 2017-01-31 12:31:19 PST
(In reply to comment #10)
> Aha! I've got the reason of this bug.
> It is not related to loop operation and stack limits.
> It is related to x64 red zone!
> 
> Now, we perform copy loop. It extends the C stack.
> But we accidentally perform this operation before extending sp.
> Logically it is ok. But this means that you now touch beyond the red zone of
> the stack!

I guess that some crashes I saw when enabling sampling profilers are also related to the red zone.
At that time, signals frequently comes. This breaks the stack that is beyond the red zone.
Comment 12 Yusuke Suzuki 2017-01-31 13:01:11 PST
Created attachment 300252 [details]
Patch
Comment 13 Michael Saboff 2017-01-31 13:17:04 PST
Comment on attachment 300252 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=300252&action=review

r=me
How does adjusting the stack pointer adjust the red zone?

> Source/JavaScriptCore/ChangeLog:8
> +        Arity fixup extends C stack and copy/fill the stack with

There is only one stack.  How about "... extends the stack and copy/fills the"?

> Source/JavaScriptCore/ChangeLog:10
> +        the values. At that time, we accidentally read/write stacks
> +        before extending the stack pointer. As a result, we touch

How about replacing "stacks before extending the stack pointer" with "stack space behind the stack pointer"

> Source/JavaScriptCore/ChangeLog:12
> +        are unsafe, especially in Linux. This patch extends the stack

How about changing "extends" to "changes"?
Comment 14 Mark Lam 2017-01-31 13:33:13 PST
Comment on attachment 300252 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=300252&action=review

> Source/JavaScriptCore/jit/ThunkGenerators.cpp:445
> +    // Adjust call frame register and stack pointer to account for missing args
> +    // We need to extend sp first before starting the copy/fill loops since it
> +    // may exceed the x64 red zone.

I think this comment is not entirely accurate and does not explain the core issue.  I think the issue is that the OS expects any space below the sp to be unused, and will allocate that region to be used by the interrupt / signal stack.  If we do not adjust the sp before filling in that region with content, the content may get trashed by an interrupt.  As a result, we need to adjust the sp first.  Incidentally, the reason that the interrupt handler can just use that region below the sp is because we are supposed to reserve an amount of memory (the red zone size) at the bottom of the stack that user code is not supposed to use.

Can you update the comments to clarify the issue please before landing?  Thanks.
Comment 15 Michael Catanzaro 2017-01-31 16:44:45 PST
Thanks Yusuke!
Comment 16 Build Bot 2017-01-31 21:53:03 PST
Comment on attachment 300252 [details]
Patch

Attachment 300252 [details] did not pass ios-sim-ews (ios-simulator-wk2):
Output: http://webkit-queues.webkit.org/results/2983099

New failing tests:
css3/filters/backdrop/dynamic-with-clip-path.html
Comment 17 Build Bot 2017-01-31 21:53:07 PST
Created attachment 300300 [details]
Archive of layout-test-results from ews122 for ios-simulator-wk2

The attached test failures were seen while running run-webkit-tests on the ios-sim-ews.
Bot: ews122  Port: ios-simulator-wk2  Platform: Mac OS X 10.11.6
Comment 18 Yusuke Suzuki 2017-02-01 03:00:24 PST
Comment on attachment 300252 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=300252&action=review

>> Source/JavaScriptCore/ChangeLog:8
>> +        Arity fixup extends C stack and copy/fill the stack with
> 
> There is only one stack.  How about "... extends the stack and copy/fills the"?

OK, we dropped the JS stack for JIT / LLInt! Changed.

>> Source/JavaScriptCore/ChangeLog:10
>> +        before extending the stack pointer. As a result, we touch
> 
> How about replacing "stacks before extending the stack pointer" with "stack space behind the stack pointer"

Sounds nice. I merged the suggestion from Mark. Then, I think rephasing it "stack space below the stack pointer" is nice.

>> Source/JavaScriptCore/ChangeLog:12
>> +        are unsafe, especially in Linux. This patch extends the stack
> 
> How about changing "extends" to "changes"?

Sounds nice. Fixed.

>> Source/JavaScriptCore/jit/ThunkGenerators.cpp:445
>> +    // may exceed the x64 red zone.
> 
> I think this comment is not entirely accurate and does not explain the core issue.  I think the issue is that the OS expects any space below the sp to be unused, and will allocate that region to be used by the interrupt / signal stack.  If we do not adjust the sp before filling in that region with content, the content may get trashed by an interrupt.  As a result, we need to adjust the sp first.  Incidentally, the reason that the interrupt handler can just use that region below the sp is because we are supposed to reserve an amount of memory (the red zone size) at the bottom of the stack that user code is not supposed to use.
> 
> Can you update the comments to clarify the issue please before landing?  Thanks.

Yup. The problem is SP is not updated before modifying the area that is lower than SP.
In that case, OS can break these areas since OS thinks it is unused.
In Linux port, we can encounter this bug more frequently since we use signals for suspend and resume the threads (SIGUSR2).
In addition, the Linux kernel could not populate the stack pages if the area is beyond the SP and the red zone.
This is the direct reason of this issue.
When touching such a page, the kernel does not populate the backing physical pages and report SEGV instead.

I'll update these comments.
Comment 19 Yusuke Suzuki 2017-02-01 03:30:53 PST
Committed r211479: <http://trac.webkit.org/changeset/211479>
Comment 20 Radar WebKit Bug Importer 2017-02-01 11:18:45 PST
<rdar://problem/30311611>
Comment 21 Kamil Frankowicz 2017-02-17 06:24:43 PST
This is CVE-2017-5949.