Bug 20760

Summary: Implement support for x86 Linux in CTI
Product: WebKit Reporter: wesleyZeng <weihong.zeng>
Component: JavaScriptCoreAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: alp, barraclough, loki, mrowe, sam
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: PC   
OS: Linux   
Bug Depends on:    
Bug Blocks: 21571    
Attachments:
Description Flags
fix the linking error.
mrowe: review-
inline the assemble code and link ok.
mrowe: review-
nod to #8 #9 #10 and sumbit patch again
none
Dirty, dirty patch that makes SFX work on x86 Linux
none
See if this helps...
none
Enable SFX on Linux/GTK
mrowe: review-
Enable SFX on Linux/GTK, updated mrowe: review+

Description wesleyZeng 2008-09-09 23:02:45 PDT
The global assemble symbol _ctiTrampoline, _ctiVMThrowTrampoline and __ZN3JSC7Machine12cti_vm_throwEPv are genrated for objective code by the assembler, but the linker need ctiTrampoline, ctiVMThrowTrampoline, _ZN3JSC7Machine12cti_vm_throwEPv, so it's failed to link.

These symbol are listed as follow:
CTI.cpp: _ctiTrampoline
CTI.cpp: _ctiVMThrowTrampoline
CTI.cpp: __ZN3JSC7Machine12cti_vm_throwEPv
Comment 1 wesleyZeng 2008-09-09 23:05:53 PDT
Created attachment 23313 [details]
fix the linking error.
Comment 2 Mark Rowe (bdash) 2008-09-09 23:38:51 PDT
Comment on attachment 23313 [details]
fix the linking error.

This breaks the Mac build, so cannot go in as-is.  Which platform are you building on?

Please don't add a name to the "Reviewed by" line unless your patch has actually been reviewed by that person.
Comment 3 Sam Weinig 2008-09-09 23:57:31 PDT
What version of GCC are you using?
Comment 4 wesleyZeng 2008-09-10 01:03:47 PDT
GCC version is "gcc (GCC) 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)
Copyright (C) 2006 Free Software Foundation, Inc."

I'm building WebKit on QT.
Comment 5 Mark Rowe (bdash) 2008-09-10 01:16:17 PDT
You'll need to rework your change to fix your platform while not breaking the Mac build. Please submit an updated patch for review that includes that change.
Comment 6 wesleyZeng 2008-09-10 03:11:39 PDT
Created attachment 23316 [details]
inline the assemble code and link ok.

inline the assemble code and link ok.
Comment 7 wesleyZeng 2008-09-10 03:14:02 PDT
I'm not sure that linker feel good on Mac. I haven't toolchain and sdk on Mac.
Comment 8 Mark Rowe (bdash) 2008-09-10 11:58:01 PDT
The problem with this approach is that it introduces an extra function prologue and epilogue.  This will lead to incorrect behavior due to the prologue doing stack manipulation yet the epilogue being skipped, and, if the code was tweaked to fix the behavioral problem, increased overhead that would likely be a performance regression.
Comment 9 Mark Rowe (bdash) 2008-09-10 12:09:10 PDT
A much simpler approach would be to declare the symbol name in a different manner when we are compiling for ELF (on Linux) vs Mach-O (on Mac OS X) as they use different naming schemes for global symbols (Mach-O prepends a leading underscore, ELF does not).
Comment 10 Gavin Barraclough 2008-09-10 12:31:16 PDT
*nod to comments #8 & #9 by bdash, we don't want to be reliant to C compiler implementation; we make assumptions about the distance back up the stack from a Machine::cti_* function called from JIT code to the parameters passed to the ctiTrampoline, this is only safe because we have complete knowledge of the behaviour of the trampoline.  As bdash says, simply declaring the symbols without the _ on Linux ought to do the trick.
Comment 11 wesleyZeng 2008-09-10 19:57:45 PDT
Created attachment 23334 [details]
nod to #8 #9 #10 and sumbit patch again

Please check it on Mac.
Comment 12 Mark Rowe (bdash) 2008-09-10 20:08:46 PDT
Does this patch cause CTI to work correctly on Linux?
Comment 13 Mark Rowe (bdash) 2008-09-10 20:14:14 PDT
I just tested it and it does not.

Looking at the two different inline assembly code blocks (GCC vs MSVC)... why don't we just generate these functions at run-time using our own code generator?  That would avoid the compiler syntax differences and platform symbol naming differences.  Is there any down side to doing that instead of adding more #if's?
Comment 14 wesleyZeng 2008-09-10 20:19:24 PDT
No!
 It caused Illegal instruction, I  try to implement this function and fix this error.
Comment 15 Mark Rowe (bdash) 2008-09-10 20:25:48 PDT
Created attachment 23335 [details]
Dirty, dirty patch that makes SFX work on x86 Linux

The attached patch gets things working on Linux.  It's really dirty, and does the bare minimum to get it working without doing so in a nice manner.  It passes all of the JSCore tests with both WREC and CTI enabled.  On SunSpider, it gives a 1.29x speedup.

Things that need fixed:
1) The inline assembly mess that I mentioned in a previous comment.
2) I'm not sure why Linux needs the same emitRestoreArgumentReference madness that Windows needs.  Perhaps Gavin can shed some light on this?
3) We have no PLATFORM macro for Linux, only the generic UNIX macro.  Do all x86 Unix platforms use the same ABI?  Do we need to introduce a LINUX define and target some of these changes specifically at Linux?
Comment 16 Gavin Barraclough 2008-09-11 05:12:34 PDT
> Things that need fixed:
> 1) The inline assembly mess that I mentioned in a previous comment.

Hmmm, JIT generating these trampolines is certainly an option, though obviously adds a small unnecessary startup cost.  Fixing this for OS X / Linux is easy, since they use the same asm syntax,

    #if PLATFORM(DARWIN)
    #define SYMBOL_STRING(name) "_" #name
    #else
    #define SYMBOL_STRING(name) #name
    #endif

then just,

    ".globl " SYMBOL_STRING(ctiVMThrowTrampoline) "\n"

etc.  Which will be a lot less extra code than would be required to wind up a whole extra run through the JIT to generate these... that said, unifying the windows asm is a bit more of a problem.  May be worth waiting 'til we have x86_64, see how integrating it's trampolines works using either of these ideas.

> 2) I'm not sure why Linux needs the same emitRestoreArgumentReference madness
> that Windows needs.  Perhaps Gavin can shed some light on this?

This requirement is tied to this change:
+#if 1 || COMPILER(MSVC)
So, in order to access the args passed into the JIT code, on OS X we take the address of the first arg & peek back up the stack to read them.  On windows, we basically do the same thing, but the compiler seems to screw up.  To prevent it from seeing what is going on, we load the first arg with a pointer to... itself, so instead of taking the address of the first arg and indexing up from here, we can just read the value of the first arg, and index up from here.

This means that on OS X the value of the arg passed into Machine::cti_* functions is never set – since it is never read – we only every use the address of the arg, to find other things higher up on the stack.  On windows we have to maintain the first arg as a pointer to itself, an update it at any point it might have been overwritten.

The cleanest and fastest solution would be to get Linux working like OS X, if we can get the compiler to behave itself – so first I'd suggest seeing if this change ("1 || ") was absolutely necessary.  If this fails then one quick thing that might be worth a try would be to apply the patch I'm about to add & see if it helps.  We currently try to mark the args to the JIT code that are modified from Machine::cti_ methods as volatile to try to prevent the compiler from optimizing out the writes (on the misapprehension that the stack belongs to it ;-) ), but the volatile is in the wrong place.  Fixing this might make the compiler on Linux play nicer. 

> 3) We have no PLATFORM macro for Linux, only the generic UNIX macro.  Do all
> x86 Unix platforms use the same ABI?  Do we need to introduce a LINUX define
> and target some of these changes specifically at Linux?

I believe all x86 Unicies follow the SysV ABI (e.g. http://math-atlas.sourceforge.net/devel/assembly/abi386-4.pdf), and would think we should probably be able to assume so, unless we run into an evidence to the contrary.
Comment 17 Gavin Barraclough 2008-09-11 05:14:13 PDT
Created attachment 23344 [details]
See if this helps...
Comment 18 Mark Rowe (bdash) 2008-09-11 17:10:38 PDT
The volatile trick doesn't help.  GCC on Linux is doing some weirdness where it appears to be copying "args" on the stack, then addressing relative to the copy.  I think for now we should stick with using emitRestoreArgumentReference on Linux, get this cleaned up and checked in, then someone can find some way to remove the need for the emitRestoreArgumentReference hack at a later date.
Comment 19 Alp Toker 2008-10-06 02:05:45 PDT
Created attachment 24112 [details]
Enable SFX on Linux/GTK

 ChangeLog                          |   11 ++++++++++
 JavaScriptCore/ChangeLog           |   39 +++++++++++++++++++++++++++++++++++++
 JavaScriptCore/GNUmakefile.am      |    8 +++++++
 JavaScriptCore/VM/CTI.cpp          |   22 +++++++++++++++-----
 JavaScriptCore/VM/CTI.h            |    2 -
 JavaScriptCore/VM/Machine.cpp      |    3 ++
 JavaScriptCore/VM/Machine.h        |   13 ++++++++----
 JavaScriptCore/kjs/regexp.cpp      |    9 ++++----
 JavaScriptCore/kjs/regexp.h        |    3 --
 JavaScriptCore/masm/X86Assembler.h |    9 +++++---
 configure.ac                       |   22 ++++++++++++++++++--
 11 files changed, 119 insertions(+), 22 deletions(-)

Enables CTI/WREC in the GTK+ port and avoids some over-inclusion of headers (bug #3761).

The CTI_ARGUMENT preprocessor checks have been made consistent so the workaround can be enabled/disabled on both GCC and MSVC, though it seems to work fine now without this.

I think we're ready to enable CTI/WREC for GTK and see if anyone complains. QT/WX can trivially enable it in their build systems too after this lands.

Thanks to Mark and Gavin for inspiration.
Comment 20 Gavin Barraclough 2008-10-06 08:31:33 PDT
Generally, this looks great – reducing m_wrecFunction to a void* makes me a little sad, but to be honest we'll probably end up turning it into a richer type than a c function pointer anyway, so I don't think this is a problem.

One issue, it looks like you're defining CTI_ARGS in both CTI.h and Machine.h? – if so, this is a little scary – we are likely going to make the set of arguments more complex (so some values can be passed in registers), and if these were to get out of sync this would be bad.  I'd prefer they were only defined in CTI.h, but if this causes a circular dependency, then could CTI.h include Machine.h? - exactly where we want this to be defined may shift as we sort out the calling conventions, so I don't mind exactly where it ends up, but would prefer if there is only one copy.
Comment 21 Mark Rowe (bdash) 2008-10-07 11:14:59 PDT
Comment on attachment 24112 [details]
Enable SFX on Linux/GTK

This generally looks good, but I'm going to mark it as r- due to the duplication of the definitions for CTI_ARGS.  Changing m_wrecFunction to a void* doesn't seem great either.
Comment 22 Alp Toker 2008-10-07 14:00:37 PDT
Created attachment 24159 [details]
Enable SFX on Linux/GTK, updated

Now avoids duplication of CTI_ARGS.

-        WRECFunction m_wrecFunction;
+        void* m_wrecFunction;

You could put WRECFunction in its own header file but I'm not convinced it's worth it. The way JITted code is passed around is probably going to have to change soon anyway so splitting this typedef out into its own header will just make more work. Besides, this pointer is invoked only in one place and is mostly treated as data so the type safety argument seems bogus.
Comment 23 Alp Toker 2008-10-07 14:06:27 PDT
(In reply to comment #22)
> Created an attachment (id=24159) [edit]
> Enable SFX on Linux/GTK, updated

Could change it from (WRECFunction) to reinterpret_cast<WRECFunction> before landing if that's any consolation!
Comment 24 Mark Rowe (bdash) 2008-10-08 16:50:19 PDT
Comment on attachment 24159 [details]
Enable SFX on Linux/GTK, updated

In Machine.cpp, the #if should not be mixed in with the includes.  It should come afterwards.


r=me
Comment 25 Alp Toker 2008-10-09 18:26:38 PDT
Patch updated and landed in r37457 but not enabled yet due to VoidPtrPair regression from r37386. Need to figure out what to do with that (compiling with -freg-struct-return fixes it but may cause problems, other workarounds have been suggested).
Comment 26 Alp Toker 2008-10-13 11:19:27 PDT
Closing this since the patch was landed. See bug #21571 for continued discussion on getting this enabled.