Bug 164588

Summary: [JSC] Build broken for 32-bit x86 after r208306 with GCC 4.9
Product: WebKit Reporter: Carlos Alberto Lopez Perez <clopez>
Component: JavaScriptCoreAssignee: Carlos Alberto Lopez Perez <clopez>
Status: RESOLVED FIXED    
Severity: Normal CC: benjamin, bugs-noreply, cdumez, cgarcia, cmarcelo, commit-queue, dbates, fpizlo, ggaren, gustavo, keith_miller, mark.lam, mcatanzaro, ossy, ysuzuki
Priority: P2    
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugs.webkit.org/show_bug.cgi?id=163562
Attachments:
Description Flags
Patch
none
Patch
none
Patch none

Carlos Alberto Lopez Perez
Reported 2016-11-10 05:45:38 PST
Building WebKitGTK+ (and I guess any other Linux port) after r208306 <http://trac.webkit.org/r208306> causes this: g++-4.9 [.....] -o Source/JavaScriptCore/CMakeFiles/JavaScriptCore.dir/heap/Heap.cpp.o -c ../../Source/JavaScriptCore/heap/Heap.cpp In file included from ../../Source/WTF/wtf/Lock.h:29:0, from ../../Source/WTF/wtf/HashTable.h:33, from ../../Source/WTF/wtf/HashMap.h:25, from ../../Source/JavaScriptCore/runtime/JSCJSValue.h:33, from ../../Source/JavaScriptCore/heap/HandleTypes.h:28, from ../../Source/JavaScriptCore/heap/Handle.h:28, from ../../Source/JavaScriptCore/heap/HandleSet.h:28, from ../../Source/JavaScriptCore/heap/Heap.h:27, from ../../Source/JavaScriptCore/heap/Heap.cpp:22: ../../Source/WTF/wtf/Atomics.h: In member function 'bool JSC::Heap::handleGCDidJIT(unsigned int)': ../../Source/WTF/wtf/Atomics.h:245:20: error: inconsistent operand constraints in an 'asm' : "memory"); ^ https://build.webkit.org/builders/GTK%20Linux%2032-bit%20Release/builds/65378/steps/compile-webkit/logs/stdio/text This is the function causing the issue: inline void x86_cpuid() { #if OS(WINDOWS) int info[4]; __cpuid(info, 0); #else intptr_t a = 0, b, c, d; asm volatile( "cpuid" : "+a"(a), "=b"(b), "=c"(c), "=d"(d) : : "memory"); #endif } Seems GCC 4.9 doesn't like it on 32-bit The build failure is not reproducible with newer GCC (tested with GCC 5.3.1 and built fine)
Attachments
Patch (1.79 KB, patch)
2016-11-10 11:38 PST, Carlos Alberto Lopez Perez
no flags
Patch (1.80 KB, patch)
2016-11-10 13:12 PST, Carlos Alberto Lopez Perez
no flags
Patch (1.87 KB, patch)
2016-11-15 13:09 PST, Carlos Alberto Lopez Perez
no flags
Carlos Alberto Lopez Perez
Comment 1 2016-11-10 06:02:42 PST
And this seems to be the right answer: http://stackoverflow.com/a/28976166 Removing -fPIC for that file from the compiler arguments makes it build
Michael Catanzaro
Comment 2 2016-11-10 06:55:31 PST
Does it work, though??? Is it really possible to have a shared library with position-dependent code? I do not think removing -fPIC could be right...?
Carlos Alberto Lopez Perez
Comment 3 2016-11-10 08:35:42 PST
(In reply to comment #2) > Does it work, though??? Is it really possible to have a shared library with > position-dependent code? I do not think removing -fPIC could be right...? What I tried was just removing the fPIC flag from JavaScriptCore/heap/Heap.cpp which is (right now) the only user of x86_cpuid() via crossModifyingCodeFence(). Doing this with cmake seems difficult (not even sure if its possible), because the fPIC argument gets passed automatically for any library via ${CMAKE_SHARED_LIBRARY_CXX_FLAGS} (for the whole library and not for a single file). So I managed to get it build by manually editing the ninja recipes generated to remove the fPIC variable from JavaScriptCore/heap/Heap.cpp It seems to work, but its hard to tell if something is broke as I don't even know what is the purpose of this crossModifyingCodeFence() thing or when its used. And removing -fPIC for the whole JavaScriptCore library seems not something we want to do in any case. I see crossModifyingCodeFence is defined as std::atomic_thread_fence(std::memory_order_seq_cst) for non x86 cpus. Maybe that could work also for an x86 cpu?. Another option is to modify that assembly to make it happy with gcc-4.9 on x86-32 bits. Here seems to be some interesting notes about this: http://sam.zoy.org/blog/2007-04-13-shlib-with-non-pic-code-have-inline-assembly-and-pic-mix-well But is hard to try to modify that without knowing what is x86_cpuid() actually doing, how to test it, or what is the purpose of crossModifyingCodeFence(). Filip could you advice us what to do here to fix this?
Carlos Alberto Lopez Perez
Comment 4 2016-11-10 10:42:16 PST
Ok.. i think now I understand better this. The purpose of executing the x86 assembly instruction cpuid on crossModifyingCodeFence() is just to force any modifications to registers, memory and flags to be completed and to drain all buffered writes to memory before the next instruction is fetched and executed. The value returned by cpuid is not used at all here. The idea is just to force this sync() on the cpu registers and memory.. right? Reference: https://jpassing.com/2015/01/19/runtime-code-modification-explained-part-2-cache-coherency-issues/
Carlos Alberto Lopez Perez
Comment 5 2016-11-10 11:38:58 PST
Michael Catanzaro
Comment 6 2016-11-10 12:21:01 PST
Comment on attachment 294391 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=294391&action=review > Source/WTF/wtf/Atomics.h:239 > +#elif CPU(X86) && defined(__PIC__) && __GNUC__ < 5 It will be a compiler warning on other compilers. Best check if it's defined before using it: && defined(__GNUC__) && __GNUC__ < 5 Do we want to run this code when using Clang? If not, then use this instead to filter out Clang: && COMPILER(GCC) && __GNUC__ < 5
Carlos Alberto Lopez Perez
Comment 7 2016-11-10 13:12:55 PST
Created attachment 294401 [details] Patch Address Michael comments
Carlos Alberto Lopez Perez
Comment 8 2016-11-15 07:33:45 PST
Ping reviewers? The GTK 32-bit buildbot has been red for several days already due to this <https://build.webkit.org/builders/GTK%20Linux%2032-bit%20Release>
Michael Catanzaro
Comment 9 2016-11-15 08:35:01 PST
Comment on attachment 294401 [details] Patch I'd rather another reviewer look at it because I do not understand the code, but here's your rubber stamp.
Michael Catanzaro
Comment 10 2016-11-15 08:36:00 PST
I'm suspicious that the code is only needed when using GCC.
Filip Pizlo
Comment 11 2016-11-15 09:31:36 PST
Comment on attachment 294401 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=294401&action=review > Source/WTF/wtf/Atomics.h:248 > +#elif CPU(X86) && defined(__PIC__) && COMPILER(GCC) && __GNUC__ < 5 > + intptr_t a = 0, b, c, d; > + asm volatile( > + "pushl %%ebx;" // save %ebx > + "cpuid;" > + "movl %%ebx, %0;" // save what cpuid put in %ebx into the first output (b) > + "popl %%ebx;" // restore the old %ebx value > + : "=r"(b), "+a"(a), "=c"(c), "=d"(d) > + : > + : "memory"); You should just make this conditional on CPU(X86). Because cpuid is so expensive, we don't use this on code paths where the extra push/pop would be expensive. It's better to simplify the conditional just for sanity. You don't need to movl %%ebx, %0. We never use the results of a, b, c, d. The only point of those variables is as a way to tell the compiler that cpuid clobbers ax, bx, cx, and dx. But it seems that this version of GCC doesn't grok our clobbering of bx. So, I suggest something like this for x86-32: intptr_t a = 0, c, d; asm volatile( "pushl %%ebx\n\t" "cpuid\n\t" "popl %%ebx\n\t" : "+a"(a), "=c"(c), "=d"(d) : : "memory"); So, we still tell the compiler that this reads and writes ax, writes cx, and writes dx, but we don't tell the compiler about bx, because we know that the compiler might not understand this and because we preserve its value using push/pop. r- because I think that the movl is unnecessary.
Carlos Alberto Lopez Perez
Comment 12 2016-11-15 12:10:14 PST
(In reply to comment #11) > Comment on attachment 294401 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=294401&action=review > > > Source/WTF/wtf/Atomics.h:248 > > +#elif CPU(X86) && defined(__PIC__) && COMPILER(GCC) && __GNUC__ < 5 > > + intptr_t a = 0, b, c, d; > > + asm volatile( > > + "pushl %%ebx;" // save %ebx > > + "cpuid;" > > + "movl %%ebx, %0;" // save what cpuid put in %ebx into the first output (b) > > + "popl %%ebx;" // restore the old %ebx value > > + : "=r"(b), "+a"(a), "=c"(c), "=d"(d) > > + : > > + : "memory"); > > You should just make this conditional on CPU(X86). Because cpuid is so > expensive, we don't use this on code paths where the extra push/pop would be > expensive. It's better to simplify the conditional just for sanity. > > You don't need to movl %%ebx, %0. We never use the results of a, b, c, d. > The only point of those variables is as a way to tell the compiler that > cpuid clobbers ax, bx, cx, and dx. But it seems that this version of GCC > doesn't grok our clobbering of bx. > > So, I suggest something like this for x86-32: > > intptr_t a = 0, c, d; > asm volatile( > "pushl %%ebx\n\t" > "cpuid\n\t" > "popl %%ebx\n\t" > : "+a"(a), "=c"(c), "=d"(d) > : > : "memory"); > > So, we still tell the compiler that this reads and writes ax, writes cx, and > writes dx, but we don't tell the compiler about bx, because we know that the > compiler might not understand this and because we preserve its value using > push/pop. > > r- because I think that the movl is unnecessary. The movl is necessary if you actually want to read what cpuid has returned, but since we don't care about that, we can just ignore it and avoid the movl. However, if you are concerned about the extra movl and about having more if-defs than necessary, why just don't use this both unconditionally for x86 and x64? I have checked what the compiler does, and doing it how you suggests avoids 2 movl call on x64 http://sprunge.us/SVKR (and 1 in x86)
Filip Pizlo
Comment 13 2016-11-15 12:13:56 PST
(In reply to comment #12) > (In reply to comment #11) > > Comment on attachment 294401 [details] > > Patch > > > > View in context: > > https://bugs.webkit.org/attachment.cgi?id=294401&action=review > > > > > Source/WTF/wtf/Atomics.h:248 > > > +#elif CPU(X86) && defined(__PIC__) && COMPILER(GCC) && __GNUC__ < 5 > > > + intptr_t a = 0, b, c, d; > > > + asm volatile( > > > + "pushl %%ebx;" // save %ebx > > > + "cpuid;" > > > + "movl %%ebx, %0;" // save what cpuid put in %ebx into the first output (b) > > > + "popl %%ebx;" // restore the old %ebx value > > > + : "=r"(b), "+a"(a), "=c"(c), "=d"(d) > > > + : > > > + : "memory"); > > > > You should just make this conditional on CPU(X86). Because cpuid is so > > expensive, we don't use this on code paths where the extra push/pop would be > > expensive. It's better to simplify the conditional just for sanity. > > > > You don't need to movl %%ebx, %0. We never use the results of a, b, c, d. > > The only point of those variables is as a way to tell the compiler that > > cpuid clobbers ax, bx, cx, and dx. But it seems that this version of GCC > > doesn't grok our clobbering of bx. > > > > So, I suggest something like this for x86-32: > > > > intptr_t a = 0, c, d; > > asm volatile( > > "pushl %%ebx\n\t" > > "cpuid\n\t" > > "popl %%ebx\n\t" > > : "+a"(a), "=c"(c), "=d"(d) > > : > > : "memory"); > > > > So, we still tell the compiler that this reads and writes ax, writes cx, and > > writes dx, but we don't tell the compiler about bx, because we know that the > > compiler might not understand this and because we preserve its value using > > push/pop. > > > > r- because I think that the movl is unnecessary. > > The movl is necessary if you actually want to read what cpuid has returned, > but since we don't care about that, we can just ignore it and avoid the movl. > > However, if you are concerned about the extra movl and about having more > if-defs than necessary, why just don't use this both unconditionally for x86 > and x64? I'm happy with that, but it gets awkward when telling the compiler that you want to push ebx. On x86-32 you would say pushl ebx, while on x86-64 you would say pushq rbx. If you can find a clever way around this then I'm happy with this code being shared. That would be the best. > > I have checked what the compiler does, and doing it how you suggests avoids > 2 movl call on x64 http://sprunge.us/SVKR (and 1 in x86)
Carlos Alberto Lopez Perez
Comment 14 2016-11-15 12:51:31 PST
(In reply to comment #13) > > > > The movl is necessary if you actually want to read what cpuid has returned, > > but since we don't care about that, we can just ignore it and avoid the movl. > > > > However, if you are concerned about the extra movl and about having more > > if-defs than necessary, why just don't use this both unconditionally for x86 > > and x64? > > I'm happy with that, but it gets awkward when telling the compiler that you > want to push ebx. On x86-32 you would say pushl ebx, while on x86-64 you > would say pushq rbx. If you can find a clever way around this then I'm > happy with this code being shared. That would be the best. > Mmmm, good point. I was overlooking that in 64-bit the instructions and the registers are different :| So, I guess that using this for x86 only is the best thing I can come with.
Carlos Alberto Lopez Perez
Comment 15 2016-11-15 13:09:21 PST
Created attachment 294870 [details] Patch Avoid a movl call and make it conditional for CPU(X86)
Carlos Alberto Lopez Perez
Comment 16 2016-11-16 12:03:06 PST
Filip please, could you review this patch? The GTK 32-bit buildbot continues to be broken due to this. I understand it should be quick to review. Thanks
Mark Lam
Comment 17 2016-11-16 12:50:01 PST
Comment on attachment 294870 [details] Patch r=me because Filip has already reviewed this previously and suggested this specific fix.
Carlos Alberto Lopez Perez
Comment 18 2016-11-16 12:59:46 PST
(In reply to comment #17) > Comment on attachment 294870 [details] > Patch > > r=me because Filip has already reviewed this previously and suggested this > specific fix. Thanks!
Carlos Alberto Lopez Perez
Comment 19 2016-11-16 13:01:05 PST
Comment on attachment 294870 [details] Patch Clearing flags on attachment: 294870 Committed r208806: <http://trac.webkit.org/changeset/208806>
Carlos Alberto Lopez Perez
Comment 20 2016-11-16 13:01:16 PST
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.