Summary: | [JSC] Build broken for 32-bit x86 after r208306 with GCC 4.9 | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Carlos Alberto Lopez Perez <clopez> | ||||||||
Component: | JavaScriptCore | Assignee: | Carlos Alberto Lopez Perez <clopez> | ||||||||
Status: | RESOLVED FIXED | ||||||||||
Severity: | Normal | CC: | benjamin, bugs-noreply, cdumez, cgarcia, cmarcelo, commit-queue, dbates, fpizlo, ggaren, gustavo, keith_miller, mark.lam, mcatanzaro, ossy, ysuzuki | ||||||||
Priority: | P2 | ||||||||||
Version: | WebKit Nightly Build | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
See Also: | https://bugs.webkit.org/show_bug.cgi?id=163562 | ||||||||||
Attachments: |
|
Description
Carlos Alberto Lopez Perez
2016-11-10 05:45:38 PST
And this seems to be the right answer: http://stackoverflow.com/a/28976166 Removing -fPIC for that file from the compiler arguments makes it build Does it work, though??? Is it really possible to have a shared library with position-dependent code? I do not think removing -fPIC could be right...? (In reply to comment #2) > Does it work, though??? Is it really possible to have a shared library with > position-dependent code? I do not think removing -fPIC could be right...? What I tried was just removing the fPIC flag from JavaScriptCore/heap/Heap.cpp which is (right now) the only user of x86_cpuid() via crossModifyingCodeFence(). Doing this with cmake seems difficult (not even sure if its possible), because the fPIC argument gets passed automatically for any library via ${CMAKE_SHARED_LIBRARY_CXX_FLAGS} (for the whole library and not for a single file). So I managed to get it build by manually editing the ninja recipes generated to remove the fPIC variable from JavaScriptCore/heap/Heap.cpp It seems to work, but its hard to tell if something is broke as I don't even know what is the purpose of this crossModifyingCodeFence() thing or when its used. And removing -fPIC for the whole JavaScriptCore library seems not something we want to do in any case. I see crossModifyingCodeFence is defined as std::atomic_thread_fence(std::memory_order_seq_cst) for non x86 cpus. Maybe that could work also for an x86 cpu?. Another option is to modify that assembly to make it happy with gcc-4.9 on x86-32 bits. Here seems to be some interesting notes about this: http://sam.zoy.org/blog/2007-04-13-shlib-with-non-pic-code-have-inline-assembly-and-pic-mix-well But is hard to try to modify that without knowing what is x86_cpuid() actually doing, how to test it, or what is the purpose of crossModifyingCodeFence(). Filip could you advice us what to do here to fix this? Ok.. i think now I understand better this. The purpose of executing the x86 assembly instruction cpuid on crossModifyingCodeFence() is just to force any modifications to registers, memory and flags to be completed and to drain all buffered writes to memory before the next instruction is fetched and executed. The value returned by cpuid is not used at all here. The idea is just to force this sync() on the cpu registers and memory.. right? Reference: https://jpassing.com/2015/01/19/runtime-code-modification-explained-part-2-cache-coherency-issues/ Created attachment 294391 [details]
Patch
Comment on attachment 294391 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=294391&action=review > Source/WTF/wtf/Atomics.h:239 > +#elif CPU(X86) && defined(__PIC__) && __GNUC__ < 5 It will be a compiler warning on other compilers. Best check if it's defined before using it: && defined(__GNUC__) && __GNUC__ < 5 Do we want to run this code when using Clang? If not, then use this instead to filter out Clang: && COMPILER(GCC) && __GNUC__ < 5 Created attachment 294401 [details]
Patch
Address Michael comments
Ping reviewers? The GTK 32-bit buildbot has been red for several days already due to this <https://build.webkit.org/builders/GTK%20Linux%2032-bit%20Release> Comment on attachment 294401 [details]
Patch
I'd rather another reviewer look at it because I do not understand the code, but here's your rubber stamp.
I'm suspicious that the code is only needed when using GCC. Comment on attachment 294401 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=294401&action=review > Source/WTF/wtf/Atomics.h:248 > +#elif CPU(X86) && defined(__PIC__) && COMPILER(GCC) && __GNUC__ < 5 > + intptr_t a = 0, b, c, d; > + asm volatile( > + "pushl %%ebx;" // save %ebx > + "cpuid;" > + "movl %%ebx, %0;" // save what cpuid put in %ebx into the first output (b) > + "popl %%ebx;" // restore the old %ebx value > + : "=r"(b), "+a"(a), "=c"(c), "=d"(d) > + : > + : "memory"); You should just make this conditional on CPU(X86). Because cpuid is so expensive, we don't use this on code paths where the extra push/pop would be expensive. It's better to simplify the conditional just for sanity. You don't need to movl %%ebx, %0. We never use the results of a, b, c, d. The only point of those variables is as a way to tell the compiler that cpuid clobbers ax, bx, cx, and dx. But it seems that this version of GCC doesn't grok our clobbering of bx. So, I suggest something like this for x86-32: intptr_t a = 0, c, d; asm volatile( "pushl %%ebx\n\t" "cpuid\n\t" "popl %%ebx\n\t" : "+a"(a), "=c"(c), "=d"(d) : : "memory"); So, we still tell the compiler that this reads and writes ax, writes cx, and writes dx, but we don't tell the compiler about bx, because we know that the compiler might not understand this and because we preserve its value using push/pop. r- because I think that the movl is unnecessary. (In reply to comment #11) > Comment on attachment 294401 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=294401&action=review > > > Source/WTF/wtf/Atomics.h:248 > > +#elif CPU(X86) && defined(__PIC__) && COMPILER(GCC) && __GNUC__ < 5 > > + intptr_t a = 0, b, c, d; > > + asm volatile( > > + "pushl %%ebx;" // save %ebx > > + "cpuid;" > > + "movl %%ebx, %0;" // save what cpuid put in %ebx into the first output (b) > > + "popl %%ebx;" // restore the old %ebx value > > + : "=r"(b), "+a"(a), "=c"(c), "=d"(d) > > + : > > + : "memory"); > > You should just make this conditional on CPU(X86). Because cpuid is so > expensive, we don't use this on code paths where the extra push/pop would be > expensive. It's better to simplify the conditional just for sanity. > > You don't need to movl %%ebx, %0. We never use the results of a, b, c, d. > The only point of those variables is as a way to tell the compiler that > cpuid clobbers ax, bx, cx, and dx. But it seems that this version of GCC > doesn't grok our clobbering of bx. > > So, I suggest something like this for x86-32: > > intptr_t a = 0, c, d; > asm volatile( > "pushl %%ebx\n\t" > "cpuid\n\t" > "popl %%ebx\n\t" > : "+a"(a), "=c"(c), "=d"(d) > : > : "memory"); > > So, we still tell the compiler that this reads and writes ax, writes cx, and > writes dx, but we don't tell the compiler about bx, because we know that the > compiler might not understand this and because we preserve its value using > push/pop. > > r- because I think that the movl is unnecessary. The movl is necessary if you actually want to read what cpuid has returned, but since we don't care about that, we can just ignore it and avoid the movl. However, if you are concerned about the extra movl and about having more if-defs than necessary, why just don't use this both unconditionally for x86 and x64? I have checked what the compiler does, and doing it how you suggests avoids 2 movl call on x64 http://sprunge.us/SVKR (and 1 in x86) (In reply to comment #12) > (In reply to comment #11) > > Comment on attachment 294401 [details] > > Patch > > > > View in context: > > https://bugs.webkit.org/attachment.cgi?id=294401&action=review > > > > > Source/WTF/wtf/Atomics.h:248 > > > +#elif CPU(X86) && defined(__PIC__) && COMPILER(GCC) && __GNUC__ < 5 > > > + intptr_t a = 0, b, c, d; > > > + asm volatile( > > > + "pushl %%ebx;" // save %ebx > > > + "cpuid;" > > > + "movl %%ebx, %0;" // save what cpuid put in %ebx into the first output (b) > > > + "popl %%ebx;" // restore the old %ebx value > > > + : "=r"(b), "+a"(a), "=c"(c), "=d"(d) > > > + : > > > + : "memory"); > > > > You should just make this conditional on CPU(X86). Because cpuid is so > > expensive, we don't use this on code paths where the extra push/pop would be > > expensive. It's better to simplify the conditional just for sanity. > > > > You don't need to movl %%ebx, %0. We never use the results of a, b, c, d. > > The only point of those variables is as a way to tell the compiler that > > cpuid clobbers ax, bx, cx, and dx. But it seems that this version of GCC > > doesn't grok our clobbering of bx. > > > > So, I suggest something like this for x86-32: > > > > intptr_t a = 0, c, d; > > asm volatile( > > "pushl %%ebx\n\t" > > "cpuid\n\t" > > "popl %%ebx\n\t" > > : "+a"(a), "=c"(c), "=d"(d) > > : > > : "memory"); > > > > So, we still tell the compiler that this reads and writes ax, writes cx, and > > writes dx, but we don't tell the compiler about bx, because we know that the > > compiler might not understand this and because we preserve its value using > > push/pop. > > > > r- because I think that the movl is unnecessary. > > The movl is necessary if you actually want to read what cpuid has returned, > but since we don't care about that, we can just ignore it and avoid the movl. > > However, if you are concerned about the extra movl and about having more > if-defs than necessary, why just don't use this both unconditionally for x86 > and x64? I'm happy with that, but it gets awkward when telling the compiler that you want to push ebx. On x86-32 you would say pushl ebx, while on x86-64 you would say pushq rbx. If you can find a clever way around this then I'm happy with this code being shared. That would be the best. > > I have checked what the compiler does, and doing it how you suggests avoids > 2 movl call on x64 http://sprunge.us/SVKR (and 1 in x86) (In reply to comment #13) > > > > The movl is necessary if you actually want to read what cpuid has returned, > > but since we don't care about that, we can just ignore it and avoid the movl. > > > > However, if you are concerned about the extra movl and about having more > > if-defs than necessary, why just don't use this both unconditionally for x86 > > and x64? > > I'm happy with that, but it gets awkward when telling the compiler that you > want to push ebx. On x86-32 you would say pushl ebx, while on x86-64 you > would say pushq rbx. If you can find a clever way around this then I'm > happy with this code being shared. That would be the best. > Mmmm, good point. I was overlooking that in 64-bit the instructions and the registers are different :| So, I guess that using this for x86 only is the best thing I can come with. Created attachment 294870 [details]
Patch
Avoid a movl call and make it conditional for CPU(X86)
Filip please, could you review this patch? The GTK 32-bit buildbot continues to be broken due to this. I understand it should be quick to review. Thanks Comment on attachment 294870 [details]
Patch
r=me because Filip has already reviewed this previously and suggested this specific fix.
(In reply to comment #17) > Comment on attachment 294870 [details] > Patch > > r=me because Filip has already reviewed this previously and suggested this > specific fix. Thanks! Comment on attachment 294870 [details] Patch Clearing flags on attachment: 294870 Committed r208806: <http://trac.webkit.org/changeset/208806> All reviewed patches have been landed. Closing bug. |