Bug 150794 - Fix the ENABLE(B3_JIT) build on Linux
Summary: Fix the ENABLE(B3_JIT) build on Linux
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: Other
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Csaba Osztrogonác
URL:
Keywords:
Depends on:
Blocks: 150279 152248
  Show dependency treegraph
 
Reported: 2015-11-02 04:54 PST by Csaba Osztrogonác
Modified: 2016-01-28 14:11 PST (History)
6 users (show)

See Also:


Attachments
Patch (6.45 KB, patch)
2015-11-02 04:58 PST, Csaba Osztrogonác
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Csaba Osztrogonác 2015-11-02 04:54:47 PST
SSIA
Comment 1 Csaba Osztrogonác 2015-11-02 04:58:36 PST
Created attachment 264574 [details]
Patch
Comment 2 Csaba Osztrogonác 2015-11-02 05:17:14 PST
Comment on attachment 264574 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=264574&action=review

> Source/JavaScriptCore/b3/B3IndexSet.h:89
>              iterator& operator++()
>              {
> -                m_iter++;
> +                ++m_iter;
>                  return *this;
>              }

The build failure which this change fixes:

../../Source/JavaScriptCore/b3/B3IndexSet.h: In member function ‘JSC::B3::IndexSet<T>::Iterable<CollectionType>::iterator& JSC::B3::IndexSet<T>::Iterable<CollectionType>::iterator::operator++()’:
../../Source/JavaScriptCore/b3/B3IndexSet.h:87:23: error: no ‘operator++(int)’ declared for postfix ‘++’ [-fpermissive]
                 m_iter++;
                       ^
I'm wondering how does it build on Apple Mac, because there isn't postfix operator++ in wtf/BitVector.h : BitVector::SetBitsIterable::iterator.
Additionally using postfix operator++ isn't necessary here, since we don't use the previous value of m_iter.
Comment 3 Csaba Osztrogonác 2015-11-02 05:22:11 PST
Comment on attachment 264574 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=264574&action=review

> Source/JavaScriptCore/b3/B3HeapRange.h:31
> +#include <limits.h>

because we uses UINT_MAX in this file:
../../Source/JavaScriptCore/b3/B3HeapRange.h: In static member function 'static JSC::B3::HeapRange JSC::B3::HeapRange::top()':
../../Source/JavaScriptCore/b3/B3HeapRange.h:62:29: error: 'UINT_MAX' was not declared in this scope

> Source/JavaScriptCore/b3/air/AirSpecial.cpp:31
> +#include <limits.h>

ditto
Comment 4 Csaba Osztrogonác 2015-11-02 05:26:57 PST
Comment on attachment 264574 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=264574&action=review

> Source/JavaScriptCore/b3/B3Type.h:73
> +    RELEASE_ASSERT_NOT_REACHED();
> +    return 0;
>  }

The build failure which this change fixes:

../../Source/JavaScriptCore/b3/B3Type.h: In function 'size_t JSC::B3::sizeofType(JSC::B3::Type)':
../../Source/JavaScriptCore/b3/B3Type.h:71:1: error: control reaches end of non-void function [-Werror=return-type]
cc1plus: all warnings being treated as errors

Type isn't strongly typed enum, so we have to add this unreachable return to make the compiler happy.

> Source/JavaScriptCore/b3/air/AirArg.h:127
> +        RELEASE_ASSERT_NOT_REACHED();
> +        return false;
>      }

ditto

> Source/JavaScriptCore/b3/air/AirArg.h:142
> +        RELEASE_ASSERT_NOT_REACHED();
> +        return false;
>      }

ditto

> Source/JavaScriptCore/b3/air/AirArg.h:477
> +        RELEASE_ASSERT_NOT_REACHED();
> +        return false;
>      }

ditto

> Source/JavaScriptCore/b3/air/AirArg.h:501
> +        RELEASE_ASSERT_NOT_REACHED();
> +        return false;
>      }

ditto

> Source/JavaScriptCore/b3/air/AirArg.h:531
> +        RELEASE_ASSERT_NOT_REACHED();
> +        return false;
>      }

ditto

> Source/JavaScriptCore/b3/air/AirCode.h:76
> +        RELEASE_ASSERT_NOT_REACHED();
> +        return { };
>      }

ditto

> Source/JavaScriptCore/b3/air/AirCode.h:88
> +        RELEASE_ASSERT_NOT_REACHED();
> +        return 0;
>      }

ditto
Comment 5 WebKit Commit Bot 2015-11-03 07:51:29 PST
Comment on attachment 264574 [details]
Patch

Clearing flags on attachment: 264574

Committed r191952: <http://trac.webkit.org/changeset/191952>
Comment 6 WebKit Commit Bot 2015-11-03 07:51:33 PST
All reviewed patches have been landed.  Closing bug.
Comment 7 Filip Pizlo 2015-11-03 10:05:34 PST
Comment on attachment 264574 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=264574&action=review

>> Source/JavaScriptCore/b3/air/AirArg.h:127
>>      }
> 
> ditto

Can we please find some other way of doing these kinds of fixes?  These methods are called from hot paths, and on clang, this change causes the compiler to emit substantially more code.  I don't like it when fixes for Linux unnecessarily slow things down on OSX.
Comment 8 Csaba Osztrogonác 2015-11-03 10:15:34 PST
Comment on attachment 264574 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=264574&action=review

>>> Source/JavaScriptCore/b3/air/AirArg.h:127
>>>      }
>> 
>> ditto
> 
> Can we please find some other way of doing these kinds of fixes?  These methods are called from hot paths, and on clang, this change causes the compiler to emit substantially more code.  I don't like it when fixes for Linux unnecessarily slow things down on OSX.

Sure, I'll check if we can fix it in a different way.

I agree that the code size can be a little bit bigger, but unreachable code can't slow down anything.
Comment 9 Filip Pizlo 2015-11-03 10:21:04 PST
(In reply to comment #8)
> Comment on attachment 264574 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=264574&action=review
> 
> >>> Source/JavaScriptCore/b3/air/AirArg.h:127
> >>>      }
> >> 
> >> ditto
> > 
> > Can we please find some other way of doing these kinds of fixes?  These methods are called from hot paths, and on clang, this change causes the compiler to emit substantially more code.  I don't like it when fixes for Linux unnecessarily slow things down on OSX.
> 
> Sure, I'll check if we can fix it in a different way.
> 
> I agree that the code size can be a little bit bigger, but unreachable code
> can't slow down anything.

False.

I just wrote this program, and compiled it with clang -O3.

#include <stdlib.h>

int foo(int x)
{
    switch (x) {
    case 42:
        return 23;
    case 85:
        return 12;
    }
}

int bar(int x)
{
    switch (x) {
    case 42:
        return 23;
    case 85:
        return 12;
    }
    return 65;
}

int baz(int x)
{
    switch (x) {
    case 42:
        return 23;
    case 85:
        return 12;
    }
    abort();
    return 65;
}

The output is:

[pizlo@shakezilla OpenSource] otool -tv test.o
test.o:
(__TEXT,__text) section
_foo:
0000000000000000	pushq	%rbp
0000000000000001	movq	%rsp, %rbp
0000000000000004	cmpl	$0x55, %edi
0000000000000007	movl	$0xc, %ecx
000000000000000c	movl	$0x17, %eax
0000000000000011	cmovel	%ecx, %eax
0000000000000014	popq	%rbp
0000000000000015	retq
0000000000000016	nopw	%cs:(%rax,%rax)
_bar:
0000000000000020	pushq	%rbp
0000000000000021	movq	%rsp, %rbp
0000000000000024	movl	$0x17, %eax
0000000000000029	cmpl	$0x2a, %edi
000000000000002c	je	0x3f
000000000000002e	cmpl	$0x55, %edi
0000000000000031	jne	0x3a
0000000000000033	movl	$0xc, %eax
0000000000000038	popq	%rbp
0000000000000039	retq
000000000000003a	movl	$0x41, %eax
000000000000003f	popq	%rbp
0000000000000040	retq
0000000000000041	nopw	%cs:(%rax,%rax)
_baz:
0000000000000050	pushq	%rbp
0000000000000051	movq	%rsp, %rbp
0000000000000054	movl	$0x17, %eax
0000000000000059	cmpl	$0x2a, %edi
000000000000005c	je	0x68
000000000000005e	cmpl	$0x55, %edi
0000000000000061	jne	0x6a
0000000000000063	movl	$0xc, %eax
0000000000000068	popq	%rbp
0000000000000069	retq
000000000000006a	callq	0x6f

As you can see, foo is compiled into a super-efficient conditional move.  Bar and baz are both compiled into a pair of branches - you need to execute one branch to get to the first case, and two branches to get to the second case.  Even one branch is worse than a conditional move, but two branches are a lot worse.

It would actually be best if you reverted your change.  There is no value to having B3 ported to Linux right now, and there is definitely harm to introducing any overheads, even hypothetical ones.  We are working hard to turn B3 into a hotrod that generates code super quickly, and we'd rather not have any changes go in that have uncertain performance implications.
Comment 10 Csaba Osztrogonác 2015-11-03 10:22:55 PST
Filed new bug report for follow up: bug150842 . I'll investigate it soon.
Comment 11 Csaba Osztrogonác 2015-11-03 10:34:01 PST
(In reply to comment #9)
> As you can see, foo is compiled into a super-efficient conditional move. 
> Bar and baz are both compiled into a pair of branches - you need to execute
> one branch to get to the first case, and two branches to get to the second
> case.  Even one branch is worse than a conditional move, but two branches
> are a lot worse.
 
> It would actually be best if you reverted your change.  There is no value to
> having B3 ported to Linux right now, and there is definitely harm to
> introducing any overheads, even hypothetical ones.  We are working hard to
> turn B3 into a hotrod that generates code super quickly, and we'd rather not
> have any changes go in that have uncertain performance implications.

OK, you convinced me, I reverted unreachable returns
in http://trac.webkit.org/changeset/191957 .

Will find a better fix bug150842 which doesn't generate any code.
Comment 12 Yusuke Suzuki 2016-01-28 13:45:35 PST
Seeing https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62272, I noticed that we have very very easy (and no performance regression!!!) fix for this issue.
Just replacing `addEdge` in the nested lambda to `this->addEdge`.

Filip, how about this?
Comment 13 Yusuke Suzuki 2016-01-28 13:48:19 PST
Ah, oops, I miscommented.
https://bugs.webkit.org/show_bug.cgi?id=151624 is correct issue for me.
Comment 14 Filip Pizlo 2016-01-28 14:11:21 PST
(In reply to comment #12)
> Seeing https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62272, I noticed that we
> have very very easy (and no performance regression!!!) fix for this issue.
> Just replacing `addEdge` in the nested lambda to `this->addEdge`.
> 
> Filip, how about this?

Seems sensible!