Bug 172793 - GC should use scrambled free-lists
Summary: GC should use scrambled free-lists
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: All All
: P2 Normal
Assignee: Filip Pizlo
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2017-05-31 20:37 PDT by Filip Pizlo
Modified: 2017-06-09 11:27 PDT (History)
14 users (show)

See Also:


Attachments
it's a start (32.09 KB, patch)
2017-05-31 20:38 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
more (37.61 KB, patch)
2017-06-01 09:17 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
it grows (44.20 KB, patch)
2017-06-01 09:53 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
compiles in debug and runs some simple code (53.89 KB, patch)
2017-06-01 10:32 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
release builds and seems to pass some tests (54.38 KB, patch)
2017-06-01 10:41 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
the patch (54.45 KB, patch)
2017-06-01 11:05 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
the patch (54.42 KB, patch)
2017-06-01 11:22 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
removed some pointer chasing (55.22 KB, patch)
2017-06-01 13:12 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
switched to list-of-pointers instead of list-of-offsets (54.87 KB, patch)
2017-06-01 13:41 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
scrambled the free-list instead (57.47 KB, patch)
2017-06-01 14:35 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
the patch (57.47 KB, patch)
2017-06-01 15:38 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
the patch (57.82 KB, patch)
2017-06-01 15:44 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
the patch (58.39 KB, patch)
2017-06-01 17:03 PDT, Filip Pizlo
mark.lam: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Filip Pizlo 2017-05-31 20:37:52 PDT
Patch forthcoming.
Comment 1 Filip Pizlo 2017-05-31 20:38:30 PDT
Created attachment 311676 [details]
it's a start
Comment 2 Filip Pizlo 2017-06-01 09:17:10 PDT
Created attachment 311709 [details]
more
Comment 3 Filip Pizlo 2017-06-01 09:53:46 PDT
Created attachment 311711 [details]
it grows
Comment 4 Filip Pizlo 2017-06-01 10:32:46 PDT
Created attachment 311722 [details]
compiles in debug and runs some simple code

Building and testing release now.
Comment 5 Build Bot 2017-06-01 10:34:42 PDT
Attachment 311722 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/heap/MarkedAllocator.cpp:159:  More than one command on the same line  [whitespace/newline] [4]
Total errors found: 1 in 23 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 6 Filip Pizlo 2017-06-01 10:41:53 PDT
Created attachment 311724 [details]
release builds and seems to pass some tests
Comment 7 Build Bot 2017-06-01 10:44:44 PDT
Attachment 311724 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/heap/MarkedAllocator.cpp:159:  More than one command on the same line  [whitespace/newline] [4]
Total errors found: 1 in 24 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 8 Filip Pizlo 2017-06-01 11:05:46 PDT
Created attachment 311728 [details]
the patch

Fix builds
Comment 9 Build Bot 2017-06-01 11:07:29 PDT
Attachment 311728 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/heap/MarkedAllocator.cpp:159:  More than one command on the same line  [whitespace/newline] [4]
Total errors found: 1 in 24 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 10 Filip Pizlo 2017-06-01 11:22:54 PDT
Created attachment 311733 [details]
the patch
Comment 11 Build Bot 2017-06-01 11:25:23 PDT
Attachment 311733 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/heap/MarkedAllocator.cpp:159:  More than one command on the same line  [whitespace/newline] [4]
Total errors found: 1 in 24 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 12 Oliver Hunt 2017-06-01 11:27:45 PDT
View in context: https://bugs.webkit.org/attachment.cgi?id=311728&action=review

lgtm (with comments), but i'd obviously prefer someone more familiar with ToT to actually r+

> Source/JavaScriptCore/heap/FreeList.cpp:71
> +    m_remaining = remaining;

can you add a RELEASE_ASSERT(remaining % cellSize == 0 ) ?

> Source/JavaScriptCore/heap/FreeListInlines.h:43
> +    unsigned remaining = m_remaining;

Can you see if making this Checked<unsigned> has a perf impact? This is simply paranoia and the code does not obviously overflow, but i'm increasingly paranoid about almost math these days :D

> Source/JavaScriptCore/heap/FreeListInlines.h:47
> +        m_remaining = remaining;

This being safe is dependent on remaining always being a multiple of m_cellSize.

> Source/JavaScriptCore/heap/MarkedAllocator.cpp:159
> +        [] () -> HeapCell* { RELEASE_ASSERT_NOT_REACHED(); return nullptr; });

nice
Comment 13 Mark Lam 2017-06-01 11:35:41 PDT
Comment on attachment 311724 [details]
release builds and seems to pass some tests

View in context: https://bugs.webkit.org/attachment.cgi?id=311724&action=review

> Source/JavaScriptCore/heap/FreeList.cpp:38
> +    static_assert(MarkedBlock::blockSize <= (1u << (sizeof(FreeOffset) * 8)), "FreeList assumes that FreeOffset is big enough to express all possible block offsets");

Why 8?  Why not MarkedBlock::atomSize?  According to MarkedBlock::Handle::cellSize(), the cellSize is m_atomsPerCell * atomSize.  That means the minimum cellSize is atomSize (which incidentally is 16 bytes).

> Source/JavaScriptCore/heap/FreeList.cpp:46
> +FreeList::~FreeList()
> +{
> +}

Can we delete this since it does nothing?

> Source/JavaScriptCore/heap/FreeList.h:-82
> -    bool operator==(const FreeList& other) const
> -    {
> -        return head == other.head
> -            && payloadEnd == other.payloadEnd
> -            && remaining == other.remaining
> -            && originalSize == other.originalSize;
> -    }
> -    
> -    bool operator!=(const FreeList& other) const
> -    {
> -        return !(*this == other);
> -    }
> -    
> -    explicit operator bool() const
> -    {
> -        return *this != FreeList();
> -    }

Are these unused?  Would you mind adding = delete versions of these in the private section just to make sure that we're not using the default versions of them unexpectedly?

> Source/JavaScriptCore/heap/FreeList.h:43
> +    ~FreeList();

Delete since it does nothing?

> Source/JavaScriptCore/heap/FreeList.h:80
> +    unsigned m_cellSize { UINT_MAX };

nit: can we group this with the other unsigneds below for better packing (though it happens to not matter right now because you have an odd number of unsigneds at the moment )?

> Source/JavaScriptCore/heap/FreeList.h:84
> +    unsigned m_listSize { 0 };
> +    unsigned m_index { 0 };

nit: Why not make these of type FreeOffset since they should be bounded by FreeOffset, no?

Oh wait ... is there an atomicity requirement that requires that these be 32-bit?  Or is it just because it's bothersome to have the JIT code read/write a 16bit field?
Comment 14 Filip Pizlo 2017-06-01 11:45:05 PDT
(In reply to Oliver Hunt from comment #12)
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=311728&action=review
> 
> lgtm (with comments), but i'd obviously prefer someone more familiar with
> ToT to actually r+
> 
> > Source/JavaScriptCore/heap/FreeList.cpp:71
> > +    m_remaining = remaining;
> 
> can you add a RELEASE_ASSERT(remaining % cellSize == 0 ) ?

I suppose.

> 
> > Source/JavaScriptCore/heap/FreeListInlines.h:43
> > +    unsigned remaining = m_remaining;
> 
> Can you see if making this Checked<unsigned> has a perf impact? This is
> simply paranoia and the code does not obviously overflow, but i'm
> increasingly paranoid about almost math these days :D

I don't think we want to use Checked<> everywhere.  This bump allocator is unchanged in this patch, so even if we did want to harden it, we'd want to do this in another patch.  I wouldn't want to see that, though.  This code is meant to be compact.

> 
> > Source/JavaScriptCore/heap/FreeListInlines.h:47
> > +        m_remaining = remaining;
> 
> This being safe is dependent on remaining always being a multiple of
> m_cellSize.

Which has always been true.  I'm not worried about that path.

> 
> > Source/JavaScriptCore/heap/MarkedAllocator.cpp:159
> > +        [] () -> HeapCell* { RELEASE_ASSERT_NOT_REACHED(); return nullptr; });
> 
> nice
Comment 15 Filip Pizlo 2017-06-01 11:49:35 PDT
(In reply to Mark Lam from comment #13)
> Comment on attachment 311724 [details]
> release builds and seems to pass some tests
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=311724&action=review
> 
> > Source/JavaScriptCore/heap/FreeList.cpp:38
> > +    static_assert(MarkedBlock::blockSize <= (1u << (sizeof(FreeOffset) * 8)), "FreeList assumes that FreeOffset is big enough to express all possible block offsets");
> 
> Why 8?  Why not MarkedBlock::atomSize?  According to
> MarkedBlock::Handle::cellSize(), the cellSize is m_atomsPerCell * atomSize. 
> That means the minimum cellSize is atomSize (which incidentally is 16 bytes).

sizeof(blah)*8 is a way of counting the bits in blah.  This is counting the number of bits in FreeOffset, which is 16.  The result is 16K, which is also blockSize.  The point is: if we made blockSize any bigger, then a 16-bit offset wouldn't work anymore.

> 
> > Source/JavaScriptCore/heap/FreeList.cpp:46
> > +FreeList::~FreeList()
> > +{
> > +}
> 
> Can we delete this since it does nothing?

It deallocates the m_list, which is a unique_ptr.  I like having those kinds of things get outlined.  Makes debugging easier IMO.

> 
> > Source/JavaScriptCore/heap/FreeList.h:-82
> > -    bool operator==(const FreeList& other) const
> > -    {
> > -        return head == other.head
> > -            && payloadEnd == other.payloadEnd
> > -            && remaining == other.remaining
> > -            && originalSize == other.originalSize;
> > -    }
> > -    
> > -    bool operator!=(const FreeList& other) const
> > -    {
> > -        return !(*this == other);
> > -    }
> > -    
> > -    explicit operator bool() const
> > -    {
> > -        return *this != FreeList();
> > -    }
> 
> Are these unused?  Would you mind adding = delete versions of these in the
> private section just to make sure that we're not using the default versions
> of them unexpectedly?

Don't need to =delete them because these don't get defaults.

> 
> > Source/JavaScriptCore/heap/FreeList.h:43
> > +    ~FreeList();
> 
> Delete since it does nothing?
> 
> > Source/JavaScriptCore/heap/FreeList.h:80
> > +    unsigned m_cellSize { UINT_MAX };
> 
> nit: can we group this with the other unsigneds below for better packing
> (though it happens to not matter right now because you have an odd number of
> unsigneds at the moment )?

Interesting, I'll look at this.

> 
> > Source/JavaScriptCore/heap/FreeList.h:84
> > +    unsigned m_listSize { 0 };
> > +    unsigned m_index { 0 };
> 
> nit: Why not make these of type FreeOffset since they should be bounded by
> FreeOffset, no?

32-bit is cheaper to load/store on x86 than 16-bit, just in the sense that 16-bit operations need the 16-bit prefix byte.  So, I only use 16-bit for the array, since there we get a nice space savings.

> 
> Oh wait ... is there an atomicity requirement that requires that these be
> 32-bit?  Or is it just because it's bothersome to have the JIT code
> read/write a 16bit field?

Yeah, the bothersome part.  Specifically, code bloat.
Comment 16 Mark Lam 2017-06-01 11:58:38 PDT
Comment on attachment 311724 [details]
release builds and seems to pass some tests

View in context: https://bugs.webkit.org/attachment.cgi?id=311724&action=review

>>> Source/JavaScriptCore/heap/FreeList.cpp:38
>>> +    static_assert(MarkedBlock::blockSize <= (1u << (sizeof(FreeOffset) * 8)), "FreeList assumes that FreeOffset is big enough to express all possible block offsets");
>> 
>> Why 8?  Why not MarkedBlock::atomSize?  According to MarkedBlock::Handle::cellSize(), the cellSize is m_atomsPerCell * atomSize.  That means the minimum cellSize is atomSize (which incidentally is 16 bytes).
> 
> sizeof(blah)*8 is a way of counting the bits in blah.  This is counting the number of bits in FreeOffset, which is 16.  The result is 16K, which is also blockSize.  The point is: if we made blockSize any bigger, then a 16-bit offset wouldn't work anymore.

Oh, I totally misread that.  Ignore me.
Comment 17 Filip Pizlo 2017-06-01 12:15:34 PDT
It looks like this is having some perf issues in splay.  Investigating...
Comment 18 Filip Pizlo 2017-06-01 13:12:19 PDT
Created attachment 311747 [details]
removed some pointer chasing

It's not much faster though
Comment 19 Filip Pizlo 2017-06-01 13:30:14 PDT
Hmmmm.  This might be a lost cause.  I cannot seem to be able to get it to be fast on splay.  I'm going to try a few more things...
Comment 20 Filip Pizlo 2017-06-01 13:41:06 PDT
Created attachment 311752 [details]
switched to list-of-pointers instead of list-of-offsets

Still slow
Comment 21 Filip Pizlo 2017-06-01 14:35:41 PDT
Created attachment 311759 [details]
scrambled the free-list instead

This *looks* like it will be perf-neutral!
Comment 22 Filip Pizlo 2017-06-01 15:38:55 PDT
Created attachment 311767 [details]
the patch
Comment 23 Build Bot 2017-06-01 15:41:44 PDT
Attachment 311767 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/heap/MarkedAllocator.cpp:159:  More than one command on the same line  [whitespace/newline] [4]
Total errors found: 1 in 27 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 24 Filip Pizlo 2017-06-01 15:44:18 PDT
Created attachment 311769 [details]
the patch
Comment 25 Build Bot 2017-06-01 15:45:57 PDT
Attachment 311769 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/heap/MarkedAllocator.cpp:159:  More than one command on the same line  [whitespace/newline] [4]
Total errors found: 1 in 27 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 26 Filip Pizlo 2017-06-01 17:03:22 PDT
Created attachment 311783 [details]
the patch
Comment 27 Build Bot 2017-06-01 17:05:28 PDT
Attachment 311783 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/heap/MarkedAllocator.cpp:159:  More than one command on the same line  [whitespace/newline] [4]
Total errors found: 1 in 28 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 28 Mark Lam 2017-06-01 17:13:52 PDT
Comment on attachment 311783 [details]
the patch

View in context: https://bugs.webkit.org/attachment.cgi?id=311783&action=review

r=me

> Source/JavaScriptCore/ChangeLog:3
> +        GC should use a scrambled free-list

Please update to match your changed bug title.

> Source/JavaScriptCore/jit/AssemblyHelpers.h:1494
> +        if (isX86())
> +            xorPtr(Address(allocatorGPR, MarkedAllocator::offsetOfFreeList() + FreeList::offsetOfSecret()), resultGPR);
> +        else {
> +            loadPtr(Address(allocatorGPR, MarkedAllocator::offsetOfFreeList() + FreeList::offsetOfSecret()), scratchGPR);
> +            xorPtr(scratchGPR, resultGPR);
> +        }

Do we still need to have this conditional on isX86() given that you've added the xorPtr impls for ARM64 and ARMv7?  Maybe make it #if CPU(...) to be nice to other ports that don't have the xorPtr(Address src, RegisterID dest) impl yet?  Otherwise, they get a build failure.
Comment 29 Filip Pizlo 2017-06-01 17:38:54 PDT
(In reply to Mark Lam from comment #28)
> Comment on attachment 311783 [details]
> the patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=311783&action=review
> 
> r=me
> 
> > Source/JavaScriptCore/ChangeLog:3
> > +        GC should use a scrambled free-list
> 
> Please update to match your changed bug title.
> 
> > Source/JavaScriptCore/jit/AssemblyHelpers.h:1494
> > +        if (isX86())
> > +            xorPtr(Address(allocatorGPR, MarkedAllocator::offsetOfFreeList() + FreeList::offsetOfSecret()), resultGPR);
> > +        else {
> > +            loadPtr(Address(allocatorGPR, MarkedAllocator::offsetOfFreeList() + FreeList::offsetOfSecret()), scratchGPR);
> > +            xorPtr(scratchGPR, resultGPR);
> > +        }
> 
> Do we still need to have this conditional on isX86() given that you've added
> the xorPtr impls for ARM64 and ARMv7?  Maybe make it #if CPU(...) to be nice
> to other ports that don't have the xorPtr(Address src, RegisterID dest) impl
> yet?  Otherwise, they get a build failure.

We need this because of the scratch register issue: this code cannot use the scratch register because it's called from FTL patchpoints that aren't marked as clobbering the scratch register.  Most patchpoints don't behave this way.  This one is special.

The rest of this function uses "if" instead of "#if", so I just went with the style.  That forced me to implement that function.  I think that eventually we want all variants of these opcodes in all masms, so that for normal JIT code (which doesn't get called from the super special kinds of patchpoints) we can just say op(addr, reg).
Comment 30 Radar WebKit Bug Importer 2017-06-01 17:39:12 PDT
<rdar://problem/32526204>
Comment 31 Mark Lam 2017-06-01 17:54:17 PDT
(In reply to Filip Pizlo from comment #29)
> (In reply to Mark Lam from comment #28)
> > > Source/JavaScriptCore/jit/AssemblyHelpers.h:1494
> > > +        if (isX86())
> > > +            xorPtr(Address(allocatorGPR, MarkedAllocator::offsetOfFreeList() + FreeList::offsetOfSecret()), resultGPR);
> > > +        else {
> > > +            loadPtr(Address(allocatorGPR, MarkedAllocator::offsetOfFreeList() + FreeList::offsetOfSecret()), scratchGPR);
> > > +            xorPtr(scratchGPR, resultGPR);
> > > +        }
> > 
> > Do we still need to have this conditional on isX86() given that you've added
> > the xorPtr impls for ARM64 and ARMv7?  Maybe make it #if CPU(...) to be nice
> > to other ports that don't have the xorPtr(Address src, RegisterID dest) impl
> > yet?  Otherwise, they get a build failure.
> 
> We need this because of the scratch register issue: this code cannot use the
> scratch register because it's called from FTL patchpoints that aren't marked
> as clobbering the scratch register.  Most patchpoints don't behave this way.
> This one is special.
> 
> The rest of this function uses "if" instead of "#if", so I just went with
> the style.  That forced me to implement that function.  I think that
> eventually we want all variants of these opcodes in all masms, so that for
> normal JIT code (which doesn't get called from the super special kinds of
> patchpoints) we can just say op(addr, reg).

By scratch register, I presume you meant the dataTempRegister and not the scratchGPR in the code?  Perhaps a comment to document this restriction would be good because it's not obvious from the code, and someone might come along later and break this code unknowingly.
Comment 32 Filip Pizlo 2017-06-02 08:58:23 PDT
(In reply to Mark Lam from comment #31)
> (In reply to Filip Pizlo from comment #29)
> > (In reply to Mark Lam from comment #28)
> > > > Source/JavaScriptCore/jit/AssemblyHelpers.h:1494
> > > > +        if (isX86())
> > > > +            xorPtr(Address(allocatorGPR, MarkedAllocator::offsetOfFreeList() + FreeList::offsetOfSecret()), resultGPR);
> > > > +        else {
> > > > +            loadPtr(Address(allocatorGPR, MarkedAllocator::offsetOfFreeList() + FreeList::offsetOfSecret()), scratchGPR);
> > > > +            xorPtr(scratchGPR, resultGPR);
> > > > +        }
> > > 
> > > Do we still need to have this conditional on isX86() given that you've added
> > > the xorPtr impls for ARM64 and ARMv7?  Maybe make it #if CPU(...) to be nice
> > > to other ports that don't have the xorPtr(Address src, RegisterID dest) impl
> > > yet?  Otherwise, they get a build failure.
> > 
> > We need this because of the scratch register issue: this code cannot use the
> > scratch register because it's called from FTL patchpoints that aren't marked
> > as clobbering the scratch register.  Most patchpoints don't behave this way.
> > This one is special.
> > 
> > The rest of this function uses "if" instead of "#if", so I just went with
> > the style.  That forced me to implement that function.  I think that
> > eventually we want all variants of these opcodes in all masms, so that for
> > normal JIT code (which doesn't get called from the super special kinds of
> > patchpoints) we can just say op(addr, reg).
> 
> By scratch register, I presume you meant the dataTempRegister and not the
> scratchGPR in the code?  Perhaps a comment to document this restriction
> would be good because it's not obvious from the code, and someone might come
> along later and break this code unknowingly.

There already is a comment:

        // NOTE: This is carefully written so that we can call it while we disallow scratch
        // register usage.
Comment 33 Filip Pizlo 2017-06-02 09:00:09 PDT
Landed in http://trac.webkit.org/changeset/217711/webkit
Comment 34 Ryan Haddad 2017-06-02 10:52:25 PDT
iOS build fix in https://trac.webkit.org/changeset/217717/webkit
Comment 35 Csaba Osztrogonác 2017-06-03 06:44:49 PDT
And ARM buildfix landed in https://trac.webkit.org/changeset/217756/webkit
Comment 36 Guillaume Emont 2017-06-09 11:27:20 PDT
Build fix for MIPS provided in bug#173170