Bug 212106

Summary: [Wasm] Reduce the amount of memory used by the Air register coloring allocator
Product: WebKit Reporter: Michael Saboff <msaboff>
Component: JavaScriptCoreAssignee: Michael Saboff <msaboff>
Status: RESOLVED FIXED    
Severity: Normal CC: ews-watchlist, keith_miller, mark.lam, sbarati, tzagallo, webkit-bug-importer, ysuzuki
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch with Build Fixes
ysuzuki: review+
Patch for Landing. Changed to using a templated class. none

Description Michael Saboff 2020-05-19 13:52:25 PDT
For large Wasm functions, the graph coloring register allocator uses O(N^2) space where N is the number of temps in the Air IR.  We need to reduce the space required by large functions.
Comment 1 Michael Saboff 2020-05-20 10:53:47 PDT
<rdar://problem/63454420>
Comment 2 Michael Saboff 2020-06-11 04:50:14 PDT
Created attachment 401633 [details]
Patch
Comment 3 Michael Saboff 2020-06-11 05:00:08 PDT
Created attachment 401634 [details]
Patch with Build Fixes
Comment 4 Yusuke Suzuki 2020-06-12 19:00:36 PDT
Comment on attachment 401634 [details]
Patch with Build Fixes

View in context: https://bugs.webkit.org/attachment.cgi?id=401634&action=review

r=me with comment.

> Source/JavaScriptCore/b3/air/AirAllocateRegistersByGraphColoring.cpp:191
> +class InterferenceEdge16Bit {
> +protected:
> +    using IndexType = unsigned;
> +
> +public:
> +    struct InterferenceEdgeHash {
> +        static unsigned hash(const InterferenceEdge16Bit& key) { return key.hash(); }
> +        static bool equal(const InterferenceEdge16Bit& a, const InterferenceEdge16Bit& b) { return a == b; }
> +        static constexpr bool safeToCompareToEmptyOrDeleted = true;
> +    };
> +    typedef SimpleClassHashTraits<InterferenceEdge16Bit> InterferenceEdgeHashTraits;
> +
> +    typedef HashSet<InterferenceEdge16Bit, InterferenceEdgeHash, InterferenceEdgeHashTraits> InterferenceSet;
> +
> +    InterferenceEdge16Bit()
> +    {
> +    }
> +
> +    InterferenceEdge16Bit(IndexType a, IndexType b)
> +    {
> +        ASSERT(a);
> +        ASSERT(b);
> +        ASSERT(a < std::numeric_limits<uint16_t>::max());
> +        ASSERT(b < std::numeric_limits<uint16_t>::max());
> +        ASSERT_WITH_MESSAGE(a != b, "A Tmp can never interfere with itself. Doing so would force it to be the superposition of two registers.");
> +
> +        if (b < a)
> +            std::swap(a, b);
> +        m_value = static_cast<uint32_t>(a) << 16 | b;
> +    }
> +
> +    InterferenceEdge16Bit(WTF::HashTableDeletedValueType)
> +        : m_value(std::numeric_limits<uint32_t>::max())
> +    {
> +    }
> +
> +    IndexType first() const
> +    {
> +        return m_value >> 16 & 0xffff;
> +    }
> +
> +    IndexType second() const
> +    {
> +        return m_value & 0xffff;
> +    }
> +
> +    bool operator==(const InterferenceEdge16Bit& other) const
> +    {
> +        return m_value == other.m_value;
> +    }
> +
> +    bool isHashTableDeletedValue() const
> +    {
> +        return *this == InterferenceEdge16Bit(WTF::HashTableDeletedValue);
> +    }
> +
> +    unsigned hash() const
> +    {
> +        return WTF::IntHash<uint32_t>::hash(m_value);
> +    }
> +
> +    void dump(PrintStream& out) const
> +    {
> +        out.print(first(), "<=>", second());
> +    }
> +
> +private:
> +    uint32_t m_value { 0 };
> +};
> +
> +class InterferenceEdge32Bit {
> +protected:
> +    using IndexType = unsigned;
> +
> +public:
> +    struct InterferenceEdgeHash {
> +        static unsigned hash(const InterferenceEdge32Bit& key) { return key.hash(); }
> +        static bool equal(const InterferenceEdge32Bit& a, const InterferenceEdge32Bit& b) { return a == b; }
> +        static constexpr bool safeToCompareToEmptyOrDeleted = true;
> +    };
> +    typedef SimpleClassHashTraits<InterferenceEdge32Bit> InterferenceEdgeHashTraits;
> +
> +    typedef HashSet<InterferenceEdge32Bit, InterferenceEdgeHash, InterferenceEdgeHashTraits> InterferenceSet;
> +
> +    InterferenceEdge32Bit()
> +    {
> +    }
> +
> +    InterferenceEdge32Bit(IndexType a, IndexType b)
> +    {
> +        ASSERT(a);
> +        ASSERT(b);
> +        ASSERT_WITH_MESSAGE(a != b, "A Tmp can never interfere with itself. Doing so would force it to be the superposition of two registers.");
> +
> +        if (b < a)
> +            std::swap(a, b);
> +        m_value = static_cast<uint64_t>(a) << 32 | b;
> +    }
> +
> +    InterferenceEdge32Bit(WTF::HashTableDeletedValueType)
> +        : m_value(std::numeric_limits<uint64_t>::max())
> +    {
> +    }
> +
> +    IndexType first() const
> +    {
> +        return m_value >> 32 & 0xffffffff;
> +    }
> +
> +    IndexType second() const
> +    {
> +        return m_value & 0xffffffff;
> +    }
> +
> +    bool operator==(const InterferenceEdge32Bit& other) const
> +    {
> +        return m_value == other.m_value;
> +    }
> +
> +    bool isHashTableDeletedValue() const
> +    {
> +        return *this == InterferenceEdge32Bit(WTF::HashTableDeletedValue);
> +    }
> +
> +    unsigned hash() const
> +    {
> +        return WTF::IntHash<uint64_t>::hash(m_value);
> +    }
> +
> +    void dump(PrintStream& out) const
> +    {
> +        out.print(first(), "<=>", second());
> +    }
> +
> +private:
> +    uint64_t m_value { 0 };
> +};

The implementation looks like very similar. Can we make them templatized one? Like,

template<typename IndexType, typename PairStorage>
class InterferenceEdge {
    static_assert(sizeof(IndexType) * 2 == sizeof(PairStorage));
    ...
};

using InferenceEdge16 = InterferenceEdge<uint16_t, uint32_t>;
using InferenceEdge32 = InterferenceEdge<uint32_t, uint64_t>;
Comment 5 Michael Saboff 2020-06-12 21:23:26 PDT
(In reply to Yusuke Suzuki from comment #4)

Thanks for the review.

> Comment on attachment 401634 [details]
> Patch with Build Fixes
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=401634&action=review
> 
> r=me with comment.
> 
> > Source/JavaScriptCore/b3/air/AirAllocateRegistersByGraphColoring.cpp:191
> > +class InterferenceEdge16Bit {
> > +protected:
> > +    using IndexType = unsigned;
> > +
> > +public:
> > +    struct InterferenceEdgeHash {
> > +        static unsigned hash(const InterferenceEdge16Bit& key) { return key.hash(); }
> > +        static bool equal(const InterferenceEdge16Bit& a, const InterferenceEdge16Bit& b) { return a == b; }
> > +        static constexpr bool safeToCompareToEmptyOrDeleted = true;
> > +    };
> > +    typedef SimpleClassHashTraits<InterferenceEdge16Bit> InterferenceEdgeHashTraits;
> > +
> > +    typedef HashSet<InterferenceEdge16Bit, InterferenceEdgeHash, InterferenceEdgeHashTraits> InterferenceSet;
> > +
> > +    InterferenceEdge16Bit()
> > +    {
> > +    }
> > +
> > +    InterferenceEdge16Bit(IndexType a, IndexType b)
> > +    {
> > +        ASSERT(a);
> > +        ASSERT(b);
> > +        ASSERT(a < std::numeric_limits<uint16_t>::max());
> > +        ASSERT(b < std::numeric_limits<uint16_t>::max());
> > +        ASSERT_WITH_MESSAGE(a != b, "A Tmp can never interfere with itself. Doing so would force it to be the superposition of two registers.");
> > +
> > +        if (b < a)
> > +            std::swap(a, b);
> > +        m_value = static_cast<uint32_t>(a) << 16 | b;
> > +    }
> > +
> > +    InterferenceEdge16Bit(WTF::HashTableDeletedValueType)
> > +        : m_value(std::numeric_limits<uint32_t>::max())
> > +    {
> > +    }
> > +
> > +    IndexType first() const
> > +    {
> > +        return m_value >> 16 & 0xffff;
> > +    }
> > +
> > +    IndexType second() const
> > +    {
> > +        return m_value & 0xffff;
> > +    }
> > +
> > +    bool operator==(const InterferenceEdge16Bit& other) const
> > +    {
> > +        return m_value == other.m_value;
> > +    }
> > +
> > +    bool isHashTableDeletedValue() const
> > +    {
> > +        return *this == InterferenceEdge16Bit(WTF::HashTableDeletedValue);
> > +    }
> > +
> > +    unsigned hash() const
> > +    {
> > +        return WTF::IntHash<uint32_t>::hash(m_value);
> > +    }
> > +
> > +    void dump(PrintStream& out) const
> > +    {
> > +        out.print(first(), "<=>", second());
> > +    }
> > +
> > +private:
> > +    uint32_t m_value { 0 };
> > +};
> > +
> > +class InterferenceEdge32Bit {
> > +protected:
> > +    using IndexType = unsigned;
> > +
> > +public:
> > +    struct InterferenceEdgeHash {
> > +        static unsigned hash(const InterferenceEdge32Bit& key) { return key.hash(); }
> > +        static bool equal(const InterferenceEdge32Bit& a, const InterferenceEdge32Bit& b) { return a == b; }
> > +        static constexpr bool safeToCompareToEmptyOrDeleted = true;
> > +    };
> > +    typedef SimpleClassHashTraits<InterferenceEdge32Bit> InterferenceEdgeHashTraits;
> > +
> > +    typedef HashSet<InterferenceEdge32Bit, InterferenceEdgeHash, InterferenceEdgeHashTraits> InterferenceSet;
> > +
> > +    InterferenceEdge32Bit()
> > +    {
> > +    }
> > +
> > +    InterferenceEdge32Bit(IndexType a, IndexType b)
> > +    {
> > +        ASSERT(a);
> > +        ASSERT(b);
> > +        ASSERT_WITH_MESSAGE(a != b, "A Tmp can never interfere with itself. Doing so would force it to be the superposition of two registers.");
> > +
> > +        if (b < a)
> > +            std::swap(a, b);
> > +        m_value = static_cast<uint64_t>(a) << 32 | b;
> > +    }
> > +
> > +    InterferenceEdge32Bit(WTF::HashTableDeletedValueType)
> > +        : m_value(std::numeric_limits<uint64_t>::max())
> > +    {
> > +    }
> > +
> > +    IndexType first() const
> > +    {
> > +        return m_value >> 32 & 0xffffffff;
> > +    }
> > +
> > +    IndexType second() const
> > +    {
> > +        return m_value & 0xffffffff;
> > +    }
> > +
> > +    bool operator==(const InterferenceEdge32Bit& other) const
> > +    {
> > +        return m_value == other.m_value;
> > +    }
> > +
> > +    bool isHashTableDeletedValue() const
> > +    {
> > +        return *this == InterferenceEdge32Bit(WTF::HashTableDeletedValue);
> > +    }
> > +
> > +    unsigned hash() const
> > +    {
> > +        return WTF::IntHash<uint64_t>::hash(m_value);
> > +    }
> > +
> > +    void dump(PrintStream& out) const
> > +    {
> > +        out.print(first(), "<=>", second());
> > +    }
> > +
> > +private:
> > +    uint64_t m_value { 0 };
> > +};
> 
> The implementation looks like very similar. Can we make them templatized
> one? Like,
> 
> template<typename IndexType, typename PairStorage>
> class InterferenceEdge {
>     static_assert(sizeof(IndexType) * 2 == sizeof(PairStorage));
>     ...
> };
> 
> using InferenceEdge16 = InterferenceEdge<uint16_t, uint32_t>;
> using InferenceEdge32 = InterferenceEdge<uint32_t, uint64_t>;

I did consider templated these classes.  I have 2 minor concerns.

 1. Templated classes could make it a little harder to read given that in addition to the two template parameters, the constants for shifting and masking will need to use derived values.

 2. These types are fed into a templated class hierarchy.  It is already difficult to understand and debug that hierarchy.  Making these templates could make that worse.

I'll give it a try and see what it looks like.
Comment 6 Michael Saboff 2020-06-16 17:25:30 PDT
Created attachment 402060 [details]
Patch for Landing.  Changed to using a templated class.
Comment 7 Michael Saboff 2020-06-17 07:41:18 PDT
Committed r263146: <https://trac.webkit.org/changeset/263146>