Bug 218406 - [Concurrent display lists] Encode display list items directly into shared memory
Summary: [Concurrent display lists] Encode display list items directly into shared memory
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Layout and Rendering (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Wenson Hsieh
URL:
Keywords: InRadar
Depends on: 218516 218517 218518
Blocks: 218425 218586
  Show dependency treegraph
 
Reported: 2020-10-30 16:06 PDT by Wenson Hsieh
Modified: 2020-11-06 12:10 PST (History)
17 users (show)

See Also:


Attachments
Part 1 WIP (367.81 KB, patch)
2020-10-30 16:54 PDT, Wenson Hsieh
no flags Details | Formatted Diff | Diff
More WIP (address Sam's feedback and try to fix builds) (381.42 KB, patch)
2020-11-02 15:30 PST, Wenson Hsieh
ews-feeder: commit-queue-
Details | Formatted Diff | Diff
More WIP (feedback + more build fixes) (386.16 KB, patch)
2020-11-02 16:11 PST, Wenson Hsieh
ews-feeder: commit-queue-
Details | Formatted Diff | Diff
More WIP (more build fixes) (386.29 KB, patch)
2020-11-02 16:35 PST, Wenson Hsieh
ews-feeder: commit-queue-
Details | Formatted Diff | Diff
More WIP (387.62 KB, patch)
2020-11-02 17:10 PST, Wenson Hsieh
ews-feeder: commit-queue-
Details | Formatted Diff | Diff
More WIP (387.50 KB, patch)
2020-11-02 17:37 PST, Wenson Hsieh
no flags Details | Formatted Diff | Diff
More WIP (358.56 KB, patch)
2020-11-03 16:04 PST, Wenson Hsieh
no flags Details | Formatted Diff | Diff
More WIP (360.01 KB, patch)
2020-11-03 17:34 PST, Wenson Hsieh
ews-feeder: commit-queue-
Details | Formatted Diff | Diff
More WIP (365.21 KB, patch)
2020-11-03 17:50 PST, Wenson Hsieh
no flags Details | Formatted Diff | Diff
Rebase on trunk (361.64 KB, patch)
2020-11-05 12:47 PST, Wenson Hsieh
thorton: review+
Details | Formatted Diff | Diff
Address feedback (371.41 KB, patch)
2020-11-05 16:05 PST, Wenson Hsieh
no flags Details | Formatted Diff | Diff
Rebase on trunk (after r269503) (372.01 KB, patch)
2020-11-06 09:12 PST, Wenson Hsieh
ews-feeder: commit-queue-
Details | Formatted Diff | Diff
Add missing NO_RETURN_DUE_TO_ASSERT after rebase (372.12 KB, patch)
2020-11-06 09:21 PST, Wenson Hsieh
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Wenson Hsieh 2020-10-30 16:06:24 PDT
The first step towards allowing the GPU and web processes to read and write display list items concurrently.

(more details coming soon)
Comment 1 Wenson Hsieh 2020-10-30 16:54:14 PDT
Created attachment 412811 [details]
Part 1 WIP
Comment 2 Sam Weinig 2020-11-01 10:48:28 PST
Comment on attachment 412811 [details]
Part 1 WIP

View in context: https://bugs.webkit.org/attachment.cgi?id=412811&action=review

I realize this is a WIP, but I thought I would give some initial feedback. Looking really good!

Given how composable these parts are, I think it would be useful to write some unit tests for things like ItemBuffer and for the round trip encoding/decoding of display list items in TestWebKitAPI.

> Source/WebCore/platform/graphics/displaylists/DisplayList.h:81
> +    WEBCORE_EXPORT void map(Function<void(ItemHandle, Optional<FloatRect>&&)>&&) const;

map feels a bit to abstract of a name for this to me, though if others like it I won't object. A more c++'y way of doing this would be to expose a pair of iterators (or a WTF::IteratorRange) that a client can use.

> Source/WebCore/platform/graphics/displaylists/DisplayListItemBuffer.h:94
> +    ItemType type() const
> +    {
> +        return static_cast<ItemType>(data[0]);
> +    }

I would add a comment here explaining this and in general make sure the data layout is as well documented as possible, as this is where things will get tricky.

> Source/WebKit/GPUProcess/graphics/RemoteImageBuffer.h:103
> +            if (auto item = IPC::Decoder::decodeObject<WebCore::DisplayList::ClipOutToPath>(data, length)) {
> +                new (handle.data + sizeof(WebCore::DisplayList::ItemType)) WebCore::DisplayList::ClipOutToPath(WTFMove(*item));
> +                return true;
> +            }
> +            return false;

I would replace this repeated pattern with a helper like:

template<typename T> bool decodeAndCreate(const uint8_t* data, size_t length, WebCore::DisplayList::ItemHandle& handle)
{
     if (auto item = IPC::Decoder::decodeObject<T>(data, length)) {
        new (handle.data + sizeof(WebCore::DisplayList::ItemType)) T(WTFMove(*item));
        return true;
    }
    return false;
}

Ideally, the whole function should probably return an Optional<WebCore::DisplayList::ItemHandle> as we have been trying to avoid out parameters.
Comment 3 Wenson Hsieh 2020-11-01 11:21:59 PST
Comment on attachment 412811 [details]
Part 1 WIP

View in context: https://bugs.webkit.org/attachment.cgi?id=412811&action=review

Thanks for taking a look! While this isn't quite ready for review, comments are certainly welcome and appreciated :)

I actually wrote a small standalone program to test some of the new display list functionality (namely by appending lots of random items and then dumping the results/sizes) that just links against WebCore. I should be able to adapt this into an API test suite, but I might want to split that out into a separate patch for review, since this part is already so huge :(

I think the WebKit encoding/decoding parts are more difficult to test in a standalone manner like this, but I'll look into ways to test it too.

>> Source/WebCore/platform/graphics/displaylists/DisplayList.h:81
>> +    WEBCORE_EXPORT void map(Function<void(ItemHandle, Optional<FloatRect>&&)>&&) const;
> 
> map feels a bit to abstract of a name for this to me, though if others like it I won't object. A more c++'y way of doing this would be to expose a pair of iterators (or a WTF::IteratorRange) that a client can use.

Sounds good — I'll give this a try.

>> Source/WebCore/platform/graphics/displaylists/DisplayListItemBuffer.h:94
>> +    }
> 
> I would add a comment here explaining this and in general make sure the data layout is as well documented as possible, as this is where things will get tricky.

Good call! Added.

>> Source/WebKit/GPUProcess/graphics/RemoteImageBuffer.h:103
>> +            return false;
> 
> I would replace this repeated pattern with a helper like:
> 
> template<typename T> bool decodeAndCreate(const uint8_t* data, size_t length, WebCore::DisplayList::ItemHandle& handle)
> {
>      if (auto item = IPC::Decoder::decodeObject<T>(data, length)) {
>         new (handle.data + sizeof(WebCore::DisplayList::ItemType)) T(WTFMove(*item));
>         return true;
>     }
>     return false;
> }
> 
> Ideally, the whole function should probably return an Optional<WebCore::DisplayList::ItemHandle> as we have been trying to avoid out parameters.

👍🏻 will refactor this to return Optional<WebCore::DisplayList::ItemHandle>.
Comment 4 Geoffrey Garen 2020-11-02 14:45:41 PST
Comment on attachment 412811 [details]
Part 1 WIP

View in context: https://bugs.webkit.org/attachment.cgi?id=412811&action=review

> Source/WebCore/platform/graphics/displaylists/DisplayList.cpp:188
> +    auto iterateItemsInRange = [&] (uint8_t* initialOffset, size_t length) {

There are *much* faster ways to iterate and execute a bytecode. Let's keep our eyes out for evidence of this function being a bottleneck. If it is, we can make it faster. (But maybe it's fine, and all our time will be in graphics code instead.)

> Source/WebCore/platform/graphics/displaylists/DisplayList.cpp:199
> +            if (client && client->requiresDecoding(itemType)) {

This kind of virtual function overhead is one example of needless cost.

> Source/WebCore/platform/graphics/displaylists/DisplayList.cpp:203
> +                    : reinterpret_cast<uint8_t*>(malloc(sizeOfTypeAndItem));

Let's use fastMalloc.

> Source/WebCore/platform/graphics/displaylists/DisplayListItemBuffer.cpp:160
> +void ItemHandle::destroy()
> +{
> +    switch (type()) {
> +    case ItemType::ClipOutToPath:
> +        return get<ClipOutToPath>().~ClipOutToPath();

This kind of destruction pattern is not great.

It's poor encapsulation because, if someone writes a new destructor and doesn't update this function, the destructor is broken and the error is not caught at compile time. Perhaps you can fix this by adding some kind of static_assert in the default case that a class has a deleted destructor.

It's also not great for performance that destruction must execute an independent switch statement. Perhaps you can combine decoding, mapped function execution, and destruction into a single member function in each display list item. Then the interpreter loop would just read the ItemType and call ItemType::doItAll(bytes, function). Inside doItAll(), any auxiliary data could be allocated as a local variable.

> Source/WebCore/platform/graphics/displaylists/DisplayListItemBuffer.cpp:299
> +    auto* buffer = reinterpret_cast<uint8_t*>(malloc(newBufferCapacity));

Ditto.
Comment 5 Wenson Hsieh 2020-11-02 15:30:50 PST
Created attachment 412979 [details]
More WIP (address Sam's feedback and try to fix builds)
Comment 6 Wenson Hsieh 2020-11-02 16:11:01 PST
Created attachment 412985 [details]
More WIP (feedback + more build fixes)
Comment 7 Wenson Hsieh 2020-11-02 16:11:58 PST
Comment on attachment 412811 [details]
Part 1 WIP

View in context: https://bugs.webkit.org/attachment.cgi?id=412811&action=review

>> Source/WebCore/platform/graphics/displaylists/DisplayList.cpp:188
>> +    auto iterateItemsInRange = [&] (uint8_t* initialOffset, size_t length) {
> 
> There are *much* faster ways to iterate and execute a bytecode. Let's keep our eyes out for evidence of this function being a bottleneck. If it is, we can make it faster. (But maybe it's fine, and all our time will be in graphics code instead.)

Ah, so I ended up rewriting this code to use C++ iterators, per Sam's earlier comment (though I think this comment still applies to my latest WIP patch that uses iterators). My gut feeling here is that the vast majority of work in these benchmarks is spent in other (graphics-related) parts of the code, though it's quite possible that we can squeeze a percent or so by tweaking the way we iterate.

>> Source/WebCore/platform/graphics/displaylists/DisplayList.cpp:199
>> +            if (client && client->requiresDecoding(itemType)) {
> 
> This kind of virtual function overhead is one example of needless cost.

So I introduced these client objects to keep as much of the encoding/decoding logic in WebKit2, and out of WebCore. I might be able to elide these virtual function calls by changing the client hooks a bit, so that the client (i.e. WebKit2) simply requests that "all inline items should require custom encoding/decoding". That said, I'm not sure if the overall cost of having these virtual client hooks is worth optimizing……at least, not yet :)

>> Source/WebCore/platform/graphics/displaylists/DisplayList.cpp:203
>> +                    : reinterpret_cast<uint8_t*>(malloc(sizeOfTypeAndItem));
> 
> Let's use fastMalloc.

Sounds good! Changed to use fastMalloc and fastFree.

>> Source/WebCore/platform/graphics/displaylists/DisplayListItemBuffer.cpp:160
>> +        return get<ClipOutToPath>().~ClipOutToPath();
> 
> This kind of destruction pattern is not great.
> 
> It's poor encapsulation because, if someone writes a new destructor and doesn't update this function, the destructor is broken and the error is not caught at compile time. Perhaps you can fix this by adding some kind of static_assert in the default case that a class has a deleted destructor.
> 
> It's also not great for performance that destruction must execute an independent switch statement. Perhaps you can combine decoding, mapped function execution, and destruction into a single member function in each display list item. Then the interpreter loop would just read the ItemType and call ItemType::doItAll(bytes, function). Inside doItAll(), any auxiliary data could be allocated as a local variable.

I intentionally avoided using `default` switch cases for this reason (such that forgetting to update this function when introducing a new DL item type will result in a compiler error).

That said, it's a good point that someone could still put a OOL item type in the wrong place and forget to call the destructor, and then we'd end up with mysterious leaks! I'll add static assertions here to check that the destructor is trivial in all the cases where we skip calling it. I also added static assertions to verify that every drawing item marked inline (T::isInlineItem is true) is also trivially destructible.

>> Source/WebCore/platform/graphics/displaylists/DisplayListItemBuffer.cpp:299
>> +    auto* buffer = reinterpret_cast<uint8_t*>(malloc(newBufferCapacity));
> 
> Ditto.

Fixed!
Comment 8 Wenson Hsieh 2020-11-02 16:35:13 PST Comment hidden (obsolete)
Comment 9 Geoffrey Garen 2020-11-02 16:43:45 PST
> >> Source/WebCore/platform/graphics/displaylists/DisplayList.cpp:199
> >> +            if (client && client->requiresDecoding(itemType)) {
> > 
> > This kind of virtual function overhead is one example of needless cost.
> 
> So I introduced these client objects to keep as much of the
> encoding/decoding logic in WebKit2, and out of WebCore. I might be able to
> elide these virtual function calls by changing the client hooks a bit, so
> that the client (i.e. WebKit2) simply requests that "all inline items should
> require custom encoding/decoding". That said, I'm not sure if the overall
> cost of having these virtual client hooks is worth optimizing……at least, not
> yet :)

Setting aside cost for a moment, what is the purpose of all this ceremony? Do we intend to support multiple clients doing arbitrary things that radically transform display list behavior?

It's pretty obfuscatory and anti-encapsulation for core functionality like reading and writing a display list item to run through an abstract client interface. Perhaps an abstract client should allocate/deallocate the underlying buffer and do the wakeup IPC. But abstracting core item functionality through a client interface makes things hard to understand. (And, returning to cost for a moment, much harder to optimize.)

I remember the bad old days when WebCore called out to WebKit for most of its core functionality, including all loading, through abstract client interfaces, and then the only client (WebKit) just turned around and called right back into WebCore. I see some of that kind of design here. For example, ItemBufferWritingClient::requiresEncoding has only one implementation, and it just calls back into WebCore::DisplayList. So much ceremony to end up right back where you started!
Comment 10 Wenson Hsieh 2020-11-02 16:50:44 PST
(In reply to Geoffrey Garen from comment #9)
> > >> Source/WebCore/platform/graphics/displaylists/DisplayList.cpp:199
> > >> +            if (client && client->requiresDecoding(itemType)) {
> > > 
> > > This kind of virtual function overhead is one example of needless cost.
> > 
> > So I introduced these client objects to keep as much of the
> > encoding/decoding logic in WebKit2, and out of WebCore. I might be able to
> > elide these virtual function calls by changing the client hooks a bit, so
> > that the client (i.e. WebKit2) simply requests that "all inline items should
> > require custom encoding/decoding". That said, I'm not sure if the overall
> > cost of having these virtual client hooks is worth optimizing……at least, not
> > yet :)
> 
> Setting aside cost for a moment, what is the purpose of all this ceremony?
> Do we intend to support multiple clients doing arbitrary things that
> radically transform display list behavior?
> 
> It's pretty obfuscatory and anti-encapsulation for core functionality like
> reading and writing a display list item to run through an abstract client
> interface. Perhaps an abstract client should allocate/deallocate the
> underlying buffer and do the wakeup IPC. But abstracting core item
> functionality through a client interface makes things hard to understand.
> (And, returning to cost for a moment, much harder to optimize.)
> 
> I remember the bad old days when WebCore called out to WebKit for most of
> its core functionality, including all loading, through abstract client
> interfaces, and then the only client (WebKit) just turned around and called
> right back into WebCore. I see some of that kind of design here. For
> example, ItemBufferWritingClient::requiresEncoding has only one
> implementation, and it just calls back into WebCore::DisplayList. So much
> ceremony to end up right back where you started!

That's true :P I agree the round trip just to tell WebCore that non-inline items require encoding is pretty unnecessary. That said, I think the client hook to actually do the encoding/decoding does need to be there, since WebCore shouldn't know about IPC/Encoder things.

I'll remove the client hooks for `requiresEncoding` and `requiresDecoding`, and replace it with a way for WebKit2 to simply tell the display list that it should delegate out to the client for non-inline items.
Comment 11 Wenson Hsieh 2020-11-02 17:02:32 PST
(In reply to Wenson Hsieh from comment #10)
> (In reply to Geoffrey Garen from comment #9)
> > > >> Source/WebCore/platform/graphics/displaylists/DisplayList.cpp:199
> > > >> +            if (client && client->requiresDecoding(itemType)) {
> > > > 
> > > > This kind of virtual function overhead is one example of needless cost.
> > > 
> > > So I introduced these client objects to keep as much of the
> > > encoding/decoding logic in WebKit2, and out of WebCore. I might be able to
> > > elide these virtual function calls by changing the client hooks a bit, so
> > > that the client (i.e. WebKit2) simply requests that "all inline items should
> > > require custom encoding/decoding". That said, I'm not sure if the overall
> > > cost of having these virtual client hooks is worth optimizing……at least, not
> > > yet :)
> > 
> > Setting aside cost for a moment, what is the purpose of all this ceremony?
> > Do we intend to support multiple clients doing arbitrary things that
> > radically transform display list behavior?
> > 
> > It's pretty obfuscatory and anti-encapsulation for core functionality like
> > reading and writing a display list item to run through an abstract client
> > interface. Perhaps an abstract client should allocate/deallocate the
> > underlying buffer and do the wakeup IPC. But abstracting core item
> > functionality through a client interface makes things hard to understand.
> > (And, returning to cost for a moment, much harder to optimize.)
> > 
> > I remember the bad old days when WebCore called out to WebKit for most of
> > its core functionality, including all loading, through abstract client
> > interfaces, and then the only client (WebKit) just turned around and called
> > right back into WebCore. I see some of that kind of design here. For
> > example, ItemBufferWritingClient::requiresEncoding has only one
> > implementation, and it just calls back into WebCore::DisplayList. So much
> > ceremony to end up right back where you started!
> 
> That's true :P I agree the round trip just to tell WebCore that non-inline
> items require encoding is pretty unnecessary. That said, I think the client
> hook to actually do the encoding/decoding does need to be there, since
> WebCore shouldn't know about IPC/Encoder things.
> 
> I'll remove the client hooks for `requiresEncoding` and `requiresDecoding`,
> and replace it with a way for WebKit2 to simply tell the display list that
> it should delegate out to the client for non-inline items.

(Or perhaps, the presence of the client itself can mean "ask the client to encode/decode OOL items").
Comment 12 Wenson Hsieh 2020-11-02 17:10:53 PST Comment hidden (obsolete)
Comment 13 Simon Fraser (smfr) 2020-11-02 17:28:26 PST
Comment on attachment 412991 [details]
More WIP

View in context: https://bugs.webkit.org/attachment.cgi?id=412991&action=review

Can you split out the non-virtual DisplayList item changes from the rest?

> Source/WebCore/platform/graphics/displaylists/DisplayList.cpp:115
> +    stream << " extent ";
> +    if (extent)
> +        stream << *extent;
> +    else
> +        stream << "unknown";
> +}

You can just pipe in the Optional<> and it will print "null opt" if not set.

> Source/WebCore/platform/graphics/displaylists/DisplayListItemBuffer.h:187
> +        if (!T::isInlineItem && m_writingClient) {
> +            static uint8_t itemBuffer[sizeOfTypeAndItem];
> +            itemBuffer[0] = static_cast<uint8_t>(T::itemType);
> +            new (itemBuffer + sizeof(ItemType)) T(std::forward<Args>(args)...);
> +            if (auto sharedBuffer = m_writingClient->encodeItem({ itemBuffer })) {
> +                auto maxAdditionalCapacityForItem = sharedBuffer->size() + sizeof(ItemType) + sizeof(size_t) + alignof(uint64_t);
> +                if (m_writtenNumberOfBytes + maxAdditionalCapacityForItem > m_writableBuffer.capacity) {
> +                    auto lastWritableBuffer = std::exchange(m_writableBuffer, { });
> +                    lastWritableBuffer.capacity = std::exchange(m_writtenNumberOfBytes, 0);
> +                    if (lastWritableBuffer.capacity)
> +                        m_readOnlyBuffers.append(WTFMove(lastWritableBuffer));
> +                    m_writableBuffer = createItemBuffer(maxAdditionalCapacityForItem);
> +                }
> +
> +                m_writableBuffer.data[m_writtenNumberOfBytes] = static_cast<uint8_t>(T::itemType);
> +                auto* startOfDataLength = m_writableBuffer.data + m_writtenNumberOfBytes + sizeof(ItemType);
> +                reinterpret_cast<size_t*>(startOfDataLength)[0] = sharedBuffer->size();
> +
> +                auto* startOfPadding = startOfDataLength + sizeof(size_t);
> +                size_t padding = (alignof(uint64_t) - reinterpret_cast<uintptr_t>(startOfPadding) % alignof(uint64_t)) % alignof(uint64_t);
> +                auto* startOfData = startOfDataLength + sizeof(size_t) + padding;
> +                memcpy(startOfData, sharedBuffer->dataAsUInt8Ptr(), sharedBuffer->size());
> +
> +                m_writtenNumberOfBytes += sizeof(ItemType) + sizeof(size_t) + padding + sharedBuffer->size();
> +            }
> +            return;
> +        }
> +
> +        if (m_writtenNumberOfBytes + sizeOfTypeAndItem > m_writableBuffer.capacity) {
> +            auto lastWritableBuffer = std::exchange(m_writableBuffer, { });
> +            lastWritableBuffer.capacity = std::exchange(m_writtenNumberOfBytes, 0);
> +            if (lastWritableBuffer.capacity)
> +                m_readOnlyBuffers.append(WTFMove(lastWritableBuffer));
> +            m_writableBuffer = createItemBuffer(sizeOfTypeAndItem);
> +        }
> +
> +        auto* startOfItem = &m_writableBuffer.data[m_writtenNumberOfBytes];
> +        startOfItem[0] = static_cast<uint8_t>(T::itemType);
> +        new (startOfItem + sizeof(ItemType)) T(std::forward<Args>(args)...);
> +        m_writtenNumberOfBytes += sizeof(ItemType) + sizeOfItemInBytes(T::itemType);
> +
> +        if (!T::isInlineItem)
> +            m_nonInlineItemsInAllocatedBuffers.append({ startOfItem });

It would be nice to pull as much of this code as possible out of the template. In general it would be best if the buffer management were isolated from code that knows about display list items.
Comment 14 Wenson Hsieh 2020-11-02 17:37:45 PST Comment hidden (obsolete)
Comment 15 Geoffrey Garen 2020-11-02 17:54:40 PST
> That's true :P I agree the round trip just to tell WebCore that non-inline
> items require encoding is pretty unnecessary. That said, I think the client
> hook to actually do the encoding/decoding does need to be there, since
> WebCore shouldn't know about IPC/Encoder things.

But WebCore already knows about IPC/Encoder things! All the display list items have encode() and decode() functions in them.

So, the real solution to this dilemma in the end is to stop treating out-of-line display list items as CoreIPC things, and start treating them as things the DisplayList class knows how to serialize / deserialize into a memory buffer. Perhaps we can't get all the way there in this patch; but that's what we want in the end.

> I'll remove the client hooks for `requiresEncoding` and `requiresDecoding`,
> and replace it with a way for WebKit2 to simply tell the display list that
> it should delegate out to the client for non-inline items.

I don't think this should be a setting.

WebCore's display list should be in charge of its own architecture. If some items are inline, and some are out of line, and we need the client to provide some special functionality for some out of line items, OK. But that should all be driven by the display list, and not by an abstract setting that, if fully traced through lots of hoops, one eventually discovers to always by 'true'.
Comment 16 Wenson Hsieh 2020-11-02 19:10:47 PST
Comment on attachment 412991 [details]
More WIP

View in context: https://bugs.webkit.org/attachment.cgi?id=412991&action=review

> Can you split out the non-virtual DisplayList item changes from the rest?

This patch was already my attempt to split the non-virtual DL item changes away from the rest :/

I'll see if I can split away more (e.g. the IPC Decoder adjustment), but most of the pieces that I could be split away from this patch are going to be very small, relative to the size of this patch.

>> Source/WebCore/platform/graphics/displaylists/DisplayList.cpp:115
>> +}
> 
> You can just pipe in the Optional<> and it will print "null opt" if not set.

Done!

>> Source/WebCore/platform/graphics/displaylists/DisplayListItemBuffer.h:187
>> +            m_nonInlineItemsInAllocatedBuffers.append({ startOfItem });
> 
> It would be nice to pull as much of this code as possible out of the template. In general it would be best if the buffer management were isolated from code that knows about display list items.

Sounds good — I think I can pull some of the "out-of-capacity" logic into a private helper.
Comment 17 Wenson Hsieh 2020-11-03 16:04:28 PST Comment hidden (obsolete)
Comment 18 Wenson Hsieh 2020-11-03 17:34:33 PST
Created attachment 413130 [details]
More WIP
Comment 19 Wenson Hsieh 2020-11-03 17:50:06 PST
Created attachment 413131 [details]
More WIP
Comment 20 Wenson Hsieh 2020-11-05 12:47:24 PST
Created attachment 413343 [details]
Rebase on trunk
Comment 21 Tim Horton 2020-11-05 12:54:09 PST
Comment on attachment 413131 [details]
More WIP

View in context: https://bugs.webkit.org/attachment.cgi?id=413131&action=review

I skimmed the middle mechanical bits but the rest looks good.

> Source/WebCore/platform/graphics/displaylists/DisplayList.cpp:224
> +    case ItemType::ClipToDrawingCommands:
> +        return; // FIXME: Support the ability to clone display lists (this is currently only used for display list tracking).

Should this fail more loudly?

> Source/WebCore/platform/graphics/displaylists/DisplayList.cpp:352
> +        m_currentItem = ItemHandle { m_cursor };

As you yourself suggested on slack, perhaps we should copy this out somewhere so it can't change between validation and application.

> Source/WebCore/platform/graphics/displaylists/DisplayListItemBuffer.cpp:579
> +    constexpr size_t defaultItemBufferCapacity = 1 << 10;

Should this be extracted out somewhere, or is it OK here?

> Source/WebCore/platform/graphics/displaylists/DisplayListItems.cpp:815
> +NO_RETURN

FWIW NO_RETURN_DUE_TO_ASSERT

> Source/WebCore/platform/graphics/displaylists/DisplayListItems.cpp:840
> +NO_RETURN

Ditto, etc.
Comment 22 Wenson Hsieh 2020-11-05 13:12:35 PST
Comment on attachment 413131 [details]
More WIP

View in context: https://bugs.webkit.org/attachment.cgi?id=413131&action=review

>> Source/WebCore/platform/graphics/displaylists/DisplayList.cpp:224
>> +        return; // FIXME: Support the ability to clone display lists (this is currently only used for display list tracking).
> 
> Should this fail more loudly?

Sounds good — I'll add an assertion here.

>> Source/WebCore/platform/graphics/displaylists/DisplayList.cpp:352
>> +        m_currentItem = ItemHandle { m_cursor };
> 
> As you yourself suggested on slack, perhaps we should copy this out somewhere so it can't change between validation and application.

👍🏻

>> Source/WebCore/platform/graphics/displaylists/DisplayListItemBuffer.cpp:579
>> +    constexpr size_t defaultItemBufferCapacity = 1 << 10;
> 
> Should this be extracted out somewhere, or is it OK here?

Hm..I think this should be okay here (as it should only be consulted in this method).

> Source/WebCore/platform/graphics/displaylists/DisplayListItems.cpp:449
> +#if !defined(NDEBUG)
> +NO_RETURN
> +#endif

(Replaced this with NO_RETURN_DUE_TO_ASSERT too)

>> Source/WebCore/platform/graphics/displaylists/DisplayListItems.cpp:815
>> +NO_RETURN
> 
> FWIW NO_RETURN_DUE_TO_ASSERT

V. nice — changed these three to use NO_RETURN_DUE_TO_ASSERT
Comment 23 Wenson Hsieh 2020-11-05 16:05:32 PST
Created attachment 413362 [details]
Address feedback
Comment 24 Wenson Hsieh 2020-11-06 09:12:16 PST
Created attachment 413436 [details]
Rebase on trunk (after r269503)
Comment 25 Wenson Hsieh 2020-11-06 09:21:04 PST
Created attachment 413437 [details]
Add missing NO_RETURN_DUE_TO_ASSERT after rebase
Comment 26 EWS 2020-11-06 10:55:18 PST
Committed r269525: <https://trac.webkit.org/changeset/269525>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 413437 [details].
Comment 27 Radar WebKit Bug Importer 2020-11-06 10:56:20 PST
<rdar://problem/71125743>
Comment 28 Fujii Hironori 2020-11-06 12:06:53 PST
Apple-Win-10-Debug-Build can't compile.

https://build.webkit.org/builders/Apple-Win-10-Debug-Build/builds/17046

> C:\cygwin\home\buildbot\worker\win10-debug\build\Source\WebCore\platform/graphics/displaylists/DisplayListItems.cpp(448,47): error C2381: 'WebCore::DisplayList::DrawImageBuffer::apply': redefinition; '__declspec(noreturn)' or '[[noreturn]]' differs (compiling source file C:\cygwin\home\buildbot\worker\win10-debug\build\WebKitBuild\Debug\DerivedSources\WebCore\unified-sources\UnifiedSource-3c72abbe-26.cpp) [C:\cygwin\home\buildbot\worker\win10-debug\build\WebKitBuild\Debug\Source\WebCore\WebCore.vcxproj]
> C:\cygwin\home\buildbot\worker\win10-debug\build\Source\WebCore\platform\graphics\displaylists\DisplayListItems.h(1331): message : see declaration of 'WebCore::DisplayList::DrawImageBuffer::apply' (compiling source file C:\cygwin\home\buildbot\worker\win10-debug\build\WebKitBuild\Debug\DerivedSources\WebCore\unified-sources\UnifiedSource-3c72abbe-26.cpp) [C:\cygwin\home\buildbot\worker\win10-debug\build\WebKitBuild\Debug\Source\WebCore\WebCore.vcxproj]
Comment 29 Wenson Hsieh 2020-11-06 12:10:57 PST
(In reply to Fujii Hironori from comment #28)
> Apple-Win-10-Debug-Build can't compile.
> 
> https://build.webkit.org/builders/Apple-Win-10-Debug-Build/builds/17046
> 
> > C:\cygwin\home\buildbot\worker\win10-debug\build\Source\WebCore\platform/graphics/displaylists/DisplayListItems.cpp(448,47): error C2381: 'WebCore::DisplayList::DrawImageBuffer::apply': redefinition; '__declspec(noreturn)' or '[[noreturn]]' differs (compiling source file C:\cygwin\home\buildbot\worker\win10-debug\build\WebKitBuild\Debug\DerivedSources\WebCore\unified-sources\UnifiedSource-3c72abbe-26.cpp) [C:\cygwin\home\buildbot\worker\win10-debug\build\WebKitBuild\Debug\Source\WebCore\WebCore.vcxproj]
> > C:\cygwin\home\buildbot\worker\win10-debug\build\Source\WebCore\platform\graphics\displaylists\DisplayListItems.h(1331): message : see declaration of 'WebCore::DisplayList::DrawImageBuffer::apply' (compiling source file C:\cygwin\home\buildbot\worker\win10-debug\build\WebKitBuild\Debug\DerivedSources\WebCore\unified-sources\UnifiedSource-3c72abbe-26.cpp) [C:\cygwin\home\buildbot\worker\win10-debug\build\WebKitBuild\Debug\Source\WebCore\WebCore.vcxproj]

Sorry about that — looking into this now.

From glancing at the error message, it looks like this version of MSVC requires NO_RETURN_DUE_TO_ASSERT in both the method definition and declaration...