Bug 237060 - Do not trigger didFail during FileReaderLoader Destruction
Summary: Do not trigger didFail during FileReaderLoader Destruction
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore JavaScript (show other bugs)
Version: Other
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2022-02-22 14:05 PST by Brandon
Modified: 2022-02-25 16:29 PST (History)
12 users (show)

See Also:


Attachments
Patch (5.89 KB, patch)
2022-02-22 14:37 PST, Brandon
no flags Details | Formatted Diff | Diff
Patch (7.37 KB, patch)
2022-02-23 09:06 PST, Brandon
no flags Details | Formatted Diff | Diff
Patch (1.52 KB, patch)
2022-02-23 13:59 PST, Brandon
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Brandon 2022-02-22 14:05:06 PST
<rdar://89239935>
Comment 1 Brandon 2022-02-22 14:37:16 PST
Created attachment 452903 [details]
Patch
Comment 2 Alexey Shvayka 2022-02-22 14:43:21 PST
Comment on attachment 452903 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=452903&action=review

> Source/WebCore/bindings/js/ReadableStreamDefaultController.h:50
> +    JSDOMGlobalObject& getGlobalObject() const 

Per WebKit code style guidelines, this should be named just "globalObject".

> Source/WebCore/fileapi/Blob.cpp:291
> +                context->eventLoop().queueTask(TaskSource::InternalAsyncTask, [this, code] {

Should we also pass `protectedThis = Ref { *this }`?

> Source/WebCore/fileapi/FileReaderLoader.h:92
> +    ScriptExecutionContext* getScriptExecutionContext() const { return m_scriptExecutionContext; }

Ditto: "scriptExecutionContext".
Comment 3 Alexey Shvayka 2022-02-22 14:50:43 PST
Comment on attachment 452903 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=452903&action=review

> Source/WebCore/fileapi/FileReaderLoader.cpp:85
> +    m_scriptExecutionContext = scriptExecutionContext;

It seems like DocumentThreadableLoader / WorkerThreadableLoader already keep a reference to context. Is there any way we can avoid adding another one?

Maybe we take context from ReadableStreamDefaultController::globalObject()?
Comment 4 Mark Lam 2022-02-22 14:52:46 PST
Comment on attachment 452903 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=452903&action=review

> Source/WebCore/ChangeLog:10
> +        Queue throwing an exception when destroying a blob object.
> +        Also, ensure that if we are destroying a vm we skip this step
> +        as it will try to allocate memory, which is not allowed during a sweep.

This is not quite correct.  Let's replace this with:
"BlobStreamSource::didFail() may be called from GC sweep.  Hence, we cannot throw an exception synchronously from there as this may result in object allocation (which is not allowed during sweep).  Instead we will enqueue a task to do the the throw later.  Additionally, if didFail() is called during VM shut down, then we skip the enqueuing of the task as the task queue may already have shut down.

> Source/WebCore/fileapi/Blob.cpp:284
> +            // During VM destruction we do not want to raise a dom exception
> +            // as this will trigger an allocation, which is not allowed as we
> +            // will be performing a sweep at that time.

This is not quite correct.  Let's replace this with:

"This function can be called doing Blob destruction, which in turn is called by GC sweep.  Since we cannot allocated objects while sweeping, we need to defer the call to error() which will allocate an exception object, by queueing a task.  Additionally, if this is being called during VM shut down, the task queue may already destructed.  So, we will skip the enqueuing of this task if we're in the midst of VM shut down."
Comment 5 Yusuke Suzuki 2022-02-22 14:55:56 PST
Comment on attachment 452903 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=452903&action=review

> Source/WebCore/fileapi/Blob.cpp:289
> +            auto& heap = globalObject.vm().heap;
> +
> +            if (!heap.isShuttingDown()) {

Let's add VM::isShuttingDown() getter and use it instead of touching heap directly here.
Heap and VM can become different after global GC. So for now, VM::isShuttingDown() implementation should be `return heap.isShuttingDown()`.
But we should avoid touching heap directly here.

> Source/WebCore/fileapi/Blob.cpp:290
> +                auto* context = m_loader->getScriptExecutionContext();

I think there is no guarantee that ScriptExecutionContext* is already destroyed. Is it ensured that this is still valid?

> Source/WebCore/fileapi/FileReaderLoader.h:127
> +    ScriptExecutionContext* m_scriptExecutionContext;

I think we cannot have this raw pointer since we have no mechanism to make this nullptr when ScriptExecutionContext is destroyed. Is this correct?
Comment 6 Mark Lam 2022-02-22 14:56:20 PST
Comment on attachment 452903 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=452903&action=review

>> Source/WebCore/fileapi/Blob.cpp:284
>> +            // will be performing a sweep at that time.
> 
> This is not quite correct.  Let's replace this with:
> 
> "This function can be called doing Blob destruction, which in turn is called by GC sweep.  Since we cannot allocated objects while sweeping, we need to defer the call to error() which will allocate an exception object, by queueing a task.  Additionally, if this is being called during VM shut down, the task queue may already destructed.  So, we will skip the enqueuing of this task if we're in the midst of VM shut down."

Hmmm, I wrote this blurb before the one in the ChangeLog.  I think the ChangeLog one is more concise (and doesn't run on).  Maybe just use that here instead?
Comment 7 Mark Lam 2022-02-22 15:06:33 PST
Comment on attachment 452903 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=452903&action=review

>> Source/WebCore/fileapi/Blob.cpp:289
>> +            if (!heap.isShuttingDown()) {
> 
> Let's add VM::isShuttingDown() getter and use it instead of touching heap directly here.
> Heap and VM can become different after global GC. So for now, VM::isShuttingDown() implementation should be `return heap.isShuttingDown()`.
> But we should avoid touching heap directly here.

Yeah, sorry about that: I told Brandon that he doesn't have to do this because I wasn't sure that the VM method should be called because the period of time for which Heap shut down flag is only a subset of the period of time during VM shut down, nor whether we do need to make a distinction between the two.  But on second thought, I suppose calling it `VM::isShuttingDown()` is appropriate for now.  We can always change its implementation later if we need to redefine it.
Comment 8 Brandon 2022-02-23 09:06:20 PST
Created attachment 452987 [details]
Patch
Comment 9 Brandon 2022-02-23 09:23:57 PST
The new patch should address reviewer comments from yesterday.


@Yusuke

I added a new variable m_shuttingDown that gets set when we enter vm destructor instead of calling out to the heap.

Would you be good with this implementation?
Comment 10 Mark Lam 2022-02-23 10:00:29 PST
Comment on attachment 452987 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=452987&action=review

> Source/WebCore/ChangeLog:10
> +        Instead, we will enqueue a task to do the the throw later. Additionally, if didFail() is called during 

typo: "the the"

> Source/WebCore/ChangeLog:11
> +        VM shut down, then we skip the enqueuing of the task as the task queue may already have shut down.

nit: "may have already" instead of "may already have"?

> Source/WebCore/fileapi/Blob.cpp:292
> +                context->eventLoop().queueTask(TaskSource::InternalAsyncTask, [this, protectedThis = Ref { *this }, code] {

The fact that we're needing to ref `this` here feels wrong because we're calling this from the destructor.  Why do we need to move the exception into m_exception if the object is already being destructed, granted, that didFail() may be called from places not in the destruction path.  This hints that this Blob code is itself not aware of what it's doing.  Let's get input from a WebCore engineer on this to determine if this is actually correct to do.
Comment 11 Chris Dumez 2022-02-23 10:18:02 PST
Comment on attachment 452987 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=452987&action=review

> Source/JavaScriptCore/runtime/VM.cpp:422
> +    setShuttingDown(true);

setIsShuttingDown()

> Source/JavaScriptCore/runtime/VM.h:292
> +    bool shuttingDown() const { return m_shuttingDown; }

Per WebKit coding style, booleans need to start with a prefix, I think this should be:
bool isShuttingDown() const { return m_isShuttingDown; }

> Source/JavaScriptCore/runtime/VM.h:293
> +    void setShuttingDown(bool value) { m_shuttingDown = value; } 

I don't think we need the boolean parameter, this could be setIsShuttingDown() or markAsShuttingDown(). Once you're shutting down, I don't think there is any coming back to running, is there?

> Source/WebCore/fileapi/Blob.cpp:290
> +            if (!vm.shuttingDown()) {

In WebKit, we prefer early returns than nesting. This should be:
if (vm.shuttingDown())
    return;

But also, I don't see how this is scalable. There are tons of functions in WebCore that end up running JS (by throwing exceptions or firing events). How is this didFail() function any different? Surely, we cannot expect EVERY dom function to check this flag. We would need a better choke point. Maybe in the bindings function that actually ends up firing the event or throwing the exception? There should really not be much JSC logic inside WebCore/DOM implementation.

>> Source/WebCore/fileapi/Blob.cpp:292
>> +                context->eventLoop().queueTask(TaskSource::InternalAsyncTask, [this, protectedThis = Ref { *this }, code] {
> 
> The fact that we're needing to ref `this` here feels wrong because we're calling this from the destructor.  Why do we need to move the exception into m_exception if the object is already being destructed, granted, that didFail() may be called from places not in the destruction path.  This hints that this Blob code is itself not aware of what it's doing.  Let's get input from a WebCore engineer on this to determine if this is actually correct to do.

Blob is an ActiveDOMObject, we should probably call queueTaskKeepingObjectAlive() from ActiveDOMObject base class?

queueTaskKeepingObjectAlive(*this, TaskSource::InternalAsyncTask, [this, code] {
  // ...
});

Note that you wouldn't need to capture protectedThis then since queueTaskKeepingObjectAlive does this for you (in addition to keeping the JS wrapper alive).
Comment 12 Chris Dumez 2022-02-23 10:21:20 PST
Comment on attachment 452987 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=452987&action=review

>>> Source/WebCore/fileapi/Blob.cpp:292
>>> +                context->eventLoop().queueTask(TaskSource::InternalAsyncTask, [this, protectedThis = Ref { *this }, code] {
>> 
>> The fact that we're needing to ref `this` here feels wrong because we're calling this from the destructor.  Why do we need to move the exception into m_exception if the object is already being destructed, granted, that didFail() may be called from places not in the destruction path.  This hints that this Blob code is itself not aware of what it's doing.  Let's get input from a WebCore engineer on this to determine if this is actually correct to do.
> 
> Blob is an ActiveDOMObject, we should probably call queueTaskKeepingObjectAlive() from ActiveDOMObject base class?
> 
> queueTaskKeepingObjectAlive(*this, TaskSource::InternalAsyncTask, [this, code] {
>   // ...
> });
> 
> Note that you wouldn't need to capture protectedThis then since queueTaskKeepingObjectAlive does this for you (in addition to keeping the JS wrapper alive).

Oh, I missed Mark's comment about this function getting called from the destructor. Protecting |this| here is definitely wrong then. Calling queueTaskKeepingObjectAlive() would be wrong too.

I'll need to see the call stack to try and suggest something better.
Comment 13 youenn fablet 2022-02-23 10:34:16 PST
IIUC, BlobStreamSource gets destroyed as part of GC and its file loader will get destroyed, which will call didFail.
Would it be possible to cancel the load in BlobStreamSource destructor if the load is not nullptr?
Comment 14 youenn fablet 2022-02-23 10:35:24 PST
If that is fixing the issue, we might want to add an ASSERT in FileReaderLoader destructor to ensure that it is not live at destruction time.
Comment 15 Mark Lam 2022-02-23 10:36:19 PST
(In reply to youenn fablet from comment #14)
> If that is fixing the issue, we might want to add an ASSERT in
> FileReaderLoader destructor to ensure that it is not live at destruction
> time.

This sounds like the right thing to do to me (but I don't know the devil in the details).
Comment 16 Mark Lam 2022-02-23 10:37:16 PST
(In reply to youenn fablet from comment #13)
> IIUC, BlobStreamSource gets destroyed as part of GC and its file loader will
> get destroyed, which will call didFail.
> Would it be possible to cancel the load in BlobStreamSource destructor if
> the load is not nullptr?

Oops, my previous comment:

"This sounds like the right thing to do to me (but I don't know the devil in the details)."

... was actually meant to reply to this one.
Comment 17 Mark Lam 2022-02-23 10:58:00 PST
Comment on attachment 452987 [details]
Patch

We're pretty sure this is not the right direction for the fix.
Comment 18 Brandon 2022-02-23 13:59:02 PST
Created attachment 453025 [details]
Patch
Comment 19 Darin Adler 2022-02-23 14:03:43 PST
Comment on attachment 453025 [details]
Patch

Can we construct a test for this?
Comment 20 Brandon 2022-02-25 15:41:01 PST
Was unable to reproduce the failing test in editing-word-with-marker-2.html for iOS Simulator locally, and it does not appear related to my patch.

Appears to be crashing consistently in the CIs.
https://results.webkit.org/?suite=layout-tests&test=editing%2Fspelling%2Fediting-word-with-marker-2.html
Comment 21 EWS 2022-02-25 16:29:02 PST
Committed r290537 (247815@main): <https://commits.webkit.org/247815@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 453025 [details].