Bug 18654 - [XHR] onProgress event needs to be dispatched according to what the specification states
Summary: [XHR] onProgress event needs to be dispatched according to what the specifica...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: XML (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Julien Chaffraix
URL:
Keywords: Qt
Depends on: 36531
Blocks: 36934
  Show dependency treegraph
 
Reported: 2008-04-21 07:31 PDT by Julien Chaffraix
Modified: 2010-03-31 23:53 PDT (History)
6 users (show)

See Also:


Attachments
Proposed fix: add an XMLHttpRequestProgressEventDispatcher to handle the throttling of progress events (22.16 KB, patch)
2010-02-24 21:43 PST, Julien Chaffraix
jchaffraix: commit-queue-
Details | Formatted Diff | Diff
Updated patch: corrected build systems and licenses (22.86 KB, patch)
2010-02-26 08:23 PST, Julien Chaffraix
ap: review-
jchaffraix: commit-queue-
Details | Formatted Diff | Diff
Take two: Renamed new file to XMLHttpRequestProgressEventThrottle, added suspend/resume support, refactored the code a bit to avoid duplication, made the test more reliable (29.59 KB, patch)
2010-03-16 08:11 PDT, Julien Chaffraix
no flags Details | Formatted Diff | Diff
Third version: use a more tailored version of suspend / resume. Also takes Alexey comments into account. (29.39 KB, patch)
2010-03-16 22:30 PDT, Julien Chaffraix
ap: review+
jchaffraix: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Julien Chaffraix 2008-04-21 07:31:32 PDT
We have to use the least frequent between when we receive data and every 350ms (±200ms). The current behaviour is to dispatch the event every time we receive the data (which induces big performance hit on big files).
Comment 1 Julien Chaffraix 2010-02-24 18:57:39 PST
The specification has changed an now it is:

"When it is said to make progress notifications, while the download is progressing, queue a task to dispatch a progress event called progress about every 50ms or for every byte received, whichever is least frequent."
Comment 2 Julien Chaffraix 2010-02-24 21:43:30 PST
Created attachment 49467 [details]
Proposed fix: add an XMLHttpRequestProgressEventDispatcher to handle the throttling of progress events
Comment 3 David Levin 2010-02-25 09:12:48 PST
Both GNUMakefile.am, WebCore.gypi, and WebCore.vcproj/WebCore.vcproj are all missing XMLHttpRequestProgressEventDispatcher.h
Comment 4 Julien Chaffraix 2010-02-25 15:04:37 PST
(In reply to comment #3)
> Both GNUMakefile.am, WebCore.gypi, and WebCore.vcproj/WebCore.vcproj are all
> missing XMLHttpRequestProgressEventDispatcher.h

Right, I will attach an updated patch with this corrected. Do you have other comments on this patch before I modify it?

Thanks!
Comment 5 Alexey Proskuryakov 2010-02-25 15:57:34 PST
+ * THIS SOFTWARE IS PROVIDED BY APPLE COMPUTER, INC. ``AS IS'' AND ANY

It's just "APPLE INC" now: <http://webkit.org/coding/bsd-license.html>.
Comment 6 Julien Chaffraix 2010-02-26 08:23:13 PST
Created attachment 49587 [details]
Updated patch: corrected build systems and licenses
Comment 7 Julien Chaffraix 2010-02-26 08:25:21 PST
(In reply to comment #5)
> + * THIS SOFTWARE IS PROVIDED BY APPLE COMPUTER, INC. ``AS IS'' AND ANY
> 
> It's just "APPLE INC" now: <http://webkit.org/coding/bsd-license.html>.

This snippet is copied from another file (can't remember which one it is). Maybe it is worth checking if there are no other discrepancies in WebKit's licenses?
Comment 8 Alexey Proskuryakov 2010-02-26 08:45:59 PST
Yes, there are lots of files with obsolete license text. For now, the approach was to fix them on a case by case basis - I don't know if it's right or wrong.
Comment 9 Julien Chaffraix 2010-02-26 11:31:40 PST
(In reply to comment #8)
> Yes, there are lots of files with obsolete license text. For now, the approach
> was to fix them on a case by case basis - I don't know if it's right or wrong.

Trying a small search (grep -R "APPLE COMPUTER, INC." .|cut -d':' -f1|uniq|wc -l) showed out that more than 1400 files have the obsolete licenses in WebCore. Some of them are idl files which are unlikely to be modified soon. Some are just new files that just replicated an existing obsolete license (as I was going to do).

It has been quite a long time since this switch occurred so I think it is worth opening a renaming bug if we want to stop this spreading.
Comment 10 Alexey Proskuryakov 2010-03-08 17:01:39 PST
Comment on attachment 49587 [details]
Updated patch: corrected build systems and licenses

+header("Cache-Control: no-cache, must-revalidate");
+header("Pragma: no-cache");

Maybe you also want no-store here?

+You should see a serie of 5 PASSED.

Typo: should be "series". I'd say "sequence", but I'm not a native speaker, so take that suggestion with a grain of salt.

I have some doubts whether this test will be stable on buildbots. But besides supporting this change, it tests a peculiar server-side timing, which would be nice to run through regression tests, even if we decide that it's impossible to test timing.

I'm not a fan of designs that involve "dispatcher" objects. Those are not objects, as they don't describe behavior of an entity a human can think about. I don't have any suggestions though.

+        m_progressEventDispatcher.dispatchEvent(expectedLength && m_receivedLength <= expectedLength, static_cast<unsigned>(m_receivedLength), static_cast<unsigned>(expectedLength));

Could you add parentheses here? Some ports' compilers will emit warning about ambiguous code, I think.

+    // FIXME: This is a temporary measure as we do not have the concept of task and thus cannot post a task.
+    void dispatchEventNow();

We have ScriptExecutionContext::postTask. Is that a different kind of task than needed here?

But I think that the spec is just being misleading here, and the event should be dispatched synchronously. The "task" it talks about is just the fact that network loading is asynchronous.

+    static const double progressEventDispatchingTime = .05; // 50 ms per specification.

I'd call this "minimumProgressEventDispatchingIntervalInSeconds".

The timer can fire behind an alert or another modal dialog. It shouldn't. Perhaps XHR should tell the dispatcher when it's suspended. It currently doesn't override ActiveDOMObject::suspend, because pages with active XHRs cannot go into b/f cache, and existing progress events do not fire simply because all loading is paused behind modal dialogs.

+        m_progressEventDispatcher.dispatchEvent(expectedLength && m_receivedLength <= expectedLength, static_cast<unsigned>(m_receivedLength), static_cast<unsigned>(expectedLength));

I believe that when expected length is not known, it's set to -1 (NSURLResponseUnknownLength), not to 0. Maybe there are platform differences here, and maybe even pre-existing bugs.
Comment 11 Alexey Proskuryakov 2010-03-08 17:48:00 PST
I've been talking about this patch with some others, and one idea is that the progress object could (or should?) incapsulate more state related to progress, and be responsible for firing load and readystatechange events.

Since these events are closely tied together, and it is important to fire them in a correct order, it makes sense to have a single class do it. And moving more state to that class could help us realize what it really is (as a noun, and not a verb disguised as a noun with "er" appended to the end).
Comment 12 Julien Chaffraix 2010-03-11 08:06:00 PST
(In reply to comment #10)
> (From update of attachment 49587 [details])
> +header("Cache-Control: no-cache, must-revalidate");
> +header("Pragma: no-cache");
> 
> Maybe you also want no-store here?

Right, we do not want the server to cache the response too.

> Typo: should be "series". I'd say "sequence", but I'm not a native speaker, so
> take that suggestion with a grain of salt.

I will change that (I was not aware that there is no singular for "series").

> I have some doubts whether this test will be stable on buildbots. But besides
> supporting this change, it tests a peculiar server-side timing, which would be
> nice to run through regression tests, even if we decide that it's impossible to
> test timing.

I think the test can be less timing depend. What we need is to check that we throttle the number of events and do not blindly dispatch for every bits we receive (the current behaviour). I will address that in the next iteration.

> I'm not a fan of designs that involve "dispatcher" objects. Those are not
> objects, as they don't describe behavior of an entity a human can think about.
> I don't have any suggestions though.

The purpose of this object is to dispatch events (thus this name) but it is also limiting the rate of dispatching. Perhaps something along this could work like XMLHTTPRequestProgressEventThrottle?

> +        m_progressEventDispatcher.dispatchEvent(expectedLength &&
> m_receivedLength <= expectedLength, static_cast<unsigned>(m_receivedLength),
> static_cast<unsigned>(expectedLength));
> 
> Could you add parentheses here? Some ports' compilers will emit warning about
> ambiguous code, I think.

I did not change the code here but worth making this change to avoid such warnings in the future.

> +    // FIXME: This is a temporary measure as we do not have the concept of
> task and thus cannot post a task.
> +    void dispatchEventNow();
> 
> We have ScriptExecutionContext::postTask. Is that a different kind of task than
> needed here?

I was under the false assumption that this was for webworkers only. It will fit the purpose indeed.

> But I think that the spec is just being misleading here, and the event should
> be dispatched synchronously. The "task" it talks about is just the fact that
> network loading is asynchronous.

I think the latest XHR level 2 WD is pretty clear:

"When it is said to make progress notifications, while the download is progressing, queue a task to dispatch a progress event called progress about every 50ms or for every byte received, whichever is least frequent."

where queue a task is defined in HTML5. Other timers may share the same queue or use seperated queues. IMHO we can just stay with a separate "queue" for XHR and just dispatch this synchronously as you were suggesting.

> +    static const double progressEventDispatchingTime = .05; // 50 ms per
> specification.
> 
> I'd call this "minimumProgressEventDispatchingIntervalInSeconds".

Seems fair!

> The timer can fire behind an alert or another modal dialog. It shouldn't.
> Perhaps XHR should tell the dispatcher when it's suspended. It currently
> doesn't override ActiveDOMObject::suspend, because pages with active XHRs
> cannot go into b/f cache, and existing progress events do not fire simply
> because all loading is paused behind modal dialogs.

Right, I will add the logic for suspending / resuming the events.

> +        m_progressEventDispatcher.dispatchEvent(expectedLength &&
> m_receivedLength <= expectedLength, static_cast<unsigned>(m_receivedLength),
> static_cast<unsigned>(expectedLength));
> 
> I believe that when expected length is not known, it's set to -1
> (NSURLResponseUnknownLength), not to 0. Maybe there are platform differences
> here, and maybe even pre-existing bugs.

Right, currently the code assumes that expectedLength >= 0. It is not strictly related to the current change so I will file a bug about that as the change is already large enough.
Comment 13 Alexey Proskuryakov 2010-03-11 09:14:08 PST
>> Maybe you also want no-store here?
> Right, we do not want the server to cache the response too.

HTTP is actually very misleading in this regard. No-cache means that intermediate proxies shouldn't cache the response, while no-store means that UA shouldn't cache it.

I didn't know this before, and made some tests that only had no-cache. This didn't seem to cause issues in practice, probably because DRT starts with an empty cache each time anyway.

> I think the latest XHR level 2 WD is pretty clear:

It's equally misleading. What we want to guarantee is that within a progress event handler, reported length matches the length of responseText. We also want readyState and other state accessors return results that match the data in progress event.

Dispatching progress events synchronously is not the only way to achieve this, but it's the simplest by far, and it's indistinguishable from JS code, since network callbacks themselves are async.

> The purpose of this object is to dispatch events (thus this name) but it is
> also limiting the rate of dispatching. Perhaps something along this could work
> like XMLHTTPRequestProgressEventThrottle?

I thought throttle was a verb, but dictionary suggests that it's a noun, too. With that understanding, it looks good to me ("XMLHttp" though). Maybe worth double-checking with someone else.
Comment 14 Alexey Proskuryakov 2010-03-11 10:16:22 PST
> Right, currently the code assumes that expectedLength >= 0. It is not strictly
> related to the current change so I will file a bug about that as the change is
> already large enough.

OK. Please double-check that this change doesn't introduce major new badness due to this issue though.
Comment 15 Julien Chaffraix 2010-03-15 23:10:32 PDT
> > I think the latest XHR level 2 WD is pretty clear:
> 
> It's equally misleading. What we want to guarantee is that within a progress
> event handler, reported length matches the length of responseText. We also want
> readyState and other state accessors return results that match the data in
> progress event.
> 
> Dispatching progress events synchronously is not the only way to achieve this,
> but it's the simplest by far, and it's indistinguishable from JS code, since
> network callbacks themselves are async.

Right, I had overlooked the XHR properties that have to be consistent. It is better to just dispatch synchronously.

> 
> > The purpose of this object is to dispatch events (thus this name) but it is
> > also limiting the rate of dispatching. Perhaps something along this could work
> > like XMLHTTPRequestProgressEventThrottle?
> 
> I thought throttle was a verb, but dictionary suggests that it's a noun, too.
> With that understanding, it looks good to me ("XMLHttp" though). Maybe worth
> double-checking with someone else.

I was thinking of the noun's meaning when I made this proposition (more precisely the motorcycle's throttle as it is also used to limit the speed).

>> Right, currently the code assumes that expectedLength >= 0. It is not strictly
>> related to the current change so I will file a bug about that as the change is
>> already large enough.

> OK. Please double-check that this change doesn't introduce major new badness
> due to this issue though.

PHP automatically fills the "Content-Length" so it is unlikely to be hit at this point. We will need some as-is magic to avoid sending the header.

Filed https://bugs.webkit.org/show_bug.cgi?id=36156 for the expected length assumption.
Comment 16 Julien Chaffraix 2010-03-16 08:11:06 PDT
Created attachment 50790 [details]
Take two: Renamed new file to XMLHttpRequestProgressEventThrottle, added suspend/resume support, refactored the code a bit to avoid duplication, made the test more reliable

This should address all Alexey's comments.
Comment 17 Alexey Proskuryakov 2010-03-16 14:21:05 PDT
Comment on attachment 50790 [details]
Take two: Renamed new file to XMLHttpRequestProgressEventThrottle, added suspend/resume support, refactored the code a bit to avoid duplication, made the test more reliable

> +header("Cache-Control: no-cache, must-revalidate");
> +header("Pragma: no-cache");
> +header("Pragma: no-store");

I think no-store should go into Cache-Control. Did such a pragma even exist in HTTP 1.0? 

> +void XMLHttpRequest::suspend()
> +{
> +    ActiveDOMObject::suspend();
> +    m_progressEventThrottle.suspend();
> +}
> +
> +void XMLHttpRequest::resume()
> +{
> +    ActiveDOMObject::resume();
> +    m_progressEventThrottle.resume();
> +}

We don't call the base class methods in other overrides of suspend/resume, I think it's better to be consistent.

+     EventTarget* m_target;

This is correct, but it looks suspicious - what guarantees that m_target doesn't get destroyed? Perhaps having type be XMLHttpRequest would make it less surprising to people looking at this.

+     Vector<RefPtr<Event> > m_pausedEvents;

Does this need to be vector? Isn't it that we can have no more than one pending event?

r=me
Comment 18 Julien Chaffraix 2010-03-16 15:29:28 PDT
(In reply to comment #17)
> (From update of attachment 50790 [details])
> > +header("Cache-Control: no-cache, must-revalidate");
> > +header("Pragma: no-cache");
> > +header("Pragma: no-store");
> 
> I think no-store should go into Cache-Control. Did such a pragma even exist in
> HTTP 1.0?

Right, I read your comment wrong. It will be changed before landing.

> We don't call the base class methods in other overrides of suspend/resume, I
> think it's better to be consistent.

I also think consistence is better.

> +     EventTarget* m_target;
> 
> This is correct, but it looks suspicious - what guarantees that m_target
> doesn't get destroyed? Perhaps having type be XMLHttpRequest would make it less
> surprising to people looking at this.

Actually this is intentional: it makes it possible to implement the upload throttling very easily as both XMLHttpRequest and XMLHttpRequestUpload inherit from EventTarget. I guess adding a comment about that could help. I was thinking of something along those line (from now):

// Weak pointer to our XMLHttpRequest object as it is the one holding us.
EventTarget* m_target;
  
Alternatively, it could be changed as you said then just changed back to what I am proposing later on.

> +     Vector<RefPtr<Event> > m_pausedEvents;
> 
> Does this need to be vector? Isn't it that we can have no more than one pending
> event?

I think we can have multiple events queued. The example I am thinking of is if we suspend the throttle and then get the last bit of data from the network:
- the code in XMLHttpRequestProgressEventThrottle::dispatchEvent will queue 2 events (ready state 'DONE' and a progress event).
- XMLHttpRequest::callReadyStateChangeListener will queue another 'load' event.
Comment 19 Alexey Proskuryakov 2010-03-16 15:51:56 PDT
> // Weak pointer to our XMLHttpRequest object as it is the one holding us.
> EventTarget* m_target;

OK.

> I think we can have multiple events queued. The example I am thinking of is if
> we suspend the throttle and then get the last bit of data from the network:
> - the code in XMLHttpRequestProgressEventThrottle::dispatchEvent will queue 2
> events (ready state 'DONE' and a progress event).
> - XMLHttpRequest::callReadyStateChangeListener will queue another 'load' event.

We shouldn't be getting any data from network while suspended. Perhaps some platforms don't implement that correctly, so the failure mode should be non-catastrophic, but the same PageLoadDeferrer that calls suspend() also pauses loading.
Comment 20 Julien Chaffraix 2010-03-16 16:10:15 PDT
> We shouldn't be getting any data from network while suspended. Perhaps some
> platforms don't implement that correctly, so the failure mode should be
> non-catastrophic, but the same PageLoadDeferrer that calls suspend() also
> pauses loading.

I was not sure about that, which is why I played safely and went for a Vector. This means that the code could be simplified:
- we should not call dispatchProgressEvent, dispatchEvent or fired if we are suspended.
- we can use a single slot for a pending progress event.
Comment 21 Julien Chaffraix 2010-03-16 22:30:00 PDT
Created attachment 50875 [details]
Third version: use a more tailored version of suspend / resume. Also takes Alexey comments into account.

This patch implements the previous items.
Comment 22 Alexey Proskuryakov 2010-03-17 09:53:16 PDT
Comment on attachment 50875 [details]
Third version: use a more tailored version of suspend / resume. Also takes Alexey comments into account.

r=me
Comment 23 Julien Chaffraix 2010-03-23 09:08:50 PDT
Committed fix in r56394. For the record, the windows build bot choked on:

static const double minimumProgressEventDispatchingIntervalInSeconds = 0.5;

It only accept such initialization for type 'int' so we need to put the initialization in the cpp files (see fix in r56397).
Comment 24 Gustavo Noronha (kov) 2010-03-23 12:37:20 PDT
This change seems to have regressed http/tests/xmlhttprequest/xmlhttprequest-onProgress-open-should-zero-length.html on GTK+. It was passing consistently before, and is failing consistently after the change. Any idea what may be causing it?
Comment 25 Julien Chaffraix 2010-03-23 14:57:43 PDT
(In reply to comment #24)
> This change seems to have regressed
> http/tests/xmlhttprequest/xmlhttprequest-onProgress-open-should-zero-length.html
> on GTK+. It was passing consistently before, and is failing consistently after
> the change. Any idea what may be causing it?

I really have no clue about that. When I saw this test failing, I was wondering if it was related to the land or just flaky.

As GTK+ is the only platform impacted, I would look into the network stack (wrong initialization of the new response or maybe deeper in libsoup). Running the http tests with --random could maybe help.

If you want me to add it to the skipped list and file a bug, just state it.
Comment 26 Csaba Osztrogonác 2010-03-24 02:41:55 PDT
(In reply to comment #24)
> This change seems to have regressed
> http/tests/xmlhttprequest/xmlhttprequest-onProgress-open-should-zero-length.html
> on GTK+. It was passing consistently before, and is failing consistently after
> the change. Any idea what may be causing it?

Unfortunately http/tests/xmlhttprequest/xmlhttprequest-onProgress-open-should-zero-length.html fails on Qt port too. I reverted r56394 and r56397 locally, and it passes again as previously, so these patches caused this bug, or they revealed a hidden bug.

If I run only http/tests/xmlhttprequest/xmlhttprequest-onProgress-open-should-zero-length.html, it passes.

If I run WebKitTools/Scripts/run-webkit-tests http/tests/xmlhttprequest/xmlhttprequest-50ms-download-dispatch.html http/tests/xmlhttprequest/xmlhttprequest-onProgress-open-should-zero-length.html
the test sometimes passes and sometimes fails.

But skipping http/tests/xmlhttprequest/xmlhttprequest-50ms-download-dispatch.html didn't solve the problem.

Skipping http/tests/xmlhttprequest/xmlhttprequest-onProgress-open-should-zero-length.html works, the bots would be happy, but it won't fix the bug.

I reopened the bug. We should determine if it is a regression or it is a new bug revealed by this patch.
Comment 27 Eric Seidel (no email) 2010-03-24 14:31:22 PDT
Comment on attachment 50790 [details]
Take two: Renamed new file to XMLHttpRequestProgressEventThrottle, added suspend/resume support, refactored the code a bit to avoid duplication, made the test more reliable

Cleared Alexey Proskuryakov's review+ from obsolete attachment 50790 [details] so that this bug does not appear in http://webkit.org/pending-commit.
Comment 28 Julien Chaffraix 2010-03-26 07:15:44 PDT
Closing the bug now that bug 36531 is solved.

Opened a follow-up bug 36661 to deal with upload progress event that should be handled in the same way.
Comment 29 Eric Seidel (no email) 2010-03-31 15:58:19 PDT
http/tests/xmlhttprequest/xmlhttprequest-50ms-download-dispatch.html has failed on the Tiger bot ever since this commit.  Please fix. :)