Bug 24758 - [Gtk] Implement a WebDataSource for the gtk port
Summary: [Gtk] Implement a WebDataSource for the gtk port
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKitGTK (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC Linux
: P2 Normal
Assignee: Jan Alonzo
URL:
Keywords: Gtk
: 16104 (view as bug list)
Depends on:
Blocks:
 
Reported: 2009-03-23 08:49 PDT by Jan Alonzo
Modified: 2009-09-02 06:07 PDT (History)
4 users (show)

See Also:


Attachments
initial implementation (34.13 KB, patch)
2009-03-23 08:51 PDT, Jan Alonzo
no flags Details | Formatted Diff | Diff
WebKitWebResource implementation (24.83 KB, patch)
2009-04-04 22:21 PDT, Jan Alonzo
gustavo: review-
Details | Formatted Diff | Diff
web data source and documentloader impl for gtk port (41.72 KB, patch)
2009-04-04 22:22 PDT, Jan Alonzo
gustavo: review-
Details | Formatted Diff | Diff
implement data source API in webkitwebframe (14.78 KB, patch)
2009-04-04 22:24 PDT, Jan Alonzo
gustavo: review-
Details | Formatted Diff | Diff
updated patch (25.23 KB, patch)
2009-07-18 01:02 PDT, Jan Alonzo
xan.lopez: review-
Details | Formatted Diff | Diff
updated datasource patch (41.77 KB, patch)
2009-07-18 01:03 PDT, Jan Alonzo
xan.lopez: review-
Details | Formatted Diff | Diff
Updated frame data source API patch (12.48 KB, patch)
2009-07-18 01:04 PDT, Jan Alonzo
xan.lopez: review-
Details | Formatted Diff | Diff
Updated WebResource patch (24.69 KB, patch)
2009-08-15 01:01 PDT, Jan Alonzo
xan.lopez: review+
jmalonzo: commit-queue-
Details | Formatted Diff | Diff
updated patch (42.99 KB, patch)
2009-08-15 22:28 PDT, Jan Alonzo
xan.lopez: review-
Details | Formatted Diff | Diff
Update web resource patch (25.48 KB, patch)
2009-08-28 00:29 PDT, Jan Alonzo
gustavo: review+
gustavo: commit-queue-
Details | Formatted Diff | Diff
updated patch (37.78 KB, patch)
2009-08-28 00:30 PDT, Jan Alonzo
xan.lopez: review+
Details | Formatted Diff | Diff
Updated patch with working tests (12.69 KB, patch)
2009-08-28 00:31 PDT, Jan Alonzo
xan.lopez: review+
xan.lopez: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jan Alonzo 2009-03-23 08:49:24 PDT
We need an implementation of WebDataSource so we can get the data, request, response, resources, etc.. associated with a web frame. Sample use cases are:
* Viewing page's source
* Saving a frame/page on disk
* ...

Of course this is not handled by WebDataSource alone, but a WebDataSource implementation is a good start to start implementing the other stuff required to make these happen.
Comment 1 Jan Alonzo 2009-03-23 08:51:27 PDT
Created attachment 28853 [details]
initial implementation

This is just an an initial implementation. Feedback appreciated.

Thanks.
Comment 2 Gustavo Noronha (kov) 2009-03-23 14:27:13 PDT
Comment on attachment 28853 [details]
initial implementation

> +DocumentLoaderGtk::DocumentLoaderGtk(const ResourceRequest& request, const SubstituteData& substituteData)
> +    : DocumentLoader(request, substituteData)
> +    , m_isDataSourceReffed(false)
> +    , m_dataSource(0)

Although mac seems to do it, I think GTK+ classes usually do not append 'Gtk' to their name. If there is no reason to do this rename, I think we should go with DocumentLoader::DocumentLoader.

> +void DocumentLoaderGtk::attachToFrame()
> +{
> +    DocumentLoader::attachToFrame();
> +
> +    refDataSource();
> +}

Is this the cause for the rename?

> -void FrameLoaderClient::dispatchWillSendRequest(DocumentLoader*, unsigned long, ResourceRequest&, const ResourceResponse&)
> +void FrameLoaderClient::dispatchWillSendRequest(DocumentLoader* loader, unsigned long identifier, ResourceRequest& request, const ResourceResponse& redirectResponse)
>  {
> -    notImplemented();
> +    if (redirectResponse.isNull())
> +        static_cast<DocumentLoaderGtk*>(loader)->increaseLoadCount(identifier);
>  }

These casts are not something I see in our other webcore support classes. Are they needed? If they are, perhaps it would be better to cook up a core()/kit() conversion pair, such as the ones that exist for things such as WebView and WebFrame.

>      WebKitWebNavigationReason kit(WebCore::NavigationType type);
>      WebCore::NavigationType core(WebKitWebNavigationReason reason);
> +
>  }

Here's the rogue blank line I mentioned.

> +    // WebKitWebDataSource private
> +    WEBKIT_API WebKitWebDataSource*
> +    webkit_web_data_source_init_with_loader(PassRefPtr<DocumentLoaderGtk>);
> +

I would make this webkit_web_data_source_new_with_core_loader, to match the already-existing history item one. If I remember correctly, I am using _with_core_request in my proposed NetworkRequest implementation, too.

> +G_CONST_RETURN gchar* webkit_web_data_source_get_unreachable_url(WebKitWebDataSource* webDataSource)
> +{
> +    g_return_val_if_fail(WEBKIT_IS_WEB_DATA_SOURCE(webDataSource), NULL);
> +
> +    WebKitWebDataSourcePrivate* priv = webDataSource->priv;
> +    const KURL& unreachableURL = priv->loader->unreachableURL();
> +
> +    if (unreachableURL.isEmpty())
> +        return NULL;
> +    else
> +        return unreachableURL.string().utf8().data();
> +}

What is this used for?

> + * webkit_web_frame_get_provisional_data_source:
> + * @frame: a #WebKitWebFrame
> + *
> + * Returns the provisional data source. You use the
> + * webkit_web_frame_load_request method to initiate an asynchronous client
> + * request which will create a provisional data source. The provisional data
> + * source will transition to a committed data source once any data has been
> + * received. Therefore this method returns NULL if either a load request is
> + * not in progress, or a load request has completed.

I think this comment seems to imply that the user needs to take action to cause loads to initiate. It would be better to rephrase this, so that it is easily understandable that any loads initiated by WebKit or by the user explicitly with load_request will cause a provisional data source to be created.

Looking good to me. I think this is a very important missing piece in our API.
Comment 3 Christian Dywan 2009-03-23 14:36:28 PDT
This is already looking interesting. I have a few comments for you :)

There are three files which don't attribute copyright to you and it looks as
though an Apple employée wrote them. You might want to add your name there, even
if they are based on existing code.

+static void webkit_web_data_source_init(WebKitWebDataSource* webDataSource)
+{
+    JSC::initializeThreading();
+    webDataSource->priv = WEBKIT_WEB_DATA_SOURCE_GET_PRIVATE(webDataSource);
+}

The initialisation call here looks out of place. Would it make sense to put
that inside webkit_init?

+WebKitWebDataSource* webkit_web_data_source_new(WebKitNetworkRequest* request)
+{
+    const gchar* uri = webkit_network_request_get_uri(request);
+    return webkit_web_data_source_init_with_loader(DocumentLoaderGtk::create(ResourceRequest(KURL(KURL(), String::fromUTF8(uri))), SubstituteData()));
+}

This line is just too long, please add a line break here ;)

+WebKitWebFrame* webkit_web_data_source_get_web_frame(WebKitWebDataSource* webDataSource)
+{
+    g_return_val_if_fail(WEBKIT_IS_WEB_DATA_SOURCE(webDataSource), NULL);
+
+    WebKitWebDataSourcePrivate* priv = webDataSource->priv;
+    FrameLoader* frameLoader = priv->loader->frameLoader();
+    if (!frameLoader)
+        return NULL;

Do we usually expect a NULL value? In that case, we should document it,
otherwise it should be an ASSERT.

+WebKitNetworkRequest* webkit_web_data_source_get_request(WebKitWebDataSource* webDataSource)
+{
+    g_return_val_if_fail(WEBKIT_IS_WEB_DATA_SOURCE(webDataSource), NULL);
+
+    WebKitWebDataSourcePrivate* priv = webDataSource->priv;
+    FrameLoader* frameLoader = priv->loader->frameLoader();
+    if (!frameLoader || !frameLoader->frameHasLoaded())
+        return NULL;
+
+    ResourceRequest request = priv->loader->request();
+    WebKitNetworkRequest* webRequest = webkit_network_request_new(request.url().string().utf8().data());
+
+    return webRequest;
+}

I'm guessing this depends on the improvements of WebKitNetworkRequest so we can
use the actual request rather than creating a new one, so I suggest a FIXME
here. And we want to store the request, otherwise subsequent calls will lead
to unexpected results, ie. memory leaks and failing pointer comparisons.

+ * Returns %TRUE if @data_source is in the process of loading its content,
+ * otherwise it returns %FALSE
+ *
+ * Return value: %TRUE if the @data_source is still loading, %FALSE if it's
+ * already loaded
+ *
+ * Since: 1.1.4

You want to avoid the similar wording here. The main description could be
something like "Determines whether the @data_source is currently in the process
of loading its content" and the return value perhaps "%TRUE if the @data_source
is currently loading, %FALSE otherwise".

+WEBKIT_API WebKitNetworkRequest *
+webkit_web_data_source_get_request            (WebKitWebDataSource  *data_source);

This should likely be _get_network_request.

+WEBKIT_API G_CONST_RETURN gchar *
+webkit_web_data_source_get_text_encoding_name (WebKitWebDataSource  *data_source);

We have functions _get_encoding and _set_encoding, I think we should use
_get_encoding here accordingly.

+WebKitWebDataSource* webkit_web_frame_get_provisional_data_source(WebKitWebFrame* frame)
+{
+    g_return_val_if_fail(WEBKIT_IS_WEB_FRAME(frame), NULL);
+
+    Frame* coreFrame = core(frame);
+    WebKitWebDataSource* dataSource = webkit_web_frame_get_data_source_from_core_loader(coreFrame->loader()->provisionalDocumentLoader());
+
+    g_return_val_if_fail(WEBKIT_IS_WEB_DATA_SOURCE(dataSource), NULL);
+
+    return dataSource;
+}

The return_if_fail in there is wrong. The comment says NULL is returned if there
is no provisional data source. And if you want to make sure WebCore doesn't
return anything else that must be an ASSERT.
The same for the non-provisional data source.
Comment 4 Jan Alonzo 2009-04-04 22:21:20 PDT
Created attachment 29261 [details]
WebKitWebResource implementation

Add a web resource implementation for the gtk port which is needed by some data source API.
Comment 5 Jan Alonzo 2009-04-04 22:22:53 PDT
Created attachment 29263 [details]
web data source and documentloader impl for gtk port

This patch adds an implementation of web data source and the corresponding document loader for the gtk port.
Comment 6 Jan Alonzo 2009-04-04 22:24:23 PDT
Created attachment 29264 [details]
implement data source API in webkitwebframe

This patch adds a couple of APIs needed to retrieve a data source from a web frame. This patch also adds a unit test for the web data source impl.
Comment 7 Jan Alonzo 2009-04-04 22:31:05 PDT
> > +    : DocumentLoader(request, substituteData)
> > +    , m_isDataSourceReffed(false)
> > +    , m_dataSource(0)
> 
> Although mac seems to do it, I think GTK+ classes usually do not append 'Gtk'
> to their name. If there is no reason to do this rename, I think we should go
> with DocumentLoader::DocumentLoader.

Ok, renamed. The Gtk port impl is now under the WebKit namespace.

> 
> > -void FrameLoaderClient::dispatchWillSendRequest(DocumentLoader*, unsigned long, ResourceRequest&, const ResourceResponse&)
> > +void FrameLoaderClient::dispatchWillSendRequest(DocumentLoader* loader, unsigned long identifier, ResourceRequest& request, const ResourceResponse& redirectResponse)
> >  {
> > -    notImplemented();
> > +    if (redirectResponse.isNull())
> > +        static_cast<DocumentLoaderGtk*>(loader)->increaseLoadCount(identifier);
> >  }
> 
> These casts are not something I see in our other webcore support classes. Are
> they needed? If they are, perhaps it would be better to cook up a core()/kit()
> conversion pair, such as the ones that exist for things such as WebView and
> WebFrame.

Yup it's needed. AFAIK core/kit is only for public API and since WebKit::DocumentLoader is not public, I don't think it's required.

> >      WebKitWebNavigationReason kit(WebCore::NavigationType type);
> >      WebCore::NavigationType core(WebKitWebNavigationReason reason);
> > +
> >  }
> 
> Here's the rogue blank line I mentioned.

Thanks fixed.

> 
> > +    // WebKitWebDataSource private
> > +    WEBKIT_API WebKitWebDataSource*
> > +    webkit_web_data_source_init_with_loader(PassRefPtr<DocumentLoaderGtk>);
> > +
> 
> I would make this webkit_web_data_source_new_with_core_loader, to match the
> already-existing history item one. If I remember correctly, I am using
> _with_core_request in my proposed NetworkRequest implementation, too.

Done.

> 
> > +G_CONST_RETURN gchar* webkit_web_data_source_get_unreachable_url(WebKitWebDataSource* webDataSource)
> > +{
> > +    g_return_val_if_fail(WEBKIT_IS_WEB_DATA_SOURCE(webDataSource), NULL);
> > +
> > +    WebKitWebDataSourcePrivate* priv = webDataSource->priv;
> > +    const KURL& unreachableURL = priv->loader->unreachableURL();
> > +
> > +    if (unreachableURL.isEmpty())
> > +        return NULL;
> > +    else
> > +        return unreachableURL.string().utf8().data();
> > +}
> 
> What is this used for?

It's used when a load was requested on a web frame using one of the methods that accept an unreachableURL parameter (n.b. we don't have this yet but i'm looking into this now).

> 
> > + * webkit_web_frame_get_provisional_data_source:
> > + * @frame: a #WebKitWebFrame
> > + *
> > + * Returns the provisional data source. You use the
> > + * webkit_web_frame_load_request method to initiate an asynchronous client
> > + * request which will create a provisional data source. The provisional data
> > + * source will transition to a committed data source once any data has been
> > + * received. Therefore this method returns NULL if either a load request is
> > + * not in progress, or a load request has completed.
> 
> I think this comment seems to imply that the user needs to take action to cause
> loads to initiate. It would be better to rephrase this, so that it is easily
> understandable that any loads initiated by WebKit or by the user explicitly
> with load_request will cause a provisional data source to be created.

Doc reworded in the updated patch.

Cheers
Comment 8 Jan Alonzo 2009-04-04 22:35:41 PDT
(In reply to comment #3)
> This is already looking interesting. I have a few comments for you :)
> 
> There are three files which don't attribute copyright to you and it looks as
> though an Apple employée wrote them. You might want to add your name there,
> even
> if they are based on existing code.

Thanks I've updated the licence headers. Can you please check again?

> 
> +static void webkit_web_data_source_init(WebKitWebDataSource* webDataSource)
> +{
> +    JSC::initializeThreading();
> +    webDataSource->priv = WEBKIT_WEB_DATA_SOURCE_GET_PRIVATE(webDataSource);
> +}
> 
> The initialisation call here looks out of place. Would it make sense to put
> that inside webkit_init?

I removed it for now. We already call it in webkit_init.

> 
> +WebKitWebDataSource* webkit_web_data_source_new(WebKitNetworkRequest* request)
> +{
> +    const gchar* uri = webkit_network_request_get_uri(request);
> +    return
> webkit_web_data_source_init_with_loader(DocumentLoaderGtk::create(ResourceRequest(KURL(KURL(),
> String::fromUTF8(uri))), SubstituteData()));
> +}
> 
> This line is just too long, please add a line break here ;)

Done - please see updated patch.

> 
> +WebKitWebFrame* webkit_web_data_source_get_web_frame(WebKitWebDataSource*
> webDataSource)
> +{
> +    g_return_val_if_fail(WEBKIT_IS_WEB_DATA_SOURCE(webDataSource), NULL);
> +
> +    WebKitWebDataSourcePrivate* priv = webDataSource->priv;
> +    FrameLoader* frameLoader = priv->loader->frameLoader();
> +    if (!frameLoader)
> +        return NULL;
> 
> Do we usually expect a NULL value? In that case, we should document it,
> otherwise it should be an ASSERT.

Ok, added an ASSERT.

> 
> +WebKitNetworkRequest* webkit_web_data_source_get_request(WebKitWebDataSource*
> webDataSource)
> +{
> +    g_return_val_if_fail(WEBKIT_IS_WEB_DATA_SOURCE(webDataSource), NULL);
> +
> +    WebKitWebDataSourcePrivate* priv = webDataSource->priv;
> +    FrameLoader* frameLoader = priv->loader->frameLoader();
> +    if (!frameLoader || !frameLoader->frameHasLoaded())
> +        return NULL;
> +
> +    ResourceRequest request = priv->loader->request();
> +    WebKitNetworkRequest* webRequest =
> webkit_network_request_new(request.url().string().utf8().data());
> +
> +    return webRequest;
> +}
> 
> I'm guessing this depends on the improvements of WebKitNetworkRequest so we can
> use the actual request rather than creating a new one, so I suggest a FIXME
> here. And we want to store the request, otherwise subsequent calls will lead
> to unexpected results, ie. memory leaks and failing pointer comparisons.

Done.

> 
> + * Returns %TRUE if @data_source is in the process of loading its content,
> + * otherwise it returns %FALSE
> + *
> + * Return value: %TRUE if the @data_source is still loading, %FALSE if it's
> + * already loaded
> + *
> + * Since: 1.1.4
> 
> You want to avoid the similar wording here. The main description could be
> something like "Determines whether the @data_source is currently in the process
> of loading its content" and the return value perhaps "%TRUE if the @data_source
> is currently loading, %FALSE otherwise".

Thanks I've updated the doc.

> 
> +WEBKIT_API WebKitNetworkRequest *
> +webkit_web_data_source_get_request            (WebKitWebDataSource 
> *data_source);
> 
> This should likely be _get_network_request.
> 
> +WEBKIT_API G_CONST_RETURN gchar *
> +webkit_web_data_source_get_text_encoding_name (WebKitWebDataSource 
> *data_source);
> 
> We have functions _get_encoding and _set_encoding, I think we should use
> _get_encoding here accordingly.

I missed this. I'll update the patch once I get more feedback here (or pre-commit)

> 
> +WebKitWebDataSource*
> webkit_web_frame_get_provisional_data_source(WebKitWebFrame* frame)
> +{
> +    g_return_val_if_fail(WEBKIT_IS_WEB_FRAME(frame), NULL);
> +
> +    Frame* coreFrame = core(frame);
> +    WebKitWebDataSource* dataSource =
> webkit_web_frame_get_data_source_from_core_loader(coreFrame->loader()->provisionalDocumentLoader());
> +
> +    g_return_val_if_fail(WEBKIT_IS_WEB_DATA_SOURCE(dataSource), NULL);
> +
> +    return dataSource;
> +}
> 
> The return_if_fail in there is wrong. The comment says NULL is returned if
> there
> is no provisional data source. And if you want to make sure WebCore doesn't
> return anything else that must be an ASSERT.
> The same for the non-provisional data source.

Updated in the patch.

Thanks!
Comment 9 Jan Alonzo 2009-04-19 01:18:50 PDT
*** Bug 16104 has been marked as a duplicate of this bug. ***
Comment 10 Eric Seidel (no email) 2009-05-21 19:17:58 PDT
Comment on attachment 29261 [details]
WebKitWebResource implementation

No comments in over a month.  Please contact a Gtk reviewer directly for review.  Marking these attachments r- to remove them from the review queue since it's unlikely this patch will get reviewed any time soon.
Comment 11 Eric Seidel (no email) 2009-05-21 19:18:12 PDT
Comment on attachment 29263 [details]
web data source and documentloader impl for gtk port

See comment above.
Comment 12 Eric Seidel (no email) 2009-05-21 19:18:28 PDT
Comment on attachment 29264 [details]
implement data source API in webkitwebframe

See comment above.
Comment 13 Maciej Stachowiak 2009-05-21 22:47:43 PDT
Comment on attachment 29261 [details]
WebKitWebResource implementation

Reflagging for review - this still needs substantive review, ideally from a Gtk reviewer.
Comment 14 Maciej Stachowiak 2009-05-21 22:47:53 PDT
Comment on attachment 29263 [details]
web data source and documentloader impl for gtk port

Reflagging for review - this still needs substantive review, ideally from a Gtk reviewer.
Comment 15 Maciej Stachowiak 2009-05-21 22:48:01 PDT
Comment on attachment 29264 [details]
implement data source API in webkitwebframe

Reflagging for review - this still needs substantive review, ideally from a Gtk reviewer.
Comment 16 Gustavo Noronha (kov) 2009-05-25 19:18:22 PDT
Comment on attachment 29261 [details]
WebKitWebResource implementation

> +                                    "url",
> +                                    "URL",
> +                                    "The url of the resource",
> +                                    NULL,
> +                                    WEBKIT_PARAM_READABLE));

Missing the i18n calls here.

> +WEBKIT_API G_CONST_RETURN gchar *
> +webkit_web_resource_get_data               (WebKitWebResource  *web_resource);

Hmm. Is it guaranteed that the data will always be null-terminated? Most probably not, since we'll have WebResources for all kinds of resources, right? Such as images, for instance? I think we may want to use GString, or something like:

WEBKIT_API G_CONST_RETURN gchar *
webkit_web_resource_get_data               (WebKitWebResource  *web_resource, gsize* length);

 159         const gchar* data;

Length here.

 160     };
 161     WebKitWebResource*
 162     webkit_web_resource_init_with_core_resource(PassRefPtr<WebCore::ArchiveResource>);

new_with_core_resource instead of init_with to match history item.

I'd r+ with those changes, let me ping xan so that he can check on the API along with me =). r- for now, because of those issues.
Comment 17 Gustavo Noronha (kov) 2009-05-25 19:44:15 PDT
Comment on attachment 29263 [details]
web data source and documentloader impl for gtk port

> +void DocumentLoader::setDataSource(WebKitWebDataSource* dataSource, WebKitWebView* webView)
> +{
> +    ASSERT(!m_dataSource);
> +
> +    m_dataSource = dataSource;
> +    refDataSource();
> +}

What is the webView used for?

> +#include <wtf/HashSet.h>
> +
> +#include "DocumentLoader.h"
> +
> +#include "webkitdefines.h"

I think you should order these, and make remove the empty lines.

> +namespace WebCore {
> +
> +    class ResourceRequest;
> +    class SubstituteData;
> +}

Needless empty line, or add an additional one at the end.

> -FrameLoaderClient::shouldUseCredentialStorage(DocumentLoader*, unsigned long  identifier)
> +FrameLoaderClient::shouldUseCredentialStorage(WebCore::DocumentLoader*, unsigned long  identifier)

Why these changes? There's a using statement already. Please remove these, they just add noise to the patch, IMO.

> +#include "config.h"
> +
> +#include "webkitwebdatasource.h"

Join those.

> +#include "webkitwebresource.h"
> +#include "webkitprivate.h"
> +
> +#include <glib.h>
> +#include <runtime/InitializeThreading.h>
> +#include <wtf/Assertions.h>
> +
> +#include "ArchiveResource.h"
> +#include "DocumentLoaderGtk.h"
> +#include "FrameLoaderClientGtk.h"
> +#include "FrameLoader.h"
> +#include "KURL.h"
> +#include "PlatformString.h"
> +#include "ResourceRequest.h"
> +#include "SharedBuffer.h"
> +#include "SubstituteData.h"

Join and sort the rest.

> +extern "C" {

Unnecessary, right?

> + * Creates a new #WebKitWebDataSource from a #WebKitNetworkRequest. Normally,
> + * #WebKitWebFrame objects create their data sources, so don't invoke this
> + * method directly.

I think s/don't invoke/you will almost never want to invoke/, to match similar documentation on GTK+.

> + * webkit_web_data_source_get_text_encoding_name:

Renaming these is still pending.

> +G_CONST_RETURN gchar* webkit_web_data_source_get_data(WebKitWebDataSource* webDataSource)

My point about these still stands. We are not sure there are no \0's in the data, so we better return the length, too. So I suggest:

G_CONST_RETURN gchar* webkit_web_data_source_get_data(WebKitWebDataSource* webDataSource, gsize* length)

> +    if (!priv->mainResource)
> +        priv->mainResource = webkit_web_resource_init_with_core_resource(coreResource.release());

new_with_core, like history (still stands)

> + * Return value: a #GArray of #WebKitWebResource. You need to free the array
> + * and its subresources once you are done with it.

Uh? Why would the user need to free the GArray?

Also, Mac seems to use the identifier of the resource for various APIs. Should we return a GHashTable that maps identifiers to the actual sub resources, and allow for things such as requesting for a specific resource using the identifier?

> +    // WebKitWebDataSource private
> +    WebKitWebDataSource*
> +    webkit_web_data_source_new_with_loader(PassRefPtr<WebKit::DocumentLoader>);
> +

new_with_core_resource should be here, too?

>      Frame* coreFrame = core(frame);
> -    DocumentLoader* docLoader = coreFrame->loader()->documentLoader();
> +    WebCore::DocumentLoader* docLoader = coreFrame->loader()->documentLoader();
>      String mimeType = docLoader->responseMIMEType();
>      return g_strdup(mimeType.utf8().data());

Not necessary?

r- for now
Comment 18 Gustavo Noronha (kov) 2009-05-25 19:50:03 PDT
Comment on attachment 29264 [details]
implement data source API in webkitwebframe

> Index: WebKit/WebKit/gtk/tests/testwebdatasource.c

I think you should add the tests to the other patch. They look good to me, but will need to be adapted after the API changes are done.

> +//    g_test_add("/webkit/webdatasource/get_text_encoding_name",
> +//               WebDataSourceFixture, 0, web_data_source_fixture_setup,
> +//               test_webkit_web_data_source_get_text_encoding_name, web_data_source_fixture_teardown);

?

> +WebKitWebDataSource* webkit_web_frame_get_provisional_data_source(WebKitWebFrame* frame)

Hmm. Why would anyone want the provisional data source? Isn't it just a blank thing?

> Index: WebKit/WebKit/gtk/ChangeLog
> ===================================================================
> --- WebKit.orig/WebKit/gtk/ChangeLog	2009-04-05 15:13:18.000000000 +1000
> +++ WebKit/WebKit/gtk/ChangeLog	2009-04-05 15:16:12.000000000 +1000

This part of the patch looks like something wrong happened hehe.
Comment 19 Jan Alonzo 2009-07-18 00:28:56 PDT
> > +WEBKIT_API G_CONST_RETURN gchar *
> > +webkit_web_resource_get_data               (WebKitWebResource  *web_resource);
> 
> Hmm. Is it guaranteed that the data will always be null-terminated? Most
> probably not, since we'll have WebResources for all kinds of resources, right?
> Such as images, for instance? I think we may want to use GString, or something
> like:

Would using a GString be Ok here? I prefer to go that route as it simplifies things a bit.

> I'd r+ with those changes, let me ping xan so that he can check on the API
> along with me =). r- for now, because of those issues.

I will be loading an updated patch soon with the changes you proposed. Thanks for the feedback.
Comment 20 Jan Alonzo 2009-07-18 00:35:37 PDT
> What is the webView used for?

Removed the param.

> > -FrameLoaderClient::shouldUseCredentialStorage(DocumentLoader*, unsigned long  identifier)
> > +FrameLoaderClient::shouldUseCredentialStorage(WebCore::DocumentLoader*, unsigned long  identifier)
> 
> Why these changes? There's a using statement already. Please remove these, they
> just add noise to the patch, IMO.

No. We have two DocumentLoader's: one from WebCore and one from WebKit so the namespace prefix is necessary here.

> > +G_CONST_RETURN gchar* webkit_web_data_source_get_data(WebKitWebDataSource* webDataSource)
> 
> My point about these still stands. We are not sure there are no \0's in the
> data, so we better return the length, too. So I suggest:

I've changed this to GString just like the WebResource.

> Also, Mac seems to use the identifier of the resource for various APIs. Should
> we return a GHashTable that maps identifiers to the actual sub resources, and
> allow for things such as requesting for a specific resource using the
> identifier?

Indeed, I had a quick browse on the Mac API and the identifier/resource map is used a lot. Though I think it's outside of the scope of this datasource enhancement. I'm happy to work on that in a different bug, if you want.

> >      Frame* coreFrame = core(frame);
> > -    DocumentLoader* docLoader = coreFrame->loader()->documentLoader();
> > +    WebCore::DocumentLoader* docLoader = coreFrame->loader()->documentLoader();
> >      String mimeType = docLoader->responseMIMEType();
> >      return g_strdup(mimeType.utf8().data());
> 
> Not necessary?

It is. See my reason above. 

I've updated the patch with your proposed changes. Thanks a lot for the feedback.

> 
> r- for now
Comment 21 Jan Alonzo 2009-07-18 00:45:10 PDT
(In reply to comment #18)
> (From update of attachment 29264 [details])
> > Index: WebKit/WebKit/gtk/tests/testwebdatasource.c
> 
> I think you should add the tests to the other patch. They look good to me, but
> will need to be adapted after the API changes are done.

It needs to be here as it uses the frame API to get to the data source.

> > +WebKitWebDataSource* webkit_web_frame_get_provisional_data_source(WebKitWebFrame* frame)
> 
> Hmm. Why would anyone want the provisional data source? Isn't it just a blank
> thing?

If you want the initial request of the data source, you would want the provisional one and not a committed data source.

Thanks for the feedback. Again, really appreciate it :) 

I'll upload the updates in a bit...
Comment 22 Jan Alonzo 2009-07-18 01:02:26 PDT
Created attachment 33014 [details]
updated patch
Comment 23 Jan Alonzo 2009-07-18 01:03:34 PDT
Created attachment 33015 [details]
updated datasource patch
Comment 24 Jan Alonzo 2009-07-18 01:04:35 PDT
Created attachment 33016 [details]
Updated frame data source API patch
Comment 25 Gustavo Noronha (kov) 2009-07-20 06:16:02 PDT
Comment on attachment 33015 [details]
updated datasource patch

> +// helper methos to avoid ref count churn

-> methods?

> +WebKitWebFrame* webkit_web_data_source_get_web_frame(WebKitWebDataSource* webDataSource)
> +{
> +    g_return_val_if_fail(WEBKIT_IS_WEB_DATA_SOURCE(webDataSource), NULL);
> +
> +    WebKitWebDataSourcePrivate* priv = webDataSource->priv;
> +    FrameLoader* frameLoader = priv->loader->frameLoader();
> +
> +    ASSERT(frameLoader);
> +
> +    WebKit::FrameLoaderClient* frameLoaderClient = static_cast<WebKit::FrameLoaderClient*>(frameLoader->client());
> +    return frameLoaderClient->webFrame();

It seems to me like a loader may be dettached from the frame. Should we check for this case and return NULL, here?

> + * Return value: a #GArray of #WebKitWebResource. The array if owned by WebKit
> + * and should not be freed or destroyed.

I keep thinking if we should not have a GHashtable here, instead, using the long identifier as key, and the WebResource object as value. Then we could have webkit_web_data_source_get_subresource(WebKitWebDataSource*, long identifier) to go straight to a subresource, if we know the resource we want (because, say, we're tracking it from willSendRequest).

Except for that, the patch looks very good to me. Let me poke Xan to look at it.
Comment 26 Gustavo Noronha (kov) 2009-07-20 06:35:46 PDT
Comment on attachment 33016 [details]
Updated frame data source API patch

Also looks fine to me.
Comment 27 Xan Lopez 2009-07-24 00:24:26 PDT
Comment on attachment 33014 [details]
updated patch

> +static void test_webkit_web_resource_get_url(WebResourceFixture* fixture, gconstpointer data)
> +{
> +    gchar* url;
> +    g_object_get(G_OBJECT(fixture->webResource), "url", &url, NULL);
> +    g_assert_cmpstr(url, ==, "http://example.com/");
> +    g_assert_cmpstr(webkit_web_resource_get_url(fixture->webResource) ,==,"http://example.com/");
> +}

You are leaking 'url' here, g_object_get will dup the string.

> +static void test_webkit_web_resource_get_mime_type(WebResourceFixture* fixture, gconstpointer data)
> +{
> +    gchar* mime_type;
> +    g_object_get(G_OBJECT(fixture->webResource), "mime-type", &mime_type, NULL);
> +    g_assert_cmpstr(mime_type, ==, "text/html");
> +    g_assert_cmpstr(webkit_web_resource_get_mime_type(fixture->webResource),==,"text/html");
> +}

Leaking 'mime_type'; same issue for all remaining tests.

> +static void webkit_web_resource_dispose(GObject* object)
> +{
> +    WebKitWebResource* webResource = WEBKIT_WEB_RESOURCE(object);
> +    WebKitWebResourcePrivate* priv = webResource->priv;
> +
> +    if (priv->resource)
> +        priv->resource->deref();

Don't you need a priv->resource = 0; here?

> +
> +    G_OBJECT_CLASS(webkit_web_resource_parent_class)->dispose(object);
> +}
> +
> +static void webkit_web_resource_finalize(GObject* object)
> +{
> +    WebKitWebResource* webResource = WEBKIT_WEB_RESOURCE(object);
> +    WebKitWebResourcePrivate* priv = webResource->priv;
> +
> +    priv->url = CString();
> +    priv->mimeType = CString();
> +    priv->textEncoding = CString();
> +    priv->frameName = CString();

IIRC we had a discussion about this and we decided it was not a good method to free memory? Is the mac port doing something like this? I'd say you'll be still leaking a full CString object this way...

> +
> +    if (priv->data)
> +        g_string_free(priv->data, FALSE);

I think you want to pass TRUE, not FALSE.

> +
> +/**
> + * webkit_web_resource_new:
> + * @data: the data to initialize the #WebKitWebResource
> + * @url: the url of the #WebKitWebResource
> + * @mime_type: the MIME type of the #WebKitWebResource
> + * @text_encoding_name: the text encoding name of the #WebKitWebResource
> + * @frame_name: the frame name of the #WebKitWebResource
> + *
> + * Returns a new #WebKitWebResource. The @text_encoding_name can be %NULL. The
> + * @frame_name argument can be used if the resource represents contents of an
> + * entire HTML frame, otherwise pass %NULL.
> + *
> + * Return value: a new #WebKitWebResource
> + *
> + * Since: 1.1.12
> + */
> +WebKitWebResource* webkit_web_resource_new(const gchar* data,
> +                                           const gchar* url,
> +                                           const gchar* mimeType,
> +                                           const gchar* textEncodingName,
> +                                           const gchar* frameName)
> +{
> +    ASSERT(data);
> +    ASSERT(url);
> +    ASSERT(mime_type);
> +
> +    RefPtr<SharedBuffer> buffer = SharedBuffer::create(data, strlen(data));

Mmm, data can't have explicit '0'? If it can strlen is wrong here (and I guess you'd need an extra 'length' parameter).

> +    WebKitWebResource* webResource = webkit_web_resource_new_with_core_resource(ArchiveResource::create(buffer, KURL(KURL(),String::fromUTF8(url)), String::fromUTF8(mimeType), String::fromUTF8(textEncodingName), String::fromUTF8(frameName)));
> +
> +    return webResource;
> +}
> +
> +/**
> + * webkit_web_resource_get_data:
> + * @web_resource: a #WebKitWebResource
> + *
> + * Returns the data of the @webResource.
> + *
> + * Return value: a #GString containing the character data of the @webResource.
> + * The string is owned by WebKit and should not be freed or destroyed.
> + *
> + * Since: 1.1.12
> + */
> +G_CONST_RETURN GString* webkit_web_resource_get_data(WebKitWebResource* webResource)

I believe in general 'const pointer object' makes little sense in C APIs, so I'd just return GString* with the note.

> +/**
> + * webkit_web_resource_get_url:
> + * @web_resource: a #WebKitWebResource
> + *
> + * Returns the URL of the resource
> + *
> + * Return value: the URL of the resource
> + *
> + * Since: 1.1.12
> + */
> +G_CONST_RETURN gchar* webkit_web_resource_get_url(WebKitWebResource* webResource)
> +{
> +    g_return_val_if_fail(WEBKIT_IS_WEB_RESOURCE(webResource), NULL);
> +
> +    WebKitWebResourcePrivate* priv = webResource->priv;
> +    if (!priv->resource)
> +        return NULL;
> +
> +    priv->url = priv->resource->url().string().utf8();
> +
> +    return priv->url.data();

Shouldn't you put the URL in priv->url once and then return always that? You are doing the same for the other properties.


> + * webkit_web_resource_get_text_encoding_name:
> + * @web_resource: a #WebKitWebResource
> + *
> + * Returns the text encoding name of the resource
> + *
> + * Return value: the text encoding name of the resource
> + *
> + * Since: 1.1.12
> + */
> +G_CONST_RETURN gchar* webkit_web_resource_get_text_encoding_name(WebKitWebResource* webResource)

We have other APIs to get the encoding name and we call them simply "get_encoding", so I think we should be consistent here probably.

Going to r- while we work on the issues.
Comment 28 Xan Lopez 2009-07-29 02:19:18 PDT
Comment on attachment 33015 [details]
updated datasource patch

> +void DocumentLoader::detachDataSource()
> +{
> +    ASSERT(!m_isDataSourceReffed);
> +    unrefDataSource();
> +}

Mmm, I don't think this ASSERT can be right, since unrefDataSource won't do anything if !m_isDataSourceReffed.

> +
> +void DocumentLoader::detachFromFrame()
> +{
> +    WebCore::DocumentLoader::detachFromFrame();
> +
> +    if (m_loadingResources.isEmpty())
> +        unrefDataSource();
> +
> +}

So this will silently not do stuff if there are resources still being loaded. I suppose that's OK?

> +
> +void DocumentLoader::decreaseLoadCount(unsigned long identifier)
> +{
> +    HashSet<unsigned long>::iterator it = m_loadingResources.find(identifier);
> +
> +    // It is valid for a load to be cancelled before it's started.
> +    if (it == m_loadingResources.end())
> +        return;
> +
> +    m_loadingResources.remove(it);
> +
> +    if (m_loadingResources.isEmpty() && !frame())
> +        unrefDataSource();
> +
> +}

Mmm, this seems to not do one unref per call, as opposed to the increaseLoadCount function which does that (for different resources). Is this done elsewhere?


>  WTF::PassRefPtr<WebCore::DocumentLoader> FrameLoaderClient::createDocumentLoader(const WebCore::ResourceRequest& request, const SubstituteData& substituteData)
>  {
> -    return DocumentLoader::create(request, substituteData);
> +    RefPtr<WebKit::DocumentLoader> loader = WebKit::DocumentLoader::create(request, substituteData);
> +
> +    WebKitWebDataSource* webDataSource = webkit_web_data_source_new_with_loader(loader.get());
> +    loader->setDataSource(webDataSource);
> +    g_object_unref(webDataSource);

Mmm, this seems a bit pointless:

- Create a loader.
- Create a datasource from the loader (and nothing else).
- Set the datasource in the loader.

Couldn't this happen automatically with a 'create' of our own or something like that?

> +#define WEBKIT_WEB_DATA_SOURCE_GET_PRIVATE(obj)        (G_TYPE_INSTANCE_GET_PRIVATE((obj), WEBKIT_TYPE_WEB_DATA_SOURCE, WebKitWebDataSourcePrivate))
> +
> +G_DEFINE_TYPE(WebKitWebDataSource, webkit_web_data_source, G_TYPE_OBJECT);
> +
> +static void webkit_web_data_source_dispose(GObject* object)
> +{
> +    WebKitWebDataSource* webDataSource = WEBKIT_WEB_DATA_SOURCE(object);
> +    WebKitWebDataSourcePrivate* priv = webDataSource->priv;
> +
> +    ASSERT(priv->loader);
> +    ASSERT(!priv->loader->isLoading());
> +    priv->loader->detachDataSource();
> +    priv->loader->deref();
> +
> +    if (priv->initialRequest)
> +        g_object_unref(priv->initialRequest);
> +
> +    if (priv->networkRequest)
> +        g_object_unref(priv->networkRequest);
> +
> +    if (priv->mainResource)
> +        g_object_unref(priv->mainResource);
> +
> +    if (priv->data)
> +        g_string_free(priv->data, FALSE);

You need to set all those to NULL, dispose can be called multiple times.

> +
> +    if (priv->subresources) {
> +        for (unsigned i=0; i < priv->subresources->len; i++) {
> +            WebKitWebResource* res = &g_array_index(priv->subresources, WebKitWebResource, i);
> +            if (res)
> +                g_object_unref(res);
> +        }
> +
> +        g_array_free(priv->subresources, TRUE);
> +    }

What's the point of using an array here? A list seems much more natural (seems Gustavo suggests a hash to be able to get easily a resource by its identifier, that might make sense too).

> +
> +    G_OBJECT_CLASS(webkit_web_data_source_parent_class)->dispose(object);
> +}
> +
> +static void webkit_web_data_source_finalize(GObject* object)
> +{
> +    WebKitWebDataSource* dataSource = WEBKIT_WEB_DATA_SOURCE(object);
> +    WebKitWebDataSourcePrivate* priv = dataSource->priv;
> +
> +    priv->unreachableURL = CString();

Same comment about this than in previous patch.

> +/**
> + * webkit_web_data_source_new:
> + * @request: the #WebKitNetworkRequest to use to create this data source
> + *
> + * Creates a new #WebKitWebDataSource from a #WebKitNetworkRequest. Normally,
> + * #WebKitWebFrame objects create their data sources so you will almost never
> + * want to invoke this method directly.
> + *
> + * Returns: a new #WebKitWebDataSource
> + *
> + * Since: 1.1.12
> + */
> +WebKitWebDataSource* webkit_web_data_source_new(WebKitNetworkRequest* request)
> +{
> +    ASSERT(request);
> +
> +    const gchar* uri = webkit_network_request_get_uri(request);
> +
> +    WebKitWebDataSource* datasource;
> +    datasource = webkit_web_data_source_new_with_loader(
> +        WebKit::DocumentLoader::create(ResourceRequest(KURL(KURL(), String::fromUTF8(uri))),
> +                                       SubstituteData()));
> +
> +    WebKitWebDataSourcePrivate* priv = datasource->priv;
> +    priv->initialRequest = request;
> +
> +    return datasource;
> +}

_new functions should, when possible, only be a call to g_object_new (otherwise bindings have to override them). This could probably be fixed by making the request/uri a property of the object and doing the needed setup there?

> +
> +/**
> + * webkit_web_data_source_get_text_encoding_name:

Same than in the other patch, we might want to just call thes "get_text_encoding" to be consistent with existing APIs.

> +
> +/**
> + * webkit_web_data_source_get_data:
> + * @data_source: a #WebKitWebDataSource
> + *
> + * Returns the raw data that represents the the frame's content.The data will
> + * be incomplete until the data has finished loading. Returns %NULL if the web
> + * frame hasn't loaded any data. Use webkit_web_data_source_is_loading to test
> + * if data source is in the process of loading.
> + *
> + * Return value: a #GString which contains the raw data that represents the @data_source or %NULL if the
> + * @data_source hasn't loaded any data.
> + *
> + * Since: 1.1.12
> + */
> +G_CONST_RETURN GString* webkit_web_data_source_get_data(WebKitWebDataSource* webDataSource)
> +{
> +    g_return_val_if_fail(WEBKIT_IS_WEB_DATA_SOURCE(webDataSource), NULL);
> +
> +    WebKitWebDataSourcePrivate* priv = webDataSource->priv;
> +
> +    RefPtr<SharedBuffer> mainResourceData = priv->loader->mainResourceData();
> +
> +    if (!mainResourceData)
> +        return NULL;
> +
> +    if (!priv->data)
> +        priv->data = g_string_new(mainResourceData->data());

Can the data have '0'? If yes, then you need the length and do g_string_new_with_len.

> +
> +/**
> + * webkit_web_data_source_get_subresources:
> + * @data_source: a #WebKitWebDataSource
> + *
> + * Returns the @data_source subresources that have finished downloading
> + *
> + * Return value: a #GArray of #WebKitWebResource. The array if owned by WebKit
> + * and should not be freed or destroyed.
> + *
> + * Since: 1.1.12
> + */
> +GArray* webkit_web_data_source_get_subresources(WebKitWebDataSource* webDataSource)

See previous comment about using a list or a hash table here.

> +
> +/**
> + * webkit_web_data_source_get_resource_for_url:
> + * @data_source: a #WebKitWebDataSource
> + * @url: the url of a resource
> + *
> + * Returns the #WebKitWebResource for the given URL
> + *
> + * Return value: the #WebKitWebResource for @url
> + *
> + * Since: 1.1.12
> + */
> +WebKitWebResource* webkit_web_data_source_get_resource_for_url(WebKitWebDataSource* webDataSource, const gchar* url)

Mmm, shouldn't this return one of the elements in the array we have?

Marking r- while we address some of the issues.
Comment 29 Xan Lopez 2009-07-29 05:10:20 PDT
Comment on attachment 33016 [details]
Updated frame data source API patch

> +static void web_data_source_fixture_teardown(WebDataSourceFixture* fixture,
> +                                             gconstpointer data)
> +{
> +    g_assert(fixture->mainFrame != NULL);
> +    g_assert(fixture->webView != NULL);

Just g_assert (fixture->mainFrame);, etc?


>  /**
> + * webkit_web_frame_get_data_source:
> + * @frame: a #WebKitWebFrame
> + *
> + * Returns the committed data source.
> + *
> + * Return value: the committed #WebKitWebDataSource or %NULL if the
> + * provisional data source is not done loading.
> + *
> + * Since: 1.1.5

1.1.5? :)

> + */
> +WebKitWebDataSource* webkit_web_frame_get_data_source(WebKitWebFrame* frame)
> +{
> +    g_return_val_if_fail(WEBKIT_IS_WEB_FRAME(frame), NULL);
> +
> +    Frame* coreFrame = core(frame);
> +    WebKitWebDataSource* dataSource;

You need to initialize this to NULL here, otherwise you could crash below.

> +
> +    if (coreFrame && coreFrame->loader()->frameHasLoaded())
> +        dataSource = webkit_web_frame_get_data_source_from_core_loader(coreFrame->loader()->documentLoader());
> +
> +    if (!dataSource)
> +        return NULL;
> +
> +    return dataSource;
> +}
> +
> +/**
> + * webkit_web_frame_get_provisional_data_source:
> + * @frame: a #WebKitWebFrame
> + *
> + * You use the webkit_web_frame_load_request method to initiate a request that
> + * creates a provisional data source. The provisional data source will
> + * transition to a committed data source once any data has been received. Use
> + * webkit_web_frame_get_data_source to get the committed data source.
> + *
> + * Return value: the provisional #WebKitWebDataSource or %NULL if a load
> + * request is not in progress or has been completed.
> + *
> + * Since: 1.1.5

Version again.

> + */
> +WebKitWebDataSource* webkit_web_frame_get_provisional_data_source(WebKitWebFrame* frame)
> +{
> +    g_return_val_if_fail(WEBKIT_IS_WEB_FRAME(frame), NULL);
> +
> +    Frame* coreFrame = core(frame);
> +    WebKitWebDataSource* dataSource = webkit_web_frame_get_data_source_from_core_loader(coreFrame->loader()->provisionalDocumentLoader());
> +
> +    if (!dataSource)
> +        return NULL;
> +
> +    return dataSource;

You can jsut return the dataSource, this does not really add anything.

Marking as r- for now.
Comment 30 Jan Alonzo 2009-07-30 18:45:48 PDT
(In reply to comment #25)
> > + * Return value: a #GArray of #WebKitWebResource. The array if owned by WebKit
> > + * and should not be freed or destroyed.
> 
> I keep thinking if we should not have a GHashtable here, instead, using the
> long identifier as key, and the WebResource object as value. Then we could have
> webkit_web_data_source_get_subresource(WebKitWebDataSource*, long identifier)
> to go straight to a subresource, if we know the resource we want (because, say,
> we're tracking it from willSendRequest).

Do you have a use case in mind for the identifier? For me, adding an identifier sounds like I need to keep track of the resources by their ID. How do I know which resource corresponds to which identifier (or vice-versa)? 

(In reply to comment #27)
> > +
> > +static void webkit_web_resource_finalize(GObject* object)
> > +{
> > +    WebKitWebResource* webResource = WEBKIT_WEB_RESOURCE(object);
> > +    WebKitWebResourcePrivate* priv = webResource->priv;
> > +
> > +    priv->url = CString();
> > +    priv->mimeType = CString();
> > +    priv->textEncoding = CString();
> > +    priv->frameName = CString();
> 
> IIRC we had a discussion about this and we decided it was not a good method to
> free memory? Is the mac port doing something like this? I'd say you'll be still
> leaking a full CString object this way...

The strings are owned by WebCore and the interface to get these strings return a const char*. Mac strings are NSStrings which I'm guessing does its own stuff.

So since they're owned by WebCore, we could just store them as const char*?

(In reply to comment #28)
> > +void DocumentLoader::detachFromFrame()
> > +{
> > +    WebCore::DocumentLoader::detachFromFrame();
> > +
> > +    if (m_loadingResources.isEmpty())
> > +        unrefDataSource();
> > +
> > +}
> 
> So this will silently not do stuff if there are resources still being loaded. I
> suppose that's OK?

I think it's fine but I'm not aware of any issues, if any. We don't want to unref the DS while some resources are still loading either. Should I add a FIXME?

> > +
> > +void DocumentLoader::decreaseLoadCount(unsigned long identifier)
> > +{
> > +    HashSet<unsigned long>::iterator it = m_loadingResources.find(identifier);
> > +
> > +    // It is valid for a load to be cancelled before it's started.
> > +    if (it == m_loadingResources.end())
> > +        return;
> > +
> > +    m_loadingResources.remove(it);
> > +
> > +    if (m_loadingResources.isEmpty() && !frame())
> > +        unrefDataSource();
> > +
> > +}
> 
> Mmm, this seems to not do one unref per call, as opposed to the
> increaseLoadCount function which does that (for different resources). Is this
> done elsewhere?

Mac does the same thing. We're checking if loadingResources is empty the same reason we're checking in ::detachFromFrame. 
> 
> >  WTF::PassRefPtr<WebCore::DocumentLoader> FrameLoaderClient::createDocumentLoader(const WebCore::ResourceRequest& request, const SubstituteData& substituteData)
> >  {
> > -    return DocumentLoader::create(request, substituteData);
> > +    RefPtr<WebKit::DocumentLoader> loader = WebKit::DocumentLoader::create(request, substituteData);
> > +
> > +    WebKitWebDataSource* webDataSource = webkit_web_data_source_new_with_loader(loader.get());
> > +    loader->setDataSource(webDataSource);
> > +    g_object_unref(webDataSource);
> 
> Mmm, this seems a bit pointless:
> 
> - Create a loader.
> - Create a datasource from the loader (and nothing else).
> - Set the datasource in the loader.
> 
> Couldn't this happen automatically with a 'create' of our own or something like
> that?

Both depend on one another that's why it's like that. I can create a convenience function but I don't think it will simplify things. And besides you only need to do  it in this place.

Thanks for the reviews. I'll update the patches once we've discussed the issues above.
Comment 31 Gustavo Noronha (kov) 2009-08-11 11:40:12 PDT
(In reply to comment #30)
> (In reply to comment #25)
> > I keep thinking if we should not have a GHashtable here, instead, using the
> > long identifier as key, and the WebResource object as value. Then we could have
> > webkit_web_data_source_get_subresource(WebKitWebDataSource*, long identifier)
> > to go straight to a subresource, if we know the resource we want (because, say,
> > we're tracking it from willSendRequest).
> 
> Do you have a use case in mind for the identifier? For me, adding an identifier
> sounds like I need to keep track of the resources by their ID. How do I know
> which resource corresponds to which identifier (or vice-versa)? 

It's not adding an identifier. It's using the identifier that webcore assigns to it, already. Say we will implement a new signal to notify the application that a resource has been loaded through a XMLHttpRequest (by implementing FrameLoaderClient::dispatchDidLoadResourceByXMLHttpRequest).

That method gets as its first argument the identifier, so I can get the Resource object from the DataSource in O(1), and emit the signal with it, or I can emit the signal with the identifier, and the application can grab the Resource object in O(1). If we don't map the identifiers to the Resource objects, we don't have a way of getting hold of the object at all when on FrameLoaderClient::dispatchDidLoadResourceByXMLHttpRequest.

Makes sense? Mac seems to do something similar to this, btw. It stores objects in a mapping, which is indexed by the identifiers. See_objectForIdentifier and surrounding code in their WebView.mm.
Comment 32 Jan Alonzo 2009-08-11 13:17:53 PDT
(In reply to comment #31)
> (In reply to comment #30)
> > (In reply to comment #25)
> > > I keep thinking if we should not have a GHashtable here, instead, using the
> > > long identifier as key, and the WebResource object as value. Then we could have
> > > webkit_web_data_source_get_subresource(WebKitWebDataSource*, long identifier)
> > > to go straight to a subresource, if we know the resource we want (because, say,
> > > we're tracking it from willSendRequest).
> > 
> > Do you have a use case in mind for the identifier? For me, adding an identifier
> > sounds like I need to keep track of the resources by their ID. How do I know
> > which resource corresponds to which identifier (or vice-versa)? 
> 
> It's not adding an identifier. It's using the identifier that webcore assigns
> to it, already. Say we will implement a new signal to notify the application
> that a resource has been loaded through a XMLHttpRequest (by implementing
> FrameLoaderClient::dispatchDidLoadResourceByXMLHttpRequest).

I meant you were sending the identifier as part of of the event. Why?

> That method gets as its first argument the identifier, so I can get the
> Resource object from the DataSource in O(1), and emit the signal with it, or I
> can emit the signal with the identifier, and the application can grab the
> Resource object in O(1). If we don't map the identifiers to the Resource
> objects, we don't have a way of getting hold of the object at all when on
> FrameLoaderClient::dispatchDidLoadResourceByXMLHttpRequest.
> 
> Makes sense? Mac seems to do something similar to this, btw. It stores objects
> in a mapping, which is indexed by the identifiers. See_objectForIdentifier and
> surrounding code in their WebView.mm.

Yeah and I think we can do the same by making the lookup internal and send the resource needed by the user. Isn't sending the identifier + "get resource by id" the same as doing the lookup internally and just send the resource as part of outgoing-request?
Comment 33 Gustavo Noronha (kov) 2009-08-12 18:40:39 PDT
(In reply to comment #32)
> > It's not adding an identifier. It's using the identifier that webcore assigns
> > to it, already. Say we will implement a new signal to notify the application
> > that a resource has been loaded through a XMLHttpRequest (by implementing
> > FrameLoaderClient::dispatchDidLoadResourceByXMLHttpRequest).
> 
> I meant you were sending the identifier as part of of the event. Why?

Oh. Just because we didn't have Resource. I plan to send the actual object, now, indeed, which is why I decided to not push it before DataSource is decided on and committed.

> > in a mapping, which is indexed by the identifiers. See_objectForIdentifier and
> > surrounding code in their WebView.mm.
> 
> Yeah and I think we can do the same by making the lookup internal and send the
> resource needed by the user. Isn't sending the identifier + "get resource by
> id" the same as doing the lookup internally and just send the resource as part
> of outgoing-request?

So yeah, we can make the lookup internally, that sounds good to me. I am just saying we need this to be doable, thus my proposal of having the identifier as a mapping key to the resource object.
Comment 34 Jan Alonzo 2009-08-15 01:01:20 PDT
Created attachment 34892 [details]
Updated WebResource patch
Comment 35 Jan Alonzo 2009-08-15 22:28:37 PDT
Created attachment 34920 [details]
updated patch
Comment 36 Xan Lopez 2009-08-19 03:26:22 PDT
Comment on attachment 34892 [details]
Updated WebResource patch


> +
> +static void test_webkit_web_resource_get_data(WebResourceFixture* fixture, gconstpointer data)
> +{
> +    GString* charData = webkit_web_resource_get_data(fixture->webResource);
> +    g_assert_cmpstr(charData->str, ==, "<html></html>");
> +}

It would be good to do a test with data that has zeros in it.

> +    g_object_class_install_property(gobject_class,
> +                                    PROP_URL,
> +                                    g_param_spec_string(
> +                                    "url",
> +                                    _("URL"),
> +                                    _("The url of the resource"),
> +                                    NULL,
> +                                    WEBKIT_PARAM_READABLE));
> +
> +    g_object_class_install_property(gobject_class,
> +                                    PROP_MIME_TYPE,
> +                                    g_param_spec_string(
> +                                    "mime-type",
> +                                    _("MIME Type"),
> +                                    _("The MIME type of the resource"),
> +                                    NULL,
> +                                    WEBKIT_PARAM_READABLE));
> +
> +    g_object_class_install_property(gobject_class,
> +                                    PROP_TEXT_ENCODING_NAME,
> +                                    g_param_spec_string(
> +                                    "text-encoding-name",
> +                                    _("Text Encoding Name"),
> +                                    _("The text encoding name of the resource"),
> +                                    NULL,
> +                                    WEBKIT_PARAM_READABLE));
> +
> +    g_object_class_install_property(gobject_class,
> +                                    PROP_FRAME_NAME,
> +                                    g_param_spec_string(
> +                                    "frame-name",
> +                                    _("Frame Name"),
> +                                    _("The frame name of the resource"),
> +                                    NULL,
> +                                    WEBKIT_PARAM_READABLE));

Please document the properties before landing this. Otherwise looks great, r=me.
Comment 37 Jan Alonzo 2009-08-19 03:35:56 PDT
Comment on attachment 34892 [details]
Updated WebResource patch

I'll land the patches in this bug.
Comment 38 Xan Lopez 2009-08-19 04:01:36 PDT
(In reply to comment #37)
> (From update of attachment 34892 [details])
> I'll land the patches in this bug.

Actually, I have one more comment for the resource bug:

in webkit_web_resource_new, the size parameter should be a gssize and '-1' would mean the data is zero terminated and the code would just use strlen itself.
Comment 39 Xan Lopez 2009-08-19 05:58:04 PDT
Have been playing with the patches a bit, some comments:

- the frame APIs to get the provisional and normal DS shouldn't check if they are valid or not, they should rely on the loaders lifecycle and just try to create a DS from them. If it's NULL then we'll return NULL.

- the provisional DS seems to be missing all the data its loader has. On closer inspection the mainResource it has inside is also empty, but if you ask the loader for, for example, the URL, you'll get the right one. Not sure if the mainResource being empty or not is expected, but in any case the DS should return as much data as it can.

- the doc says the normal DS should be NULL until the load is committed, but this does not seem to happen.
Comment 40 Gustavo Noronha (kov) 2009-08-19 14:21:12 PDT
FYI, I have written the 'view source' functionality for Epiphany based on this API, to serve as a simple test-drive: http://bugzilla.gnome.org/show_bug.cgi?id=503968

I plan to also implement 'save as' using the datasource API.
Comment 41 Xan Lopez 2009-08-19 23:33:44 PDT
(In reply to comment #39)
> inspection the mainResource it has inside is also empty, but if you ask the
> loader for, for example, the URL, you'll get the right one. Not sure if the
> mainResource being empty or not is expected, but in any case the DS should
> return as much data as it can.
> 

So, this is because mainResource gets its data from the stored response, which in the case of the provisionalLoader will be completely empty for obvious reasons. Using the initialRequest (the request does not work, I'm not completely sure of the difference here) has at least the URL, which is good enough for my needs in Epiphany in order to fix http://bugzilla.gnome.org/show_bug.cgi?id=591294
Comment 42 Xan Lopez 2009-08-22 23:45:06 PDT
Comment on attachment 34920 [details]
updated patch

I had already r- this... hopefully jan still has the mail around :)
Comment 43 Jan Alonzo 2009-08-28 00:29:34 PDT
Created attachment 38720 [details]
Update web resource patch

Added documentation and some refinements.
Comment 44 Jan Alonzo 2009-08-28 00:30:15 PDT
Created attachment 38721 [details]
updated patch
Comment 45 Jan Alonzo 2009-08-28 00:31:14 PDT
Created attachment 38722 [details]
Updated patch with working tests

Patch updated with (now) working unit test!
Comment 46 Xan Lopez 2009-08-28 02:33:25 PDT
Comment on attachment 38720 [details]
Update web resource patch

> +/**
> + * webkit_web_resource_new:
> + * @data: the data to initialize the #WebKitWebResource
> + * @length: the length of @data

Please mention that length can be < 0 to indicate that 'data' is null-terminated.

> + * @uri: the uri of the #WebKitWebResource
> + * @mime_type: the MIME type of the #WebKitWebResource
> + * @text_encoding_name: the text encoding name of the #WebKitWebResource
> + * @frame_name: the frame name of the #WebKitWebResource
> + *
> + * Returns a new #WebKitWebResource. The @text_encoding_name can be %NULL. The
> + * @frame_name argument can be used if the resource represents contents of an
> + * entire HTML frame, otherwise pass %NULL.
> + *
> + * Return value: a new #WebKitWebResource
> + *
> + * Since: 1.1.14
> + */
> +WebKitWebResource* webkit_web_resource_new(const gchar* data,
> +                                           gssize size,
> +                                           const gchar* uri,
> +                                           const gchar* mimeType,
> +                                           const gchar* encoding,
> +                                           const gchar* frameName)
> +{
> +    ASSERT(data);
> +    ASSERT(uri);
> +    ASSERT(mime_type);

I think this should be g_return_val_if_fail, not ASSERT. You don't want to crash at runtime in any case IMHO.

Still looks good to me with these two changes, 1/2 of r=me and kov can do the final r+ since it's new API.
Comment 47 Gustavo Noronha (kov) 2009-08-28 06:36:19 PDT
Comment on attachment 38720 [details]
Update web resource patch

> +                                           const gchar* frameName)
> +{
> +    ASSERT(data);
> +    ASSERT(uri);
> +    ASSERT(mime_type);

+1 for Xan's comments; also notice that this check would only have any effect in debug builds, and we do want checks in runtime for release builds =).
Comment 48 Gustavo Noronha (kov) 2009-08-31 05:43:23 PDT
Comment on attachment 38721 [details]
updated patch

Looks good to me. Xan? =)
Comment 49 Gustavo Noronha (kov) 2009-08-31 05:44:35 PDT
Comment on attachment 38722 [details]
Updated patch with working tests

Same here, looks good.
Comment 50 Xan Lopez 2009-08-31 06:48:27 PDT
Comment on attachment 38721 [details]
updated patch

> +    if (priv->data) {
> +        g_string_free(priv->data, TRUE);
> +        priv->data = NULL;
> +    }

Being pedantic, priv->data should be freed in finalize, since it's not a reference counted object :)

> +
> +    G_OBJECT_CLASS(webkit_web_data_source_parent_class)->dispose(object);
> +}
> +


> +/**
> + * webkit_web_data_source_new_with_request:
> + * @request: the #WebKitNetworkRequest to use to create this data source
> + *
> + * Creates a new #WebKitWebDataSource from a #WebKitNetworkRequest. Normally,
> + * #WebKitWebFrame objects create their data sources so you will almost never
> + * want to invoke this method directly.

Do you have any use case for users where it would make sense to use this? It would be good to mention it I think.

> +/**
> + * webkit_web_data_source_get_initial_request:
> + * @data_source: a #WebKitWebDataSource
> + *
> + * Returns a reference to the original request that was used to load the web
> + * content

I think I would use this to expand the introductory section at the begining of the file.

> + *
> + * Return value: the original #WebKitNetworkRequest
> + *
> + * Since: 1.1.14
> + */

> +
> +/**
> + * webkit_web_data_source_get_network_request:
> + * @data_source: a #WebKitWebDataSource
> + *
> + * Returns a #WebKitNetworkRequest that was used to create this #WebKitWebDataSource
> + *
> + * Return value: the #WebKitNetworkRequest that created the @data_source or
> + * %NULL if the @data_source is not attached to the frame or the frame hasn't been loaded.

It's still not clear to me from the doc what's the difference between this and the initial request. Only that if this function returns !NULL it means the resource has been loaded and attached to a frame? Ie, something similar to the provisional vs non-provisional data source? Could you clarify it in the documentation? And add tests for it :)

> + *
> + * Since 1.1.14
> + */

I'm going to r+, but let's clarify the documentation a bit and work on adding tests ASAP.
+        Reviewed by NOBODY (OOPS!).
> +
> +        [Gtk] Implement a WebDataSource for the gtk port
> +        https://bugs.webkit.org/show_bug.cgi?id=24758
> +
>          Implement WebKitWebResource for the resource-related API for
>          WebKitWebDataSource.
>
Comment 51 Xan Lopez 2009-08-31 07:09:25 PDT
Comment on attachment 38722 [details]
Updated patch with working tests

r=me