Bug 89873 - [WK2][GTK] Implement new API to save a web page using MHTML
Summary: [WK2][GTK] Implement new API to save a web page using MHTML
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKitGTK (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Mario Sanchez Prada
URL:
Keywords: Gtk
Depends on: 89872
Blocks:
  Show dependency treegraph
 
Reported: 2012-06-25 06:05 PDT by Mario Sanchez Prada
Modified: 2012-08-10 02:51 PDT (History)
7 users (show)

See Also:


Attachments
Patch proposal + Unit test + Documentation (19.60 KB, patch)
2012-06-25 07:01 PDT, Mario Sanchez Prada
cgarcia: review-
gustavo: commit-queue-
Details | Formatted Diff | Diff
Patch Proposal + Unit Tests + Documentation (19.30 KB, patch)
2012-06-26 03:09 PDT, Mario Sanchez Prada
gustavo: commit-queue-
Details | Formatted Diff | Diff
Patch Proposal + Unit Tests + Documentation (19.31 KB, patch)
2012-06-26 10:14 PDT, Mario Sanchez Prada
no flags Details | Formatted Diff | Diff
Patch Proposal + Unit Tests + Documentation (18.14 KB, patch)
2012-08-08 00:23 PDT, Mario Sanchez Prada
cgarcia: review-
Details | Formatted Diff | Diff
Patch Proposal + Unit Tests + Documentation (18.88 KB, patch)
2012-08-09 05:05 PDT, Mario Sanchez Prada
cgarcia: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Mario Sanchez Prada 2012-06-25 06:05:01 PDT
It would be great if we could ask a WebKitWebView to save the current web page, and its associated external resources, into a self-contained file we browse later (maybe offline) or send to someone else, for instance.

In this regard, we could initially use the MHTML format[1] (already supported  in WebKit since some time ago [2][3]), but it could probably be good if we did not tied ourselves up into it and left the door open to some other possibilities that might be interesting as well (e.g. embedding external resources as 'data URIs'[4]).

Last, it would be interesting both to have a way to easily save the page into disk (a save_to_file() function) but also a way to receive the serialized page in some other way, so the end application can decide what to do with it.

After talking to Carlos, we think a proposal like this one could make sense:

    typedef enum {
        WEBKIT_SAVE_MODE_MHTML // The only supported 'Save Mode' at the moment
    } WebKitWebViewSaveMode;


    WEBKIT_API void
    webkit_web_view_save                               (WebKitWebView             *web_view,
                                                        WebKitWebViewSaveMode      save_mode,
                                                        GCancellable              *cancellable,
                                                        GAsyncReadyCallback        callback,
                                                        gpointer                   user_data);

    WEBKIT_API GInputStream*
    webkit_web_view_save_finish                        (WebKitWebView             *web_view,
                                                        GAsyncResult              *result,
                                                        GError                   **error);

    WEBKIT_API void
    webkit_web_view_save_to_file                       (WebKitWebView             *web_view,
                                                        const gchar               *filepath,
                                                        WebKitWebViewSaveMode      save_mode,
                                                        GCancellable              *cancellable,
                                                        GAsyncReadyCallback        callback,
                                                        gpointer                   user_data);

    WEBKIT_API gboolean
    webkit_web_view_save_to_file_finish                (WebKitWebView             *web_view,
                                                        GAsyncResult              *result,
                                                        GError                   **error)


I will send a mail to webkit-gtk ML and will start attaching patches soon.

[1] http://en.wikipedia.org/wiki/MHTML
[2] https://bugs.webkit.org/show_bug.cgi?id=7168
[3] https://bugs.webkit.org/show_bug.cgi?id=7169
[4] http://en.wikipedia.org/wiki/Data_URI_scheme
Comment 1 Mario Sanchez Prada 2012-06-25 06:56:51 PDT
Depending on bug 89872, since we need a (currently missing) C API to implement this feature
Comment 2 Mario Sanchez Prada 2012-06-25 07:01:40 PDT
Created attachment 149284 [details]
Patch proposal + Unit test + Documentation

Here comes the (first) patch
Comment 3 WebKit Review Bot 2012-06-25 07:04:05 PDT
Thanks for the patch. If this patch contains new public API please make sure it follows the guidelines for new WebKit2 GTK+ API. See http://trac.webkit.org/wiki/WebKitGTK/AddingNewWebKit2API
Comment 4 Gustavo Noronha (kov) 2012-06-25 07:09:26 PDT
Comment on attachment 149284 [details]
Patch proposal + Unit test + Documentation

Attachment 149284 [details] did not pass gtk-ews (gtk):
Output: http://queues.webkit.org/results/13098099
Comment 5 Carlos Garcia Campos 2012-06-26 00:41:07 PDT
Comment on attachment 149284 [details]
Patch proposal + Unit test + Documentation

View in context: https://bugs.webkit.org/attachment.cgi?id=149284&action=review

> Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2184
> +static void webViewFileSavedCallback(GObject *object, GAsyncResult *result, gpointer data)

The * are placed wrongly here. Also, I would call this something like replaceContentsCallback to make it clear this is the callback of g_file_replace_contents_async()

> Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2190
> +    if (!g_file_replace_contents_finish(file, result, NULL, &error)) {

Use 0 instead of NULL

> Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2192
> +        return;

Don't return here, or the operation will never complete.

> Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2198
> +static void webViewMHTMLDataGotCallback(WKDataRef wkData, WKErrorRef, void* context)

getContentsAsMHTMLDataCallback?

> Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2208
> +        /* We need to retain the data until the asyncronous process
> +           initiated by the user has finished completely. */

Use C++ style comments

> Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2211
> +        /* If we are saving to a file we need to write the data on disk before finishing. */

Ditto.

> Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2215
> +            GFile* file = g_file_new_for_path(data->filePath.get());

Use GRefPtr for the file, since you are leaking it. g_file_replace_contents_async() will take a reference of the file, so you can just adopt the ref here.

> Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2217
> +            g_file_replace_contents_async(file, reinterpret_cast<const gchar*>(WKDataGetBytes(data->wkData.get())), WKDataGetSize(data->wkData.get()), NULL, FALSE, G_FILE_CREATE_REPLACE_DESTINATION,
> +                                          data->cancellable.get(), webViewFileSavedCallback, g_object_ref(result.get()));

Use 0 instead of NULL.

> Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2241
> +void webkit_web_view_save(WebKitWebView *webView, WebKitWebViewSaveMode saveMode, GCancellable* cancellable, GAsyncReadyCallback callback, gpointer userData)

WebKitWebView *webView -> WebKitWebView* webView

> Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2247
> +    /* We only support MHTML at the moment */
> +    if (saveMode != WEBKIT_SAVE_MODE_MHTML)
> +        return;

Use g_return_if_fail for this check.

> Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2256
> +    WKPageGetContentsAsMHTMLData(wkPage, false, result, webViewMHTMLDataGotCallback);

Just curiosity, why not using binary mode?

> Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2288
> +    GInputStream* dataStream = 0;
> +    ViewSaveAsyncData* data = static_cast<ViewSaveAsyncData*>(g_simple_async_result_get_op_res_gpointer(simple));
> +    if (data) {
> +        gsize length = WKDataGetSize(data->wkData.get());
> +        dataStream = g_memory_input_stream_new_from_data(g_memdup(WKDataGetBytes(data->wkData.get()), length), length, g_free);
> +    }
> +
> +    return dataStream;

We are saying that the function returns NULL in case of error, but here it can return NULL for a valid empty page. If data is NULL, we could just return an empty memory stream

> Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2294
> + * @path: a string containing a relative or absolute path for the file where the current web page should be saved to. The string must be encoded in the glib filename encoding.

I'm wondering whether we should receive a GFile instead, and use it directly in g_file_replace_contents

> Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2316
> +    /* We only support MHTML at the moment */
> +    if (saveMode != WEBKIT_SAVE_MODE_MHTML)
> +        return;

Use g_return_if_fail() for this check

> Source/WebKit2/UIProcess/API/gtk/WebKitWebView.h:130
> +typedef enum {
> +    WEBKIT_SAVE_MODE_MHTML
> +} WebKitWebViewSaveMode;

Maybe WebKitSaveMode? or WebKitWebSaveMode and WEBKIT_WEB_SAVE_MODE_MHTML?

> Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:867
> +static void webViewSavedToFileCallback(GObject *object, GAsyncResult *res, WebViewTest* test)

GObject *object -> GObject* object
GAsyncResult *res -> GAsyncResult* result

> Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:873
> +    g_main_loop_quit(test->m_mainLoop);

We have test->quitMainLoop() now.

> Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:879
> +    GInputStream* inputStream = webkit_web_view_save_finish(test->m_webView, res, &error);

Use GRefPtr and adopt the ref to not leak it.

> Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:884
> +    GFile* file = g_file_new_for_path(kSaveDestinationFilePath);

Ditto.

> Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:885
> +    GInputStream* savedFileInputStream = G_INPUT_STREAM(g_file_read(file, NULL, &error));

Ditto. And use 0 instead of NULL

> Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:894
> +    GOwnPtr<gchar> buffer1(g_new0(gchar, 512));
> +    GOwnPtr<gchar> buffer2(g_new0(gchar, 512));

Why not stack allocated buffers? 512 is small enough to use the stack, I'd say.

> Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:898
> +    while (readBytes = g_input_stream_read(inputStream, buffer1.outPtr(), 512, NULL, &error)) {

Use 0 instead of NULL

> Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:901
> +        totalBytesFromFile += g_input_stream_read(savedFileInputStream, buffer2.outPtr(), 512, NULL, &error);

Ditto.

> Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:913
> +    // Check that the same minimum amount of bytes are received.
> +    g_assert_cmpint(totalBytesFromStream, ==, totalBytesFromFile);
> +    g_assert_cmpint(totalBytesFromStream, >=, 2048);
> +
> +    g_input_stream_close(inputStream, NULL, NULL);
> +    g_input_stream_close(savedFileInputStream, NULL, NULL);

Since you are not going to compare the contents, I think this could be a lot simpler if you just get the size of the file without reading it and only read from the input stream

> Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:937
> +    test->waitUntilMainLoopFinishes();
> +}

You should unlink and free the kSaveDestinationFilePath, or g_rmdir(kTempDirectory); will fail
Comment 6 Mario Sanchez Prada 2012-06-26 03:09:41 PDT
Created attachment 149501 [details]
Patch Proposal + Unit Tests + Documentation

Attaching new patch addressing these issues. See comments below...

(In reply to comment #5)
> (From update of attachment 149284 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=149284&action=review
> 
> > Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2184
> > +static void webViewFileSavedCallback(GObject *object, GAsyncResult *result, gpointer data)
> 
> The * are placed wrongly here. Also, I would call this something like replaceContentsCallback to make it clear this is the callback of g_file_replace_contents_async()

Fixed.

> > Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2190
> > +    if (!g_file_replace_contents_finish(file, result, NULL, &error)) {
> 
> Use 0 instead of NULL

Fixed.
 
> > Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2192
> > +        return;
> 
> Don't return here, or the operation will never complete.

Fixed.

> > Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2198
> > +static void webViewMHTMLDataGotCallback(WKDataRef wkData, WKErrorRef, void* context)
> 
> getContentsAsMHTMLDataCallback?

Fixed.
 
> > Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2208
> > +        /* We need to retain the data until the asyncronous process
> > +           initiated by the user has finished completely. */
> 
> Use C++ style comments

Fixed.
 
> > Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2211
> > +        /* If we are saving to a file we need to write the data on disk before finishing. */
> 
> Ditto.

Fixed.

> > Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2215
> > +            GFile* file = g_file_new_for_path(data->filePath.get());
> 
> Use GRefPtr for the file, since you are leaking it. g_file_replace_contents_async() will take a reference of the file, so you can just adopt the ref here.
> 

Fixed.

> > Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2217
> > +            g_file_replace_contents_async(file, reinterpret_cast<const gchar*>(WKDataGetBytes(data->wkData.get())), WKDataGetSize(data->wkData.get()), NULL, FALSE, G_FILE_CREATE_REPLACE_DESTINATION,
> > +                                          data->cancellable.get(), webViewFileSavedCallback, g_object_ref(result.get()));
> 
> Use 0 instead of NULL.
> 

Fixed.

> > Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2241
> > +void webkit_web_view_save(WebKitWebView *webView, WebKitWebViewSaveMode saveMode, GCancellable* cancellable, GAsyncReadyCallback callback, gpointer userData)
> 
> WebKitWebView *webView -> WebKitWebView* webView
> 

Fixed.

> > Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2247
> > +    /* We only support MHTML at the moment */
> > +    if (saveMode != WEBKIT_SAVE_MODE_MHTML)
> > +        return;
> 
> Use g_return_if_fail for this check.
> 

Fixed.

> > Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2256
> > +    WKPageGetContentsAsMHTMLData(wkPage, false, result, webViewMHTMLDataGotCallback);
> 
> Just curiosity, why not using binary mode?
> 

It will work either way in WebKit based browers, both for writing a valid .mht file and for being able to read it later on with a MHTML capable browser. However, considering the .mht is a file with content type text/html, I think it's better not to use binary encoding for the resources, so you can easily inspect it with a text editor (as the content type seems to suggest you should be able to do it).

Also, as per the following comment in Source/WebKit/public/WebPageSerializer.h, it seems binary encoding could be unsupported in some other browsers:

    [...]
    // Serializes the WebView contents to a MHTML representation.
    WEBKIT_EXPORT static WebCString serializeToMHTML(WebView*);

    // Similar to serializeToMHTML but uses binary encoding for the MHTML parts.
    // This results in a smaller MHTML file but it might not be supported by other browsers.
    WEBKIT_EXPORT static WebCString serializeToMHTMLUsingBinaryEncoding(WebView*);
    [...]

So, my vote goes for using printable characters for the .mht file.

> > Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2288
> > +    GInputStream* dataStream = 0;
> > +    ViewSaveAsyncData* data = static_cast<ViewSaveAsyncData*>(g_simple_async_result_get_op_res_gpointer(simple));
> > +    if (data) {
> > +        gsize length = WKDataGetSize(data->wkData.get());
> > +        dataStream = g_memory_input_stream_new_from_data(g_memdup(WKDataGetBytes(data->wkData.get()), length), length, g_free);
> > +    }
> > +
> > +    return dataStream;
> 
> We are saying that the function returns NULL in case of error, but here it can return NULL for a valid empty page. If data is NULL, we could just return an empty memory stream
> 

Fixed.

> > Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2294
> > + * @path: a string containing a relative or absolute path for the file where the current web page should be saved to. The string must be encoded in the glib filename encoding.
> 
> I'm wondering whether we should receive a GFile instead, and use it directly in g_file_replace_contents
>  

Fixed.

> > Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2316
> > +    /* We only support MHTML at the moment */
> > +    if (saveMode != WEBKIT_SAVE_MODE_MHTML)
> > +        return;
> 
> Use g_return_if_fail() for this check
>  

Fixed.
 
> > Source/WebKit2/UIProcess/API/gtk/WebKitWebView.h:130
> > +typedef enum {
> > +    WEBKIT_SAVE_MODE_MHTML
> > +} WebKitWebViewSaveMode;
> 
> Maybe WebKitSaveMode? or WebKitWebSaveMode and WEBKIT_WEB_SAVE_MODE_MHTML?
>   

Fixed. I used WebKitSaveMode.

> > Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:867
> > +static void webViewSavedToFileCallback(GObject *object, GAsyncResult *res, WebViewTest* test)
> 
> GObject *object -> GObject* object
> GAsyncResult *res -> GAsyncResult* result
>   

Fixed.

> > Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:873
> > +    g_main_loop_quit(test->m_mainLoop);
> 
> We have test->quitMainLoop() now.
>   

Fixed.

> > Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:879
> > +    GInputStream* inputStream = webkit_web_view_save_finish(test->m_webView, res, &error);
> 
> Use GRefPtr and adopt the ref to not leak it.
>   

Fixed.

> > Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:884
> > +    GFile* file = g_file_new_for_path(kSaveDestinationFilePath);
> 
> Ditto.
>   

Fixed.

> > Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:885
> > +    GInputStream* savedFileInputStream = G_INPUT_STREAM(g_file_read(file, NULL, &error));
> 
> Ditto. And use 0 instead of NULL
>   

Fixed.

> > Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:894
> > +    GOwnPtr<gchar> buffer1(g_new0(gchar, 512));
> > +    GOwnPtr<gchar> buffer2(g_new0(gchar, 512));
> 
> Why not stack allocated buffers? 512 is small enough to use the stack, I'd say.
> 

When I first started writing this unit test I considered bigger sizes for that buffer and so that's why I ended up using the heap here. But I agree that 512 bytes is not that big, specially considering this is just an unit test, and code is simpler.  

Fixed.

> > Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:898
> > +    while (readBytes = g_input_stream_read(inputStream, buffer1.outPtr(), 512, NULL, &error)) {
> 
> Use 0 instead of NULL
>   

Fixed.

> > Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:901
> > +        totalBytesFromFile += g_input_stream_read(savedFileInputStream, buffer2.outPtr(), 512, NULL, &error);
> 
> Ditto.
>   

Fixed.

> > Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:913
> > +    // Check that the same minimum amount of bytes are received.
> > +    g_assert_cmpint(totalBytesFromStream, ==, totalBytesFromFile);
> > +    g_assert_cmpint(totalBytesFromStream, >=, 2048);
> > +
> > +    g_input_stream_close(inputStream, NULL, NULL);
> > +    g_input_stream_close(savedFileInputStream, NULL, NULL);
> 
> Since you are not going to compare the contents, I think this could be a lot simpler if you just get the size of the file without reading it and only read from the input stream
> 

Ok. I changed this test following your suggestions.

> > Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:937
> > +    test->waitUntilMainLoopFinishes();
> > +}
> 
> You should unlink and free the kSaveDestinationFilePath, or g_rmdir(kTempDirectory); will fail

Fixed.
Comment 7 Gustavo Noronha (kov) 2012-06-26 03:17:43 PDT
Comment on attachment 149501 [details]
Patch Proposal + Unit Tests + Documentation

Attachment 149501 [details] did not pass gtk-ews (gtk):
Output: http://queues.webkit.org/results/13094437
Comment 8 Mario Sanchez Prada 2012-06-26 10:14:07 PDT
Created attachment 149551 [details]
Patch Proposal + Unit Tests + Documentation

There was a wrong line in the previous patch, so attaching a new patch with the following difference:

  -            ASSERT(G_IS_FILE(data->file));
  +            ASSERT(G_IS_FILE(data->file.get()));
Comment 9 Gustavo Noronha (kov) 2012-06-26 10:20:03 PDT
Comment on attachment 149551 [details]
Patch Proposal + Unit Tests + Documentation

Attachment 149551 [details] did not pass gtk-ews (gtk):
Output: http://queues.webkit.org/results/13098490
Comment 10 Mario Sanchez Prada 2012-06-27 03:39:44 PDT
(In reply to comment #9)
> (From update of attachment 149551 [details])
> Attachment 149551 [details] did not pass gtk-ews (gtk):
> Output: http://queues.webkit.org/results/13098490

This is normal: it depends on the patch for bug 89872 for those missing symbols to be found
Comment 11 Mario Sanchez Prada 2012-08-08 00:23:08 PDT
Created attachment 157139 [details]
Patch Proposal + Unit Tests + Documentation

Attaching a newly rebase version of this patch (so it passes in EWS) now that bug 89872 has been fixed
Comment 12 Carlos Garcia Campos 2012-08-08 02:18:02 PDT
Comment on attachment 157139 [details]
Patch Proposal + Unit Tests + Documentation

View in context: https://bugs.webkit.org/attachment.cgi?id=157139&action=review

Patch looks good, r- only because of the unit tests

> Source/WebKit2/ChangeLog:8
> +        Implemented new asynchronous API in WebKitWebView.

This sounds too generic, please explain better what this patch is about.

> Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2458
> + * specified in #WebKitSaveMode.

specified in @save_mode, no?

> Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2508
> +    ViewSaveAsyncData* data = static_cast<ViewSaveAsyncData*>(g_simple_async_result_get_op_res_gpointer(simple));
> +    if (data) {
> +        gsize length = WKDataGetSize(data->wkData.get());
> +        g_memory_input_stream_add_data(G_MEMORY_INPUT_STREAM(dataStream), g_memdup(WKDataGetBytes(data->wkData.get()), length), length, g_free);
> +    }

data can't be null here, you should probably check that length is not 0 before adding data to the stream. 

ViewSaveAsyncData* data = static_cast<ViewSaveAsyncData*>(g_simple_async_result_get_op_res_gpointer(simple));
gsize length = WKDataGetSize(data->wkData.get());
if (length)
    g_memory_input_stream_add_data(G_MEMORY_INPUT_STREAM(dataStream), g_memdup(WKDataGetBytes(data->wkData.get()), length), length, g_free);

> Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2524
> + * specified in #WebKitSaveMode and writing it to @file.

specified in @save_mode, no?

> Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2557
> + * Returns: %TRUE if the web page was successfully saved on disk or %FALSE otherwise.

I would said save saved to file, and avoid the word "disk"

> Source/WebKit2/UIProcess/API/gtk/WebKitWebView.h:124
> + * @WEBKIT_SAVE_MODE_MHTML: Save the current page using the MHTML
> + * format.

This could be a single line

> Source/WebKit2/UIProcess/API/gtk/WebKitWebView.h:127
> + * Enum values to specify the different ways in which a #WebKitWebView
> + * will be saved the current web page into a self-contained file.

"can save its current web page into a self-contained file" maybe

> Source/WebKit2/UIProcess/API/gtk/WebKitWebView.h:341
> +WEBKIT_API GInputStream*

GInputStream* -> GInputStream *

> Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:953
> +static void webViewSavedToFileCallback(GObject *object, GAsyncResult *res, WebViewTest* test)

GObject *object -> GObject* object
GAsyncResult *res -> GAsyncResult *result

> Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:988
> +    // Check that the a minimum amount of bytes are received in the stream.
> +    g_assert_cmpint(totalBytesFromStream, >=, 2048);

Why 2048?

> Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:992
> +    // Check that the file exists and that it contains at least the same amount of bytes.
> +    GRefPtr<GFileInfo> fileInfo = adoptGRef(g_file_query_info(file.get(), G_FILE_ATTRIBUTE_STANDARD_SIZE, static_cast<GFileQueryInfoFlags>(0), 0, 0));
> +    g_assert_cmpint(g_file_info_get_size(fileInfo.get()), >=, 2048);

It's a bit weird that you are checking the file exists in the save callback instead of the save_to_file one.

> Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:999
> +static void testWebViewSave(UIClientTest* test, gconstpointer)

Why is this a UIClientTest?

> Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:1018
> +    kSaveDestinationFilePath = g_build_filename(kTempDirectory, "testWebViewSaveResult.mht", NULL);
> +    GRefPtr<GFile> file = adoptGRef(g_file_new_for_path(kSaveDestinationFilePath));
> +    webkit_web_view_save_to_file(test->m_webView, file.get(), WEBKIT_SAVE_MODE_MHTML, 0, reinterpret_cast<GAsyncReadyCallback>(webViewSavedToFileCallback), test);
> +    test->waitUntilMainLoopFinishes();
> +
> +    webkit_web_view_save(test->m_webView, WEBKIT_SAVE_MODE_MHTML, 0, reinterpret_cast<GAsyncReadyCallback>(webViewSavedCallback), test);
> +    test->waitUntilMainLoopFinishes();

I would add a new class deriving from WebViewTest withj helper methods:

GInputStream* save();
GFile* saveToFile();

and here in the test function you can check the results, and compare them or whatever.

> Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:1050
> +    g_rmdir(kTempDirectory);

Here you should free kTempDirectory and kSaveDestinationFilePath I think
Comment 13 Mario Sanchez Prada 2012-08-09 05:05:52 PDT
Created attachment 157435 [details]
Patch Proposal + Unit Tests + Documentation

Attaching new version of the patch addressing issues pointed out by Carlos.

Notice that, as per the creation of a new class SaveWebViewText : WebViewTest, the global variables for the name of the temporary file and directory are not needed anymore, since those now belong to that new class (and so will be created and freed in the normal lifecycle for that test).
Comment 14 Carlos Garcia Campos 2012-08-09 05:23:28 PDT
Comment on attachment 157435 [details]
Patch Proposal + Unit Tests + Documentation

View in context: https://bugs.webkit.org/attachment.cgi?id=157435&action=review

> Source/WebKit2/UIProcess/API/gtk/WebKitWebView.cpp:2458
> + * specified in #WebKitSaveMode.

specified in @save_mode, no?

> Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:956
> +    SaveWebViewTest()
> +    {
> +        m_tempDirectory.set(g_dir_make_tmp("WebKit2SaveViewTest-XXXXXX", 0));

C++ initializer syntax:

SaveWebViewTest()
    : m_tempDirectory(g_dir_make_tmp("WebKit2SaveViewTest-XXXXXX", 0))

or isn't that possible with GOwnPtr?

> Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:962
> +        g_file_delete(m_file.get(), 0, 0);

Don't assume saveAndWaitForFile() has been called, even thought it's the case now, this class might be used in the future for other tests.

> Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:964
> +        g_input_stream_close(m_inputStream.get(), 0, 0);

Ditto.

> Source/WebKit2/UIProcess/API/gtk/tests/TestWebKitWebView.cpp:1037
> +    // Check that the file exists and that it contains at least the same amount of bytes.

the same amount of bytes than the stream?
Comment 15 Mario Sanchez Prada 2012-08-10 02:51:54 PDT
Committed r125267: <http://trac.webkit.org/changeset/125267>