WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
18343
[GTK] Soup backend must handle upload of multiple files
https://bugs.webkit.org/show_bug.cgi?id=18343
Summary
[GTK] Soup backend must handle upload of multiple files
Luca Bruno
Reported
2008-04-07 09:06:32 PDT
It's a missing feature. The curl backend currently handle this, but soup doesn't yet.
Attachments
proposed patch
(5.34 KB, patch)
2009-01-28 17:41 PST
,
Gustavo Noronha (kov)
no flags
Details
Formatted Diff
Diff
updated proposed patch
(5.63 KB, patch)
2009-01-29 12:13 PST
,
Gustavo Noronha (kov)
mrowe
: review-
Details
Formatted Diff
Diff
fixed proposed patch
(6.85 KB, patch)
2009-01-30 09:07 PST
,
Gustavo Noronha (kov)
no flags
Details
Formatted Diff
Diff
patch ready to be committed
(6.86 KB, patch)
2009-02-01 09:54 PST
,
Gustavo Noronha (kov)
no flags
Details
Formatted Diff
Diff
Show Obsolete
(4)
View All
Add attachment
proposed patch, testcase, etc.
Dan Winship
Comment 1
2008-04-07 09:56:45 PDT
"multipart" branch of libsoup git (
http://gnome.org/~danw/libsoup.git
) has various bits of support for multipart/form-data. (SoupMultipart, plus some new methods in soup-forms.h) that's probably ready to merge in to libsoup svn once I branch for GNOME 2.22 later today
Luca Bruno
Comment 2
2008-04-07 11:03:04 PDT
WebCore gives us already the elements with multipart headers, but not the data in case of files. In curl I just ended up by writing a function (readCallback if not mispelled) that wrote the whole contents of the form to a buffer given by curl. I think this can be done in soup too? Or do you think it's better to have the multipart API in first?
Dan Winship
Comment 3
2008-04-07 14:07:44 PDT
ah, ok, you're right, you don't need SoupMultipart at all. basically what you want to do is iterate over httpBody()->elements(), and soup_message_body_append() each chunk to the request_body. (As far as I understand the C++ bits, you should be able to use SOUP_MEMORY_TEMPORARY when appending, because the pointers you pass will remain valid until after the SoupMessage is destroyed. [This avoids needing to copy the data.]) For the files to be uploaded, you can either read them into memory like in curl, or you can copy the trick from libsoup/tests/simple-httpd.c of using mmap-based SoupBuffers. You should also call "soup_message_body_set_accumulate (msg->request_body, FALSE)", to tell libsoup that you don't need it to concatenate everything together into msg->request_body->data for you when it's done. (That's only available in libsoup 2.4.1, released today, so maybe forget about that for now...)
Gustavo Noronha (kov)
Comment 4
2009-01-28 17:41:03 PST
Created
attachment 27134
[details]
proposed patch I implemented what Dan suggested, using the libsoup test as an example. It worked in my tests using GtkLauncher and a simple test CherryPy application, uploading a single file.
Dan Winship
Comment 5
2009-01-28 18:48:41 PST
(In reply to
comment #4
)
> I implemented what Dan suggested
I'd said to skip the soup_message_body_set_accumulate call "for now", since it was only available in the very-latest libsoup when I wrote the comment. But that was a long time ago, and WebKit currently depends on a much newer libsoup than that anyway, so you should definitely add that. (Though this is just a performance thing, not a correctness thing.)
Gustavo Noronha (kov)
Comment 6
2009-01-29 02:25:14 PST
(In reply to
comment #5
)
> (In reply to
comment #4
) > > I implemented what Dan suggested > > I'd said to skip the soup_message_body_set_accumulate call "for now", since it > was only available in the very-latest libsoup when I wrote the comment. But > that was a long time ago, and WebKit currently depends on a much newer libsoup > than that anyway, so you should definitely add that. (Though this is just a > performance thing, not a correctness thing.)
Yeah, I considered that, actually. I tried adding the call to just before the for loop, and it caused the code to not work, I'm not really sure why. I believe, though, that having soup accumulate the data to request_body may be sound, since we are thinking of providing the SoupMessage to signals fired after the request has been prepared and sent. What's your opinion on that? This might actually exhaust virtual memory, I guess?
Dan Winship
Comment 7
2009-01-29 04:49:20 PST
(In reply to
comment #6
)
> Yeah, I considered that, actually. I tried adding the call to just before the > for loop, and it caused the code to not work, I'm not really sure why.
Hm... maybe something else expects msg->request_body to be filled in later on? Actually, now that I think about it, I'm not sure anyone else is using accumulate=FALSE on the *request* body, so maybe there's a bug in that code in libsoup...
> I > believe, though, that having soup accumulate the data to request_body may be > sound, since we are thinking of providing the SoupMessage to signals fired > after the request has been prepared and sent. What's your opinion on that? This > might actually exhaust virtual memory, I guess?
Well, the curl backend always behaves that way, right?, so it wouldn't be any worse than that. But there's not much reason to do the mmap hack in that case.
Dan Winship
Comment 8
2009-01-29 06:22:36 PST
(In reply to
comment #7
)
> But there's not much reason to do the mmap hack in that case.
Oops, strike that part. Leaving accumulate on means it will keep the mmapped buffers around longer than it would otherwise, but it won't actually concatenate everything into a single buffer unless someone explicitly calls soup_message_body_flatten(msg->request_body), so mmapping is still probably better memory-usage-wise.
Gustavo Noronha (kov)
Comment 9
2009-01-29 08:52:28 PST
(In reply to
comment #8
)
> (In reply to
comment #7
) > > But there's not much reason to do the mmap hack in that case. > > Oops, strike that part. Leaving accumulate on means it will keep the mmapped > buffers around longer than it would otherwise, but it won't actually > concatenate everything into a single buffer unless someone explicitly calls > soup_message_body_flatten(msg->request_body), so mmapping is still probably > better memory-usage-wise.
Yeah, the tests I've done show that only virtual memory and IO caches are affected, while the "private" memory actually allocated stays low, so I don't think a flatten is happening anywhere; using mmap is useful.
Gustavo Noronha (kov)
Comment 10
2009-01-29 08:55:19 PST
(In reply to
comment #7
)
> Hm... maybe something else expects msg->request_body to be filled in later on? > Actually, now that I think about it, I'm not sure anyone else is using > accumulate=FALSE on the *request* body, so maybe there's a bug in that code in > libsoup...
Maybe. I'll do some further testing.
> > believe, though, that having soup accumulate the data to request_body may be > > sound, since we are thinking of providing the SoupMessage to signals fired > > after the request has been prepared and sent. What's your opinion on that? This > > might actually exhaust virtual memory, I guess? > > Well, the curl backend always behaves that way, right?, so it wouldn't be any > worse than that. But there's not much reason to do the mmap hack in that case.
Yeah, we'd match curl functionality, but I am afraid we may be needlessly limiting the size of uploads this way, even though the bar would be set pretty high. We can always investigate this further down the line, though, when the need actually arises. Having some form of file upload for Soup right now is a great win already IMO.
Gustavo Noronha (kov)
Comment 11
2009-01-29 12:13:04 PST
Created
attachment 27154
[details]
updated proposed patch This patch only adds a TODO comment regarding the usage of the accumulate flag. We would have to require a development version of libsoup, so I'd rather wait 'till we are far enough into the future, or have ways of #ifdef'ing the line before adding it. As Dan says, this would improve performance, but not setting it to FALSE doesn't make the code incorrect.
Gustavo Noronha (kov)
Comment 12
2009-01-29 12:22:40 PST
(In reply to
comment #11
)
> This patch only adds a TODO comment regarding the usage of the accumulate flag.
when compared to my previous one, that is =)
Mark Rowe (bdash)
Comment 13
2009-01-30 05:11:29 PST
Comment on
attachment 27154
[details]
updated proposed patch Some brief style comments since I'm not familiar with soup at all.
> +#include <fcntl.h> > +#include <sys/types.h> > +#include <sys/stat.h> > +#include <unistd.h> > + > #include "config.h" > #include "CString.h" > #include "ResourceHandle.h"
Per our style guidelines, config.h and ResourceHandle.h should be included first in this file and grouped together, followed by other WebCore headers, then other headers. That is to say that the new includes should be moved down by the exiting gio / libsoup includes, and CString.h moved down into the other WebCore includes.
> @@ -58,6 +64,20 @@ enum > ERROR_BAD_NON_HTTP_METHOD > }; > > +struct mapping > +{ > + gpointer ptr; > + gsize length; > +}; > + > +static void > +free_mapping(gpointer data) > +{ > + struct mapping* mapping = (struct mapping*)data; > + munmap (mapping->ptr, mapping->length); > + g_slice_free (struct mapping, mapping); > +} > + > static void cleanupGioOperation(ResourceHandleInternal* handle);
Types should be named with an initial capital letter. The name "mapping" is rather generic and could perhaps better describe what the type represents. The return type and linkage specifier of a function declaration should be on the same line as the rest of the declaration. Function names in WebCore should use camelCase rather than lower_case_with_underscores. The extra "struct" qualifier when referring to the type should be omitted as it is not needed in C++. The type cast should use a C++-style cast rather than a C-style cast.
> + gsize num_elements = httpBody->elements().size(); > +
This should be of type size_t rather than gsize since it is only used to index in to a WebCore type. It should also be named in camelCase.
> + for (gsize count = 0; count < num_elements; count++) {
Should be size_t for the same reason as above. "i" would probably be a more idiomatic name than count.
> + const FormDataElement& element = httpBody->elements()[count]; > + > + if (element.m_type == FormDataElement::data) { > + soup_message_body_append(msg->request_body, SOUP_MEMORY_TEMPORARY, element.m_data.data(), element.m_data.size()); > + } else {
The one-line body of the if should not have {}s around it.
> + /* > + * mapping for uploaded files code inspired by technique used in > + * libsoup's simple-httpd test > + */ > + SoupBuffer* soupBuffer; > + struct stat statbuf;
These variables should be declared immediately before they're first needed.
> + struct mapping* mapping = g_slice_new (struct mapping); > + int fd = open(element.m_filename.utf8().data(), O_CLOEXEC|O_RDONLY); > + > + fstat(fd, &statbuf); > + > + mapping->ptr = mmap(NULL, statbuf.st_size, PROT_READ, MAP_PRIVATE, fd, 0); > + mapping->length = statbuf.st_size; > + > + close(fd);
This code really needs some error-handling. If the file has been deleted since the user selected it in the form control, opening the file will fail and we'll probably end up segfaulting after mmap returns an error and we treat it as a pointer. I'm going to mark this as r- due to the lack of error handling and style issues. If someone familiar with libsoup can chime in on whether the usage is correct then this can be r+'d when those issues are addressed.
Gustavo Noronha (kov)
Comment 14
2009-01-30 09:07:10 PST
Created
attachment 27182
[details]
fixed proposed patch Fxed style issues reported by bdash, and also added error checking for open() and mmap(). As I said on IRC, I believe we have soup covered since Dan worked with me to figure out why _set_accumulate(msg, FALSE) wasn't working, and verified the code in the process.
Mark Rowe (bdash)
Comment 15
2009-01-31 16:44:39 PST
Comment on
attachment 27182
[details]
fixed proposed patch I'm going to say r=me, though there are a few issues that should be addressed before landing.
> #if PLATFORM(GTK) > #if GLIB_CHECK_VERSION(2,12,0) > @@ -55,9 +61,24 @@ enum > { > ERROR_TRANSPORT, > ERROR_UNKNOWN_PROTOCOL, > - ERROR_BAD_NON_HTTP_METHOD > + ERROR_BAD_NON_HTTP_METHOD, > + ERROR_UNABLE_TO_MAP_FILE, > + ERROR_UNABLE_TO_OPEN_FILE, > +};
Do we need the two different error codes here? The "map file" one seems like an implementation detail that adds little benefit over the "open file" error code.
> + > +struct FileMapping > +{ > + gpointer ptr; > + gsize length; > }; > > +static void freeFileMapping(gpointer data) > +{ > + FileMapping* fileMapping = static_cast<FileMapping*>(data); > + munmap (fileMapping->ptr, fileMapping->length); > + g_slice_free (FileMapping, fileMapping); > +}
Shouldn't have a space between the function name and parenthesis in a function call.
> + > + struct stat statBuf; > + fstat(fd, &statBuf);
The struct keyword is unnecessary here.
> + > + FileMapping* fileMapping = g_slice_new (FileMapping);
There is extra space after the function name.
> + fileMapping->ptr = mmap(NULL, statBuf.st_size, PROT_READ, MAP_PRIVATE, fd, 0); > + if (fileMapping->ptr == MAP_FAILED) { > + ResourceError error("webkit-network-error", ERROR_UNABLE_TO_MAP_FILE, urlString, strerror(errno)); > + d->client()->didFail(this, error); > + freeFileMapping(fileMapping);
Calling freeFileMapping here will attempt to munmap a pointer of MAP_FAILED. Is that ok? I think we also need to close the file descriptor in this error path to avoid leaking it.
> + g_object_unref(msg); > + return false; > + } > + fileMapping->length = statBuf.st_size; > + > + close(fd); > + > + SoupBuffer* soupBuffer;
This declaration should be combined with the initialisation on the next line.
> + > + soupBuffer = soup_buffer_new_with_owner(fileMapping->ptr, fileMapping->length, fileMapping, freeFileMapping);
Gustavo Noronha (kov)
Comment 16
2009-01-31 18:22:54 PST
(In reply to
comment #15
)
> (From update of
attachment 27182
[details]
[review]) > I'm going to say r=me, though there are a few issues that should be addressed > before landing.
Thanks, I'll attach a patch ready for landing in a minute.
> Do we need the two different error codes here? The "map file" one seems like > an implementation detail that adds little benefit over the "open file" error > code.
Agreed. I'm removing the MAP one.
> Calling freeFileMapping here will attempt to munmap a pointer of MAP_FAILED. > Is that ok?
My understanding from reading the manpage is that it should be OK, but I have added a guard to make sure we will only munmap if it is not MAP_FAILED.
Gustavo Noronha (kov)
Comment 17
2009-02-01 09:54:32 PST
Created
attachment 27232
[details]
patch ready to be committed I'm attaching a patch ready to be commited (except for ChangeLog changes). I have left struct stat as is because my C++ compiler was confusing it with the stat(2) call without the struct prefix, it seems.
Christian Dywan
Comment 18
2009-02-12 10:19:09 PST
Comment on
attachment 27232
[details]
patch ready to be committed 2009-02-12 Gustavo Noronha Silva <
gns@gnome.org
> Reviewed by Mark Rowe. [GTK] Soup backend must handle upload of multiple files
https://bugs.webkit.org/show_bug.cgi?id=18343
* platform/network/soup/ResourceHandleSoup.cpp: (WebCore::): (WebCore::freeFileMapping): (WebCore::ResourceHandle::startHttp): Support multipart request bodies mmap'ing files to be uploaded.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug