RESOLVED FIXED Bug 117735
[curl] Improve multipart response handling
https://bugs.webkit.org/show_bug.cgi?id=117735
Summary [curl] Improve multipart response handling
Peter Gal
Reported 2013-06-18 05:32:38 PDT
The curl backend does not know anything about multipart responses so all of the returned data from the server will be displayed.
Attachments
patch (25.15 KB, patch)
2013-07-16 04:55 PDT, Peter Gal
bfulgham: review-
patch v2 (25.17 KB, patch)
2013-08-14 03:01 PDT, Peter Gal
no flags
patch v3 (25.53 KB, patch)
2013-09-04 10:24 PDT, Peter Gal
no flags
Peter Gal
Comment 1 2013-06-26 09:36:37 PDT
Patch should arrive in this week or so.
Peter Gal
Comment 2 2013-07-16 04:55:42 PDT
peavo
Comment 3 2013-07-16 07:51:55 PDT
Good stuff, this patch builds just fine on WinCairo.
Peter Gal
Comment 4 2013-07-16 07:53:32 PDT
(In reply to comment #3) > Good stuff, this patch builds just fine on WinCairo. Cool thanks for testing!
Peter Gal
Comment 5 2013-07-23 05:40:27 PDT
ping?
Brent Fulgham
Comment 6 2013-07-23 10:18:40 PDT
Comment on attachment 206765 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=206765&action=review Overall this looks good. There are a few minor style problems that need to be corrected. I'm also not sure about the meaning of 'strict' header parsing. It would be helpful to know if there are documented rules for when something should be parsed strictly or not. > Source/WebCore/platform/network/HTTPParsers.cpp:608 > +size_t parseHTTPHeader(const char* start, size_t length, String& failureReason, AtomicString& nameStr, String& valueStr, bool strict) Who governs whether we are parsing in 'strict' mode or not? Is this documented by an RFC? > Source/WebCore/platform/network/curl/MultipartHandle.cpp:37 > +bool MultipartHandle::extractBoundary(const String &contentType, String &boundary) We write this as "const String& contentType" and "String& boundary". The reference is part of the type, not the argument name. > Source/WebCore/platform/network/curl/MultipartHandle.cpp:77 > +bool MultipartHandle::matchForBoundary(const char *data, size_t pos, size_t &matched) We write this as "const char* data" and "size_t& matched". The pointer (and reference) token is part of the type, not the argument name. > Source/WebCore/platform/network/curl/MultipartHandle.cpp:81 > + for (size_t i = 0; i < m_boundaryLength; ++i) When the contents of a for statement spans multiple lines, I prefer to see them wrapped in brackets to aid readability: for (size_t i = 0; i < m_boundaryLength; ++i) { > Source/WebCore/platform/network/curl/MultipartHandle.cpp:85 > + } ... and close here. > Source/WebCore/platform/network/curl/MultipartHandle.cpp:91 > +bool MultipartHandle::checkForBoundary(size_t &boundaryStart, size_t &lastPartialMatch) Argument names... > Source/WebCore/platform/network/curl/MultipartHandle.cpp:94 > + const char *content = m_buffer.data(); const char* content... > Source/WebCore/platform/network/curl/MultipartHandle.cpp:116 > + const char *content = m_buffer.data(); const char* content ... > Source/WebCore/platform/network/curl/MultipartHandle.cpp:119 > + return false; // Not enough data, skipping for now. This comment isn't helpful; go ahead and remove it. > Source/WebCore/platform/network/curl/MultipartHandle.cpp:136 > + for (; p < end; p++) { It's good to be in the habit of using prefix operator ++/--, since there is a performance benefit to using them for iterators. for (; p < end; ++p) { > Source/WebCore/platform/network/curl/MultipartHandle.cpp:156 > +void MultipartHandle::contentReceived(const char *data, size_t length) const char* data > Source/WebCore/platform/network/curl/MultipartHandle.cpp:182 > +*/ Nice comment! > Source/WebCore/platform/network/curl/MultipartHandle.cpp:293 > + return true; // There is still things to process, so go for it. This *is* still *a* thing, or There *are* still things. > Source/WebCore/platform/network/curl/MultipartHandle.cpp:322 > + const char *data = m_buffer.data(); const char* data > Source/WebCore/platform/network/curl/MultipartHandle.h:40 > + static bool extractBoundary(const String &contentType, String &boundary); &'s should be connected to the type declarations, not the argument names. > Source/WebCore/platform/network/curl/MultipartHandle.h:42 > + MultipartHandle(ResourceHandle* handle, const String &boundary) Ditto. > Source/WebCore/platform/network/curl/MultipartHandle.h:68 > + bool matchForBoundary(const char *data, size_t pos, size_t &matched); &'s (and *'s) should be connected to the type declarations, not the argument names. > Source/WebCore/platform/network/curl/MultipartHandle.h:69 > + inline bool checkForBoundary(size_t &boundaryStart, size_t &lastPartialMatch); &'s should be connected to the type declarations, not the argument names. > Source/WebCore/platform/network/curl/MultipartHandle.h:73 > + ResourceHandle *m_resourceHandle; * should be part of the type. Since ResourceHandle is not 'owned' by this object, and the MultipartHandle cannot exist without a resource handle, I wonder if this should be a reference?
Peter Gal
Comment 7 2013-07-25 08:04:19 PDT
(In reply to comment #6) > (From update of attachment 206765 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=206765&action=review > > Overall this looks good. There are a few minor style problems that need to be corrected. I'm also not sure about the meaning of 'strict' header parsing. It would be helpful to know if there are documented rules for when something should be parsed strictly or not. > > > Source/WebCore/platform/network/HTTPParsers.cpp:608 > > +size_t parseHTTPHeader(const char* start, size_t length, String& failureReason, AtomicString& nameStr, String& valueStr, bool strict) > > Who governs whether we are parsing in 'strict' mode or not? Is this documented by an RFC? > In this case if the strict is true it'll mean that we strictly enforce the HTTP headers to end with \r\n. If it is false we'll allow headers which are only separated by a \n. The RFC states that the headers should be separated by a \r\n but there are servers and testcases that doesn't follow this rule (sadly). Actually I'm not sure if we should allow such headers, but in there are other places in the curl backend where it is allowed, and also I think the other network backends (libsoup for example) allows such headers.
Peter Gal
Comment 8 2013-08-13 04:37:57 PDT
(In reply to comment #6) > (From update of attachment 206765 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=206765&action=review > > > Source/WebCore/platform/network/curl/MultipartHandle.h:73 > > + ResourceHandle *m_resourceHandle; > > * should be part of the type. > > Since ResourceHandle is not 'owned' by this object, and the MultipartHandle cannot exist without a resource handle, I wonder if this should be a reference? In case of the FormDataStream the ResourceHandle is also a simple pointer (http://trac.webkit.org/browser/trunk/Source/WebCore/platform/network/curl/FormDataStreamCurl.h#L54) that's why I did the same here. Also note that in the headerCallback method the ResourceHandle ('job' variable) is pointer and for the 'client()->didReceive*' calls a pointer is required. Still I could change this to be a reference if you think that'll be better.
Peter Gal
Comment 9 2013-08-14 03:01:18 PDT
Created attachment 208710 [details] patch v2 updated patch.
Chris Dumez
Comment 10 2013-08-14 04:28:11 PDT
Comment on attachment 208710 [details] patch v2 View in context: https://bugs.webkit.org/attachment.cgi?id=208710&action=review > Source/WebCore/platform/network/curl/MultipartHandle.cpp:63 > + boundaryEnd = contentType.find(";", boundaryStart); ';' > Source/WebCore/platform/network/curl/MultipartHandle.cpp:77 > +bool MultipartHandle::matchForBoundary(const char* data, size_t pos, size_t& matchedLength) We don't use abbreviations in WebKit: pos -> position > Source/WebCore/platform/network/curl/MultipartHandle.cpp:82 > + if (data[pos + i] != m_boundary[i]) { how do we know data[pos + i] does not go out of bounds? > Source/WebCore/platform/network/curl/MultipartHandle.cpp:92 > +bool MultipartHandle::checkForBoundary(size_t& boundaryStartPos, size_t& lastPartialMatchPos) "Pos" -> "Position' > Source/WebCore/platform/network/curl/MultipartHandle.cpp:99 > + lastPartialMatchPos = contentLength; Ditto. > Source/WebCore/platform/network/curl/MultipartHandle.cpp:117 > + const char* content = m_buffer.data(); This can be moved after the if check for contentLength. > Source/WebCore/platform/network/curl/MultipartHandle.cpp:135 > + char* end = const_cast<char*>(content) + contentLength; Cannot this one be const? > Source/WebCore/platform/network/curl/MultipartHandle.cpp:249 > + if (!m_buffer.size()) m_buffer.isEmpty() > Source/WebCore/platform/network/curl/MultipartHandle.h:77 > + Vector<char> m_buffer; Considering the operations you are doing (append and remove from beginning), a Vector may not be the best data structure. You might want to take a look at StreamBuffer (or Deque) > Source/WebCore/platform/network/curl/ResourceHandleManager.cpp:406 > + d->m_multipartHandle = adoptPtr(new MultipartHandle(job, boundary)); We usually use static factory functions such as PassOwnPtr<MultipartHandle> MultipartHandle::create() for such cases so that we never deal with raw pointers externally.
Peter Gal
Comment 11 2013-09-04 01:31:16 PDT
(In reply to comment #10) > (From update of attachment 208710 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=208710&action=review > > > Source/WebCore/platform/network/curl/MultipartHandle.cpp:82 > > + if (data[pos + i] != m_boundary[i]) { > > how do we know data[pos + i] does not go out of bounds? Ahh yeah, didn't checked for that. I'll fix that. > > Source/WebCore/platform/network/curl/MultipartHandle.cpp:135 > > + char* end = const_cast<char*>(content) + contentLength; > > Cannot this one be const? > It can be, and will be :) > > Source/WebCore/platform/network/curl/MultipartHandle.h:77 > > + Vector<char> m_buffer; > > Considering the operations you are doing (append and remove from beginning), a Vector may not be the best data structure. You might want to take a look at StreamBuffer (or Deque) > The StreamBuffer operates with the concept of blocks (which are essentially Vectors) and these blocks are in a Deque. Also it is only possible to access the first block only, so if the boundary string is split between 2 block, to correctly handle that case that'll require more code and/or complexity in the MultipartHandler. The Deque uses internally the same VectorBuffer as the Vector. Also note that if I would use Deque<char> then when appending characters I would need to iterate over all input chars to call append for each one, in Vector there is already such method so I don't need to write it (I know not a big excuse but still less code to type this way :)) > > Source/WebCore/platform/network/curl/ResourceHandleManager.cpp:406 > > + d->m_multipartHandle = adoptPtr(new MultipartHandle(job, boundary)); > > We usually use static factory functions such as PassOwnPtr<MultipartHandle> MultipartHandle::create() for such cases so that we never deal with raw pointers externally. Okay, that makes sense.
Peter Gal
Comment 12 2013-09-04 10:24:50 PDT
Created attachment 210473 [details] patch v3 Updated patch (Sorry for the big delay)
WebKit Commit Bot
Comment 13 2013-09-04 10:27:50 PDT
Attachment 210473 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCore/WebCore.vcxproj/WebCore.vcxproj', u'Source/WebCore/WebCore.vcxproj/WebCore.vcxproj.filters', u'Source/WebCore/platform/network/HTTPParsers.cpp', u'Source/WebCore/platform/network/HTTPParsers.h', u'Source/WebCore/platform/network/ResourceHandleInternal.h', u'Source/WebCore/platform/network/curl/MultipartHandle.cpp', u'Source/WebCore/platform/network/curl/MultipartHandle.h', u'Source/WebCore/platform/network/curl/ResourceHandleManager.cpp']" exit_code: 1 Source/WebCore/platform/network/curl/MultipartHandle.h:41: The parameter name "handle" adds no information, so it should be removed. [readability/parameter_name] [5] Total errors found: 1 in 9 files If any of these errors are false positives, please file a bug against check-webkit-style.
Peter Gal
Comment 14 2013-09-11 07:25:52 PDT
(In reply to comment #12) > Created an attachment (id=210473) [details] > patch v3 > ping?
Brent Fulgham
Comment 15 2013-09-12 11:02:24 PDT
Comment on attachment 210473 [details] patch v3 r=me
WebKit Commit Bot
Comment 16 2013-09-12 11:25:20 PDT
Comment on attachment 210473 [details] patch v3 Clearing flags on attachment: 210473 Committed r155633: <http://trac.webkit.org/changeset/155633>
WebKit Commit Bot
Comment 17 2013-09-12 11:25:23 PDT
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.