WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
63310
MHTML archive loader should support binary parts
https://bugs.webkit.org/show_bug.cgi?id=63310
Summary
MHTML archive loader should support binary parts
Jay Civelli
Reported
2011-06-23 21:40:11 PDT
The MHTML archive loader only supports the following encodings: QuotedPrintable, Base64 and and 7bit. It should also support binary for writing/reading archives, as it provides more compact files that are faster to load.
Attachments
Patch
(17.69 KB, patch)
2011-06-26 10:00 PDT
,
Jay Civelli
no flags
Details
Formatted Diff
Diff
Patch
(17.69 KB, patch)
2011-06-26 21:19 PDT
,
Jay Civelli
no flags
Details
Formatted Diff
Diff
Patch
(17.96 KB, patch)
2011-06-27 12:31 PDT
,
Jay Civelli
no flags
Details
Formatted Diff
Diff
Show Obsolete
(2)
View All
Add attachment
proposed patch, testcase, etc.
Jay Civelli
Comment 1
2011-06-26 10:00:18 PDT
Created
attachment 98629
[details]
Patch
Adam Barth
Comment 2
2011-06-26 15:48:15 PDT
Comment on
attachment 98629
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=98629&action=review
> Source/WebCore/loader/archive/mhtml/MHTMLParser.cpp:168 > + endOfArchiveReached = (nextChars[0] == '-' && nextChars[1] == '-');
Do you want to ASSERT(nextChars.size() == 2) before this line (to help the casual reader)?
> Source/WebCore/loader/archive/mhtml/MHTMLParser.cpp:180 > + if (checkBoundary && (line == endOfPartBoundary || line == endOfDocumentBoundary)) { > + endOfArchiveReached = (line == endOfDocumentBoundary);
nit: Can we avoid doing this strcmp twice?
> Source/WebCore/platform/SharedBuffer.cpp:171 > + if (!data.isEmpty()) > + append(data.data(), data.size());
It looks like append(const char*, unsigned) might already understand length == 0, so this branch might not be needed.
> Source/WebKit/chromium/public/WebPageSerializer.h:64 > // Serializes the WebView contents to a MHTML representation. > - WEBKIT_API static WebCString serializeToMHTML(WebView*); > + // Passing true for |useBinaryEncoding| will result in a smaller HTML file, > + // but it might not work on other browsers. > + WEBKIT_API static WebCString serializeToMHTML(WebView*, bool useBinaryEncoding = false);
Should this be an enum for the various supported encodings (even if we just support two today)?
Adam Barth
Comment 3
2011-06-26 15:48:34 PDT
+fishd for WebKit API review.
Jay Civelli
Comment 4
2011-06-26 21:18:47 PDT
Comment on
attachment 98629
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=98629&action=review
>> Source/WebCore/loader/archive/mhtml/MHTMLParser.cpp:168 >> + endOfArchiveReached = (nextChars[0] == '-' && nextChars[1] == '-'); > > Do you want to ASSERT(nextChars.size() == 2) before this line (to help the casual reader)?
Done.
>> Source/WebCore/loader/archive/mhtml/MHTMLParser.cpp:180 >> + endOfArchiveReached = (line == endOfDocumentBoundary); > > nit: Can we avoid doing this strcmp twice?
Good idea, done.
>> Source/WebCore/platform/SharedBuffer.cpp:171 >> + append(data.data(), data.size()); > > It looks like append(const char*, unsigned) might already understand length == 0, so this branch might not be needed.
Removed if.
>> Source/WebKit/chromium/public/WebPageSerializer.h:64 >> + WEBKIT_API static WebCString serializeToMHTML(WebView*, bool useBinaryEncoding = false); > > Should this be an enum for the various supported encodings (even if we just support two today)?
I don't expect any other encodings. We could change to an enum in the future in the unlikely event that we need an extra encoding.
Jay Civelli
Comment 5
2011-06-26 21:19:43 PDT
Created
attachment 98660
[details]
Patch
Adam Barth
Comment 6
2011-06-26 21:23:04 PDT
Comment on
attachment 98660
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=98660&action=review
> Source/WebKit/chromium/public/WebPageSerializer.h:64 > // Serializes the WebView contents to a MHTML representation. > - WEBKIT_API static WebCString serializeToMHTML(WebView*); > + // Passing true for |useBinaryEncoding| will result in a smaller HTML file, > + // but it might not work on other browsers. > + WEBKIT_API static WebCString serializeToMHTML(WebView*, bool useBinaryEncoding = false);
Ok. I'm going to let fishd do the final review. (Changing the WebKit API is a PITA, which is why I suggested an enum.)
Darin Fisher (:fishd, Google)
Comment 7
2011-06-27 10:18:00 PDT
Comment on
attachment 98660
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=98660&action=review
>> Source/WebKit/chromium/public/WebPageSerializer.h:64 >> + WEBKIT_API static WebCString serializeToMHTML(WebView*, bool useBinaryEncoding = false); > > Ok. I'm going to let fishd do the final review. (Changing the WebKit API is a PITA, which is why I suggested an enum.)
Ordinarily, raw bool parameters like this don't help much with readability at the callsite. It would make the callsites more self-documenting if you passed an enum value, or if you created a separate function: serializeToMHTMLUsingBinaryEncoding.
Jay Civelli
Comment 8
2011-06-27 12:31:14 PDT
Created
attachment 98766
[details]
Patch
Jay Civelli
Comment 9
2011-06-27 12:32:43 PDT
Comment on
attachment 98660
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=98660&action=review
>>> Source/WebKit/chromium/public/WebPageSerializer.h:64 >>> + WEBKIT_API static WebCString serializeToMHTML(WebView*, bool useBinaryEncoding = false); >> >> Ok. I'm going to let fishd do the final review. (Changing the WebKit API is a PITA, which is why I suggested an enum.) > > Ordinarily, raw bool parameters like this don't help much with readability at the > callsite. It would make the callsites more self-documenting if you passed an > enum value, or if you created a separate function: serializeToMHTMLUsingBinaryEncoding.
OK, I created a separate function as I don't expect any other encoding in the future.
Darin Fisher (:fishd, Google)
Comment 10
2011-06-27 12:41:21 PDT
Comment on
attachment 98766
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=98766&action=review
> Source/WebKit/chromium/src/WebPageSerializer.cpp:211 > +WebCString WebPageSerializer::serializeToMHTMLUsingBinaryEncoding(WebView* view)
why don't you make this method return WebData? (that addresses the FIXME)
Jay Civelli
Comment 11
2011-06-27 12:55:24 PDT
Comment on
attachment 98766
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=98766&action=review
>> Source/WebKit/chromium/src/WebPageSerializer.cpp:211 >> +WebCString WebPageSerializer::serializeToMHTMLUsingBinaryEncoding(WebView* view) > > why don't you make this method return WebData? (that addresses the FIXME)
I have that in another patch, that got r+ed, landed and broke a test and got rolled back. In the meantime I landed a patch in Chromium that uses the API. So my plan is to go the "#ifdef bug_1234_fixed" road to address that in an upcoming patch.
Darin Fisher (:fishd, Google)
Comment 12
2011-06-27 14:22:53 PDT
Comment on
attachment 98766
[details]
Patch OK, LGTM for WebKit API change.
Darin Fisher (:fishd, Google)
Comment 13
2011-06-27 14:25:05 PDT
Comment on
attachment 98766
[details]
Patch Based on Adam's previous comments that this was just waiting for me to give final review, I'm marking this R=me. I did not review the WebCore portions of this patch carefully, but I defer to Adam for that part.
Adam Barth
Comment 14
2011-06-27 14:26:59 PDT
I'm happy to take responsibility for the WebCore parts.
WebKit Review Bot
Comment 15
2011-06-27 15:18:31 PDT
Comment on
attachment 98766
[details]
Patch Clearing flags on attachment: 98766 Committed
r89869
: <
http://trac.webkit.org/changeset/89869
>
WebKit Review Bot
Comment 16
2011-06-27 15:18:37 PDT
All reviewed patches have been landed. Closing bug.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug