RESOLVED FIXED Bug 63231
SharedBufferChunkReader does not deal properly with non printable characters
https://bugs.webkit.org/show_bug.cgi?id=63231
Summary SharedBufferChunkReader does not deal properly with non printable characters
Jay Civelli
Reported 2011-06-22 23:59:46 PDT
The SharedBUfferChunkReader class currently returns a String when reading a chunk of data. This does not work well with binary data. To deal properly with binary data it should use Vector<char>.
Attachments
Patch (10.34 KB, patch)
2011-06-23 00:50 PDT, Jay Civelli
no flags
Patch (10.50 KB, patch)
2011-06-23 10:54 PDT, Jay Civelli
no flags
Unit-tests (11.43 KB, application/octet-stream)
2011-06-23 13:07 PDT, Jay Civelli
no flags
Jay Civelli
Comment 1 2011-06-23 00:50:56 PDT
Adam Barth
Comment 2 2011-06-23 09:52:46 PDT
Comment on attachment 98320 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=98320&action=review > Source/WebCore/ChangeLog:8 > + Fixing the SharedBufferChunkReader API so it works correctly with > + binary data (non printable characters). > + Also adding a method to peek at the data (this is needed for MHTML > + with binary parts). I'm slightly confused as to what this patch is doing. Is this changing any observable behavior? If so, we should add a test. This patch also appears to introduce some functions that aren't used. Is there a subsequent patch that will use these functions? > Source/WebCore/platform/SharedBufferChunkReader.cpp:84 > + chunk.append(m_separator.data(), m_separatorIndex); Can you add an assert that m_separatorIndex <= m_separator.size() ? > Source/WebCore/platform/SharedBufferChunkReader.cpp:110 > + return false; // Compiler is unhappy without this. Rather than have this comment, we usually ASSERT_NOT_REACHED() > Source/WebCore/platform/SharedBufferChunkReader.cpp:113 > +String SharedBufferChunkReader::nextChunkAsUTF8String(bool includeSeparator) Should we say nextChunkAsUTF8StringWithLatin1Fallback ? > Source/WebCore/platform/SharedBufferChunkReader.h:57 > + // Reads size bytes at the current location in the buffer, without changing the buffer position. > + // Returns the number of bytes read. That number might be less than the specified size if the end of the buffer was reached. > + size_t peek(Vector<char>&, size_t); This function doesn't appear to be used yet.
Jay Civelli
Comment 3 2011-06-23 10:35:02 PDT
Comment on attachment 98320 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=98320&action=review >> Source/WebCore/platform/SharedBufferChunkReader.cpp:84 >> + chunk.append(m_separator.data(), m_separatorIndex); > > Can you add an assert that m_separatorIndex <= m_separator.size() ? Done. >> Source/WebCore/platform/SharedBufferChunkReader.cpp:110 >> + return false; // Compiler is unhappy without this. > > Rather than have this comment, we usually ASSERT_NOT_REACHED() Done. >> Source/WebCore/platform/SharedBufferChunkReader.cpp:113 >> +String SharedBufferChunkReader::nextChunkAsUTF8String(bool includeSeparator) > > Should we say nextChunkAsUTF8StringWithLatin1Fallback ? Done. >> Source/WebCore/platform/SharedBufferChunkReader.h:57 >> + size_t peek(Vector<char>&, size_t); > > This function doesn't appear to be used yet. Yes, I need it for the MHTML binary part support. I have local unittest testing it. As soon as we have the unit-tests for WebCore working I'll add them. I'll upload them my unit-test to this bug so they are not lost.
Jay Civelli
Comment 4 2011-06-23 10:38:04 PDT
(In reply to comment #2) > (From update of attachment 98320 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=98320&action=review > > > Source/WebCore/ChangeLog:8 > > + Fixing the SharedBufferChunkReader API so it works correctly with > > + binary data (non printable characters). > > + Also adding a method to peek at the data (this is needed for MHTML > > + with binary parts). > > I'm slightly confused as to what this patch is doing. Is this changing any observable behavior? If so, we should add a test. This patch also appears to introduce some functions that aren't used. Is there a subsequent patch that will use these functions? Sorry for the confusion. My goal here is to change the API so it deals correctly with binary data. This is needed for an upcoming patch that allows binary parts in MHTML files. There are several issues with using String in this API when dealing with binary data: - each byte is transformed to a UTF16 char, using 2 bytes when it could use only one - when getting the data out of the string non UTF characters get messed up The peek method is also something I introduce as I'll need it for my upcoming patch.
Jay Civelli
Comment 5 2011-06-23 10:54:46 PDT
Jay Civelli
Comment 6 2011-06-23 10:58:25 PDT
(In reply to comment #4) > (In reply to comment #2) > > (From update of attachment 98320 [details] [details]) > > View in context: https://bugs.webkit.org/attachment.cgi?id=98320&action=review > > > > > Source/WebCore/ChangeLog:8 > > > + Fixing the SharedBufferChunkReader API so it works correctly with > > > + binary data (non printable characters). > > > + Also adding a method to peek at the data (this is needed for MHTML > > > + with binary parts). > > > > I'm slightly confused as to what this patch is doing. Is this changing any observable behavior? If so, we should add a test. This patch also appears to introduce some functions that aren't used. Is there a subsequent patch that will use these functions? > > Sorry for the confusion. > My goal here is to change the API so it deals correctly with binary data. > This is needed for an upcoming patch that allows binary parts in MHTML files. > There are several issues with using String in this API when dealing with binary data: > - each byte is transformed to a UTF16 char, using 2 bytes when it could use only one > - when getting the data out of the string non UTF characters get messed up > > The peek method is also something I introduce as I'll need it for my upcoming patch. And also, this does not change any current behavior and the mhtml layout-tests ensure this did not break the MHTML support.
Adam Barth
Comment 7 2011-06-23 11:01:08 PDT
Comment on attachment 98370 [details] Patch Ok.
WebKit Review Bot
Comment 8 2011-06-23 12:29:49 PDT
Comment on attachment 98370 [details] Patch Clearing flags on attachment: 98370 Committed r89599: <http://trac.webkit.org/changeset/89599>
WebKit Review Bot
Comment 9 2011-06-23 12:29:53 PDT
All reviewed patches have been landed. Closing bug.
Jay Civelli
Comment 10 2011-06-23 13:07:26 PDT
Created attachment 98384 [details] Unit-tests Unit-tests that should be added to TestWebKitAPI once it has the WebCore dependency.
Note You need to log in before you can comment on or make changes to this bug.