WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
Bug 63231
SharedBufferChunkReader does not deal properly with non printable characters
https://bugs.webkit.org/show_bug.cgi?id=63231
Summary
SharedBufferChunkReader does not deal properly with non printable characters
Jay Civelli
Reported
2011-06-22 23:59:46 PDT
The SharedBUfferChunkReader class currently returns a String when reading a chunk of data. This does not work well with binary data. To deal properly with binary data it should use Vector<char>.
Attachments
Patch
(10.34 KB, patch)
2011-06-23 00:50 PDT
,
Jay Civelli
no flags
Details
Formatted Diff
Diff
Patch
(10.50 KB, patch)
2011-06-23 10:54 PDT
,
Jay Civelli
no flags
Details
Formatted Diff
Diff
Unit-tests
(11.43 KB, application/octet-stream)
2011-06-23 13:07 PDT
,
Jay Civelli
no flags
Details
Show Obsolete
(1)
View All
Add attachment
proposed patch, testcase, etc.
Jay Civelli
Comment 1
2011-06-23 00:50:56 PDT
Created
attachment 98320
[details]
Patch
Adam Barth
Comment 2
2011-06-23 09:52:46 PDT
Comment on
attachment 98320
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=98320&action=review
> Source/WebCore/ChangeLog:8 > + Fixing the SharedBufferChunkReader API so it works correctly with > + binary data (non printable characters). > + Also adding a method to peek at the data (this is needed for MHTML > + with binary parts).
I'm slightly confused as to what this patch is doing. Is this changing any observable behavior? If so, we should add a test. This patch also appears to introduce some functions that aren't used. Is there a subsequent patch that will use these functions?
> Source/WebCore/platform/SharedBufferChunkReader.cpp:84 > + chunk.append(m_separator.data(), m_separatorIndex);
Can you add an assert that m_separatorIndex <= m_separator.size() ?
> Source/WebCore/platform/SharedBufferChunkReader.cpp:110 > + return false; // Compiler is unhappy without this.
Rather than have this comment, we usually ASSERT_NOT_REACHED()
> Source/WebCore/platform/SharedBufferChunkReader.cpp:113 > +String SharedBufferChunkReader::nextChunkAsUTF8String(bool includeSeparator)
Should we say nextChunkAsUTF8StringWithLatin1Fallback ?
> Source/WebCore/platform/SharedBufferChunkReader.h:57 > + // Reads size bytes at the current location in the buffer, without changing the buffer position. > + // Returns the number of bytes read. That number might be less than the specified size if the end of the buffer was reached. > + size_t peek(Vector<char>&, size_t);
This function doesn't appear to be used yet.
Jay Civelli
Comment 3
2011-06-23 10:35:02 PDT
Comment on
attachment 98320
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=98320&action=review
>> Source/WebCore/platform/SharedBufferChunkReader.cpp:84 >> + chunk.append(m_separator.data(), m_separatorIndex); > > Can you add an assert that m_separatorIndex <= m_separator.size() ?
Done.
>> Source/WebCore/platform/SharedBufferChunkReader.cpp:110 >> + return false; // Compiler is unhappy without this. > > Rather than have this comment, we usually ASSERT_NOT_REACHED()
Done.
>> Source/WebCore/platform/SharedBufferChunkReader.cpp:113 >> +String SharedBufferChunkReader::nextChunkAsUTF8String(bool includeSeparator) > > Should we say nextChunkAsUTF8StringWithLatin1Fallback ?
Done.
>> Source/WebCore/platform/SharedBufferChunkReader.h:57 >> + size_t peek(Vector<char>&, size_t); > > This function doesn't appear to be used yet.
Yes, I need it for the MHTML binary part support. I have local unittest testing it. As soon as we have the unit-tests for WebCore working I'll add them. I'll upload them my unit-test to this bug so they are not lost.
Jay Civelli
Comment 4
2011-06-23 10:38:04 PDT
(In reply to
comment #2
)
> (From update of
attachment 98320
[details]
) > View in context:
https://bugs.webkit.org/attachment.cgi?id=98320&action=review
> > > Source/WebCore/ChangeLog:8 > > + Fixing the SharedBufferChunkReader API so it works correctly with > > + binary data (non printable characters). > > + Also adding a method to peek at the data (this is needed for MHTML > > + with binary parts). > > I'm slightly confused as to what this patch is doing. Is this changing any observable behavior? If so, we should add a test. This patch also appears to introduce some functions that aren't used. Is there a subsequent patch that will use these functions?
Sorry for the confusion. My goal here is to change the API so it deals correctly with binary data. This is needed for an upcoming patch that allows binary parts in MHTML files. There are several issues with using String in this API when dealing with binary data: - each byte is transformed to a UTF16 char, using 2 bytes when it could use only one - when getting the data out of the string non UTF characters get messed up The peek method is also something I introduce as I'll need it for my upcoming patch.
Jay Civelli
Comment 5
2011-06-23 10:54:46 PDT
Created
attachment 98370
[details]
Patch
Jay Civelli
Comment 6
2011-06-23 10:58:25 PDT
(In reply to
comment #4
)
> (In reply to
comment #2
) > > (From update of
attachment 98320
[details]
[details]) > > View in context:
https://bugs.webkit.org/attachment.cgi?id=98320&action=review
> > > > > Source/WebCore/ChangeLog:8 > > > + Fixing the SharedBufferChunkReader API so it works correctly with > > > + binary data (non printable characters). > > > + Also adding a method to peek at the data (this is needed for MHTML > > > + with binary parts). > > > > I'm slightly confused as to what this patch is doing. Is this changing any observable behavior? If so, we should add a test. This patch also appears to introduce some functions that aren't used. Is there a subsequent patch that will use these functions? > > Sorry for the confusion. > My goal here is to change the API so it deals correctly with binary data. > This is needed for an upcoming patch that allows binary parts in MHTML files. > There are several issues with using String in this API when dealing with binary data: > - each byte is transformed to a UTF16 char, using 2 bytes when it could use only one > - when getting the data out of the string non UTF characters get messed up > > The peek method is also something I introduce as I'll need it for my upcoming patch.
And also, this does not change any current behavior and the mhtml layout-tests ensure this did not break the MHTML support.
Adam Barth
Comment 7
2011-06-23 11:01:08 PDT
Comment on
attachment 98370
[details]
Patch Ok.
WebKit Review Bot
Comment 8
2011-06-23 12:29:49 PDT
Comment on
attachment 98370
[details]
Patch Clearing flags on attachment: 98370 Committed
r89599
: <
http://trac.webkit.org/changeset/89599
>
WebKit Review Bot
Comment 9
2011-06-23 12:29:53 PDT
All reviewed patches have been landed. Closing bug.
Jay Civelli
Comment 10
2011-06-23 13:07:26 PDT
Created
attachment 98384
[details]
Unit-tests Unit-tests that should be added to TestWebKitAPI once it has the WebCore dependency.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug