Bug 106919

Summary:

Make AtomicMarkupTokenBase use a bare UChar* for external characters

Product:

WebKit

Reporter:

Tony Gentilcore <tonyg>

Component:

New Bugs

Assignee:

Tony Gentilcore <tonyg>

Status:

RESOLVED FIXED

Severity:

Normal

CC:

abarth, benjamin, msaboff, ojan.autocc, webkit.review.bot

Priority:

Version:

528+ (Nightly build)

Hardware:

Unspecified

OS:

Unspecified

Bug Depends on:

Bug Blocks:

106127

Attachments:

Description	Flags
Patch	none

Tony Gentilcore

Reported 2013-01-15 10:06:07 PST

Make AtomicMarkupTokenBase use a bare UChar* for external characters

Attachments
Patch (5.32 KB, patch) 2013-01-15 10:07 PST, Tony Gentilcore	no flags	Details Formatted Diff Diff
View All Add attachment proposed patch, testcase, etc.

Tony Gentilcore

Comment 1 2013-01-15 10:07:26 PST

Created attachment 182800 [details] Patch

Eric Seidel (no email)

Comment 2 2013-01-15 10:35:36 PST

Comment on attachment 182800 [details] Patch ok.

WebKit Review Bot

Comment 3 2013-01-15 11:07:51 PST

Comment on attachment 182800 [details] Patch Clearing flags on attachment: 182800 Committed r139760: <http://trac.webkit.org/changeset/139760>

WebKit Review Bot

Comment 4 2013-01-15 11:07:54 PST

All reviewed patches have been landed. Closing bug.

Benjamin Poulain

Comment 5 2013-01-15 23:33:12 PST

Comment on attachment 182800 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=182800&action=review > Source/WebCore/xml/parser/MarkupTokenBase.h:529 > + size_t charactersLength() const > + { > + ASSERT(m_type == Token::Type::Character); > + return m_externalCharactersLength; > } While you are at it: this would be better as unsigned instead of size_t. Character buffers are usually unsigned (WTFString included). > Source/WebCore/xml/parser/MarkupTokenBase.h:588 > + size_t m_externalCharactersLength; ditto.

Tony Gentilcore

Comment 6 2013-01-16 10:37:44 PST

Comment on attachment 182800 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=182800&action=review >> Source/WebCore/xml/parser/MarkupTokenBase.h:529 >> } > > While you are at it: this would be better as unsigned instead of size_t. > Character buffers are usually unsigned (WTFString included). abarth tells me that on a 64bit system, unsigned will still be 32 bits but size_t will be 64. Since Vectors are arbitrary length, using size_t is appropriate here.

Adam Barth

Comment 7 2013-01-16 11:02:15 PST

I wonder if we should crash if we get a character token that's larger than 32 bits. I bet we do later anyway due to overflow checks in StringImpl.

Benjamin Poulain

Comment 8 2013-01-16 13:44:21 PST

> abarth tells me that on a 64bit system, unsigned will still be 32 bits but size_t will be 64. That is correct. > Since Vectors are arbitrary length, using size_t is appropriate here. Well, that's all nice when creating the Token, but the method you call will get you un trouble. E.g.: String characters = String(token->characters(), token->charactersLength()); Your patch changed from using Vector types to Character types, which is why I suggest we move to string conventions. > I wonder if we should crash if we get a character token that's larger than 32 bits. I bet we do later anyway due to overflow checks in StringImpl. Yep, I think you are right. Seeing what is here, I think that's the way to make this correct.

Adam Barth

Comment 9 2013-01-16 14:22:24 PST

Yeah. Would you be willing to file a bug about this issue? The issue pre-dates this patch since the old code used Vector::size as well.

Note You need to log in before you can comment on or make changes to this bug.