WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
11108
Replace usage of split by proper parsers
https://bugs.webkit.org/show_bug.cgi?id=11108
Summary
Replace usage of split by proper parsers
Rob Buis
Reported
2006-10-01 04:48:58 PDT
See
http://www.kevlindev.com/gui/utilities/viewbox/viewBox.svg
. When the aspect ratio params change to empty, it crashes.
Attachments
First attempt
(8.04 KB, patch)
2006-10-01 05:05 PDT
,
Rob Buis
no flags
Details
Formatted Diff
Diff
Probably better long-term solution
(24.57 KB, patch)
2006-10-02 13:49 PDT
,
Rob Buis
no flags
Details
Formatted Diff
Diff
Using UChar
(30.72 KB, patch)
2006-10-03 06:03 PDT
,
Rob Buis
no flags
Details
Formatted Diff
Diff
Improved patch
(134.37 KB, patch)
2006-10-04 12:23 PDT
,
Rob Buis
no flags
Details
Formatted Diff
Diff
Slightly cleaned up patch
(177.32 KB, patch)
2006-10-06 01:10 PDT
,
Rob Buis
mitz: review-
Details
Formatted Diff
Diff
Compiles with ToT
(121.51 KB, patch)
2006-12-22 13:45 PST
,
Rob Buis
no flags
Details
Formatted Diff
Diff
Completely remove split usage on ToT
(134.34 KB, patch)
2006-12-24 05:46 PST
,
Rob Buis
no flags
Details
Formatted Diff
Diff
Smaller patch
(50.73 KB, patch)
2006-12-27 12:26 PST
,
Rob Buis
no flags
Details
Formatted Diff
Diff
Better diff
(44.48 KB, patch)
2006-12-27 12:54 PST
,
Rob Buis
eric
: review+
Details
Formatted Diff
Diff
Show Obsolete
(8)
View All
Add attachment
proposed patch, testcase, etc.
Rob Buis
Comment 1
2006-10-01 05:05:26 PDT
Created
attachment 10854
[details]
First attempt A simple fix... Cheers, Rob.
Eric Seidel (no email)
Comment 2
2006-10-01 05:10:33 PDT
Comment on
attachment 10854
[details]
First attempt Looks fine. You should probably search for all other uses of DeprecatedString::split. There may be other crashing cases. Do we adequately protect against having only 1 or 2 or 3 params? Using DeprecatedStringList::split is really bad practice, since it copies the string for parsing, instead of just walking it. But at least this fixes the crash for now.
Rob Buis
Comment 3
2006-10-02 13:47:21 PDT
Hi, (In reply to
comment #2
)
> (From update of
attachment 10854
[details]
[edit]) > Looks fine. > > You should probably search for all other uses of DeprecatedString::split. > There may be other crashing cases. Do we adequately protect against having > only 1 or 2 or 3 params? > > Using DeprecatedStringList::split is really bad practice, since it copies the > string for parsing, instead of just walking it. But at least this fixes the > crash for now.
Nice to see the r+, however I think we may as well go for the real fix, which is removing use of split() and do proper parsers. I'll attach a patch for this, then maybe we can discuss the proper action. I think it is good that we have this bug reminding us of one of the current (split()) problems though. cheers, Rob.
Rob Buis
Comment 4
2006-10-02 13:49:56 PDT
Created
attachment 10872
[details]
Probably better long-term solution Here is the current state of my patch to get rid of split(). It needs speed testing, but it should be quicker since the copies are gone and less methods (like endsWith()) are called. Cheers, Rob.
Darin Adler
Comment 5
2006-10-02 19:39:29 PDT
Comment on
attachment 10872
[details]
Probably better long-term solution Why the conversion to char? The other functions could just as well take UChar* as char*, and converting to char just means you have to do additional memory allocation.
Rob Buis
Comment 6
2006-10-03 06:03:46 PDT
Created
attachment 10880
[details]
Using UChar Now I use UChar * instead of char* as suggested by Darin. Cheers, Rob.
Eric Seidel (no email)
Comment 7
2006-10-03 06:18:25 PDT
Comment on
attachment 10880
[details]
Using UChar These: + if (*currParam == 'd') { + if (!((eoString - currParam) > 4) && + currParam[1] == 'e' && + currParam[2] == 'f' && + currParam[3] == 'e' && + currParam[4] == 'r') + goto bail_out; + currParam += 5; + skipOptionalSpaces(currParam); } or at least part, should be abstracted into a static inline. maybe just this part: + if (!((eoString - currParam) > 4) && + currParam[1] == 'e' && + currParam[2] == 'f' && + currParam[3] == 'e' && + currParam[4] == 'r') Sleep now. I can look at the rest later.
Rob Buis
Comment 8
2006-10-04 12:23:53 PDT
Created
attachment 10908
[details]
Improved patch This patch is much more complete and removes split from use in ksvg. Also added and changed some testcases. Let me know whether there should be more tests. Cheers, Rob.
Rob Buis
Comment 9
2006-10-06 01:10:08 PDT
Created
attachment 10944
[details]
Slightly cleaned up patch This one has some nicer code in some parsing sections. Also I fixed obvious code style bugs. Cheers, Rob.
Maciej Stachowiak
Comment 10
2006-10-09 05:26:03 PDT
Don't forget to remove the conflict marker: +>>>>>>> .
r16800
mitz
Comment 11
2006-10-20 07:30:53 PDT
Comment on
attachment 10944
[details]
Slightly cleaned up patch I didn't read the entire patch, but here are some observations: Conflict markers: +>>>>>>> .
r16800
Typo: + - use functions from SVGParserUtils instead of spli Some cleanup opportunities: else if (attr->name() == SVGNames::valuesAttr) { This: String parse = m_rgbColor.stripWhiteSpace(); followed by: + const UChar* ptr = reinterpret_cast<const UChar*>(parse.charactersWithNullTermination()) + 4; always copies the buffer once (I think in many cases it copies it twice). I think you can avoid one copy using a combination of skipOptionalSpaces and String::find() (or a single condition that checks the next 4 characters) starting after the run of spaces. I can't say if it's worth it. Is it necessary to cast the result of charactersWithNullTermination()? Shouldn't you set ec before returning in these cases? + if (!tryParseCoord(ptr, r)) + return; The use of skipOptionalSpaces in the non-percentage case is redundant, as parseCoord always ends with skipOptionalSpacesOrDelimiter which ends with skipOptionalSpaces: + if (*ptr == '%') { + percentages++; + ptr++; + skipOptionalSpacesOrDelimiter(ptr); + } else { + normal++; + skipOptionalSpaces(ptr); It also seems that this will successfully parse things like "rgb(25 , % , 50%, 75%)". tryParseNumberList is only used in this pattern: + int nr = tryParseNumberList(value, x, y); + if (nr > 0) { + setKernelUnitLengthXBaseValue(x); + if (nr == 1) + setKernelUnitLengthYBaseValue(x); + else + setKernelUnitLengthYBaseValue(y); + } I think you can factor out the common logic so that you'll have a function bool tryParseNumberOptionalNumber(const String&, double& h, double& v) that sets h and v to the same value if there is only one value, and returns false if there is no value. I don't know how it should behave for more than 2 values (or any junk after the second value). SVGNumberList::parse() allows things like " , 1, 2, , 3 ," (since tryParseCoord allows one comma, and then skipOptionalSpacesOrDelimiter allows another one; it is also what allows the leading comma). This is a funny variable name: + const UChar* eoString = currSegment + points.length(); This can be changed to use tryParseCoord: currSegment = parseCoord(currSegment, xPos); if (currSegment == prevSegment) Some redundant parameter names: + const UChar* parseCoord(const UChar* ptr, double& number); + void parsePoints(const String& points) const; Extra spaces inside the parentheses; "d" is unnecessary: + void parseSVG( const String &d, bool process = false ); skipOptionalSpaces(" ") tests the first character twice. Does the compiler optimize it? skipOptionalSpacesOrDelimiter(" ") tests the first character three times. Same question. This should be FIXME: + // TODO : more error checking/reporting
mitz
Comment 12
2006-10-20 13:47:06 PDT
Comment on
attachment 10944
[details]
Slightly cleaned up patch Additional observations: In SVGColor::setRGBColor(), normal == 3 - percentages, so you don't need two variables. Brace should be indented: + static inline bool skipOptionalSpaces(const UChar*& ptr) // true means "found space" +{ Brace should be on the switch line: + switch (type) + {
Darin Adler
Comment 13
2006-11-09 08:37:42 PST
Comment on
attachment 10854
[details]
First attempt Clear review flag on this obsolete patch.
Rob Buis
Comment 14
2006-12-22 13:45:30 PST
Created
attachment 11972
[details]
Compiles with ToT This patch is done sort of from scratch. It is not complete, but does a lot of useful things already. Although it is not complete I am attaching it already since I am away tomorrow and someone may want to shark it (Eric?) :) I should be able to easily finish it sunday. Also note that it should help with
bug 11908
. Cheers, Rob.
Rob Buis
Comment 15
2006-12-24 05:46:29 PST
Created
attachment 11993
[details]
Completely remove split usage on ToT This one should be complete and quite clean. Cheers, Rob.
Rob Buis
Comment 16
2006-12-24 07:26:41 PST
After discussing with Eric, it is clear that there are two important issues with the patch. One is performance, other are tests. Here are some ways we can improve tests for transform lists and preserve aspect ratio: - multiple comma's - missing (, ) etc. - having too many/few params in the transform lists - maybe wrong ordering (defer as last one for example) - entering letter where number is expected Cheers, Rob. (In reply to
comment #15
)
> Created an attachment (id=11993) [edit] > Completely remove split usage on ToT > > This one should be complete and quite clean. > Cheers, > > Rob. >
Rob Buis
Comment 17
2006-12-27 12:26:53 PST
Created
attachment 12069
[details]
Smaller patch This patch solves just the transform passing and should be easier to review. Also note that it gives a big speed improvement for the svg PLT. Cheers, Rob.
Rob Buis
Comment 18
2006-12-27 12:54:34 PST
Created
attachment 12070
[details]
Better diff As requested by Eric. Cheers, Rob.
Eric Seidel (no email)
Comment 19
2006-12-27 13:24:00 PST
Comment on
attachment 12070
[details]
Better diff This looks great. We definitely need more invalid path tests. I think you could rename SVGPathUtils to SVGPathUtilities for clarity, but that's not a huge deal. r=me
Eric Seidel (no email)
Comment 20
2006-12-27 13:26:11 PST
Oh, I almost forgot, a couple comments made over IRC: 1. skip = true, so the last parseNumber in parseNumberOptionalNumber might need to force that to false. 2. the x array may need to be cleared every time through the loop, otherwise optional parameters (like those for ROTATE) might end up being re-used from a previous parse.
David Kilzer (:ddkilzer)
Comment 21
2006-12-28 15:36:42 PST
Landed in
r18440
by rwlbuis.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug