22949 – Refactor HTMLFormElement code

RESOLVED FIXED 22949

Refactor HTMLFormElement code

https://bugs.webkit.org/show_bug.cgi?id=22949

Summary Refactor HTMLFormElement code

Nikolas Zimmermann

Reported 2008-12-20 17:14:12 PST

HTMLFormElement contains useful code regarding GET/POST to url handling, multi-part form handling, boundary string generation etc. This should be abstracted in a helper class usable by both HTMLFormElement & WMLGoElement - to share the whole fragile form-building logic in a central place.

Attachments
Initial patch (50.38 KB, patch) 2008-12-20 17:19 PST, Nikolas Zimmermann	no flags	Details Formatted Diff Diff
Updated patch (52.34 KB, patch) 2008-12-20 19:39 PST, Nikolas Zimmermann	darin: review-	Details Formatted Diff Diff
Updated patch v2 (51.59 KB, patch) 2008-12-21 15:33 PST, Nikolas Zimmermann	staikos: review+	Details Formatted Diff Diff
Show Obsolete (2) View All Add attachment proposed patch, testcase, etc.

Nikolas Zimmermann

Comment 1 2008-12-20 17:19:55 PST

Created attachment 26173 [details] Initial patch

Nikolas Zimmermann

Comment 2 2008-12-20 18:48:50 PST

Comment on attachment 26173 [details] Initial patch Clearing review flag for now. As suggested by Holger, a better approach would be to create a FormDataBuilder class, which encapsulates the Vector<char> instead of having to pass it to several functions, as it is in this patch. Uploading a revised version soon.

Nikolas Zimmermann

Comment 3 2008-12-20 19:39:43 PST

Created attachment 26179 [details] Updated patch

Nikolas Zimmermann

Comment 4 2008-12-20 20:02:03 PST

Comment on attachment 26179 [details] Updated patch Okay, I've realized memory is not freed immediately with the current patch. The simple fix is to move "m_formDataBuilder.clear()" from the beginning of the createFormData() to the end. Note: the memory is not leaked, it's just freed too late - upon HTMLFormElement destruction, or if another createFormData() call is invoked. This is not a good idea, the m_encodedData / m_multiPartData fields of FormDataBuilder, aren't needed anymore after processing, as FormData:appendString() memcpy's the char ptr, in it's internal data buffer. > +PassRefPtr<FormData> HTMLFormElement::createFormData(const CString& boundary) > { > + m_formDataBuilder.clear(); Move the call > result->appendData(encodedData.data(), encodedData.size()); -> here. > return result; > }

Darin Adler

Comment 5 2008-12-21 09:54:14 PST

Comment on attachment 26179 [details] Updated patch I think the basic idea on this patch is OK, but making the data used to build form data into a member of HTMLFormElement is a step backward. Your troubles with the lifetime of the contents of m_encodedData and m_multiPartData demonstrate this. It's just not good to take local data and turn it into a data member. There are plenty of ways around this. One is to pass the vectors in to all the relevant functions. The other is to make this two classes instead of one. > + * Copyright (C) 2008 Nikolas Zimmermann <nikolas.zimmermann@torchmobile.com> When you're grabbing a bunch of existing code and refactoring it, you should not assume that the original copyright holders are relinquishing copyright on all the code. You need to list the appropriate original authors here too. > +static inline void appendString(Vector<char>& buffer, const char string) > +{ > + buffer.append(string); > +} It's strange to call this function "appendString" since it appends a character, not a string. The "string" in "appendString" is what's being appended, not what you're appending to. Maybe we should just name the helper function "append" instead of "appendString". Also, the type of the argument should be "char", not "const char". > + static const char s_hexDigits[17] = "0123456789ABCDEF"; > + > + // Same safe characters as Netscape for compatibility. > + static const char s_safeCharacters[] = "-._*"; I don't think these s_ prefixes add anything; since these are so local I think it reads great to just use normal names. I also think the code read well when the definitions were right where these things were used. > + // The RFC 2046 spec says the AlphaNumeric characters plus the following characters The word "alphanumeric" does not have a capital "n" nor should it be capitalized. > + // are legal for boundaries: '()+_,-./:=? > + // However the following characters, though legal, cause some sites to fail: > + // (),./:= > + // http://bugs.webkit.org/show_bug.cgi?id=13352 > + static const char s_alphaNumericEncodingMap[64] = > + { > + 0x41, 0x42, 0x43, 0x44, 0x45, 0x46, 0x47, 0x48, Braces go on the same line as the definition. Indentation is 4 spaces, not 2 spaces. > + bool isPostMethod() const { return m_isPostMethod; } > + void setIsPostMethod(bool value) { m_isPostMethod = value; } > + > + bool isMultiPartForm() const { return m_isMultiPartForm; } > + void setIsMultiPartForm(bool value) { m_isMultiPartForm = value; } > + > + String encodingType() const { return m_encodingType; } > + void setEncodingType(const String& value) { m_encodingType = value; } > + > + String acceptCharset() const { return m_acceptCharset; } > + void setAcceptCharset(const String& value) { m_acceptCharset = value; } > + > + void parseEncodingType(const String&); > + void parseMethodType(const String&); > + > + TextEncoding dataEncoding(Document*) const; > + > + // Helper functions used by HTMLFormElement/WMLGoElement for multi-part form data > + void beginMultiPartHeader(Vector<char>&, const CString& boundary, const CString& name) const; > + void addBoundaryToMultiPartHeader(Vector<char>& header, const CString& boundary, bool isLastBoundary = false) const; > + void addFileNameToMultiPartHeader(Vector<char>&, const TextEncoding& encoding, const String& fileName) const; > + void addContentTypeToMultiPartHeader(Vector<char>&, const CString& mimeType) const; > + void finishMultiPartHeader(Vector<char>&) const; > + > + // Functions used by HTMLFormElement/WMLGoElement for non multi-part form data > + void addKeyValuePairAsFormData(Vector<char>&, const CString& key, const CString& value) const; > + > + // Encode the input string as HTML form compatible data > + void encodeAsFormData(Vector<char>&, const CString&) const; > + > + // Returns a 0-terminated C string in the vector > + void generateUniqueBoundaryString(Vector<char>&) const; Can any of these functions be private? When you make a new class with so many public functions and so few private ones, I worry that the architecture isn't right. There's no need for the name "encoding" for the const TextEncoding& argument. Do we use the spelling "fileName" or "filename" in WebKit? It looks like you preferred "fileName", treating it as two separate words, and renamed many things from "fileName" to "filename" but I see "filename" in all sorts of other places in the WebKit code; maybe we should standardize on that. I originally preferred "fileName", but now I'm leaning toward "filename". I'm going to say review- because I don't think it's good to add two Vector members to HTMLFormElement that used to be local variables.

Nikolas Zimmermann

Comment 6 2008-12-21 10:40:31 PST

(In reply to comment #5) > (From update of attachment 26179 [details] [review]) > I think the basic idea on this patch is OK, but making the data used to build > form data into a member of HTMLFormElement is a step backward. Your troubles > with the lifetime of the contents of m_encodedData and m_multiPartData > demonstrate this. It's just not good to take local data and turn it into a data > member. Hey Darin, okay, this is really stupid. This was exactly my previous approach - as seen in "Initial patch". Holger convinced me this was not the right way to go, and I updated the patch.... Would you prefer a solution, as it's done in the 'Initial patch'? Greetings, Niko

Darin Adler

Comment 7 2008-12-21 11:09:50 PST

(In reply to comment #6) > okay, this is really stupid. This was exactly my previous approach - as seen in > "Initial patch". Holger convinced me this was not the right way to go, and I > updated the patch.... Sorry, I don't mean to make you go in circles like this. > Would you prefer a solution, as it's done in the 'Initial patch'? Yes, I do prefer that aspect of the initial patch. Holger might be right that it's easier to read when the code has fewer arguments. That could be addressed by making those arguments be a class that groups them together or some other similar simplification. But having state in the DOM tree all the time that has to do only with the encoding process both makes the DOM tree bigger and makes bugs possible that are otherwise impossible. It's better to make bugs due to leftover state completely impossible, by leaving that state out. Sorry I haven't been on IRC with you and Holger so we could all discuss it together.

Nikolas Zimmermann

Comment 8 2008-12-21 12:11:13 PST

Good evening, > Sorry, I don't mean to make you go in circles like this. No problem, you couldn't know it - there are no comments in the bug report that indicate Holgers influence :-) > Yes, I do prefer that aspect of the initial patch. > > Holger might be right that it's easier to read when the code has fewer > arguments. That could be addressed by making those arguments be a class that > groups them together or some other similar simplification. Okay, I readded the "Vector<char>&" as first parameter to the affected functions, and made them static. > But having state in the DOM tree all the time that has to do only with the > encoding process both makes the DOM tree bigger and makes bugs possible that > are otherwise impossible. It's better to make bugs due to leftover state > completely impossible, by leaving that state out. Yes, you're definately right. It's better to combine the both patch variations, so after all it wasn't the worst idea to code both :-) > > Sorry I haven't been on IRC with you and Holger so we could all discuss it > together. No worries, a new patch will be uploaded later.

Nikolas Zimmermann

Comment 9 2008-12-21 15:33:25 PST

Created attachment 26190 [details] Updated patch v2

George Staikos

Comment 10 2008-12-21 17:42:41 PST

r+ with comment about incorrect copyright notices

Nikolas Zimmermann

Comment 11 2008-12-21 17:46:37 PST

Thanks for the review. As Darins issues appear to be solved - landed in r39430. Darin, in case you still see issues, please comment and I'll revert and/or fix them.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware Mac

OS OS X 10.5

Product WebKit

Component XML

Assignee

Nikolas Zimmermann

Reported

2008-12-20 17:14 PST

Modified

2008-12-21 17:46 PST History

CC List

1 user Show

URL

Keywords

Depends on

Blocks

20393 22638

Dependencies

tree graph