WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
22949
Refactor HTMLFormElement code
https://bugs.webkit.org/show_bug.cgi?id=22949
Summary
Refactor HTMLFormElement code
Nikolas Zimmermann
Reported
2008-12-20 17:14:12 PST
HTMLFormElement contains useful code regarding GET/POST to url handling, multi-part form handling, boundary string generation etc. This should be abstracted in a helper class usable by both HTMLFormElement & WMLGoElement - to share the whole fragile form-building logic in a central place.
Attachments
Initial patch
(50.38 KB, patch)
2008-12-20 17:19 PST
,
Nikolas Zimmermann
no flags
Details
Formatted Diff
Diff
Updated patch
(52.34 KB, patch)
2008-12-20 19:39 PST
,
Nikolas Zimmermann
darin
: review-
Details
Formatted Diff
Diff
Updated patch v2
(51.59 KB, patch)
2008-12-21 15:33 PST
,
Nikolas Zimmermann
staikos
: review+
Details
Formatted Diff
Diff
Show Obsolete
(2)
View All
Add attachment
proposed patch, testcase, etc.
Nikolas Zimmermann
Comment 1
2008-12-20 17:19:55 PST
Created
attachment 26173
[details]
Initial patch
Nikolas Zimmermann
Comment 2
2008-12-20 18:48:50 PST
Comment on
attachment 26173
[details]
Initial patch Clearing review flag for now. As suggested by Holger, a better approach would be to create a FormDataBuilder class, which encapsulates the Vector<char> instead of having to pass it to several functions, as it is in this patch. Uploading a revised version soon.
Nikolas Zimmermann
Comment 3
2008-12-20 19:39:43 PST
Created
attachment 26179
[details]
Updated patch
Nikolas Zimmermann
Comment 4
2008-12-20 20:02:03 PST
Comment on
attachment 26179
[details]
Updated patch Okay, I've realized memory is not freed immediately with the current patch. The simple fix is to move "m_formDataBuilder.clear()" from the beginning of the createFormData() to the end. Note: the memory is not leaked, it's just freed too late - upon HTMLFormElement destruction, or if another createFormData() call is invoked. This is not a good idea, the m_encodedData / m_multiPartData fields of FormDataBuilder, aren't needed anymore after processing, as FormData:appendString() memcpy's the char ptr, in it's internal data buffer.
> +PassRefPtr<FormData> HTMLFormElement::createFormData(const CString& boundary) > { > + m_formDataBuilder.clear();
Move the call
> result->appendData(encodedData.data(), encodedData.size());
-> here.
> return result; > }
Darin Adler
Comment 5
2008-12-21 09:54:14 PST
Comment on
attachment 26179
[details]
Updated patch I think the basic idea on this patch is OK, but making the data used to build form data into a member of HTMLFormElement is a step backward. Your troubles with the lifetime of the contents of m_encodedData and m_multiPartData demonstrate this. It's just not good to take local data and turn it into a data member. There are plenty of ways around this. One is to pass the vectors in to all the relevant functions. The other is to make this two classes instead of one.
> + * Copyright (C) 2008 Nikolas Zimmermann <
nikolas.zimmermann@torchmobile.com
>
When you're grabbing a bunch of existing code and refactoring it, you should not assume that the original copyright holders are relinquishing copyright on all the code. You need to list the appropriate original authors here too.
> +static inline void appendString(Vector<char>& buffer, const char string) > +{ > + buffer.append(string); > +}
It's strange to call this function "appendString" since it appends a character, not a string. The "string" in "appendString" is what's being appended, not what you're appending to. Maybe we should just name the helper function "append" instead of "appendString". Also, the type of the argument should be "char", not "const char".
> + static const char s_hexDigits[17] = "0123456789ABCDEF"; > + > + // Same safe characters as Netscape for compatibility. > + static const char s_safeCharacters[] = "-._*";
I don't think these s_ prefixes add anything; since these are so local I think it reads great to just use normal names. I also think the code read well when the definitions were right where these things were used.
> + // The RFC 2046 spec says the AlphaNumeric characters plus the following characters
The word "alphanumeric" does not have a capital "n" nor should it be capitalized.
> + // are legal for boundaries: '()+_,-./:=? > + // However the following characters, though legal, cause some sites to fail: > + // (),./:= > + //
http://bugs.webkit.org/show_bug.cgi?id=13352
> + static const char s_alphaNumericEncodingMap[64] = > + { > + 0x41, 0x42, 0x43, 0x44, 0x45, 0x46, 0x47, 0x48,
Braces go on the same line as the definition. Indentation is 4 spaces, not 2 spaces.
> + bool isPostMethod() const { return m_isPostMethod; } > + void setIsPostMethod(bool value) { m_isPostMethod = value; } > + > + bool isMultiPartForm() const { return m_isMultiPartForm; } > + void setIsMultiPartForm(bool value) { m_isMultiPartForm = value; } > + > + String encodingType() const { return m_encodingType; } > + void setEncodingType(const String& value) { m_encodingType = value; } > + > + String acceptCharset() const { return m_acceptCharset; } > + void setAcceptCharset(const String& value) { m_acceptCharset = value; } > + > + void parseEncodingType(const String&); > + void parseMethodType(const String&); > + > + TextEncoding dataEncoding(Document*) const; > + > + // Helper functions used by HTMLFormElement/WMLGoElement for multi-part form data > + void beginMultiPartHeader(Vector<char>&, const CString& boundary, const CString& name) const; > + void addBoundaryToMultiPartHeader(Vector<char>& header, const CString& boundary, bool isLastBoundary = false) const; > + void addFileNameToMultiPartHeader(Vector<char>&, const TextEncoding& encoding, const String& fileName) const; > + void addContentTypeToMultiPartHeader(Vector<char>&, const CString& mimeType) const; > + void finishMultiPartHeader(Vector<char>&) const; > + > + // Functions used by HTMLFormElement/WMLGoElement for non multi-part form data > + void addKeyValuePairAsFormData(Vector<char>&, const CString& key, const CString& value) const; > + > + // Encode the input string as HTML form compatible data > + void encodeAsFormData(Vector<char>&, const CString&) const; > + > + // Returns a 0-terminated C string in the vector > + void generateUniqueBoundaryString(Vector<char>&) const;
Can any of these functions be private? When you make a new class with so many public functions and so few private ones, I worry that the architecture isn't right. There's no need for the name "encoding" for the const TextEncoding& argument. Do we use the spelling "fileName" or "filename" in WebKit? It looks like you preferred "fileName", treating it as two separate words, and renamed many things from "fileName" to "filename" but I see "filename" in all sorts of other places in the WebKit code; maybe we should standardize on that. I originally preferred "fileName", but now I'm leaning toward "filename". I'm going to say review- because I don't think it's good to add two Vector members to HTMLFormElement that used to be local variables.
Nikolas Zimmermann
Comment 6
2008-12-21 10:40:31 PST
(In reply to
comment #5
)
> (From update of
attachment 26179
[details]
[review]) > I think the basic idea on this patch is OK, but making the data used to build > form data into a member of HTMLFormElement is a step backward. Your troubles > with the lifetime of the contents of m_encodedData and m_multiPartData > demonstrate this. It's just not good to take local data and turn it into a data > member.
Hey Darin, okay, this is really stupid. This was exactly my previous approach - as seen in "Initial patch". Holger convinced me this was not the right way to go, and I updated the patch.... Would you prefer a solution, as it's done in the 'Initial patch'? Greetings, Niko
Darin Adler
Comment 7
2008-12-21 11:09:50 PST
(In reply to
comment #6
)
> okay, this is really stupid. This was exactly my previous approach - as seen in > "Initial patch". Holger convinced me this was not the right way to go, and I > updated the patch....
Sorry, I don't mean to make you go in circles like this.
> Would you prefer a solution, as it's done in the 'Initial patch'?
Yes, I do prefer that aspect of the initial patch. Holger might be right that it's easier to read when the code has fewer arguments. That could be addressed by making those arguments be a class that groups them together or some other similar simplification. But having state in the DOM tree all the time that has to do only with the encoding process both makes the DOM tree bigger and makes bugs possible that are otherwise impossible. It's better to make bugs due to leftover state completely impossible, by leaving that state out. Sorry I haven't been on IRC with you and Holger so we could all discuss it together.
Nikolas Zimmermann
Comment 8
2008-12-21 12:11:13 PST
Good evening,
> Sorry, I don't mean to make you go in circles like this.
No problem, you couldn't know it - there are no comments in the bug report that indicate Holgers influence :-)
> Yes, I do prefer that aspect of the initial patch. > > Holger might be right that it's easier to read when the code has fewer > arguments. That could be addressed by making those arguments be a class that > groups them together or some other similar simplification.
Okay, I readded the "Vector<char>&" as first parameter to the affected functions, and made them static.
> But having state in the DOM tree all the time that has to do only with the > encoding process both makes the DOM tree bigger and makes bugs possible that > are otherwise impossible. It's better to make bugs due to leftover state > completely impossible, by leaving that state out.
Yes, you're definately right. It's better to combine the both patch variations, so after all it wasn't the worst idea to code both :-)
> > Sorry I haven't been on IRC with you and Holger so we could all discuss it > together.
No worries, a new patch will be uploaded later.
Nikolas Zimmermann
Comment 9
2008-12-21 15:33:25 PST
Created
attachment 26190
[details]
Updated patch v2
George Staikos
Comment 10
2008-12-21 17:42:41 PST
r+ with comment about incorrect copyright notices
Nikolas Zimmermann
Comment 11
2008-12-21 17:46:37 PST
Thanks for the review. As Darins issues appear to be solved - landed in
r39430
. Darin, in case you still see issues, please comment and I'll revert and/or fix them.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug