Bug 11882

Summary: Need a way to regression test .webarchive output files
Product: WebKit Reporter: David Kilzer (:ddkilzer) <ddkilzer>
Component: Tools / TestsAssignee: David Kilzer (:ddkilzer) <ddkilzer>
Status: RESOLVED FIXED    
Severity: Enhancement CC: beidson, darin, jim.correia, mitz
Priority: P2    
Version: 420+   
Hardware: Mac   
OS: OS X 10.4   
Attachments:
Description Flags
Patch v1
mitz: review-
Patch v2
none
Patch v3
none
Patch v4
none
Patch v4.1
none
Patch v5
none
Patch v6 darin: review+

David Kilzer (:ddkilzer)
Reported 2006-12-19 14:12:27 PST
We need a way to regression test .webarchive files output from WebKit for bugs like Bug 7266, Bug 11839 and Bug 11850.
Attachments
Patch v1 (32.50 KB, patch)
2006-12-24 19:40 PST, David Kilzer (:ddkilzer)
mitz: review-
Patch v2 (32.44 KB, patch)
2006-12-26 09:44 PST, David Kilzer (:ddkilzer)
no flags
Patch v3 (78.61 KB, patch)
2006-12-28 18:28 PST, David Kilzer (:ddkilzer)
no flags
Patch v4 (76.07 KB, patch)
2006-12-29 13:57 PST, David Kilzer (:ddkilzer)
no flags
Patch v4.1 (76.12 KB, patch)
2007-01-07 22:31 PST, David Kilzer (:ddkilzer)
no flags
Patch v5 (76.72 KB, patch)
2007-01-29 22:27 PST, David Kilzer (:ddkilzer)
no flags
Patch v6 (76.76 KB, patch)
2007-01-29 22:48 PST, David Kilzer (:ddkilzer)
darin: review+
David Kilzer (:ddkilzer)
Comment 1 2006-12-19 14:20:02 PST
I had two thoughts on how to do this: 1. Make a dumpAsWebArchive() method available through the layoutTestController, then teach DumpRenderTree how to "compare" two .webarchive files for differences (ignoring timestamp differences as needed). This would allow for testing of the actual structure of the .webarchive files, but wouldn't test whether they could be rendered properly after being loaded. 2. Make a dumpWebArchiveRenderTree() method available through the layoutTestController that loads in the .webarchive file and dumps out its render tree. This would allow for testing of the rendering of the .webarchive file. Or both? Darin, do you have any other ideas or comments on the above?
Darin Adler
Comment 2 2006-12-19 16:12:02 PST
Those both sound pretty good; I like (2) better. Maybe there are some other even-better ideas.
David Kilzer (:ddkilzer)
Comment 3 2006-12-24 19:40:29 PST
Created attachment 12015 [details] Patch v1 Patch v1 to implement webarchive tests.
mitz
Comment 4 2006-12-25 00:08:19 PST
Comment on attachment 12015 [details] Patch v1 +static NSString* dumpWebArchiveAsText(WebArchive *webArchive) The first star should be next to the function name. I think the function name is misleading, since it doesn't actually dump. + NSPropertyListFormat format; Since you don't use the format you can remove this variable and just pass NULL to propertyListFromData:mutabilityOption:format:errorDescription:. + NSString *result = [[NSString alloc] initWithData:xmlData encoding:NSUTF8StringEncoding]; + [result release]; This is wrong: you're releasing result so there's nothing keeping it from being deallocated. You should either autorelease or delay the release until after you're done with the string. + NSMutableString *normalizedResult = [result mutableCopy]; + return normalizedResult; This is leaking: -mutableCopy gives you a retained object, so you should autorelease (but not release) normalizedResult before returning it. Anyway, there is no need to allocate and initialize an immutable string and then make a mutable copy. This is simpler: NSMutableString *normalizedResult = [[NSMutableString alloc] initWithData:xmlData encoding:NSUTF8StringEncoding]; // normalize the string return [normalizedResult autorelease]; I think it's slightly safer to prepend "file://" to the search string, but I wonder if it wouldn't be safer to not wait until after serialization to do the normalization, i.e. walk the dictionary and apply the normalization only to WebResourceURL values. r- because of the memory management issues.
David Kilzer (:ddkilzer)
Comment 5 2006-12-25 20:36:59 PST
(In reply to comment #4) > (From update of attachment 12015 [details] [edit]) > +static NSString* dumpWebArchiveAsText(WebArchive *webArchive) > > The first star should be next to the function name. > I think the function name is misleading, since it doesn't actually dump. I've thought some more about this. Should the name of the method on the layoutTestController change as well? I think it'd be better for the LTC method to match the internal method used to create the string.
David Kilzer (:ddkilzer)
Comment 6 2006-12-26 09:44:36 PST
Created attachment 12036 [details] Patch v2 (In reply to comment #4) > (From update of attachment 12015 [details] [edit]) > +static NSString* dumpWebArchiveAsText(WebArchive *webArchive) > > The first star should be next to the function name. Fixed. > I think the function name is misleading, since it doesn't actually dump. Renamed to serializeWebArchiveToXML(). Ignore my remarks in Comment #5. > + NSPropertyListFormat format; > > Since you don't use the format you can remove this variable and just pass NULL > to propertyListFromData:mutabilityOption:format:errorDescription:. Fixed to use nil since it's ObjC code. > + NSString *result = [[NSString alloc] initWithData:xmlData > encoding:NSUTF8StringEncoding]; > + [result release]; > > This is wrong: you're releasing result so there's nothing keeping it from being > deallocated. You should either autorelease or delay the release until after > you're done with the string. > > + NSMutableString *normalizedResult = [result mutableCopy]; > > + return normalizedResult; > > This is leaking: -mutableCopy gives you a retained object, so you should > autorelease (but not release) normalizedResult before returning it. > > Anyway, there is no need to allocate and initialize an immutable string and > then make a mutable copy. This is simpler: > > NSMutableString *normalizedResult = [[NSMutableString alloc] > initWithData:xmlData encoding:NSUTF8StringEncoding]; > // normalize the string > return [normalizedResult autorelease]; Fixed. Thanks for the suggestion! Guess I need to better familiarize myself with the NSMutableString API. > I think it's slightly safer to prepend "file://" to the search string, but I > wonder if it wouldn't be safer to not wait until after serialization to do the > normalization, i.e. walk the dictionary and apply the normalization only to > WebResourceURL values. I prepended "file://" to the search string and the replacement string, but I think this code is going to be MUCH clearer to understand than walking the WebArchive data structure, finding NSURL objects, and replacing them with new NSURL objects (since the originals are immutable). Also, I'm not sure if the original WebArchive uses Mutable versions of data structures, which means I may have to recreate part or all of the data structure. Finally, the odds that the WebKit checkout path will appear "randomly" in the XML is highly unlikely, and even if it did, this is only used for tests and such an issue could be worked around easily.
mitz
Comment 7 2006-12-26 10:09:13 PST
Comment on attachment 12036 [details] Patch v2 (In reply to comment #6) > Fixed to use nil since it's ObjC code. This is a nitpick, but actually "In Objective-C, [the null pointer] should be written as nil if it is being used as a null pointer of type id or another Objective-C object type, otherwise NULL". Here's it's a pointer to an NSPropertyListFormat, which is not an object. > I prepended "file://" to the search string and the replacement string, but I > think this code is going to be MUCH clearer to understand than walking the > WebArchive data structure, finding NSURL objects, and replacing them with new > NSURL objects (since the originals are immutable). Also, I'm not sure if the > original WebArchive uses Mutable versions of data structures, which means I may > have to recreate part or all of the data structure. I think that if you specify mutabilityOption:NSPropertyListMutableContainersAndLeaves you will get a "deeply-mutable" version (the objects won't be NSURLs, though, they would be NSMutableStrings keyed by WebResourceURL). Having said that, I'm comfortable with the string search/replace approach, even more so now that you've added the file:// prefix. I think Darin or Brady should review this too.
Darin Adler
Comment 8 2006-12-27 22:33:59 PST
Comment on attachment 12036 [details] Patch v2 Typo: "excercise" in ChangeLog. I think it would be better to not implement dumpAsWebArchive in DumpRenderTree on platforms that don't have web archive support. That way, the test can check the existence of the function and indicate the error. It would be way better if the data was dumped in text format for resource types that are text/ -- we should probably teach DumpRenderTree to do that. Fine first step! r=me
David Kilzer (:ddkilzer)
Comment 9 2006-12-28 18:28:33 PST
Created attachment 12094 [details] Patch v3 Changes from Patch v2: - Removed DumpRenderTreeQt changes. - Broke up two webarchive tests into 10 tests. - Changed nil to NULL in NSPropertyListSerialization propertyListFromData:mutabilityOption:format:errorDescription: call. - Fixed error message for webarchive. - Updated ChangeLog entries.
David Kilzer (:ddkilzer)
Comment 10 2006-12-28 18:29:21 PST
Comment on attachment 12036 [details] Patch v2 Cleared darin's r+ on obsolete patch.
Darin Adler
Comment 11 2006-12-28 20:18:25 PST
Comment on attachment 12094 [details] Patch v3 r=me
David Kilzer (:ddkilzer)
Comment 12 2006-12-28 20:56:18 PST
Committed revision 18466.
David Kilzer (:ddkilzer)
Comment 13 2006-12-29 00:20:20 PST
Ugh. The WebResourceData objects differ on each machine (or depend on where WebKit was checked out). Backing out r18466 until this is solved.
David Kilzer (:ddkilzer)
Comment 14 2006-12-29 00:31:20 PST
(In reply to comment #13) > Ugh. The WebResourceData objects differ on each machine (or depend on where > WebKit was checked out). Backing out r18466 until this is solved. Committed revision 18468.
David Kilzer (:ddkilzer)
Comment 15 2006-12-29 07:19:08 PST
Comment on attachment 12094 [details] Patch v3 Removing darin's r+ on this patch which failed all tests on the buildbot.
David Kilzer (:ddkilzer)
Comment 16 2006-12-29 07:21:04 PST
(In reply to comment #13) > Ugh. The WebResourceData objects differ on each machine (or depend on where > WebKit was checked out). Backing out r18466 until this is solved. http://build.webkit.org/post-commit-powerpc-mac-os-x/builds/4868
David Kilzer (:ddkilzer)
Comment 17 2006-12-29 07:26:08 PST
(In reply to comment #13) > Ugh. The WebResourceData objects differ on each machine (or depend on where > WebKit was checked out). Backing out r18466 until this is solved. That should be WebResourceResponse, not WebResourceData.
David Kilzer (:ddkilzer)
Comment 18 2006-12-29 13:57:04 PST
Created attachment 12113 [details] Patch v4 Changes from Patch v3: - Now also updates URL embedded in NSURLResponse, keyed as WebResourceResponse in WebArchive data structures. - Updated expected results. - Reviewer, PLEASE apply patch and run webarchive tests locally to make sure all dependencies on working directory path have been removed!
David Kilzer (:ddkilzer)
Comment 19 2006-12-29 15:17:29 PST
(In reply to comment #18) > Changes from Patch v3: > > - Now also updates URL embedded in NSURLResponse, keyed as WebResourceResponse > in WebArchive data structures. After writing this, I wondered if I should try to modify the WebArchive data structure before serializing it. I had to copy the unarchiving/archiving code from WebArchive.m for the normalizeWebResourceResponse() method in DumpRenderTree. > - Reviewer, PLEASE apply patch and run webarchive tests locally to make sure > all dependencies on working directory path have been removed! I also realized after posting this that I could test it myself! Will post the results when completed.
David Kilzer (:ddkilzer)
Comment 20 2006-12-29 16:33:30 PST
(In reply to comment #19) > > - Reviewer, PLEASE apply patch and run webarchive tests locally to make sure > > all dependencies on working directory path have been removed! > > I also realized after posting this that I could test it myself! Will post the > results when completed. I tested this on my PB G4 (patch developed on the MBP C2D), and it worked great! This issue has been resolved. Another question I had was whether these keys could be exported from WebKit to prevent duplication in DumpRenderTree.m as _WebMainResourceKey is already exported in WebKit.exp: _WebSubresourcesKey _WebSubframeArchivesKey
David Kilzer (:ddkilzer)
Comment 21 2006-12-29 17:12:57 PST
(In reply to comment #19) > (In reply to comment #18) > > Changes from Patch v3: > > > > - Now also updates URL embedded in NSURLResponse, keyed as WebResourceResponse > > in WebArchive data structures. > > After writing this, I wondered if I should try to modify the WebArchive data > structure before serializing it. I had to copy the unarchiving/archiving code > from WebArchive.m for the normalizeWebResourceResponse() method in > DumpRenderTree. Looking at the WebArchive and WebResource classes, these classes are immutable, so I'd have to create WebArchive and WebResource objects just to change their URLs. I can do this, but I'd like some more direction before I go implement that.
David Kilzer (:ddkilzer)
Comment 22 2006-12-30 07:33:01 PST
(In reply to comment #19) > After writing this, I wondered if I should try to modify the WebArchive data > structure before serializing it. I had to copy the unarchiving/archiving code > from WebArchive.m for the normalizeWebResourceResponse() method in > DumpRenderTree. Of course, I could also simply delete the WebResponseResource from the WebArchive before it's serialized to remove the dependency as well.
David Kilzer (:ddkilzer)
Comment 23 2007-01-07 22:31:20 PST
Created attachment 12293 [details] Patch v4.1 Patch v4 was created using a pre-release version of svn-create-patch from Bug 12023. The format of the last diff in the file (apple.gif) was updated to work with the released version of svn-create-patch.
Darin Adler
Comment 24 2007-01-09 22:41:37 PST
Comment on attachment 12293 [details] Patch v4.1 I'd love to see a version of this where the contents of the textual resources were readable in the dump. But this is good to have even without that: r=me
David Kilzer (:ddkilzer)
Comment 25 2007-01-11 04:37:01 PST
Comment on attachment 12293 [details] Patch v4.1 Going to rework the patch once more to make the HTML content of the archive show up as HTML (instead of text-encoded data). Clearing darin's r+ flag.
David Kilzer (:ddkilzer)
Comment 26 2007-01-29 22:27:08 PST
Created attachment 12775 [details] Patch v5 Changes since Patch v4/v4.1: - WebResourceData is converted from data to a string before being output.
David Kilzer (:ddkilzer)
Comment 27 2007-01-29 22:29:28 PST
Comment on attachment 12775 [details] Patch v5 Hold the phone! I'm missing converting WebResourceData in WebSubresources.
David Kilzer (:ddkilzer)
Comment 28 2007-01-29 22:48:26 PST
Created attachment 12776 [details] Patch v6 Changes since Patch v5: - Patch v5 correctly converted WebSubresources (I was looking data for image/gif). - Added "application/x-javascript" to render WebResourceData as text.
Darin Adler
Comment 29 2007-01-30 15:11:30 PST
Comment on attachment 12776 [details] Patch v6 r=me
David Kilzer (:ddkilzer)
Comment 30 2007-01-30 19:50:59 PST
Committed revision 19284.
Note You need to log in before you can comment on or make changes to this bug.