Bug 11882 - Need a way to regression test .webarchive output files
Summary: Need a way to regression test .webarchive output files
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: 420+
Hardware: Mac OS X 10.4
: P2 Enhancement
Assignee: David Kilzer (:ddkilzer)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-12-19 14:12 PST by David Kilzer (:ddkilzer)
Modified: 2007-01-30 19:50 PST (History)
4 users (show)

See Also:


Attachments
Patch v1 (32.50 KB, patch)
2006-12-24 19:40 PST, David Kilzer (:ddkilzer)
mitz: review-
Details | Formatted Diff | Diff
Patch v2 (32.44 KB, patch)
2006-12-26 09:44 PST, David Kilzer (:ddkilzer)
no flags Details | Formatted Diff | Diff
Patch v3 (78.61 KB, patch)
2006-12-28 18:28 PST, David Kilzer (:ddkilzer)
no flags Details | Formatted Diff | Diff
Patch v4 (76.07 KB, patch)
2006-12-29 13:57 PST, David Kilzer (:ddkilzer)
no flags Details | Formatted Diff | Diff
Patch v4.1 (76.12 KB, patch)
2007-01-07 22:31 PST, David Kilzer (:ddkilzer)
no flags Details | Formatted Diff | Diff
Patch v5 (76.72 KB, patch)
2007-01-29 22:27 PST, David Kilzer (:ddkilzer)
no flags Details | Formatted Diff | Diff
Patch v6 (76.76 KB, patch)
2007-01-29 22:48 PST, David Kilzer (:ddkilzer)
darin: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description David Kilzer (:ddkilzer) 2006-12-19 14:12:27 PST
We need a way to regression test .webarchive files output from WebKit for bugs like Bug 7266, Bug 11839 and Bug 11850.
Comment 1 David Kilzer (:ddkilzer) 2006-12-19 14:20:02 PST
I had two thoughts on how to do this:

1. Make a dumpAsWebArchive() method available through the layoutTestController, then teach DumpRenderTree how to "compare" two .webarchive files for differences (ignoring timestamp differences as needed).

This would allow for testing of the actual structure of the .webarchive files, but wouldn't test whether they could be rendered properly after being loaded.

2. Make a dumpWebArchiveRenderTree() method available through the layoutTestController that loads in the .webarchive file and dumps out its render tree.

This would allow for testing of the rendering of the .webarchive file.

Or both?  Darin, do you have any other ideas or comments on the above?
Comment 2 Darin Adler 2006-12-19 16:12:02 PST
Those both sound pretty good; I like (2) better. Maybe there are some other even-better ideas.
Comment 3 David Kilzer (:ddkilzer) 2006-12-24 19:40:29 PST
Created attachment 12015 [details]
Patch v1

Patch v1 to implement webarchive tests.
Comment 4 mitz 2006-12-25 00:08:19 PST
Comment on attachment 12015 [details]
Patch v1

+static NSString* dumpWebArchiveAsText(WebArchive *webArchive)

The first star should be next to the function name.
I think the function name is misleading, since it doesn't actually dump.

+    NSPropertyListFormat format;

Since you don't use the format you can remove this variable and just pass NULL to propertyListFromData:mutabilityOption:format:errorDescription:.

+    NSString *result = [[NSString alloc] initWithData:xmlData encoding:NSUTF8StringEncoding];
+    [result release];

This is wrong: you're releasing result so there's nothing keeping it from being deallocated. You should either autorelease or delay the release until after you're done with the string.

+    NSMutableString *normalizedResult = [result mutableCopy];

+    return normalizedResult;

This is leaking: -mutableCopy gives you a retained object, so you should autorelease (but not release) normalizedResult before returning it.

Anyway, there is no need to allocate and initialize an immutable string and then make a mutable copy. This is simpler:

    NSMutableString *normalizedResult = [[NSMutableString alloc]  initWithData:xmlData encoding:NSUTF8StringEncoding];
    // normalize the string
    return [normalizedResult autorelease];

I think it's slightly safer to prepend "file://" to the search string, but I wonder if it wouldn't be safer to not wait until after serialization to do the normalization, i.e. walk the dictionary and apply the normalization only to WebResourceURL values.

r- because of the memory management issues.
Comment 5 David Kilzer (:ddkilzer) 2006-12-25 20:36:59 PST
(In reply to comment #4)
> (From update of attachment 12015 [details] [edit])
> +static NSString* dumpWebArchiveAsText(WebArchive *webArchive)
> 
> The first star should be next to the function name.
> I think the function name is misleading, since it doesn't actually dump.

I've thought some more about this.  Should the name of the method on the layoutTestController change as well?  I think it'd be better for the LTC method to match the internal method used to create the string.

Comment 6 David Kilzer (:ddkilzer) 2006-12-26 09:44:36 PST
Created attachment 12036 [details]
Patch v2

(In reply to comment #4)
> (From update of attachment 12015 [details] [edit])
> +static NSString* dumpWebArchiveAsText(WebArchive *webArchive)
> 
> The first star should be next to the function name.

Fixed.

> I think the function name is misleading, since it doesn't actually dump.

Renamed to serializeWebArchiveToXML().  Ignore my remarks in Comment #5.

> +    NSPropertyListFormat format;
> 
> Since you don't use the format you can remove this variable and just pass NULL
> to propertyListFromData:mutabilityOption:format:errorDescription:.

Fixed to use nil since it's ObjC code.

> +    NSString *result = [[NSString alloc] initWithData:xmlData
> encoding:NSUTF8StringEncoding];
> +    [result release];
> 
> This is wrong: you're releasing result so there's nothing keeping it from being
> deallocated. You should either autorelease or delay the release until after
> you're done with the string.
> 
> +    NSMutableString *normalizedResult = [result mutableCopy];
> 
> +    return normalizedResult;
> 
> This is leaking: -mutableCopy gives you a retained object, so you should
> autorelease (but not release) normalizedResult before returning it.
> 
> Anyway, there is no need to allocate and initialize an immutable string and
> then make a mutable copy. This is simpler:
> 
>     NSMutableString *normalizedResult = [[NSMutableString alloc] 
> initWithData:xmlData encoding:NSUTF8StringEncoding];
>     // normalize the string
>     return [normalizedResult autorelease];

Fixed.  Thanks for the suggestion!  Guess I need to better familiarize myself with the NSMutableString API.

> I think it's slightly safer to prepend "file://" to the search string, but I
> wonder if it wouldn't be safer to not wait until after serialization to do the
> normalization, i.e. walk the dictionary and apply the normalization only to
> WebResourceURL values.

I prepended "file://" to the search string and the replacement string, but I think this code is going to be MUCH clearer to understand than walking the WebArchive data structure, finding NSURL objects, and replacing them with new NSURL objects (since the originals are immutable).  Also, I'm not sure if the original WebArchive uses Mutable versions of data structures, which means I may have to recreate part or all of the data structure.

Finally, the odds that the WebKit checkout path will appear "randomly" in the XML is highly unlikely, and even if it did, this is only used for tests and such an issue could be worked around easily.
Comment 7 mitz 2006-12-26 10:09:13 PST
Comment on attachment 12036 [details]
Patch v2

(In reply to comment #6)

> Fixed to use nil since it's ObjC code.

This is a nitpick, but actually "In Objective-C, [the null pointer] should be written as nil if it is being used as a null pointer of type id or another Objective-C object type, otherwise NULL". Here's it's a pointer to an NSPropertyListFormat, which is not an object.

> I prepended "file://" to the search string and the replacement string, but I
> think this code is going to be MUCH clearer to understand than walking the
> WebArchive data structure, finding NSURL objects, and replacing them with new
> NSURL objects (since the originals are immutable).  Also, I'm not sure if the
> original WebArchive uses Mutable versions of data structures, which means I may
> have to recreate part or all of the data structure.

I think that if you specify mutabilityOption:NSPropertyListMutableContainersAndLeaves you will get a "deeply-mutable" version (the objects won't be NSURLs, though, they would be NSMutableStrings keyed by WebResourceURL). Having said that, I'm comfortable with the string search/replace approach, even more so now that you've added the file:// prefix.

I think Darin or Brady should review this too.
Comment 8 Darin Adler 2006-12-27 22:33:59 PST
Comment on attachment 12036 [details]
Patch v2

Typo: "excercise" in ChangeLog.

I think it would be better to not implement dumpAsWebArchive in DumpRenderTree on platforms that don't have web archive support. That way, the test can check the existence of the function and indicate the error.

It would be way better if the data was dumped in text format for resource types that are text/ -- we should probably teach DumpRenderTree to do that.

Fine first step!

r=me
Comment 9 David Kilzer (:ddkilzer) 2006-12-28 18:28:33 PST
Created attachment 12094 [details]
Patch v3

Changes from Patch v2:

- Removed DumpRenderTreeQt changes.
- Broke up two webarchive tests into 10 tests.
- Changed nil to NULL in NSPropertyListSerialization propertyListFromData:mutabilityOption:format:errorDescription: call.
- Fixed error message for webarchive.
- Updated ChangeLog entries.
Comment 10 David Kilzer (:ddkilzer) 2006-12-28 18:29:21 PST
Comment on attachment 12036 [details]
Patch v2

Cleared darin's r+ on obsolete patch.
Comment 11 Darin Adler 2006-12-28 20:18:25 PST
Comment on attachment 12094 [details]
Patch v3

r=me
Comment 12 David Kilzer (:ddkilzer) 2006-12-28 20:56:18 PST
Committed revision 18466.

Comment 13 David Kilzer (:ddkilzer) 2006-12-29 00:20:20 PST
Ugh.  The WebResourceData objects differ on each machine (or depend on where WebKit was checked out).  Backing out r18466 until this is solved.

Comment 14 David Kilzer (:ddkilzer) 2006-12-29 00:31:20 PST
(In reply to comment #13)
> Ugh.  The WebResourceData objects differ on each machine (or depend on where
> WebKit was checked out).  Backing out r18466 until this is solved.

Committed revision 18468.
Comment 15 David Kilzer (:ddkilzer) 2006-12-29 07:19:08 PST
Comment on attachment 12094 [details]
Patch v3

Removing darin's r+ on this patch which failed all tests on the buildbot.
Comment 16 David Kilzer (:ddkilzer) 2006-12-29 07:21:04 PST
(In reply to comment #13)
> Ugh.  The WebResourceData objects differ on each machine (or depend on where
> WebKit was checked out).  Backing out r18466 until this is solved.

http://build.webkit.org/post-commit-powerpc-mac-os-x/builds/4868

Comment 17 David Kilzer (:ddkilzer) 2006-12-29 07:26:08 PST
(In reply to comment #13)
> Ugh.  The WebResourceData objects differ on each machine (or depend on where
> WebKit was checked out).  Backing out r18466 until this is solved.

That should be WebResourceResponse, not WebResourceData.

Comment 18 David Kilzer (:ddkilzer) 2006-12-29 13:57:04 PST
Created attachment 12113 [details]
Patch v4

Changes from Patch v3:

- Now also updates URL embedded in NSURLResponse, keyed as WebResourceResponse in WebArchive data structures.
- Updated expected results.
- Reviewer, PLEASE apply patch and run webarchive tests locally to make sure all dependencies on working directory path have been removed!
Comment 19 David Kilzer (:ddkilzer) 2006-12-29 15:17:29 PST
(In reply to comment #18)
> Changes from Patch v3:
> 
> - Now also updates URL embedded in NSURLResponse, keyed as WebResourceResponse
> in WebArchive data structures.

After writing this, I wondered if I should try to modify the WebArchive data structure before serializing it.  I had to copy the unarchiving/archiving code from WebArchive.m for the normalizeWebResourceResponse() method in DumpRenderTree.

> - Reviewer, PLEASE apply patch and run webarchive tests locally to make sure
> all dependencies on working directory path have been removed!

I also realized after posting this that I could test it myself!  Will post the results when completed.

Comment 20 David Kilzer (:ddkilzer) 2006-12-29 16:33:30 PST
(In reply to comment #19)
> > - Reviewer, PLEASE apply patch and run webarchive tests locally to make sure
> > all dependencies on working directory path have been removed!
> 
> I also realized after posting this that I could test it myself!  Will post the
> results when completed.

I tested this on my PB G4 (patch developed on the MBP C2D), and it worked great!  This issue has been resolved.

Another question I had was whether these keys could be exported from WebKit to prevent duplication in DumpRenderTree.m as _WebMainResourceKey is already exported in WebKit.exp:

_WebSubresourcesKey
_WebSubframeArchivesKey

Comment 21 David Kilzer (:ddkilzer) 2006-12-29 17:12:57 PST
(In reply to comment #19)
> (In reply to comment #18)
> > Changes from Patch v3:
> > 
> > - Now also updates URL embedded in NSURLResponse, keyed as WebResourceResponse
> > in WebArchive data structures.
> 
> After writing this, I wondered if I should try to modify the WebArchive data
> structure before serializing it.  I had to copy the unarchiving/archiving code
> from WebArchive.m for the normalizeWebResourceResponse() method in
> DumpRenderTree.

Looking at the WebArchive and WebResource classes, these classes are immutable, so I'd have to create WebArchive and WebResource objects just to change their URLs.  I can do this, but I'd like some more direction before I go implement that.

Comment 22 David Kilzer (:ddkilzer) 2006-12-30 07:33:01 PST
(In reply to comment #19)
> After writing this, I wondered if I should try to modify the WebArchive data
> structure before serializing it.  I had to copy the unarchiving/archiving code
> from WebArchive.m for the normalizeWebResourceResponse() method in
> DumpRenderTree.

Of course, I could also simply delete the WebResponseResource from the WebArchive before it's serialized to remove the dependency as well.

Comment 23 David Kilzer (:ddkilzer) 2007-01-07 22:31:20 PST
Created attachment 12293 [details]
Patch v4.1

Patch v4 was created using a pre-release version of svn-create-patch from Bug 12023.  The format of the last diff in the file (apple.gif) was updated to work with the released version of svn-create-patch.
Comment 24 Darin Adler 2007-01-09 22:41:37 PST
Comment on attachment 12293 [details]
Patch v4.1

I'd love to see a version of this where the contents of the textual resources were readable in the dump.

But this is good to have even without that: r=me
Comment 25 David Kilzer (:ddkilzer) 2007-01-11 04:37:01 PST
Comment on attachment 12293 [details]
Patch v4.1

Going to rework the patch once more to make the HTML content of the archive show up as HTML (instead of text-encoded data).  

Clearing darin's r+ flag.
Comment 26 David Kilzer (:ddkilzer) 2007-01-29 22:27:08 PST
Created attachment 12775 [details]
Patch v5

Changes since Patch v4/v4.1:

- WebResourceData is converted from data to a string before being output.
Comment 27 David Kilzer (:ddkilzer) 2007-01-29 22:29:28 PST
Comment on attachment 12775 [details]
Patch v5

Hold the phone!  I'm missing converting WebResourceData in WebSubresources.
Comment 28 David Kilzer (:ddkilzer) 2007-01-29 22:48:26 PST
Created attachment 12776 [details]
Patch v6

Changes since Patch v5:

- Patch v5 correctly converted WebSubresources (I was looking data for image/gif).
- Added "application/x-javascript" to render WebResourceData as text.
Comment 29 Darin Adler 2007-01-30 15:11:30 PST
Comment on attachment 12776 [details]
Patch v6

r=me
Comment 30 David Kilzer (:ddkilzer) 2007-01-30 19:50:59 PST
Committed revision 19284.