RESOLVED FIXED 9299
Teach svn-create-patch and friends to work with binary files
https://bugs.webkit.org/show_bug.cgi?id=9299
Summary Teach svn-create-patch and friends to work with binary files
David Kilzer (:ddkilzer)
Reported 2006-06-03 21:39:17 PDT
It would be nice if svn-create-patch, svn-apply and svn-unapply could handle binary file content.
Attachments
Patch v1 (8.94 KB, patch)
2006-06-03 22:05 PDT, David Kilzer (:ddkilzer)
darin: review+
Example test patch (87.32 KB, patch)
2006-06-04 04:19 PDT, David Kilzer (:ddkilzer)
no flags
David Kilzer (:ddkilzer)
Comment 1 2006-06-03 22:05:15 PDT
Created attachment 8686 [details] Patch v1 Patch v1 is my first pass at implementing this feature. There is no changelog yet since I figure I'll have to make some adjustments as the first pass isn't perfect. Here's a run-down of the changes: svn-create-patch - Added 2006 to copyright year. - When creating a patch, look for "Cannot display: file marked as a binary type." to denote a binary file. - The contents of the binary file are appended at the end of each binary file segment as a base64-encoded chunk of text. svn-apply - Again uses "Cannot display: file marked as a binary type." to denote a binary file. - Handle binary files separately using handleBinaryChange() subroutine. - Renamed local $file variable to $contents (which made more sense to me). - handleBinaryChange() uses a regex to pull the base64-encoded text out of the patch contents. svn-unapply - Again uses "Cannot display: file marked as a binary type." to denote a binary file. - Simplified unapply logic to removing an existing file if it's there, then call "svn revert" regardless. (This may be a bit too simplistic for the other goals outlined in the comments, though.) Tested with a patch for r14715 as well as a local test where one image file each was added, deleted and modified. Please test locally, though.
David Kilzer (:ddkilzer)
Comment 2 2006-06-04 04:14:51 PDT
Reassigning back to webkit-unassigned for a wider audience. (In reply to comment #1) > Created an attachment (id=8686) [edit] > Patch v1 Random design decision information: One downside to setting the binary file contents this way (e.g., relying on the presence or absence of the base64-encoded chunk of text) is that any "old" patches which include patch segments for added or modified binary files will actually look like deletions. The additions will fail out, but the modifications will work as if they were deletions. I'd have to add extra contextual information with the chunk o' text to keep this from happening. I also thought about putting some kind of markers around the chunks o' text, but couldn't decide whether to go old-style with uuencode-like begin/end lines, or new-style with MIME delimiters. Once I figured out that I could just use a regex to pull it back out, I skipped the markers for the first patch. Note that the other nice thing about simply adding the chunks o' text in with the binary patches is that this format will degrade gracefully for anyone using just the patch(1) command. The patch(1) command is smart enough to ignore content that isn't a patch, so it will simply skip the chunks o' text when it runs into them. Another downside to including the chunks o' text inline with each binary patch is that it will make it harder to review a patch simply by looking at it if the images are interspersed with the text file changes. This could be fixed by sorting all of the binary patches to the end in svn-create-patch. I also thought about trying to compress binary files before base64-encoding them, but I couldn't come up with a nice way to communicate this information (that all of the binary files are also compressed) to someone who saw the format for the first time, was familiar with plain diff(1) output, but wanted to grok the new format through trial-and-error. I suppose the file(1) command could help them determine that the base64-decoded text was compressed by a particular program, though. It's trivial to add gzip or bzip2 compression before base64-encoding.
David Kilzer (:ddkilzer)
Comment 3 2006-06-04 04:19:57 PDT
Created attachment 8687 [details] Example test patch (In reply to comment #1) > Tested with a patch for r14715 as well as a local test where one image file > each was added, deleted and modified. Please test locally, though. This is the local example test patch where one image file was added, modified and deleted that I used to test svn-apply and svn-unapply. Simply apply Patch v1, then download this file to try out the new functionality.
Darin Adler
Comment 4 2006-06-04 09:55:07 PDT
Comment on attachment 8686 [details] Patch v1 Hooray! Hooray! r=me
Darin Adler
Comment 5 2006-06-04 10:51:00 PDT
I think this is a fine start and it's good to just land what you have here, even though it could use some refinement. Compressing the binary data is a great idea. We should not be shy about adding some additional lines of text to make the format easier to understand. I don't think we should use a "MIME multipart" approach. Instead we should use something that's human readable and consistent in style with standard "svn di" output. I think that we should try to be completely unambiguous in the format. The base64 data for the file (before and after in the case of a modification, before for deletion, after for addition) should be preceded by something that says in plain English whether it's a file being added, deleted, or modified. In fact, I think we should omit "Cannot display: file marked as a binary type.", because it's misleading. I like the economy of staying as close as possible to the original patch, but I don't want anything actively misleading in the patch. So the text should say something like "Binary file: Base64-encoded data follows." instead of "Cannot display." But some variation on that which is clear for the 3 cases of addition, deletion, and modification. Putting all the binary changes at the bottom is a great idea -- makes it easier to review the patch. Since "svn diff" gives changes in random order, it would be good to sort them in the generated patch. I suggest sorting filenames the same way that prepare-ChangeLog does, but using the "binary vs. text" as the primary key. I'd like to see the apply and unapply scripts check the binary against the data in the patch and do something safe when the file doesn't match. I'd suggest a rule that says "complain and do nothing for the entire patch if binary files are present but do not match". Probably a pass before applying the patch that checks that everything applies successfully. For diff patches we could use --dry-run to check that they apply, and for binary patches just compare what's on disk with what's in the patch. As a special case, we could allow entirely-missing files for a patch that modifies a file. That way, you can get a patch to apply just by removing some files explicitly.
David Kilzer (:ddkilzer)
Comment 6 2006-06-04 11:49:57 PDT
Committed revision 14718. Now the real work begins!
Note You need to log in before you can comment on or make changes to this bug.