Bug 67628 - PrettyPatch should handle delta patch mechanism in git binary patches
Summary: PrettyPatch should handle delta patch mechanism in git binary patches
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Ben Wells
URL:
Keywords:
Depends on: 67631
Blocks:
  Show dependency treegraph
 
Reported: 2011-09-05 22:23 PDT by Ben Wells
Modified: 2011-09-14 21:07 PDT (History)
8 users (show)

See Also:


Attachments
Patch (4.15 KB, patch)
2011-09-05 22:37 PDT, Ben Wells
no flags Details | Formatted Diff | Diff
Patch (12.70 KB, patch)
2011-09-07 04:41 PDT, Ben Wells
no flags Details | Formatted Diff | Diff
Patch (12.59 KB, patch)
2011-09-07 04:58 PDT, Ben Wells
no flags Details | Formatted Diff | Diff
Patch (12.62 KB, patch)
2011-09-08 21:31 PDT, Ben Wells
no flags Details | Formatted Diff | Diff
Patch (12.22 KB, patch)
2011-09-09 00:21 PDT, Ben Wells
no flags Details | Formatted Diff | Diff
Patch (12.16 KB, patch)
2011-09-11 18:34 PDT, Ben Wells
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Ben Wells 2011-09-05 22:23:18 PDT
PrettyPatch doesn't handle delta patch mechanism in git binary patches
Comment 1 Ben Wells 2011-09-05 22:37:22 PDT
Created attachment 106389 [details]
Patch
Comment 2 Ben Wells 2011-09-05 22:51:17 PDT
Link to mail archive where delta patch mechanism is mentioned: http://marc.info/?l=git&m=114682417113315&w=2
Comment 3 Shinichiro Hamaji 2011-09-05 23:25:46 PDT
Comment on attachment 106389 [details]
Patch

Looks good.
Comment 4 WebKit Review Bot 2011-09-06 00:25:20 PDT
Comment on attachment 106389 [details]
Patch

Clearing flags on attachment: 106389

Committed r94554: <http://trac.webkit.org/changeset/94554>
Comment 5 WebKit Review Bot 2011-09-06 00:25:25 PDT
All reviewed patches have been landed.  Closing bug.
Comment 6 Ben Wells 2011-09-06 01:05:08 PDT
This doesn't fix the problem when applied to the website.

Instead of getting "Binary file, nothing to see here" you get errors applying the patch. This worked locally when running webkit-patch upload, but not when deployed on the server. Will look more tomorrow.
Comment 7 Ben Wells 2011-09-06 18:57:56 PDT
I've had a look into this and have some findings. 

Summary: as far as I can see, this can't be fixed. We can give better feedback.

Details: binary patch generation in git works like this:

GenerateBinaryPatch: 
    write a header to the patch for this file
    GenerateBinaryDiff(oldfile, newfile)
    GenerateBinaryDiff(newfile, oldfile)

GenerateBinaryDiff(fromfile, tofile):
    compress / encode tofile and see how big it is
    generate a compress / encoded delta between fromfile and tofile 
    if compressed delta is smaller than compressed tofile:
        write 'delta SIZE'
        write the compressed delta
    else:
        write 'literal SIZE'
        write the compressed tofile

(paraphrased from http://git.kernel.org/?p=git/git.git;a=blob;f=diff.c;h=fcc0078074c364d0a4c2bd75a6d390e517eb7f87;hb=HEAD)

Currently, if the to file uses literal and the from file uses delta, the file is recognized as an image but extracting the from file fails. This can be seen for LayoutTests/platform/chromium-win/fast/backgrounds/gradient-background-leakage-expected.png in attachment 106368 [details]

Recognising image files when the to file is delta makes these errors show up more.

Not much can be done if both use delta encoding as the files can't be extracted from deltas alone. A simple change is to print a better message if either file uses delta encoding. Ideally we would recognise if one is literal and the other is delta but this probably isn't worth it as the delta/delta case is more common.
Comment 8 Eric Seidel (no email) 2011-09-06 19:56:43 PDT
Can't we show the resulting file by applying to the version available at trac.webkit.org/browse?
Comment 9 Ben Wells 2011-09-06 20:04:37 PDT
(In reply to comment #8)
> Can't we show the resulting file by applying to the version available at trac.webkit.org/browse?

That would be good. I'm unsure of the mechanics of this on the server; I'll dig around in those script.
Comment 10 Ben Wells 2011-09-06 20:24:19 PDT
(In reply to comment #9)
> (In reply to comment #8)
> > Can't we show the resulting file by applying to the version available at trac.webkit.org/browse?
> 
> That would be good. I'm unsure of the mechanics of this on the server; I'll dig around in those script.

The PrettyPatch.rb script could be made to download the 'from' file to a temp location (using the open-uri import) and apply the delta to that. That should work. Would there be any problem doing this locally, e.g. when running webkit-patch upload? If so it could not do this if you're in a git repo.
Comment 11 Eric Seidel (no email) 2011-09-06 20:27:54 PDT
I think the original dream was to do this all in JavaScript.  I don't know how tenable such a dream is.
Comment 12 Ben Wells 2011-09-07 04:41:43 PDT
Created attachment 106569 [details]
Patch
Comment 13 Ben Wells 2011-09-07 04:58:55 PDT
Created attachment 106571 [details]
Patch
Comment 14 Ben Wells 2011-09-07 05:02:34 PDT
This is my first Ruby coding, so there are probably some dumb things in there. Doing this in javascript sounds difficult, although if you could keep the git apply stuff in cgi-land it might be a lot simpler.

If this is r+, I will put it through the cq tomorrow Sydney time.
Comment 15 Adam Roben (:aroben) 2011-09-07 08:42:14 PDT
Maybe we should be downloading files from svn.webkit.org instead of trac.webkit.org?

CCing Mark Rowe, who knows a lot more about Ruby than I do.
Comment 16 Ben Wells 2011-09-07 17:10:11 PDT
(In reply to comment #15)
> Maybe we should be downloading files from svn.webkit.org instead of trac.webkit.org?

I can use something like this instead if it is preferable: http://svn.webkit.org/repository/webkit/!svn/bc/93233/trunk/....
Comment 17 Ben Wells 2011-09-08 18:38:34 PDT
Any more feedback? Would be great to get this reviewed as it is making rebaselining from a git checkout and going through the cq pretty painful.

Also, I meant to ask explicitly: should i use svn.webkit.org or trac.webkit.org? Either way the URL is just a constant; if svn.webkit.org is preferable let me know.
Comment 18 Adam Barth 2011-09-08 19:17:06 PDT
aroben is probably your best reviewer for this patch.

I would use svn.webkit.org.  It's the horse's mouth, so to speak.
Comment 19 Ben Wells 2011-09-08 21:31:15 PDT
Created attachment 106839 [details]
Patch
Comment 20 Ben Wells 2011-09-08 21:32:25 PDT
Updated to use svn.webkit.org.

You can see the output of this version of PrettyPatch.rb (and the problem without it) at bug 67830.
Comment 21 Shinichiro Hamaji 2011-09-08 22:41:02 PDT
Comment on attachment 106839 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=106839&action=review

> Websites/bugs.webkit.org/PrettyPatch/PrettyPatch.rb:713
> +        def self.extract_contents_from_git_binary_delta_chunk(from_filepath, from_git_index, encoded_chunk, to_git_index)

Maybe we can add a helper function for this and download_from_revision_from_svn ? They look very similar.

> Websites/bugs.webkit.org/PrettyPatch/PrettyPatch.rb:721
> +                    file << open(from_filepath).read

The opened file is not closed here? Maybe

file << open(from_filepath){|f|f.read}

or something like this. (with recent ruby (>=1.8.7) we can do the same thing by open(from_filepath, &:read), but the above way would be safer considering older interpreters)

> Websites/bugs.webkit.org/PrettyPatch/PrettyPatch.rb:723
> +                    file.close

I think we don't need this begin-ensure because Kernel#open should close the file automatically even if an exception is raised.
Comment 22 Ben Wells 2011-09-09 00:21:14 PDT
Created attachment 106845 [details]
Patch
Comment 23 Ben Wells 2011-09-09 00:23:37 PDT
Updated per Hamaji's feedback - factored out temp filename and file copying. Looks much nicer now, hope I got the file copy stuff correct.
Comment 24 Adam Roben (:aroben) 2011-09-09 06:01:56 PDT
Comment on attachment 106845 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=106845&action=review

It would be nice if we could avoid touching the disk. Is there a way to run git-apply on stdin?

> Websites/bugs.webkit.org/ChangeLog:14
> +        When reconstructing the images from the patches, if we have a delta patch
> +        we may download the previous revision from trac to get the image data.

"from trac" is no longer accurate.

> Websites/bugs.webkit.org/PrettyPatch/PrettyPatch.rb:23
> +        $git_svn_revision = 0

This is just a Subversion revision number. It has nothing to do with Git (even though it's only present for git patches). I'd just call it $svn_revision.

> Websites/bugs.webkit.org/PrettyPatch/PrettyPatch.rb:672
> +                to_file << open(from_path) { |from_file| from_file.read}

Missing a space before }

> Websites/bugs.webkit.org/PrettyPatch/PrettyPatch.rb:678
> +            svn_uri =  get_svn_uri(repository_path)

Extra space after =

> Websites/bugs.webkit.org/PrettyPatch/PrettyPatch.rb:719
> +            # For literal encoded, simply reconstruct. For delta encoded, download from svn

Missing a period at the end of this comment.

> Websites/bugs.webkit.org/PrettyPatch/PrettyPatch.rb:723
> +            if (GIT_LITERAL_FORMAT.match(encoded_chunk[0])) then
> +                return extract_contents_from_git_binary_literal_chunk(encoded_chunk, git_index)
> +            elsif (GIT_DELTA_FORMAT.match(encoded_chunk[0])) then
> +                return download_from_revision_from_svn(repository_path)

Ruby, like Python, typically omits parentheses around conditionals like these.

WebKit coding style says not to say "else" after "return". So these could both just be "if"s. No need for "elsif".

Like in Perl, you can use syntax like this:

return extract_contents_from_git_binary_literal_chunk(encoded_chunk, git_index) if GIT_LITERAL_FORMAT.match(encoded_chunk[0])
return download_from_revision_from_svn(repository_path) if GIT_DELTA_FORMAT.match(encoded_chunk[0])

That might be clearer.
Comment 25 Ben Wells 2011-09-11 18:34:26 PDT
Created attachment 107012 [details]
Patch
Comment 26 Ben Wells 2011-09-11 18:43:39 PDT
(In reply to comment #24)
> It would be nice if we could avoid touching the disk. Is there a way to run git-apply on stdin?

git-apply takes the patch on stdin. It's also hard coded to output to disk as far as I can see. I've updated the way git-apply is used for the delta patches so that it doesn't require a disk copy before calling git-apply; now the disk is used for delta patches the same way as for literal patches.

The only way I can see to avoid touching disk is to use some sort of RAM disk or virtual disk on the server, but I think that's an orthogonal thing.

BTW, something which has snuck in with this change is that the tmp files should be cleaned up now; there is a bug in the current version of PrettyPatch where the filename was used in the erase instead of the filepath so the tmp files hang around (line 660).

> "from trac" is no longer accurate.

Fixed.

> > Websites/bugs.webkit.org/PrettyPatch/PrettyPatch.rb:23
> > +        $git_svn_revision = 0
> 
> This is just a Subversion revision number. It has nothing to do with Git (even though it's only present for git patches). I'd just call it $svn_revision.

Renamed.

> > Websites/bugs.webkit.org/PrettyPatch/PrettyPatch.rb:672
> > +                to_file << open(from_path) { |from_file| from_file.read}
> 
> Missing a space before }

Fixed.

> > Websites/bugs.webkit.org/PrettyPatch/PrettyPatch.rb:678
> > +            svn_uri =  get_svn_uri(repository_path)
> 
> Extra space after =

Fixed.

> > Websites/bugs.webkit.org/PrettyPatch/PrettyPatch.rb:719
> > +            # For literal encoded, simply reconstruct. For delta encoded, download from svn
> 
> Missing a period at the end of this comment.
> 
> > Websites/bugs.webkit.org/PrettyPatch/PrettyPatch.rb:723
> > +            if (GIT_LITERAL_FORMAT.match(encoded_chunk[0])) then
> > +                return extract_contents_from_git_binary_literal_chunk(encoded_chunk, git_index)
> > +            elsif (GIT_DELTA_FORMAT.match(encoded_chunk[0])) then
> > +                return download_from_revision_from_svn(repository_path)
> 
> Ruby, like Python, typically omits parentheses around conditionals like these.
> 
> WebKit coding style says not to say "else" after "return". So these could both just be "if"s. No need for "elsif".
> 
> Like in Perl, you can use syntax like this:
> 
> return extract_contents_from_git_binary_literal_chunk(encoded_chunk, git_index) if GIT_LITERAL_FORMAT.match(encoded_chunk[0])
> return download_from_revision_from_svn(repository_path) if GIT_DELTA_FORMAT.match(encoded_chunk[0])
> 
> That might be clearer.

I've removed the elsif, but haven't used the return X if Y construct as I thought the lines got too long (>160 characters for the to file / delta case).
Comment 27 WebKit Review Bot 2011-09-14 21:07:46 PDT
Comment on attachment 107012 [details]
Patch

Clearing flags on attachment: 107012

Committed r95159: <http://trac.webkit.org/changeset/95159>
Comment 28 WebKit Review Bot 2011-09-14 21:07:52 PDT
All reviewed patches have been landed.  Closing bug.