Bug 34826

Summary: new-run-webkit-tests --platform=mac-leopard does bogus image diffing
Product: WebKit Reporter: Eric Seidel (no email) <eric>
Component: Tools / TestsAssignee: Dirk Pranke <dpranke>
Status: RESOLVED FIXED    
Severity: Normal CC: abarth, dpranke, jamesr, levin, ojan
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: PC   
OS: OS X 10.5   
Bug Depends on:    
Bug Blocks: 34984    
Attachments:
Description Flags
first draft of patch - still buggy
none
updated patch - this should work correctly and safely
none
split server_process.read() into two calls for clearer usage
none
fix ChangeLog comments
none
back out PEP-8 reformatting to minimize confusion
none
revise w/ review feedback from eric and ojan
none
revise w/ more feedback from ojan none

Eric Seidel (no email)
Reported 2010-02-10 18:13:26 PST
run-chromium-webkit-tests --platform=mac-leopard does bogus image diffing def _path_to_image_diff(self): return self._build_path('image_diff') # FIXME: This is wrong and should be "ImageDiff", but having the correct path causes other parts of the script to hang. turns out that "image_diff" expects command line arguments and is not long-running. ImageDiff is long running. We'll have to teach RCWT how to deal with a long-running ImageDiff (similar to how we deal with a long-running DumpRenderTree) via the "Driver" class.
Attachments
first draft of patch - still buggy (10.77 KB, patch)
2010-02-26 19:36 PST, Dirk Pranke
no flags
updated patch - this should work correctly and safely (28.22 KB, patch)
2010-03-24 19:44 PDT, Dirk Pranke
no flags
split server_process.read() into two calls for clearer usage (29.41 KB, patch)
2010-03-25 14:13 PDT, Dirk Pranke
no flags
fix ChangeLog comments (29.21 KB, patch)
2010-03-25 14:36 PDT, Dirk Pranke
no flags
back out PEP-8 reformatting to minimize confusion (24.56 KB, patch)
2010-03-25 18:15 PDT, Dirk Pranke
no flags
revise w/ review feedback from eric and ojan (26.30 KB, patch)
2010-03-26 15:10 PDT, Dirk Pranke
no flags
revise w/ more feedback from ojan (27.37 KB, patch)
2010-03-26 16:05 PDT, Dirk Pranke
no flags
Dirk Pranke
Comment 1 2010-02-26 19:36:14 PST
Created attachment 49671 [details] first draft of patch - still buggy This isn't ready to land yet, but it's close. The test script is hanging occasionally, and I think it's due to a bug in the diff_image() implementation but I haven't tracked it down yet. I'm posting the patch so others know it's in progress.
Dirk Pranke
Comment 2 2010-03-24 19:44:20 PDT
Created attachment 51582 [details] updated patch - this should work correctly and safely
Dirk Pranke
Comment 3 2010-03-25 14:13:19 PDT
Created attachment 51675 [details] split server_process.read() into two calls for clearer usage
Eric Seidel (no email)
Comment 4 2010-03-25 14:21:36 PDT
Drive-by nit: ChangeLog looks not quite right.
Dirk Pranke
Comment 5 2010-03-25 14:36:07 PDT
Created attachment 51676 [details] fix ChangeLog comments
Ojan Vafai
Comment 6 2010-03-25 17:56:51 PDT
Can you split the style changes off into a separate page? That would make reviewing the code changes considerably easier.
Dirk Pranke
Comment 7 2010-03-25 18:15:35 PDT
Created attachment 51706 [details] back out PEP-8 reformatting to minimize confusion
Dirk Pranke
Comment 8 2010-03-25 18:15:47 PDT
(In reply to comment #6) > Can you split the style changes off into a separate page? That would make > reviewing the code changes considerably easier. Done.
Ojan Vafai
Comment 9 2010-03-26 14:25:18 PDT
Comment on attachment 51706 [details] back out PEP-8 reformatting to minimize confusion > + def diff_image(self, expected_filename, actual_filename, > + diff_filename=None): > + """Compare two image files and produce a delta image file. > + > + Return 1 if the two files are different, 0 if they are the same. Why not return a boolean? > + actual_file = open(actual_filename).read() > + input = 'Content-Length: %d\n' % actual_length > + input += actual_file > + input += 'Content-Length: %d\n' % expected_length > + input += expected_file How about using a single format string here? > + cmd = [self._path_to_image_diff(), '--tolerance', '0.1'] Doesn't this need to be controlled via the commandline? Also, I'm pretty sure tiger, leopard and snow-leopard have different default tolerances. > cmd += self._options.wrapper.split() > - # FIXME: Using arch here masks any possible file-not-found errors from a non-existant driver executable. > - cmd += ['arch', '-i386', port._path_to_driver(), '-'] Is this FIXME no longer true? > +++ b/WebKitTools/Scripts/webkitpy/layout_tests/port/server_process.py > +class ServerProcess: I like the concept of this class, but not so much the name. There's not really a server involved. ReaderProcess? ReadLineProcess? ReadWriteProcess? > + """This class provides a wrapper around a subprocess that > + implements a simple read/write usage model. The primary benefit > + is that reading responses takes a timeout, so that we don't ever block > + indefinitely. The class also handles transparently restarting processes > + as necessary to keep issuing commands.""" > + def read_line(self, deadline): > + """Read a single line from the subprocess, waiting until the deadline. > + If the deadline passes, the call times out. Note that even if the > + subprocess has crashed or the deadline has passed, if there is output > + pending, it will be returned. > + > + Args: > + deadline: timestamp that the read must complete before Passing in a timestamp is weird. Can you have it be a timeout and then have this code deal with calculating what the deadline should be? > + Returns: > + output: data returned, if any. If no data is available and the > + call times out or crashes, an empty string is returned. Note > + that the returned string includes the newline ('\n').""" > + return self._read(deadline, 0) > + > + def read(self, deadline, size): > + """Attempts to read size characters from the subprocess, waiting until > + the deadline passes. If the deadline passes, any available data will be > + returned. Note that even if the deadline has passed or if the > + subprocess has crashed, any available data will still be returned. > + > + Args: > + deadline: timestamp that the read must complete before > + size: amount of data to read. > + Returns: > + output: data returned, if any. If no data is available, an empty > + string is returned. > + """ > + return self._read(deadline, size) Why does this function exist? Why not just make _read public? > + def _read(self, deadline, size): > + """Internal routine that actually does the read.""" > + idx = -1 Can you give this a more descriptive name? > diff --git a/WebKitTools/Scripts/webkitpy/layout_tests/test_types/image_diff.py b/WebKitTools/Scripts/webkitpy/layout_tests/test_types/image_diff.py > index 42928ba..3a3d8a6 100644 > --- a/WebKitTools/Scripts/webkitpy/layout_tests/test_types/image_diff.py > +++ b/WebKitTools/Scripts/webkitpy/layout_tests/test_types/image_diff.py > @@ -90,6 +90,7 @@ class ImageDiff(test_type_base.TestTypeBase): > Args: > filename: the name of the test > target: Debug or Release > + Returns 1 if the files are different, 0 if they match > """ I know you didn't make it this way, but can we make this return a boolean? Then we can make the above code do the same?
Dirk Pranke
Comment 10 2010-03-26 14:38:31 PDT
(In reply to comment #9) > (From update of attachment 51706 [details]) > > + def diff_image(self, expected_filename, actual_filename, > > + diff_filename=None): > > + """Compare two image files and produce a delta image file. > > + > > + Return 1 if the two files are different, 0 if they are the same. > > Why not return a boolean? You saw below why not; I'll change the two files to return booleans instead. > > > + actual_file = open(actual_filename).read() > > + input = 'Content-Length: %d\n' % actual_length > > + input += actual_file > > + input += 'Content-Length: %d\n' % expected_length > > + input += expected_file > > How about using a single format string here? > Will do. > > + cmd = [self._path_to_image_diff(), '--tolerance', '0.1'] > > Doesn't this need to be controlled via the commandline? Also, I'm pretty sure > tiger, leopard and snow-leopard have different default tolerances. Theoretically. I think the whole "tolerance" concept needs to go away, and we should only allow exact pixel matches. > > cmd += self._options.wrapper.split() > > - # FIXME: Using arch here masks any possible file-not-found errors from a non-existant driver executable. > > - cmd += ['arch', '-i386', port._path_to_driver(), '-'] > > Is this FIXME no longer true? > That is correct; we now detect and report if the driver is missing properly. > > +++ b/WebKitTools/Scripts/webkitpy/layout_tests/port/server_process.py > > +class ServerProcess: > > I like the concept of this class, but not so much the name. There's not really > a server involved. ReaderProcess? ReadLineProcess? ReadWriteProcess? I agree with you that ServerProcess isn't great, but I don't like any of your suggestions any better. RpcProcess? > Passing in a timestamp is weird. Can you have it be a timeout and then have > this code deal with calculating what the deadline should be? I agree that this is a little weird (compared to select, etc), but I think it's actually easier to get right on the caller side than the normal API design pattern. I can change it if you think following established (IMO, bad) API conventions is better here. > > + """ > > + return self._read(deadline, size) > > Why does this function exist? Why not just make _read public? > There are two public use cases supported, <read one line of unknown length> and <read an exact number of bytes>). Combining those into a single function means you have to use some sort of a magic value to indicate which you want. I thought having two public functions was better. However, as you saw below, implementing them separately would be silly, so I combined them into a single internal routine. > > + def _read(self, deadline, size): > > + """Internal routine that actually does the read.""" > > + idx = -1 > > Can you give this a more descriptive name? Suggestions? > I know you didn't make it this way, but can we make this return a boolean? Then > we can make the above code do the same? will do.
Ojan Vafai
Comment 11 2010-03-26 15:02:13 PDT
(In reply to comment #10) > (In reply to comment #9) > > > + cmd = [self._path_to_image_diff(), '--tolerance', '0.1'] > > > > Doesn't this need to be controlled via the commandline? Also, I'm pretty sure > > tiger, leopard and snow-leopard have different default tolerances. > > Theoretically. I think the whole "tolerance" concept needs to go away, and we > should only allow exact pixel matches. I'm inclined to agree, but for now this should at least have a FIXME. > > > +++ b/WebKitTools/Scripts/webkitpy/layout_tests/port/server_process.py > > > +class ServerProcess: > > > > I like the concept of this class, but not so much the name. There's not really > > a server involved. ReaderProcess? ReadLineProcess? ReadWriteProcess? > > I agree with you that ServerProcess isn't great, but I don't like any of your > suggestions any better. RpcProcess? How about IpcProcess? RPC is about actually executing code on the remote process. This is more about communicating with the remote process. > > Passing in a timestamp is weird. Can you have it be a timeout and then have > > this code deal with calculating what the deadline should be? > > I agree that this is a little weird (compared to select, etc), but I think it's > actually easier to get right on the caller side than the normal API design > pattern. I can change it if you think following established (IMO, bad) API > conventions is better here. I think the confusion of this violating established API conventions is more likely to cause bugs. I'd rather it just be a timeout. > > > + """ > > > + return self._read(deadline, size) > > > > Why does this function exist? Why not just make _read public? > > There are two public use cases supported, <read one line of unknown length> and > <read an exact number of bytes>). Combining those into a single function means > you have to use some sort of a magic value to indicate which you want. I > thought having two public functions was better. However, as you saw below, > implementing them separately would be silly, so I combined them into a single > internal routine. Sorry, I phrased poorly. I'm suggesting read_line just call the public read function. No need for a private one. > > > + def _read(self, deadline, size): > > > + """Internal routine that actually does the read.""" > > > + idx = -1 > > > > Can you give this a more descriptive name? > > Suggestions? Even just "index" would be fine.
Dirk Pranke
Comment 12 2010-03-26 15:10:18 PDT
Created attachment 51785 [details] revise w/ review feedback from eric and ojan
Dirk Pranke
Comment 13 2010-03-26 16:05:09 PDT
(In reply to comment #11) > (In reply to comment #10) > > (In reply to comment #9) > > > > + cmd = [self._path_to_image_diff(), '--tolerance', '0.1'] > > > > > > Doesn't this need to be controlled via the commandline? Also, I'm pretty sure > > > tiger, leopard and snow-leopard have different default tolerances. > > > > Theoretically. I think the whole "tolerance" concept needs to go away, and we > > should only allow exact pixel matches. > > I'm inclined to agree, but for now this should at least have a FIXME. > FIXME added. > > > > +++ b/WebKitTools/Scripts/webkitpy/layout_tests/port/server_process.py > > > > +class ServerProcess: > > > > > > I like the concept of this class, but not so much the name. There's not really > > > a server involved. ReaderProcess? ReadLineProcess? ReadWriteProcess? > > > > I agree with you that ServerProcess isn't great, but I don't like any of your > > suggestions any better. RpcProcess? > > How about IpcProcess? RPC is about actually executing code on the remote > process. This is more about communicating with the remote process. > The intent of the api is that there is a request followed by a reply. Unfortunately due to the way ImageDiff and DRT produce output, you can't easily encapsulate this into a single reply() method, and so I was left with the more generic read() and write() routines. However, IPC is too vague to capture the intent. Per face-to-face discussion w/ Ojan, we agreed to leave it as ServerProcess until we could think of something better. > > > Passing in a timestamp is weird. Can you have it be a timeout and then have > > > this code deal with calculating what the deadline should be? > > > > I agree that this is a little weird (compared to select, etc), but I think it's > > actually easier to get right on the caller side than the normal API design > > pattern. I can change it if you think following established (IMO, bad) API > > conventions is better here. > > I think the confusion of this violating established API conventions is more > likely to cause bugs. I'd rather it just be a timeout. Okay, I've changed it to be a timeout. Judging by the fact that I introduced half a dozen bugs while change this, and by the fact that the code got longer on both sides, I think I was right. For a relevant discussion of the merits of API design in select() and read APIs, see http://cacm.acm.org/magazines/2009/5/24646-api-design-matters/fulltext , although he repeats the error of not being able to distinguish between "don't wait", "wait for X", and "wait forever" in a delta timestamp. Using a deadline is clearer still. > > > > > + """ > > > > + return self._read(deadline, size) > > > > > > Why does this function exist? Why not just make _read public? > > > > There are two public use cases supported, <read one line of unknown length> and > > <read an exact number of bytes>). Combining those into a single function means > > you have to use some sort of a magic value to indicate which you want. I > > thought having two public functions was better. However, as you saw below, > > implementing them separately would be silly, so I combined them into a single > > internal routine. > > Sorry, I phrased poorly. I'm suggesting read_line just call the public read > function. No need for a private one. As we discussed face-to-face, read() probably should've raised an exception if called with a non-positive number for size. I have added that, and you were okay with the change. > > > > > + def _read(self, deadline, size): > > > > + """Internal routine that actually does the read.""" > > > > + idx = -1 > > > > > > Can you give this a more descriptive name? > > > > Suggestions? > > Even just "index" would be fine. Ah, sorry, I thought you were complaining about "_read". I think "idx" is pretty obvious and shorter, but I've changed it to "index".
Dirk Pranke
Comment 14 2010-03-26 16:05:34 PDT
Created attachment 51791 [details] revise w/ more feedback from ojan
Dirk Pranke
Comment 15 2010-03-26 16:07:52 PDT
(In reply to comment #13) > > > > > +++ b/WebKitTools/Scripts/webkitpy/layout_tests/port/server_process.py > > > > > +class ServerProcess: > > > > > > > > I like the concept of this class, but not so much the name. There's not really > > > > a server involved. ReaderProcess? ReadLineProcess? ReadWriteProcess? > > > > > > I agree with you that ServerProcess isn't great, but I don't like any of your > > > suggestions any better. RpcProcess? > > > > How about IpcProcess? RPC is about actually executing code on the remote > > process. This is more about communicating with the remote process. > > > > The intent of the api is that there is a request followed by a reply. > Unfortunately due to the way ImageDiff and DRT produce output, you can't easily > encapsulate this into a single reply() method, and so I was left with the more > generic read() and write() routines. However, IPC is too vague to capture the > intent. Per face-to-face discussion w/ Ojan, we agreed to leave it as > ServerProcess until we could think of something better. > Originally when I wrote this my idea was to have ImageDiff and DumpRenderTree derive from ServerProcess, and then the complexity in parsing the responses would be hidden. Unfortunately, since they return such weird sets of arguments, even trying to fit them both into a single reply() method was awkward, and so I abandoned that idea.
Eric Seidel (no email)
Comment 16 2010-03-26 16:15:51 PDT
Comment on attachment 51791 [details] revise w/ more feedback from ojan I'm told that you and Ojan reached a happy place. This looks OK to me. I think this new architecture is much cleaner. Two letter variable names make my brain kink. But I don't think it's worth another round of changes to fix them. In general I'd rather write out full english names and phrases to describe variables. Then again, I come from a non-80 column limited world. :) Thanks for all your hard work!
Dirk Pranke
Comment 17 2010-03-26 16:38:45 PDT
Comment on attachment 51791 [details] revise w/ more feedback from ojan Clearing flags on attachment: 51791 Committed r56647: <http://trac.webkit.org/changeset/56647>
Dirk Pranke
Comment 18 2010-03-26 16:38:50 PDT
All reviewed patches have been landed. Closing bug.
James Robinson
Comment 19 2010-04-19 12:21:35 PDT
This killed the chromium windows canary: http://build.chromium.org/buildbot/waterfall.fyi/builders/Webkit%20(webkit.org)/builds/25766/steps/webkit_tests/logs/stdio The failure is: Traceback (most recent call last): File "C:\b\slave\webkit-rel-webkit-org\build\src\third_party\WebKit\WebKitTools\Scripts\new-run-webkit-tests", line 37, in <module> sys.exit(run_webkit_tests.main(options, args)) File "C:\b\slave\webkit-rel-webkit-org\build\src\third_party\WebKit\WebKitTools\Scripts\webkitpy\layout_tests\run_webkit_tests.py", line 1401, in main port_obj = port.get(options.platform, options) File "C:\b\slave\webkit-rel-webkit-org\build\src\third_party\WebKit\WebKitTools\Scripts\webkitpy\layout_tests\port\factory.py", line 84, in get import chromium_win File "C:\b\slave\webkit-rel-webkit-org\build\src\third_party\WebKit\WebKitTools\Scripts\webkitpy\layout_tests\port\chromium_win.py", line 39, in <module> import chromium File "C:\b\slave\webkit-rel-webkit-org\build\src\third_party\WebKit\WebKitTools\Scripts\webkitpy\layout_tests\port\chromium.py", line 43, in <module> import webkit File "C:\b\slave\webkit-rel-webkit-org\build\src\third_party\WebKit\WebKitTools\Scripts\webkitpy\layout_tests\port\webkit.py", line 46, in <module> import webkitpy.layout_tests.port.server_process as server_process File "C:\b\slave\webkit-rel-webkit-org\build\src\third_party\WebKit\WebKitTools\Scripts\webkitpy\layout_tests\port\server_process.py", line 32, in <module> import fcntl ImportError: No module named fcntl program finished with exit code 1 I think the 'import fcntl' line needs to be guarded. Can you take a look at this Dirk? If you don't have time I can try to fix it later today. Thanks.
Dirk Pranke
Comment 20 2010-04-19 13:59:45 PDT
(In reply to comment #19) > This killed the chromium windows canary: > http://build.chromium.org/buildbot/waterfall.fyi/builders/Webkit%20(webkit.org)/builds/25766/steps/webkit_tests/logs/stdio > This whole module won't work on Windows, because the non-blocking I/O model is totally different. We need to do much more invasive surgery than that. I've checked in a fix to webkit.py to not pull the file in if we're on windows, which fixes this for now.
Note You need to log in before you can comment on or make changes to this bug.