Bug 81659

Summary:

[GTK] crash log reports support

Product:

WebKit

Reporter:

Philippe Normand <pnormand>

Component:

Tools / Tests

Assignee:

Nobody <webkit-unassigned>

Status:

RESOLVED FIXED

Severity:

Normal

CC:

abarth, aroben, dpranke, eric, ojan, webkit.review.bot

Priority:

Version:

528+ (Nightly build)

Hardware:

Unspecified

OS:

Unspecified

Attachments:

Description	Flags
Patch	dpranke: review-
Path	none
Patch	dpranke: review+
Patch	mrobinson: review+

Philippe Normand

Reported 2012-03-20 06:53:25 PDT

When a test crash the "crash log" link in the results page should provide the stacktrace and state of all the running threads.

Attachments
Patch (47.13 KB, patch) 2012-03-20 09:02 PDT, Philippe Normand	dpranke: review-	Details Formatted Diff Diff
Path (17.82 KB, patch) 2012-03-27 10:20 PDT, Philippe Normand	no flags	Details Formatted Diff Diff
Patch (23.04 KB, patch) 2012-03-28 01:13 PDT, Philippe Normand	dpranke: review+	Details Formatted Diff Diff
Patch (21.96 KB, patch) 2012-03-30 02:58 PDT, Philippe Normand	mrobinson: review+	Details Formatted Diff Diff
Show Obsolete (3) View All Add attachment proposed patch, testcase, etc.

Philippe Normand

Comment 1 2012-03-20 09:02:55 PDT

Created attachment 132836 [details] Patch

Philippe Normand

Comment 2 2012-03-20 10:25:57 PDT

I'm not sure about the move of the CrashLog logic to the Port infrastructure, I can undo this change if needed but I'd need the new crashmon/daemontools code to be called only in GTK for now and I don't see how to reliably test this (sys.platform == "linux2" won't work, other ports run on Linux).

Dirk Pranke

Comment 3 2012-03-20 13:08:59 PDT

Comment on attachment 132836 [details] Patch I am a bit of two minds as to whether these routines belong on the Port object or under common/system somewhere. I think I agree that figuring out what the right crash log is is a property of the Port and not the operating system, since, for example, chromium uses different mechanisms than apple on the mac or Gtk on Linux. Adam, Eric, any thoughts? Apart from that, this approach seems reasonable but we need to merge it with the clean up I'm doing in bugs 81600 and 81603. Also, you might care about the change in bug 81575. View in context: https://bugs.webkit.org/attachment.cgi?id=132836&action=review > Tools/Scripts/webkitpy/common/system/executive.py:264 > + return e.errno == errno.EPERM Why is this change needed? I don't understand what was failing before or is fixed now ... > Tools/Scripts/webkitpy/layout_tests/controllers/test_result_writer.py:65 > + writer.write_crash_report(crashed_driver_output.crashed_process_name, crashed_driver_output.error, driver_output.pid) We should coordinate this change with my patches in bug 81603 and 81600 - the driver should return the crash log and the test result writer should just use that. > Tools/Scripts/webkitpy/layout_tests/controllers/test_result_writer.py:166 > + fs.write_text_file(filename, log if log else error.decode("utf-8")) see above. > Tools/Scripts/webkitpy/layout_tests/port/driver.py:67 > + test_time=0, timeout=False, error='', crashed_process_name=None, pid=None): see above; I think this should be crashed_pid, not pid. > Tools/Scripts/webkitpy/layout_tests/port/gtk.py:146 > + return process_name == 'gdb-dump' This function goes away w/ the patch in bug 81600 ... > Tools/Scripts/webkitpy/tool/commands/queries.py:375 > + for port_name in tool.port_factory.all_port_names(): You probably don't want to iterate across all_port_names(), since that could presumably get you the same crash log multiple times.

Philippe Normand

Comment 4 2012-03-21 02:41:12 PDT

(In reply to comment #3) > (From update of attachment 132836 [details]) > I am a bit of two minds as to whether these routines belong on the Port object or under common/system somewhere. I think I agree that figuring out what the right crash log is is a property of the Port and not the operating system, since, for example, chromium uses different mechanisms than apple on the mac or Gtk on Linux. > > Adam, Eric, any thoughts? > > Apart from that, this approach seems reasonable but we need to merge it with the clean up I'm doing in bugs 81600 and 81603. Also, you might care about the change in bug 81575. > Ok I'll hold my horse on these bugs then :) > > View in context: https://bugs.webkit.org/attachment.cgi?id=132836&action=review > > > Tools/Scripts/webkitpy/common/system/executive.py:264 > > + return e.errno == errno.EPERM > > Why is this change needed? I don't understand what was failing before or is fixed now ... > When I was testing our crashmon script and running it as root I was hitting this issue. Running the tests as a normal user I wasn't able to send the 0 signal to the crashmon process executed by root. > > Tools/Scripts/webkitpy/layout_tests/controllers/test_result_writer.py:65 > > + writer.write_crash_report(crashed_driver_output.crashed_process_name, crashed_driver_output.error, driver_output.pid) > > We should coordinate this change with my patches in bug 81603 and 81600 - the driver should return the crash log and the test result writer should just use that. > > > Tools/Scripts/webkitpy/layout_tests/controllers/test_result_writer.py:166 > > + fs.write_text_file(filename, log if log else error.decode("utf-8")) > > see above. > > > Tools/Scripts/webkitpy/layout_tests/port/driver.py:67 > > + test_time=0, timeout=False, error='', crashed_process_name=None, pid=None): > > see above; I think this should be crashed_pid, not pid. > > > Tools/Scripts/webkitpy/layout_tests/port/gtk.py:146 > > + return process_name == 'gdb-dump' > > This function goes away w/ the patch in bug 81600 ... > > > Tools/Scripts/webkitpy/tool/commands/queries.py:375 > > + for port_name in tool.port_factory.all_port_names(): > > You probably don't want to iterate across all_port_names(), since that could presumably get you the same crash log multiple times. So when I find the first, print it and exit early?

Dirk Pranke

Comment 5 2012-03-21 13:37:17 PDT

(In reply to comment #4) > > > Tools/Scripts/webkitpy/common/system/executive.py:264 > > > + return e.errno == errno.EPERM > > > > Why is this change needed? I don't understand what was failing before or is fixed now ... > > > > When I was testing our crashmon script and running it as root I was hitting this issue. Running the tests as a normal user I wasn't able to send the 0 signal to the crashmon process executed by root. > OK. > > > Tools/Scripts/webkitpy/tool/commands/queries.py:375 > > > + for port_name in tool.port_factory.all_port_names(): > > > > You probably don't want to iterate across all_port_names(), since that could presumably get you the same crash log multiple times. > > So when I find the first, print it and exit early? That's probably as good as anything else I can think of.

Philippe Normand

Comment 6 2012-03-27 10:20:35 PDT

Created attachment 134094 [details] Path

Martin Robinson

Comment 7 2012-03-27 14:52:14 PDT

Comment on attachment 134094 [details] Path View in context: https://bugs.webkit.org/attachment.cgi?id=134094&action=review Looks good, but I'd rather someone more familiar with the crash dumping code do the final review. I have left some comments below. > Tools/BuildSlaveSupport/gtk/README:-23 > -============================================== > - Running a GTK+ build slave under daemontools > -============================================== > - > -This directory contains several scripts which can be used to run a WebKitGTK+ > -build slave under daemontools [1]. This is convenient because daemontools > -will automatically restart services when they die, and that means less human > -intervention is needed. > - > - > -Dependencies > -============ > - > -In order to use the provided service control files, you will need the > -following: > - > -* The GNU Bash shell (the scripts contain some bash-isms) > - > -* The daemontools package (or one of its drop-in replacements, like runit > - or freedt; but only daemontools has been tested so far). > - > -* The crash dump monitor also uses "inotifywait" (part of inotify-tools [2]) > - Did you mean to remove all of this README? Not all of it deals with crashmon. > Tools/Scripts/new-run-webkit-tests:61 > + jhbuild = os.path.join(script_dir, '..', 'gtk', 'run-with-jhbuild') > + cmd.insert(1, jhbuild) This can be one line. > Tools/Scripts/webkitpy/layout_tests/port/gtk.py:156 > + name_str = name or '<unknown process name>' > + pid_str = str(pid or '<unknown>') Okay, just a nit but you should use full words, unless the abbreviation would be more canonical. So IMO pid is good and str is bad. In what cases is pid == None? If pid can be None and name can be None shouldn't you handle that case explicitly and do an early return? > Tools/Scripts/webkitpy/layout_tests/port/gtk.py:164 > + expected_crash_dump_filename = "core-pid_%s-_-process_%s" % (pid_str, name_str) > + if pid: > + match_filename = lambda name: name == expected_crash_dump_filename > + else: > + match_filename = lambda name: name.find(name_str) > -1 I think would be more clear as: def match_filename(name): if pid: return name == expected_crash_dump_filename return name.find(name_str) > -1 If it's frequent that pid is None, wouldn't it be better to simply find the dump with the most recent timestamp? > Tools/Scripts/webkitpy/layout_tests/port/gtk.py:166 > + for path in reversed(sorted(dumps)): Perhaps the sorting here ensures that the most recent dump is selected? I'm just guessing, so maybe it deserves a comment. > Tools/Scripts/webkitpy/layout_tests/port/gtk.py:170 > + stderr_lines = errors + (stderr or '<empty>').decode('utf8').splitlines() What guarantee do we have that stderr is in UTF-8 format?

Dirk Pranke

Comment 8 2012-03-27 19:12:36 PDT

Comment on attachment 134094 [details] Path The patch generally looks pretty good to me, just a few nits ... >> Tools/Scripts/new-run-webkit-tests:61 >> + cmd.insert(1, jhbuild) > > This can be one line. Ew, this feels like a hack; then again, so is much of this file :( >> Tools/Scripts/webkitpy/layout_tests/port/gtk.py:156 >> + pid_str = str(pid or '<unknown>') > > Okay, just a nit but you should use full words, unless the abbreviation would be more canonical. So IMO pid is good and str is bad. > > In what cases is pid == None? If pid can be None and name can be None shouldn't you handle that case explicitly and do an early return? pid is None if WebProcess crashes but we're not told the pid of the WebProcess (which I think is currently always on the GTk, and will be until you add support for that). name is probably never None now. > Tools/Scripts/webkitpy/layout_tests/port/gtk.py:165 > + dumps = self._filesystem.files_under(log_directory) note that you can pass match_filename as a filter to files_under() directly. >> Tools/Scripts/webkitpy/layout_tests/port/gtk.py:170 >> + stderr_lines = errors + (stderr or '<empty>').decode('utf8').splitlines() > > What guarantee do we have that stderr is in UTF-8 format? I'm not actually sure that it's always (or even ever) in UTF-8; I saw a crash earlier today on the chromium mac 10.6 bot when we failed to decode something as utf-8, so we might want to trap a codec error here and try it the other way as well.

Philippe Normand

Comment 9 2012-03-28 01:13:10 PDT

Created attachment 134234 [details] Patch

Dirk Pranke

Comment 10 2012-03-28 12:15:53 PDT

Comment on attachment 134234 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=134234&action=review > Tools/Scripts/webkitpy/layout_tests/port/gtk.py:173 > + stderr_lines = errors + (stderr or '<empty>').decode('utf8', errors='ignore').splitlines() errors='ignore', huh? didn't know about that ...

Philippe Normand

Comment 11 2012-03-30 01:12:03 PDT

Thanks for the review! However I think I'll upload a new iteration of the patch, removing the --coredump-dir command line and using an environment variable instead. It should ease deployment on the bots.

Philippe Normand

Comment 12 2012-03-30 02:58:02 PDT

Created attachment 134760 [details] Patch

Philippe Normand

Comment 13 2012-04-03 08:23:30 PDT

Committed r113037: <http://trac.webkit.org/changeset/113037>

Note You need to log in before you can comment on or make changes to this bug.