|Summary:||REGRESSION (??? - r50171): inspector tests crashing at JSC::TypeInfo::type()|
|Product:||WebKit||Reporter:||Eric Seidel (no email) <eric>|
|Component:||Web Inspector (Deprecated)||Assignee:||Nobody <webkit-unassigned>|
|Severity:||Normal||CC:||ap, aroben, atwilson, barraclough, dimich, ggaren, joepeck, knorton, levin, oliver, pfeldman, timothy|
|Version:||528+ (Nightly build)|
|OS:||OS X 10.5|
|Bug Depends on:||31817, 31615|
|Bug Blocks:||30916, 31268|
Description Eric Seidel (no email) 2009-10-27 14:39:52 PDT
Comment 1 Eric Seidel (no email) 2009-10-27 14:50:09 PDT
I wonder if this could be related to the crash just seen on the Tiger bot in inspector/console- tests: http://build.webkit.org/results/Tiger%20Intel%20Release/r50171%20(5612)/results.html
Comment 2 Eric Seidel (no email) 2009-10-28 09:37:10 PDT
Looks like it's crashing again this morning. This is a real bug: http://build.webkit.org/results/Leopard%20Intel%20Debug%20(Tests)/r50217%20(6552)/results.html
Comment 3 Timothy Hatcher 2009-10-28 09:38:58 PDT
Where can we get the crash logs? Why isn't the stderr output file a 404? (That would help, since this is likely an ASSERT.)
Comment 4 Timothy Hatcher 2009-10-28 09:39:10 PDT
Comment 5 Eric Seidel (no email) 2009-10-28 09:45:45 PDT
We don't have an easy way to get crash logs off the bots yet. :( bug 14861.
Comment 6 Eric Seidel (no email) 2009-10-28 09:46:30 PDT
I expect that run-webkit-tests --iterations 100 inspector/console-format.html might reproduce it in a local debug build. Since this crash seems flakey, it doesn't crash every time.
Comment 7 Eric Seidel (no email) 2009-10-28 15:19:35 PDT
This console failure (internal timeout): http://build.webkit.org/results/Leopard%20Intel%20Debug%20(Tests)/r50239%20(6569)/inspector/console-format-collections-pretty-diff.html may also be related. I ran the inspector tests 100 times (--iterations 100) locally in debug mode and saw no crashes.
Comment 8 Eric Seidel (no email) 2009-10-28 15:20:51 PDT
My --iterations 100 run was with a rather old build of WebKit. I'm going to update and run the tests again since these failures only recently started on the leopard bot. Sadly inspector/ is currently the most flakey set of tests on the leopard debug bot. :(
Comment 9 Pavel Feldman 2009-10-28 15:30:52 PDT
There are two things that we can try: 1) Change DRT so that inspector is enabled for LayoutTests/inspectors tests 2) Replace setTimeout(0) with the direct call in tests.
Comment 10 Eric Seidel (no email) 2009-10-28 15:33:26 PDT
I worry this might be an http-induced crasher leaking into the inspector tests. I remember we've had some flakiness with XHR tests in the past causing random crashes. I'll dig up the bugs. I'm right now running: run-webkit-tests --iterations 100 http inspector to see if this could be related to being run after the http tests.
Comment 11 Eric Seidel (no email) 2009-10-28 15:34:55 PDT
bug 29344, bug 30726, bug 30519, bug 30392 and bug 29090 are all about flakey http tests, some of which involve unexplained crashers. It is possible the inspector tests are just the most recent victim here.
Comment 12 Eric Seidel (no email) 2009-10-28 15:37:08 PDT
Just had console-dirxml.html fail on the Tiger bot: http://build.webkit.org/results/Tiger%20Intel%20Release/r50240%20(5657)/results.html Clearly something is wrong here that's causing flakey inspector/ tests on multiple bots. :(
Comment 13 Eric Seidel (no email) 2009-10-28 15:48:30 PDT
Created attachment 42064 [details] crash report from running "run-webkit-tests --iterations 100 http inspector" I guess I'll CC some of the JSC guys.
Comment 14 Eric Seidel (no email) 2009-10-28 15:49:10 PDT
CCing a couple JIT guys in case this crash dump looks familiar to them.
Comment 15 Eric Seidel (no email) 2009-10-28 23:09:18 PDT
Created attachment 42082 [details] 13 crash reports from another run of "run-webkit-tests --iterations 100 http inspector" I let the run complete. Here are 13 crash reports from the run. Obviously this bug is reproducible . :) Now I guess we just need a reduction...
Comment 16 Eric Seidel (no email) 2009-10-28 23:09:43 PDT
The 13 reports don't all look the same, but I think there are only two separate stack traces.
Comment 17 Eric Seidel (no email) 2009-10-29 13:50:49 PDT
I'm currently trying to reduce the number of tests required to cause this to fail. Right now I have the set down to about 140. I'll post the list when I have it down to a more reasonable number. Right now that 140 is a subset of the http tests and all of the inspector tests.
Comment 18 Eric Seidel (no email) 2009-10-29 14:04:42 PDT
If I'm correctly reading the stack traces correctly, it looks like something is trying to toString() a bad JSValue pointer? Is that a correct reading?
Comment 19 Pavel Feldman 2009-10-29 14:28:26 PDT
There has been a bug where quarantine wrappers were not holding wrapped objects and those were collected on the go. That was causing random crap to take place including the one you describe. Quarantine code seemed to be right though - at least it had appropriate mark methods. Could we run this with GC disabled? Or do some stessful GC on inspector tests only? Quarantined objects are only used in inspector and should soon go away.
Comment 20 Eric Seidel (no email) 2009-10-29 14:35:57 PDT
Created attachment 42147 [details] 180 tests when run which are known to crash. I'm still trying to reduce this set. cat known_to_crash.txt | xargs run-webkit-tests --iterations 100 will lead to the crash.
Comment 21 Eric Seidel (no email) 2009-10-29 14:42:20 PDT
When this crashes, it seems to crash on the very first inspector test. It's possible that a GC is triggered during an inspector test and that's the reason why it's crashing. I guess I could try sprinkling gc() calls in inspector/console-dir.html and see what happens.
Comment 22 Eric Seidel (no email) 2009-10-29 15:13:49 PDT
Comment 23 Eric Seidel (no email) 2009-10-29 15:29:02 PDT
Why do all of those crash points only have the low 8 bytes set?
Comment 24 Oliver Hunt 2009-10-30 12:37:32 PDT
Whether either getting passed in bogus values or (and this seems more likely) we're truncating that tag bits from an jsvalue. It would be good to see if we can find exactly what revision this started at.
Comment 25 Oliver Hunt 2009-10-30 13:08:09 PDT
Eric are you running on leopard or snowleopard?
Comment 26 Eric Seidel (no email) 2009-10-30 13:19:22 PDT
I'm running Leopard. So are the bots we've seen this crash on. I believe I've only ever seen this crash on Leopard Debug, although it's possible it crashes on other configurations.
Comment 27 Oliver Hunt 2009-10-30 13:21:24 PDT
As yet i have been unable to repro -- if you can get a narrower revision range that would be great
Comment 28 David Levin 2009-10-30 13:35:41 PDT
Created attachment 42230 [details] crash log
Comment 29 Pavel Feldman 2009-10-30 14:59:54 PDT
(In reply to comment #27) > As yet i have been unable to repro -- if you can get a narrower revision range > that would be great The tests (as well as the testing harness) for inspector were introduced not so long ago, so I think it might be hard to narrow the revision range. It might have been there for a while.
Comment 30 Oliver Hunt 2009-10-31 01:16:12 PDT
I've added a myriad of assertions but have yet to hit anything prior to the actual crash. It's really bizarre. I may start adding assertions looking for these specific bad values.
Comment 31 Eric Seidel (no email) 2009-11-03 00:14:01 PST
This seems to be less common on the bots today, but is not gone. I suspect that xmlhttprequest tests were added and thus changed what objects were live at the time gc() ended up being called during the crashing console tests.
Comment 32 Mark Rowe (bdash) 2009-11-03 22:36:14 PST
Comment 33 Eric Seidel (no email) 2009-11-04 00:03:43 PST
inspector/console-format-collections.html is crashing consistently no the Leopard Debug Test Bot this evening. I assume it's just this bug. I assume that the stars aligned with the addition of some new test such that the gc timing is correct to trigger this bug more frequently again. Or at least that my (uninformed) theory. :)
Comment 34 Pavel Feldman 2009-11-09 13:23:52 PST
Have not seen Leopard bots failing since I queued things carefully in https://bugs.webkit.org/show_bug.cgi?id=30884. Or was it failing?
Comment 35 Eric Seidel (no email) 2009-11-09 13:28:12 PST
I haven't seen the bots crash due to this in a while either. But I haven't been paying super-close attention. If it's fixed, do we have any idea what could have fixed it?
Comment 36 Pavel Feldman 2009-11-09 13:41:43 PST
I've queued all the interaction between the inspected page and frontend more carefully. In particular this excluded re-enterability from withing the timer fire.
Comment 37 Pavel Feldman 2009-11-13 05:49:28 PST
Created attachment 43152 [details] another crash log on r50935
Comment 38 Eric Seidel (no email) 2009-11-13 12:02:06 PST
Created attachment 43183 [details] crash report from console-dir.html when trying to land bug 31474 https://bugs.webkit.org/show_bug.cgi?id=31474#c5 saw this crash again. There have been a rash of GC-related crashes the last few days, so this may not be related to this particular bug, but is the same test. bug 31460 is one example of the other JSC crashes seen in the last 48 hours.
Comment 39 Eric Seidel (no email) 2009-11-13 12:06:33 PST
Created attachment 43186 [details] Crash report from console-dir.html when trying to land bug 31456
Comment 40 Eric Seidel (no email) 2009-11-13 12:13:08 PST
Created attachment 43188 [details] Another crash report from console-dir.html when trying to land bug 31406
Comment 41 Eric Seidel (no email) 2009-11-13 12:19:29 PST
I'm not sure that: https://bugs.webkit.org/attachment.cgi?id=43183 https://bugs.webkit.org/attachment.cgi?id=43186 https://bugs.webkit.org/attachment.cgi?id=43188 Are actually related to the original bug in question. They just happen to be console-dir.html crashes of the last couple days. They may be of a different origin.
Comment 42 Eric Seidel (no email) 2009-11-13 12:25:29 PST
Looks like we're seeing console-dir.html crashes on the build bots too: http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r50956%20(7276)/results.html http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r50933%20(7258)/results.html
Comment 43 Eric Seidel (no email) 2009-11-17 13:38:55 PST
Attempting to reduce the set of tests required to produce a crash.
Comment 44 Eric Seidel (no email) 2009-11-17 14:03:35 PST
I've plugged the set of test cases into the automated test minimizer "tmin": http://code.google.com/p/tmin/wiki/TminManual and we're just gonna hope. :)
Comment 45 Eric Seidel (no email) 2009-11-17 15:15:02 PST
Created attachment 43383 [details] 27 tests which are known to crash when run together cat known_to_crash.txt | xargs run-webkit-tests --iterations 10 --no-launch-safari --debug is the command I'm using.
Comment 46 Eric Seidel (no email) 2009-11-17 15:30:04 PST
OK. I've reduced it to 4 tests required to produce the crash: http/tests/xmlhttprequest/workers/shared-worker-close.html http/tests/xmlhttprequest/workers/shared-worker-methods.html inspector/console-dir.html inspector/console-format.html This command: run-webkit-tests --debug --iterations 20 --no-launch-safari http/tests/xmlhttprequest/workers/shared-worker-close.html http/tests/xmlhttprequest/workers/shared-worker-methods.html inspector/console-dir.html inspector/console-format.html Reliably produces a crash for me. Looking now to see if I can condense this down into a single test case.
Comment 47 Eric Seidel (no email) 2009-11-17 15:38:43 PST
Looks like this is caused by Shared Workers + gc() This command crashes reliably for me: run-webkit-tests --debug --iterations 100 --no-launch-safari http/tests/xmlhttprequest/workers/shared-worker-methods.html I'll see if I can reduce that single test further. Although at this point I would expect one of the JSC experts should be able to give some theories as to what's going wrong here. :)
Comment 48 Dmitry Titov 2009-11-17 19:25:06 PST
I think I have a fix for this. Will create a patch later today. This particular test (xhr in shared workers) fails because of this change: http://trac.webkit.org/changeset/50919 (landed 11/12)
Comment 49 Oliver Hunt 2009-11-17 20:23:27 PST
(In reply to comment #47) > Looks like this is caused by Shared Workers + gc() > > This command crashes reliably for me: > run-webkit-tests --debug --iterations 100 --no-launch-safari > http/tests/xmlhttprequest/workers/shared-worker-methods.html > > I'll see if I can reduce that single test further. Although at this point I > would expect one of the JSC experts should be able to give some theories as to > what's going wrong here. :) Based on that i think the crash we're currently looking at is a different issue from the one this bug refers to (the revision you refer to is after the date this bug was filed)
Comment 50 Dmitry Titov 2009-11-17 22:59:14 PST
Perhaps there are more then one cause. One of them is bug 31615. Lets see what remains after that one will land.
Comment 51 Dmitry Titov 2009-11-17 23:22:07 PST
With patch for bug 31615 applied, this command does not crash (before it did): run-webkit-tests --iterations 1000 --no-launch-safari http/tests/xmlhttprequest/workers/shared-worker-methods.html
Comment 52 Eric Seidel (no email) 2009-11-18 12:22:48 PST
Inspired by the diagnosis made in bug 31615 I looked back through the changes just before r50171 again. I wonder if http://trac.webkit.org/changeset/50167 could be related to this at all? XHRs are used from multiple threads, no? Is it safe to call those inspector methods from XHR?
Comment 53 Pavel Feldman 2009-11-18 13:03:23 PST
Timeline can only receive events on main thread. I can see timeline being called from within callReadyStateChangeListener only. This one is presumably dispatching events on the main thread for Document context. Workers' contexts should have no timeline agent instances due to the logic in InspectorTimelineAgent::retrieve. So things should be ok, unless marshalling of these events from XHR to main thread is happening later.
Comment 54 Adam Roben (:aroben) 2009-12-02 07:36:52 PST
*** Bug 31999 has been marked as a duplicate of this bug. ***
Comment 55 Timothy Hatcher 2012-03-17 09:27:26 PDT
Does not seem to be there anymore.