Bug 30835 - REGRESSION (??? - r50171): inspector tests crashing at JSC::TypeInfo::type()
: REGRESSION (??? - r50171): inspector tests crashing at JSC::TypeInfo::type()
Status: RESOLVED WORKSFORME
: WebKit
Web Inspector (Deprecated)
: 528+ (Nightly build)
: Macintosh Mac OS X 10.5
: P1 Normal
Assigned To:
:
: InRadar
: 31615 31817
: 30916 31268
  Show dependency treegraph
 
Reported: 2009-10-27 14:39 PST by
Modified: 2012-03-17 09:27 PST (History)


Attachments
crash report from running "run-webkit-tests --iterations 100 http inspector" (29.40 KB, text/plain)
2009-10-28 15:48 PST, Eric Seidel
no flags Details
13 crash reports from another run of "run-webkit-tests --iterations 100 http inspector" (88.35 KB, application/x-gzip)
2009-10-28 23:09 PST, Eric Seidel
no flags Details
180 tests when run which are known to crash. (10.03 KB, text/plain)
2009-10-29 14:35 PST, Eric Seidel
no flags Details
crash log (27.04 KB, text/plain)
2009-10-30 13:35 PST, David Levin
no flags Details
another crash log on r50935 (1.15 KB, text/plain)
2009-11-13 05:49 PST, Pavel Feldman
no flags Details
crash report from console-dir.html when trying to land bug 31474 (28.68 KB, text/plain)
2009-11-13 12:02 PST, Eric Seidel
no flags Details
Crash report from console-dir.html when trying to land bug 31456 (29.34 KB, text/plain)
2009-11-13 12:06 PST, Eric Seidel
no flags Details
Another crash report from console-dir.html when trying to land bug 31406 (28.76 KB, text/plain)
2009-11-13 12:13 PST, Eric Seidel
no flags Details
27 tests which are known to crash when run together (1.31 KB, text/plain)
2009-11-17 15:15 PST, Eric Seidel
no flags Details


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2009-10-27 14:39:52 PST
inspector/console-format.html crashed on the Leopard Debug Bot

http://build.webkit.org/results/Leopard%20Intel%20Debug%20(Tests)/r50171%20(6519)/results.html

Unfortunately I don't have a crash log for you.  I've only seen it happen once so far.
------- Comment #1 From 2009-10-27 14:50:09 PST -------
I wonder if this could be related to the crash just seen on the Tiger bot in inspector/console- tests:
http://build.webkit.org/results/Tiger%20Intel%20Release/r50171%20(5612)/results.html
------- Comment #2 From 2009-10-28 09:37:10 PST -------
Looks like it's crashing again this morning.  This is a real bug:
http://build.webkit.org/results/Leopard%20Intel%20Debug%20(Tests)/r50217%20(6552)/results.html
------- Comment #3 From 2009-10-28 09:38:58 PST -------
Where can we get the crash logs? Why isn't the stderr output file a 404? (That would help, since this is likely an ASSERT.)
------- Comment #4 From 2009-10-28 09:39:10 PST -------
"why is"
------- Comment #5 From 2009-10-28 09:45:45 PST -------
We don't have an easy way to get crash logs off the bots yet. :(  bug 14861.
------- Comment #6 From 2009-10-28 09:46:30 PST -------
I expect that
run-webkit-tests --iterations 100 inspector/console-format.html
might reproduce it in a local debug build.  Since this crash seems flakey, it doesn't crash every time.
------- Comment #7 From 2009-10-28 15:19:35 PST -------
This console failure (internal timeout):
http://build.webkit.org/results/Leopard%20Intel%20Debug%20(Tests)/r50239%20(6569)/inspector/console-format-collections-pretty-diff.html
may also be related.

I ran the inspector tests 100 times (--iterations 100) locally in debug mode and saw no crashes.
------- Comment #8 From 2009-10-28 15:20:51 PST -------
My --iterations 100 run was with a rather old build of WebKit.  I'm going to update and run the tests again since these failures only recently started on the leopard bot.

Sadly inspector/ is currently the most flakey set of tests on the leopard debug bot. :(
------- Comment #9 From 2009-10-28 15:30:52 PST -------
There are two things that we can try:
1) Change DRT so that inspector is enabled for LayoutTests/inspectors tests
2) Replace setTimeout(0) with the direct call in tests.
------- Comment #10 From 2009-10-28 15:33:26 PST -------
I worry this might be an http-induced crasher leaking into the inspector tests.  I remember we've had some flakiness with XHR tests in the past causing random crashes.  I'll dig up the bugs.

I'm right now running:
run-webkit-tests --iterations 100 http inspector
to see if this could be related to being run after the http tests.
------- Comment #11 From 2009-10-28 15:34:55 PST -------
bug 29344, bug 30726, bug 30519, bug 30392 and bug 29090 are all about flakey http tests, some of which involve unexplained crashers.  It is possible the inspector tests are just the most recent victim here.
------- Comment #12 From 2009-10-28 15:37:08 PST -------
Just had console-dirxml.html fail on the Tiger bot:
http://build.webkit.org/results/Tiger%20Intel%20Release/r50240%20(5657)/results.html
Clearly something is wrong here that's causing flakey inspector/ tests on multiple bots. :(
------- Comment #13 From 2009-10-28 15:48:30 PST -------
Created an attachment (id=42064) [details]
crash report from running "run-webkit-tests --iterations 100 http inspector"

I guess I'll CC some of the JSC guys.
------- Comment #14 From 2009-10-28 15:49:10 PST -------
CCing a couple JIT guys in case this crash dump looks familiar to them.
------- Comment #15 From 2009-10-28 23:09:18 PST -------
Created an attachment (id=42082) [details]
13 crash reports from another run of  "run-webkit-tests --iterations 100 http inspector"

I let the run complete.  Here are 13 crash reports from the run.  Obviously this bug is reproducible . :)  Now I guess we just need a reduction...
------- Comment #16 From 2009-10-28 23:09:43 PST -------
The 13 reports don't all look the same, but I think there are only two separate stack traces.
------- Comment #17 From 2009-10-29 13:50:49 PST -------
I'm currently trying to reduce the number of tests required to cause this to fail.  Right now I have the set down to about 140.  I'll post the list when I have it down to a more reasonable number.  Right now that 140 is a subset of the http tests and all of the inspector tests.
------- Comment #18 From 2009-10-29 14:04:42 PST -------
If I'm correctly reading the stack traces correctly, it looks like something is trying to toString() a bad JSValue pointer?  Is that a correct reading?
------- Comment #19 From 2009-10-29 14:28:26 PST -------
There has been a bug where quarantine wrappers were not holding wrapped objects and those were collected on the go. That was causing random crap to take place including the one you describe. Quarantine code seemed to be right though - at least it had appropriate mark methods. Could we run this with GC disabled? Or do some stessful GC on inspector tests only? Quarantined objects are only used in inspector and should soon go away.
------- Comment #20 From 2009-10-29 14:35:57 PST -------
Created an attachment (id=42147) [details]
180 tests when run which are known to crash.

I'm still trying to reduce this set.

cat known_to_crash.txt | xargs run-webkit-tests --iterations 100

will lead to the crash.
------- Comment #21 From 2009-10-29 14:42:20 PST -------
When this crashes, it seems to crash on the very first inspector test.  It's possible that a GC is triggered during an inspector test and that's the reason why it's crashing.  I guess I could try sprinkling gc() calls in inspector/console-dir.html and see what happens.
------- Comment #22 From 2009-10-29 15:13:49 PST -------
Alexey suggested I try using COLLECT_ON_EVERY_ALLOCATION from Collector.cpp.

I built a copy of WebKit with that, and ran all of the inspector/*.html tests under DumpRenderTree.  I was not able to produce a crash.

I wonder if one of the http tests is smashing memory in some way?  It's strange that all of the crashes seem to have very similar crash points:

0x00000000fffffff0
0x0000000000000001
0x0000000000000fe4
0x0000000000000002
0x00000000fffffff6

Do these values look familiar to anyone in JIT land?  The crash point is always:
0   com.apple.JavaScriptCore          0x0052bb81 JSC::TypeInfo::type() const + 9 (JSTypeInfo.h:60)
------- Comment #23 From 2009-10-29 15:29:02 PST -------
Why do all of those crash points only have the low 8 bytes set?
------- Comment #24 From 2009-10-30 12:37:32 PST -------
Whether either getting passed in bogus values or (and this seems more likely) we're truncating that tag bits from an jsvalue.  It would be good to see if we can find exactly what revision this started at.
------- Comment #25 From 2009-10-30 13:08:09 PST -------
Eric are you running on leopard or snowleopard?
------- Comment #26 From 2009-10-30 13:19:22 PST -------
I'm running Leopard.  So are the bots we've seen this crash on.  I believe I've only ever seen this crash on Leopard Debug, although it's possible it crashes on other configurations.
------- Comment #27 From 2009-10-30 13:21:24 PST -------
As yet i have been unable to repro -- if you can get a narrower revision range that would be great
------- Comment #28 From 2009-10-30 13:35:41 PST -------
Created an attachment (id=42230) [details]
crash log
------- Comment #29 From 2009-10-30 14:59:54 PST -------
(In reply to comment #27)
> As yet i have been unable to repro -- if you can get a narrower revision range
> that would be great

The tests (as well as the testing harness) for inspector were introduced not so long ago, so I think it might be hard to narrow the revision range. It might have been there for a while.
------- Comment #30 From 2009-10-31 01:16:12 PST -------
I've added a myriad of assertions but have yet to hit anything prior to the actual crash.  It's really bizarre.  I may start adding assertions looking for these specific bad values.
------- Comment #31 From 2009-11-03 00:14:01 PST -------
This seems to be less common on the bots today, but is not gone.  I suspect that xmlhttprequest tests were added and thus changed what objects were live at the time gc() ended up being called during the crashing console tests.
------- Comment #32 From 2009-11-03 22:36:14 PST -------
<rdar://problem/7363589>
------- Comment #33 From 2009-11-04 00:03:43 PST -------
inspector/console-format-collections.html is crashing consistently no the Leopard Debug Test Bot this evening.  I assume it's just this bug.  I assume that the stars aligned with the addition of some new test such that the gc timing is correct to trigger this bug more frequently again.  Or at least that my (uninformed) theory. :)
------- Comment #34 From 2009-11-09 13:23:52 PST -------
Have not seen Leopard bots failing since I queued things carefully in https://bugs.webkit.org/show_bug.cgi?id=30884. Or was it failing?
------- Comment #35 From 2009-11-09 13:28:12 PST -------
I haven't seen the bots crash due to this in a while either.  But I haven't been paying super-close attention.  If it's fixed, do we have any idea what could have fixed it?
------- Comment #36 From 2009-11-09 13:41:43 PST -------
I've queued all the interaction between the inspected page and frontend more carefully. In particular this excluded re-enterability from withing the timer fire.
------- Comment #37 From 2009-11-13 05:49:28 PST -------
Created an attachment (id=43152) [details]
another crash log on r50935
------- Comment #38 From 2009-11-13 12:02:06 PST -------
Created an attachment (id=43183) [details]
crash report from bug 31474

https://bugs.webkit.org/show_bug.cgi?id=31474#c5 saw this crash again.
There have been a rash of GC-related crashes the last few days, so this may not be related to this particular bug, but is the same test.  bug 31460 is one example of the other JSC crashes seen in the last 48 hours.
------- Comment #39 From 2009-11-13 12:06:33 PST -------
Created an attachment (id=43186) [details]
Crash report from trying to land bug 31456
------- Comment #40 From 2009-11-13 12:13:08 PST -------
Created an attachment (id=43188) [details]
Another crash report from console-dir.html when trying to land bug 31406
------- Comment #41 From 2009-11-13 12:19:29 PST -------
I'm not sure that:
https://bugs.webkit.org/attachment.cgi?id=43183
https://bugs.webkit.org/attachment.cgi?id=43186
https://bugs.webkit.org/attachment.cgi?id=43188
Are actually related to the original bug in question.  They just happen to be console-dir.html crashes of the last couple days.  They may be of a different origin.
------- Comment #43 From 2009-11-17 13:38:55 PST -------
Attempting to reduce the set of tests required to produce a crash.
------- Comment #44 From 2009-11-17 14:03:35 PST -------
I've plugged the set of test cases into the automated test minimizer "tmin": http://code.google.com/p/tmin/wiki/TminManual and we're just gonna hope. :)
------- Comment #45 From 2009-11-17 15:15:02 PST -------
Created an attachment (id=43383) [details]
27 tests which are known to crash when run together

cat known_to_crash.txt | xargs run-webkit-tests --iterations 10 --no-launch-safari --debug

is the command I'm using.
------- Comment #46 From 2009-11-17 15:30:04 PST -------
OK.  I've reduced it to 4 tests required to produce the crash:

http/tests/xmlhttprequest/workers/shared-worker-close.html http/tests/xmlhttprequest/workers/shared-worker-methods.html
inspector/console-dir.html
inspector/console-format.html

This command:
run-webkit-tests --debug --iterations 20 --no-launch-safari http/tests/xmlhttprequest/workers/shared-worker-close.html http/tests/xmlhttprequest/workers/shared-worker-methods.html inspector/console-dir.html inspector/console-format.html

Reliably produces a crash for me.

Looking now to see if I can condense this down into a single test case.
------- Comment #47 From 2009-11-17 15:38:43 PST -------
Looks like this is caused by Shared Workers + gc()

This command crashes reliably for me:
run-webkit-tests --debug --iterations 100 --no-launch-safari  http/tests/xmlhttprequest/workers/shared-worker-methods.html

I'll see if I can reduce that single test further.  Although at this point I would expect one of the JSC experts should be able to give some theories as to what's going wrong here. :)
------- Comment #48 From 2009-11-17 19:25:06 PST -------
I think I have a fix for this. Will create a patch later today.

This particular test (xhr in shared workers) fails because of this change: http://trac.webkit.org/changeset/50919 (landed 11/12)
------- Comment #49 From 2009-11-17 20:23:27 PST -------
(In reply to comment #47)
> Looks like this is caused by Shared Workers + gc()
> 
> This command crashes reliably for me:
> run-webkit-tests --debug --iterations 100 --no-launch-safari 
> http/tests/xmlhttprequest/workers/shared-worker-methods.html
> 
> I'll see if I can reduce that single test further.  Although at this point I
> would expect one of the JSC experts should be able to give some theories as to
> what's going wrong here. :)

Based on that i think the crash we're currently looking at is a different issue from the one this bug refers to (the revision you refer to is after the date this bug was filed)
------- Comment #50 From 2009-11-17 22:59:14 PST -------
Perhaps there are more then one cause. One of them is bug 31615. Lets see what remains after that one will land.
------- Comment #51 From 2009-11-17 23:22:07 PST -------
With patch for bug 31615 applied, this command does not crash (before it did):
run-webkit-tests --iterations 1000 --no-launch-safari http/tests/xmlhttprequest/workers/shared-worker-methods.html
------- Comment #52 From 2009-11-18 12:22:48 PST -------
Inspired by the diagnosis made in bug 31615 I looked back through the changes just before r50171 again.  I wonder if http://trac.webkit.org/changeset/50167 could be related to this at all?  XHRs are used from multiple threads, no?  Is it safe to call those inspector methods from XHR?
------- Comment #53 From 2009-11-18 13:03:23 PST -------
Timeline can only receive events on main thread. I can see timeline being called from within callReadyStateChangeListener only. This one is presumably dispatching events on the main thread for Document context. Workers' contexts should have no timeline agent instances due to the logic in InspectorTimelineAgent::retrieve. So things should be ok, unless marshalling of these events from XHR to main thread is happening later.
------- Comment #54 From 2009-12-02 07:36:52 PST -------
*** Bug 31999 has been marked as a duplicate of this bug. ***
------- Comment #55 From 2012-03-17 09:27:26 PST -------
Does not seem to be there anymore.