r95912, r95914(buildfix), r95917(buildfix) and r96068(regression fix)
made sputnik tests flakey on Qt platform.
I think it isn't a QtWebKit bug, but a hidden JSC bug revealed by QtWebKit bots.
r96068 fixed zillion crash, but there are timeout sputnik tests:
On Qt ARM bot: http://build.webkit.sed.hu/results/ARMv5%20Linux%20Qt%20Release%20%28Test%29/r96068%20%283791%29/results.html
On Qt 4.8 bot:
On Qt 32-bit debug bot:
You can reproduce timeouts with Qt 4.7.4 in release mode 32-bit (Same as our official bot on build.webkit.org) with ORWT if you run all tests. (Unfortunately NRWT runs tests in different order and the bug doesn't occur with it on the official bot.)
I tried revert r95912, r95914, r95917 and r96068 locally and then all tests pass for me.
One more thing: This bug only appears on 32 bit x86 platfom and on ARM.
I managed to reproduce it in a small example:
$ Tools/Scripts/old-run-webkit-tests LayoutTests/sputnik/Conformance/15_Native_Objects/15.1_The_Global_Object/15.1.3/18.104.22.168_decodeURI --exit-after-n-failures 1 --verbose
running sputnik/Conformance/15_Native_Objects/15.1_The_Global_Object/15.1.3/22.214.171.124_decodeURI/S126.96.36.199_A1.1_T1.html -> succeeded
running sputnik/Conformance/15_Native_Objects/15.1_The_Global_Object/15.1.3/188.8.131.52_decodeURI/S184.108.40.206_A1.2_T1.html -> succeeded
running sputnik/Conformance/15_Native_Objects/15.1_The_Global_Object/15.1.3/220.127.116.11_decodeURI/S18.104.22.168_A1.2_T2.html -> succeeded
running sputnik/Conformance/15_Native_Objects/15.1_The_Global_Object/15.1.3/22.214.171.124_decodeURI/S126.96.36.199_A1.3_T1.html -> timed out
Any GC expert volunteer for this bug? :)
Is there a way to reproduce this on a non-Qt system?
(In reply to comment #6)
> Is there a way to reproduce this on a non-Qt system?
I don't know. But Zoltan started to fix it, he confirmed that it is a GC related bug. I think he will provide the fix tomorrow.
(In reply to comment #7)
> (In reply to comment #6)
> > Is there a way to reproduce this on a non-Qt system?
> I don't know. But Zoltan started to fix it, he confirmed that it is a GC related bug. I think he will provide the fix tomorrow.
We might be able to fix it if we had any information about what is going wrong -- currently we can't repro, but zoltan has found the bug and hasn't commented on what that bug is so we can't help in any way :-/
It seems http://trac.webkit.org/changeset/96354 fixed the bug. But let's wait for Zoltan's confirmation.
> We might be able to fix it if we had any information about what is going wrong -- currently we can't repro, but zoltan has found the bug and hasn't commented on what that bug is so we can't help in any way :-/
I will check the fix Ossy mentioned. Probably I still need to debug it to see that the fix hides the issue or really fix it. But it is a good lead at least.
for (const ClassInfo* ci = this; ci; ci = ci->parentClass)
in this case ci == ci->parentClass, so it is an infinite loop.
This happens after the calling of GC. The 'this' pointer contains JSDOMWindow, namely the JSDOMWindow of S188.8.131.52_A1.2_T1.html. The GC call and infinite loop happens during the run of S184.108.40.206_A1.2_T2.html. And the mentioned parentClass is the 3rd parent.
p structure()->classInfo()->parentClass->parentClass == 0xf12d3fb0
p structure()->classInfo()->parentClass->parentClass->parentClass == 0xf12d3fb0
and this repeats forever.
Created attachment 109288 [details]
I did the debugging. The Structure was freed, but still have references from a "should be freed" object (unused JSDOMWindow). The cell is allocated again and the new memory data cause the infinite loop (it could be a crash of course). After the signed chars changed to int both of them are correctly collected.
Comment on attachment 109288 [details]
View in context: https://bugs.webkit.org/attachment.cgi?id=109288&action=review
> + Changint signed char to int in r96354 solved the
typo: Changint -> Changing
Comment on attachment 109288 [details]
r=me with typo fix.
Landed in http://trac.webkit.org/changeset/96483