77419 2012-01-31 03:21:53 -0800 run-webkit-tests: This machine could support 16 child processes, but only has enough memory for 15 2012-06-19 14:44:48 -0700 1 1 1 Unclassified WebKit Tools / Tests 528+ (Nightly build) Unspecified Unspecified RESOLVED FIXED NRWT P2 Normal --- 73847 74021 74650 1 koivisto webkit-unassigned ap aroben dglazkov dpranke eric lforschler oldest_to_newest 545904 0 koivisto 2012-01-31 03:21:53 -0800 wat. This MacPro has 14GB of RAM. 545908 1 eric 2012-01-31 03:28:37 -0800 It's a heuristic. :) http://trac.webkit.org/browser/trunk/Tools/Scripts/webkitpy/layout_tests/port/base.py#L166 It uses the results of vm_stat: http://trac.webkit.org/browser/trunk/Tools/Scripts/webkitpy/common/system/platforminfo.py#L75 I suspect you have a lot of other things running? 545910 2 eric 2012-01-31 03:30:45 -0800 I'm very happy to tune the heuristic further. It can be wrong in both directions. ORWT didn't have this problem because it only ran one copy of DRT. NRWT can run as many as we'd like it to (you can manually control it with --child-processes=N as you like). Right now it tries to run one-per-core if we have the ram to support them. The RAM requirement was added because we were breaking the Mac builders which only had 3GB of RAM :) (Which was about all you needed back in the day to link webkit and run one DRT at a time.) 545912 3 eric 2012-01-31 03:32:38 -0800 See bug 74021, bug 74650 and bug 73847 for more of the history. 545963 4 koivisto 2012-01-31 05:17:58 -0800 No, the machine is not under memory pressure. You are likely to get better results by addressing the problem directly (only allow n threads per GB of physical memory in the machine). Any attempts to base heuristics on free memory figures is almost certain to go wrong. Explaining why this is the case would require lengthy discussions. It suffices to say that the OS memory subsystem is complex. 546121 5 ap 2012-01-31 08:58:35 -0800 The heuristic is not based on free memory any more - see bug 74650. 546200 6 koivisto 2012-01-31 10:21:24 -0800 (In reply to comment #5) > The heuristic is not based on free memory any more - see bug 74650. Yes it is, with a slightly altered definition of "free memory". 546226 7 eric 2012-01-31 10:53:15 -0800 I'm happy to accept alternate proposals. Ideally in python form. :) I'm not at all wedded to the current heuristic. The very first attempt at this (bug 73847) used physical memory (sysctl -n hw.memsize), that was deemed not good-enough, and changed to free memory (vm_stat "Pages free") in bug 74021. That was again decided to be insufficient and changed to use free + inactive (vm_stat "Pages free" + "Pages inactive") in bug 74650. Again, totally open to changing the algorithm. But we'll need a concrete suggestion. See bug 73847 for why we moved away from sysctl -n hw.memsize to vm_stat "Pages free". 546236 8 eric 2012-01-31 10:59:31 -0800 (In reply to comment #7) > Again, totally open to changing the algorithm. But we'll need a concrete suggestion. See bug 73847 for why we moved away from sysctl -n hw.memsize to vm_stat "Pages free". Sorry, I meant to say bug 74021, but I realize now the discussion was all in private mail about the mac bots. I'm happy to forward you the (not very exciting) discussion. I'm open to changing this back to using hw.memsize with a smaller expected-ram-per-DRT value. Another way would be to not pick a number of DRTs to spawn at the beginning and dynamically control them based on free memory. That's a larger change, but perhaps a better system design. Dirk might be able to comment on how difficult that might be. 546306 9 lforschler 2012-01-31 11:42:09 -0800 (In reply to comment #8) > (In reply to comment #7) > > Again, totally open to changing the algorithm. But we'll need a concrete suggestion. See bug 73847 for why we moved away from sysctl -n hw.memsize to vm_stat "Pages free". > > Sorry, I meant to say bug 74021, but I realize now the discussion was all in private mail about the mac bots. I'm happy to forward you the (not very exciting) discussion. > > I'm open to changing this back to using hw.memsize with a smaller expected-ram-per-DRT value. > > Another way would be to not pick a number of DRTs to spawn at the beginning and dynamically control them based on free memory. That's a larger change, but perhaps a better system design. Dirk might be able to comment on how difficult that might be. We should ensure that all the bots have enough memory to run as many DRT processes as cores, otherwise we are just wasting cpu capacity. If this means upgrading the memory in the bots, that is what we should do. Our EWS bots are a mix of 4, 8, and 16 core machines. What is the memory requirement for DRT? Obviously a 16 core machine will need more memory than a 4 core machine, but how much more I am unsure. 546434 10 dpranke 2012-01-31 13:28:09 -0800 (In reply to comment #9) > We should ensure that all the bots have enough memory to run as many DRT processes as cores, otherwise we are just wasting cpu capacity. If this means upgrading the memory in the bots, that is what we should do. Our EWS bots are a mix of 4, 8, and 16 core machines. What is the memory requirement for DRT? Obviously a 16 core machine will need more memory than a 4 core machine, but how much more I am unsure. I agree with Lucas. Memory is (relatively) cheap, and this is the approach we've been using since time immemorial on the Chromium bots. I think roughly speaking we tend to have about 768MB of physical memory per DRT instance (i.e., per virtual core). Dunno if 512MB/DRT would be enough, but it sure seems like it should be. I can pull the stats from all of the Chromium bots if need be. 652839 11 dpranke 2012-06-19 14:44:48 -0700 This should have been fixed in http://trac.webkit.org/changeset/120738 ; please reopen if you still see issues.