Bug 77419 - run-webkit-tests: This machine could support 16 child processes, but only has enough memory for 15
Summary: run-webkit-tests: This machine could support 16 child processes, but only has...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords: NRWT
Depends on: 73847 74021 74650
Blocks:
  Show dependency treegraph
 
Reported: 2012-01-31 03:21 PST by Antti Koivisto
Modified: 2012-06-19 14:44 PDT (History)
6 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Antti Koivisto 2012-01-31 03:21:53 PST
wat. This MacPro has 14GB of RAM.
Comment 1 Eric Seidel (no email) 2012-01-31 03:28:37 PST
It's a heuristic. :)

http://trac.webkit.org/browser/trunk/Tools/Scripts/webkitpy/layout_tests/port/base.py#L166

It uses the results of vm_stat:
http://trac.webkit.org/browser/trunk/Tools/Scripts/webkitpy/common/system/platforminfo.py#L75

I suspect you have a lot of other things running?
Comment 2 Eric Seidel (no email) 2012-01-31 03:30:45 PST
I'm very happy to tune the heuristic further.  It can be wrong in both directions.  ORWT didn't have this problem because it only ran one copy of DRT.  NRWT can run as many as we'd like it to (you can manually control it with --child-processes=N as you like).  Right now it tries to run one-per-core if we have the ram to support them.  The RAM requirement was added because we were breaking the Mac builders which only had 3GB of RAM :) (Which was about all you needed back in the day to link webkit and run one DRT at a time.)
Comment 3 Eric Seidel (no email) 2012-01-31 03:32:38 PST
See bug 74021, bug 74650 and bug 73847 for more of the history.
Comment 4 Antti Koivisto 2012-01-31 05:17:58 PST
No, the machine is not under memory pressure.

You are likely to get better results by addressing the problem directly (only allow n threads per GB of physical memory in the machine). Any attempts to base heuristics on free memory figures is almost certain to go wrong.

Explaining why this is the case would require lengthy discussions. It suffices to say that the OS memory subsystem is complex.
Comment 5 Alexey Proskuryakov 2012-01-31 08:58:35 PST
The heuristic is not based on free memory any more - see bug 74650.
Comment 6 Antti Koivisto 2012-01-31 10:21:24 PST
(In reply to comment #5)
> The heuristic is not based on free memory any more - see bug 74650.

Yes it is, with a slightly altered definition of "free memory".
Comment 7 Eric Seidel (no email) 2012-01-31 10:53:15 PST
I'm happy to accept alternate proposals.  Ideally in python form. :)  I'm not at all wedded to the current heuristic.

The very first attempt at this (bug 73847) used physical memory (sysctl -n hw.memsize), that was deemed not good-enough, and changed to free memory (vm_stat "Pages free") in bug 74021.  That was again decided to be insufficient and changed to use free + inactive (vm_stat "Pages free" + "Pages inactive") in bug 74650.

Again, totally open to changing the algorithm.  But we'll need a concrete suggestion.  See bug 73847 for why we moved away from sysctl -n hw.memsize to vm_stat "Pages free".
Comment 8 Eric Seidel (no email) 2012-01-31 10:59:31 PST
(In reply to comment #7)
> Again, totally open to changing the algorithm.  But we'll need a concrete suggestion.  See bug 73847 for why we moved away from sysctl -n hw.memsize to vm_stat "Pages free".

Sorry, I meant to say bug 74021, but I realize now the discussion was all in private mail about the mac bots.  I'm happy to forward you the (not very exciting) discussion.

I'm open to changing this back to using hw.memsize with a smaller expected-ram-per-DRT value.

Another way would be to not pick a number of DRTs to spawn at the beginning and dynamically control them based on free memory.  That's a larger change, but perhaps a better system design.  Dirk might be able to comment on how difficult that might be.
Comment 9 Lucas Forschler 2012-01-31 11:42:09 PST
(In reply to comment #8)
> (In reply to comment #7)
> > Again, totally open to changing the algorithm.  But we'll need a concrete suggestion.  See bug 73847 for why we moved away from sysctl -n hw.memsize to vm_stat "Pages free".
> 
> Sorry, I meant to say bug 74021, but I realize now the discussion was all in private mail about the mac bots.  I'm happy to forward you the (not very exciting) discussion.
> 
> I'm open to changing this back to using hw.memsize with a smaller expected-ram-per-DRT value.
> 
> Another way would be to not pick a number of DRTs to spawn at the beginning and dynamically control them based on free memory.  That's a larger change, but perhaps a better system design.  Dirk might be able to comment on how difficult that might be.

We should ensure that all the bots have enough memory to run as many DRT processes as cores, otherwise we are just wasting cpu capacity.  If this means upgrading the memory in the bots, that is what we should do.  Our EWS bots are a mix of 4, 8, and 16 core machines.  What is the memory requirement for DRT?  Obviously a 16 core machine will need more memory than a 4 core machine, but how much more I am unsure.
Comment 10 Dirk Pranke 2012-01-31 13:28:09 PST
(In reply to comment #9)
> We should ensure that all the bots have enough memory to run as many DRT processes as cores, otherwise we are just wasting cpu capacity.  If this means upgrading the memory in the bots, that is what we should do.  Our EWS bots are a mix of 4, 8, and 16 core machines.  What is the memory requirement for DRT?  Obviously a 16 core machine will need more memory than a 4 core machine, but how much more I am unsure.

I agree with Lucas. Memory is (relatively) cheap, and this is the approach we've been using since time immemorial on the Chromium bots. I think roughly speaking we tend to have about 768MB of physical memory per DRT instance (i.e., per virtual core). Dunno if 512MB/DRT would be enough, but it sure seems like it should be.

I can pull the stats from all of the Chromium bots if need be.
Comment 11 Dirk Pranke 2012-06-19 14:44:48 PDT
This should have been fixed in http://trac.webkit.org/changeset/120738 ; please reopen if you still see issues.