NEW 163470
run-webkit-tests consumes gigabytes of memory with --iterations 4294967300
https://bugs.webkit.org/show_bug.cgi?id=163470
Summary run-webkit-tests consumes gigabytes of memory with --iterations 4294967300
David Kilzer (:ddkilzer)
Reported 2016-10-14 16:36:04 PDT
run-webkit-tests consumes gigabytes of memory with --iterations 4294967300: $ ./Tools/Scripts/run-webkit-tests --release --no-build -1 --iterations 4294967300 --no-sample-on-timeout --no-timeout --child-processes=1 --batch-size=4294967300 --no-show-results compositing/color-matching/pdf-image-match.html Are we hitting pathological behavior by specifying that many iterations? 
Attachments
David Kilzer (:ddkilzer)
Comment 1 2016-10-14 16:36:40 PDT
Dean Johnson
Comment 2 2016-10-14 16:55:34 PDT
I would suspect the major issue here being this function: OpenSource/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py ... class Manager(object): ... def _get_test_inputs(self, tests_to_run, repeat_each, iterations): test_inputs = [] for _ in xrange(iterations): for test in tests_to_run: for _ in xrange(repeat_each): test_inputs.append(self._test_input_for_file(test)) # This line return test_inputs Since it figures out all test_inputs before the tests are actually ran, you'll see multiple gigabytes of data stored in memory with large iteration numbers. Is this really an issue? When do we ever run millions of iterations? If we do need to do this for some reason, can we limit it to 1000000?
David Kilzer (:ddkilzer)
Comment 3 2016-10-14 20:42:05 PDT
We could also make a smarter data structure that uses memory more efficiently, such as an iterator object that just returns the next test when asked, but internally stores the repetitive iteration state. This is not a critical bug to fix, but I wanted to document the behavior that I saw when I ran into it.
Dean Johnson
Comment 4 2017-08-07 17:42:38 PDT
Python has a data structure/process called "generators" that work very similarly to what you describe. As an example, the code I pasted before could be written as follows, which would evaluate the "test" to return at the time it was being accessed as opposed to generating the set beforehand: OpenSource/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py ... class Manager(object): ... def _get_test_inputs_LIST(self, tests_to_run, repeat_each, iterations): test_inputs = [] for _ in xrange(iterations): for test in tests_to_run: for _ in xrange(repeat_each): test_inputs.append(self._test_input_for_file(test)) # This line return test_inputs def _get_test_inputs_GENERATOR(self, tests_to_run, repeat_each, iterations): for _ in xrange(iterations): for test in tests_to_run: for _ in xrange(repeat_each): yield self._test_input_for_file(test) Now, calling _get_test_inputs_GENERATOR(args) would give you a generator object, which evaluates and returns the next item in the "list" at access-time. This *should* take the memory consumption from NUM_TESTS * NUM_ITERATIONS to O(1). The only reason I have not just written a patch for this is I suspect we probably use _get_test_inputs in a repeatedly-accessible way, which would mean just adopting this new paradigm naively could lead to breaking existing test infrastructure.
Note You need to log in before you can comment on or make changes to this bug.