Bug 163470

Summary:	run-webkit-tests consumes gigabytes of memory with --iterations 4294967300
Product:	WebKit	Reporter:	David Kilzer (:ddkilzer) <ddkilzer>
Component:	Tools / Tests	Assignee:	Nobody <webkit-unassigned>
Status:	NEW
Severity:	Normal	CC:	ap, dean_johnson, lforschler, webkit-bug-importer
Priority:	P2	Keywords:	InRadar
Version:	Safari 10
Hardware:	Unspecified
OS:	Unspecified

David Kilzer (:ddkilzer)

Reported 2016-10-14 16:36:04 PDT

run-webkit-tests consumes gigabytes of memory with --iterations 4294967300: $ ./Tools/Scripts/run-webkit-tests --release --no-build -1 --iterations 4294967300 --no-sample-on-timeout --no-timeout --child-processes=1 --batch-size=4294967300 --no-show-results compositing/color-matching/pdf-image-match.html Are we hitting pathological behavior by specifying that many iterations?

Attachments
Add attachment proposed patch, testcase, etc.

David Kilzer (:ddkilzer)

Comment 1 2016-10-14 16:36:40 PDT

<rdar://problem/28783669>

Dean Johnson

Comment 2 2016-10-14 16:55:34 PDT

I would suspect the major issue here being this function: OpenSource/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py ... class Manager(object): ... def _get_test_inputs(self, tests_to_run, repeat_each, iterations): test_inputs = [] for _ in xrange(iterations): for test in tests_to_run: for _ in xrange(repeat_each): test_inputs.append(self._test_input_for_file(test)) # This line return test_inputs Since it figures out all test_inputs before the tests are actually ran, you'll see multiple gigabytes of data stored in memory with large iteration numbers. Is this really an issue? When do we ever run millions of iterations? If we do need to do this for some reason, can we limit it to 1000000?

David Kilzer (:ddkilzer)

Comment 3 2016-10-14 20:42:05 PDT

We could also make a smarter data structure that uses memory more efficiently, such as an iterator object that just returns the next test when asked, but internally stores the repetitive iteration state. This is not a critical bug to fix, but I wanted to document the behavior that I saw when I ran into it.

Dean Johnson

Comment 4 2017-08-07 17:42:38 PDT

Python has a data structure/process called "generators" that work very similarly to what you describe. As an example, the code I pasted before could be written as follows, which would evaluate the "test" to return at the time it was being accessed as opposed to generating the set beforehand: OpenSource/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py ... class Manager(object): ... def _get_test_inputs_LIST(self, tests_to_run, repeat_each, iterations): test_inputs = [] for _ in xrange(iterations): for test in tests_to_run: for _ in xrange(repeat_each): test_inputs.append(self._test_input_for_file(test)) # This line return test_inputs def _get_test_inputs_GENERATOR(self, tests_to_run, repeat_each, iterations): for _ in xrange(iterations): for test in tests_to_run: for _ in xrange(repeat_each): yield self._test_input_for_file(test) Now, calling _get_test_inputs_GENERATOR(args) would give you a generator object, which evaluates and returns the next item in the "list" at access-time. This *should* take the memory consumption from NUM_TESTS * NUM_ITERATIONS to O(1). The only reason I have not just written a patch for this is I suspect we probably use _get_test_inputs in a repeatedly-accessible way, which would mean just adopting this new paradigm naively could lead to breaking existing test infrastructure.

Note You need to log in before you can comment on or make changes to this bug.