WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
84008
Add public page loading performance tests using web-page-replay
https://bugs.webkit.org/show_bug.cgi?id=84008
Summary
Add public page loading performance tests using web-page-replay
Ryosuke Niwa
Reported
2012-04-15 23:27:20 PDT
Google's page cycler and Apple's PLT test suites are both private due to copyright law restrictions, and people outside of these two organizations cannot see the contents. This severely limits the utility and the effectiveness of these test suites because not all contributors can run them locally. We should add a publicly distributable version of these two test suites that tests page loading time of real websites. The idea here is to load pages out of archive.org. Because archive.org provides a content snapshot at a specific time, we can always pull the same data by simply specifying the URL, eliminating the need for distributing the page contents with the test suite but still allowing all contributors to obtain the same page test data when running the tests. Credit: Greg Simon for this clever idea. In order to avoid DoS'ing archive.org, we can use web-page-replay:
http://code.google.com/p/web-page-replay/
to cache data locally and make any other necessary modifications to the page for the purpose of performance tests.
Attachments
Adds basic replay mechanism
(944.62 KB, patch)
2012-04-15 23:31 PDT
,
Ryosuke Niwa
no flags
Details
Formatted Diff
Diff
Missing replaytest.py
(4.99 KB, text/x-python-script)
2012-04-15 23:43 PDT
,
Ryosuke Niwa
no flags
Details
work in progress
(71.56 KB, patch)
2012-05-03 16:17 PDT
,
Ryosuke Niwa
no flags
Details
Formatted Diff
Diff
work in progress (got rid of the useless change log entry)
(13.78 KB, patch)
2012-05-03 16:19 PDT
,
Ryosuke Niwa
no flags
Details
Formatted Diff
Diff
work in progress
(15.30 KB, patch)
2012-05-10 15:13 PDT
,
Ryosuke Niwa
no flags
Details
Formatted Diff
Diff
work in progress 4
(12.22 KB, patch)
2012-05-25 18:40 PDT
,
Ryosuke Niwa
no flags
Details
Formatted Diff
Diff
Initial implementation
(46.35 KB, patch)
2012-05-30 17:44 PDT
,
Ryosuke Niwa
no flags
Details
Formatted Diff
Diff
Addressed Dirk's comments.
(51.85 KB, patch)
2012-05-31 18:21 PDT
,
Ryosuke Niwa
dpranke
: review+
Details
Formatted Diff
Diff
Show Obsolete
(7)
View All
Add attachment
proposed patch, testcase, etc.
Ryosuke Niwa
Comment 1
2012-04-15 23:31:12 PDT
Created
attachment 137280
[details]
Adds basic replay mechanism
Ryosuke Niwa
Comment 2
2012-04-15 23:32:38 PDT
Everything in Tools/WebPageReplay has been imported from
http://code.google.com/p/web-page-replay/
. I still need to do some polishing but the basic machinery is there already. You can add a bunch of URLs to Replay/replaytest.list and run-perf-tests will automatically create replay tests for you :)
Ryosuke Niwa
Comment 3
2012-04-15 23:43:13 PDT
Created
attachment 137283
[details]
Missing replaytest.py Note: to use this patch, you have to go to Mac's system preference and setup http/https proxies at 8080 and 8413
Zoltan Horvath
Comment 4
2012-05-03 01:38:43 PDT
(In reply to
comment #2
)
> Everything in Tools/WebPageReplay has been imported from
http://code.google.com/p/web-page-replay/
.
Is there any chance that it will work on linux if I install the dependencies?
> I still need to do some polishing but the basic machinery is there already. You can add a bunch of URLs to Replay/replaytest.list and run-perf-tests will automatically create replay tests for you :)
Hmm, is the ResourceLoadDelegate.mm the only mac specific change? Btw, do you have a recent version of the patch or the latest sits here? :) Thanks!
Ryosuke Niwa
Comment 5
2012-05-03 03:27:02 PDT
(In reply to
comment #4
)
> > I still need to do some polishing but the basic machinery is there already. You can add a bunch of URLs to Replay/replaytest.list and run-perf-tests will automatically create replay tests for you :) > > Hmm, is the ResourceLoadDelegate.mm the only mac specific change?
I suspect a similar change needs to be made on all ports.
> Btw, do you have a recent version of the patch or the latest sits here? :)
I'm working on it now.
Ryosuke Niwa
Comment 6
2012-05-03 16:17:58 PDT
Created
attachment 140121
[details]
work in progress Something is broken on web-page-replay side, and I can't get it to work. Everything results in 404 :(
Ryosuke Niwa
Comment 7
2012-05-03 16:19:13 PDT
Created
attachment 140122
[details]
work in progress (got rid of the useless change log entry)
Ryosuke Niwa
Comment 8
2012-05-10 15:13:50 PDT
Created
attachment 141276
[details]
work in progress
Ryosuke Niwa
Comment 9
2012-05-23 16:29:26 PDT
Here's a list of popular 60+ web pages (except ad provider, porn, etc...) that were available on archive.org:
http://web.archive.org/web/20110729013436/http://www.google.com/
http://web.archive.org/web/20110101045027/http://www.youtube.com/
http://web.archive.org/web/20101029215059/http://www.yahoo.com/
http://web.archive.org/web/20110726050652/http://www.baidu.com/
http://web.archive.org/web/20110713210358/http://news.baidu.com/
http://web.archive.org/web/20110729021258/http://en.wikipedia.org/wiki/Main_Page
http://web.archive.org/web/20110716004235/http://en.wikipedia.org/wiki/The_Beatles
http://web.archive.org/web/20110131013707/http://www.qq.com/
http://web.archive.org/web/20110729032327/http://www.amazon.com/
http://web.archive.org/web/20110729033222/http://googleblog.blogspot.com/
http://web.archive.org/web/20100209183352/http://www.taobao.com/index_global.php
http://web.archive.org/web/20110726224242/http://www.sina.com.cn/
http://web.archive.org/web/20110728224518/http://www.yahoo.co.jp/
http://web.archive.org/web/20110728062744/http://www.msn.com/
http://web.archive.org/web/20110729040307/http://wordpress.com/
http://web.archive.org/web/20110721134904/http://en.blog.wordpress.com/
http://web.archive.org/web/20110721020325/http://www.google.com.hk/
http://web.archive.org/web/20110729062331/http://www.ebay.com/
http://web.archive.org/web/20110721133155/http://www.yandex.ru/
http://web.archive.org/web/20110720113906/http://www.163.com/
http://web.archive.org/web/20110728224522/http://weibo.com/
http://web.archive.org/web/20110426114921/http://weibo.com/yaochen
http://web.archive.org/web/20110630155134/http://www.bing.com/news
http://web.archive.org/web/20110712090452/http://windows.microsoft.com/en-US/windows/home
http://web.archive.org/web/20110723131535/http://news.soso.com/
http://web.archive.org/web/20110721002024/http://staff.tumblr.com/
http://web.archive.org/web/20110728143621/https://www.paypal.com/
http://web.archive.org/web/20110608070205/https://www.apple.com/
http://web.archive.org/web/20110729034928/http://www.google.ru/
http://web.archive.org/web/20110728224803/http://www.sohu.com/
http://web.archive.org/web/20110725235527/http://sfbay.craigslist.org/
http://web.archive.org/web/20110624121311/http://sfbay.craigslist.org/apa/
http://web.archive.org/web/20110728165802/http://www.imdb.com/
http://web.archive.org/web/20110729013439/http://www.bbc.co.uk/
http://web.archive.org/web/20110728190248/http://www.tudou.com/
http://web.archive.org/web/20100613214720/http://sl65amg.blog32.fc2.com/
http://web.archive.org/web/20110708184935/http://blog.goo.ne.jp/isehakusandou
http://web.archive.org/web/20110721233757/http://www.ifeng.com/
http://web.archive.org/web/20110728092214/http://www.ask.com/
http://liveweb.archive.org/http://www.ask.com/wiki
http://web.archive.org/web/20110726103516/http://www.youku.com/
http://web.archive.org/web/20110710095406/http://imgur.com/gallery
http://web.archive.org/web/20110729013512/http://www.cnn.com/
http://web.archive.org/web/20110716175917/http://www.hao123.com/
http://web.archive.org/web/20110728094718/http://www.aol.com/
http://web.archive.org/web/20110728190308/http://espn.go.com/
http://web.archive.org/web/20110724020539/http://www.alibaba.com/
http://web.archive.org/web/20110725075627/http://www.avg.com/us-en/homepage
http://web.archive.org/web/20110723060303/http://www.rakuten.co.jp/
http://web.archive.org/web/20110707075549/http://www.about.com/travel/
http://web.archive.org/web/20110727034228/http://www.chinaz.com/
http://web.archive.org/web/20110729082516/http://wordpress.org/
http://web.archive.org/web/20110722032538/http://official.ameba.jp/ranking/day/officialAccessRankingTop.html
http://web.archive.org/web/20110709073905/https://www.alipay.com/
http://web.archive.org/web/20110722061518/http://www.uol.com.br/
http://web.archive.org/web/20110721223556/http://www.amazon.co.jp/
http://web.archive.org/web/20110512210519/http://stackoverflow.com/
http://web.archive.org/web/20110729043303/http://www.huffingtonpost.com/
http://web.archive.org/web/20100627183250/http://www.cnet.com/
http://web.archive.org/web/20110416045031/http://www.dailymotion.com/us
Ryosuke Niwa
Comment 10
2012-05-25 18:40:23 PDT
Created
attachment 144180
[details]
work in progress 4
Ryosuke Niwa
Comment 11
2012-05-30 10:51:10 PDT
It appears that some of the pages can't be loaded properly in DRT using web-page-replay.
Ryosuke Niwa
Comment 12
2012-05-30 17:44:46 PDT
Created
attachment 144957
[details]
Initial implementation
Dirk Pranke
Comment 13
2012-05-30 18:48:31 PDT
Comment on
attachment 144957
[details]
Initial implementation View in context:
https://bugs.webkit.org/attachment.cgi?id=144957&action=review
Patch generally looks good, just some minor comments
> Tools/Scripts/webkitpy/layout_tests/port/webkit.py:533 > + command = driver_input.test_name
Seems like this should just go in is_http_test()? I understand that you might be thinking 'that's just for tests in the http/ directory', but it's probably more confusing this way.
> Tools/Scripts/webkitpy/performance_tests/perftest.py:72 > + if not output or output.text == None or output.error:
when would output be None here? driver.run_test() should always return something; can you call run_failed() outside of run()?
> Tools/Scripts/webkitpy/performance_tests/perftest.py:176 > + def run(self, port, driver, time_out_ms):
Does it make more sense to pass the port to __init__()?
> Tools/Scripts/webkitpy/performance_tests/perftest.py:210 > + return driver.run_test(DriverInput(self.path_or_url(), time_out_ms, None, False))
Maybe this should go up into PerfTest() and then PerfTest.run() can just call this to avoid the repetition?
> Tools/Scripts/webkitpy/performance_tests/perftest.py:219 > + replay_path = os.path.join(os.path.dirname(webkitpy.thirdparty.autoinstalled.webpagereplay.__file__), 'replay.py')
can you get this from just webkitpy.thirdparty.autoinstalled.webpagereplay.replay.__file__ ?
> Tools/Scripts/webkitpy/performance_tests/perftest.py:230 > + connection = socket.create_connection(('localhost', '8080'), timeout=100)
timeouts are in seconds, are you really trying to wait for 100 seconds and retrying 100 times, or did you mean for this to be msecs?
> Tools/Scripts/webkitpy/performance_tests/perftest.py:239 > + self._process.send_signal(signal.SIGINT)
curious why this is SIGINT instead of TERM or KILL?
> Tools/Scripts/webkitpy/performance_tests/perftest.py:311 > + filesystem = driver._port.host.filesystem
You can just access port.host.filesystem directly.
Ryosuke Niwa
Comment 14
2012-05-31 17:55:44 PDT
Comment on
attachment 144957
[details]
Initial implementation View in context:
https://bugs.webkit.org/attachment.cgi?id=144957&action=review
>> Tools/Scripts/webkitpy/layout_tests/port/webkit.py:533 >> + command = driver_input.test_name > > Seems like this should just go in is_http_test()? I understand that you might be thinking 'that's just for tests in the http/ directory', but it's probably more confusing this way.
That's problematic because of code like: def test_to_uri(self, test_name): """Convert a test name to a URI.""" if not self.is_http_test(test_name): return path.abspath_to_uri(self._port.abspath_for_test(test_name)) I've added is_external_http_test to base.py instead.
>> Tools/Scripts/webkitpy/performance_tests/perftest.py:72 >> + if not output or output.text == None or output.error: > > when would output be None here? driver.run_test() should always return something; can you call run_failed() outside of run()?
run_single will return None if the web page replay didn't start properly. Moved the condition to PageLoadingPerfTest.run.
>> Tools/Scripts/webkitpy/performance_tests/perftest.py:176 >> + def run(self, port, driver, time_out_ms): > > Does it make more sense to pass the port to __init__()?
That's a good idea how that both run and prepare take port.
>> Tools/Scripts/webkitpy/performance_tests/perftest.py:210 >> + return driver.run_test(DriverInput(self.path_or_url(), time_out_ms, None, False)) > > Maybe this should go up into PerfTest() and then PerfTest.run() can just call this to avoid the repetition?
Done.
>> Tools/Scripts/webkitpy/performance_tests/perftest.py:219 >> + replay_path = os.path.join(os.path.dirname(webkitpy.thirdparty.autoinstalled.webpagereplay.__file__), 'replay.py') > > can you get this from just webkitpy.thirdparty.autoinstalled.webpagereplay.replay.__file__ ?
Done.
>> Tools/Scripts/webkitpy/performance_tests/perftest.py:230 >> + connection = socket.create_connection(('localhost', '8080'), timeout=100) > > timeouts are in seconds, are you really trying to wait for 100 seconds and retrying 100 times, or did you mean for this to be msecs?
Fixed. It appears that create_connection bails out immediately when there's nobody listening on the specified port. So I've added an explicit sleep of 1s to avoid the busy loop.
>> Tools/Scripts/webkitpy/performance_tests/perftest.py:239 >> + self._process.send_signal(signal.SIGINT) > > curious why this is SIGINT instead of TERM or KILL?
Changed it to SIGTERM.
>> Tools/Scripts/webkitpy/performance_tests/perftest.py:311 >> + filesystem = driver._port.host.filesystem > > You can just access port.host.filesystem directly.
Yeah, I forgot to update that. Fixed.
Ryosuke Niwa
Comment 15
2012-05-31 18:21:38 PDT
Created
attachment 145188
[details]
Addressed Dirk's comments.
Dirk Pranke
Comment 16
2012-05-31 19:38:49 PDT
Comment on
attachment 145188
[details]
Addressed Dirk's comments. View in context:
https://bugs.webkit.org/attachment.cgi?id=145188&action=review
> Tools/Scripts/webkitpy/layout_tests/port/driver.py:134 > +
Turns out I don't want this ... see below (and sorry!)
> Tools/Scripts/webkitpy/layout_tests/port/webkit.py:533 > + command = driver_input.test_name
Okay, I see the problem here, which is that driver_input.test_name isn't actually a test_name, it's a full URL. DriverInputs aren't supposed to be full URLs, they're supposed to be test names :). However, it would be lame if that kept you from using the driver, so we need some other way of distinguishing that this is a URL and not a test name. I'm not sure what the best way to do that is yet, and I don't want to block you on this ... Can you add a #FIXME for now, flip this clause so that it's tested first, and go back to what you were doing before? e.g.: # FIXME: perf tests can pass in full URLs, not just test names ... if driver_input.test_name.startswith('http://') or driver_input.test_name.startswith('https://'): command = driver_input.test_name else: ... sorry for the back-and-forth on this. Everything else looks fine. Thanks!
Ryosuke Niwa
Comment 17
2012-05-31 19:43:23 PDT
Comment on
attachment 145188
[details]
Addressed Dirk's comments. View in context:
https://bugs.webkit.org/attachment.cgi?id=145188&action=review
>> Tools/Scripts/webkitpy/layout_tests/port/webkit.py:533 >> + command = driver_input.test_name > > Okay, I see the problem here, which is that driver_input.test_name isn't actually a test_name, it's a full URL. DriverInputs aren't supposed to be full URLs, they're supposed to be test names :). > > However, it would be lame if that kept you from using the driver, so we need some other way of distinguishing that this is a URL and not a test name. I'm not sure what the best way to do that is yet, and I don't want to block you on this ... Can you add a #FIXME for now, flip this clause so that it's tested first, and go back to what you were doing before? e.g.: > > # FIXME: perf tests can pass in full URLs, not just test names ... > if driver_input.test_name.startswith('http://') or driver_input.test_name.startswith('https://'): > command = driver_input.test_name > else: > ... > > sorry for the back-and-forth on this. > > Everything else looks fine. Thanks!
Okay. WIll do that. I think we should just rename test_name to path_or_url.
Ryosuke Niwa
Comment 18
2012-05-31 20:19:37 PDT
Committed
r119188
: <
http://trac.webkit.org/changeset/119188
>
Ryosuke Niwa
Comment 19
2012-05-31 20:22:36 PDT
Thanks for the review, Dirk. I'm thrilled to land this patch :) An open performance test suite is coming!
Jeff Hammel
Comment 20
2012-06-01 09:59:58 PDT
Mozilla-side work recorded here:
https://bugzilla.mozilla.org/show_bug.cgi?id=760574
Ryosuke Niwa
Comment 21
2012-07-27 00:14:53 PDT
FYI, I've added some explanation about this on
http://trac.webkit.org/wiki/Writing%20Performance%20Tests
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug