WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
20097
[Qt] 20% Sunspider slow-down
https://bugs.webkit.org/show_bug.cgi?id=20097
Summary
[Qt] 20% Sunspider slow-down
Zoltan Herczeg
Reported
2008-07-18 11:19:05 PDT
I noticed a major slow-down for sunspider between
r35202
-
r35208
Repository : Root:
http://svn.webkit.org/repository/webkit
UUID: 268f45cc-cd09-0410-ab3c-d52691b4dbfc System: Linux, Qt-4.3.4, gcc-4.1.2 (20061115) CPU: AMD opteron, Dual-core 2.2 GHz (dedicated for benchmark measurement) Revision: 35033: 3168.0ms +/- 0.8% 35123: 3172.0ms +/- 0.7% 35170: 3197.8ms +/- 0.9% 35190: 3191.5ms +/- 0.6% 35202: 3193.7ms +/- 0.7% <<< HERE ~20% slow >>> 35208: 3782.8ms +/- 0.6% 35231: 3779.2ms +/- 0.6%
Attachments
Detailed sunspider results (.tgz)
(3.81 KB, application/octet-stream)
2008-07-18 11:25 PDT
,
Zoltan Herczeg
no flags
Details
compare r35202 and r35203
(3.64 KB, text/plain)
2008-07-19 00:06 PDT
,
Zoltan Herczeg
no flags
Details
bubbleSort.js
(1.79 KB, text/javascript)
2008-07-20 11:01 PDT
,
David Kilzer (:ddkilzer)
no flags
Details
Add missing NDEBUG
(941 bytes, patch)
2008-07-21 08:22 PDT
,
Gabor Loki
hausmann
: review+
Details
Formatted Diff
Diff
View All
Add attachment
proposed patch, testcase, etc.
Zoltan Herczeg
Comment 1
2008-07-18 11:25:25 PDT
Created
attachment 22366
[details]
Detailed sunspider results (.tgz)
Alexey Proskuryakov
Comment 2
2008-07-18 11:34:56 PDT
The only JavaScriptCore change in this range is <
http://trac.webkit.org/projects/webkit/changeset/35203
>.
David Kilzer (:ddkilzer)
Comment 3
2008-07-18 13:13:03 PDT
Could you attach all of the additional tests you're running with SunSpider?
Cameron Zwarich (cpst)
Comment 4
2008-07-18 14:19:46 PDT
I don't see a negative impact on performance from
r35203
on my machine, a 2.16 GHz MacBook Pro with Apple GCC 4.0.
Mark Rowe (bdash)
Comment 5
2008-07-18 14:31:22 PDT
I see
r35203
as being around 2% faster than
r35202
on my Mac with GCC 4.0. If I force JavaScriptCore to build with GCC 4.2, I see
r35203
as being around 2% slower than
r35202
. This regression may be specific to either your compiler version or perhaps even to the manner in which the Qt port builds JavaScriptCore.
Geoffrey Garen
Comment 6
2008-07-18 14:36:17 PDT
Can you verify that the ALWAYS_INLINE macro works in your port?
Zoltan Herczeg
Comment 7
2008-07-18 15:06:45 PDT
(In reply to
comment #6
)
> Can you verify that the ALWAYS_INLINE macro works in your port?
Yes, both ALWAYS_INLINE and NEVER_INLINE works (I checked an assembly dump) Maybe it was not clear, but our extended benchmark set is separated from Sunspider. Those tests are not finalized and may changed in the near future. However, if you are still interested, I can attach them. Is it possible, that the AMD cpu, which has a slower fpu than the intel counterpart, is responsible for the slow? Maybe I should try out my binaries on other computers.
Zoltan Herczeg
Comment 8
2008-07-18 15:31:41 PDT
I simply copied my binaries to a P4 2.40 GHz (jsc-35202, jsc-35208 and libQtCore.so.4 because it was missing from that system), and it seems the results are the same as on the other archtecture. By the way, the size is also increased: jsc-35202 - 1820153 jsc-35208 - 1878409
David Kilzer (:ddkilzer)
Comment 9
2008-07-18 18:02:35 PDT
(In reply to
comment #7
)
> Maybe it was not clear, but our extended benchmark set is separated from > Sunspider. Those tests are not finalized and may changed in the near future. > However, if you are still interested, I can attach them.
Please attach them. The only way to reproduce the results on other platforms is to use the same tests. You should also try the original SunSpider tests to see how they perform on your platform, e.g., if you see a slowdown or a speedup.
Zoltan Herczeg
Comment 10
2008-07-19 00:04:52 PDT
I am sorry, it seems it was a bad thing to mention the other benchmark set. Naturally, the results I sent you are measured by the original sunsipder tests without any kind of tweaks. In the meantime, I have compiled the 35203 revision as well. Now I can enclose the comparistation between
r35202
and
r35203
. The worst case: access-fannkuch: *1.67x as slow* The reason why I mentioned bubble sort, because it was more than 70% slower, and that is bigger than access-fannkuch. The file sizes as also interesting: jsc-35123: 1814130 bytes jsc-35170: 1820097 bytes jsc-35190: 1820143 bytes jsc-35202: 1820153 bytes << BIG JUMP >> jsc-35203: 1878409 bytes jsc-35208: 1878409 bytes (you mentionad there was no changes here) jsc-35231: 1862421 bytes (little better)
Zoltan Herczeg
Comment 11
2008-07-19 00:06:02 PDT
Created
attachment 22382
[details]
compare
r35202
and
r35203
Mark Rowe (bdash)
Comment 12
2008-07-19 00:18:38 PDT
It may be worth testing a different compiler version or a different port to see what result you get. The GTK port should compile and run on Linux, and since it uses a different build system it may give different performance figures. it would be interesting to see whether it also shows the same regression.
Zoltan Herczeg
Comment 13
2008-07-19 05:55:28 PDT
I downloaded WebKit to my home computer: Core 2 Duo, Ubuntu-Linux, GTK port, gcc-4.2.1 sunspider
r35202
- 2614.9ms +/- 0.2% sunspider
r35203
- 2601.3ms +/- 0.2% It seems a little bit faster! To my shock, the jsc binaries are only 63% of the QT port:
r35202
: GTK: 1147693 bytes, QT: 1820153 bytes
r35203
: GTK: 1168479 bytes, QT: 1878409 bytes Do you have any idea? Is the QT port causes the difference?
Mark Rowe (bdash)
Comment 14
2008-07-19 22:32:41 PDT
I would look into the compiler flags that are used when building JavaScriptCore. Is the Qt build system building JavaScriptCore with "-fstrict-aliasing"? Are they both using "-O3"?
David Kilzer (:ddkilzer)
Comment 15
2008-07-20 11:01:17 PDT
Created
attachment 22392
[details]
bubbleSort.js Additional bubbleSort.js test metioned in Zoltan's email to webkit-dev.
Zoltan Herczeg
Comment 16
2008-07-20 14:24:26 PDT
I have successfully compiled the QT port on my home computer: QT-port-35202: sunspider: 2665.1ms +/- 0.3% binary size: 1841368 QT-port-35203: sunspider: 3238.2ms +/- 0.2% binary size: 1904808 As you can see, this is exactly the same thing what we saw before! So this thing is QT port specific. These are the CFLAGS for the QT-port: g++ [...] -pipe -Wreturn-type -fno-strict-aliasing -O2 -fvisibility=hidden -fvisibility-inlines-hidden [...] I will try to change these to -O3 and -f-strict-aliasing, and will send you the results.
Zoltan Herczeg
Comment 17
2008-07-21 01:59:09 PDT
It seems the QT port applies the same CXXFLAGS to all projects. We only need -O3 and -fstrict-aliasing flags for JavaScriptCore. Is there a simple way to do that, or do I need to write a new "GENERATOR" rule in JavaScriptCore.pri? Any other idea (perhaps change the global CXXFLAGS)?
David Kilzer (:ddkilzer)
Comment 18
2008-07-21 02:26:23 PDT
Copying some Qt folks regarding
Comment #17
.
Simon Hausmann
Comment 19
2008-07-21 03:54:56 PDT
(In reply to
comment #17
)
> It seems the QT port applies the same CXXFLAGS to all projects. We only need > -O3 and -fstrict-aliasing flags for JavaScriptCore. Is there a simple way to do > that, or do I need to write a new "GENERATOR" rule in JavaScriptCore.pri? Any > other idea (perhaps change the global CXXFLAGS)?
With the way JavaScriptCore is built into the same library the CXXFLAGS are the same for the entire QtWebKit library. But for testing purposes you could try something along the lines of QMAKE_CXXFLAGS_RELEASE ~= s/-O2/-O3 -fstrict-aliasing/ in JavaScriptCore.pri or WebCore.pro.
Mark Rowe (bdash)
Comment 20
2008-07-21 04:26:30 PDT
It is almost certainly unsafe for WebCore to be built with strict aliasing enabled.
Zoltan Herczeg
Comment 21
2008-07-21 05:06:48 PDT
> QMAKE_CXXFLAGS_RELEASE ~= s/-O2/-O3 -fstrict-aliasing/
I did something similar, just to see what happend with jsc. The resulted binary was a bit bigger, and a bit faster than the version with -O2, but the difference is small. I looked at the resulting binaries, and there are strange strings in QT-
r35203
(-O2 and -O3 as well). They seems kinda debug info : [...]void WTF::HashTable<Key, Value, Extractor, HashFunctions, Traits, KeyTraits>::checkKey(const T&) [with T = KJS::UString::Rep*, HashTranslator = WTF::RefPtrHashMapRawKeyTranslator<KJS::UString::Rep*, std::pair<WTF::RefPtr<KJS::UString::Rep>, KJS::SymbolTableEntry>, WTF::PairHash[...] These lines are neither in QT-
r35202
nor in GTK binaries. Is it possible that some recently triggered macros made these lines?
Zoltan Herczeg
Comment 22
2008-07-21 07:20:35 PDT
We belive the jsc always compiles in debug mode for QT (and the asserts in Registers.h slows down the sunspider). It seems WebKit/JavaScriptCore/kjs/jsc.pro does not have 'release' mode.
Simon Hausmann
Comment 23
2008-07-21 07:28:14 PDT
(In reply to
comment #22
)
> We belive the jsc always compiles in debug mode for QT (and the asserts in > Registers.h slows down the sunspider). It seems > WebKit/JavaScriptCore/kjs/jsc.pro does not have 'release' mode.
If you build with build-webkit --release for example then jsc is built in release mode. If you run just qmake in JavaScriptCore/kjs then qmake will pick whatever Qt was built with by default, which would normally be a release build, unless you configure your version of Qt with --debug.
Gabor Loki
Comment 24
2008-07-21 08:22:56 PDT
Created
attachment 22404
[details]
Add missing NDEBUG The problem was that the jsc.pro was not contain any release specific rule. So there was no chance to pass the NDEBUG define to the compiler. The patch contains the similar rule which is in WebKit.pri file. Note: I have to include USE_SYSTEM_MALLOC define in case of release, because there was linking problem (WebKit.pri uses USE_SYSTEM_MALLOC in case of debug also).
Mark Rowe (bdash)
Comment 25
2008-07-21 08:27:17 PDT
Does the patch address the performance issue?
Gabor Loki
Comment 26
2008-07-21 12:01:28 PDT
(In reply to
comment #25
)
> Does the patch address the performance issue?
Of course. Here is the SunSpider result: ** TOTAL **: 1.37x as fast 2897.3ms +/- 1.9% 2107.5ms +/- 2.6% significant
Zoltan Herczeg
Comment 27
2008-07-21 23:43:05 PDT
Here is the summarization of the above bug in one sentence: In the QT jsc build, the jsc.pro qmake file never defines the NDEBUG symbol, so the releases contains all asserts and other debug related codes, which heavily slows down all measurements.
Simon Hausmann
Comment 28
2008-07-22 00:27:45 PDT
Comment on
attachment 22404
[details]
Add missing NDEBUG Patch looks good to me
David Kilzer (:ddkilzer)
Comment 29
2008-07-23 08:00:33 PDT
$ git svn dcommit Committing to
http://svn.webkit.org/repository/webkit/trunk
... M JavaScriptCore/ChangeLog M JavaScriptCore/kjs/jsc.pro Committed
r35294
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug