Bug 109597 - Cache timer heap pointer to timers
Summary: Cache timer heap pointer to timers
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Platform (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-02-12 10:48 PST by Antti Koivisto
Modified: 2013-02-13 06:20 PST (History)
4 users (show)

See Also:


Attachments
patch (5.13 KB, patch)
2013-02-12 10:51 PST, Antti Koivisto
kling: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Antti Koivisto 2013-02-12 10:48:31 PST
Accessing timer heap through thread global storage is slow (~0.1% in PLT3). We can cache the heap pointer to each TimerBase. There are not huge numbers of timers around so memory is not an issue and many timers are heavily reused.
Comment 1 Antti Koivisto 2013-02-12 10:51:31 PST
Created attachment 187896 [details]
patch
Comment 2 Andreas Kling 2013-02-12 10:57:09 PST
Comment on attachment 187896 [details]
patch

Neat hack! r=me
Comment 3 Antti Koivisto 2013-02-12 12:25:04 PST
http://trac.webkit.org/changeset/142652
Comment 4 Alexey Proskuryakov 2013-02-12 18:36:31 PST
Comment on attachment 187896 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=187896&action=review

It is worrisome that these functions are so incredibly hot that TLS access - a very fast operation - shows up on the profile.

> Source/WebCore/platform/Timer.h:85
> +    const Vector<TimerBase*>& timerHeap() const { ASSERT(m_cachedThreadGlobalTimerHeap); return *m_cachedThreadGlobalTimerHeap; }
> +    Vector<TimerBase*>& timerHeap() { ASSERT(m_cachedThreadGlobalTimerHeap); return *m_cachedThreadGlobalTimerHeap; }

I don't think that differentiating between the two is meaningful. The first function casts away constness from "this" anyway, so const safety is violated despite having a separate const version of this function.

I suggest always returning a non-const reference, the timer heap is never immutable.
Comment 5 Antti Koivisto 2013-02-13 06:20:36 PST
(In reply to comment #4)
> (From update of attachment 187896 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=187896&action=review
> 
> It is worrisome that these functions are so incredibly hot that TLS access - a very fast operation - shows up on the profile.

Part of the problem was that timerHeap() is accessed from many places during a singe call. Here is the profile before the patch:

Running Time	Self		Symbol Name
134.0ms    0.5%	9.0	 	 WebCore::TimerBase::setNextFireTime(double)
47.0ms    0.1%	2.0	 	  WebCore::TimerBase::heapPopMin()
41.0ms    0.1%	31.0	 	   void std::__1::__push_heap_front<WebCore::TimerHeapLessThanFunction&, WebCore::TimerHeapIterator>(WebCore::TimerHeapIterator, WebCore::TimerHeapIterator, WebCore::TimerHeapLessThanFunction&, std::__1::iterator_traits<WebCore::TimerHeapIterator>::difference_type)
10.0ms    0.0%	5.0	 	    WebCore::timerHeap()
3.0ms    0.0%	2.0	 	     WTF::ThreadSpecific<WebCore::ThreadGlobalData>::operator WebCore::ThreadGlobalData*()
1.0ms    0.0%	1.0	 	      pthread_getspecific
2.0ms    0.0%	2.0	 	     DYLD-STUB$$pthread_getspecific
4.0ms    0.0%	1.0	 	   WebCore::timerHeap()
3.0ms    0.0%	3.0	 	    WTF::ThreadSpecific<WebCore::ThreadGlobalData>::operator WebCore::ThreadGlobalData*()
34.0ms    0.1%	0.0	 	  WebCore::setSharedTimerFireInterval(double)
26.0ms    0.0%	20.0	 	  void std::__1::__push_heap_back<WebCore::TimerHeapLessThanFunction&, WebCore::TimerHeapIterator>(WebCore::TimerHeapIterator, WebCore::TimerHeapIterator, WebCore::TimerHeapLessThanFunction&, std::__1::iterator_traits<WebCore::TimerHeapIterator>::difference_type)
6.0ms    0.0%	1.0	 	   WebCore::timerHeap()
4.0ms    0.0%	3.0	 	    WTF::ThreadSpecific<WebCore::ThreadGlobalData>::operator WebCore::ThreadGlobalData*()
1.0ms    0.0%	1.0	 	     pthread_getspecific
1.0ms    0.0%	1.0	 	    DYLD-STUB$$pthread_getspecific
10.0ms    0.0%	2.0	 	  WebCore::timerHeap()
8.0ms    0.0%	6.0	 	   WTF::ThreadSpecific<WebCore::ThreadGlobalData>::operator WebCore::ThreadGlobalData*()
2.0ms    0.0%	2.0	 	    pthread_getspecific
3.0ms    0.0%	0.0	 	  WebCore::stopSharedTimer()
2.0ms    0.0%	2.0	 	  WebCore::TimerBase::alignedFireTime(double) const
2.0ms    0.0%	2.0	 	  WTF::ThreadSpecific<WebCore::ThreadGlobalData>::operator WebCore::ThreadGlobalData*()
1.0ms    0.0%	0.0	 	  WebCore::DOMTimer::alignedFireTime(double) const

Focusing on timerHeap():

Running Time	Self		Symbol Name
32.0ms    0.1%	11.0	 	 WebCore::timerHeap()
18.0ms    0.0%	14.0	 	  WTF::ThreadSpecific<WebCore::ThreadGlobalData>::operator WebCore::ThreadGlobalData*()
4.0ms    0.0%	4.0	 	   pthread_getspecific
3.0ms    0.0%	3.0	 	  DYLD-STUB$$pthread_getspecific

Call to pthread_getspecific is indeed pretty fast in itself. Assembly level profile shows that time is spent in indirection and branchiness related to fetching the right piece of data.

> I suggest always returning a non-const reference, the timer heap is never immutable.

You are right.