Bug 96042 - Specialize nextBreakablePosition depending on breakNBSP
Summary: Specialize nextBreakablePosition depending on breakNBSP
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Layout and Rendering (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Benjamin Poulain
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-09-06 17:12 PDT by Benjamin Poulain
Modified: 2012-09-10 16:40 PDT (History)
5 users (show)

See Also:


Attachments
Patch (5.84 KB, patch)
2012-09-06 17:33 PDT, Benjamin Poulain
no flags Details | Formatted Diff | Diff
Patch (5.63 KB, patch)
2012-09-07 22:38 PDT, Benjamin Poulain
eric: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Benjamin Poulain 2012-09-06 17:12:34 PDT
This inner loop must be tighter!!!
Comment 1 Benjamin Poulain 2012-09-06 17:33:26 PDT
Created attachment 162630 [details]
Patch
Comment 2 Eric Seidel (no email) 2012-09-06 17:54:17 PDT
Comment on attachment 162630 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=162630&action=review

That's a surprisingly small gain.  Did you compare the instrument samples to see what the % change of this function is?

> Source/WebCore/rendering/break_lines.cpp:149
> +template<>
> +inline bool needsLineBreakIterator<false>(UChar ch)
>  {
>      return ch > asciiLineBreakTableLastChar && ch != noBreakSpace;
>  }

Can't we write this as a single function?  the compiler should optimize out useless branches you should jsut be able to have if (treatNoBreakSpaceAsBreak) in the shared implementation and have the compiler do the right thing?

> Source/WebCore/rendering/break_lines.cpp:194
> +int nextBreakablePositionBreakOnNBSP(LazyLineBreakIterator& lazyBreakIterator, int pos)

By OnNBSP? do you mean IncludingNBSP?  I'm not quite sure what this function name means.
Comment 3 Benjamin Poulain 2012-09-06 18:12:08 PDT
> That's a surprisingly small gain. 

How much do you want? :)
There are not really low hanging fruits left.

> Did you compare the instrument samples to see what the % change of this function is?

On the function itself, Instruments give ~9-10% faster.

I guess could get better numbers with DTrace if you want exact numbers.

> > Source/WebCore/rendering/break_lines.cpp:149
> > +template<>
> > +inline bool needsLineBreakIterator<false>(UChar ch)
> >  {
> >      return ch > asciiLineBreakTableLastChar && ch != noBreakSpace;
> >  }
> 
> Can't we write this as a single function?  the compiler should optimize out useless branches you should jsut be able to have if (treatNoBreakSpaceAsBreak) in the shared implementation and have the compiler do the right thing?

It could in one case but I prefer to be explicit.
IMHO we should not rely blindly on compiler optimizations since this is a hotspot.


> > Source/WebCore/rendering/break_lines.cpp:194
> > +int nextBreakablePositionBreakOnNBSP(LazyLineBreakIterator& lazyBreakIterator, int pos)
> 
> By OnNBSP? do you mean IncludingNBSP?  I'm not quite sure what this function name means.

Yep indeed, this name is bad.

nextBreakablePositionBreakIncludingBreakOnNBSP ?
Long but explicit.
Comment 4 Eric Seidel (no email) 2012-09-06 20:30:47 PDT
(In reply to comment #3)
> > That's a surprisingly small gain. 
> 
> How much do you want? :)
> There are not really low hanging fruits left.
> 
> > Did you compare the instrument samples to see what the % change of this function is?
> 
> On the function itself, Instruments give ~9-10% faster.
> 
> I guess could get better numbers with DTrace if you want exact numbers.

Nah, that's fine. :)  I remember seeing this function as being about 5% of total time.  With your change I would expect it to be much less than 5% if we're seeing a 3% overall win.

> > > Source/WebCore/rendering/break_lines.cpp:149
> > > +template<>
> > > +inline bool needsLineBreakIterator<false>(UChar ch)
> > >  {
> > >      return ch > asciiLineBreakTableLastChar && ch != noBreakSpace;
> > >  }
> > 
> > Can't we write this as a single function?  the compiler should optimize out useless branches you should jsut be able to have if (treatNoBreakSpaceAsBreak) in the shared implementation and have the compiler do the right thing?
> 
> It could in one case but I prefer to be explicit.
> IMHO we should not rely blindly on compiler optimizations since this is a hotspot.

I guess. :)  It would be an extremely dumb compiler to not remove unreachable code!

> > > Source/WebCore/rendering/break_lines.cpp:194
> > > +int nextBreakablePositionBreakOnNBSP(LazyLineBreakIterator& lazyBreakIterator, int pos)
> > 
> > By OnNBSP? do you mean IncludingNBSP?  I'm not quite sure what this function name means.
> 
> Yep indeed, this name is bad.
> 
> nextBreakablePositionBreakIncludingBreakOnNBSP ?
> Long but explicit.

Thinking about it more, I might just go the other way.  Have the "including NBSP" one just be "nextBreakablePosition" and the non-NBSP one be "nextBreakablePositionIgnoringNBSP".  Your call.
Comment 5 Benjamin Poulain 2012-09-07 22:14:47 PDT
> > It could in one case but I prefer to be explicit.
> > IMHO we should not rely blindly on compiler optimizations since this is a hotspot.
> 
> I guess. :)  It would be an extremely dumb compiler to not remove unreachable code!

Reading back, I realize I totally misunderstood you.

I originally thought you were asking to rewrite this in a way that the compiler know the ch test is useless. I figured, sure, that is gonna be tricky but should be possible. :)

Now that I see what you really asked... sure, I can change to if (treatNoBreakSpaceAsBreak) :-D

> Thinking about it more, I might just go the other way.  Have the "including NBSP" one just be "nextBreakablePosition" and the non-NBSP one be "nextBreakablePositionIgnoringNBSP".  Your call.

Changing to nextBreakablePositionIgnoringNBSP() is ok with me.
Comment 6 Benjamin Poulain 2012-09-07 22:38:01 PDT
Created attachment 162942 [details]
Patch
Comment 7 Eric Seidel (no email) 2012-09-07 23:03:34 PDT
Comment on attachment 162942 [details]
Patch

Assuming this is still a similar win, sounds great!
Comment 8 Benjamin Poulain 2012-09-08 17:35:27 PDT
Committed r127974: <http://trac.webkit.org/changeset/127974>
Comment 9 Benjamin Poulain 2012-09-10 16:40:02 PDT
The chrome bots get the expected improvements: http://webkit-perf.appspot.com/graph.html#tests=[[2947022,2001,3001]]&sel=none&displayrange=7&datatype=running

The Mac ones do not: http://webkit-perf.appspot.com/graph.html#tests=[[2947022,2001,32196]]&sel=none&displayrange=7&datatype=running

Locally, on Mac, I get the same results as the Chrome bots.
I have no explanation for the bot results. :(