This inner loop must be tighter!!!
Created attachment 162630 [details] Patch
Comment on attachment 162630 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=162630&action=review That's a surprisingly small gain. Did you compare the instrument samples to see what the % change of this function is? > Source/WebCore/rendering/break_lines.cpp:149 > +template<> > +inline bool needsLineBreakIterator<false>(UChar ch) > { > return ch > asciiLineBreakTableLastChar && ch != noBreakSpace; > } Can't we write this as a single function? the compiler should optimize out useless branches you should jsut be able to have if (treatNoBreakSpaceAsBreak) in the shared implementation and have the compiler do the right thing? > Source/WebCore/rendering/break_lines.cpp:194 > +int nextBreakablePositionBreakOnNBSP(LazyLineBreakIterator& lazyBreakIterator, int pos) By OnNBSP? do you mean IncludingNBSP? I'm not quite sure what this function name means.
> That's a surprisingly small gain. How much do you want? :) There are not really low hanging fruits left. > Did you compare the instrument samples to see what the % change of this function is? On the function itself, Instruments give ~9-10% faster. I guess could get better numbers with DTrace if you want exact numbers. > > Source/WebCore/rendering/break_lines.cpp:149 > > +template<> > > +inline bool needsLineBreakIterator<false>(UChar ch) > > { > > return ch > asciiLineBreakTableLastChar && ch != noBreakSpace; > > } > > Can't we write this as a single function? the compiler should optimize out useless branches you should jsut be able to have if (treatNoBreakSpaceAsBreak) in the shared implementation and have the compiler do the right thing? It could in one case but I prefer to be explicit. IMHO we should not rely blindly on compiler optimizations since this is a hotspot. > > Source/WebCore/rendering/break_lines.cpp:194 > > +int nextBreakablePositionBreakOnNBSP(LazyLineBreakIterator& lazyBreakIterator, int pos) > > By OnNBSP? do you mean IncludingNBSP? I'm not quite sure what this function name means. Yep indeed, this name is bad. nextBreakablePositionBreakIncludingBreakOnNBSP ? Long but explicit.
(In reply to comment #3) > > That's a surprisingly small gain. > > How much do you want? :) > There are not really low hanging fruits left. > > > Did you compare the instrument samples to see what the % change of this function is? > > On the function itself, Instruments give ~9-10% faster. > > I guess could get better numbers with DTrace if you want exact numbers. Nah, that's fine. :) I remember seeing this function as being about 5% of total time. With your change I would expect it to be much less than 5% if we're seeing a 3% overall win. > > > Source/WebCore/rendering/break_lines.cpp:149 > > > +template<> > > > +inline bool needsLineBreakIterator<false>(UChar ch) > > > { > > > return ch > asciiLineBreakTableLastChar && ch != noBreakSpace; > > > } > > > > Can't we write this as a single function? the compiler should optimize out useless branches you should jsut be able to have if (treatNoBreakSpaceAsBreak) in the shared implementation and have the compiler do the right thing? > > It could in one case but I prefer to be explicit. > IMHO we should not rely blindly on compiler optimizations since this is a hotspot. I guess. :) It would be an extremely dumb compiler to not remove unreachable code! > > > Source/WebCore/rendering/break_lines.cpp:194 > > > +int nextBreakablePositionBreakOnNBSP(LazyLineBreakIterator& lazyBreakIterator, int pos) > > > > By OnNBSP? do you mean IncludingNBSP? I'm not quite sure what this function name means. > > Yep indeed, this name is bad. > > nextBreakablePositionBreakIncludingBreakOnNBSP ? > Long but explicit. Thinking about it more, I might just go the other way. Have the "including NBSP" one just be "nextBreakablePosition" and the non-NBSP one be "nextBreakablePositionIgnoringNBSP". Your call.
> > It could in one case but I prefer to be explicit. > > IMHO we should not rely blindly on compiler optimizations since this is a hotspot. > > I guess. :) It would be an extremely dumb compiler to not remove unreachable code! Reading back, I realize I totally misunderstood you. I originally thought you were asking to rewrite this in a way that the compiler know the ch test is useless. I figured, sure, that is gonna be tricky but should be possible. :) Now that I see what you really asked... sure, I can change to if (treatNoBreakSpaceAsBreak) :-D > Thinking about it more, I might just go the other way. Have the "including NBSP" one just be "nextBreakablePosition" and the non-NBSP one be "nextBreakablePositionIgnoringNBSP". Your call. Changing to nextBreakablePositionIgnoringNBSP() is ok with me.
Created attachment 162942 [details] Patch
Comment on attachment 162942 [details] Patch Assuming this is still a similar win, sounds great!
Committed r127974: <http://trac.webkit.org/changeset/127974>
The chrome bots get the expected improvements: http://webkit-perf.appspot.com/graph.html#tests=[[2947022,2001,3001]]&sel=none&displayrange=7&datatype=running The Mac ones do not: http://webkit-perf.appspot.com/graph.html#tests=[[2947022,2001,32196]]&sel=none&displayrange=7&datatype=running Locally, on Mac, I get the same results as the Chrome bots. I have no explanation for the bot results. :(