WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
99213
[ARMv7] Neither linkCall() nor linkPointer() should flush code
https://bugs.webkit.org/show_bug.cgi?id=99213
Summary
[ARMv7] Neither linkCall() nor linkPointer() should flush code
Yong Li
Reported
2012-10-12 15:04:31 PDT
LinkBuffer calls linkCall() and linkPointer(), which ends up with code flushing on ARMv7. However, this doesn't look necessary, as the code will be finally flushed anyway. A simple solution is to pass an argument "flush" to linkCall() and linkPointer(), and suppress code flushing when it is called from LinkBuffer. This gives "noticeable" boost on sunspider result. But I'm not sure if this is the best solution.
Attachments
the patch
(3.80 KB, patch)
2012-10-13 14:16 PDT
,
Yong Li
no flags
Details
Formatted Diff
Diff
Add the fix to replaceWith* methods
(4.92 KB, patch)
2012-10-15 11:10 PDT
,
Yong Li
no flags
Details
Formatted Diff
Diff
again
(4.92 KB, patch)
2012-10-17 09:25 PDT
,
Yong Li
no flags
Details
Formatted Diff
Diff
Show Obsolete
(2)
View All
Add attachment
proposed patch, testcase, etc.
Yong Li
Comment 1
2012-10-12 15:11:45 PDT
A better plan: For ARMv7 only, make linkCall/linkPointer not flash code by default. make other calls to linkCall/linkPointer in ARMv7 use explicit flags to flush code...
Yong Li
Comment 2
2012-10-13 14:16:32 PDT
Created
attachment 168565
[details]
the patch
Yong Li
Comment 3
2012-10-15 08:38:45 PDT
I also found replaceWithLoad and replaceWithAddressComputation can unnecessarily flush code. going to fix it in another patch
Yong Li
Comment 4
2012-10-15 10:23:38 PDT
Comment on
attachment 168565
[details]
the patch hm... Not only LinkBuffer can call linkCall
Yong Li
Comment 5
2012-10-15 11:10:28 PDT
Created
attachment 168744
[details]
Add the fix to replaceWith* methods actually it was false alarm. Also, linkCall never flushes code in ARM/SHA/MIPS ports. This is pure ARMv7 bug
Geoffrey Garen
Comment 6
2012-10-15 12:21:27 PDT
> A better plan: For ARMv7 only, make linkCall/linkPointer not flash code by default.
Why make this optimization CPU-specific?
Yong Li
Comment 7
2012-10-15 12:23:15 PDT
(In reply to
comment #6
)
> > A better plan: For ARMv7 only, make linkCall/linkPointer not flash code by default. > > Why make this optimization CPU-specific?
This is only an issue in ARMv7Assembler. Other assemblers don't have this problem. x86/x64 don't need to flush code. ARM-traditional/mips/sh4 don't flush code in linkCall()
Geoffrey Garen
Comment 8
2012-10-15 16:06:49 PDT
> > Why make this optimization CPU-specific? > > This is only an issue in ARMv7Assembler. Other assemblers don't have this problem. x86/x64 don't need to flush code. ARM-traditional/mips/sh4 don't flush code in linkCall()
It seems brittle, and a layering violation, for the ARMv7 assembler to guess that calls like linkCall() and linkPointer() will always be followed by a call to cacheFlush(). Even though you believe, based on reading the implementation details of all our assemblers, that only ARMv7 needs this particular optimization at this time, and that the optimization is correct for the current implementation of LinkBuffer, the goal of our MacroAssembler and LinkBuffer interfaces is to abstract away those details so we can make future changes without thinking about all platforms at once. Is there a major blocker that forces this code to be ARMv7-specific?
Yong Li
Comment 9
2012-10-16 07:40:28 PDT
(In reply to
comment #8
)
> > > Why make this optimization CPU-specific? > > > > This is only an issue in ARMv7Assembler. Other assemblers don't have this problem. x86/x64 don't need to flush code. ARM-traditional/mips/sh4 don't flush code in linkCall() > > It seems brittle, and a layering violation, for the ARMv7 assembler to guess that calls like linkCall() and linkPointer() will always be followed by a call to cacheFlush(). > > Even though you believe, based on reading the implementation details of all our assemblers, that only ARMv7 needs this particular optimization at this time, and that the optimization is correct for the current implementation of LinkBuffer, the goal of our MacroAssembler and LinkBuffer interfaces is to abstract away those details so we can make future changes without thinking about all platforms at once. > > Is there a major blocker that forces this code to be ARMv7-specific?
Maybe the title is confusing. I thought It was a more generic issue at the very beginning. However after reading more code, I found that it is not LinkBuffer's fault at all. I believe that none of the link*() calls need/should flush code. LinkBuffer will flush the entire assembly at the finalization step. However, those relink*/repatch*/replace* calls must flush code (except for x86 families). None of other assemblers flush code in linkCall() and linkPointer(), and ARMv7Assembler doesn't flush in linkJump(). So I think we can call it a bug in ARMv7Assembler, and it is an accident that linkCall/linkPointer finally flushes code through setInt32.
Yong Li
Comment 10
2012-10-17 09:25:13 PDT
Created
attachment 169195
[details]
again
Yong Li
Comment 11
2012-10-18 10:42:08 PDT
Ping reviewers... has anyone tried it on iOS?
Gavin Barraclough
Comment 12
2012-10-26 17:51:46 PDT
Hi Yong, Nice catch – but I think there's a slightly better fix if you feel like taking it on. Your patch fixes the problem that we'll flush ints/pointers more than once during the link phase, but there is a related issue that we sometimes flush the same cache line more than once when repatching code. I'd suggest changing the MacroAssembler::repatch* etc methods in all assemblers to return the range of code modified (start address & end/size as a pair), and have the RepatchBuffer cache all required flushes & only perform them once all modifications have been performed. When buffering up the flushes, we can round to cache line size, then ditch redundant flushes & coalesce flushes to adjacent cache lines. (minor detail: JITWriteBarrier & JumpReplacementWatchpoint currently aren't using RepatchBuffer & should be – that will also need to be fixed). I won't r+ for now, since I'd rather see the fix I've described, but if you don't feel like taking this on we could take your patch as is & revisit – would just be unfortunate since some of this change would be redundant & end up being reverted.
Benjamin Poulain
Comment 13
2012-10-26 19:08:09 PDT
Comment on
attachment 169195
[details]
again View in context:
https://bugs.webkit.org/attachment.cgi?id=169195&action=review
I remove the review flag for now. I think this is a nice patch but I would really like it to be completely generic. ARM is not the only platform without completely consistent caches. I think we should follow Gavin's advices here. Yong, any opinion?
> Source/JavaScriptCore/ChangeLog:10 > + LinkBuffer doesn't need to flush code during linking. It will > + eventually flush the whole executable. Fixing this gives >%5 > + sunspider boost (on QNX).
After profiling on hardware, I find it strange you get so much improvement from this. It is pretty far from a hotspot. It is still a good idea to fix this though.
Yong Li
Comment 14
2012-10-26 21:15:45 PDT
(In reply to
comment #12
)
> Hi Yong, > > Nice catch – but I think there's a slightly better fix if you feel like taking it on. > > Your patch fixes the problem that we'll flush ints/pointers more than once during the link phase, but there is a related issue that we sometimes flush the same cache line more than once when repatching code. I'd suggest changing the MacroAssembler::repatch* etc methods in all assemblers to return the range of code modified (start address & end/size as a pair), and have the RepatchBuffer cache all required flushes & only perform them once all modifications have been performed. When buffering up the flushes, we can round to cache line size, then ditch redundant flushes & coalesce flushes to adjacent cache lines.
Actually I did try a very similar solution with RepatchBuffer. However it didn't show any boost on benchmarks. I could try improving that patch when time allowing. But I think the bug should be separated as this one is only an issue in ARMv7Assembler, and repatching is generic. Even now I had a ready patch for the repatching one, I still think it should be a separate bug.
> > (minor detail: JITWriteBarrier & JumpReplacementWatchpoint currently aren't using RepatchBuffer & should be – that will also need to be fixed).
Thanks. Very helpful info :)
Yong Li
Comment 15
2012-10-29 08:04:34 PDT
(In reply to
comment #13
)
> (From update of
attachment 169195
[details]
) > View in context:
https://bugs.webkit.org/attachment.cgi?id=169195&action=review
> > I remove the review flag for now. > I think this is a nice patch but I would really like it to be completely generic. ARM is not the only platform without completely consistent caches. > > I think we should follow Gavin's advices here. Yong, any opinion? >
I still think it is an ARMV-v7 only bug, as none of other assemblers' linkCall/linkPointer does cache flushing, nor ARMv7's linkJump() does. So it is very clear to me it is just an independent bug rather than a generic optimization. I could modify the changelog if needed. We can create another bug for the potential duplicate cache flushing during "repatching". So I'll add back the review request.
Yong Li
Comment 16
2012-11-13 09:18:40 PST
The patch is pending so long here. It seems there is no objection with this patch itself. Although it doesn't give so much boost on other platforms, it is still a proper fix, right?
George Staikos
Comment 17
2012-11-19 20:53:35 PST
Comment on
attachment 169195
[details]
again No complaints, it looks good, and gives improvements on at least one platform.
WebKit Review Bot
Comment 18
2012-11-20 08:33:12 PST
Comment on
attachment 169195
[details]
again Clearing flags on attachment: 169195 Committed
r135286
: <
http://trac.webkit.org/changeset/135286
>
WebKit Review Bot
Comment 19
2012-11-20 08:33:17 PST
All reviewed patches have been landed. Closing bug.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug