Math.random function implementation in async-fs repeatedly emits CheckAdd. And this makes PutGlobalVariable un-sunkable, and bloats many code that is not inherently necessary. 35 return function() { 36 // Robert Jenkins' 32 bit integer hash function. 37 seed = ((seed + 0x7ed55d16) + (seed << 12)) & 0xffffffff; 38 seed = ((seed ^ 0xc761c23c) ^ (seed >>> 19)) & 0xffffffff; 39 seed = ((seed + 0x165667b1) + (seed << 5)) & 0xffffffff; 40 seed = ((seed + 0xd3a2646c) ^ (seed << 9)) & 0xffffffff; 41 seed = ((seed + 0xfd7046c5) + (seed << 3)) & 0xffffffff; 42 seed = ((seed ^ 0xb55a4f09) ^ (seed >>> 16)) & 0xffffffff; 43 return (seed & 0xfffffff) / 0x10000000; 44 }; But seems that `& 0xffffffff` is saying that we only care lower 32bit. I would like to check how to remove the above check-add and make this efficient.
(In reply to Yusuke Suzuki from comment #0) > Math.random function implementation in async-fs repeatedly emits CheckAdd. > And this makes PutGlobalVariable un-sunkable, and bloats many code that is > not inherently necessary. > > 35 return function() { > 36 // Robert Jenkins' 32 bit integer hash function. > 37 seed = ((seed + 0x7ed55d16) + (seed << 12)) & 0xffffffff; > 38 seed = ((seed ^ 0xc761c23c) ^ (seed >>> 19)) & 0xffffffff; > 39 seed = ((seed + 0x165667b1) + (seed << 5)) & 0xffffffff; > 40 seed = ((seed + 0xd3a2646c) ^ (seed << 9)) & 0xffffffff; > 41 seed = ((seed + 0xfd7046c5) + (seed << 3)) & 0xffffffff; > 42 seed = ((seed ^ 0xb55a4f09) ^ (seed >>> 16)) & 0xffffffff; > 43 return (seed & 0xfffffff) / 0x10000000; > 44 }; > > But seems that `& 0xffffffff` is saying that we only care lower 32bit. I > would like to check how to remove the above check-add and make this > efficient. Maybe, we should do integer-range-optimization in B3 to convert CheckAdd to Add. Then, CSE should work.
Or, extend existing DFG IntegerRangeOptimization Phase.
After looking through the code, I think, 1. We can improve things in B3 by analyzing integer-range. 2. But in this case, we can easily improve it in DFG since DFG knows Int52Rep(Int32) semantics while it is encoded into shift in B3 And the most common case in general JS code with Int52 is `ArithAdd(Int52Rep(Int32), Int52Constant)`, which will be shown when we decide upgrading Int32 to Int52. After considering about it, I'm guessing this is soooooooooo common than I expected (maybe, the most common pattern for Int52 calculation?). If we can remove CheckOverflow from this, we could improve many programs (even in non-FTL) very easily. While (1) is attractive, just adding (2) adhocly also looks nice to me. (1) looks nice to me too, it could potentially improve bound-checked Wasm code...? I'm not sure.
Or, we can do this easily in B3, checking.
Do it in B3 first. I found that this is super easy.
Created attachment 379305 [details] Patch
Created attachment 379306 [details] Patch
<rdar://problem/55583016>
Comment on attachment 379306 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=379306&action=review Nice. r=me > Source/JavaScriptCore/ChangeLog:25 > + This converts many CheckAdd in JetStream2/async-fs's hot function simple Add, removes bunch of unnecessary instructions which exist because of this OSR exit. /function simple Add, removes bunch/function to simple Add, and removes a bunch/. > Source/JavaScriptCore/b3/testb3_5.cpp:950 > + CHECK(invoke<int64_t>(*code, value) == 2ll * static_cast<int32_t>(value)); Why is this static_cast necessary?
Comment on attachment 379306 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=379306&action=review >> Source/JavaScriptCore/b3/testb3_5.cpp:950 >> + CHECK(invoke<int64_t>(*code, value) == 2ll * static_cast<int32_t>(value)); > > Why is this static_cast necessary? This code is precisely simulating the B3's code, like, first performing SExt8, just for readability.
Committed r250198: <https://trac.webkit.org/changeset/250198>
Comment on attachment 379306 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=379306&action=review >>> Source/JavaScriptCore/b3/testb3_5.cpp:950 >>> + CHECK(invoke<int64_t>(*code, value) == 2ll * static_cast<int32_t>(value)); >> >> Why is this static_cast necessary? > > This code is precisely simulating the B3's code, like, first performing SExt8, just for readability. Yeah, but you don't do this for the SExt16 case below. I noticed the inconsistency and wondered why. It's not a big deal as the code behaves the same either way.
Comment on attachment 379306 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=379306&action=review >>>> Source/JavaScriptCore/b3/testb3_5.cpp:950 >>>> + CHECK(invoke<int64_t>(*code, value) == 2ll * static_cast<int32_t>(value)); >>> >>> Why is this static_cast necessary? >> >> This code is precisely simulating the B3's code, like, first performing SExt8, just for readability. > > Yeah, but you don't do this for the SExt16 case below. I noticed the inconsistency and wondered why. It's not a big deal as the code behaves the same either way. I see, I'll add the static_cast to SExt16 case too!
Committed r250199: <https://trac.webkit.org/changeset/250199>