| Summary: | WebKit crashes at operationPutByIdNonStrictBuildList when logging into http://roxtecbuild.factor10labs.com | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | WebKit | Reporter: | Matthew Mirman <mmirman> | ||||||
| Component: | JavaScriptCore | Assignee: | Matthew Mirman <mmirman> | ||||||
| Status: | RESOLVED FIXED | ||||||||
| Severity: | Major | CC: | dbates, ddkilzer, fpizlo, ggaren, mmirman, ossy | ||||||
| Priority: | P1 | Keywords: | InRadar | ||||||
| Version: | 528+ (Nightly build) | ||||||||
| Hardware: | iPhone / iPad | ||||||||
| OS: | All | ||||||||
| URL: | http://roxtecbuild.factor10labs.com | ||||||||
| Bug Depends on: | 143368 | ||||||||
| Bug Blocks: | |||||||||
| Attachments: |
|
||||||||
|
Description
Matthew Mirman
2015-02-11 13:36:34 PST
Created attachment 247009 [details]
removed register r9 from arm callee save register list
This should be a conservative change.
Comment on attachment 247009 [details] removed register r9 from arm callee save register list View in context: https://bugs.webkit.org/attachment.cgi?id=247009&action=review > Source/JavaScriptCore/ChangeLog:3 > + r9 isn't necessarily a callee save register on ARMv7. So, is it or isn't it? > Source/JavaScriptCore/jit/RegisterSet.cpp:-89 > - result.set(ARMRegisters::r9); Currently, in theory, we could use the callee save list to control two separate things: 1) Deciding which registers don't need to be saved at callsites. 2) Deciding which registers must be saved inside a callee. If it's possible for some of our callers to ever think that r9 is callee-save, then removing it from the list will only introduce more subtle bugs: we will incorrectly fail to save r9 in some cases. So, you need to determine if the iOS ARM64 ABI treat r9 as never-callee-save or sometimes-callee-save. This fix is only correct if the iOS ARM64 ABI treats r9 as never-callee-save. I'd also check if our Linux friends are doing ARM64, and if they are, then you need to CC them. This fix would break them if Linux was r9-always-callee-save or sometimes-callee-save. This patch needs a regression test. Comment on attachment 247009 [details]
removed register r9 from arm callee save register list
I'm going to mark this r- because:
1) As ggaren says, this needs a regression test.
2) If by "isn't necessarily callee-save" you mean that it sometimes might be a callee-save, then this isn't the right fix. It will only introduce other crashes, because in places where we use the callee save list to decide which registers to save inside a callee, this will fail to save r9 and if some callers might expect us to save it them we'll break them. Maybe the right answer is to split the calleeSaveRegisters into a set of things that are guaranteed preserved and a set of things that must be saved. Or, this needs to be conditionalized to still do the right thing on those platforms or configurations where r9 is always (or sometimes) callee save. Or, if you can prove that r9 is actually never callee-save on those iOS ARM64 then maybe this fix is fine so long as we CC our Linux friends to that they can also investigate what to do with r9.
Comment on attachment 247009 [details] removed register r9 from arm callee save register list View in context: https://bugs.webkit.org/attachment.cgi?id=247009&action=review >> Source/JavaScriptCore/jit/RegisterSet.cpp:-89 >> - result.set(ARMRegisters::r9); > > Currently, in theory, we could use the callee save list to control two separate things: > > 1) Deciding which registers don't need to be saved at callsites. > 2) Deciding which registers must be saved inside a callee. > > If it's possible for some of our callers to ever think that r9 is callee-save, then removing it from the list will only introduce more subtle bugs: we will incorrectly fail to save r9 in some cases. > > So, you need to determine if the iOS ARM64 ABI treat r9 as never-callee-save or sometimes-callee-save. This fix is only correct if the iOS ARM64 ABI treats r9 as never-callee-save. > > I'd also check if our Linux friends are doing ARM64, and if they are, then you need to CC them. This fix would break them if Linux was r9-always-callee-save or sometimes-callee-save. Oh, this has nothing to do with ARM64. It's ARMv7. Ossy: we're about to remove r9 from the callee save register list for ARMv7 because our documentation says that it's not a callee-save. Can you guys check if it is a callee-save on your ABIs? The documentation for this is very confusing. (In reply to comment #6) > Ossy: we're about to remove r9 from the callee save register list for ARMv7 > because our documentation says that it's not a callee-save. Can you guys > check if it is a callee-save on your ABIs? The documentation for this is > very confusing. Sure, will check on monday morning (CET timezone) Something is weird here. ARM traditional doesn't have calleeSaveRegisters() implementation (else case). And then I tried to remove everything from ARM Thumb2 case and add an UNREACHABLE_FOR_PLATFORM() macro, and tests still pass. It seems this code isn't used at all on ARM Linux. Will check it in detail soon. (In reply to comment #8) > Something is weird here. > > ARM traditional doesn't have calleeSaveRegisters() implementation (else > case). > And then I tried to remove everything from ARM Thumb2 case and add an > UNREACHABLE_FOR_PLATFORM() macro, and tests still pass. It seems this > code isn't used at all on ARM Linux. Will check it in detail soon. Ah, UNREACHABLE_FOR_PLATFORM() is debug only crash. But removing the register list on ARM Thumb2 Linux doesn't cause any regression on the JSC stress tests. (In reply to comment #9) > (In reply to comment #8) > > Something is weird here. > > > > ARM traditional doesn't have calleeSaveRegisters() implementation (else > > case). > > And then I tried to remove everything from ARM Thumb2 case and add an > > UNREACHABLE_FOR_PLATFORM() macro, and tests still pass. It seems this > > code isn't used at all on ARM Linux. Will check it in detail soon. > > Ah, UNREACHABLE_FOR_PLATFORM() is debug only crash. But removing > the register list on ARM Thumb2 Linux doesn't cause any regression > on the JSC stress tests. Ugh. So many things wrong with this picture. 1) Why isn't this crashing in debug? Maybe we don't have good enough test coverage for this path. It is admittedly a rare path, but still, we should be testing it. 2) why isn't UNREACHABLE_FOR_PLATFORM a release assert? I had changed it to be that way and had been using it that way - it's a shortcut for RELEASE_ASSERT_NOT_REACHED that also turns off clang's reachability warnings. 3) you're actually right that currently if we disable FTL then calleeSaveRegisters can totally be the empty set. This is because there are two possible uses of the set: (a) knowing which registers must be preserved inside a caller and (b) knowing which registers a caller can rely on the caller preserving. Without FTL, we happen to only use the set for (b). It's of course safe to assume that the caller preserves nothing, but you risk bad performance. Also this could turn into a very hard to debug issue if we wrote some non-FTL code that needed to use the set for (a). Can you help with this? We should fix these things. Can you confirm that we really don't see debug failures on that unreachable path? (In reply to comment #10) > Ugh. So many things wrong with this picture. > > 1) Why isn't this crashing in debug? Maybe we don't have good enough test > coverage for this path. It is admittedly a rare path, but still, we should > be testing it. I haven't checked full debug tests, because it is very slow.The problem was that I thought previously it is a release assert. I started a debug test now, let's see the real coverage. > 2) why isn't UNREACHABLE_FOR_PLATFORM a release assert? I had changed it to > be that way and had been using it that way - it's a shortcut for > RELEASE_ASSERT_NOT_REACHED that also turns off clang's reachability > warnings. I agree, it should be release assert, but it isn't. UNREACHABLE_FOR_PLATFORM is CRASH() if COMPILER(CLANG), but ASSERT_NOT_REACHED() otherwise which is simple ((void)0) in release mode. It seems you changed only the clang case to CRASH() in r168459. If you agree, we could change the not-clang case too. I would prefer RELEASE_ASSERT_NOT_REACHED() in both cases, because it has more verbose output in debug mode than a simple CRASH(). > 3) you're actually right that currently if we disable FTL then > calleeSaveRegisters can totally be the empty set. This is because there are > two possible uses of the set: (a) knowing which registers must be preserved > inside a caller and (b) knowing which registers a caller can rely on the > caller preserving. Without FTL, we happen to only use the set for (b). It's > of course safe to assume that the caller preserves nothing, but you risk bad > performance. Also this could turn into a very hard to debug issue if we > wrote some non-FTL code that needed to use the set for (a). Can you help > with this? Now I understand why it doesn't cause any problem now, there is no FTL on 32 bit and the non-FTL code doesn't use the case (a). Sure, I'll check the details of ARM Thumb2 and Traditional on Linux and add proper register list here to avoid possible bugs in the future. > We should fix these things. Can you confirm that we really don't see debug > failures on that unreachable path? I agree, I'm on it. (In reply to comment #11) > (In reply to comment #10) > > Ugh. So many things wrong with this picture. > > > > 1) Why isn't this crashing in debug? Maybe we don't have good enough test > > coverage for this path. It is admittedly a rare path, but still, we should > > be testing it. > > I haven't checked full debug tests, because it is very slow.The > problem was that I thought previously it is a release assert. I > started a debug test now, let's see the real coverage. Gotcha. As an aside, it's for this reason that I think that in core parts of the engine - especially JSC - it's super profitable to have release asserts whenever possible. Running tests on embedded systems is so expensive that sometimes we only get meaningful coverage from release asserts. I'm in favor of any change that turns a debug assert into a release assert if it comes with some evidence that performance is unaffected. > > > 2) why isn't UNREACHABLE_FOR_PLATFORM a release assert? I had changed it to > > be that way and had been using it that way - it's a shortcut for > > RELEASE_ASSERT_NOT_REACHED that also turns off clang's reachability > > warnings. > > I agree, it should be release assert, but it isn't. UNREACHABLE_FOR_PLATFORM > is CRASH() if COMPILER(CLANG), but ASSERT_NOT_REACHED() otherwise which is > simple ((void)0) in release mode. It seems you changed only the clang case > to CRASH() in r168459. If you agree, we could change the not-clang case too. > I would prefer RELEASE_ASSERT_NOT_REACHED() in both cases, because it has > more verbose output in debug mode than a simple CRASH(). I agree with this. It should be RELEASE_ASSERT_NOT_REACHED(). > > > 3) you're actually right that currently if we disable FTL then > > calleeSaveRegisters can totally be the empty set. This is because there are > > two possible uses of the set: (a) knowing which registers must be preserved > > inside a caller and (b) knowing which registers a caller can rely on the > > caller preserving. Without FTL, we happen to only use the set for (b). It's > > of course safe to assume that the caller preserves nothing, but you risk bad > > performance. Also this could turn into a very hard to debug issue if we > > wrote some non-FTL code that needed to use the set for (a). Can you help > > with this? > > Now I understand why it doesn't cause any problem now, there is > no FTL on 32 bit and the non-FTL code doesn't use the case (a). > Sure, I'll check the details of ARM Thumb2 and Traditional on Linux > and add proper register list here to avoid possible bugs in the future. Thanks! > > > We should fix these things. Can you confirm that we really don't see debug > > failures on that unreachable path? > > I agree, I'm on it. I ran a full debug test on ARM Traditional and Thumb2 too at night, everything work fine, except UNREACHABLE_FOR_PLATFORM() hit in calleeSaveRegisters() on ARM Traditional in ~30% of the tests. So the test coverage seems to be good here. :) And I checked the documentation about r9. AAPCS doensn't specify clearly if it is callee-saved or not, but I think it's safe to consider to be callee-saved on Linux. I haven't found any example for the opposite case. (+info: https://bugs.webkit.org/show_bug.cgi?id=141903#c2 ) Additionally I filed new bug reports to - add ARM traditional implementation of calleeSaveRegisters() - bug141903 - change UNREACHABLE_FOR_PLATFORM() to release assert - bug141904 Created attachment 247139 [details]
removed register r9 from arm callee save register list
Added a test case.
Comment on attachment 247139 [details]
removed register r9 from arm callee save register list
r=me.
Please rename the test "regress-141489.js". Feel free to add a comment to the test that you are checking that r9 is no longer a callee save.
Comment on attachment 247139 [details] removed register r9 from arm callee save register list Patch landed: http://trac.webkit.org/changeset/180516 Closing bug. |