Summary: | 8-bit version of weakCompareAndSwap() can cause an infinite loop | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Mark Lam <mark.lam> | ||||||||||||||||
Component: | Web Template Framework | Assignee: | Mark Lam <mark.lam> | ||||||||||||||||
Status: | RESOLVED FIXED | ||||||||||||||||||
Severity: | Normal | CC: | benjamin, bfulgham, cmarcelo, commit-queue, fpizlo, ggaren, kevleyski, mmirman, msaboff, ossy, rniwa, webkit-bug-importer | ||||||||||||||||
Priority: | P2 | Keywords: | InRadar | ||||||||||||||||
Version: | 528+ (Nightly build) | ||||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||
Bug Depends on: | |||||||||||||||||||
Bug Blocks: | 141290 | ||||||||||||||||||
Attachments: |
|
Description
Mark Lam
2015-03-09 16:18:04 PDT
Created attachment 248298 [details]
the patch.
Comment on attachment 248298 [details]
the patch.
Need to add test file to other ports.
Created attachment 248303 [details]
take 2: with fixes for windows project. Make doesn't seem to build testapi.
Comment on attachment 248303 [details] take 2: with fixes for windows project. Make doesn't seem to build testapi. View in context: https://bugs.webkit.org/attachment.cgi?id=248303&action=review > Source/WTF/wtf/Atomics.h:335 > + struct CompositeValue { > + public: > + CompositeValue(volatile unsigned* baseWordAddr, uint8_t replacementByte, uintptr_t replacementIndex) > + { > + u.word = *baseWordAddr; > + u.bytes[replacementIndex] = replacementByte; > + } > + > + unsigned word() { return u.word; } > + > + private: > + union { > + unsigned word; > + uint8_t bytes[sizeof(unsigned)]; > + } u; > + }; > + > + CompositeValue oldAlignedValue(alignedLocation, expected, locationOffset); > + CompositeValue newAlignedValue(alignedLocation, newValue, locationOffset); This is wrong. You load from *alignedLocation twice: once t get the side-bytes of the old value and once to get the side-bytes of the new value. This has a race. If the bytes other than the ones you are CASing inside the aligned word change between the two loads, you could cause badness. I suggest not using the CompositeValue helper. Instead, all that this needs is a compiler fence before the load of the oldAlignedValue in the original code. We're missing the path to "config.h" on the Windows project settings for testapi. I'll land a fix for that independent of this change. (In reply to comment #5) > Comment on attachment 248303 [details] > take 2: with fixes for windows project. Make doesn't seem to build testapi. > > View in context: > https://bugs.webkit.org/attachment.cgi?id=248303&action=review > > > Source/WTF/wtf/Atomics.h:335 > > + struct CompositeValue { > > + public: > > + CompositeValue(volatile unsigned* baseWordAddr, uint8_t replacementByte, uintptr_t replacementIndex) > > + { > > + u.word = *baseWordAddr; > > + u.bytes[replacementIndex] = replacementByte; > > + } > > + > > + unsigned word() { return u.word; } > > + > > + private: > > + union { > > + unsigned word; > > + uint8_t bytes[sizeof(unsigned)]; > > + } u; > > + }; > > + > > + CompositeValue oldAlignedValue(alignedLocation, expected, locationOffset); > > + CompositeValue newAlignedValue(alignedLocation, newValue, locationOffset); > > This is wrong. You load from *alignedLocation twice: once t get the > side-bytes of the old value and once to get the side-bytes of the new value. > This has a race. If the bytes other than the ones you are CASing inside the > aligned word change between the two loads, you could cause badness. > > I suggest not using the CompositeValue helper. Instead, all that this needs > is a compiler fence before the load of the oldAlignedValue in the original > code. Here's the race that breaks your code: char buf[4]; // initial state buf[0] = 0; buf[1] = 0; buf[2] = 0; buf[3] = 0; Thread #1 takes the following steps: T1: 1) CAS(&buf[0], 0, 1); Thread #2 takes the following steps: T2: 1) buf[1] = 1; T2: 2) buf[1] = 0; We would expect the following to be valid outcomes of executing this program, regardless of the three possible interleavings. Here, "T1.1->T2.1" implies a particular sequencing of executions, so in particular it means that "T1 executes step 1 (i.e. the CAS) and then T2 executes step 1 (i.e. buf[1] = 1)". T1.1->T2.1->T2.2: buf == {1, 0, 0, 0} T2.1->T1.1->t2.2: buf == {1, 0, 0, 0} T2.1->T2.2->T1.1: buf == {1, 0, 0, 0} In particular there is no valid outcome with buf[1] != 0. But if the 1-byte CAS does what you do, then it actually has two loads and a CAS. The first load influences what we compare and the second load influences what we set the word to. Let's say that the CAS is decomposed into three steps: a) load the whole word, put in the expected byte b) load the whole word, put in the newValue byte c) do the word CAS This means that we will have the following evil interleaving: T1.1.a->T2.1->T2.2->T1.1.b->T1.1.c After T1.1.a executes, the expected value word will be {0, 0, 0, 0}. After T2.1, buf[1] == 1, but then T2.2 runs and then we have buf[1] == 0 again. Now T1.1.b runs and sets the new value word to {1, 1, 0, 0}. Then T1.1.c runs, and we CAS with expected = {0, 0, 0, 0} and newValue = {1, 1, 0, 0}. So the result is {1, 1, 0, 0,} which is wrong. I believe the composite value thing doesn't produce a net benefit. I think it makes it easier to make the sort of mistake that you made. CAS implementations should be as straight-line and free of abstractions as possible because we want to be able to see, by glancing at the code, how many loads there are. I believe that a correct implementation must only involve one load. The abstraction via CompositeLoad means that the code appears to have one load but you only realize that it executes twice by thinking about it a bit more carefully. There are two possible changes that make the current trunk code correct: 1) Put a compiler fence before and after the whole 8-byte CAS function - that is, the early return would instead become a control flow diamond and the whole 8-byte CAS function would look something like this: compilerFence(); ... address math oldAlignedValue = *load; if (u.bytes[thingy] == stuff) result = false; else { ... result = CAS(...); } compilerFence(); return result; The reason for both compiler fences is to prevent code motion into or out of the CAS implementation. The top compiler fence prevents hoisting the load. The bottom compiler fence prevents hoisting a store into the fast-exit path. This doesn't prevent any optimizations other than those that would be invalid. 2) Use an unconditional CAS like your change does, but without CompositeValue. You could just construct oldValue and newValue by storing a byte into the union and then loading the word out. This kind of very C-like simple implementation is trivial to verify by visual inspection. The upside is that it's easier to reason about why it's right. But there is a downside: the splicing of the old value, which trunk does not do, is probably just as expensive as that byte check. And if the check would fail (early return) in trunk, this new modified code would have to execute a full CAS. Also, I suspect that we still would need a compiler fence at the top to prevent removing the load. I worry that a clever compiler could CSE the load with an earlier load, and that could cause some subtle issues. I don't really care much if we use (1) or (2). But the CompositeValue abstraction feels dangerous to me - the less abstraction this code has, the more obvious it will be what the interleavings could be. Created attachment 248307 [details]
take 3: splice the expectedValue and newValue using the same oldValue
Created attachment 248308 [details]
take 4: fixed missing volatile
Comment on attachment 248308 [details] take 4: fixed missing volatile View in context: https://bugs.webkit.org/attachment.cgi?id=248308&action=review > Source/WTF/wtf/Atomics.h:-316 > - // Make sure that this load is always issued and never optimized away. I should probably restore this comment. Comment on attachment 248308 [details] take 4: fixed missing volatile View in context: https://bugs.webkit.org/attachment.cgi?id=248308&action=review > Source/WTF/wtf/Atomics.h:-316 > - // Make sure that this load is always issued and never optimized away. Why remove this comment? I think it's useful. Thanks for the review. The comment has been restored. Landed in r181305: <http://trac.webkit.org/r181305>. (In reply to comment #13) > Thanks for the review. The comment has been restored. > > Landed in r181305: <http://trac.webkit.org/r181305>. This patch broke Windows builds: CompareAndSwapTest.cpp ..\..\API\tests\CompareAndSwapTest.cpp(95): error C3861: 'pthread_exit': identifier not found [C:\cygwin\home\buildbot\slave\win-debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi.vcxproj] ..\..\API\tests\CompareAndSwapTest.cpp(102): error C2065: 'pthread_t' : undeclared identifier [C:\cygwin\home\buildbot\slave\win-debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi.vcxproj] ..\..\API\tests\CompareAndSwapTest.cpp(102): error C2146: syntax error : missing ';' before identifier 'threadIDs' [C:\cygwin\home\buildbot\slave\win-debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi.vcxproj] ..\..\API\tests\CompareAndSwapTest.cpp(102): error C2065: 'threadIDs' : undeclared identifier [C:\cygwin\home\buildbot\slave\win-debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi.vcxproj] ..\..\API\tests\CompareAndSwapTest.cpp(110): error C2065: 'threadIDs' : undeclared identifier [C:\cygwin\home\buildbot\slave\win-debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi.vcxproj] ..\..\API\tests\CompareAndSwapTest.cpp(110): error C3861: 'pthread_create': identifier not found [C:\cygwin\home\buildbot\slave\win-debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi.vcxproj] ..\..\API\tests\CompareAndSwapTest.cpp(115): error C2065: 'threadIDs' : undeclared identifier [C:\cygwin\home\buildbot\slave\win-debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi.vcxproj] ..\..\API\tests\CompareAndSwapTest.cpp(115): error C3861: 'pthread_join': identifier not found [C:\cygwin\home\buildbot\slave\win-debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi.vcxproj] (In reply to comment #14) > (In reply to comment #13) > > Thanks for the review. The comment has been restored. > > > > Landed in r181305: <http://trac.webkit.org/r181305>. > > This patch broke Windows builds: > > CompareAndSwapTest.cpp > ..\..\API\tests\CompareAndSwapTest.cpp(95): error C3861: 'pthread_exit': > identifier not found > [C:\cygwin\home\buildbot\slave\win- > debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi. > vcxproj] > ..\..\API\tests\CompareAndSwapTest.cpp(102): error C2065: 'pthread_t' : > undeclared identifier > [C:\cygwin\home\buildbot\slave\win- > debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi. > vcxproj] > ..\..\API\tests\CompareAndSwapTest.cpp(102): error C2146: syntax error : > missing ';' before identifier 'threadIDs' > [C:\cygwin\home\buildbot\slave\win- > debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi. > vcxproj] > ..\..\API\tests\CompareAndSwapTest.cpp(102): error C2065: 'threadIDs' : > undeclared identifier > [C:\cygwin\home\buildbot\slave\win- > debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi. > vcxproj] > ..\..\API\tests\CompareAndSwapTest.cpp(110): error C2065: 'threadIDs' : > undeclared identifier > [C:\cygwin\home\buildbot\slave\win- > debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi. > vcxproj] > ..\..\API\tests\CompareAndSwapTest.cpp(110): error C3861: 'pthread_create': > identifier not found > [C:\cygwin\home\buildbot\slave\win- > debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi. > vcxproj] > ..\..\API\tests\CompareAndSwapTest.cpp(115): error C2065: 'threadIDs' : > undeclared identifier > [C:\cygwin\home\buildbot\slave\win- > debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi. > vcxproj] > ..\..\API\tests\CompareAndSwapTest.cpp(115): error C3861: 'pthread_join': > identifier not found > [C:\cygwin\home\buildbot\slave\win- > debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi. > vcxproj] Yeah, Brent just told me this on the way out. I’m working on a fix. Created attachment 248314 [details]
follow up: Windows doesn't like pthreads
Windows fix landed in r181308: <http://trac.webkit.org/r181308>. Created attachment 248316 [details]
Speculative fix for Windows build.
The EWS bots are not cooperating. I went ahead and landed the speculative Windows fix in r181309: <http://trac.webkit.org/r181309>. Created attachment 248318 [details]
Build fix for Windows. This time, I've tested this on a local Windows box.
Final build fix for Windows landed in r181315: <http://trac.webkit.org/r181315>. (In reply to comment #17) > Windows fix landed in r181308: <http://trac.webkit.org/r181308>. It broke the JSC API tests on the Apple bots. (In reply to comment #22) > (In reply to comment #17) > > Windows fix landed in r181308: <http://trac.webkit.org/r181308>. > > It broke the JSC API tests on the Apple bots. Which ones? Can you provide a link please? (In reply to comment #23) > (In reply to comment #22) > > (In reply to comment #17) > > > Windows fix landed in r181308: <http://trac.webkit.org/r181308>. > > > > It broke the JSC API tests on the Apple bots. > > Which ones? Can you provide a link please? All Apple Mac bots on build.webkit.org (In reply to comment #24) > All Apple Mac bots on build.webkit.org I wasn’t seeing this on the dashboard except for the Mavericks debug bots. Anyway, it’s a trivial test only fix. Landed in r181319: <http://trac.webkit.org/r181319>. (In reply to comment #25) > (In reply to comment #24) > > All Apple Mac bots on build.webkit.org > > I wasn’t seeing this on the dashboard except for the Mavericks debug bots. > Anyway, it’s a trivial test only fix. Ah, because the JSC test coverage is poor on the dashboard: - CLOOP and 32 bit bots aren't included - Yosemite WK1 and WK2 bots don't run JSC tests and the dedicated JSC tester bot isn't on dashboard - Only the Mavericks WK1 bots can be found on it > Landed in r181319: <http://trac.webkit.org/r181319>. Thanks for the quick fix. Thanks also |