WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
142513
8-bit version of weakCompareAndSwap() can cause an infinite loop
https://bugs.webkit.org/show_bug.cgi?id=142513
Summary
8-bit version of weakCompareAndSwap() can cause an infinite loop
Mark Lam
Reported
2015-03-09 16:18:04 PDT
Presently, Bitmap::concurrentTestAndSet() uses the 8-bit version of weakCompareAndSwap() (which compares and swaps an uint8_t value). Bitmap::concurrentTestAndSet() has a loop that checks if a bit in the byte of interest has been set. If not, it will call the 8-bit CAS function to set the bit. Under the covers, the 8-bit CAS function actually works with a 32-bit CAS. The 8-bit CAS will first fetch the 32-bit value in memory that contains the 8-bit value, and check if it contains the expected byte. If the value in memory doesn't have the expected byte, it will return early to its caller. The expectation is that the caller will reload the byte from memory and call the 8-bit CAS again. Unfortunately, this code path that returns early does not have a compiler fence. Without that compiler fence, the C++ compiler can optimize away the reloading of the expected byte value, leaving it unchanged. As a result, we'll have a infinite loop here that checks a value that will never change, and the loop will not terminate until the value changes. The fix is to eliminate the early return check in the 8-bit CAS, and have it always call down to the 32-bit CAS. The 32-bit CAS has a compiler fence which will prevent this issue.
Attachments
the patch.
(13.04 KB, patch)
2015-03-09 17:36 PDT
,
Mark Lam
mark.lam
: review-
Details
Formatted Diff
Diff
take 2: with fixes for windows project. Make doesn't seem to build testapi.
(14.48 KB, patch)
2015-03-09 17:59 PDT
,
Mark Lam
fpizlo
: review-
fpizlo
: commit-queue-
Details
Formatted Diff
Diff
take 3: splice the expectedValue and newValue using the same oldValue
(14.47 KB, patch)
2015-03-09 19:09 PDT
,
Mark Lam
no flags
Details
Formatted Diff
Diff
take 4: fixed missing volatile
(14.41 KB, patch)
2015-03-09 19:11 PDT
,
Mark Lam
fpizlo
: review+
Details
Formatted Diff
Diff
follow up: Windows doesn't like pthreads
(2.54 KB, patch)
2015-03-09 21:05 PDT
,
Mark Lam
achristensen
: review+
Details
Formatted Diff
Diff
Speculative fix for Windows build.
(1.04 KB, patch)
2015-03-09 21:37 PDT
,
Mark Lam
no flags
Details
Formatted Diff
Diff
Build fix for Windows. This time, I've tested this on a local Windows box.
(7.65 KB, patch)
2015-03-09 23:40 PDT
,
Mark Lam
achristensen
: review+
Details
Formatted Diff
Diff
Show Obsolete
(4)
View All
Add attachment
proposed patch, testcase, etc.
Mark Lam
Comment 1
2015-03-09 16:19:09 PDT
<
rdar://problem/20028910
>
Mark Lam
Comment 2
2015-03-09 17:36:41 PDT
Created
attachment 248298
[details]
the patch.
Mark Lam
Comment 3
2015-03-09 17:51:33 PDT
Comment on
attachment 248298
[details]
the patch. Need to add test file to other ports.
Mark Lam
Comment 4
2015-03-09 17:59:28 PDT
Created
attachment 248303
[details]
take 2: with fixes for windows project. Make doesn't seem to build testapi.
Filip Pizlo
Comment 5
2015-03-09 18:20:46 PDT
Comment on
attachment 248303
[details]
take 2: with fixes for windows project. Make doesn't seem to build testapi. View in context:
https://bugs.webkit.org/attachment.cgi?id=248303&action=review
> Source/WTF/wtf/Atomics.h:335 > + struct CompositeValue { > + public: > + CompositeValue(volatile unsigned* baseWordAddr, uint8_t replacementByte, uintptr_t replacementIndex) > + { > + u.word = *baseWordAddr; > + u.bytes[replacementIndex] = replacementByte; > + } > + > + unsigned word() { return u.word; } > + > + private: > + union { > + unsigned word; > + uint8_t bytes[sizeof(unsigned)]; > + } u; > + }; > + > + CompositeValue oldAlignedValue(alignedLocation, expected, locationOffset); > + CompositeValue newAlignedValue(alignedLocation, newValue, locationOffset);
This is wrong. You load from *alignedLocation twice: once t get the side-bytes of the old value and once to get the side-bytes of the new value. This has a race. If the bytes other than the ones you are CASing inside the aligned word change between the two loads, you could cause badness. I suggest not using the CompositeValue helper. Instead, all that this needs is a compiler fence before the load of the oldAlignedValue in the original code.
Brent Fulgham
Comment 6
2015-03-09 18:23:27 PDT
We're missing the path to "config.h" on the Windows project settings for testapi. I'll land a fix for that independent of this change.
Filip Pizlo
Comment 7
2015-03-09 18:49:11 PDT
(In reply to
comment #5
)
> Comment on
attachment 248303
[details]
> take 2: with fixes for windows project. Make doesn't seem to build testapi. > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=248303&action=review
> > > Source/WTF/wtf/Atomics.h:335 > > + struct CompositeValue { > > + public: > > + CompositeValue(volatile unsigned* baseWordAddr, uint8_t replacementByte, uintptr_t replacementIndex) > > + { > > + u.word = *baseWordAddr; > > + u.bytes[replacementIndex] = replacementByte; > > + } > > + > > + unsigned word() { return u.word; } > > + > > + private: > > + union { > > + unsigned word; > > + uint8_t bytes[sizeof(unsigned)]; > > + } u; > > + }; > > + > > + CompositeValue oldAlignedValue(alignedLocation, expected, locationOffset); > > + CompositeValue newAlignedValue(alignedLocation, newValue, locationOffset); > > This is wrong. You load from *alignedLocation twice: once t get the > side-bytes of the old value and once to get the side-bytes of the new value. > This has a race. If the bytes other than the ones you are CASing inside the > aligned word change between the two loads, you could cause badness. > > I suggest not using the CompositeValue helper. Instead, all that this needs > is a compiler fence before the load of the oldAlignedValue in the original > code.
Here's the race that breaks your code: char buf[4]; // initial state buf[0] = 0; buf[1] = 0; buf[2] = 0; buf[3] = 0; Thread #1 takes the following steps: T1: 1) CAS(&buf[0], 0, 1); Thread #2 takes the following steps: T2: 1) buf[1] = 1; T2: 2) buf[1] = 0; We would expect the following to be valid outcomes of executing this program, regardless of the three possible interleavings. Here, "T1.1->T2.1" implies a particular sequencing of executions, so in particular it means that "T1 executes step 1 (i.e. the CAS) and then T2 executes step 1 (i.e. buf[1] = 1)". T1.1->T2.1->T2.2: buf == {1, 0, 0, 0} T2.1->T1.1->t2.2: buf == {1, 0, 0, 0} T2.1->T2.2->T1.1: buf == {1, 0, 0, 0} In particular there is no valid outcome with buf[1] != 0. But if the 1-byte CAS does what you do, then it actually has two loads and a CAS. The first load influences what we compare and the second load influences what we set the word to. Let's say that the CAS is decomposed into three steps: a) load the whole word, put in the expected byte b) load the whole word, put in the newValue byte c) do the word CAS This means that we will have the following evil interleaving: T1.1.a->T2.1->T2.2->T1.1.b->T1.1.c After T1.1.a executes, the expected value word will be {0, 0, 0, 0}. After T2.1, buf[1] == 1, but then T2.2 runs and then we have buf[1] == 0 again. Now T1.1.b runs and sets the new value word to {1, 1, 0, 0}. Then T1.1.c runs, and we CAS with expected = {0, 0, 0, 0} and newValue = {1, 1, 0, 0}. So the result is {1, 1, 0, 0,} which is wrong.
Filip Pizlo
Comment 8
2015-03-09 18:59:52 PDT
I believe the composite value thing doesn't produce a net benefit. I think it makes it easier to make the sort of mistake that you made. CAS implementations should be as straight-line and free of abstractions as possible because we want to be able to see, by glancing at the code, how many loads there are. I believe that a correct implementation must only involve one load. The abstraction via CompositeLoad means that the code appears to have one load but you only realize that it executes twice by thinking about it a bit more carefully. There are two possible changes that make the current trunk code correct: 1) Put a compiler fence before and after the whole 8-byte CAS function - that is, the early return would instead become a control flow diamond and the whole 8-byte CAS function would look something like this: compilerFence(); ... address math oldAlignedValue = *load; if (u.bytes[thingy] == stuff) result = false; else { ... result = CAS(...); } compilerFence(); return result; The reason for both compiler fences is to prevent code motion into or out of the CAS implementation. The top compiler fence prevents hoisting the load. The bottom compiler fence prevents hoisting a store into the fast-exit path. This doesn't prevent any optimizations other than those that would be invalid. 2) Use an unconditional CAS like your change does, but without CompositeValue. You could just construct oldValue and newValue by storing a byte into the union and then loading the word out. This kind of very C-like simple implementation is trivial to verify by visual inspection. The upside is that it's easier to reason about why it's right. But there is a downside: the splicing of the old value, which trunk does not do, is probably just as expensive as that byte check. And if the check would fail (early return) in trunk, this new modified code would have to execute a full CAS. Also, I suspect that we still would need a compiler fence at the top to prevent removing the load. I worry that a clever compiler could CSE the load with an earlier load, and that could cause some subtle issues. I don't really care much if we use (1) or (2). But the CompositeValue abstraction feels dangerous to me - the less abstraction this code has, the more obvious it will be what the interleavings could be.
Mark Lam
Comment 9
2015-03-09 19:09:44 PDT
Created
attachment 248307
[details]
take 3: splice the expectedValue and newValue using the same oldValue
Mark Lam
Comment 10
2015-03-09 19:11:43 PDT
Created
attachment 248308
[details]
take 4: fixed missing volatile
Mark Lam
Comment 11
2015-03-09 19:12:52 PDT
Comment on
attachment 248308
[details]
take 4: fixed missing volatile View in context:
https://bugs.webkit.org/attachment.cgi?id=248308&action=review
> Source/WTF/wtf/Atomics.h:-316 > - // Make sure that this load is always issued and never optimized away.
I should probably restore this comment.
Filip Pizlo
Comment 12
2015-03-09 19:12:55 PDT
Comment on
attachment 248308
[details]
take 4: fixed missing volatile View in context:
https://bugs.webkit.org/attachment.cgi?id=248308&action=review
> Source/WTF/wtf/Atomics.h:-316 > - // Make sure that this load is always issued and never optimized away.
Why remove this comment? I think it's useful.
Mark Lam
Comment 13
2015-03-09 19:21:18 PDT
Thanks for the review. The comment has been restored. Landed in
r181305
: <
http://trac.webkit.org/r181305
>.
Ryosuke Niwa
Comment 14
2015-03-09 20:13:16 PDT
(In reply to
comment #13
)
> Thanks for the review. The comment has been restored. > > Landed in
r181305
: <
http://trac.webkit.org/r181305
>.
This patch broke Windows builds: CompareAndSwapTest.cpp ..\..\API\tests\CompareAndSwapTest.cpp(95): error C3861: 'pthread_exit': identifier not found [C:\cygwin\home\buildbot\slave\win-debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi.vcxproj] ..\..\API\tests\CompareAndSwapTest.cpp(102): error C2065: 'pthread_t' : undeclared identifier [C:\cygwin\home\buildbot\slave\win-debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi.vcxproj] ..\..\API\tests\CompareAndSwapTest.cpp(102): error C2146: syntax error : missing ';' before identifier 'threadIDs' [C:\cygwin\home\buildbot\slave\win-debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi.vcxproj] ..\..\API\tests\CompareAndSwapTest.cpp(102): error C2065: 'threadIDs' : undeclared identifier [C:\cygwin\home\buildbot\slave\win-debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi.vcxproj] ..\..\API\tests\CompareAndSwapTest.cpp(110): error C2065: 'threadIDs' : undeclared identifier [C:\cygwin\home\buildbot\slave\win-debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi.vcxproj] ..\..\API\tests\CompareAndSwapTest.cpp(110): error C3861: 'pthread_create': identifier not found [C:\cygwin\home\buildbot\slave\win-debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi.vcxproj] ..\..\API\tests\CompareAndSwapTest.cpp(115): error C2065: 'threadIDs' : undeclared identifier [C:\cygwin\home\buildbot\slave\win-debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi.vcxproj] ..\..\API\tests\CompareAndSwapTest.cpp(115): error C3861: 'pthread_join': identifier not found [C:\cygwin\home\buildbot\slave\win-debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi.vcxproj]
Mark Lam
Comment 15
2015-03-09 20:22:39 PDT
(In reply to
comment #14
)
> (In reply to
comment #13
) > > Thanks for the review. The comment has been restored. > > > > Landed in
r181305
: <
http://trac.webkit.org/r181305
>. > > This patch broke Windows builds: > > CompareAndSwapTest.cpp > ..\..\API\tests\CompareAndSwapTest.cpp(95): error C3861: 'pthread_exit': > identifier not found > [C:\cygwin\home\buildbot\slave\win- > debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi. > vcxproj] > ..\..\API\tests\CompareAndSwapTest.cpp(102): error C2065: 'pthread_t' : > undeclared identifier > [C:\cygwin\home\buildbot\slave\win- > debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi. > vcxproj] > ..\..\API\tests\CompareAndSwapTest.cpp(102): error C2146: syntax error : > missing ';' before identifier 'threadIDs' > [C:\cygwin\home\buildbot\slave\win- > debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi. > vcxproj] > ..\..\API\tests\CompareAndSwapTest.cpp(102): error C2065: 'threadIDs' : > undeclared identifier > [C:\cygwin\home\buildbot\slave\win- > debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi. > vcxproj] > ..\..\API\tests\CompareAndSwapTest.cpp(110): error C2065: 'threadIDs' : > undeclared identifier > [C:\cygwin\home\buildbot\slave\win- > debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi. > vcxproj] > ..\..\API\tests\CompareAndSwapTest.cpp(110): error C3861: 'pthread_create': > identifier not found > [C:\cygwin\home\buildbot\slave\win- > debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi. > vcxproj] > ..\..\API\tests\CompareAndSwapTest.cpp(115): error C2065: 'threadIDs' : > undeclared identifier > [C:\cygwin\home\buildbot\slave\win- > debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi. > vcxproj] > ..\..\API\tests\CompareAndSwapTest.cpp(115): error C3861: 'pthread_join': > identifier not found > [C:\cygwin\home\buildbot\slave\win- > debug\build\Source\JavaScriptCore\JavaScriptCore.vcxproj\testapi\testapi. > vcxproj]
Yeah, Brent just told me this on the way out. I’m working on a fix.
Mark Lam
Comment 16
2015-03-09 21:05:09 PDT
Created
attachment 248314
[details]
follow up: Windows doesn't like pthreads
Mark Lam
Comment 17
2015-03-09 21:23:11 PDT
Windows fix landed in
r181308
: <
http://trac.webkit.org/r181308
>.
Mark Lam
Comment 18
2015-03-09 21:37:39 PDT
Created
attachment 248316
[details]
Speculative fix for Windows build.
Mark Lam
Comment 19
2015-03-09 21:43:03 PDT
The EWS bots are not cooperating. I went ahead and landed the speculative Windows fix in
r181309
: <
http://trac.webkit.org/r181309
>.
Mark Lam
Comment 20
2015-03-09 23:40:04 PDT
Created
attachment 248318
[details]
Build fix for Windows. This time, I've tested this on a local Windows box.
Mark Lam
Comment 21
2015-03-09 23:57:48 PDT
Final build fix for Windows landed in
r181315
: <
http://trac.webkit.org/r181315
>.
Csaba Osztrogonác
Comment 22
2015-03-10 01:04:19 PDT
(In reply to
comment #17
)
> Windows fix landed in
r181308
: <
http://trac.webkit.org/r181308
>.
It broke the JSC API tests on the Apple bots.
Mark Lam
Comment 23
2015-03-10 01:06:25 PDT
(In reply to
comment #22
)
> (In reply to
comment #17
) > > Windows fix landed in
r181308
: <
http://trac.webkit.org/r181308
>. > > It broke the JSC API tests on the Apple bots.
Which ones? Can you provide a link please?
Csaba Osztrogonác
Comment 24
2015-03-10 01:07:07 PDT
(In reply to
comment #23
)
> (In reply to
comment #22
) > > (In reply to
comment #17
) > > > Windows fix landed in
r181308
: <
http://trac.webkit.org/r181308
>. > > > > It broke the JSC API tests on the Apple bots. > > Which ones? Can you provide a link please?
All Apple Mac bots on build.webkit.org
Mark Lam
Comment 25
2015-03-10 01:48:53 PDT
(In reply to
comment #24
)
> All Apple Mac bots on build.webkit.org
I wasn’t seeing this on the dashboard except for the Mavericks debug bots. Anyway, it’s a trivial test only fix. Landed in
r181319
: <
http://trac.webkit.org/r181319
>.
Csaba Osztrogonác
Comment 26
2015-03-10 04:09:11 PDT
(In reply to
comment #25
)
> (In reply to
comment #24
) > > All Apple Mac bots on build.webkit.org > > I wasn’t seeing this on the dashboard except for the Mavericks debug bots. > Anyway, it’s a trivial test only fix.
Ah, because the JSC test coverage is poor on the dashboard: - CLOOP and 32 bit bots aren't included - Yosemite WK1 and WK2 bots don't run JSC tests and the dedicated JSC tester bot isn't on dashboard - Only the Mavericks WK1 bots can be found on it
> Landed in
r181319
: <
http://trac.webkit.org/r181319
>.
Thanks for the quick fix.
Kevin Staunton-Lambert
Comment 27
2015-11-18 10:00:39 PST
Thanks also
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug