Bug 162095 - Speed up Heap::isMarkedConcurrently
Summary: Speed up Heap::isMarkedConcurrently
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: JF Bastien
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-09-16 15:32 PDT by JF Bastien
Modified: 2016-09-27 16:44 PDT (History)
11 users (show)

See Also:


Attachments
patch (9.52 KB, patch)
2016-09-16 22:21 PDT, JF Bastien
no flags Details | Formatted Diff | Diff
patch (9.79 KB, patch)
2016-09-17 09:58 PDT, JF Bastien
no flags Details | Formatted Diff | Diff
patch (9.79 KB, patch)
2016-09-17 18:10 PDT, JF Bastien
no flags Details | Formatted Diff | Diff
test (3.94 KB, application/octet-stream)
2016-09-17 18:11 PDT, JF Bastien
no flags Details
patch (11.32 KB, patch)
2016-09-17 18:13 PDT, JF Bastien
no flags Details | Formatted Diff | Diff
patch (11.97 KB, patch)
2016-09-19 23:30 PDT, JF Bastien
no flags Details | Formatted Diff | Diff
patch (12.20 KB, patch)
2016-09-21 10:08 PDT, JF Bastien
no flags Details | Formatted Diff | Diff
patch (11.27 KB, patch)
2016-09-26 11:03 PDT, JF Bastien
fpizlo: review+
Details | Formatted Diff | Diff
patch (19.01 KB, patch)
2016-09-26 16:21 PDT, JF Bastien
no flags Details | Formatted Diff | Diff
patch (19.00 KB, patch)
2016-09-26 16:27 PDT, JF Bastien
no flags Details | Formatted Diff | Diff
patch (18.88 KB, patch)
2016-09-26 16:39 PDT, JF Bastien
no flags Details | Formatted Diff | Diff
patch (20.74 KB, patch)
2016-09-27 11:38 PDT, JF Bastien
no flags Details | Formatted Diff | Diff
patch (20.99 KB, patch)
2016-09-27 12:08 PDT, JF Bastien
no flags Details | Formatted Diff | Diff
patch (20.94 KB, patch)
2016-09-27 13:42 PDT, JF Bastien
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description JF Bastien 2016-09-16 15:32:33 PDT
Heap::isMarkedConcurrently has a fairly expensive load-load fence.

This fence is there because:
 1. If the read of m_version in MarkedBlock::needsFlip isn't what's expected then;
 2. The read of m_marks in MarkedBlock::isMarked needs to observe the value that was stored *before* m_version was stored.

This ordering isn't guaranteed on ARM, which has a weak memory model.

There are 3 ways to guarantee this ordering:
 A. Use a barrier instruction.
 B. Use a load-acquire (new in ARMv8).
 C. use ARM's address dependency rule, which C++ calls memory_order_consume.

In general:
 A. is slow but orders all of memory in an intuitive manner.
 B. is faster-ish and has the same property-ish.
 C. should be faster still, but *only orders dependent loads*. This last part is critical! Consume isn't an all-out replacement for acquire (acquire is rather a superset of consume).


ARM explains the address dependency rule in their document "barrier litmus tests and cookbook":

> *Resolving by the use of barriers and address dependency*
>
> There is a rule within the ARM architecture that:
>
> Where the value returned by a read is used to compute the virtual address of a subsequent read or write (this is known as an address dependency), then these two memory accesses will be observed in program order. An address dependency exists even if the value read by the first read has no effect in changing the virtual address (as might be the case if the value returned is masked off before it is used, or if it had no effect on changing a predicted address value).
> 
> This restriction applies only when the data value returned from one read is used as a data value to calculate the address of a subsequent read or write. This does not apply if the data value returned from
one read is used to determine the condition code flags, and the values of the flags are used for condition code evaluation to determine the address of a subsequent reads, either through conditional execution or
the evaluation of a branch. This is known as a control dependency.
>
> Where both a control and address dependency exist, the ordering behaviour is consistent with the address dependency. 


C++'s memory_order_consume is unfortunately unimplemented by C++ compilers, and maybe unimplementable as spec'd. I'm working with interested folks in the committee to fix this situation: http://wg21.link/p0190r2

You'll note that this paper has a bunch of proposed solutions, no C++ implementation of any of them, and that Linux uses consume ordering in RCU successfully. I therefore intend to:
 1. implement our own special-purpose dependency + consume in WebKit's Atomics.h;
 2. Benchmark and use it in this location;
 3. If that works out, commit and slowly start using it in other locations (which may require improving the hacky API I have in mind).
 4. Feed this information back to the C++ standards committee so that C++Next has a proper way to use consume ordering.
Comment 1 JF Bastien 2016-09-16 15:38:32 PDT
So far I've:
 - Compiled the code for A64 with barrier, and with dependency ordering;
 - Changed Heap.cpp:measurePhaseTiming to return true;
 - Ran jsc a few times on-device with a self-contained splay.js (from Octane);
 - grep "Heap::markRoots convergence took" | tail -n2

This gives the last GC:Eden and GC:Full entries for each run, which I then averaged to get the following numbers:
  GC:Eden is 93% average runtime with consume ordering than it is with the barrier.
  GC:Full is 76% average runtime with consume ordering than it is with the barrier.
These measurements are fairly noisy, though!

I ran Octane and Kraken, it reports:
  31.0664+-0.2420 30.9790+-0.1619 might be 1.0028x faster

I'm running them for longer, and added JSBench.
Comment 2 JF Bastien 2016-09-16 20:58:38 PDT
Results from running octane+kraken with --outer 50 are neutral-ish. This is OK: the GC's own self measurements say it got better, but overall it's hard to notice (even splay seems slightly *slower* in these measurements).

                                                 original                 dependency                                    
Octane:
   encrypt                                   0.26637+-0.00281    ?     0.26663+-0.00314       ?
   decrypt                                   4.59402+-0.05826    ?     4.62301+-0.04497       ?
   deltablue                        x2       0.22070+-0.00171          0.22006+-0.00331       
   earley                                    0.50421+-0.00415          0.50098+-0.00498       
   boyer                                    10.51093+-0.08624         10.41645+-0.06887       
   navier-stokes                    x2       9.95340+-0.09436          9.91315+-0.09518       
   raytrace                         x2       1.51839+-0.01214    ?     1.52155+-0.01326       ?
   richards                         x2       0.15778+-0.00258    ?     0.15873+-0.00263       ?
   splay                            x2       1.05021+-0.01220    ?     1.05662+-0.01293       ?
   regexp                           x2      43.77781+-0.25201         43.26305+-0.45005         might be 1.0119x faster
   pdfjs                            x2      81.78266+-0.95327         80.39971+-1.06762         might be 1.0172x faster
   mandreel                         x2     107.16854+-1.15268        106.28571+-1.06227       
   gbemu                            x2      84.47725+-1.36278         83.21173+-1.26753         might be 1.0152x faster
   closure                                   0.75797+-0.00697          0.74811+-0.00958         might be 1.0132x faster
   jquery                                   11.70750+-0.11316         11.62195+-0.14425       
   box2d                            x2      24.69607+-0.30160         24.23807+-0.40581         might be 1.0189x faster
   zlib                             x2     666.82827+-12.50878   ?   685.25164+-16.85821      ? might be 1.0276x slower
   typescript                       x2    1460.83906+-13.60317   ?  1461.27808+-13.61377      ?

   <geometric>                              10.81353+-0.03277         10.77419+-0.03512         might be 1.0037x faster

                                                 original                 dependency                                    
Kraken:
   ai-astar                                  213.139+-1.663      ?     214.018+-2.360         ?
   audio-beat-detection                       73.722+-0.899             73.513+-1.162         
   audio-dft                                 160.656+-1.419            160.038+-1.366         
   audio-fft                                  54.096+-0.415      ?      54.140+-0.552         ?
   audio-oscillator                           70.751+-0.703      ?      71.216+-0.585         ?
   imaging-darkroom                           95.893+-1.023      ?      96.225+-1.048         ?
   imaging-desaturate                        101.466+-0.978      ?     102.377+-0.830         ?
   imaging-gaussian-blur                     124.911+-1.589            123.726+-1.553         
   json-parse-financial                       59.564+-0.415             59.434+-0.506         
   json-stringify-tinderbox                   38.598+-0.432             38.557+-0.437         
   stanford-crypto-aes                        85.026+-0.770             84.783+-0.864         
   stanford-crypto-ccm                        62.169+-1.002             61.935+-1.066         
   stanford-crypto-pbkdf2                    193.870+-1.840            191.803+-2.080           might be 1.0108x faster
   stanford-crypto-sha256-iterative           63.815+-0.547             63.426+-0.694         

   <arithmetic>                               99.834+-0.296             99.657+-0.346           might be 1.0018x faster

                                                 original                 dependency                                    
Geomean of preferred means:
   <scaled-result>                           32.8558+-0.0744           32.7665+-0.0780          might be 1.0027x faster
Comment 3 JF Bastien 2016-09-16 21:42:26 PDT
Sample function where this gets inlined JSC::CodeBlock::visitWeakly(JSC::SlotVisitor&):

Before:
	ldr	w10, [x8,#136]         # Original read of m_version in MarkedBlock::needsFlip
	cmp	w10, w9
	b.ne	<JSC::CodeBlock::visitWeakly(JSC::SlotVisitor&)+0x88>
	dmb	ish                    # This barrier is gone in the below version
	sub	x9, x20, x8
	lsr	x10, x9, #7            # Address computation
	ldrb	w8, [x8,x10]           # This load must observe stores which occured before the read of m_version above
	ubfx	w9, w9, #4, #3
	orr	w10, wzr, #0x1
	lsl	w9, w10, w9
	and	w8, w8, w9
	cbnz	w8, <JSC::CodeBlock::visitWeakly(JSC::SlotVisitor&)+0x188>
	b	<JSC::CodeBlock::visitWeakly(JSC::SlotVisitor&)+0x88>

Dependency:
	ldr	w11, [x8,#136]         # Original read of m_version in MarkedBlock::needsFlip
	eor	x9, x11, x11           # Create a dependency: x9 now holds a 0 which depends on the above load
	cmp	w11, w10
	b.ne	<JSC::CodeBlock::visitWeakly(JSC::SlotVisitor&)+0x8c>
	sub	x10, x20, x8
	add	x9, x9, x10, lsr #4   # Address computation using the dependent zero
	lsr	x10, x9, #3           # Ditto, dependent zero carries through here
	ldrb	w8, [x8,x10]          # Dependent load, ARM's address dependency rule guarantees the value loaded will see stores which occured before m_version was stored
	and	w9, w9, #0x7
	orr	w10, wzr, #0x1
	lsl	w9, w10, w9
	and	w8, w8, w9
	cbnz	w8, <JSC::CodeBlock::visitWeakly(JSC::SlotVisitor&)+0x18c>
	b	<JSC::CodeBlock::visitWeakly(JSC::SlotVisitor&)+0x8c>

The same Heap::isMarkedConcurrently code and its callees get inlined in 12 places total. Here are the other ones:
  <JSC::CodeBlock::propagateTransitions(JSC::SlotVisitor&)>: 3x inlined
  <JSC::CodeBlock::determineLiveness(JSC::SlotVisitor&)>: 2x inlined
  <JSC::CodeBlock::shouldJettisonDueToOldAge()>:
  <JSC::AccessCase::propagateTransitions(JSC::SlotVisitor&) const>:
  <JSC::SlotVisitor::markAuxiliary(void const*)>:
  <JSC::Structure::isCheapDuringGC()>: 2x inlined
  <JSC::Structure::markIfCheap(JSC::SlotVisitor&)>:
Comment 4 JF Bastien 2016-09-16 22:21:31 PDT
Created attachment 289161 [details]
patch

A few open questions.

Which instruction is better for creating the dependency:
 - eor %xdep %xin %xin
 - and %xdep %xin #0
That's really a μarch question, I'll ask around. Either could be decoded intelligently at a zero-cycle cost by the front-end and rewritten to XZR (which I assume still keeps the dependency, though that's worth double-checking).

Should I specialize the template for different input sizes? Right now my inline asm uses %x (the full 64-bit X registers) and it's in a template, so it'll use the full register even if the input is a 32-bit W register.

Should I implement this for A32 / T32 as well? I figure it's safer to do one architecture at a time, so I'd do it in a separate patch.

I just use a typedef for ConsumeDependency, and it's set to size_t. Should I encapsulate to reduce the operations which can be done with this value?

Overall I think we'll want to evolve this to a real C++ class as we start using it. I'll ping Paul and see what he thinks for RCU.
Comment 5 WebKit Commit Bot 2016-09-16 22:24:06 PDT
Attachment 289161 [details] did not pass style-queue:


ERROR: Source/WTF/wtf/Atomics.h:236:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/JavaScriptCore/heap/MarkedBlock.h:250:  The parameter name "dependency" adds no information, so it should be removed.  [readability/parameter_name] [5]
ERROR: Source/JavaScriptCore/heap/MarkedBlock.h:476:  More than one command on the same line  [whitespace/newline] [4]
Total errors found: 3 in 5 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 6 Filip Pizlo 2016-09-17 08:25:03 PDT
Comment on attachment 289161 [details]
patch

What about armv7? Wouldn't almost exactly the same code work?
Comment 7 JF Bastien 2016-09-17 09:58:56 PDT
Created attachment 289172 [details]
patch

(In reply to comment #6)
> Comment on attachment 289161 [details]
> patch
> 
> What about armv7? Wouldn't almost exactly the same code work?

Added. I was wary of rolling this change to 2 architectures at the same time.
Comment 8 WebKit Commit Bot 2016-09-17 09:59:55 PDT
Attachment 289172 [details] did not pass style-queue:


ERROR: Source/WTF/wtf/Atomics.h:236:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WTF/wtf/Atomics.h:242:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/JavaScriptCore/heap/MarkedBlock.h:250:  The parameter name "dependency" adds no information, so it should be removed.  [readability/parameter_name] [5]
ERROR: Source/JavaScriptCore/heap/MarkedBlock.h:476:  More than one command on the same line  [whitespace/newline] [4]
Total errors found: 4 in 5 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 9 JF Bastien 2016-09-17 11:40:50 PDT
Forgot to say: this is the general idea. I need to expand on it and add a few tests. Specifically, I want to SFINAE on is_trivially_copyable, and I want to switch template instances based on size (8-, 16-, 32-, 64-bit versions) because the load and xor should be different in some cases (especially the A32/T32 version with 64-bit, needs paired registers and needs dependency on both loaded 32-bit registers if I'm not mistaken).

So, this is good for preliminary review, but definitely not commit yet.
Comment 10 JF Bastien 2016-09-17 18:10:29 PDT
Created attachment 289184 [details]
patch

I think the patch is pretty much good to go! I'll upload a test as well, not sure where it should go.
Comment 11 JF Bastien 2016-09-17 18:11:22 PDT
Created attachment 289185 [details]
test

A test. Which directory should this go in?
Comment 12 WebKit Commit Bot 2016-09-17 18:12:24 PDT
Attachment 289184 [details] did not pass style-queue:


ERROR: Source/WTF/wtf/Atomics.h:236:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WTF/wtf/Atomics.h:242:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/JavaScriptCore/heap/MarkedBlock.h:250:  The parameter name "dependency" adds no information, so it should be removed.  [readability/parameter_name] [5]
ERROR: Source/JavaScriptCore/heap/MarkedBlock.h:476:  More than one command on the same line  [whitespace/newline] [4]
Total errors found: 4 in 5 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 13 JF Bastien 2016-09-17 18:13:39 PDT
Created attachment 289186 [details]
patch

(was the old patch again)
Comment 14 WebKit Commit Bot 2016-09-17 18:15:15 PDT
Attachment 289186 [details] did not pass style-queue:


ERROR: Source/WTF/wtf/Atomics.h:221:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WTF/wtf/Atomics.h:227:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WTF/wtf/Atomics.h:261:  Should have only a single space after a punctuation in a comment.  [whitespace/comments] [5]
ERROR: Source/WTF/wtf/Atomics.h:262:  Should have only a single space after a punctuation in a comment.  [whitespace/comments] [5]
ERROR: Source/WTF/wtf/Atomics.h:274:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/WTF/wtf/Atomics.h:276:  Should have only a single space after a punctuation in a comment.  [whitespace/comments] [5]
ERROR: Source/JavaScriptCore/heap/MarkedBlock.h:250:  The parameter name "dependency" adds no information, so it should be removed.  [readability/parameter_name] [5]
ERROR: Source/JavaScriptCore/heap/MarkedBlock.h:476:  More than one command on the same line  [whitespace/newline] [4]
Total errors found: 8 in 5 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 15 Geoffrey Garen 2016-09-19 13:41:47 PDT
Comment on attachment 289186 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=289186&action=review

> Source/JavaScriptCore/heap/HeapInlines.h:96
> +    auto f = block.needsFlipWithDependency(objectSpaceVersion);

"f" should have a name.

> Source/JavaScriptCore/heap/MarkedBlock.h:250
> +    bool isMarked(const void*, WTF::ConsumeDependency dependency = 0 /* No dependency */);

It is better for code to document itself. If 0 means "no dependency", it should be an enum-style value or a named const.

> Source/JavaScriptCore/heap/MarkedBlock.h:474
> +    auto c = WTF::consumeLoad(&m_version);

"c" needs a name. If "c" is the value of m_version, you can call it "version".

> Source/JavaScriptCore/heap/MarkedBlock.h:475
> +    struct { bool needsFlip; WTF::ConsumeDependency dependency; } ret;

Since this struct is produced in one file an consumed in another, it should be declared in the header.

> Source/WTF/ChangeLog:42
> +        (WTF::makeMagicalZero): a 0 which carries a dependency

It's a warning flag when the comment describing a function is short and uses words that are not the same as the words in the name of the function. Usually, the words you would use to describe a function should appear in its name. I would call this function "zeroWithConsumeDependency".

> Source/WTF/wtf/Atomics.h:211
> +    // doesn't know it. It's magical because it creates an address dependency

Notice that this line of commentary fully explains how to better name this function.

> Source/WTF/wtf/Atomics.h:221
> +                 // Lie about touching memory, freak the optimizer out a bit.
> +                 : "memory");

This comment isn't helpful. You should explain the effect you're trying to achieve, and exactly which kinds of optimizations you want to avoid.

> Source/WTF/wtf/Atomics.h:229
> +#else

I think this #else ought to specify its valid CPU architectures, and produce an #error for an unrecognized CPU architecture. Otherwise, new CPU bring-ups will get super-weird errors that could have been caught at compile time.

> Source/WTF/wtf/Atomics.h:249
> +#else

Should specify CPU architecture.
Comment 16 JF Bastien 2016-09-19 23:30:07 PDT
Created attachment 289330 [details]
patch

Patch updated to address all comments, except as noted below:

> > Source/WTF/wtf/Atomics.h:229
> > +#else
> 
> I think this #else ought to specify its valid CPU architectures, and produce
> an #error for an unrecognized CPU architecture. Otherwise, new CPU bring-ups
> will get super-weird errors that could have been caught at compile time.
> 
> > Source/WTF/wtf/Atomics.h:249
> > +#else
> 
> Should specify CPU architecture.

I don't think we should do this: as I coded it the behavior reverts to the one I'm replacing (from consume to acquire). That's a pessimization but it's valid (and is what all C++ implementations do to consume already). Eventually I want to get this back into C++, so we're just making other weak architectures not as fast as they could be, there shouldn't be weird errors.
Comment 17 WebKit Commit Bot 2016-09-19 23:31:08 PDT
Attachment 289330 [details] did not pass style-queue:


ERROR: Source/WTF/wtf/Atomics.h:223:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WTF/wtf/Atomics.h:250:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WTF/wtf/Atomics.h:284:  Should have only a single space after a punctuation in a comment.  [whitespace/comments] [5]
ERROR: Source/WTF/wtf/Atomics.h:285:  Should have only a single space after a punctuation in a comment.  [whitespace/comments] [5]
ERROR: Source/WTF/wtf/Atomics.h:293:  Missing space before {  [whitespace/braces] [5]
ERROR: Source/WTF/wtf/Atomics.h:295:  Should have only a single space after a punctuation in a comment.  [whitespace/comments] [5]
ERROR: Source/JavaScriptCore/heap/MarkedBlock.h:251:  The parameter name "dependency" adds no information, so it should be removed.  [readability/parameter_name] [5]
ERROR: Source/JavaScriptCore/heap/MarkedBlock.h:270:  More than one command on the same line  [whitespace/newline] [4]
Total errors found: 8 in 5 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 18 Alexey Proskuryakov 2016-09-20 10:13:10 PDT
The style checker warnings all seem valid, except for the last one. Could you please address them?

Additionally, the design in WTF is that all public symbols are supposed to be pulled into global namespace, so one should almost never use the WTF:: prefix.
Comment 19 JF Bastien 2016-09-21 10:08:07 PDT
Created attachment 289464 [details]
patch

- Format patch with clang-format -style=WebKit. Error left is regex getting confused by inline assembly constraints (it doesn't like the colon because it doesn't understand inline asm).
- Remove WTF:: from bitwise_cast (leave other WTF:: uses which follow the surrounding code's uses of WTF::).
- Make sure 64-bit values don't tear.

Any opinions on where the test should go?
Comment 20 Alexey Proskuryakov 2016-09-21 10:43:11 PDT
> - Format patch with clang-format -style=WebKit.

That's not always correct, Jonathan noticed some mistakes in clang formatter rules while looking at it. Note that the latest patch doesn't apply, so it's not being processed.

> Any opinions on where the test should go?

The only place I can think of is Tools/TestWebKitAPI
Comment 21 JF Bastien 2016-09-26 11:03:12 PDT
Created attachment 289837 [details]
patch

Rebased. Still need to move test to Tools/TestWebKitAPI/ but otherwise review is ready.
Comment 22 WebKit Commit Bot 2016-09-26 11:06:19 PDT
Attachment 289837 [details] did not pass style-queue:


ERROR: Source/WTF/wtf/Atomics.h:227:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WTF/wtf/Atomics.h:255:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
Total errors found: 2 in 4 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 23 Geoffrey Garen 2016-09-26 11:08:32 PDT
> > > Source/WTF/wtf/Atomics.h:249
> > > +#else
> > 
> > Should specify CPU architecture.
> 
> I don't think we should do this: as I coded it the behavior reverts to the
> one I'm replacing (from consume to acquire).

Sounds good!
Comment 24 Filip Pizlo 2016-09-26 11:08:37 PDT
Comment on attachment 289837 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=289837&action=review

> Source/JavaScriptCore/heap/MarkedBlock.h:277
> +        WTF::ConsumeDependency dependency;

Remove the WTF::

> Source/JavaScriptCore/heap/MarkedBlock.h:484
> +    auto consumed = WTF::consumeLoad(&m_markingVersion);

Remove the WTF::

> Source/WTF/wtf/Atomics.h:205
> +typedef size_t ConsumeDependency;

You should do a "using WTF::ConsumeDependency" at the bottom of this file.  It's perverse, but the WTF convention is: WTF headers put their stuff in the WTF namespace and then use the public parts from the global namespace.  It's a bug that some of the Atomics stuff doesn't do this (like the fences).  I think that your new code should follow the general WTF convention.  Feel free to fix any other parts of Atomics.h in this patch if you like, like the fact that loadStoreFence() and friend are not usinged from the global namespace.

> Source/WTF/wtf/Atomics.h:238
> +ALWAYS_INLINE ConsumeDependency zeroWithConsumeDependency(T value)

Ditto.

> Source/WTF/wtf/Atomics.h:264
> +template <typename T, typename std::enable_if<sizeof(T) == 2>::type* = nullptr>

Ditto.

> Source/WTF/wtf/Atomics.h:272
> +ALWAYS_INLINE ConsumeDependency zeroWithConsumeDependency(T value)

Ditto.

> Source/WTF/wtf/Atomics.h:279
> +struct Consumed {

Ditto

> Source/WTF/wtf/Atomics.h:299
> +ALWAYS_INLINE auto consumeLoad(const T* location)

Ditto
Comment 25 JF Bastien 2016-09-26 16:21:04 PDT
Created attachment 289886 [details]
patch

Patch with test. Remove WTF:: prefix.
Comment 26 WebKit Commit Bot 2016-09-26 16:23:31 PDT
Attachment 289886 [details] did not pass style-queue:


ERROR: Tools/TestWebKitAPI/Tests/WTF/Consume.cpp:28:  Alphabetical sorting problem.  [build/include_order] [4]
ERROR: Tools/TestWebKitAPI/Tests/WTF/Consume.cpp:33:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Tools/TestWebKitAPI/Tests/WTF/Consume.cpp:34:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Tools/TestWebKitAPI/Tests/WTF/Consume.cpp:35:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Tools/TestWebKitAPI/Tests/WTF/Consume.cpp:36:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Tools/TestWebKitAPI/Tests/WTF/Consume.cpp:37:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Tools/TestWebKitAPI/Tests/WTF/Consume.cpp:113:  When wrapping a line, only indent 4 spaces.  [whitespace/indent] [3]
ERROR: Source/WTF/wtf/Atomics.h:227:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WTF/wtf/Atomics.h:255:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
Total errors found: 9 in 8 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 27 JF Bastien 2016-09-26 16:27:00 PDT
Created attachment 289888 [details]
patch

Fix format of test.
Comment 28 WebKit Commit Bot 2016-09-26 16:29:08 PDT
Attachment 289888 [details] did not pass style-queue:


ERROR: Source/WTF/wtf/Atomics.h:227:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WTF/wtf/Atomics.h:255:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
Total errors found: 2 in 8 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 29 JF Bastien 2016-09-26 16:39:52 PDT
Created attachment 289891 [details]
patch

Fix asm warning on ARM build.
Make code cleaner by ordering function SFINAEs by size.
Comment 30 WebKit Commit Bot 2016-09-26 16:41:05 PDT
Attachment 289891 [details] did not pass style-queue:


ERROR: Source/WTF/wtf/Atomics.h:227:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WTF/wtf/Atomics.h:233:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WTF/wtf/Atomics.h:253:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WTF/wtf/Atomics.h:259:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
Total errors found: 4 in 8 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 31 JF Bastien 2016-09-26 17:16:27 PDT
Building the test I added is failing in an unexpected manner:

Tools/TestWebKitAPI/Tests/WTF/Consume.cpp: In member function 'virtual void TestWebKitAPI::WTF_ConsumePtr_Test::TestBody()':
../../Tools/TestWebKitAPI/Tests/WTF/Consume.cpp:92:18: error: 'main' was not declared in this scope
../../Tools/TestWebKitAPI/Tests/WTF/Consume.cpp:94:179: error: template argument 1 is invalid
../../Tools/TestWebKitAPI/Tests/WTF/Consume.cpp:95:184: error: template argument 1 is invalid

It looks like I didn't set up the xcode project file properly?
Comment 32 JF Bastien 2016-09-26 17:25:12 PDT
The main difference seems to be:

Tools/TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj:            ADCEBBA51D99D4CF002E283A /* Consume.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = Consume.cpp; sourceTree = "<group>"; };
Tools/TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj:                            ADCEBBA51D99D4CF002E283A /* Consume.cpp */,


Tools/TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj:            7C83DF1D1D0A590C00FEBCF3 /* Lock.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0FFC45A41B73EBE20085BD62 /* Lock.cpp */; };
Tools/TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj:            0FFC45A41B73EBE20085BD62 /* Lock.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = Lock.cpp; sourceTree = "<group>"; };
Tools/TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj:                            0FFC45A41B73EBE20085BD62 /* Lock.cpp */,
Tools/TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj:                            7C83DF1D1D0A590C00FEBCF3 /* Lock.cpp in Sources */,


I'm missing two entries, and the first should have a fileRef instead of this weird lastKnownFileType. This UI thing confuses and frightens me. I'll keep clicking around.
Comment 33 JF Bastien 2016-09-27 11:38:52 PDT
Created attachment 289986 [details]
patch

Updated patch should fix xcode issue.
Comment 34 WebKit Commit Bot 2016-09-27 11:40:58 PDT
Attachment 289986 [details] did not pass style-queue:


ERROR: Source/WTF/wtf/Atomics.h:227:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WTF/wtf/Atomics.h:233:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WTF/wtf/Atomics.h:253:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WTF/wtf/Atomics.h:259:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
Total errors found: 4 in 8 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 35 JF Bastien 2016-09-27 12:08:54 PDT
Created attachment 289993 [details]
patch

Some builds were sad, fix them (add static_cast, and use globals in the test instead of &main).
Comment 36 WebKit Commit Bot 2016-09-27 12:10:52 PDT
Attachment 289993 [details] did not pass style-queue:


ERROR: Source/WTF/wtf/Atomics.h:227:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WTF/wtf/Atomics.h:232:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WTF/wtf/Atomics.h:251:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WTF/wtf/Atomics.h:256:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
Total errors found: 4 in 8 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 37 JF Bastien 2016-09-27 13:42:39 PDT
Created attachment 290003 [details]
patch

Previous patch was missing a typo fix in the Consume.cpp test.
Comment 38 WebKit Commit Bot 2016-09-27 13:45:36 PDT
Attachment 290003 [details] did not pass style-queue:


ERROR: Source/WTF/wtf/Atomics.h:227:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WTF/wtf/Atomics.h:232:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WTF/wtf/Atomics.h:251:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WTF/wtf/Atomics.h:256:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
Total errors found: 4 in 8 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 39 Filip Pizlo 2016-09-27 15:57:52 PDT
Comment on attachment 290003 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=290003&action=review

> Source/WTF/wtf/Atomics.h:252
> +#elif CPU(ARM)

Does this kick in for what we call CPU(ARM_THUMB2)?
Comment 40 JF Bastien 2016-09-27 16:20:34 PDT
Comment on attachment 290003 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=290003&action=review

>> Source/WTF/wtf/Atomics.h:252
>> +#elif CPU(ARM)
> 
> Does this kick in for what we call CPU(ARM_THUMB2)?

IIUC, yes: THUMB2 is only set if WTF_THUMB_ARCH_VERSION == 4, which means arch is 6T2 or 7 and above (otherwise it's ARM_TRADITIONAL), but ARM is always set.

So I think this does what we want.

See: https://godbolt.org/g/ST3PzR
Comment 41 WebKit Commit Bot 2016-09-27 16:43:55 PDT
Comment on attachment 290003 [details]
patch

Clearing flags on attachment: 290003

Committed r206470: <http://trac.webkit.org/changeset/206470>
Comment 42 WebKit Commit Bot 2016-09-27 16:44:03 PDT
All reviewed patches have been landed.  Closing bug.