Bug 20422 - Patch to allow custom memory allocation control
Summary: Patch to allow custom memory allocation control
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Web Template Framework (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Enhancement
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-08-18 01:26 PDT by Paul Pedriana
Modified: 2009-05-14 02:07 PDT (History)
12 users (show)

See Also:


Attachments
Implements operator new utilities (14.40 KB, text/plain)
2008-08-18 01:28 PDT, Paul Pedriana
mrowe: review-
Details
Unit test for New.h (5.30 KB, text/plain)
2008-08-18 01:29 PDT, Paul Pedriana
mrowe: review-
Details
Implements a new file: JavaScriptCore/wtf/New.h. (16.82 KB, patch)
2008-09-02 17:44 PDT, Paul Pedriana
darin: review-
Details | Formatted Diff | Diff
Test application for New.h (7.45 KB, text/plain)
2008-09-02 17:47 PDT, Paul Pedriana
no flags Details
proposed patch for JavaScripCore (138.80 KB, patch)
2008-10-01 02:25 PDT, Zoltan Horvath
darin: review-
Details | Formatted Diff | Diff
proposed patch for WebCore (12.34 KB, patch)
2008-10-01 02:28 PDT, Zoltan Horvath
no flags Details | Formatted Diff | Diff
proposed patch (41.24 KB, patch)
2008-10-06 04:20 PDT, Balazs Kelemen
darin: review-
Details | Formatted Diff | Diff
Patch against revision 38334 (476.22 KB, patch)
2008-11-20 21:19 PST, Paul Pedriana
darin: review-
Details | Formatted Diff | Diff
Unit test for latest FastAllocBase patch. (9.31 KB, text/plain)
2008-11-20 21:20 PST, Paul Pedriana
no flags Details
JavaScriptCore FastMalloc-related patch. (51.48 KB, patch)
2008-12-11 01:19 PST, Paul Pedriana
darin: review-
Details | Formatted Diff | Diff
Build file patch. (6.55 KB, patch)
2008-12-11 01:22 PST, Paul Pedriana
darin: review-
Details | Formatted Diff | Diff
Updated build systems patch (4.98 KB, patch)
2009-01-06 06:46 PST, Zoltan Horvath
darin: review-
Details | Formatted Diff | Diff
Updated FastMalloc patch. (61.06 KB, patch)
2009-01-28 00:51 PST, Paul Pedriana
darin: review+
Details | Formatted Diff | Diff
Same patch as before but updated to work with r42235 (April 6, 2009) (65.50 KB, patch)
2009-04-06 01:21 PDT, Paul Pedriana
no flags Details | Formatted Diff | Diff
Same patch as before, updated (44.59 KB, patch)
2009-04-06 11:48 PDT, Zoltan Horvath
darin: review+
Details | Formatted Diff | Diff
proposed patch for HashTable inheritance (1.38 KB, patch)
2009-04-17 08:17 PDT, Zoltan Horvath
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Paul Pedriana 2008-08-18 01:26:53 PDT
This submission consists of two files: New.h and OperatorNewTest.cpp. The former implements a base header file which allows for custom control of memory allocation in WebKit. The latter implements a basic unit test for New.h. 

The usefulness of this feature was discussed in June on the webkit-dev mailing list. See the following for context: https://lists.webkit.org/pipermail/webkit-dev/2008-June/004089.html. In summary, this system allows for the external redirection of memory allocation in WebKit to a user-specified heap or heaps. It provides for this by two means: overriding operator new in C++ classes (usually via a base class), and by overriding global operator new. Users have responded that the first means is preferred, though there may be cases where classes aren't used or where malloc is being used instead of new and thus the second means would be needed. These means are mutually compatible. 

Initially this submission does not include a WebKit-wide patch to convert code to using this system, nor does it need to at this time. In theory, WebKit code can be migrated to take advantage of this in a gradual manner, though the intended ability for the user to control memory allocation WebKit-wide won't be possibly until the migration is complete. This initial submission is a more formal RFC than that proposed on the mailing list and after discussion modifications and a full WebKit-wide patch to use it can be considered. New.h could be checked into the codebase and undergo initial testing before any patch to use it WebKit-wide was constructed and submitted. 

This submission is motivated by the necessity to tightly control memory on embedded platforms, especially where WebKit is not the primary application but instead is a library used within another application. Additionally, the submission significantly benefits the KDE project by allowing memory optimizations that they otherwise are unable to implement. I will defer to a representative from KDE to step in and clarify this further. 

This submission should not affect the optimized WebKit runtime in any significant way. It is unlikely to affect the WebKit build speed or memory usage either. The primary downside to the proposed system is that it will require a number of modifications to the WebKit code (adding base class operators) to take advantage of.
Comment 1 Paul Pedriana 2008-08-18 01:28:25 PDT
Created attachment 22847 [details]
Implements operator new utilities
Comment 2 Paul Pedriana 2008-08-18 01:29:19 PDT
Created attachment 22848 [details]
Unit test for New.h
Comment 3 Mark Rowe (bdash) 2008-08-18 07:30:45 PDT
Can you please take a look at http://webkit.org/coding/contributing.html for information about contributing code changes.  In particular, take note of the coding style guidelines, the need for a ChangeLog entry, and that changes should be provided in patch form.
Comment 4 Adam Treat 2008-08-18 09:34:43 PDT
I think this would be a good solution to allow the Linux ports to use FastMalloc and further it does provide a nice way for ports to provide customized allocators.  

Definitely prefer the new base class option to the global operator new for reasons discussed here by Maciej:

https://lists.webkit.org/pipermail/webkit-dev/2008-June/004089.html
Comment 5 Paul Pedriana 2008-08-18 11:39:46 PDT
I can convert these to patches and make new attachments for that (along with a ChangeLog entry). Before I submitted the above files I walked over the coding guidelines and did any changes to make the code compliant. Did I possibly miss something?
Comment 6 Mark Rowe (bdash) 2008-08-19 21:55:21 PDT
(In reply to comment #5)
> I can convert these to patches and make new attachments for that (along with a
> ChangeLog entry). Before I submitted the above files I walked over the coding
> guidelines and did any changes to make the code compliant. Did I possibly miss
> something?

There are a few issues that I see:
* Lack of license header
* Inconsistent placement of the opening brace in a function definition.
* Functions named with an initial upper-case letter.
* Use of C-style casts in C++ code.
* Use of NULL in C++ code.
* "Filled" block header comments.
* Large numbers of blank lines, such as at the end of the file.

Some of these aren't explicitly mentioned in the coding style guidelines, but are style rules that we apply consistently throughout WebKit.  If something isn't explicitly mentioned it's a good idea to match the predominant style of existing code.
Comment 7 Mark Rowe (bdash) 2008-08-19 21:57:06 PDT
Comment on attachment 22847 [details]
Implements operator new utilities

Marking as r- for now.  When you attach the updated patch someone can review for substance rather than style :-)
Comment 8 Paul Pedriana 2008-09-02 17:44:48 PDT
Created attachment 23132 [details]
Implements a new file: JavaScriptCore/wtf/New.h.

I have revised the file to better follow WebKit coding conventions than the previous version, made it a formal patch instead of a .h file, and made a couple small code changes.
Comment 9 Paul Pedriana 2008-09-02 17:47:16 PDT
Created attachment 23133 [details]
Test application for New.h

This file is incidental and is not itself currently a proposed patch to WebKit.
Comment 10 Zoltan Horvath 2008-09-08 06:39:43 PDT
Our goal is to change standard malloc to tcmalloc under linux-qt port but we couldn't solve operator new and delete problems.  As we see this solution can solve this, therefore if the community accepts this change then we would gladly help to replace all new and delete operators to the new ones.
Comment 11 Adam Treat 2008-09-15 09:02:01 PDT
Zoltan, to be clear I think we would not be replacing all the 'new' and 'delete' operators, rather we'd be changing WebKit so that all classes inherit AllocBase.  At least that is my preferred solution.
Comment 12 Darin Adler 2008-09-21 14:25:10 PDT
Comment on attachment 23132 [details]
Implements a new file: JavaScriptCore/wtf/New.h.

Seems good.

I think the term "Object" might be confusing here. I hope we can come up with better names than deleteObject and deleteArray.

AllocBase seems like a reasonable name, but if we're going to be using it everywhere, do we really have to use the abbreviation "alloc"? I'd prefer a name that was a tiny bit more expressive about what it does exactly. Not that I'm a fan of the prefix "fast" in "fastMalloc", but it would at least be consistent to use that motif everywhere; we should consider that.

I think it's probably a mistake to name this <wtf/New.h> since there's a system header called <new.h> on some platforms. We have trouble with <WebCore/String.h> and we've had to name it PlatformString.h because of conflict with <string.h> on some platforms.

The main thing I'd like to see that's not here is a way to detect if people use new/delete directly and forget to use these functions. One of the advantages of the current approach is that there's no room for error. You can't forget and mismatch these. We would like a compile time error or a runtime error if someone does delete instead of deleteObject or delete [] instead of deleteArray. I'd like to come up with a way of doing that, even if only on certain platforms. We can be creative and use macros if necessary. I also think we consider putting debugging sentinels in versions when NDEBUG is not set so we detect objects that are allocated with new/newObject and deleted with delete[]/deleteArray or vice versa.

It's fine to use the underscore naming convention for things where we're trying to match something in the std namespace. But I don't think that includes integral_constant, true_type, and false_type. Unless perhaps those are coming in a future standard?

+        template <> struct has_trivial_constructor<float>                    : public true_type{};
+        template <> struct has_trivial_constructor<const float>              : public true_type{};

Why are these versions with "const" needed? Are they really needed? If so, then why aren't versions with "volatile" and "const volatile" also needed?

HashTraits.h has a template called IsInteger. There are some differences in idiom between this new code and that existing code that might be worth resolving. And we may want to share with that template.

The has_trivial_constructor and has_trivial_destructor need to be specialized for raw pointers too. What does "trivial" mean for a constructor? Does it really mean you can use uninitialized memory for the type, as opposed to needing zero-filled memory?

We normally put spaces inside empty braces. We normally try to avoid lining up code, like the lined-up colons. It's "high maintenance" formatting that tends to fall apart if you rename something.

+            fastFree(p);  // We don't need to check for a null pointer; the compiler does this.

I don't understand this comment. fastFree checks for 0 and quietly does nothing. What does the compiler have to do with it?

+        // This handles the case whereby T has a trivial ctor and a trivial dtor.

I think you mean "wherein", not "whereby".

+                T* const p = static_cast<T*>(fastMalloc(sizeof(T) * count));

We normally don't mark local variables const, even if we don't plan to modify them. Is there a reason you're doing that here and not elsewhere?

+                return reinterpret_cast<T*>(p);

Why is there a cast here (in NewArrayImpl<T, false, true>? I believe p already has the right type.

+                uint64_t* p = static_cast<uint64_t*>(fastMalloc(sizeof(uint64_t) + (sizeof(T) * count)));

The use of uint64_t here is a little strange. Is that really what we want, even on 32-bit platforms? Seems a bit inefficient. I guess you're trying to not perturb the alignment so you're trying to chose an integer size that's big enough everywhere. If so, then "unsigned long long" may be the best type to use. I don't think explicitly specifying 64 bits is necessarily helpful.

I think a better approach would be to put the pointer into a char*, since that has a special case to allow pointer aliasing. I believe the implementation you have here would not compile with strict aliasing. We can't just cast directly from uint64_t* to another arbitrary type. But we can cast from char* to two different types. We should find an answer to this and test it. It's critical that we continue to compile with the strict aliasing compiler option on.

If you were using the char technique, then we could advance by a size that preserves alignment, while still using a 32-bit integer for the array size on 32-bit platforms, which is probably more efficient.

+        if (p) { // As per the C++ standard, test for null pointer.

I think this comment is inaccurate. Clearly the C++ standard doesn't tell us what a function template named deleteObject should do! What you're really trying to say here is something more like this: "Since we intend to use this wherever we'd use C++ delete, allow a null pointer as it does."

Since we plan to use this everywhere we might be using it in performance-critical code paths. Perhaps we should consider adding version of delete without a null check that instead assert? It might be good to save a branch in the case of a trivial destructor. We could also consider a version of fastFree without a null check for the same reason.

+                if (p) { // As per the C++ standard, test for null pointer.
+                    // No need to destruct the objects in this case.
+                    fastFree(p);
+                }

There's no need to add an extra check for null in this case since fastFree already does this.

I think I made enough comments to justify review- for now. I'm really interested in figuring out a strategy for using this in all the code that will help us catch places where we forget to use it.
Comment 13 Darin Adler 2008-09-21 14:28:03 PDT
(In reply to comment #10)
> we would gladly
> help to replace all new and delete operators to the new ones.

I believe the proposal is this:

    1) Add AllocBase (looking for better name) as a base class for every class we new/delete or one of its base classes.
    2) For objects we new/delete that are not classes that can be changed to derive from AllocBase, use newObject/deleteObject and newArray/deleteArray.

One trick is coming up with a good technique for making sure we do this and catch mistakes either at compile time or at least at runtime if we do it wrong. It's going to be hard to review patches knowing that new/delete is sometimes right and sometimes wrong, depending on the type of object it's being called on.
Comment 14 Paul Pedriana 2008-09-21 22:54:45 PDT
Thanks. That is very good feedback. I will work towards addressing the items over the next week or so.

The uint64_t is to cause it to act like malloc as per the C99 standard section 7.20.3 p1 ("suitably aligned"). But as you say, uint64_t isn't the best way. I wonder if the best way is to create a union of long long and long double and use the size of that. I'm unaware of any compiler that would support uint64_t but have it be larger than long long. I have however seen compilers that implement long long as a 128 bit vector-friendly type, and then there is PowerPC's 16 bit vector types themselves. FWIW, FastMalloc.cpp states that it returns 8 byte aligned memory.

I have no idea where the "We don't need to check for a null pointer; the compiler does this." statement came from. It's not a true statement in that context. I think it might have come in from a paste of some other code where it made sense. 

Comment 15 Zoltan Horvath 2008-09-24 04:13:06 PDT
We are working on trying this solution in JSCore. I have a question: Does it need to inherit all classes from AllocBase, or just those whitch currently created by new call? (I think over the future allocations of objects.)
Comment 16 Zoltan Horvath 2008-10-01 02:22:16 PDT
We have implemented the experimental version of this solution for JavaScriptCore (r37048) therefore all classes inherit from the common base class (AllocBase). We had some problems during the implementation:
 - AllocBase was extended with another &#8220;new&#8221; operator which accepts pointer as well
 - There were several non-public inheritances therefore we changed their visibility to public (the main problem is that classes non-publicly inherit from Noncopyable)
 - Constructor was added to several structs and they are called when the structs are instantiated
 - Code-generator scripts were changed to use the constructors of the structs when instantiating them in static arrays 
    for example:
     -const ClassInfo InternalFunction::info = { "Function", 0, 0, 0 };
     +const ClassInfo InternalFunction::info("Function", 0, 0, 0);
 - We corrected same compilation errors in WebCore which came from these but this solution is not implemented for the whole WebCore

We tested JavaScriptCore under linux-qt and it worked with this solution.
Comment 17 Zoltan Horvath 2008-10-01 02:25:40 PDT
Created attachment 23971 [details]
proposed patch for JavaScripCore

This is a patch for JavaScriptCore only (use it inside JavaScriptCore directory).
Comment 18 Zoltan Horvath 2008-10-01 02:28:09 PDT
Created attachment 23972 [details]
proposed patch for WebCore

This patch corrects the compilation errors in WebCore caused by the previous patch.
Comment 19 Darin Adler 2008-10-01 09:42:24 PDT
Comment on attachment 23971 [details]
proposed patch for JavaScripCore

(In reply to comment #16)
>  - There were several non-public inheritances therefore we changed their
> visibility to public (the main problem is that classes non-publicly inherit
> from Noncopyable)

Why did you change the visibility to public? That sounds wrong. Using private inheritance is a valid technique and you shouldn't just change the inheritance.

>  - Constructor was added to several structs and they are called when the
> structs are instantiated

Why?

>  - Code-generator scripts were changed to use the constructors of the structs
> when instantiating them in static arrays 
>     for example:
>      -const ClassInfo InternalFunction::info = { "Function", 0, 0, 0 };
>      +const ClassInfo InternalFunction::info("Function", 0, 0, 0);

This change is unacceptable.

On many platforms, including Mac OS X, it will cause the code of the JavaScriptCore framework to be paged in during load time so the constructors can be run. Struct initialization doesn't require code execution and is handled by the loader, but constructor execution does. This has a dramatic effect on the load time performance of an application that links with the library before it uses the library.

It seems that there are multiple unrelated changes here. I don't see how a change to ClassInfo has anything to do with the New.h header.
Comment 20 Darin Adler 2008-10-01 09:44:49 PDT
Comment on attachment 23972 [details]
proposed patch for WebCore

Clearing review flag on this. Next time around, the WebCore and JavaScriptCore changes should go into a single patch.
Comment 21 Darin Adler 2008-10-01 09:50:18 PDT
Comment on attachment 23971 [details]
proposed patch for JavaScripCore

Thanks for tackling this.

I think the main problem with this patch is that it went overboard in making AllocBase a base class for classes that are never used with new/delete. Noncopyable and ClassInfo are both examples of that. By only using AllocBase as a base class for classes where we use new/delete the size of the patch would be smaller and we wouldn't have to make unwanted changes like the loading slowdown with static constructors.

Other examples of classes that are not ever allocated at all and so don't need to inherit from AllocBase are the various hash functions. These are simply traits and don't ever get allocated at all, so there's no reason to change them. Same thing for many other classes, such as AVLTreeAbstractorForArrayCompare. Same thing for classes where we only use them on the stack and never use new.

It makes no sense to add lots of empty constructors for classes where it's not legitimate to construct them that way, so we should redesign the AllocBase class so it doesn't require an empty constructor to compile.

There are also changes like this one:

-    NodeFeatureInfo<T> result = {node, info, numConstants};
+    NodeFeatureInfo<T> result = { node, info, numConstants };

That's just a formatting tweak so it should be in a separate patch. We don't want a patch that makes a small change to a file but also strips all the trailing spaces from that file. Those should be done separately.
Comment 22 Ariya Hidayat 2008-10-02 02:58:17 PDT
Do you happen to have benchmark results (SunSpider) comparison with and without the patch? At least, this should be easy for JSC part only.
Comment 23 Balazs Kelemen 2008-10-06 04:16:07 PDT
We reimplemented our solution for JavaScriptCore by taking into account your comments. Now we only inherit classes from AllocBase which are ever allocated on the heap. Visibility of inheritances are not changed and we did not change structs.

We present the results of SunSpider on the original and on the modified WebKit (more precisely, on JSC). As you can see, there is no significant difference between the two versions.

official:Total:                 1597.4ms +/- 0.2%
patched: Total:                 1606.0ms +/- 0.9%
====================
official:Total:                 1602.8ms +/- 0.3%
patched: Total:                 1598.9ms +/- 0.2%
====================
official:Total:                 1600.1ms +/- 0.2%
patched: Total:                 1601.6ms +/- 0.1%
====================
official:Total:                 1601.1ms +/- 0.3%
patched: Total:                 1599.4ms +/- 0.2%
====================
official:Total:                 1602.6ms +/- 0.6%
patched: Total:                 1606.0ms +/- 0.9%
====================
official:Total:                 1610.3ms +/- 1.1%
patched: Total:                 1601.5ms +/- 0.2%
====================
official:Total:                 1602.2ms +/- 0.3%
patched: Total:                 1600.8ms +/- 0.1%
====================
official:Total:                 1606.4ms +/- 0.8%
patched: Total:                 1599.5ms +/- 0.2%
====================
official:Total:                 1600.2ms +/- 0.1%
patched: Total:                 1600.7ms +/- 0.1%
====================
official:Total:                 1608.4ms +/- 0.9%
patched: Total:                 1600.0ms +/- 0.2%
Comment 24 Balazs Kelemen 2008-10-06 04:20:36 PDT
Created attachment 24113 [details]
proposed patch

Implementing the solution for JavaScriptCore.
Comment 25 Darin Adler 2008-10-07 09:55:50 PDT
Comment on attachment 24113 [details]
proposed patch

This patch depends on Paul Pedriana's New.h, and if you read the bug report you'll see many unaddressed comments from me about that. So we can't just land that file until the comments are addressed. I'm not going to repeat my comments.

This patch contains a double-copy of New.h; the whole file pasted twice. We won't land that,

+        HashEntry()
+		{
+			initialize(0, 0, 0, 0);
+		}

We don't want to add code like this. We need to fix AllocBase to eliminate that spurious requirement rather than adding dead code to the project to work around it.

-    class ExpressionNode : public Node {
+    class ExpressionNode : public Node, public WTF::AllocBase {

Where is the code that does "new ExpressionNode"?

+#include <wtf/New.h>

Why are you adding this include to ustring.h?

-    class Vector {
+    class Vector : public AllocBase {

Where is the code that does "new Vector"?

-class MainThreadInvoker : public QObject {
+class MainThreadInvoker : public QObject, public AllocBase {

Where is the code that does "new MainThreadInvoker"?
Comment 26 Paul Pedriana 2008-10-07 14:18:42 PDT
I think that Darin's comment #12 needs to be addressed before proceeding much further. I will be doing that, but must apologize because I've been busy meeting deadlines in my effort to write a new graphics/font/transport back-end for my target platforms. I'm gaining traction on that my deadlines so I should be able to work on this within a few days.
Comment 27 Balazs Kelemen 2008-10-09 01:14:01 PDT
(In reply to comment #25)
> (From update of attachment 24113 [details] [edit])
> This patch depends on Paul Pedriana's New.h, and if you read the bug report
> you'll see many unaddressed comments from me about that. So we can't just land
> that file until the comments are addressed. I'm not going to repeat my
> comments.

Ok, wee stop this work until AllocBase is finished.

> This patch contains a double-copy of New.h; the whole file pasted twice.
Sorry, it was a silly mistake.

> 
> -    class ExpressionNode : public Node {
> +    class ExpressionNode : public Node, public WTF::AllocBase {
> 
> Where is the code that does "new ExpressionNode"?

There is no new call to ExpressionNode directly but its descendants are
instantiated by new. Here is the list of descendants which are instantiated:
JavaScriptCore/kjs/nodes.cpp: line 1303 - AssignResolveNode : public
                            ExpressionNode, public ThrowableExpressionData
JavaScriptCore/kjs/grammar.y: line 1249 - ReadModifyBracketNode : public
                            ExpressionNode, public ThrowableSubExpressionData
JavaScriptCore/kjs/grammar.y: line 1259 - ReadModifyDotNode : public
                            ExpressionNode, public ThrowableSubExpressionData
JavaScriptCore/kjs/grammar.y: line 1275 - PrefixBracketNode : public 
                     ExpressionNode, public ThrowablePrefixedSubExpressionData
JavaScriptCore/kjs/grammar.y: line 1281 - PrefixDotNode : public
                     ExpressionNode, public ThrowablePrefixedSubExpressionData
JavaScriptCore/kjs/grammar.y: line 1297 - PostfixBracketNode : public
                     ExpressionNode, public ThrowableSubExpressionData
JavaScriptCore/kjs/grammar.y: line 1304 - PostfixDotNode : public
                     ExpressionNode, public ThrowableSubExpressionData
JavaScriptCore/kjs/grammar.y: line 1324 - FunctionCallBracketNode : public
                     ExpressionNode, public ThrowableSubExpressionData
JavaScriptCore/kjs/grammar.y: line 1330 - FunctionCallDotNode : public 
                     ExpressionNode, public ThrowableSubExpressionData


> +#include <wtf/New.h>
> 
> Why are you adding this include to ustring.h?
Another mistake.

> -    class Vector {
> +    class Vector : public AllocBase {
> 
> Where is the code that does "new Vector"?

JavaScriptCore/VM/SegmentedVector.h: line 149 - "Segment* segment = new Segment;"
And as you can see Segment is a Vector: "typedef Vector<T, SegmentSize> Segment;"

> 
> -class MainThreadInvoker : public QObject {
> +class MainThreadInvoker : public QObject, public AllocBase {
> 
> Where is the code that does "new MainThreadInvoker"?
> 
JavaScriptCore/wtf/qt/MainThreadQt.cpp: line 59 - Q_GLOBAL_STATIC(MainThreadInvoker, webkit_main_thread_invoker)
I think this line call new, but I am not sure. We will analyse it.
Comment 28 Paul Pedriana 2008-10-21 01:38:08 PDT
The following discussion follows Darin's comment #12. It references a new revision of the proposed allocation interface which is not yet posted. This is something I can work on right away and which I've done some work on already. But some of the items below might require a little feedback before a revision is completed and posted with some confidence. 

Regarding the name AllocBase and fastMalloc, are you suggesting a base class called FastMallocBase? That seems fine and consistent with the use of fastMalloc. It would make it somewhat clear that its intent is to use fastMalloc. I will AllocBase to FastMallocBase for the new revision. 

Since the file name New.h needs to be changed, I can follow suit with this and rename it FastNew.h for the new revision. Or should it be FastMallocNew.h? 

Instead of newObject/deleteObject, I wonder if there is a way to use the fastMalloc name here. Such a name might be fastNew/fastDelete/fastNewArray/fastDeleteArray. I can't think of a simple way to avoid having special names for Array versions, as both the object and array delete functions take the same argument: T*. For the new revision I will switch it to fastNew, etc. and refer to it as such from here on. 

Being able to detect misuse (e.g. new/delete mismatches) at compile time is tough (given how C++ works), though detecting it at runtime isn't very difficult. For some users it's enough to simply have global operator new and delete assert(false), and that's what some embedded and game developers do application-wide. But that won't work for most users. A practical solution could involve a magic number approach whereby the four bytes prior to an allocation identifies how the allocation was done. Such an approach could tell not just if there was a mismatch between fastNew and fastDelete, but also could potentially be used to tell if there was a mistaken use of fastDelete instead of fastFree or vice versa. I can implement this for the new revision without much trouble. It can be controlled by a define which defaults to disabled in NDEBUG and which is tested by #if ENABLE(FAST_NEW_VALIDATION) or a suitable alternative. 

Regarding underscores in integral_constant, true_type, and false_type, these match the C++ TR1, as documented at http://msdn.microsoft.com/en-us/library/bb982910.aspx, for example. The functionality there is defined to be identical to the new C++ standard, with the only difference being that it is in the WTF namespace. And the new C++ functionality is imported into WTF if present. So I think it's a good idea to match the new C++ standard functionality.

Regarding the const version of has_trivial_constructor, yes volatile versions are needed to be complete. I left that out of the first proposal for expediency but left a comment in there about volatile. I'm not aware of a way to avoid having the four types defined, but I'm pretty sure there are only these four variations and no others. Granted, the code will execute correctly regardless of the omission of the volatile variations; it just would use a little extra memory. I will nevertheless add the volatile versions to be complete. Note that the most recent versions of GCC and VC++ (and IBM, CodeWarrior, and Dinkumware) for desktop platforms have built-in support for these type traits as per the new C++ standard. 

In looking at HashTraits.h I can see the aforementioned IsInteger type traits. It's just checking for int and not const int, volatile int, const volatile int. So it will misdetect for the latter three types. Since compilers are now providing explicit support for C++ standard type traits, I wonder if it would be useful to have HashTraits.h also defer to the compiler/standard library when possible. Would it be useful to have a TypeTraits.h header file which is shared by HashTraits.h and FastNew.h. I can implement that and add the const/volatile is_integral variations (recall that TR1 calls it is_integral).

I can add a space in between the braces and remove the lining up easily enough.

I will get rid of that mistaken "We don't need to check for a null pointer" comment.

I suppose "whereby" isn't right; that can be easily changed.

Ditto for the 'const' in local variable declarations. 

Yes the cast in NewArrayImpl<T, false, true> is redundant.

Regarding the uint64_t thing, it's there to provide an array size prefix to arrays of non-POD types, while preserving C++'s "suitably aligned" allocation requirement. I can switch that to use a type which is a union of void*/long long/double/int64_t, and that would be reasonably portable. The built-in compiler generated array new is going to be doing something similar if not identical to this in a lot of cases, so adding a prefix of 8 bytes shouldn't be adding any more space than the compiler or standard library is already adding on its own (I just verified with GCC 4.1 on PowerPC). The reason it is 8 bytes even on a 32 bit platform is for this alignment preservation. Exactly how compilers deal with this is compiler-specific, but they all face the problem that non-PODs must be aligned. 

Should the delete function (previously deleteObject, to be changed to fastDelete) check for a NULL pointer? It seems to me that it needs to do so by default. I agree that using "Since we intend to use this wherever we'd use C++ delete, allow a null pointer as it does" is a better comment to have there. I can pretty easily add a version that avoids the NULL pointer check. It's only a matter of what to call it. Is there a precedent in WebKit for this kind of naming? fastDeleteSafe? fastDeleteValid? fastDeletePtr? It will be useful to document that fastFree or the user's equivalent can handle NULL so that FastNew.h doesn't need to. 
Comment 29 Darin Adler 2008-10-21 11:32:16 PDT
Comments just on your comments. I didn't reread my original comments to make sure you addressed all of them.

(In reply to comment #28)
> Regarding the name AllocBase and fastMalloc, are you suggesting a base class
> called FastMallocBase? That seems fine and consistent with the use of
> fastMalloc.

Sounds good. I don't love the name fastMalloc, but I do love consistency and clarity about the relationship between these.

Maybe FastAllocBase would be better -- the "m" makes it even more jargony without adding much -- but FastMallocBase seems fine. Maybe there's an even better name that makes it clear that deriving fromthis makes new/delete use FastMalloc by default for objects of this class. Because, as you know, that's all it does: Saves you having to use fastNew/fastDelete explicitly. We don't use the word "Base" in classes like "Noncopyable", so I think it's worth thinking at least a tiny bit further about what a more straightforward name would be; maybe it wouldn't include Base at all.

> Since the file name New.h needs to be changed, I can follow suit with this and
> rename it FastNew.h for the new revision. Or should it be FastMallocNew.h? 

I think you should name the file after the class. So FastAllocBase.h (or whatever we end up naming it) even though it will contain helpful function templates too, not just the class. The general pattern in the WebKit project overall is to name files after classes, and it seems we can follow that here.

> Instead of newObject/deleteObject, I wonder if there is a way to use the
> fastMalloc name here. Such a name might be
> fastNew/fastDelete/fastNewArray/fastDeleteArray. I can't think of a simple way
> to avoid having special names for Array versions, as both the object and array
> delete functions take the same argument: T*. For the new revision I will switch
> it to fastNew, etc. and refer to it as such from here on.

Seems fine. While I don't love the "fast" word, I think that using it consistently as a theme will help people understand how to use it.

> Being able to detect misuse (e.g. new/delete mismatches) at compile time is
> tough (given how C++ works), though detecting it at runtime isn't very
> difficult.

Yes, that's my impression too. But lets not give up on compile time just yet.

But I am still hoping to find a creative solution to detecting the misuse at compile time, although I haven't thought of anything. In the past we have done some clever things with macros that worked just fine. For example, we could do something that involves defining new and delete macros to prevent accidental direct use of built-in new, and somehow exempt code that needs to do this. I don't want to give up on this yet, so please do think about it while you do the rest of the project. Keep in mind, too, that something to detect misuse at compile time is useful even if it isn't possible on all platforms.

Lets definitely do the work to detect it at runtime.

> For some users it's enough to simply have global operator new and
> delete assert(false), and that's what some embedded and game developers do
> application-wide.

And this will probably work on Mac OS X, since the tool chain supports a global operator new that is not actually global, but is instead local to the WebKit, WebCore, and JavaScriptCore frameworks. So we should do it on that platform, where a lot of the WebKit development is done.

> A practical solution
> could involve a magic number approach whereby the four bytes prior to an
> allocation identifies how the allocation was done. Such an approach could tell
> not just if there was a mismatch between fastNew and fastDelete, but also could
> potentially be used to tell if there was a mistaken use of fastDelete instead
> of fastFree or vice versa. I can implement this for the new revision without
> much trouble. It can be controlled by a define which defaults to disabled in
> NDEBUG and which is tested by #if ENABLE(FAST_NEW_VALIDATION) or a suitable
> alternative. 

Yes, worth doing. There are many misuses it won't detect, though. For example, if a programmer use fastNew to allocate and then uses the standard compiler delete to deallocate there's no guarantee it would be caught; those are some of the most likely types of mistakes. But it would be great to catch things like using fastNew and then pairing it with fastFree, so I think we want to do this in FastMalloc too, not just the new functions.

We should think further about how to catch more types of mistake.

> Regarding underscores in integral_constant, true_type, and false_type, these
> match the C++ TR1, as documented at
> http://msdn.microsoft.com/en-us/library/bb982910.aspx, for example.

Sounds fine. There should be a comment explaining this.

> Regarding the const version of has_trivial_constructor, yes volatile versions
> are needed to be complete. I left that out of the first proposal for expediency
> but left a comment in there about volatile.

Sorry, I overlooked the comment about volatile.

> Note that
> the most recent versions of GCC and VC++ (and IBM, CodeWarrior, and Dinkumware)
> for desktop platforms have built-in support for these type traits as per the
> new C++ standard.

I think the best way to handle this is to make a header that supplies the new standard stuff as needed for compilers that lack it. Analogous to the MathExtras.h header. Then we can just use this almost as if the standard was present on all the platforms. If we can get away with actually defining it inside the std namespace on the non-compliant platforms, that's good. Or we can import it into the WTF namespace on the platforms that have it, and define it in the WTF namespace for platforms that don't.

> In looking at HashTraits.h I can see the aforementioned IsInteger type traits.
> It's just checking for int and not const int, volatile int, const volatile int.
> So it will misdetect for the latter three types. Since compilers are now
> providing explicit support for C++ standard type traits, I wonder if it would
> be useful to have HashTraits.h also defer to the compiler/standard library when
> possible. Would it be useful to have a TypeTraits.h header file which is shared
> by HashTraits.h and FastNew.h. I can implement that and add the const/volatile
> is_integral variations (recall that TR1 calls it is_integral).

Yes, a shared implementation would be better than just leaving this existing one in place while we add a new one.

> Regarding the uint64_t thing, it's there to provide an array size prefix to
> arrays of non-POD types, while preserving C++'s "suitably aligned" allocation
> requirement. I can switch that to use a type which is a union of void*/long
> long/double/int64_t, and that would be reasonably portable.

Why not use "unsigned long long" instead of uint64_t? Is uint64_t more portable in some concrete way? Also, is there any platform where double is not sufficient? I think the specific "64" in uint64_t places the wrong emphasis. It's not the concrete size that matters, but rather "aligned enough for all built in types", and uint64_t does not say that to me any more than double would.

> I can pretty easily add a version that avoids the NULL pointer check. It's only a
> matter of what to call it. Is there a precedent in WebKit for this kind of
> naming? fastDeleteSafe? fastDeleteValid? fastDeletePtr?

I couldn't find any good precedent. There's gcProtect and gcProtectNullTolerant, but: 1) those names are too long, 2) that's backwards because the standard new/delete are null tolerant. What we need names for are versions of delete and free that are *not* null-tolerant. I would suggest these:

    fastNonNullFree
    fastNonNullDelete

Clarity is more important than brevity, I think. The above are ugly, but not too terrible.

An aside: We don't want the word "safe" in function names, because that implies that all other functions are possibly "unsafe"; I don't want to create a sense of vague dread about all the other functions ;-)

> It will be useful to
> document that fastFree or the user's equivalent can handle NULL so that
> FastNew.h doesn't need to.

I think it should already be clear that since fastFree is an implementation of free, and free requires handling of null pointers, that fastFree is obligated to handle null pointers. But maybe you're just saying we need a comment to that effect in FastMalloc.h; you could include that in your patch. Or are you saying something else?
Comment 30 Paul Pedriana 2008-11-20 21:13:06 PST
We (Zoltan Horvath and I) have produced a new patch (AllAllocation_v3.patch) which addresses the previous review comments. This patch includes the primary allocator and type traits-related code as well as a working patch of WebKit (revision 38334) to use the allocator code and which serves to demonstrate feasibility. The core of the patch involves the following files:

FastAllocBase.h 
    Primary header
TypeTraits.h
    New file which includes improved traits moved from HashTraits.h
FastMalloc.h
    Implements FAST_MALLOC_MATCH_VALIDATION
FastMalloc.cpp
    Implements FAST_MALLOC_MATCH_VALIDATION
Platform.h
    Provides definition of ENABLE(FAST_MALLOC_MATCH_VALIDATION)
HashTraits.h
    moved is_integer to to TypeTraits.h

Additionally there is FastAllocBaseTest.cpp which is a standalone test of the functionality.

A possibility is to apply this patch in two phases: 1) just the changes above with 2) other changes to follow. We felt it was useful to produce a demonstration of the patch which is carried through to all affected WebKit files to demonstrate ease and practicality of implementation. Zoltan can discuss the implementation of it better than me, since he did most of the work there and was the producer of the current .patch file. 

Zoltan could also speak more about observed performance effects, though I'd be surprised if performance changed in a measurable way. When FAST_MALLOC_MATCH_VALIDATION is enabled I expect that memory usage could increase by perhaps as much as 1-2% for patterns that include a lot of small allocations.

Most of the following discussion refers to FastAllocatorBase.h. Hopefully it isn't too verbose.

  >> I wrote:
  >> Regarding the name AllocBase and fastMalloc, are you suggesting a base 
  >> class called FastMallocBase? That seems fine and consistent with the 
  >> use of fastMalloc.

  > Darin wrote:
  > Sounds good. I don't love the name fastMalloc, but I do love consistency 
  > and clarity about the relationship between these.
  > 
  > Maybe FastAllocBase would be better -- the "m" makes it even more jargony
  > without adding much -- but FastMallocBase seems fine. Maybe there's an 
  > even better name that makes it clear that deriving from this makes 
  > new/delete use FastMalloc by default for objects of this class. Because, 
  > as you know, that's all it does: Save you having to use fastNew/fastDelete
  > explicitly. We don't use the word "Base" in classes like "Noncopyable", 
  > so I think it's worth thinking at least a tiny bit further about what a 
  > more straightforward name would be; maybe it wouldn't include Base at all.

I've got it as FastAllocBase now. I'm trying to think of a name that doesn't 
include "Base" but conveys some meaning. "Noncopyable" works well for its 
purpose, and an analog for an allocation base class might be something like 
"FastAllocated". I did notice that GenericHashTraitsBase is the base for 
HashTraits in HashTraits.h.


  >> Since the file name New.h needs to be changed, I can follow suit with 
  >> this and rename it FastNew.h for the new revision. Or FastMallocNew.h? 

  > I think you should name the file after the class. So FastAllocBase.h 
  > (or whatever we end up naming it) even though it will contain helpful 
  > function templates too, not just the class. The general pattern in the 
  > WebKit project overall is to name files after classes, and it seems we 
  > can follow that here.

I've got it as FastAllocBase.h, though it could be changed to something 
like FastAllocated.h.


  >> Instead of newObject/deleteObject, I wonder if there is a way to use the
  >> fastMalloc name here. Such a name might be
  >> fastNew/fastDelete/fastNewArray/fastDeleteArray. I can't think of a 
  >> simple way to avoid having special names for Array versions, as both the 
  >> object and array delete functions take the same argument: T*. For the 
  >> new revision I will switch it to fastNew, etc. and refer to it as such 
  >> from here on.

  > Seems fine. While I don't love the "fast" word, I think that using it
  > consistently as a theme will help people understand how to use it.

It is now implemented as fastNew, etc.


  >> Being able to detect misuse (e.g. new/delete mismatches) at compile time 
  >> is tough (given how C++ works), though detecting it at runtime isn't 
  >> very difficult.

  > Yes, that's my impression too. But lets not give up on compile time 
  > just yet. But I am still hoping to find a creative solution to detecting 
  > the misuse at compile time, although I haven't thought of anything. 
  > In the past we have done some clever things with macros that worked 
  > just fine. For example, we could do something that involves defining new 
  > and delete macros to prevent accidental direct use of built-in new, and 
  > somehow exempt code that needs to do this. I don't want to give up on this
  > yet, so please do think about it while you do the rest of the project. 
  > Keep in mind, too, that something to detect misuse at compile time is 
  > useful even if it isn't possible on all platforms.

Lets definitely do the work to detect it at runtime.

There's now code to detect mismatches of allocation. It's enabled via 
#if ENABLE(FAST_MALLOC_MATCH_VALIDATION) and has undergone some testing. 
The implementation is fairly neat in that it allows any heap implementation
to support it; it isn't limited to the FastMalloc heap.

  >> For some users it's enough to simply have global operator new and
  >> delete assert(false), and that's what some embedded and game developers 
  >> do application-wide.

  > And this will probably work on Mac OS X, since the tool chain supports a 
  > global operator new that is not actually global, but is instead local to 
  > the WebKit, WebCore, and JavaScriptCore frameworks. So we should do it 
  > on that platform, where a lot of the WebKit development is done.

I think this is something that would be done outside this patch in the 
application's or library's new/delete implementation. My testing has included
testing this and it works as expected.

  >> A practical solution could involve a magic number approach whereby the 
  >> four bytes prior to an allocation identifies how the allocation was done.
  >> Such an approach could tell not just if there was a mismatch between 
  >> fastNew and fastDelete, but also could potentially be used to tell if 
  >> there was a mistaken use of fastDelete instead of fastFree or vice versa.
  >> I can implement this for the new revision without much trouble. It can 
  >> be controlled by a define which defaults to disabled in NDEBUG and which 
  >> is tested by #if ENABLE(FAST_NEW_VALIDATION) or a suitable alternative. 

  > Yes, worth doing. There are many misuses it won't detect, though. 
  > For example, if a programmer use fastNew to allocate and then uses the 
  > standard compiler delete to deallocate there's no guarantee it would be 
  > caught; those are some of the most likely types of mistakes. But it would 
  > be great to catch things like using fastNew and then pairing it with 
  > fastFree, so I think we want to do this in FastMalloc too, not just the 
  > new functions. We should think further about how to catch more types of 
  > mistake.

It turns out that fastNew followed be delete or free is being detected at 
the glibc level and msvc CRT level in our tests. We've tested the other
combinations and they do the right thing. 

  >> Regarding underscores in integral_constant, true_type, and false_type, 
  >> these match the C++ TR1, as documented at:
  >> http://msdn.microsoft.com/en-us/library/bb982910.aspx, for example.

  > Sounds fine. There should be a comment explaining this.

Done

  >> In looking at HashTraits.h I can see the aforementioned IsInteger type 
  >> traits. It's just checking for int and not const int, volatile int, 
  >> const volatile int. So it will misdetect for the latter three types. 
  >> Since compilers are now providing explicit support for C++ standard 
  >> type traits, I wonder if it would be useful to have HashTraits.h also 
  >> defer to the compiler/standard library when possible. Would it be useful 
  >> to have a TypeTraits.h header file which is shared by HashTraits.h and 
  >> FastNew.h. I can implement that and add the const/volatile is_integral 
  >> variations (recall that TR1 calls it is_integral).

  > Yes, a shared implementation would be better than just leaving this 
  > existing one in place while we add a new one.

This is implemented in TypeTraits.h and it includes const/volatile traits
for both HashTraits' is_integral and FastMallocBase's has_trivial_constructor.


  >> Regarding the uint64_t thing, it's there to provide an array size prefix 
  >> to arrays of non-POD types, while preserving C++'s "suitably aligned" 
  >> allocation requirement. I can switch that to use a type which is a union 
  >> of void*/long long/double/int64_t, and that would be reasonably portable.

  > Why not use "unsigned long long" instead of uint64_t? Is uint64_t more 
  > portable in some concrete way? Also, is there any platform where double 
  > is not sufficient? I think the specific "64" in uint64_t places the wrong 
  > emphasis. It's not the concrete size that matters, but rather "aligned 
  > enough for all built in types", and uint64_t does not say that to me any 
  > more than double would.

I've used a templated union of unsigned long long and T. This should address
size and aliasing issues.


  >> Should the delete function (previously deleteObject, to be changed to
  >> fastDelete) check for a NULL pointer? It seems to me that it needs to do
  >> so by default. I agree that using "Since we intend to use this wherever 
  >> we'd use C++ delete, allow a null pointer as it does" is a better comment
  >> to have there. I can pretty easily add a version that avoids the NULL 
  >> pointer check. It's only a matter of what to call it. Is there a 
  >> precedent in WebKit for this kind of naming? fastDeleteSafe? 
  >> fastDeleteValid? fastDeletePtr?

  > I couldn't find any good precedent. There's gcProtect and
  > gcProtectNullTolerant, but: 1) those names are too long, 2) that's 
  > backwards because the standard new/delete are null tolerant. What we need 
  > names for are versions of delete and free that are *not* null-tolerant. 
  > I would suggest these:
  >     fastNonNullFree
  >     fastNonNullDelete
  > Clarity is more important than brevity, I think. The above are ugly, but 
  > not too terrible.
  > 
  > An aside: We don't want the word "safe" in function names, because that 
  > implies that all other functions are possibly "unsafe"; I don't want to 
  > create a sense of vague dread about all the other functions ;-)

I've added fastNonNullDelete to match fastDelete. I wonder if having this 
is necessary or if it's better to simply require that the allocator 
implementation always act like fastFree and check for a NULL pointer. 


  >> It will be useful to document that fastFree or the user's equivalent can
  >> handle NULL so that FastNew.h doesn't need to.

  > I think it should already be clear that since fastFree is an 
  > implementation of free, and free requires handling of null pointers, 
  > that fastFree is obligated to handle null pointers. But maybe you're just
  > saying we need a comment to that effect in FastMalloc.h; you could include 
  > that in your patch. Or are you saying something else?

I've added a comment clarifying that the underlying allocator must check 
for NULL for the use of FastDelete.


  >> + T* const p = static_cast<T*>(fastMalloc(sizeof(T) * count));

  > We normally don't mark local variables const, even if we don't plan to 
  > modify them. Is there a reason you're doing that here and not elsewhere?

This is now fixed.


  >> + return reinterpret_cast<T*>(p);

  > Why is there a cast here (in NewArrayImpl<T, false, true>? I believe p 
  > already has the right type.

This is now fixed.
Comment 31 Paul Pedriana 2008-11-20 21:19:18 PST
Created attachment 25333 [details]
Patch against revision 38334
Comment 32 Paul Pedriana 2008-11-20 21:20:41 PST
Created attachment 25334 [details]
Unit test for latest FastAllocBase patch.
Comment 33 Darin Adler 2008-11-21 09:38:51 PST
Comment on attachment 25333 [details]
Patch against revision 38334

I started reviewing this, but I'm only part way through it. I'll post my comments soon.
Comment 34 Darin Adler 2008-11-23 20:24:49 PST
Comment on attachment 25333 [details]
Patch against revision 38334

> -    class HeavyProfile : public Profile {
> +    class HeavyProfile : public Profile, public WTF::FastAllocBase {

Instead of having all the classes that derive from Profile derive from FastAllocBase, I suggest having Profile derive from FastAllocBase.

> -    struct CallIdentifier {
> +    struct CallIdentifier : public WTF::FastAllocBase {

This class is never used with new/delete, so I'm not sure it's helpful to derive from FastAllocBase. What's the rule of thumb you're using to decide when to derive from it? I'm thinking we should start with classes that are used with "new".

> -    class StructureID : public RefCounted<StructureID> {
> +    class StructureID : public RefCounted<StructureID>, public WTF::FastAllocBase {

This class has been renamed to Structure, so this patch is a bit stale.

> Index: JavaScriptCore/runtime/Identifier.h
> ===================================================================
> --- JavaScriptCore/runtime/Identifier.h	(revision 38334)
> +++ JavaScriptCore/runtime/Identifier.h	(working copy)
> @@ -23,12 +23,13 @@
>  
>  #include "JSGlobalData.h"
>  #include "UString.h"
> +#include <wtf/FastAllocBase.h>

Since UString.h already includes this header, the new include is redundant. There's a script, find-extra-includes, that you can use to find redundant includes of this file. You could do "find-extra-includes | grep FastAllocBase" and eliminate any redundant includes before you put the next patch up for review.

> -    Rep* r = new Rep;
> +    Rep* r = fastNew<Rep>();

The Rep struct should be changed to inherit from FastAllocBase.

> -    , arrayTable(new HashTable(JSC::arrayTable))
> +    , arrayTable(fastNew<HashTable>(JSC::arrayTable))

The HashTable struct should be changed to inherit from FastAllocBase.

> -    class JSValue : Noncopyable {
> +    class JSValue : Noncopyable, public WTF::FastAllocBase {

No JSValue object or object of any class derived from it can be allocated with a plain "new". These objects are garbage-collected. So there's no need to add the base class here.

> -class MainThreadInvoker : public QObject {
> +class MainThreadInvoker : public QObject, public FastAllocBase {

Is this class used with "new"? I couldn't find any place where it was.

> -class ThreadCondition : Noncopyable {
> +class ThreadCondition : Noncopyable, public WTF::FastAllocBase {

I can't find any use of this class with "new".

> +    template <typename T> struct has_trivial_constructor : public integral_constant<bool, __has_trivial_constructor(T)>{ };

Missing spaces between the ">" and the "{" on many lines in this file.

> +namespace WTF {
> +
> +    COMPILE_ASSERT(is_integral<bool>::value, WTF_IsInteger_bool_true);
> +    COMPILE_ASSERT(is_integral<const bool>::value, WTF_IsInteger_bool_true);
> +    COMPILE_ASSERT(is_integral<volatile bool>::value, WTF_IsInteger_bool_true);
> +    COMPILE_ASSERT(is_integral<const volatile bool>::value, WTF_IsInteger_bool_true);
> +    COMPILE_ASSERT(is_integral<char>::value, WTF_IsInteger_char_true);
> +    COMPILE_ASSERT(is_integral<signed char>::value, WTF_IsInteger_signed_char_true);
> +    COMPILE_ASSERT(is_integral<unsigned char>::value, WTF_IsInteger_unsigned_char_true);
> +    COMPILE_ASSERT(is_integral<short>::value, WTF_IsInteger_short_true);
> +    COMPILE_ASSERT(is_integral<unsigned short>::value, WTF_IsInteger_unsigned_short_true);
> +    COMPILE_ASSERT(is_integral<int>::value, WTF_IsInteger_int_true);
> +    COMPILE_ASSERT(is_integral<unsigned int>::value, WTF_IsInteger_unsigned_int_true);
> +    COMPILE_ASSERT(is_integral<long>::value, WTF_IsInteger_long_true);
> +    COMPILE_ASSERT(is_integral<unsigned long>::value, WTF_IsInteger_unsigned_long_true);
> +    COMPILE_ASSERT(is_integral<long long>::value, WTF_IsInteger_long_long_true);
> +    COMPILE_ASSERT(is_integral<unsigned long long>::value, WTF_IsInteger_unsigned_long_long_true);
> +
> +#if !defined(_MSC_VER) || defined(_NATIVE_WCHAR_T_DEFINED)
> +    COMPILE_ASSERT(is_integral<wchar_t>::value, WTF_IsInteger_wchar_t_true);
> +#endif
> +
> +    COMPILE_ASSERT(!is_integral<char*>::value, WTF_IsInteger_char_pointer_false);
> +    COMPILE_ASSERT(!is_integral<const char*>::value, WTF_IsInteger_const_char_pointer_false);
> +    COMPILE_ASSERT(!is_integral<volatile char*>::value, WTF_IsInteger_volatile_char_pointer__false);
> +    COMPILE_ASSERT(!is_integral<double>::value, WTF_IsInteger_double_false);
> +    COMPILE_ASSERT(!is_integral<float>::value, WTF_IsInteger_float_false);
> +
> +} // namespace WTF

Could we put these assertions in a .cpp file instead of in the header? I don't think we need to compile these compile time assertions in every single file that includes TypeTraits.h.

> +    // This defines a type which holds an unsigned integer and is the same 
> +    // size as the minimally aligned memory allocation. It can be used as 
> +    // a size prefix for array allocations.
> +    typedef unsigned long long AlignedAllocSize;

I think that the word "Size" in this name makes it sound like it's a block size. That's not what this is at all. Maybe MaximallyAlignedInteger would be a better name for this type? Or AllocAlignmentInteger?

Also, I don't understand what the comment "can be used as a size prefix" means, since you're storying the AllocType byte there, not a size.

> +    inline AllocType getFastMallocMatchValidationType(const void* p)
> +    {
> +        const char* pType = static_cast<const char*>(p) - sizeof(AlignedAllocSize);
> +        return static_cast<AllocType>(*pType);
> +    }

WebKit style guidelines onit the word "get" in a function name like this one.

Since AllocType is used for validation, I think you should use longer values instead of just a byte, and more unusual numbers, to decrease the chance of a false negative.

Why not put AllocType and these functions into the Internal namespace?

> +#include <wtf/TypeTraits.h>
> +
> +
> +namespace WTF {

We normally don't put two blank lines in a place like this. I noticed otehr examples of that, and I think you should omit those also.

> +#if ENABLE(FAST_MALLOC_MATCH_VALIDATION)
> +            if (p)
> +                setFastMallocMatchValidationType(p, AllocTypeClassNew);
> +#endif

I'd like to see this ditty in an inline function so that FastAllocBase can share it with fastNew without repeating the code twice. This would also have the benefit of moving the #if out of those functions.

> +#if ENABLE(FAST_MALLOC_MATCH_VALIDATION)
> +            if (p) {
> +                if (getFastMallocMatchValidationType(p) != AllocTypeClassNew)
> +                    fastMallocMatchFailed(p);
> +                setFastMallocMatchValidationType(p, AllocTypeMalloc);  // Set it to this so that fastFree thinks it's OK.
> +            }
> +#endif

Same comment here.

> +    /* Disabled until it's known that we want to actually do this.
> +    template <typename T, typename Arg1>
> +    inline T* fastNew(Arg1 arg1)
> +    {
> +        void* p = fastMalloc(sizeof(T));
> +
> +        if (p) {
> +#if ENABLE(FAST_MALLOC_MATCH_VALIDATION)
> +            if (p)
> +                setFastMallocMatchValidationType(p, AllocTypeFastNew);
> +#endif
> +            return ::new(p) T(arg1);
> +        }
> +
> +        return 0;
> +    }
> +    */

It looks like you have this commented out, but later in the same file you have one that's not commented out.

> +                void* p = fastMalloc(sizeof(AlignedAllocSize) + (sizeof(T) * count));
> +                ArraySize<T> a = { static_cast<AlignedAllocSize*>(p) };
> +
> +                if (p) {
> +#if ENABLE(FAST_MALLOC_MATCH_VALIDATION)
> +                    setFastMallocMatchValidationType(p, AllocTypeFastNewArray);
> +#endif
> +                    *a.size++ = count;

I don't understand how you can store both the array size and the validation type in the same place; "p" and "a.size" both point to the same address. Maybe I'm missing something obvious here.

> +        if(p) { // We need to test p here because we dereference it.

Needs a space after the "if".

> +                if(p) { // We need to test p here because we dereference it.

Needs a space after the "if".

> +                    while(pEnd-- != p)

Needs a space after the "while".

> +    template <typename T>
> +    inline void fastNonNullDelete(T* p)
> +    {
> +        if (p) {

Given "NonNull" I don't understand why there's a null check here. Maybe "NonNull" means something different than I think it does.

> +                    while(pEnd-- != p)

Needs a space after the "while".

> +    n += sizeof(AlignedAllocSize);  // A pathologically high n value could result in this overflowing.

I think we need code to handle that and return a 0, not just a comment.
 
> +    totalBytes += sizeof(AlignedAllocSize); // A pathologically high totalBytes value could result in this overflowing.

I think we need code to handle that and return a 0, not just a comment.

>  void cfree(void* ptr) {
> -#ifndef WTF_CHANGES
> -    MallocHook::InvokeDeleteHook(ptr);
> -#endif
> -  do_free(ptr);
> +  free(ptr);
>  }

Do you really need to change this?

>  
>  #ifndef WTF_CHANGES
> @@ -3324,25 +3430,35 @@
>      return realloc<false>(old_ptr, new_size);
>  }
>  
> +
>  template <bool abortOnFailure>
>  ALWAYS_INLINE
>  #endif
>  void* realloc(void* old_ptr, size_t new_size) {
>    if (old_ptr == NULL) {
> +#if ENABLE(FAST_MALLOC_MATCH_VALIDATION)
> +    void* result = malloc(new_size);
> +#else
>      void* result = do_malloc(new_size);
>  #ifndef WTF_CHANGES
>      MallocHook::InvokeNewHook(result, new_size);
>  #endif
> +#endif
>      return result;
>    }
>    if (new_size == 0) {
> -#ifndef WTF_CHANGES
> -    MallocHook::InvokeDeleteHook(old_ptr);
> -#endif
>      free(old_ptr);
>      return NULL;
>    }

I don't understand what the value is of removing InvokeDeleteHook here.

> +#if !defined(ENABLE_FAST_MALLOC_MATCH_VALIDATION)
> +#define ENABLE_FAST_MALLOC_MATCH_VALIDATION 1
> +#endif

This seems wrong. I think this is something we'd want off in production builds.

> -    class NullNode : public ExpressionNode {
> +    class NullNode : public ExpressionNode, public WTF::FastAllocBase {

We can avoid adding so much separate derivation of every node class from FastAllocBase by making ParserRefCounted inherit from it.

> Index: WebKit/qt/tests/qwebframe/tst_qwebframe.cpp
> ===================================================================
> --- WebKit/qt/tests/qwebframe/tst_qwebframe.cpp	(revision 38334)
> +++ WebKit/qt/tests/qwebframe/tst_qwebframe.cpp	(working copy)
> @@ -27,6 +27,8 @@
>  #include <qwebhistory.h>
>  #include <QRegExp>
>  #include <QNetworkRequest>
> +
> +#include <wtf/FastAllocBase.h>
>  //TESTED_CLASS=
>  //TESTED_FILES=
>  
> @@ -69,7 +71,7 @@
>  Q_DECLARE_METATYPE(QVariantList)
>  Q_DECLARE_METATYPE(QVariantMap)
>  
> -class MyQObject : public QObject
> +class MyQObject : public QObject, public WTF::FastAllocBase
>  {
>      Q_OBJECT
>  
> @@ -2037,7 +2039,7 @@
>      QCOMPARE(m_view->page()->mainFrame()->toHtml(), html);
>  }
>  
> -class TestNetworkManager : public QNetworkAccessManager
> +class TestNetworkManager : public QNetworkAccessManager, public WTF::FastAllocBase
>  {
>  public:
>      TestNetworkManager(QObject* parent) : QNetworkAccessManager(parent) {}

Given this is just a test, I think it's OK to just leave these classes alone.

> Index: WebKit/qt/tests/qwebframe/qwebframe.pro
> ===================================================================
> --- WebKit/qt/tests/qwebframe/qwebframe.pro	(revision 38334)
> +++ WebKit/qt/tests/qwebframe/qwebframe.pro	(working copy)
> @@ -4,3 +4,6 @@
>  SOURCES  += tst_qwebframe.cpp
>  QT += testlib network
>  QMAKE_RPATHDIR = $$OUTPUT_DIR/lib $$QMAKE_RPATHDIR
> +
> +LIBS += -L../../../../JavaScriptCore
> +LIBS += -lJavaScriptCore
> Index: WebKit/qt/tests/qwebpage/qwebpage.pro
> ===================================================================
> --- WebKit/qt/tests/qwebpage/qwebpage.pro	(revision 38334)
> +++ WebKit/qt/tests/qwebpage/qwebpage.pro	(working copy)
> @@ -4,3 +4,6 @@
>  SOURCES  += tst_qwebpage.cpp
>  QT += testlib network
>  QMAKE_RPATHDIR = $$OUTPUT_DIR/lib $$QMAKE_RPATHDIR
> +
> +LIBS += -L../../../../JavaScriptCore
> +LIBS += -lJavaScriptCore

Are you sure these are correct? Why wasn't this needed before now? It this a good/reasonable new requirement?

I can't review all the rest of this. I think it's a mistake to make adoption of this across the entire project one giant patch. It would be much smarter to convert a class in JavaScriptCore and another in WebCore and get the kinks out before adding the code to hundreds of files all at the same time.

In general, I think we should put FastAllocBase into some base classes such as RefCounted and thereby avoid having to explicitly put it in so many classes individually, and in various other cases put it into a class as far up the inheritance tree as possible.

I'm concerned that the patch has modifications to the Qt build system but not others. Is that because you've only tried this with the Qt build system, or is there something Qt-specific? I think we should get the build system changes out of the way first, so a small patch that just adds the dependency and gets the build right would be good to do first before the giant patch.

review- for now.

Lets come up with a smaller patch that introduces the new header and adopts it in enough places so we can test it, rather than an "all at once" approach. Then we can follow up with adopting it elsewhere and then some kind of "check that it's used everywhere" final step.
Comment 35 Zoltan Horvath 2008-11-24 07:32:57 PST
>> -    class HeavyProfile : public Profile {
>> +    class HeavyProfile : public Profile, public WTF::FastAllocBase {
>
>Instead of having all the classes that derive from Profile derive from
>FastAllocBase, I suggest having Profile derive from FastAllocBase.

Okay, I'll do this.

>> -    struct CallIdentifier {
>> +    struct CallIdentifier : public WTF::FastAllocBase {
>
>This class is never used with new/delete, so I'm not sure it's helpful to
>derive from FastAllocBase. What's the rule of thumb you're using to decide when
>to derive from it? I'm thinking we should start with classes that are used with
>"new".

new call: JavaScriptCore/profiler/CallIdentifier.h: line 84

We use a static analyzer tool called Columbus (http://www.inf.u-szeged.hu/sed/softwarequality) to find out which classes are instantiated with "new" so we know exactly which classes have to be inherited from FastAllocBase. (With the help of Columbus we can filter out unneccessary inheritances and we can handle diamond inheritances properly.)

>> -    class StructureID : public RefCounted<StructureID> {
>> +    class StructureID : public RefCounted<StructureID>, public WTF::FastAllocBase {
>
>This class has been renamed to Structure, so this patch is a bit stale.

It will be updated in the next patch. I did this patch for r38344.

>> Index: JavaScriptCore/runtime/Identifier.h
>> ===================================================================
>> --- JavaScriptCore/runtime/Identifier.h       (revision 38334)
>> +++ JavaScriptCore/runtime/Identifier.h       (working copy)
>> @@ -23,12 +23,13 @@
>>  
>>  #include "JSGlobalData.h"
>>  #include "UString.h"
>> +#include <wtf/FastAllocBase.h>
>
>Since UString.h already includes this header, the new include is redundant.
>There's a script, find-extra-includes, that you can use to find redundant
>includes of this file. You could do "find-extra-includes | grep FastAllocBase"
>and eliminate any redundant includes before you put the next patch up for
>review.

I'll remove redundant headers, thanks for the tip.

>> -    Rep* r = new Rep;
>> +    Rep* r = fastNew<Rep>();
>
>The Rep struct should be changed to inherit from FastAllocBase.
>

If I inherit Rep from FastAllocBase it'll be no longer a POD and it can't be initalized with a C-style initializer list. (causes compile error)
Should I add new/delete member functions to Rep or use fastNew/fastDelete?

>> -    , arrayTable(new HashTable(JSC::arrayTable))
>> +    , arrayTable(fastNew<HashTable>(JSC::arrayTable))
>
>The HashTable struct should be changed to inherit from FastAllocBase.

I had the same problem with Rep. I'll check it again in the new patch.

>> -    class JSValue : Noncopyable {
>> +    class JSValue : Noncopyable, public WTF::FastAllocBase {
>
>No JSValue object or object of any class derived from it can be allocated with
>a plain "new". These objects are garbage-collected. So there's no need to add
>the base class here.

Okay, I'll modify.

>> -class MainThreadInvoker : public QObject {
>> +class MainThreadInvoker : public QObject, public FastAllocBase {
>
>Is this class used with "new"? I couldn't find any place where it was.

new call: MainThreadInvoker: JavaScriptCore/wtf/qt/MainThreadQt.cpp: line 59
Qt's Q_GLOBAL_STATIC macro causes: MainThreadInvoker *x = new MainThreadInvoker;

>> -class ThreadCondition : Noncopyable {
>> +class ThreadCondition : Noncopyable, public WTF::FastAllocBase {
>
>I can't find any use of this class with "new".

new call: WebCore/storage/DatabaseTask.cpp: line 74

>> -    class NullNode : public ExpressionNode {
>> +    class NullNode : public ExpressionNode, public WTF::FastAllocBase {
>
>We can avoid adding so much separate derivation of every node class from
>FastAllocBase by making ParserRefCounted inherit from it.

Okay. I'll inherit ParserRefCounted. I didn't do this because there are some classes which inherited from ParserRefCounted but not constructed by new.

>Given this is just a test, I think it's OK to just leave these classes alone.

Ok.

>> Index: WebKit/qt/tests/qwebframe/qwebframe.pro
>> Index: WebKit/qt/tests/qwebpage/qwebpage.pro
>>...
>> +LIBS += -L../../../../JavaScriptCore
>> +LIBS += -lJavaScriptCore
>
>Are you sure these are correct? Why wasn't this needed before now? It this a
>good/reasonable new requirement?

We'll leave tests alone so it won't be needed.

>I can't review all the rest of this. I think it's a mistake to make adoption of
>this across the entire project one giant patch. It would be much smarter to
>convert a class in JavaScriptCore and another in WebCore and get the kinks out
>before adding the code to hundreds of files all at the same time.

Ok. First we'll produce a patch just for JavaScriptCore.

>In general, I think we should put FastAllocBase into some base classes such as
>RefCounted and thereby avoid having to explicitly put it in so many classes
>individually, and in various other cases put it into a class as far up the
>inheritance tree as possible.

Okay, but how far up? There are a lot of classes which are inherited from (for example) Noncopyable but only a part of them instantiated with "new". And what can we do when the inheritance is private?

>I'm concerned that the patch has modifications to the Qt build system but not
>others. Is that because you've only tried this with the Qt build system, or is
>there something Qt-specific? I think we should get the build system changes out
>of the way first, so a small patch that just adds the dependency and gets the
>build right would be good to do first before the giant patch.

We've only tried this under linux-qt port. For the next version we need to modify the build system just for QtLauncher.
Comment 36 Darin Adler 2008-11-24 09:24:29 PST
(In reply to comment #35)
> If I inherit Rep from FastAllocBase it'll be no longer a POD and it can't be
> initalized with a C-style initializer list. (causes compile error)
> Should I add new/delete member functions to Rep or use fastNew/fastDelete?

Probably fastNew/fastDelete. If we wanted to add new/delete member functions, I think we'd want to do it with a macro, not writing it by hand.

> Okay. I'll inherit ParserRefCounted. I didn't do this because there are some
> classes which inherited from ParserRefCounted but not constructed by new.

I think we can be a bit liberal about adding FastAllocBase as a base class for various classes even if they aren't being used with new/delete. I don't think the inheritance will do any harm.

> Okay, but how far up? There are a lot of classes which are inherited from (for
> example) Noncopyable but only a part of them instantiated with "new". And what
> can we do when the inheritance is private?

Currently we inherit from Noncopyable privately because there's no difference between inheriting it privately and publicly. I think it would be fine to add FastAllocBase as a base class for Noncopyable, just for convenience, and then change inheritance of Noncopyable to be public inheritance -- it wasn't important to inherit from it privately, and it's reasonable to change it.

> We've only tried this under linux-qt port. For the next version we need to
> modify the build system just for QtLauncher.

We'll want a version of this patch that tries to make all ports work, even if it's not tested with all of them. I suggest making a small patch that concentrates on the build issues of adding new source files and getting them compiled in all ports, before doing a larger patch that tries to adopt this class in all the relevant places.

It's easier to get things right if we do things in smaller pieces.

In particular, you want a small, easy-to-review patch that will exercise the issues. For example, a patch that adds the new files and adds one use of each aspect of the header in WebCore as well as in JavaScriptCore would be a good idea. Then a larger patch that simply adopts the mechanism more widely.
Comment 37 Paul Pedriana 2008-12-01 22:33:14 PST
I'm working on a revision of the allocator code that addresses Darin's comments in https://bugs.webkit.org/show_bug.cgi?id=20422#c34. It might be useful to know that I have been working on the allocator implementation and its unit test, while Zoltan has been working on the application code to use it.

In the meantime, I have a response to three items.

    >> -  do_free(ptr);
    >> +  free(ptr);
    >>  }

    > Do you really need to change this?

I believe, the former behavior is a bug, as it bypasses the InvokeDeleteHook mechanism. cfree is a sibling to free and a complement to calloc. Without the above fix, a call to calloc would result in a call to InvokeNewHook, but the eventual call to cfree would never call the InvokeDeleteHook.

    >>    if (new_size == 0) {
    >> -#ifndef WTF_CHANGES
    >> -    MallocHook::InvokeDeleteHook(old_ptr);
    >> -#endif
    >>      free(old_ptr);
    >>      return NULL;
    >>    }

    > I don't understand what the value is of removing InvokeDeleteHook here.

The InvokeDeleteHook call is redundant and is called again within free. Aside from being less efficient, it complicates the user's implementation requirements and makes it harder to do validation since the user would have to keep track of pointers being 'double-deleted'.

If for some reason people want to keep the existing (IMO buggy) behavior, then I can undo the above.

    >> I don't understand how you can store both the array size and
    >> the validation type in the same place; "p" and "a.size" both point
    >> to the same address. Maybe I'm missing something obvious here.

They aren't stored in the same place. The validation type is stored prior
in memory to the array size. First the validation type is written in some
reserved prefix bytes, then the array size is written in its own reserved
prefix bytes. The array code doesn't care about the location of the
validation type bytes. The unit tests exercise this and execute OK, FWIW.
Comment 38 Darin Adler 2008-12-03 12:39:34 PST
(In reply to comment #37)
>     >> -  do_free(ptr);
>     >> +  free(ptr);
>     >>  }
> 
>     > Do you really need to change this?
> 
> I believe, the former behavior is a bug, as it bypasses the InvokeDeleteHook
> mechanism. cfree is a sibling to free and a complement to calloc. Without the
> above fix, a call to calloc would result in a call to InvokeNewHook, but the
> eventual call to cfree would never call the InvokeDeleteHook.

OK. Lets fix this separately, then. Doesn't seem closely related to the rest of what's going on here.

>     >>    if (new_size == 0) {
>     >> -#ifndef WTF_CHANGES
>     >> -    MallocHook::InvokeDeleteHook(old_ptr);
>     >> -#endif
>     >>      free(old_ptr);
>     >>      return NULL;
>     >>    }
> 
>     > I don't understand what the value is of removing InvokeDeleteHook here.
> 
> The InvokeDeleteHook call is redundant and is called again within free. Aside
> from being less efficient, it complicates the user's implementation
> requirements and makes it harder to do validation since the user would have to
> keep track of pointers being 'double-deleted'.
> 
> If for some reason people want to keep the existing (IMO buggy) behavior, then
> I can undo the above.

Same comment. Lets fix these without including all the other changes.

>     >> I don't understand how you can store both the array size and
>     >> the validation type in the same place; "p" and "a.size" both point
>     >> to the same address. Maybe I'm missing something obvious here.
> 
> They aren't stored in the same place. The validation type is stored prior
> in memory to the array size. First the validation type is written in some
> reserved prefix bytes, then the array size is written in its own reserved
> prefix bytes. The array code doesn't care about the location of the
> validation type bytes. The unit tests exercise this and execute OK, FWIW.

There are two ways to resolve this then:

    1) Make me smarter.
    2) Make the code easier to read.

Somehow I thought that "p" and "a.size" were the same address. I'm just guessing, but there's probably a way to write this that makes it clearer.
Comment 39 Paul Pedriana 2008-12-11 01:19:24 PST
Created attachment 25944 [details]
JavaScriptCore FastMalloc-related patch.

This addresses Darin's comments in https://bugs.webkit.org/show_bug.cgi?id=20422#c34 and https://bugs.webkit.org/show_bug.cgi?id=20422#c38. This is merely the basic JavaScriptCore patch for memory allocation control and doesn't include changes to JavaScriptCore, WebCore, etc. to use this. That is to be done in a succeeding patch.
Comment 40 Paul Pedriana 2008-12-11 01:22:36 PST
Created attachment 25945 [details]
Build file patch.

This is Zoltan Horvath's pbxproj/GNUmakefile.am/vcproj file patch.
Comment 41 Zoltan Horvath 2009-01-06 06:46:20 PST
Created attachment 26460 [details]
Updated build systems patch

I've updated the build patch. Both patches works with latest revisions.
Comment 42 Darin Adler 2009-01-10 15:38:11 PST
Comment on attachment 25944 [details]
JavaScriptCore FastMalloc-related patch.

The work looks good and about ready to land.

> +            fastFree(p);  // We don't need to check for a null pointer; the compiler does this.

The comment is curious, because fastFree also checks for a null pointer. I'm not really sure what we're saying with the comment.

> +        void* p = fastMalloc(sizeof(T));
> +
> +        if (p) {
> +            fastMallocMatchValidateMalloc(p, Internal::AllocTypeFastNew);
> +            return ::new(p) T;
> +        }
> +
> +        return 0;

In general in this project, we prefer the "early return" style, so we'd check for null and return 0, then put the success case code in a straight line rather than nested in an if statement.

> +    if(static_cast<size_t>(~0) - sizeof(AllocAlignmentInteger) <= n)  // If overflow would occur...

We put a space after the if and before the "(" character.

It's OK to use static_cast<size_t>(~0), but it doesn't seem better than numeric_limits<size_t>::max() to me.

> +    n += sizeof(AllocAlignmentInteger);
> +    char* result = static_cast<char*>(malloc(n));

I think it's slightly better to write this as:

    char* result = static_cast<char*>(malloc(sizeof(AllocAlignmentInteger) + n));

rather than modifying the argument. On the minus side, it makes the line of code a little more complex, but on the plus side it avoids modifying an argument which can sometimes be confusing. I'm also not sure that making the result be a char* is all that valuable, since we have to cast it with reinterpret_cast. If we made it a void* we could use static_cast instead.

> +        *reinterpret_cast<AllocAlignmentInteger*>(result) = Internal::AllocTypeMalloc;
> +        result += sizeof(AllocAlignmentInteger);

With void* this would be:

    *static_cast<AllocAlignmentInteger*>(result) = Internal::AllocTypeMalloc;
    result = static_cast<AllocAlignmentInteger*>(result) + 1;

Or:

    AllocAlignmentInteger* header = static_cast<AllocAlignmentInteger*>(result);
    *header = Internal::AllocTypeMalloc;
    result = header + 1;

> +        AllocAlignmentInteger* pValue = Internal::fastMallocMatchValidationValue(p);
> +        if (*pValue != Internal::AllocTypeMalloc)
> +            Internal::fastMallocMatchFailed(p);
> +        free(pValue);

We almost never use the "Hungarian" style where a varible is named "pType", which you're using here. I'd personally prefer a letter or a word, maybe "header".

> +    new_ptr = static_cast<char*>(new_ptr) + sizeof(AllocAlignmentInteger);

I don't see a good reason to involve char* here. I'd use this:

    new_ptr = static_cast<AllocAlignmentInteger*>(new_ptr) + 1;

> +            const AllocAlignmentInteger* pType = reinterpret_cast<const AllocAlignmentInteger*>(static_cast<const char*>(p) - sizeof(AllocAlignmentInteger));

Again, unnecessary use of char* and reinterpret_cast here.

    const AllocAlignmentInteger* pType = static_cast<const AllocAlignmentInteger*>(p) - 1;


> +    // These could alternatively be inline functions which do nothing.
> +    #define fastMallocMatchValidateMalloc(p, type)
> +    #define fastMallocMatchValidateFree(p, type)

And they should be. We'd prefer empty inline functions to macros for this sort of thing.

>  /* ENABLE macro defaults */
> +#if !defined(ENABLE_FAST_MALLOC_MATCH_VALIDATION)
> +#define ENABLE_FAST_MALLOC_MATCH_VALIDATION 0
> +#endif

Our platform header is really getting out of hand. A long list of things with no documentation about what they are! Not that your single addition makes it significantly worse.

I think there were just enough comments that I'll say review-, although I think this is in great shape and nearly ready to go. I don't think there's any one comment above that's important, but I'm hoping you'll want to take at least one of my suggestions above.
Comment 43 Darin Adler 2009-01-10 15:38:54 PST
Comment on attachment 25945 [details]
Build file patch.

Please include the build file patch along with the cpp file patch and the ChangeLog next time around. Separate patches make it slightly harder for the committer.
Comment 44 Darin Adler 2009-01-10 15:39:26 PST
Comment on attachment 26460 [details]
Updated build systems patch

Looks fine. Ideally we want a single patch including the whole thing next time around.

I think we're getting really close!
Comment 45 Paul Pedriana 2009-01-28 00:51:55 PST
Created attachment 27097 [details]
Updated FastMalloc patch.

We have an updated patch, which has the source and build file 
changes in one patch. Some comments, FWIW:

    > +            fastFree(p);  // We don't need to check for a null pointer;
    the compiler does this.

    The comment is curious, because fastFree also checks for a null 
    pointer. I'm not really sure what we're saying with the comment.

It has been removed. It didn't really need to be said.


    > +        void* p = fastMalloc(sizeof(T));
    > +
    > +        if (p) {
    > +            fastMallocMatchValidateMalloc(p, Internal::AllocTypeFastNew);
    > +            return ::new(p) T;
    > +        }
    > +
    > +        return 0;

    In general in this project, we prefer the "early return" style, 
    so we'd check for null and return 0, then put the success case 
    code in a straight line rather than nested in an if statement.

I changed the FastAllocBase code to use the early return style. 
I might point out that the early return style results in slightly 
slower code with most compilers, including GCC and VC++. The reason 
is that branch prediction expects the 'true' case to be executed
and optimizes for that. Code with if statements which are false pay 
a small branch prediction penalty. Some processors (e.g. PowerPC) 
have instruction variants which tell the processor to reverse this 
expectation, but you need to use things like GCC's __builtin_expect 
or assembly language to access them. I realize you're probably aware
of this already, but I'm mentioning it because that's why in the past
I've often written code that way.


    > +    if(static_cast<size_t>(~0) - sizeof(AllocAlignmentInteger) <= n)  
      // If overflow would occur...

    We put a space after the if and before the "(" character.

    It's OK to use static_cast<size_t>(~0), but it doesn't seem better 
    than numeric_limits<size_t>::max() to me.

OK. Done on both accounts.



    > +    n += sizeof(AllocAlignmentInteger);
    > +    char* result = static_cast<char*>(malloc(n));

    I think it's slightly better to write this as:

        char* result = static_cast<char*>(malloc(sizeof(AllocAlignmentInteger) +
    n));

    rather than modifying the argument. On the minus side, it makes 
    the line of code a little more complex, but on the plus side it 
    avoids modifying an argument which can sometimes be confusing. 

I changed the code to use the (sizeof(AllocAlignmentInteger) + n) form.

    I'm also not sure that making the result be a char* is 
    all that valuable, since we have to cast it with reinterpret_cast. 
    If we made it a void* we could use static_cast instead.

    > +        *reinterpret_cast<AllocAlignmentInteger*>(result) = 
         Internal::AllocTypeMalloc;
    > +        result += sizeof(AllocAlignmentInteger);

    With void* this would be:

        *static_cast<AllocAlignmentInteger*>(result) = 
            Internal::AllocTypeMalloc;
        result = static_cast<AllocAlignmentInteger*>(result) + 1;

    Or:

        AllocAlignmentInteger* header = static_cast<AllocAlignmentInteger*>
            (result);
        *header = Internal::AllocTypeMalloc;
        result = header + 1;

OK. Done.


    > +        AllocAlignmentInteger* pValue =
          Internal::fastMallocMatchValidationValue(p);
    > +        if (*pValue != Internal::AllocTypeMalloc)
    > +            Internal::fastMallocMatchFailed(p);
    > +        free(pValue);

    We almost never use the "Hungarian" style where a varible is 
    named "pType", which you're using here. I'd personally prefer 
    a letter or a word, maybe "header".

header seems fine. Done.


    > +    new_ptr = static_cast<char*>(new_ptr) + 
        sizeof(AllocAlignmentInteger);

    I don't see a good reason to involve char* here. I'd use this:

        new_ptr = static_cast<AllocAlignmentInteger*>(new_ptr) + 1;

OK. Done.


    > + const AllocAlignmentInteger* pType = reinterpret_cast<const 
         AllocAlignmentInteger*>(static_cast<const char*>(p) - 
         sizeof(AllocAlignmentInteger));

    Again, unnecessary use of char* and reinterpret_cast here.

        const AllocAlignmentInteger* pType = static_cast<const 
            AllocAlignmentInteger*>(p) - 1;

Yep. The latter is significantly better. 


    > +    // These could alternatively be inline functions which do nothing.
    > +    #define fastMallocMatchValidateMalloc(p, type)
    > +    #define fastMallocMatchValidateFree(p, type)

    And they should be. We'd prefer empty inline functions 
    to macros for this sort of thing.

OK. Done.


    >  /* ENABLE macro defaults */
    > +#if !defined(ENABLE_FAST_MALLOC_MATCH_VALIDATION)
    > +#define ENABLE_FAST_MALLOC_MATCH_VALIDATION 0
    > +#endif

    Our platform header is really getting out of hand. 
    A long list of things with no documentation about what 
    they are! Not that your single addition makes it
    significantly worse.

I'm a fan of more in-code documentation. I didn't put anything 
there because there wasn't existing similar documentation. 
I put a two line comment explaining what this feature does.


    I think there were just enough comments that I'll say 
    review-, although I think this is in great shape and 
    nearly ready to go. I don't think there's any one comment 
    above that's important, but I'm hoping you'll want to take 
    at least one of my suggestions above.

I did them all.
Comment 46 Darin Adler 2009-02-23 16:38:47 PST
(In reply to comment #45)
> I might point out that the early return style results in slightly 
> slower code with most compilers, including GCC and VC++. The reason 
> is that branch prediction expects the 'true' case to be executed
> and optimizes for that.

Very interesting. I did not know that.
Comment 47 Darin Adler 2009-02-23 16:40:23 PST
(In reply to comment #45)
> Some processors (e.g. PowerPC) 
> have instruction variants which tell the processor to reverse this 
> expectation, but you need to use things like GCC's __builtin_expect 
> or assembly language to access them.

That's not quite right. The compiler doesn't put the code in the same order you typed it, so the issue here is not about instruction variants for the processor. The compiler can order the code however it wants.

We do indeed use __builtin_expect in various places with the UNLIKELY and LIKELY macros.
Comment 48 Darin Adler 2009-02-23 16:42:28 PST
Comment on attachment 27097 [details]
Updated FastMalloc patch.

r=me
Comment 49 Adam Treat 2009-02-26 06:05:46 PST
Congratulations  on getting this done!  Any ideas when this will be landing in the tree?
Comment 50 Pam Greene (IRC:pamg) 2009-03-11 10:51:10 PDT
I'll give this a couple of hours for any last-minute retractions, then land it this afternoon.
Comment 51 Pam Greene (IRC:pamg) 2009-03-11 17:44:08 PDT
OK, maybe not. With all these active patches, I'm not positive which ones need landing. If you can let me know that, I'll get 'em in.
Comment 52 Pam Greene (IRC:pamg) 2009-03-11 17:45:12 PDT
(In reply to comment #51)
> OK, maybe not. With all these active patches, I'm not positive which ones need
> landing. If you can let me know that, I'll get 'em in.

More accurately, I'm not positive that the last, r+ patch stands alone, and that all the others are obsolete rather than still in progress.
Comment 53 Darin Adler 2009-03-11 17:47:12 PDT
(In reply to comment #52)
> More accurately, I'm not positive that the last, r+ patch stands alone, and
> that all the others are obsolete rather than still in progress.

I'm pretty sure that last one does stand alone.
Comment 54 Paul Pedriana 2009-03-12 01:39:52 PDT
That's correct. 
Comment 55 Pam Greene (IRC:pamg) 2009-03-12 13:18:26 PDT
Sorry, this patch has suffered bitrot.  It no longer applies cleanly, and I'm not confident I can manually patch it in correctly (especially the JSC XCode project). Paul, please refresh it and I'll land.
Comment 56 Paul Pedriana 2009-04-06 01:21:12 PDT
Created attachment 29272 [details]
Same patch as before but updated to work with r42235 (April 6, 2009)

This is simply an updated patch which works with the latest trunk. The primary code of the patch is nearly unchanged from the previous version. The primary difference is that this new patch modifies the recently added TypeTraits.h instead of this patch creating TypeTraits.h itself. Thanks again to Zoltan Horvath for helping test this and making a build patch.
Comment 57 David Levin 2009-04-06 09:22:49 PDT
Could be handled on landing, but
* The change in WebCore/ForwardingHeaders/wtf/TypeTraits.h looks wrong and seems like it should not be done (along with the corresponding ChangeLog).

* The // -*- mode: c++; c-basic-offset: 4 -*- at the top of JavaScriptCore/wtf/FastAllocBase.h isn't typical style.
Comment 58 Darin Adler 2009-04-06 09:32:04 PDT
Comment on attachment 29272 [details]
Same patch as before but updated to work with r42235 (April 6, 2009)

I’ll clear the review flag since we’d ideally want a new patch without the two problems David Levin pointed out. If someone wants to fix those and land, that's OK; but if a non-committer wants to post a revised patch that would be good too.
Comment 59 Zoltan Horvath 2009-04-06 11:48:41 PDT
Created attachment 29285 [details]
Same patch as before, updated

> * The change in WebCore/ForwardingHeaders/wtf/TypeTraits.h looks wrong and
> seems like it should not be done (along with the corresponding ChangeLog).

Done.

> * The // -*- mode: c++; c-basic-offset: 4 -*- at the top of
> JavaScriptCore/wtf/FastAllocBase.h isn't typical style.

Done.

Three other smart correction in FastAllocBase.h: 
- in line 291: "WTF::has_trivial_constructor" replaced to  "WTF::HasTrivialConstructor" and 
"WTF::has_trivial_destructor" replaced to "WTF::HasTrivialDestructor"
- in line 347: "WTF::has_trivial_destructor" replaced to "WTF::HasTrivialDestructor"
- in line 396: "WTF::has_trivial_destructor" replaced to "WTF::HasTrivialDestructor"

Others are unchanged.
Comment 60 David Levin 2009-04-08 17:19:05 PDT
Working on landing.
Comment 61 David Levin 2009-04-08 18:15:23 PDT
Committed <http://trac.webkit.org/changeset/42344>
Comment 62 Xan Lopez 2009-04-09 05:30:53 PDT
I had to commit r42356 to fix the build with GCC after this patch. AFAICT checking for a given date in __GLIBCXX__ is not enough, you also have to check that C++0x is being used. I did this, but I'm not sure if it's the ideal solution in this case, feel free to change it.
Comment 63 Zoltan Horvath 2009-04-17 02:07:03 PDT
The FastAllocBase class is in the trunk. The next step is to inherite every classes from FastAllocBase which has been instantiated by 'new'. Darin mentioned before put FastAllocBase into inheritance tree as far up as possible. Is this still the convention? 

I think first we need to produce a patch for JavaScriptCore. Should I create a new bug report for this and file a patch for JSCore?
Comment 64 Darin Adler 2009-04-17 05:54:10 PDT
(In reply to comment #63)
> The FastAllocBase class is in the trunk. The next step is to inherite every
> classes from FastAllocBase which has been instantiated by 'new'. Darin
> mentioned before put FastAllocBase into inheritance tree as far up as possible.
> Is this still the convention?

I think it's still the best approach, yes.

> I think first we need to produce a patch for JavaScriptCore. Should I create a
> new bug report for this and file a patch for JSCore?

I think the best next step would be to do a single class, not all of JavaScriptCore.
Comment 65 Zoltan Horvath 2009-04-17 08:17:38 PDT
Created attachment 29578 [details]
proposed patch for HashTable inheritance

I've inherited HashTable from FastAllocBase. (Instantiated by 'new' in JavaScriptCore/runtime/JSGlobalData.cpp)
Comment 66 Zoltan Horvath 2009-05-14 02:07:25 PDT
I created a new bugreport for JavaScriptCore inheritance and I filed the HashTable patch there.
bug #25782 https://bugs.webkit.org/show_bug.cgi?id=25782