RESOLVED FIXED 5243
Numerous performance improvements to new FastMalloc to fix regression
https://bugs.webkit.org/show_bug.cgi?id=5243
Summary Numerous performance improvements to new FastMalloc to fix regression
Maciej Stachowiak
Reported 2005-10-02 21:22:37 PDT
The new threadsafe fast allocator caused a performance regression relative to dlmalloc, however by optimizing it a bit I have fixed this. Also I reduced memory consumption to be as good as the system malloc. Here is what I did: - use fastMalloc for everything - it now gets applied to all new/delete allocations via a private inline operator new that is now included into every file via config.h. - tweaked some of the numeric parameters for size classes and amount of wasted memory allowed per allocation - this saves on memory use and consequently improves speed. - so long as the allocator is not being used on background threads, get the per-thread cache from a global variable instead of from pthread_getspecific, since the latter is slow. - inline more functions, and force the ones GCC refuses to inline with attribute(always_inline), nearly all of these have one call site so inlining them has to be a win. - use some tricks to calculate allocation size more efficiently and fewer times for small allocations, to avoid hitting the huge size table array. - avoid hitting the per-thread cache on code paths that don't need it. - implement inline assembly version of spinlock for PowerPC (was already done for x86)
Attachments
numerous optimizations to the new malloc (242.18 KB, patch)
2005-10-02 21:23 PDT, Maciej Stachowiak
darin: review-
addressed darin's comments (260.84 KB, patch)
2005-10-03 00:35 PDT, Maciej Stachowiak
darin: review+
Maciej Stachowiak
Comment 1 2005-10-02 21:23:24 PDT
Created attachment 4165 [details] numerous optimizations to the new malloc
Maciej Stachowiak
Comment 2 2005-10-02 21:45:18 PDT
Comment on attachment 4165 [details] numerous optimizations to the new malloc Uh, I meant "?"
Darin Adler
Comment 3 2005-10-02 21:52:03 PDT
Comment on attachment 4165 [details] numerous optimizations to the new malloc I'm puzzled by the use of "config.h" -- did we discover that was the best practice and better than a prefix file? Also, it seems that if we do use "config.h" it needs to be the first include in each file, because other headers might have inline code that depends on it, but it instead seems to be the second include in every file. Is this intentional or perhaps a bug in the script you used to add it? To echo a complaint you've made to me many times, it was really a pain to search through all the boilerplate changes to the other files to find the substantive changes to review. I'm concerned that changes to constants in the FastMalloc.cpp source file seem to be near comments that have not been updated. Are there any comments here that need to be updated? The use of __attribute__((always_inline)) is not inside an ifdef. Lets instead do this in a more-portable way using a macro and __GNUC__ ifdef as we do with PRIVATE_INLINE. I suggest that PRIVATE_INLINE use a KXMLCORE prefix since it's a macro and in a header. Should remove the FIXME comment talking about PPC. Is there an easy way to catch the mistake of pairing fastMalloc with plain old free or vice versa in a build where NDEBUG is not defined? If not, we should try to come up with something.
Maciej Stachowiak
Comment 4 2005-10-02 23:57:03 PDT
It sounds from the comments on webkit-dev like the prefix solution will not work for Symbian or Linux. Changes made in response to comments: - config.h is now the first file everywhere - if you don't include config.h but use new or delete, you now get a compile failure (on OS X at least) - comments updated to match constants - made an ifdef for use of ALWAYS_INLINE in the file - put prefix on PRIVATE_INLINE - removed now unneeded FIXME
Maciej Stachowiak
Comment 5 2005-10-03 00:35:26 PDT
Created attachment 4171 [details] addressed darin's comments
Darin Adler
Comment 6 2005-10-03 09:38:25 PDT
Comment on attachment 4171 [details] addressed darin's comments Why remove the include of config.h from fpconst.cpp? It needs the definition of WORDS_BIGENDIAN. Otherwise looks fine. I'll mark review+, but please consider the fpconst.cpp thing.
Maciej Stachowiak
Comment 7 2005-10-03 13:59:00 PDT
This is also <rdar://problem/4283967>
Note You need to log in before you can comment on or make changes to this bug.