Summary: | Port bmalloc to Windows | ||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Alex Christensen <achristensen> | ||||||||||||||||||||||||||
Component: | Web Template Framework | Assignee: | Basuke Suzuki <Basuke.Suzuki> | ||||||||||||||||||||||||||
Status: | NEW --- | ||||||||||||||||||||||||||||
Severity: | Normal | CC: | achristensen, annulen, Basuke.Suzuki, benjamin, bfulgham, cdumez, cmarcelo, commit-queue, daewoong.jang, don.olmstead, ews-watchlist, ggaren, gyuyoung.kim, hyungwook.lee, mitz, pvollan, ryuan.choi, sergio, sungmann.cho, ysuzuki | ||||||||||||||||||||||||||
Priority: | P2 | ||||||||||||||||||||||||||||
Version: | 528+ (Nightly build) | ||||||||||||||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||||||||||||
Bug Depends on: | 255945, 255968, 256011, 173567, 173622, 174232, 175299, 255943, 255944, 255996, 256010, 256314 | ||||||||||||||||||||||||||||
Bug Blocks: | |||||||||||||||||||||||||||||
Attachments: |
|
Description
Alex Christensen
2015-04-01 11:29:01 PDT
Created attachment 260481 [details]
Patch
Attachment 260481 [details] did not pass style-queue:
ERROR: Source/bmalloc/bmalloc/Algorithm.h:55: bitwise_cast is incorrectly named. Don't use underscores in your identifier names. [readability/naming/underscores] [4]
Total errors found: 1 in 25 files
If any of these errors are false positives, please file a bug against check-webkit-style.
I left the style error in Algorithm.h unchanged because it was caused by the code block I copied from WTF/StdLibExtras.h. If I should have fixed the style error, please let me know. Comment on attachment 260481 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=260481&action=review > Source/bmalloc/bmalloc/Algorithm.h:64 > +/* > +* C++'s idea of a reinterpret_cast lacks sufficient cojones. > +*/ > +template<typename ToType, typename FromType> > +inline ToType bitwise_cast(FromType from) > +{ > + static_assert(sizeof(FromType) == sizeof(ToType), "bitwise_cast size of FromType and ToType must be equal!"); > + union { > + FromType from; > + ToType to; > + } u; > + u.from = from; > + return u.to; > +} Is this function as efficient as reinterpret_cast? What does the generated assembly look like? > Source/bmalloc/bmalloc/Allocator.cpp:81 > +#else > + BCRASH(); > +#endif This doesn't look right. You want the Windows aligned allocation API. > Source/bmalloc/bmalloc/Allocator.cpp:84 > - > + Please revert this whitespace change. > Source/bmalloc/bmalloc/Chunk.h:76 > #if BPLATFORM(IOS) > char m_memory[] __attribute__((aligned(16384))); > static_assert(vmPageSize == 16384, "vmPageSize and alignment must be same"); > -#else > +#elif !BOS(WINDOWS) > char m_memory[] __attribute__((aligned(4096))); > static_assert(vmPageSize == 4096, "vmPageSize and alignment must be same"); > +#else > + char* m_memory { mask(reinterpret_cast<char*>(&m_memory + 1) + (vmPageSize - 1), vmPageMask) }; > #endif It is bad to clutter the code with #ifdefs. That's a general problem in this patch. We want some better options for abstracting platform differences. Please add this to the CMake build, too. There are several places where I added if (NOT WIN32) with a comment referencing this bug. Also, I think it would probably be best if we turned bmalloc on for the WinCairo port only at first. Comment on attachment 260481 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=260481&action=review >> Source/bmalloc/bmalloc/Algorithm.h:64 >> +} > > Is this function as efficient as reinterpret_cast? What does the generated assembly look like? The generated assembly I got from MSVC: return reinterpret_cast<uintptr_t>(ptr); 00402760 mov eax,ecx return bitwise_cast<uintptr_t>(ptr); 00402760 mov eax,ecx I believe we get same results from both of them. I will check on Mac and post the result later. >> Source/bmalloc/bmalloc/Allocator.cpp:81 >> +#endif > > This doesn't look right. You want the Windows aligned allocation API. I know, but the equivalent Windows API _aligned_malloc() has a constraint that the memory block allocated by it must be freed by _aligned_free(). And I believe bmalloc::Allocator doesn't treat aligned memory block as special for now since POSIX APIs don't. In my opinion, it would take more effort to replace BCRASH() with proper implementation and I would love to do that, but I think it is beyond this patch's work. I would appreciate if you give me a suggestion. >> Source/bmalloc/bmalloc/Allocator.cpp:84 >> + > > Please revert this whitespace change. OK. I'll do it. >> Source/bmalloc/bmalloc/Chunk.h:76 >> #endif > > It is bad to clutter the code with #ifdefs. That's a general problem in this patch. We want some better options for abstracting platform differences. I totally agree. Could you give me any suggestions for improving code? (In reply to comment #5) > Please add this to the CMake build, too. There are several places where I > added if (NOT WIN32) with a comment referencing this bug. > Also, I think it would probably be best if we turned bmalloc on for the > WinCairo port only at first. Alright I'll update the patch. And actually I didn't turned bmalloc on neither AppleWin nor WinCairo. Do you think it is okay to turn it on WinCairo port by default? (In reply to comment #5) > Please add this to the CMake build, too. There are several places where I > added if (NOT WIN32) with a comment referencing this bug. > Also, I think it would probably be best if we turned bmalloc on for the > WinCairo port only at first. Really? I think it would be better to turn it on across the board. The Apple Windows port is the only one with any meaningful testing, and the only place we are likely to see if anything is going wrong. If it starts failing, we can always roll it out. (In reply to comment #8) > (In reply to comment #5) > > Please add this to the CMake build, too. There are several places where I > > added if (NOT WIN32) with a comment referencing this bug. > > Also, I think it would probably be best if we turned bmalloc on for the > > WinCairo port only at first. > > Really? I think it would be better to turn it on across the board. The Apple > Windows port is the only one with any meaningful testing, and the only place > we are likely to see if anything is going wrong. > > If it starts failing, we can always roll it out. I think all the ports should use it, but we would have to submit a project to build it internally, too. If you think that won't cause a problem, then let's just use bmalloc everywhere. Comment on attachment 260481 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=260481&action=review >>> Source/bmalloc/bmalloc/Algorithm.h:64 >>> +} >> >> Is this function as efficient as reinterpret_cast? What does the generated assembly look like? > > The generated assembly I got from MSVC: > > return reinterpret_cast<uintptr_t>(ptr); > 00402760 mov eax,ecx > > return bitwise_cast<uintptr_t>(ptr); > 00402760 mov eax,ecx > > I believe we get same results from both of them. I will check on Mac and post the result later. From XCode: pushq %rbp Ltmp0: .cfi_def_cfa_offset 16 Ltmp1: .cfi_offset %rbp, -16 movq %rsp, %rbp I got the assemply above from code: uintptr_t a = reinterpret_cast<uintptr_t>(p); and uintptr_t a = bitwise_cast<uintptr_t>(p); Created attachment 260964 [details]
Patch 2
Attachment 260964 [details] did not pass style-queue:
ERROR: Source/bmalloc/bmalloc/Threading.h:59: The parameter name "threadHandle" adds no information, so it should be removed. [readability/parameter_name] [5]
ERROR: Source/bmalloc/bmalloc/Threading.h:61: The parameter name "destructor" adds no information, so it should be removed. [readability/parameter_name] [5]
ERROR: Source/bmalloc/bmalloc/Threading.h:62: The parameter name "key" adds no information, so it should be removed. [readability/parameter_name] [5]
ERROR: Source/bmalloc/bmalloc/Threading.h:63: The parameter name "key" adds no information, so it should be removed. [readability/parameter_name] [5]
ERROR: Source/bmalloc/bmalloc/Threading.h:63: The parameter name "object" adds no information, so it should be removed. [readability/parameter_name] [5]
ERROR: Source/bmalloc/bmalloc/Algorithm.h:55: bitwise_cast is incorrectly named. Don't use underscores in your identifier names. [readability/naming/underscores] [4]
ERROR: Source/bmalloc/bmalloc/VMAllocate.h:33: Alphabetical sorting problem. [build/include_order] [4]
Total errors found: 7 in 34 files
If any of these errors are false positives, please file a bug against check-webkit-style.
Created attachment 260965 [details]
Patch 2
Have you measured any benchmark improvement with this patch? Created attachment 260968 [details]
Patch 2
(In reply to comment #14) > Have you measured any benchmark improvement with this patch? For Windows ports? Not yet. Could you give me any leads, to test it myself? I'm also interested about the performance gain bmalloc would give. And I turned bmalloc on by default for all Windows ports. If you have diffrent idea, please let me know. Thank you. Attachment 260968 [details] did not pass style-queue:
ERROR: Source/bmalloc/bmalloc/Algorithm.h:55: bitwise_cast is incorrectly named. Don't use underscores in your identifier names. [readability/naming/underscores] [4]
Total errors found: 1 in 34 files
If any of these errors are false positives, please file a bug against check-webkit-style.
(In reply to comment #16) > (In reply to comment #14) > > Have you measured any benchmark improvement with this patch? > > For Windows ports? Not yet. Could you give me any leads, to test it myself? > I'm also interested about the performance gain bmalloc would give. http://browserbench.org has a dom and a javascript benchmark. Created attachment 260977 [details]
Patch 3
Fixed compilation error.
Attachment 260977 [details] did not pass style-queue:
ERROR: Source/bmalloc/bmalloc/Algorithm.h:55: bitwise_cast is incorrectly named. Don't use underscores in your identifier names. [readability/naming/underscores] [4]
Total errors found: 1 in 34 files
If any of these errors are false positives, please file a bug against check-webkit-style.
Created attachment 260982 [details]
Patch 4
Fix compilation error.
Attachment 260982 [details] did not pass style-queue:
ERROR: Source/bmalloc/bmalloc/Algorithm.h:55: bitwise_cast is incorrectly named. Don't use underscores in your identifier names. [readability/naming/underscores] [4]
Total errors found: 1 in 34 files
If any of these errors are false positives, please file a bug against check-webkit-style.
Created attachment 260995 [details]
Patch 4
Created attachment 260996 [details]
Patch 4
Attachment 260996 [details] did not pass style-queue:
ERROR: Source/bmalloc/bmalloc/Algorithm.h:55: bitwise_cast is incorrectly named. Don't use underscores in your identifier names. [readability/naming/underscores] [4]
Total errors found: 1 in 35 files
If any of these errors are false positives, please file a bug against check-webkit-style.
Created attachment 261053 [details]
Patch 5
Attachment 261053 [details] did not pass style-queue:
ERROR: Source/bmalloc/bmalloc/Algorithm.h:55: bitwise_cast is incorrectly named. Don't use underscores in your identifier names. [readability/naming/underscores] [4]
Total errors found: 1 in 37 files
If any of these errors are false positives, please file a bug against check-webkit-style.
(In reply to comment #18) > (In reply to comment #16) > > (In reply to comment #14) > > > Have you measured any benchmark improvement with this patch? > > > > For Windows ports? Not yet. Could you give me any leads, to test it myself? > > I'm also interested about the performance gain bmalloc would give. > http://browserbench.org has a dom and a javascript benchmark. I ran Jet Stream benchmark on Windows MiniBrowser(AppleWin for x86). PC: i7-2600 @ 3.4GHz, 16GB ram, Windows 7 64bit OS Tested three times on both MiniBrowsers with/without bmalloc, in turn: Result ------ MiniBrowser: w/o bmalloc | w/ bmalloc ------------------------------------------------- Set 1: 87.676 +- 2.5816 | 94.044 +- 0.33255 Set 2: 85.505 +- 5.8075 | 93.971 +- 1.3106 Set 3: 86.152 +- 3.3151 | 93.873 +- 2.7638 I couldn't complete speedometer test. They refused to continue running after a short while. I believe recent Windows ports have problem. Comment on attachment 261053 [details]
Patch 5
There's still far too much platform-specific conditional code in the core logic of bmalloc in this patch. We should strive for core logic that works on all platforms, with helper classes or helper functions that call through to platform-specific APIs. bmalloc on Windows is not worth cluttering up the code this much.
As an aside is it really true that Windows does not support partial decommit of a memory region? I've read that it does.
I don't think we should just copy existing WTF code to do the platform abstraction here. Some of that code is old enough that it deals with platform problems that don't exist anymore.
(In reply to comment #29) > Comment on attachment 261053 [details] > Patch 5 > > There's still far too much platform-specific conditional code in the core > logic of bmalloc in this patch. We should strive for core logic that works > on all platforms, with helper classes or helper functions that call through > to platform-specific APIs. bmalloc on Windows is not worth cluttering up the > code this much. > I would like to know which ifdefs made you unsatisfied. I looked into my patch again and I think I can remove those ifdefs in AsyncTask.h and SegregatedFreeList.cpp with modifying existing code, but the rest of ifdefs looks necessary to me. I agree that bmalloc on Windows is not worth making code ugly since nobody makes useful product from Windows port. > As an aside is it really true that Windows does not support partial decommit > of a memory region? I've read that it does. > OK, I will look into that matter again. > I don't think we should just copy existing WTF code to do the platform > abstraction here. Some of that code is old enough that it deals with > platform problems that don't exist anymore. I didn't copy code from WTF. I just adopted class, method and function names form WTF and filled them with new code for Windows after I moved existing code into them. I agree that 'We should strive for core logic that works on all platforms', but I hardly modified meaningful logic in this patch. What I did was redefinition of helper classes and helper functions. (In reply to comment #30) > > I agree that bmalloc on Windows is not worth making code ugly since nobody > makes useful product from Windows port. That's not true. Apple ships products using the Windows port. So do a number of other companies. (In reply to comment #31) > (In reply to comment #30) > > > > I agree that bmalloc on Windows is not worth making code ugly since nobody > > makes useful product from Windows port. > > That's not true. Apple ships products using the Windows port. So do a number > of other companies. Sorry, my selection of words was inappropriate. I was thinking of something like Safari browser. Of course WebKit Windows ports are useful for various matters. This has been sitting here for over a month. It would be nice to have this reviewed, Geoff. I think I did review it? I understand that I've set a high bar here, and that it's going to be challenging to eliminate as much platform-specific code from core logic as possible, and that it might be frustrating too. I'm not sure how to guide this work more clearly without writing the patch myself. I might be able to find some time to do that. Created attachment 289663 [details] Updated patch based on r203260. Created attachment 403094 [details]
WIP Patch
Created attachment 403100 [details]
WIP Patch
|