RESOLVED FIXED 174232
Implement __builtin_clzl for MSVC
https://bugs.webkit.org/show_bug.cgi?id=174232
Summary Implement __builtin_clzl for MSVC
Daewoong Jang
Reported 2017-07-06 19:06:39 PDT
Rewrite bmalloc::log2() in a recursive fashion as MSVC does not have __builtin_clzl nor exact equivalent of it. This change is needed for supporting bmalloc on Windows(https://bugs.webkit.org/show_bug.cgi?id=143310).
Attachments
patch (1.23 KB, patch)
2017-07-06 19:08 PDT, Daewoong Jang
no flags
patch (2.62 KB, patch)
2017-07-10 18:48 PDT, Daewoong Jang
no flags
patch-does-not-hurt-gcc-compilation (2.29 KB, patch)
2017-07-12 17:11 PDT, Daewoong Jang
no flags
patch (2.23 KB, patch)
2017-07-27 16:31 PDT, Daewoong Jang
no flags
Archive of layout-test-results from ews122 for ios-simulator-wk2 (966.59 KB, application/zip)
2017-07-27 19:01 PDT, Build Bot
no flags
Daewoong Jang
Comment 1 2017-07-06 19:08:36 PDT
Yusuke Suzuki
Comment 2 2017-07-06 19:13:04 PDT
Comment on attachment 314788 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=314788&action=review > Source/bmalloc/bmalloc/Algorithm.h:137 > - return bitCount<unsigned long>() - 1 - __builtin_clzl(value); > + return value < 2 ? 0 : 1 + log2(value / 2); We should keep the current efficient log2 for non-Windows environment. If this is not constexpr, in Windows, you can use _BitScanReverse64 / _BitScanReverse. But maybe they are not constexpr.
Daewoong Jang
Comment 3 2017-07-06 21:46:36 PDT
(In reply to Yusuke Suzuki from comment #2) > Comment on attachment 314788 [details] > patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=314788&action=review > > > Source/bmalloc/bmalloc/Algorithm.h:137 > > - return bitCount<unsigned long>() - 1 - __builtin_clzl(value); > > + return value < 2 ? 0 : 1 + log2(value / 2); > > We should keep the current efficient log2 for non-Windows environment. > If this is not constexpr, in Windows, you can use _BitScanReverse64 / > _BitScanReverse. But maybe they are not constexpr. _BitScanReverse is not constexpr, so I did it this way. If we have to keep current implementation, then I'll put a compiler guard around the code. Thank you for reviewing!
Geoffrey Garen
Comment 4 2017-07-07 08:22:35 PDT
I'm curious: Do you have a working version of bmalloc on Windows? If so, have you measured its performance? It would be helpful to see the full design of the finished work, and to know its performance. Some of the decisions you've made seem to be compromises that harm performance. To avoid muddying up the code, I think it's preferable to emulate missing functions on Windows. So, in this case, you could define __builtin_clzl only on windows, and use a recursive function or a tree of branches to compute it.
Daewoong Jang
Comment 5 2017-07-08 02:15:44 PDT
(In reply to Geoffrey Garen from comment #4) > I'm curious: Do you have a working version of bmalloc on Windows? If so, > have you measured its performance? > > It would be helpful to see the full design of the finished work, and to know > its performance. > Of course I have a working version of bmalloc on Windows in my working tree and you can see its performance gain here: https://bugs.webkit.org/show_bug.cgi?id=143310#c28 It's old but bmalloc on Windows gives definite 10%+ performance gain on JavaScript benchmarks run on Windows. Never mind the cluttered patch you could see on the page above. It won't be cluttered like that in the finished work and won't be upstreamed until you're happy with the change :) > Some of the decisions you've made seem to be compromises that harm > performance. > > To avoid muddying up the code, I think it's preferable to emulate missing > functions on Windows. So, in this case, you could define __builtin_clzl only > on windows, and use a recursive function or a tree of branches to compute it. It could be done in that way for this small case, but how about other POSIX functions like pthread/mman APIs? I am thinking of putting an abstraction layer instead of emulation. Will it be OK for you? Thank you for your comment.
Daewoong Jang
Comment 6 2017-07-10 18:48:00 PDT
Build Bot
Comment 7 2017-07-10 18:49:54 PDT
Attachment 315060 [details] did not pass style-queue: ERROR: Source/bmalloc/bmalloc/Algorithm.h:141: __builtin_clzl is incorrectly named. Don't use underscores in your identifier names. [readability/naming/underscores] [4] Total errors found: 1 in 4 files If any of these errors are false positives, please file a bug against check-webkit-style.
Daewoong Jang
Comment 8 2017-07-12 17:11:21 PDT
Created attachment 315311 [details] patch-does-not-hurt-gcc-compilation
Build Bot
Comment 9 2017-07-12 17:13:32 PDT
Attachment 315311 [details] did not pass style-queue: ERROR: Source/bmalloc/bmalloc/Algorithm.h:152: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5] Total errors found: 1 in 3 files If any of these errors are false positives, please file a bug against check-webkit-style.
Daewoong Jang
Comment 10 2017-07-27 16:31:19 PDT
Build Bot
Comment 11 2017-07-27 16:33:06 PDT
Attachment 316578 [details] did not pass style-queue: ERROR: Source/bmalloc/bmalloc/Algorithm.h:152: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5] Total errors found: 1 in 3 files If any of these errors are false positives, please file a bug against check-webkit-style.
Daewoong Jang
Comment 12 2017-07-27 16:33:54 PDT
This patch stays here for too long. Can I get a review?
Build Bot
Comment 13 2017-07-27 19:01:37 PDT
Comment on attachment 316578 [details] patch Attachment 316578 [details] did not pass ios-sim-ews (ios-simulator-wk2): Output: http://webkit-queues.webkit.org/results/4199491 New failing tests: imported/w3c/IndexedDB-private-browsing/idbfactory_open12.html
Build Bot
Comment 14 2017-07-27 19:01:39 PDT
Created attachment 316598 [details] Archive of layout-test-results from ews122 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews122 Port: ios-simulator-wk2 Platform: Mac OS X 10.12.5
Geoffrey Garen
Comment 15 2017-08-01 08:27:57 PDT
> It could be done in that way for this small case, but how about other POSIX > functions like pthread/mman APIs? I am thinking of putting an abstraction > layer instead of emulation. Will it be OK for you? I think we'll have to consider each Windows API, and its difference from POSIX, individually. For example, VirtualAllocEx is very similar to mmap+madvise.
Geoffrey Garen
Comment 16 2017-08-01 08:28:09 PDT
Comment on attachment 316578 [details] patch r=me
WebKit Commit Bot
Comment 17 2017-08-01 08:56:23 PDT
Comment on attachment 316578 [details] patch Clearing flags on attachment: 316578 Committed r220097: <http://trac.webkit.org/changeset/220097>
WebKit Commit Bot
Comment 18 2017-08-01 08:56:25 PDT
All reviewed patches have been landed. Closing bug.
Radar WebKit Bug Importer
Comment 19 2017-08-01 08:56:48 PDT
Daewoong Jang
Comment 20 2017-08-01 16:19:01 PDT
(In reply to Geoffrey Garen from comment #15) > > It could be done in that way for this small case, but how about other POSIX > > functions like pthread/mman APIs? I am thinking of putting an abstraction > > layer instead of emulation. Will it be OK for you? > > I think we'll have to consider each Windows API, and its difference from > POSIX, individually. For example, VirtualAllocEx is very similar to > mmap+madvise. All right. Thank you for review.
Note You need to log in before you can comment on or make changes to this bug.