Bug 250112
Summary: | [WTF] Integrate simdutf | ||
---|---|---|---|
Product: | WebKit | Reporter: | Jarred Sumner <jarred> |
Component: | JavaScriptCore | Assignee: | Yusuke Suzuki <ysuzuki> |
Status: | RESOLVED FIXED | ||
Severity: | Normal | CC: | ap, cdumez, darin, mark.lam, mmaxfield, webkit-bug-importer, ysuzuki |
Priority: | P2 | Keywords: | InRadar |
Version: | WebKit Nightly Build | ||
Hardware: | Unspecified | ||
OS: | Unspecified |
Jarred Sumner
simdutf is a fast text encoding/decoding library which supports UTF8 -> UTF16, UTF16 -> UTF8, and ascii validation. GitHub: https://github.com/simdutf/simdutf
Bun uses simdutf for TextEncoder & TextDecoder (in the happy path)
For this code:
```
var decoder = new TextDecoder();
var encoder = new TextEncoder();
var buf = encoder.encode(
"not all ascii 🙂 🙃 1-2-3-4 abcdefghklmnopqrstuvwxyz".repeat(9999)
),
buf1 = buf.slice(0, buf.length - 1),
buf2 = buf.slice(0, buf.length - 2);
var decoded = decoder.decode(buf),
decoded1 = decoder.decode(buf1),
decoded2 = decoder.decode(buf2);
console.time("TextDecoder.decode");
decoder.decode(buf);
console.timeEnd("TextDecoder.decode");
console.time("TextDecoder.decode");
decoder.decode(buf1);
console.timeEnd("TextDecoder.decode");
console.time("TextDecoder.decode");
decoder.decode(buf2);
console.timeEnd("TextDecoder.decode");
console.time("TextEncoder.encode");
encoder.encode(decoded);
console.timeEnd("TextEncoder.encode");
console.time("TextEncoder.encode");
encoder.encode(decoded1);
console.timeEnd("TextEncoder.encode");
console.time("TextEncoder.encode");
encoder.encode(decoded2);
console.timeEnd("TextEncoder.encode");
```
On macOS arm64 in Bun v0.4.1:
```
[0.69ms] TextDecoder.decode
[0.54ms] TextDecoder.decode
[0.59ms] TextDecoder.decode
[0.21ms] TextEncoder.encode
[0.20ms] TextEncoder.encode
[0.25ms] TextEncoder.encode
```
Safari Technology Preview:
```
TextDecoder.decode: 0.621ms
TextDecoder.decode: 0.589ms
TextDecoder.decode: 0.604ms
TextEncoder.encode: 2.317ms
TextEncoder.encode: 1.945ms
TextEncoder.encode: 1.966ms
```
For non-ascii UTF-8 input, TextEncoder.encode runs about 8x faster in Bun compared to Safari and that's mostly because it's using simdutf.
Attachments | ||
---|---|---|
Add attachment proposed patch, testcase, etc. |
Radar WebKit Bug Importer
<rdar://problem/104145576>
Yusuke Suzuki
Pull request: https://github.com/WebKit/WebKit/pull/9990
EWS
Committed 281011@main (68eced08c0e3): <https://commits.webkit.org/281011@main>
Reviewed commits have been landed. Closing PR #9990 and removing active labels.