Bug 250112 - [WTF] Integrate simdutf
Summary: [WTF] Integrate simdutf
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Yusuke Suzuki
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2023-01-04 16:38 PST by Jarred Sumner
Modified: 2023-02-11 17:37 PST (History)
7 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jarred Sumner 2023-01-04 16:38:42 PST
simdutf is a fast text encoding/decoding library which supports UTF8 -> UTF16, UTF16 -> UTF8, and ascii validation. GitHub: https://github.com/simdutf/simdutf

Bun uses simdutf for TextEncoder & TextDecoder (in the happy path)

For this code:
```
var decoder = new TextDecoder();
var encoder = new TextEncoder();
var buf = encoder.encode(
    "not all ascii 🙂 🙃 1-2-3-4 abcdefghklmnopqrstuvwxyz".repeat(9999)
  ),
  buf1 = buf.slice(0, buf.length - 1),
  buf2 = buf.slice(0, buf.length - 2);
var decoded = decoder.decode(buf),
  decoded1 = decoder.decode(buf1),
  decoded2 = decoder.decode(buf2);

console.time("TextDecoder.decode");
decoder.decode(buf);
console.timeEnd("TextDecoder.decode");
console.time("TextDecoder.decode");
decoder.decode(buf1);
console.timeEnd("TextDecoder.decode");
console.time("TextDecoder.decode");
decoder.decode(buf2);
console.timeEnd("TextDecoder.decode");
console.time("TextEncoder.encode");
encoder.encode(decoded);
console.timeEnd("TextEncoder.encode");
console.time("TextEncoder.encode");
encoder.encode(decoded1);
console.timeEnd("TextEncoder.encode");
console.time("TextEncoder.encode");
encoder.encode(decoded2);
console.timeEnd("TextEncoder.encode");

```

On macOS arm64 in Bun v0.4.1:

```
[0.69ms] TextDecoder.decode
[0.54ms] TextDecoder.decode
[0.59ms] TextDecoder.decode
[0.21ms] TextEncoder.encode
[0.20ms] TextEncoder.encode
[0.25ms] TextEncoder.encode
```

Safari Technology Preview:

```
TextDecoder.decode: 0.621ms
TextDecoder.decode: 0.589ms
TextDecoder.decode: 0.604ms
TextEncoder.encode: 2.317ms
TextEncoder.encode: 1.945ms
TextEncoder.encode: 1.966ms
```

For non-ascii UTF-8 input, TextEncoder.encode runs about 8x faster in Bun compared to Safari and that's mostly because it's using simdutf.
Comment 1 Radar WebKit Bug Importer 2023-01-11 16:39:41 PST
<rdar://problem/104145576>
Comment 2 Yusuke Suzuki 2023-02-11 17:37:44 PST
Pull request: https://github.com/WebKit/WebKit/pull/9990