simdutf is a fast text encoding/decoding library which supports UTF8 -> UTF16, UTF16 -> UTF8, and ascii validation. GitHub: https://github.com/simdutf/simdutf Bun uses simdutf for TextEncoder & TextDecoder (in the happy path) For this code: ``` var decoder = new TextDecoder(); var encoder = new TextEncoder(); var buf = encoder.encode( "not all ascii 🙂 🙃 1-2-3-4 abcdefghklmnopqrstuvwxyz".repeat(9999) ), buf1 = buf.slice(0, buf.length - 1), buf2 = buf.slice(0, buf.length - 2); var decoded = decoder.decode(buf), decoded1 = decoder.decode(buf1), decoded2 = decoder.decode(buf2); console.time("TextDecoder.decode"); decoder.decode(buf); console.timeEnd("TextDecoder.decode"); console.time("TextDecoder.decode"); decoder.decode(buf1); console.timeEnd("TextDecoder.decode"); console.time("TextDecoder.decode"); decoder.decode(buf2); console.timeEnd("TextDecoder.decode"); console.time("TextEncoder.encode"); encoder.encode(decoded); console.timeEnd("TextEncoder.encode"); console.time("TextEncoder.encode"); encoder.encode(decoded1); console.timeEnd("TextEncoder.encode"); console.time("TextEncoder.encode"); encoder.encode(decoded2); console.timeEnd("TextEncoder.encode"); ``` On macOS arm64 in Bun v0.4.1: ``` [0.69ms] TextDecoder.decode [0.54ms] TextDecoder.decode [0.59ms] TextDecoder.decode [0.21ms] TextEncoder.encode [0.20ms] TextEncoder.encode [0.25ms] TextEncoder.encode ``` Safari Technology Preview: ``` TextDecoder.decode: 0.621ms TextDecoder.decode: 0.589ms TextDecoder.decode: 0.604ms TextEncoder.encode: 2.317ms TextEncoder.encode: 1.945ms TextEncoder.encode: 1.966ms ``` For non-ascii UTF-8 input, TextEncoder.encode runs about 8x faster in Bun compared to Safari and that's mostly because it's using simdutf.
<rdar://problem/104145576>
Pull request: https://github.com/WebKit/WebKit/pull/9990