WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
280593
TextDecoder raises "RangeError: Bad value" exception after 2GB of text
https://bugs.webkit.org/show_bug.cgi?id=280593
Summary
TextDecoder raises "RangeError: Bad value" exception after 2GB of text
Jacob Bandes-Storch
Reported
2024-09-29 16:52:00 PDT
Created
attachment 472732
[details]
Test page demonstrating error after 2GB of text decoding TextDecoder seems to crash after processing 2GB of text in streaming mode. ## Steps to reproduce: 1. Create a 3GB test file using: truncate -s 3G test.txt 2. Open textdecoder-test.html (attached to this bug) in Safari 3. Click "choose file" and select the test.txt created in step 1 4. Observe the progress bar stops at 2.00GB and then an error is logged to the console: "Unhandled Promise Rejection: RangeError: Bad value" ## Expected behavior: No error -- should be able to continue parsing text beyond the 2GB range. ## Notes: Works as expected in Chrome and Firefox.
Attachments
Test page demonstrating error after 2GB of text decoding
(1.21 KB, text/html)
2024-09-29 16:52 PDT
,
Jacob Bandes-Storch
no flags
Details
View All
Add attachment
proposed patch, testcase, etc.
Radar WebKit Bug Importer
Comment 1
2024-10-06 16:52:13 PDT
<
rdar://problem/137394167
>
wyhaya
Comment 2
2025-03-04 22:57:20 PST
Minimal replication: ``` const text = new TextDecoder() const buff = new ArrayBuffer(100) for (let i = 0; i < 21474837; i++) { text.decode(buff) } ``` This will see the same error. If you change 21474837 to 21474836, it will run normally.
Darin Adler
Comment 3
2025-03-05 09:09:17 PST
There is currently code in TextDecoder that keeps track of the total number of decoded bytes passed in, and throws a range error if that total number of decoded bytes is greater than the maximum string length supported by WebKit and its JavaScript engine. I’m not sure why that check is present. The simplest way to start fixing this bug is to remove TextDecoder::m_decodedBytes and the code that checks it entirely. Next, we have to figure out if TextDecoder handles cases where the passed in data is very large. This check may have been protecting the underlying code from being tested in various edge cases, and we’d want to fix those. But it’s possible that just removing m_decodedBytes will take care of the whole problem without requiring any additional work.
Darin Adler
Comment 4
2025-03-05 09:17:44 PST
This limit was added to try to address a security problem with producing output strings that are too long. But the code change instead limited the total number of characters passed in to each codec, which is stricter than is needed. To fix this we need to correctly address the original security issue by making sure codecs can never produce a string larger than the maximum string length, and remove TextDecoder::m_decodedBytes since it will no longer be necessary. It’s unsafe to remove TextDecoder::m_decodedBytes without addressing the underlying issue of TextCodec decode functions producing strings that are too long.
Darin Adler
Comment 5
2025-03-07 16:41:22 PST
Not to "try" to address a security problem. It successfully addressed the security problem.
Jacob Bandes-Storch
Comment 6
2025-03-17 21:22:55 PDT
Here is an example workaround that I ended up using for this issue – just create a new TextDecoder every so often:
https://github.com/jtbandes/mbox.wtf/blob/e039291e70160700c47a6876a41db967375e631e/src/readLines.ts#L20-L32
Erik Zivkovic
Comment 7
2025-04-05 22:49:32 PDT
We have a long running wasm application that hits this problem very often, affecting thousands of users of Safari every week. wasm-bindgen (Rust) generates this piece code for every app using it: ``` const lTextEncoder = typeof TextEncoder === 'undefined' ? (0, module.require)('util').TextEncoder : TextEncoder; let cachedTextEncoder = new lTextEncoder('utf-8'); function getStringFromWasm0(ptr, len) { ptr = ptr >>> 0; return cachedTextDecoder.decode(getUint8ArrayMemory0().subarray(ptr, ptr + len)); } ``` And then getStringFromWasm0 gets used for everything that needs to do JS things with Rust strings.
Erik Zivkovic
Comment 8
2025-04-06 00:09:01 PDT
Sorry, I copied the wrong snippet, this is the correct one: ``` const lTextDecoder = typeof TextDecoder === 'undefined' ? (0, module.require)('util').TextDecoder : TextDecoder; let cachedTextDecoder = new lTextDecoder('utf-8', { ignoreBOM: true, fatal: true }); function getStringFromWasm0(ptr, len) { ptr = ptr >>> 0; return cachedTextDecoder.decode(getUint8ArrayMemory0().subarray(ptr, ptr + len)); } ```
Darin Adler
Comment 9
2025-04-07 11:14:42 PDT
Understood. The reason this hasn’t been fixed quickly is that it’s challenging to fix this problem without reintroducing the security vulnerability.
Alex Christensen
Comment 10
2025-04-07 11:52:41 PDT
Pull request:
https://github.com/WebKit/WebKit/pull/43753
EWS
Comment 11
2025-04-08 07:41:04 PDT
Committed
293416@main
(8c5a1e5c6db5): <
https://commits.webkit.org/293416@main
> Reviewed commits have been landed. Closing PR #43753 and removing active labels.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug