RESOLVED FIXED 254021
Compression Streams not handling large outputs during the flush stage
https://bugs.webkit.org/show_bug.cgi?id=254021
Summary Compression Streams not handling large outputs during the flush stage
bzugmeyer@gmail.com
Reported 2023-03-16 07:17:08 PDT
I am testing the new `CompressionStream` API in Safari Technology Preview. When trying to compress a moderately large buffer, the data it produces seems invalid. To reproduce: Compress a large enough buffer of non-repeating data. In the following example, I use a buffer of approximately 48kB. Functions are coming from the WebKit CompressionStream test suite[1]: const input = new TextEncoder().encode( JSON.stringify(Array.from({ length: 10_000 }, (_, i) => i)) ); const output = await compressArrayBuffer(input, "deflate"); assert_array_equals(input, pako.inflate(output)); Expected: `pako.inflate` returns a buffer equal to the input buffer. Actual: The decompression fails with `pako.inflate` returning `undefined`. Notes: With this particular example, I noticed that it works correctly when compressing up to 35_578 bytes. The issue only occurs when I try to compress more bytes (>= 35_579). Please see this page[2] for a practical way to reproduce the issue. In Chrome, every test succeed, but in Safari tests with larger inputs are failing. [1]: https://github.com/WebKit/WebKit/blob/20329b62061b40d5a423a1d75b67779945b84729/LayoutTests/imported/w3c/web-platform-tests/compression/compression-stream.tentative.any.js [2]: https://safari-compressionstream-issue.benoitzugmeyer.repl.co
Attachments
Alexey Proskuryakov
Comment 1 2023-03-16 08:59:08 PDT
Hopefully this is a dupe of bug 252474.
Radar WebKit Bug Importer
Comment 2 2023-03-23 07:18:14 PDT
Brandon
Comment 3 2023-03-23 11:41:36 PDT
Thanks for the bug report I can reproduce in Safari, Safari Technology Preview, and MiniBrowser. This is not a dupe of the other one from what I can tell at first glance. <!DOCTYPE html> <html> <script type="module"> import * as pako from "https://cdnjs.cloudflare.com/ajax/libs/pako/2.1.0/pako.esm.mjs"; import * as fflate from "https://unpkg.com/fflate@0.7.4/esm/browser.js"; // JSON-encoded array of 10 thousands numbers ("[0,1,2,...]"). This produces 48_891 bytes of data. const fullData = new TextEncoder().encode( JSON.stringify(Array.from({ length: 10_000 }, (_, i) => i)) ); await test(10) await test(1_000) await test(10_000) await test(30_000) await test(40_000) // fails await test(35_578) // succeeds await test(35_579) // fails async function test(bytesLength) { const data = fullData.subarray(0, bytesLength) const compressedData = await compressArrayBuffer(data, "deflate"); // Decompress with pako, and check that we got the same result as our original string // Similar to https://github.com/WebKit/WebKit/blob/20329b62061b40d5a423a1d75b67779945b84729/LayoutTests/imported/w3c/web-platform-tests/compression/compression-stream.tentative.any.js#L54-L55 try { assert_array_equals(data, pako.inflate(compressedData)); console.log(`[pako] Succeeded with ${bytesLength} bytes`) } catch (error) { console.error(`[pako] Failed with ${bytesLength} bytes:`, error) } // Double check with another library try { assert_array_equals(data, fflate.unzlibSync(compressedData)); console.log(`[fflate] Succeeded with ${bytesLength} bytes`) } catch (error) { console.error(`[fflate] Failed with ${bytesLength} bytes:`, error) } try { assert_array_equals(data, await decompress(compressedData)); console.log(`deflate succeed with ${bytesLength} bytes`); } catch (error) { console.error(`deflate Failed with ${bytesLength} bytes`, error); } } async function compressArrayBuffer(input, format) { const cs = new CompressionStream(format); const writer = cs.writable.getWriter(); writer.write(input); const closePromise = writer.close(); const out = []; const reader = cs.readable.getReader(); let totalSize = 0; while (true) { const { value, done } = await reader.read(); if (done) break; out.push(value); totalSize += value.byteLength; } await closePromise; const concatenated = new Uint8Array(totalSize); let offset = 0; for (const array of out) { concatenated.set(array, offset); offset += array.byteLength; } return concatenated; } function assert_array_equals(a, b) { if (!a) { throw new Error(`Arrays not equal: a is falsy (${a})`) } if (!b) { throw new Error(`Arrays not equal: b is falsy (${b})`) } if (!a.length === b.length) { throw new Error(`Arrays not equal: a.length !== b.length (${a.length} !== ${b.length})`) } a.forEach((v, i) => { if (a[i] !== b[i]) { throw new Error(`Arrays not equal: a[${i}] !== b[${i}] (${a[i]} !== ${b[i]})`) } }) } async function concatenateStream(readableStream) { const reader = readableStream.getReader(); let totalSize = 0; const buffers = []; while (true) { const { value, done } = await reader.read(); if (done) { break; } buffers.push(value); totalSize += value.byteLength; } reader.releaseLock(); const concatenated = new Uint8Array(totalSize); let offset = 0; for (const buffer of buffers) { concatenated.set(buffer, offset); offset += buffer.byteLength; } return concatenated; } async function decompress(view) { const ds = new DecompressionStream('deflate'); const writer = ds.writable.getWriter(); writer.write(view); writer.close(); return await concatenateStream(ds.readable); } </script> </html>
Brandon
Comment 4 2023-05-10 23:14:49 PDT
So what was happening here is that we missed an edge case where during the flush step we may have data longer than the allocated output. Since the avail_in was set to 0 we would just exit. We need to verify that the stream has ended before exiting. The reason the 35_579 size failed was the allocated output during the flush step was 16384, while the output data was just slightly larger. The first compression outputted only 2 bytes, while the remaining 13686 were only spit out during the flush step. This fixes it. @@ -132,8 +131,8 @@ ExceptionOr<RefPtr<JSC::ArrayBuffer>> CompressionStreamEncoder::compress(const u result = deflate(&m_zstream, (m_finish) ? Z_FINISH : Z_NO_FLUSH); if (result != Z_OK && result != Z_STREAM_END && result != Z_BUF_ERROR) return Exception { TypeError, "Failed to compress data."_s }; - - if (!m_zstream.avail_in) { + + if (!m_zstream.avail_in && (!m_finish || (m_finish && result == Z_STREAM_END))) { shouldCompress = false; output.resize(allocateSize - m_zstream.avail_out); } Working on the PR now.
Brandon
Comment 5 2023-05-10 23:18:29 PDT
EWS
Comment 6 2023-05-11 20:07:13 PDT
Committed 263997@main (a30d9284f5b9): <https://commits.webkit.org/263997@main> Reviewed commits have been landed. Closing PR #13741 and removing active labels.
Brent Fulgham
Comment 7 2023-05-17 13:35:17 PDT
*** Bug 256330 has been marked as a duplicate of this bug. ***
Note You need to log in before you can comment on or make changes to this bug.