Bug 254021
Summary: | Compression Streams not handling large outputs during the flush stage | ||
---|---|---|---|
Product: | WebKit | Reporter: | bzugmeyer <bzugmeyer> |
Component: | WebCore Misc. | Assignee: | Brandon <brandonstewart> |
Status: | RESOLVED FIXED | ||
Severity: | Normal | CC: | brandonstewart, cdumez, gildas.lormeau, kurt, webkit-bug-importer |
Priority: | P2 | Keywords: | InRadar |
Version: | Safari Technology Preview | ||
Hardware: | Mac (Apple Silicon) | ||
OS: | macOS 12 | ||
See Also: | https://bugs.webkit.org/show_bug.cgi?id=252474 |
bzugmeyer@gmail.com
I am testing the new `CompressionStream` API in Safari Technology Preview. When trying to compress a moderately large buffer, the data it produces seems invalid.
To reproduce:
Compress a large enough buffer of non-repeating data. In the following example, I use a buffer of approximately 48kB. Functions are coming from the WebKit CompressionStream test suite[1]:
const input = new TextEncoder().encode(
JSON.stringify(Array.from({ length: 10_000 }, (_, i) => i))
);
const output = await compressArrayBuffer(input, "deflate");
assert_array_equals(input, pako.inflate(output));
Expected:
`pako.inflate` returns a buffer equal to the input buffer.
Actual:
The decompression fails with `pako.inflate` returning `undefined`.
Notes:
With this particular example, I noticed that it works correctly when compressing up to 35_578 bytes. The issue only occurs when I try to compress more bytes (>= 35_579).
Please see this page[2] for a practical way to reproduce the issue. In Chrome, every test succeed, but in Safari tests with larger inputs are failing.
[1]: https://github.com/WebKit/WebKit/blob/20329b62061b40d5a423a1d75b67779945b84729/LayoutTests/imported/w3c/web-platform-tests/compression/compression-stream.tentative.any.js
[2]: https://safari-compressionstream-issue.benoitzugmeyer.repl.co
Attachments | ||
---|---|---|
Add attachment proposed patch, testcase, etc. |
Alexey Proskuryakov
Hopefully this is a dupe of bug 252474.
Radar WebKit Bug Importer
<rdar://problem/107133345>
Brandon
Thanks for the bug report I can reproduce in Safari, Safari Technology Preview, and MiniBrowser. This is not a dupe of the other one from what I can tell at first glance.
<!DOCTYPE html>
<html>
<script type="module">
import * as pako from "https://cdnjs.cloudflare.com/ajax/libs/pako/2.1.0/pako.esm.mjs";
import * as fflate from "https://unpkg.com/fflate@0.7.4/esm/browser.js";
// JSON-encoded array of 10 thousands numbers ("[0,1,2,...]"). This produces 48_891 bytes of data.
const fullData = new TextEncoder().encode(
JSON.stringify(Array.from({ length: 10_000 }, (_, i) => i))
);
await test(10)
await test(1_000)
await test(10_000)
await test(30_000)
await test(40_000) // fails
await test(35_578) // succeeds
await test(35_579) // fails
async function test(bytesLength) {
const data = fullData.subarray(0, bytesLength)
const compressedData = await compressArrayBuffer(data, "deflate");
// Decompress with pako, and check that we got the same result as our original string
// Similar to https://github.com/WebKit/WebKit/blob/20329b62061b40d5a423a1d75b67779945b84729/LayoutTests/imported/w3c/web-platform-tests/compression/compression-stream.tentative.any.js#L54-L55
try {
assert_array_equals(data, pako.inflate(compressedData));
console.log(`[pako] Succeeded with ${bytesLength} bytes`)
} catch (error) {
console.error(`[pako] Failed with ${bytesLength} bytes:`, error)
}
// Double check with another library
try {
assert_array_equals(data, fflate.unzlibSync(compressedData));
console.log(`[fflate] Succeeded with ${bytesLength} bytes`)
} catch (error) {
console.error(`[fflate] Failed with ${bytesLength} bytes:`, error)
}
try {
assert_array_equals(data, await decompress(compressedData));
console.log(`deflate succeed with ${bytesLength} bytes`);
} catch (error) {
console.error(`deflate Failed with ${bytesLength} bytes`, error);
}
}
async function compressArrayBuffer(input, format) {
const cs = new CompressionStream(format);
const writer = cs.writable.getWriter();
writer.write(input);
const closePromise = writer.close();
const out = [];
const reader = cs.readable.getReader();
let totalSize = 0;
while (true) {
const { value, done } = await reader.read();
if (done)
break;
out.push(value);
totalSize += value.byteLength;
}
await closePromise;
const concatenated = new Uint8Array(totalSize);
let offset = 0;
for (const array of out) {
concatenated.set(array, offset);
offset += array.byteLength;
}
return concatenated;
}
function assert_array_equals(a, b) {
if (!a) {
throw new Error(`Arrays not equal: a is falsy (${a})`)
}
if (!b) {
throw new Error(`Arrays not equal: b is falsy (${b})`)
}
if (!a.length === b.length) {
throw new Error(`Arrays not equal: a.length !== b.length (${a.length} !== ${b.length})`)
}
a.forEach((v, i) => {
if (a[i] !== b[i]) {
throw new Error(`Arrays not equal: a[${i}] !== b[${i}] (${a[i]} !== ${b[i]})`)
}
})
}
async function concatenateStream(readableStream) {
const reader = readableStream.getReader();
let totalSize = 0;
const buffers = [];
while (true) {
const { value, done } = await reader.read();
if (done) {
break;
}
buffers.push(value);
totalSize += value.byteLength;
}
reader.releaseLock();
const concatenated = new Uint8Array(totalSize);
let offset = 0;
for (const buffer of buffers) {
concatenated.set(buffer, offset);
offset += buffer.byteLength;
}
return concatenated;
}
async function decompress(view) {
const ds = new DecompressionStream('deflate');
const writer = ds.writable.getWriter();
writer.write(view);
writer.close();
return await concatenateStream(ds.readable);
}
</script>
</html>
Brandon
So what was happening here is that we missed an edge case where during the flush step we may have data longer than the allocated output. Since the avail_in was set to 0 we would just exit. We need to verify that the stream has ended before exiting.
The reason the 35_579 size failed was the allocated output during the flush step was 16384, while the output data was just slightly larger. The first compression outputted only 2 bytes, while the remaining 13686 were only spit out during the flush step.
This fixes it.
@@ -132,8 +131,8 @@ ExceptionOr<RefPtr<JSC::ArrayBuffer>> CompressionStreamEncoder::compress(const u
result = deflate(&m_zstream, (m_finish) ? Z_FINISH : Z_NO_FLUSH);
if (result != Z_OK && result != Z_STREAM_END && result != Z_BUF_ERROR)
return Exception { TypeError, "Failed to compress data."_s };
-
- if (!m_zstream.avail_in) {
+
+ if (!m_zstream.avail_in && (!m_finish || (m_finish && result == Z_STREAM_END))) {
shouldCompress = false;
output.resize(allocateSize - m_zstream.avail_out);
}
Working on the PR now.
Brandon
Pull request: https://github.com/WebKit/WebKit/pull/13741
EWS
Committed 263997@main (a30d9284f5b9): <https://commits.webkit.org/263997@main>
Reviewed commits have been landed. Closing PR #13741 and removing active labels.
Brent Fulgham
*** Bug 256330 has been marked as a duplicate of this bug. ***