Bug 292925
| Summary: | [WTR] Replace invalid UTF-8 bytes instead of crashing | ||
|---|---|---|---|
| Product: | WebKit | Reporter: | Alicia Boya García <aboya> |
| Component: | Tools / Tests | Assignee: | Alicia Boya García <aboya> |
| Status: | RESOLVED FIXED | ||
| Severity: | Normal | CC: | webkit-bug-importer |
| Priority: | P2 | Keywords: | InRadar |
| Version: | WebKit Nightly Build | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
Alicia Boya García
Currently, if invalid UTF-8 is printed to stderr, the test runner crashes with an error like this:
UnicodeDecodeError raised: 'utf-8' codec can't decode byte 0xd6 in position 179: invalid continuation byte
This is particularly a problem when running tests with environment variables used by various libraries for debugging, as any improper encoding will not only crash, but leave you with very few cues of what caused it.
This patch makes the test runner code that reads stderr use errors="replace" when decoding UTF-8: any invalid UTF-8 sequences will be replaced by U+FFFD � REPLACEMENT CHARACTER.
This allows users to continue debugging in the presence of invalid UTF-8 in stderr logs. Any invalid UTF-8 sequences can still be found by searching for the replacement character.
The specific invalid sequence is lost. Personally, I would prefer if stderr was collected as a bytestring so that the -stderr.txt file contained byte-by-byte match of what the test runner emitted, but the refactor necessary to be able to accomplish that is outside of the scope of this patch.
| Attachments | ||
|---|---|---|
| Add attachment proposed patch, testcase, etc. |
Alicia Boya García
Pull request: https://github.com/WebKit/WebKit/pull/45307
EWS
Committed 295135@main (3d5b2ebc6280): <https://commits.webkit.org/295135@main>
Reviewed commits have been landed. Closing PR #45307 and removing active labels.
Radar WebKit Bug Importer
<rdar://problem/151656136>