The TextCodecUTF8 and TextCodecLatin1 decoding routines have a calculation error in the update to the 'destination16' memory location. This was found by static analysis of the code. The 'destination16' variable (in both files) is a pointer to a 16-bit character value, while the 'source' value is an 8-bit value. We updated the 'source' pointer by incrementing it by the sizeof(MachineWord), which is the number of UTF8 characters we have consumed during the decode. However, the 'destination16' variable is a UChar* (a 16-bit value). If we increment it by the number of bytes, that has the effect of moving us twice the number of 16-bit characters than we should be. We should be incrementing by sizeof(MachineWord) / sizeof(UChar).
Created attachment 244576 [details] Patch
I misunderstood what copyASCIIMachineWord was doing here. The sizeof() is the correct thing to be doing, and the Static Analyzer warning is spurious.
Comment on attachment 244576 [details] Patch These three fixes look good. Why no regression tests for any of them? Didn’t these bugs cause any symptoms? We normally require regression tests for all bug fixes.
Comment on attachment 244576 [details] Patch Oops, as you said, the warning was wrong for the destination16 lines!