Created attachment 244137 [details] Test case 1. Check the function TextCodecLatin1::decode() 2. Suppose the following text is decoded: "%41%42%43%44%45%46%47%48%82%82". 3. The URL precent encoded text will be converted to 10 bytes array and sent to TextCodecLatin1::decode(), 4. Since the first byte is ASCII and since its index is 8-bytes aligned, we use the fast decoding path. 5. Since the first eight bytes are all ASCII, we do the decoding in one step and and we copy the whole word from the source byte array to the result buffer. 6. Since after the first eight bytes, what is left is less than a machine word, we exit the fast decoding path. 7. The problem is after breaking the fast decoding code path, we copy the current byte as is even if it is a non ASCII character. A test case is attached. It works correctly in FireFox and Chrome and shows two non ASCII 0x82 characters at the end. But in Safari, it shows a single 0x82 character at the end.
Created attachment 244140 [details] Patch
Comment on attachment 244140 [details] Patch This makes the all-ASCII-but-not-aligned-to-machine-word case slower unnecessarily. I think a better fix might be to just add: if (!isASCII(*source)) goto useLookupTable; right after the code that says: if (source == end) break; at the end of the isAlignedToMachineWord(source) block.
Comment on attachment 244140 [details] Patch Great catch, by the way! Please consider my suggested alternate fix.
Created attachment 244172 [details] Patch
(In reply to comment #3) > Comment on attachment 244140 [details] > Patch > > Great catch, by the way! Please consider my suggested alternate fix. Done. Thanks for the suggestion.
Comment on attachment 244172 [details] Patch Clearing flags on attachment: 244172 Committed r178099: <http://trac.webkit.org/changeset/178099>
All reviewed patches have been landed. Closing bug.