140173 – The ASCII decoding for non ASCII character is incorrect if this character comes after the going through fast decoding code path and before the end of the text by less than a machine word size of characters

RESOLVED FIXED 140173

The ASCII decoding for non ASCII character is incorrect if this character comes after the going through fast decoding code path and before the end of the text by less than a machine word size of characters

https://bugs.webkit.org/show_bug.cgi?id=140173

Summary The ASCII decoding for non ASCII character is incorrect if this character com...

Said Abou-Hallawa

Reported 2015-01-06 21:09:53 PST

Created attachment 244137 [details] Test case 1. Check the function TextCodecLatin1::decode() 2. Suppose the following text is decoded: "%41%42%43%44%45%46%47%48%82%82". 3. The URL precent encoded text will be converted to 10 bytes array and sent to TextCodecLatin1::decode(), 4. Since the first byte is ASCII and since its index is 8-bytes aligned, we use the fast decoding path. 5. Since the first eight bytes are all ASCII, we do the decoding in one step and and we copy the whole word from the source byte array to the result buffer. 6. Since after the first eight bytes, what is left is less than a machine word, we exit the fast decoding path. 7. The problem is after breaking the fast decoding code path, we copy the current byte as is even if it is a non ASCII character. A test case is attached. It works correctly in FireFox and Chrome and shows two non ASCII 0x82 characters at the end. But in Safari, it shows a single 0x82 character at the end.

Attachments
Test case (73 bytes, text/html) 2015-01-06 21:09 PST, Said Abou-Hallawa	no flags	Details
Patch (5.83 KB, patch) 2015-01-06 21:25 PST, Said Abou-Hallawa	no flags	Details Formatted Diff Diff
Patch (4.61 KB, patch) 2015-01-07 12:04 PST, Said Abou-Hallawa	no flags	Details Formatted Diff Diff
Show Obsolete (1) View All Add attachment proposed patch, testcase, etc.

Said Abou-Hallawa

Comment 1 2015-01-06 21:25:26 PST

Created attachment 244140 [details] Patch

Darin Adler

Comment 2 2015-01-06 23:50:29 PST

Comment on attachment 244140 [details] Patch This makes the all-ASCII-but-not-aligned-to-machine-word case slower unnecessarily. I think a better fix might be to just add: if (!isASCII(*source)) goto useLookupTable; right after the code that says: if (source == end) break; at the end of the isAlignedToMachineWord(source) block.

Darin Adler

Comment 3 2015-01-06 23:51:39 PST

Comment on attachment 244140 [details] Patch Great catch, by the way! Please consider my suggested alternate fix.

Said Abou-Hallawa

Comment 4 2015-01-07 12:04:04 PST

Created attachment 244172 [details] Patch

Said Abou-Hallawa

Comment 5 2015-01-07 12:13:08 PST

(In reply to comment #3) > Comment on attachment 244140 [details] > Patch > > Great catch, by the way! Please consider my suggested alternate fix. Done. Thanks for the suggestion.

WebKit Commit Bot

Comment 6 2015-01-07 21:18:26 PST

Comment on attachment 244172 [details] Patch Clearing flags on attachment: 244172 Committed r178099: <http://trac.webkit.org/changeset/178099>

WebKit Commit Bot

Comment 7 2015-01-07 21:18:29 PST

All reviewed patches have been landed. Closing bug.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware Unspecified

OS Unspecified

Product WebKit

Component Page Loading

Assignee

Said Abou-Hallawa

Reported

2015-01-06 21:09 PST

Modified

2015-01-07 21:18 PST History

CC List

2 users Show

URL

Keywords

Depends on

Blocks