Bug 15814 - fast/js/kde/encode_decode_uri.html fails
Summary: fast/js/kde/encode_decode_uri.html fails
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 523.x (Safari 3)
Hardware: Mac OS X 10.4
: P2 Normal
Assignee: Darin Adler
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2007-11-03 00:44 PDT by Darin Adler
Modified: 2007-11-04 16:31 PST (History)
1 user (show)

See Also:


Attachments
patch (125.24 KB, patch)
2007-11-03 00:48 PDT, Darin Adler
mjs: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Darin Adler 2007-11-03 00:44:32 PDT
The checked-in results include failures.

Because of a failure in fast/js/resources/js-test-pre.js this test used to pass when it was really failing.
Comment 1 Darin Adler 2007-11-03 00:48:36 PDT
Created attachment 17015 [details]
patch
Comment 2 Darin Adler 2007-11-03 00:49:11 PDT
<rdar://problem/5536644>
Comment 3 Maciej Stachowiak 2007-11-03 01:36:25 PDT
Comment on attachment 17015 [details]
patch

r=me
Comment 4 Alexey Proskuryakov 2007-11-03 02:13:50 PDT
+    return throwError(exec, URIError);

I think it would be good to add a string explaining the failure to throwError.

-    // Backwards BOM and U+FFFF should never appear in UTF-8 data.
-    if (c == 0xFFFE || c == 0xFFFF)
-      return -1;

FWIW, Firefox gives different result for these character on this test:

FAIL decodeURI(encodeURI(String.fromCharCode(65534))) should be &#65534;. Was &#65533;.
FAIL decodeURI(encodeURI(String.fromCharCode(65535))) should be &#65535;. Was &#65533;.
Comment 5 Darin Adler 2007-11-03 09:42:27 PDT
(In reply to comment #4)
> FAIL decodeURI(encodeURI(String.fromCharCode(65534))) should be &#65534;. Was
> &#65533;.
> FAIL decodeURI(encodeURI(String.fromCharCode(65535))) should be &#65535;. Was
> &#65533;.

This behavior seems a bit strange and it's not mentioned by the standard. What do you think we should do?
Comment 6 Darin Adler 2007-11-03 09:52:45 PDT
Committed revision 27406.
Comment 7 Alexey Proskuryakov 2007-11-04 02:44:16 PST
(In reply to comment #5)
> This behavior seems a bit strange and it's not mentioned by the standard. What
> do you think we should do?

I think the new behavior is correct, although I do not know why a comment in the removed code said that "Backwards BOM and U+FFFF should never appear in UTF-8 data." Of course, those are non-characters, but there are 64 more of those, and according to the Unicode FAQ, UTF encoding should round-trip those.
Comment 8 Darin Adler 2007-11-04 16:31:21 PST
(In reply to comment #7)
> I think the new behavior is correct, although I do not know why a comment in
> the removed code said that "Backwards BOM and U+FFFF should never appear in
> UTF-8 data." Of course, those are non-characters, but there are 64 more of
> those, and according to the Unicode FAQ, UTF encoding should round-trip those.

I wrote that comment. And I think I was wrong when I wrote it.