Bug 12643

Summary: NPN_Status is using latin-1 encoding for the message instead of UTF-8
Product: WebKit Reporter: Alexey Proskuryakov <ap>
Component: Plug-insAssignee: Alexey Proskuryakov <ap>
Status: RESOLVED FIXED    
Severity: Normal CC: darin
Priority: P2    
Version: 420+   
Hardware: Mac   
OS: OS X 10.4   
Attachments:
Description Flags
proposed patch
mjs: review+
fix the fix darin: review+

Alexey Proskuryakov
Reported 2007-02-06 21:49:22 PST
I believe the convention is that char* pointers are UTF-8.
Attachments
proposed patch (1.34 KB, patch)
2007-02-06 21:52 PST, Alexey Proskuryakov
mjs: review+
fix the fix (1.28 KB, patch)
2007-02-13 21:28 PST, Alexey Proskuryakov
darin: review+
Alexey Proskuryakov
Comment 1 2007-02-06 21:52:41 PST
Created attachment 12996 [details] proposed patch
Geoffrey Garen
Comment 2 2007-02-06 22:08:58 PST
Comments in the file indicate that we use Latin1 on purpose to handle invalid input (since there's no such thing as invalid Latin1). Darin's check-in comment (from svn) was: Use Windows Latin-1 rather than MacRoman for messages. I guess we should test later to find out which is used by MacIE and conform to that, but for now this seems like a better default. Have you done any testing to see what other browsers support? I'd like Darin to review this, but I think it should be a - unless we're diverging from other browsers here.
Alexey Proskuryakov
Comment 3 2007-02-06 22:17:37 PST
Maciej Stachowiak
Comment 4 2007-02-06 22:45:42 PST
Comment on attachment 12996 [details] proposed patch r=me
Geoffrey Garen
Comment 5 2007-02-07 06:45:27 PST
In that case, should we convert all the calls from Latin1?
Alexey Proskuryakov
Comment 6 2007-02-07 09:15:57 PST
That's a fair question! While investigating this bug, I looked for other places in WebBaseNetscapePluginView.mm that handled C strings as Latin-1, and found GetURL/PostURL. However, URLs are formally ASCII anyway, so the behavior for non-ASCII ones may be more tricky, and I didn't investigate it yet.
Alexey Proskuryakov
Comment 7 2007-02-10 06:44:36 PST
Darin, Geoff has suggested asking you to look at this patch before landing it - could you please take a look?
Darin Adler
Comment 8 2007-02-12 10:39:53 PST
I did look and talked to Geoff. I don't have anything definitive to say here except that: 1) we should not break existing plug-ins, even ones that have assumptions about our strange behavior 2) we should act like other browsers do, to make plug-ins work as well as possible 3) we should behave in a way that "makes sense" and works with any characters 4) if we are encoding or decoding in a way that can have failures, we need to carefully define what we do in those failure cases -- converting arbitrary bytes interpreted as Latin-1 to UTF-16 is an example of something that can never fail, as is encoding valid UTF-16 to UTF-8 bytes, where as converting to Latin-1 or decoding UTF-8 are examples of things that can fail But I assume you already know all of that.
Alexey Proskuryakov
Comment 9 2007-02-13 11:44:03 PST
Committed revision 19608. I believe that in this particular case an outright failure to set status text is better than displaying garbage that may result from processing it as Latin-1, thus landing as is.
Alexey Proskuryakov
Comment 10 2007-02-13 21:28:27 PST
Created attachment 13160 [details] fix the fix Darin has pointed out that I forgot the most obvious problem: CFRelease will crash when given a null parameter. So, it's still necessary to explicitly check for conversion failure.
Darin Adler
Comment 11 2007-02-14 08:57:16 PST
Comment on attachment 13160 [details] fix the fix r=me
Alexey Proskuryakov
Comment 12 2007-02-14 10:12:09 PST
Committed revision 19625.
Note You need to log in before you can comment on or make changes to this bug.