Summary: | NPN_Status is using latin-1 encoding for the message instead of UTF-8 | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Alexey Proskuryakov <ap> | ||||||
Component: | Plug-ins | Assignee: | Alexey Proskuryakov <ap> | ||||||
Status: | RESOLVED FIXED | ||||||||
Severity: | Normal | CC: | darin | ||||||
Priority: | P2 | ||||||||
Version: | 420+ | ||||||||
Hardware: | Mac | ||||||||
OS: | OS X 10.4 | ||||||||
Attachments: |
|
Description
Alexey Proskuryakov
2007-02-06 21:49:22 PST
Created attachment 12996 [details]
proposed patch
Comments in the file indicate that we use Latin1 on purpose to handle invalid input (since there's no such thing as invalid Latin1). Darin's check-in comment (from svn) was: Use Windows Latin-1 rather than MacRoman for messages. I guess we should test later to find out which is used by MacIE and conform to that, but for now this seems like a better default. Have you done any testing to see what other browsers support? I'd like Darin to review this, but I think it should be a - unless we're diverging from other browsers here. Mozilla uses UTF-8: <http://lxr.mozilla.org/mozilla1.8/source/layout/generic/nsObjectFrame.cpp#2547> Comment on attachment 12996 [details]
proposed patch
r=me
In that case, should we convert all the calls from Latin1? That's a fair question! While investigating this bug, I looked for other places in WebBaseNetscapePluginView.mm that handled C strings as Latin-1, and found GetURL/PostURL. However, URLs are formally ASCII anyway, so the behavior for non-ASCII ones may be more tricky, and I didn't investigate it yet. Darin, Geoff has suggested asking you to look at this patch before landing it - could you please take a look? I did look and talked to Geoff. I don't have anything definitive to say here except that: 1) we should not break existing plug-ins, even ones that have assumptions about our strange behavior 2) we should act like other browsers do, to make plug-ins work as well as possible 3) we should behave in a way that "makes sense" and works with any characters 4) if we are encoding or decoding in a way that can have failures, we need to carefully define what we do in those failure cases -- converting arbitrary bytes interpreted as Latin-1 to UTF-16 is an example of something that can never fail, as is encoding valid UTF-16 to UTF-8 bytes, where as converting to Latin-1 or decoding UTF-8 are examples of things that can fail But I assume you already know all of that. Committed revision 19608. I believe that in this particular case an outright failure to set status text is better than displaying garbage that may result from processing it as Latin-1, thus landing as is. Created attachment 13160 [details]
fix the fix
Darin has pointed out that I forgot the most obvious problem: CFRelease will crash when given a null parameter. So, it's still necessary to explicitly check for conversion failure.
Comment on attachment 13160 [details]
fix the fix
r=me
Committed revision 19625. |