I believe the convention is that char* pointers are UTF-8.
Created attachment 12996 [details]
Comments in the file indicate that we use Latin1 on purpose to handle invalid input (since there's no such thing as invalid Latin1).
Darin's check-in comment (from svn) was:
Use Windows Latin-1 rather than MacRoman for messages. I guess we should test later to find out which is used by MacIE and conform to that, but for now this seems like a better default.
Have you done any testing to see what other browsers support?
I'd like Darin to review this, but I think it should be a - unless we're diverging from other browsers here.
Mozilla uses UTF-8: <http://lxr.mozilla.org/mozilla1.8/source/layout/generic/nsObjectFrame.cpp#2547>
Comment on attachment 12996 [details]
In that case, should we convert all the calls from Latin1?
That's a fair question! While investigating this bug, I looked for other places in WebBaseNetscapePluginView.mm that handled C strings as Latin-1, and found GetURL/PostURL. However, URLs are formally ASCII anyway, so the behavior for non-ASCII ones may be more tricky, and I didn't investigate it yet.
Darin, Geoff has suggested asking you to look at this patch before landing it - could you please take a look?
I did look and talked to Geoff. I don't have anything definitive to say here except that:
1) we should not break existing plug-ins, even ones that have assumptions about our strange behavior
2) we should act like other browsers do, to make plug-ins work as well as possible
3) we should behave in a way that "makes sense" and works with any characters
4) if we are encoding or decoding in a way that can have failures, we need to carefully define what we do in those failure cases -- converting arbitrary bytes interpreted as Latin-1 to UTF-16 is an example of something that can never fail, as is encoding valid UTF-16 to UTF-8 bytes, where as converting to Latin-1 or decoding UTF-8 are examples of things that can fail
But I assume you already know all of that.
Committed revision 19608.
I believe that in this particular case an outright failure to set status text is better than displaying garbage that may result from processing it as Latin-1, thus landing as is.
Created attachment 13160 [details]
fix the fix
Darin has pointed out that I forgot the most obvious problem: CFRelease will crash when given a null parameter. So, it's still necessary to explicitly check for conversion failure.
Comment on attachment 13160 [details]
fix the fix
Committed revision 19625.