Reproduction steps: 1. Go to www.wo99.com Issue: The password label ("密 码") is displayed with a rectangle in the middle. Expected: A rectangular box should not be displayed in between the characters. Other browsers: IE, FF, Opera: work fine. Reason: looks like there is a GB2312/GBK converter issue. Firefox's converter maps 'two bytes' between 密 and 码 in the original document to U+3000 but apparently the ICU converter used by Safari maps it to U+E5E5. Nightly tested: WebKit r29785 Attached is the screenshot and reduction.
Created attachment 18694 [details] screenshot
Created attachment 18695 [details] Reduction
I do not think that this is a general problem with PUAs, renaming the bug to match its scope, as I understand it. Please correct me if I'm wrong. Some history: A3A0 (or 0300 in unencoded form) was undefined in original GB2312, GB2312-80, GBK or Microsoft's version of the latter. Due to what looks like a bug, it was mapped to Unicode U+3000 in browsers though. WebKit also used to have a workaround for this, added for <rdar://problem/3225472> "www.sina.com.cn uses A3A0 for full-width space". This workaround was lost when switching to ICU. GB18030, which is the next iteration of GBK, maps it to a private use character U+E5E5, but browsers do not follow the spec in this.
Created attachment 18721 [details] proposed fix I am not sure if this code is actually needed in TextCodecMac, but I do not see any compelling reason to remove it, either.
Comment on attachment 18721 [details] proposed fix r=me
Committed revision 29826.