Glyphs in what I'm suspecting to be anything above the Basic Multilingual Plane (0x0000-0xffff) gets a diamond with a question mark in the middle, even when a glyph exists at the value.
Could you please provide a test case? Given that you mention the diamond with a question mark, I suspect that you're seeing this problem on Mac, but then I'm confused because Mac WebKit certainly supports non-BMP characters. Are you seeing this in Safari on Mac, some other WebKit based browser on Mac, or some entirely different platform?
Created attachment 119817 [details] A zip file containing a demonstration of the bug The zip file contains an HTML file which displays characters at specific unicode values. The table shows the corresponding Unicode value on the right-most column.
I've checked this on Safari v5.1, Chrome v16 and the latest Webkit build - all on the Mac. Let me know if there's anything else I can provide.
This is specifically an issue with parsing strings like '\01f3a4'. It will work if you just paste a Unicode character in your CSS (and add a @charset rule to make sure it's decoded correctly). I don't know if we're matching the spec or not here.
Created attachment 120122 [details] demo of literal character working fine
Thanks Alexey. That works for me in Safari 5.1 and Webkit (Chrome doesn't seem to support it). I poked around as well and couldn't discern if this isn't appropriate. Pragmatically, it may be prove frustrating for people looking at the CSS file with a typeface that doesn't contain those glyphs. But that's not technically your problem. ;)
My lord, apologies for such a poorly-written comment. It's been a long day...
(In reply to comment #4) > This is specifically an issue with parsing strings like '\01f3a4'. It will work if you just paste a Unicode character in your CSS (and add a @charset rule to make sure it's decoded correctly). > > I don't know if we're matching the spec or not here. FWIW, the spec is here: http://www.w3.org/TR/CSS21/syndata.html#characters / http://www.w3.org/TR/css3-syntax/#characters It doesn’t mention anything about UTF-16 or surrogate pairs in escapes (which are thus non-standard, although they happen to be supported in WebKit); only Unicode / ISO 10646 code points are allowed in CSS escape sequences. This kind of CSS escape sequence doesn’t work in WebKit for characters outside the BMP, which is what this bug is about. For more info, see this mailing list discussion: http://lists.w3.org/Archives/Public/www-style/2012Jan/thread.html#msg536 For example, `\1d306 ` or `\01d306` are supposed to be CSS escape sequences for the “tetragram for centre” symbol (U+1D306), but they currently don’t work in WebKit. (In reply to comment #5) > Created an attachment (id=120122) [details] > reduced test case I’m not sure how that test case helps, as it doesn’t contain a CSS escape sequence, just the literal character. Am I missing something? Here’s an appropriate test case: http://jsfiddle.net/mathias/jY7ra/ The first escape sequence (used with `html:before`) is the standard one. WebKit is the only engine this fails in.
Created attachment 122271 [details] reduced test case
Comment on attachment 120122 [details] demo of literal character working fine > I’m not sure how that test case helps It was meant as a demonstration that the issue is more limited in scope than originally reported. I chose a poor description for the attachment, sorry for the confusion.
This is a tokenizer level issue (AP thanks for CC'ing me). Would not be much trouble to fix it in the custom written tokenizer after it is landed, just adding some extra parsing to the escape sequences.
Note that this also affects `document.querySelector` and `document.querySelectorAll`. Failing test case: data:text/html;charset=utf-8,%3C!DOCTYPE%20html%3E%3Ctitle%3EMothereffing%20CSS%20escapes%20example%3C%2Ftitle%3E%3Cstyle%3Epre%7Bbackground%3A%23eee%3Bpadding%3A.5em%7Dp%7Bdisplay%3Anone%7D%23ab%5Ca9%20de%5C1d306%20fg%7Bdisplay%3Ablock%7D%3C%2Fstyle%3E%3Ch1%3E%3Ca%20href%3D%22http%3A%2F%2Fmothereff.in%2Fcss-escapes%23ab%25C2%25A9de%25F0%259D%258C%2586fg%22%3EMothereffing%20CSS%20escapes%3C%2Fa%3E%20example%3C%2Fh1%3E%3Cpre%3E%3Ccode%3Eab%C2%A9de%F0%9D%8C%86fg%3C%2Fcode%3E%3C%2Fpre%3E%3Cp%20id%3D%22ab%C2%A9de%F0%9D%8C%86fg%22%3EIf%20you%20can%20read%20this%2C%20the%20escaped%20CSS%20selector%20worked.%20%3C%2Fp%3E%3Cscript%3Edocument.getElementById('ab%C2%A9de%F0%9D%8C%86fg').innerHTML%20%2B%3D%20'%20%3Ccode%3Edocument.getElementById%3C%2Fcode%3E%20worked.'%3Bdocument.querySelector('%23ab%5C%5Ca9%20de%5C%5C1d306%20fg').innerHTML%2B%3D'%20%3Ccode%3Edocument.querySelector%3C%2Fcode%3E%20worked.'%3C%2Fscript%3E (In reply to comment #11) > This is a tokenizer level issue (AP thanks for CC'ing me). Would not be much trouble to fix it in the custom written tokenizer after it is landed, just adding some extra parsing to the escape sequences. Out of curiosity, when will the custom-written tokenizer land (if it hasn’t already)? Any bug tickets I can subscribe to?
> Out of curiosity, when will the custom-written tokenizer land (if it hasn’t already)? Any bug tickets I can subscribe to? https://bugs.webkit.org/show_bug.cgi?id=70107 I just got an r+ to it, but I will land it tomorrow because I want to see the bots.
FWIW, I’ve just deployed some changes to my CSS escaper tool to make it easier to create test cases for this bug. E.g. click the “example” link on http://mothereff.in/css-escapes#1%F0%9D%8C%86. (In reply to comment #13) > I just got an r+ to it, but I will land it tomorrow because I want to see the bots. That’s awesome news!
https://bugs.webkit.org/show_bug.cgi?id=70107 is now RESOLVED FIXED, landed here: http://trac.webkit.org/changeset/106217
Better test case that will show a red/lime background depending on success/failure: data:text/html;charset=utf-8,<!DOCTYPE%20html><title>Mothereffing%20CSS%20escapes%20example<%2Ftitle><style>pre%7Bbackground%3A%23eee%3Bpadding%3A.5em%7D.test%7Bdisplay%3Anone%7D%23b%5Ca9%20de%5C1d306%20fg%7Bdisplay%3Ablock%7D.pass%7Bbackground%3Alime%7D.fail%7Bbackground%3Ared%7D<%2Fstyle><h1><a%20href%3D"http%3A%2F%2Fmothereff.in%2Fcss-escapes%231b%25C2%25A9de%25F0%259D%258C%2586fg">Mothereffing%20CSS%20escapes<%2Fa>%20example<%2Fh1><pre><code>b%C2%A9de%F0%9D%8C%86fg<%2Fcode><%2Fpre><p%20id%3D"b%C2%A9de%F0%9D%8C%86fg"%20class%3Dtest>If%20you%20can%20read%20this%2C%20the%20escaped%20CSS%20selector%20worked.%20<%2Fp><p>Standard%20CSS%20character%20escape%20sequences%20for%20supplementary%20Unicode%20characters%20aren%E2%80%99t%20currently%20supported%20in%20WebKit.%20<strong>This%20test%20case%20will%20fail%20in%20those%20browsers.<%2Fstrong>%20It%E2%80%99s%20better%20to%20leave%20these%20characters%20unescaped.<%2Fp><script>var%20el%3Ddocument.getElementsByTagName('p')%5B0%5D%3Btry%7Bdocument.getElementById('b%5Cxa9de%5Cud834%5Cudf06fg').innerHTML%20%2B%3D%20'%20<code>document.getElementById<%2Fcode>%20worked.'%3Bdocument.querySelector('%23b%5C%5Ca9%20de%5C%5C1d306%20fg').innerHTML%2B%3D'%20<code>document.querySelector<%2Fcode>%20worked.'%3Bel.className%3D'pass'%7Dcatch(e)%7Bel.innerHTML%3D'FAIL'%3Bel.className%3D'fail'%7D<%2Fscript> Short URL: http://mths.be/bel
This seems fixed. Feel free to mark this bug as RESOLVED FIXED.
*** This bug has been marked as a duplicate of bug 76152 ***