Bug 28938 - CSS string value is not correctly serialized when it contains binary characters
Summary: CSS string value is not correctly serialized when it contains binary characters
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: CSS (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC OS X 10.5
: P2 Normal
Assignee: Yuta Kitamura
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-09-03 01:33 PDT by Yuta Kitamura
Modified: 2010-04-05 17:08 PDT (History)
4 users (show)

See Also:


Attachments
Escape control characters in CSS string value when it is serialized. (11.80 KB, patch)
2009-09-03 03:09 PDT, Yuta Kitamura
no flags Details | Formatted Diff | Diff
Escape control characters in CSS string value when it is serialized. (11.67 KB, patch)
2009-09-04 03:57 PDT, Yuta Kitamura
no flags Details | Formatted Diff | Diff
Escape control characters in CSS string value when it is serialized. (12.02 KB, patch)
2009-09-04 04:38 PDT, Yuta Kitamura
no flags Details | Formatted Diff | Diff
Escape control characters in CSS string value when it is serialized. (12.07 KB, patch)
2009-09-04 04:51 PDT, Yuta Kitamura
no flags Details | Formatted Diff | Diff
Patch v5. (13.32 KB, patch)
2010-04-01 21:48 PDT, Yuta Kitamura
no flags Details | Formatted Diff | Diff
Patch v6 (Applying darin's comment). (16.76 KB, patch)
2010-04-04 19:45 PDT, Yuta Kitamura
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Yuta Kitamura 2009-09-03 01:33:40 PDT
When WebKit serializes a CSS string value, it does not escape binary characters in that value.

For example, suppose that we have an element with the following style:
    font-family: '\0\1\2';
When we obtain this style using JavaScript (like element.style.fontFamily), it becomes an invisible string, instead of the correctly escaped string. WebKit should escape these characters as specified in CSS spec.
Comment 1 Yuta Kitamura 2009-09-03 03:09:36 PDT
Created attachment 38979 [details]
Escape control characters in CSS string value when it is serialized.

When WebKit serializesa CSS string value that contains binary characters
('\0\1\2' for example), it did not escape these characters. As a result,
users got (invisible) control characters through scripts, which was not good.
This change fixes this issue.

As a side effect, two separate codes for escaping CSS strings are merged, and
become a public function (escapeCSSString).
---
 7 files changed, 185 insertions(+), 52 deletions(-)
Comment 2 Eric Seidel (no email) 2009-09-04 00:29:11 PDT
Comment on attachment 38979 [details]
Escape control characters in CSS string value when it is serialized.

This would be much better as a js test/dom only test:
http://trac.webkit.org/wiki/Writing%20Layout%20Tests%20for%20DumpRenderTree

YOu'd get shouldBeEqualToString for fre, instead of having to make your own pass/fail functions.

This also doesn't need to be run from onload, you can run this from an inline script tag (which making this a js test will do automatically).

No argument names needed here since tehy don't add clarity:
6 String quoteCSSString(const String& string);
 217 String quoteCSSStringIfNeeded(const String& string);
 218 String quoteCSSURLIfNeeded(const String& string);
Comment 3 Yuta Kitamura 2009-09-04 00:41:04 PDT
Thank you for your review, Eric. I'm going to try again.
Comment 4 Yuta Kitamura 2009-09-04 03:57:19 PDT
Created attachment 39050 [details]
Escape control characters in CSS string value when it is serialized.

When WebKit serializesa CSS string value that contains binary characters
('\0\1\2' for example), it did not escape these characters. As a result,
users got (invisible) control characters through scripts, which was not good.
This change fixes this issue.

As a side effect, two separate codes for escaping CSS strings are merged, and
become a public function (escapeCSSString).
---
 8 files changed, 144 insertions(+), 53 deletions(-)
Comment 5 Yuta Kitamura 2009-09-04 04:01:24 PDT
Comment on attachment 39050 [details]
Escape control characters in CSS string value when it is serialized.

Oops, I overlooked the first part of Eric's comment. Let me do this again.
Comment 6 Yuta Kitamura 2009-09-04 04:38:57 PDT
Created attachment 39051 [details]
Escape control characters in CSS string value when it is serialized.

When WebKit serializesa CSS string value that contains binary characters
('\0\1\2' for example), it did not escape these characters. As a result,
users got (invisible) control characters through scripts, which was not good.
This change fixes this issue.

As a side effect, two separate codes for escaping CSS strings are merged, and
become a public function (escapeCSSString).
---
 9 files changed, 151 insertions(+), 53 deletions(-)
Comment 7 Yuta Kitamura 2009-09-04 04:51:18 PDT
Created attachment 39052 [details]
Escape control characters in CSS string value when it is serialized.

When WebKit serializesa CSS string value that contains binary characters
('\0\1\2' for example), it did not escape these characters. As a result,
users got (invisible) control characters through scripts, which was not good.
This change fixes this issue.

As a side effect, two separate codes for escaping CSS strings are merged, and
become a public function (escapeCSSString).
---
 9 files changed, 156 insertions(+), 53 deletions(-)
Comment 8 Yuta Kitamura 2009-09-04 04:54:36 PDT
Sorry for uploading patches again and again (I should have been more careful).

I converted the test to a JS test, and fixed some issues as suggested in Eric's comment.
Comment 9 Eric Seidel (no email) 2009-09-24 14:05:47 PDT
Comment on attachment 39052 [details]
Escape control characters in CSS string value when it is serialized.

I think this will be much slower than the old implementation.  We should at least use StringBuilder instead of String::append() because String::append() will make a new buffer every time. :(

In general the change looks OK, assuming it does not cause a PLT regression.

Could we pre-allocate the length of the new string all at once instead of doing so many mallocs?  StringBuilder will be better, but still slower than the old replace code was I think.
Comment 10 Eric Seidel (no email) 2009-09-24 14:06:07 PDT
Vector<UChar> also would have better append behavior than String does.
Comment 11 Simon Fraser (smfr) 2010-03-30 16:09:21 PDT
It would be nice to move forward on this.
Comment 12 Yuta Kitamura 2010-03-31 02:38:39 PDT
(In reply to comment #11)
> It would be nice to move forward on this.

This patch slipped off from my mind long ago... I'll continue.
Comment 13 Yuta Kitamura 2010-04-01 21:48:59 PDT
Created attachment 52372 [details]
Patch v5.
Comment 14 Darin Adler 2010-04-02 10:08:15 PDT
Comment on attachment 52372 [details]
Patch v5.

Great work. It's good to handle these cases correctly.

Looking at the tokenizer it seems that tab characters are legal in a CSS quoted string. But we are choosing to escape them as "\9". I guess that's OK.

> +            String escape = String::format("\\%x", static_cast<int>(ch));
> +            for (unsigned j = 0; j < escape.length(); ++j)
> +                buffer[index++] = escape[j];

This is excessive memory allocation and complexity to convert a two digit number to hex.

Instead something like this:

    static const char hexDigits[17] = "0123456789ABCDEF";
    buffer[index++] = '\\';
    if (ch >= 0x10)
        buffer[index++] = hexDigits[ch >> 4];
    buffer[index++] = hexDigits[ch & 0xF];

> +            // Space characeter may be required to separate backslash-escape sequence and normal characters.

Typo: "characeter".

These functions don't belong in CSSPrimitiveValue.h. I think they should move to CSSParser.h.

review- because I'd like to see the code changed to not use String::format.
Comment 15 Yuta Kitamura 2010-04-04 19:45:52 PDT
Created attachment 52516 [details]
Patch v6 (Applying darin's comment).
Comment 16 Yuta Kitamura 2010-04-04 19:47:44 PDT
(In reply to comment #15)
> Created an attachment (id=52516) [details]
> Patch v6 (Applying darin's comment).

I think I addressed all issues raised so far. Darin, could you please take another look?

Thanks,
Comment 17 WebKit Review Bot 2010-04-04 19:49:34 PDT
Attachment 52516 [details] did not pass style-queue:

Failed to run "WebKitTools/Scripts/check-webkit-style" exit_code: 1
WebCore/css/CSSParser.cpp:5437:  A case label should not be indented, but line up with its switch statement.  [whitespace/indent] [4]
Total errors found: 1 in 10 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 18 Yuta Kitamura 2010-04-04 19:58:21 PDT
(In reply to comment #17)
> Attachment 52516 [details] did not pass style-queue:
> 
> Failed to run "WebKitTools/Scripts/check-webkit-style" exit_code: 1
> WebCore/css/CSSParser.cpp:5437:  A case label should not be indented, but line
> up with its switch statement.  [whitespace/indent] [4]
> Total errors found: 1 in 10 files
> 
> 
> If any of these errors are false positives, please file a bug against
> check-webkit-style.

Oops, I've forgotten to run check-webkit-style. My bad.

However, I'm wondering if I need to fix this error, because I've found there are a lot of both usages (indented and not-indented case labels).
Comment 19 Darin Adler 2010-04-04 21:26:49 PDT
Comment on attachment 52516 [details]
Patch v6 (Applying darin's comment).

> +    if (p == end || !(p[0] == '_' || p[0] >= 128 || isASCIIAlpha(p[0])))

In a later patch (I don't suggest delaying this one) we should use !isASCII(p[0]) instead of p[0] >= 128 since it’s clearer and probably more efficient as well.

> +        if (!(p[0] == '_' || p[0] == '-' || p[0] >= 128 || isASCIIAlphanumeric(p[0])))

Same here.

> +                if (c < 128)
> +                    return false;

And isASCII(c) here.

> +            static const char hexDigits[17] = "0123456789abcdef";

I personally prefer capital hex.

> +    String quoteCSSString(const String&);

I think we should make this function be private to the .cpp file instead of exporting it. I can't think of any case where we'd want other files to use it.

r=me without any changes -- you could make one or more of the changes above in a follow-up patch
Comment 20 Yuta Kitamura 2010-04-05 00:53:16 PDT
(In reply to comment #19)
> r=me without any changes -- you could make one or more of the changes above in
> a follow-up patch

Thanks, I'd be happy to write a follow-up patch.
Comment 21 WebKit Commit Bot 2010-04-05 17:08:42 PDT
Comment on attachment 52516 [details]
Patch v6 (Applying darin's comment).

Clearing flags on attachment: 52516

Committed r57105: <http://trac.webkit.org/changeset/57105>
Comment 22 WebKit Commit Bot 2010-04-05 17:08:48 PDT
All reviewed patches have been landed.  Closing bug.