Bug 29292 - [HTML5][Forms] Support for <textarea maxlength=N>
: [HTML5][Forms] Support for <textarea maxlength=N>
Status: RESOLVED FIXED
: WebKit
Forms
: 528+ (Nightly build)
: All All
: P2 Normal
Assigned To:
: http://www.whatwg.org/specs/web-apps/...
:
:
: 27454
  Show dependency treegraph
 
Reported: 2009-09-15 21:53 PST by
Modified: 2009-09-30 14:32 PST (History)


Attachments
Proposed patch (14.73 KB, patch)
2009-09-16 20:18 PST, Kent Tamura
darin: review-
Review Patch | Details | Formatted Diff | Diff
Proposed patch (rev.2) (19.64 KB, patch)
2009-09-17 19:30 PST, Kent Tamura
no flags Review Patch | Details | Formatted Diff | Diff
Proposed patch (rev.3) (19.57 KB, patch)
2009-09-18 10:54 PST, Kent Tamura
eric: commit‑queue-
Review Patch | Details | Formatted Diff | Diff
Proposed patch (rev.4) (19.57 KB, patch)
2009-09-18 13:06 PST, Kent Tamura
no flags Review Patch | Details | Formatted Diff | Diff
Proposed patch (rev.5) (19.60 KB, patch)
2009-09-18 17:53 PST, Kent Tamura
no flags Review Patch | Details | Formatted Diff | Diff
Proposed patch (rev.6) (19.59 KB, patch)
2009-09-21 09:23 PST, Kent Tamura
no flags Review Patch | Details | Formatted Diff | Diff


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2009-09-15 21:53:15 PST
As discussed in Bug#27454, the maxlength attribute will limit user-input length, and won't truncate a value set to the .value IDL attribute.
------- Comment #1 From 2009-09-16 20:18:31 PST -------
Created an attachment (id=39679) [details]
Proposed patch
------- Comment #2 From 2009-09-17 12:37:34 PST -------
(From update of attachment 39679 [details])
> +    if (!hasAttribute(maxlengthAttr))
> +        return;
> +    bool ok;
> +    unsigned maxLength = getAttribute(maxlengthAttr).string().toUInt(&ok);
> +    if (!ok)
> +        return;

It's a shame to do two hash table lookups here for the maxlength attribute. And there's no reason for that check. toUInt will give false for "ok" when maxlengthAttr is null, so I believe the hasAttribute check is unnecessary extra work.

> +    int currentLength = InputElement::numGraphemeClusters(toRenderTextControl(renderer())->text().impl());
> +    int selectionLength = InputElement::numGraphemeClusters(plainText(document()->frame()->selection()->selection().toNormalizedRange().get()).impl());

I think these functions should just be public functions, perhaps in PlatformString.h, rather than members of the InputElement class. It also seems inconvenient to have to pass them StringImpl* rather than just a String. As we make these functions more public it makes sense to me to make them more normal functions and less special-purpose.

> +    if (newLength < static_cast<int>(proposedValue.length())) {
> +        String string = proposedValue;
> +        return string.left(newLength);
> +    }
> +    return proposedValue;

This can just be written:

    return proposedValue.left(newLength);

The left function already has optimization for the case where the length is >= the length of the string, and you don't need to redo it.

> +    bool ok;
> +    unsigned maxLength = getAttribute(maxlengthAttr).string().toUInt(&ok);
> +    return ok ? maxLength : 0;

toUInt already guarantees it will return 0 if the string can't be parsed, so you can just write:

    return getAttribute(maxlengthAttr).string().toUInt();

> +    static String sanitizeUserInputValue(const String&, int);

It's not good to leave out the argument name here for the int. It's clear what this int would be. So it needs a name.

I'm going to say review- because I think at least some of my suggestions above really ought to be done.
------- Comment #3 From 2009-09-17 19:30:23 PST -------
Created an attachment (id=39745) [details]
Proposed patch (rev.2)
------- Comment #4 From 2009-09-17 19:32:53 PST -------
(In reply to comment #2)

Thank you for the helpful comments.

> (From update of attachment 39679 [details] [details])
> > +    if (!hasAttribute(maxlengthAttr))
> > +        return;
> > +    bool ok;
> > +    unsigned maxLength = getAttribute(maxlengthAttr).string().toUInt(&ok);
> > +    if (!ok)
> > +        return;
> 
> It's a shame to do two hash table lookups here for the maxlength attribute. And
> there's no reason for that check. toUInt will give false for "ok" when
> maxlengthAttr is null, so I believe the hasAttribute check is unnecessary extra
> work.

Removed hasAttribute().


> > +    int currentLength = InputElement::numGraphemeClusters(toRenderTextControl(renderer())->text().impl());
> > +    int selectionLength = InputElement::numGraphemeClusters(plainText(document()->frame()->selection()->selection().toNormalizedRange().get()).impl());
> 
> I think these functions should just be public functions, perhaps in
> PlatformString.h, rather than members of the InputElement class. It also seems
> inconvenient to have to pass them StringImpl* rather than just a String. As we
> make these functions more public it makes sense to me to make them more normal
> functions and less special-purpose.

Moved the grapheme functions to String class.


> > +    if (newLength < static_cast<int>(proposedValue.length())) {
> > +        String string = proposedValue;
> > +        return string.left(newLength);
> > +    }
> > +    return proposedValue;
> 
> This can just be written:
> 
>     return proposedValue.left(newLength);

Done.


> > +    bool ok;
> > +    unsigned maxLength = getAttribute(maxlengthAttr).string().toUInt(&ok);
> > +    return ok ? maxLength : 0;
> 
> toUInt already guarantees it will return 0 if the string can't be parsed, so
> you can just write:
> 
>     return getAttribute(maxlengthAttr).string().toUInt();

Done.


> > +    static String sanitizeUserInputValue(const String&, int);
> 
> It's not good to leave out the argument name here for the int. It's clear what
> this int would be. So it needs a name.

Added the argument name.
------- Comment #5 From 2009-09-18 09:48:00 PST -------
(From update of attachment 39745 [details])
> +    if (isEmpty())
> +        return 0;

This special case for empty and null strings is not needed.

> +    TextBreakIterator* it = characterBreakIterator(characters(), length());
> +    if (!it)
> +        return 0;

This is not new code, but it's really non-useful to return 0 here. It would be way better to just return length().

> +    if (isEmpty())
> +        return 0;

This special case for empty and null strings is not needed.

> +    TextBreakIterator* it = characterBreakIterator(characters(), length());
> +    if (!it)
> +        return 0;

This is not new code, but it's really non-useful to return 0 here. It would be better to return min(length(), numGraphemeClusters).

r=me as-is, though
------- Comment #6 From 2009-09-18 10:54:14 PST -------
Created an attachment (id=39769) [details]
Proposed patch (rev.3)

Clean up String::numGraphemeClusters() and String::numCharactersInGraphemeClusters() for Comment #5.
------- Comment #7 From 2009-09-18 11:13:35 PST -------
(From update of attachment 39769 [details])
> +    unsigned maxLength = static_cast<unsigned>(data.maxLength());  // maxLength() never be negative.

Coding style says one space before the "//".

The grammar here is wrong. It could be "can never be negative" or "never is negative" instead.

> +    unsigned appendableLength =  maxLength > baseLength ? maxLength - baseLength : 0;

Extra space here after the equal sign.

Another way to write this is:

    max(maxLength, baseLength) - baseLength

Your way is probably clearer, though.

> +    TextBreakIterator* it = characterBreakIterator(characters(), length());
> +    if (!it)
> +        return length();

The value length() is not right here. It should be min(length(), numGraphemeClusters) instead.

I'm going to say review+ because realistically characterBreakIterator will not return 0 so that comment is not important.
------- Comment #8 From 2009-09-18 11:43:36 PST -------
(From update of attachment 39769 [details])
Darin had review comments, so this can't be auto-committed unless you post a new patch.
------- Comment #9 From 2009-09-18 11:53:24 PST -------
(In reply to comment #8)
> Darin had review comments, so this can't be auto-committed unless you post a
> new patch.

My comments are optional, though. We could land it as-is if Kent wants to.
------- Comment #10 From 2009-09-18 13:06:48 PST -------
Created an attachment (id=39781) [details]
Proposed patch (rev.4)

> > +    unsigned maxLength = static_cast<unsigned>(data.maxLength());  // maxLength() never be negative.
> Coding style says one space before the "//".

Fixed.

> The grammar here is wrong. It could be "can never be negative" or "never is
> negative" instead.

Fixed.

> > +    unsigned appendableLength =  maxLength > baseLength ? maxLength - baseLength : 0;
> Extra space here after the equal sign.

Fixed.

> > +    TextBreakIterator* it = characterBreakIterator(characters(), length());
> The value length() is not right here. It should be min(length(),
> numGraphemeClusters) instead.

I don't think so.
Suppose that the string is "\ud800\udc00", of which length() is 2 and has 1 grapheme.  numCharactersInGraphmeClusters(1) with this string should return 2.  If we specified min(length(), nmGraphemeClusters) to characerBreakIterator(), the result would be 1.
------- Comment #11 From 2009-09-18 14:28:16 PST -------
(In reply to comment #10)
> Suppose that the string is "\ud800\udc00", of which length() is 2 and has 1
> grapheme.  numCharactersInGraphmeClusters(1) with this string should return 2. 
> If we specified min(length(), nmGraphemeClusters) to characerBreakIterator(),
> the result would be 1.

But the whole point is that when the iterator can't be created we have to assume that each character is a separate grapheme cluster because we don’t know any better.

Suppose that the string is "ab". numCharactersInGraphemeClusters(1) should return 1, not 2. If we specify length(), the result would be 2.

Obviously we can’t handle things correctly without the character iterator, but it seems clear that treating the entire string as one giant grapheme cluster is not the way to go. And that’s what returning length() does.
------- Comment #12 From 2009-09-18 17:53:38 PST -------
Created an attachment (id=39807) [details]
Proposed patch (rev.5)

> Suppose that the string is "ab". numCharactersInGraphemeClusters(1) should
> return 1, not 2. If we specify length(), the result would be 2.

ok, I changed it to:

unsigned String::numCharactersInGraphemeClusters(unsigned numGraphemeClusters) const
{
    TextBreakIterator* it = characterBreakIterator(characters(), length());
    if (!it)
        return min(length(), numGraphemeClusters);
------- Comment #13 From 2009-09-21 09:23:00 PST -------
Created an attachment (id=39855) [details]
Proposed patch (rev.6)

 resources/ -> script-tests/
------- Comment #14 From 2009-09-21 18:29:54 PST -------
(From update of attachment 39745 [details])
Removing Darin's r+ from this obsolete patch.
------- Comment #15 From 2009-09-21 18:30:04 PST -------
(From update of attachment 39769 [details])
Removing Darin's r+ from this obsolete patch.
------- Comment #16 From 2009-09-23 10:28:47 PST -------
(From update of attachment 39855 [details])
Darin reviewed the previous patch, so it should be easy for him to r+ this one.  I'd like to leave it for him to do that.  I would need to read up to make sure that your numCharactersInGraphemeClusters changes are OK.... especially since they're not explained in the ChangeLog. :(
------- Comment #17 From 2009-09-23 19:55:49 PST -------
(From update of attachment 39855 [details])
Clearing flags on attachment: 39855

Committed r48698: <http://trac.webkit.org/changeset/48698>
------- Comment #18 From 2009-09-27 18:01:12 PST -------
The patch was landed.
------- Comment #19 From 2009-09-30 14:32:08 PST -------
This broke Facebook.  See bug 29922.