Bug 29292 - [HTML5][Forms] Support for <textarea maxlength=N>
: [HTML5][Forms] Support for <textarea maxlength=N>
Status: RESOLVED FIXED
Product: WebKit
Classification: Unclassified
Component: Forms
: 528+ (Nightly build)
: All All
: P2 Normal
Assigned To: Kent Tamura
http://www.whatwg.org/specs/web-apps/...
:
Depends on:
Blocks: 27454
  Show dependency treegraph
 
Reported: 2009-09-15 21:53 PDT by Kent Tamura
Modified: 2009-09-30 14:32 PDT (History)
3 users (show)

See Also:


Attachments
Proposed patch (14.73 KB, patch)
2009-09-16 20:18 PDT, Kent Tamura
darin: review-
Details | Formatted Diff | Diff
Proposed patch (rev.2) (19.64 KB, patch)
2009-09-17 19:30 PDT, Kent Tamura
no flags Details | Formatted Diff | Diff
Proposed patch (rev.3) (19.57 KB, patch)
2009-09-18 10:54 PDT, Kent Tamura
eric: commit‑queue-
Details | Formatted Diff | Diff
Proposed patch (rev.4) (19.57 KB, patch)
2009-09-18 13:06 PDT, Kent Tamura
no flags Details | Formatted Diff | Diff
Proposed patch (rev.5) (19.60 KB, patch)
2009-09-18 17:53 PDT, Kent Tamura
no flags Details | Formatted Diff | Diff
Proposed patch (rev.6) (19.59 KB, patch)
2009-09-21 09:23 PDT, Kent Tamura
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Kent Tamura 2009-09-15 21:53:15 PDT
As discussed in Bug#27454, the maxlength attribute will limit user-input length, and won't truncate a value set to the .value IDL attribute.
Comment 1 Kent Tamura 2009-09-16 20:18:31 PDT
Created attachment 39679 [details]
Proposed patch
Comment 2 Darin Adler 2009-09-17 12:37:34 PDT
Comment on attachment 39679 [details]
Proposed patch

> +    if (!hasAttribute(maxlengthAttr))
> +        return;
> +    bool ok;
> +    unsigned maxLength = getAttribute(maxlengthAttr).string().toUInt(&ok);
> +    if (!ok)
> +        return;

It's a shame to do two hash table lookups here for the maxlength attribute. And there's no reason for that check. toUInt will give false for "ok" when maxlengthAttr is null, so I believe the hasAttribute check is unnecessary extra work.

> +    int currentLength = InputElement::numGraphemeClusters(toRenderTextControl(renderer())->text().impl());
> +    int selectionLength = InputElement::numGraphemeClusters(plainText(document()->frame()->selection()->selection().toNormalizedRange().get()).impl());

I think these functions should just be public functions, perhaps in PlatformString.h, rather than members of the InputElement class. It also seems inconvenient to have to pass them StringImpl* rather than just a String. As we make these functions more public it makes sense to me to make them more normal functions and less special-purpose.

> +    if (newLength < static_cast<int>(proposedValue.length())) {
> +        String string = proposedValue;
> +        return string.left(newLength);
> +    }
> +    return proposedValue;

This can just be written:

    return proposedValue.left(newLength);

The left function already has optimization for the case where the length is >= the length of the string, and you don't need to redo it.

> +    bool ok;
> +    unsigned maxLength = getAttribute(maxlengthAttr).string().toUInt(&ok);
> +    return ok ? maxLength : 0;

toUInt already guarantees it will return 0 if the string can't be parsed, so you can just write:

    return getAttribute(maxlengthAttr).string().toUInt();

> +    static String sanitizeUserInputValue(const String&, int);

It's not good to leave out the argument name here for the int. It's clear what this int would be. So it needs a name.

I'm going to say review- because I think at least some of my suggestions above really ought to be done.
Comment 3 Kent Tamura 2009-09-17 19:30:23 PDT
Created attachment 39745 [details]
Proposed patch (rev.2)
Comment 4 Kent Tamura 2009-09-17 19:32:53 PDT
(In reply to comment #2)

Thank you for the helpful comments.

> (From update of attachment 39679 [details])
> > +    if (!hasAttribute(maxlengthAttr))
> > +        return;
> > +    bool ok;
> > +    unsigned maxLength = getAttribute(maxlengthAttr).string().toUInt(&ok);
> > +    if (!ok)
> > +        return;
> 
> It's a shame to do two hash table lookups here for the maxlength attribute. And
> there's no reason for that check. toUInt will give false for "ok" when
> maxlengthAttr is null, so I believe the hasAttribute check is unnecessary extra
> work.

Removed hasAttribute().


> > +    int currentLength = InputElement::numGraphemeClusters(toRenderTextControl(renderer())->text().impl());
> > +    int selectionLength = InputElement::numGraphemeClusters(plainText(document()->frame()->selection()->selection().toNormalizedRange().get()).impl());
> 
> I think these functions should just be public functions, perhaps in
> PlatformString.h, rather than members of the InputElement class. It also seems
> inconvenient to have to pass them StringImpl* rather than just a String. As we
> make these functions more public it makes sense to me to make them more normal
> functions and less special-purpose.

Moved the grapheme functions to String class.


> > +    if (newLength < static_cast<int>(proposedValue.length())) {
> > +        String string = proposedValue;
> > +        return string.left(newLength);
> > +    }
> > +    return proposedValue;
> 
> This can just be written:
> 
>     return proposedValue.left(newLength);

Done.


> > +    bool ok;
> > +    unsigned maxLength = getAttribute(maxlengthAttr).string().toUInt(&ok);
> > +    return ok ? maxLength : 0;
> 
> toUInt already guarantees it will return 0 if the string can't be parsed, so
> you can just write:
> 
>     return getAttribute(maxlengthAttr).string().toUInt();

Done.


> > +    static String sanitizeUserInputValue(const String&, int);
> 
> It's not good to leave out the argument name here for the int. It's clear what
> this int would be. So it needs a name.

Added the argument name.
Comment 5 Darin Adler 2009-09-18 09:48:00 PDT
Comment on attachment 39745 [details]
Proposed patch (rev.2)

> +    if (isEmpty())
> +        return 0;

This special case for empty and null strings is not needed.

> +    TextBreakIterator* it = characterBreakIterator(characters(), length());
> +    if (!it)
> +        return 0;

This is not new code, but it's really non-useful to return 0 here. It would be way better to just return length().

> +    if (isEmpty())
> +        return 0;

This special case for empty and null strings is not needed.

> +    TextBreakIterator* it = characterBreakIterator(characters(), length());
> +    if (!it)
> +        return 0;

This is not new code, but it's really non-useful to return 0 here. It would be better to return min(length(), numGraphemeClusters).

r=me as-is, though
Comment 6 Kent Tamura 2009-09-18 10:54:14 PDT
Created attachment 39769 [details]
Proposed patch (rev.3)

Clean up String::numGraphemeClusters() and String::numCharactersInGraphemeClusters() for Comment #5.
Comment 7 Darin Adler 2009-09-18 11:13:35 PDT
Comment on attachment 39769 [details]
Proposed patch (rev.3)

> +    unsigned maxLength = static_cast<unsigned>(data.maxLength());  // maxLength() never be negative.

Coding style says one space before the "//".

The grammar here is wrong. It could be "can never be negative" or "never is negative" instead.

> +    unsigned appendableLength =  maxLength > baseLength ? maxLength - baseLength : 0;

Extra space here after the equal sign.

Another way to write this is:

    max(maxLength, baseLength) - baseLength

Your way is probably clearer, though.

> +    TextBreakIterator* it = characterBreakIterator(characters(), length());
> +    if (!it)
> +        return length();

The value length() is not right here. It should be min(length(), numGraphemeClusters) instead.

I'm going to say review+ because realistically characterBreakIterator will not return 0 so that comment is not important.
Comment 8 Eric Seidel 2009-09-18 11:43:36 PDT
Comment on attachment 39769 [details]
Proposed patch (rev.3)

Darin had review comments, so this can't be auto-committed unless you post a new patch.
Comment 9 Darin Adler 2009-09-18 11:53:24 PDT
(In reply to comment #8)
> Darin had review comments, so this can't be auto-committed unless you post a
> new patch.

My comments are optional, though. We could land it as-is if Kent wants to.
Comment 10 Kent Tamura 2009-09-18 13:06:48 PDT
Created attachment 39781 [details]
Proposed patch (rev.4)

> > +    unsigned maxLength = static_cast<unsigned>(data.maxLength());  // maxLength() never be negative.
> Coding style says one space before the "//".

Fixed.

> The grammar here is wrong. It could be "can never be negative" or "never is
> negative" instead.

Fixed.

> > +    unsigned appendableLength =  maxLength > baseLength ? maxLength - baseLength : 0;
> Extra space here after the equal sign.

Fixed.

> > +    TextBreakIterator* it = characterBreakIterator(characters(), length());
> The value length() is not right here. It should be min(length(),
> numGraphemeClusters) instead.

I don't think so.
Suppose that the string is "\ud800\udc00", of which length() is 2 and has 1 grapheme.  numCharactersInGraphmeClusters(1) with this string should return 2.  If we specified min(length(), nmGraphemeClusters) to characerBreakIterator(), the result would be 1.
Comment 11 Darin Adler 2009-09-18 14:28:16 PDT
(In reply to comment #10)
> Suppose that the string is "\ud800\udc00", of which length() is 2 and has 1
> grapheme.  numCharactersInGraphmeClusters(1) with this string should return 2. 
> If we specified min(length(), nmGraphemeClusters) to characerBreakIterator(),
> the result would be 1.

But the whole point is that when the iterator can't be created we have to assume that each character is a separate grapheme cluster because we don’t know any better.

Suppose that the string is "ab". numCharactersInGraphemeClusters(1) should return 1, not 2. If we specify length(), the result would be 2.

Obviously we can’t handle things correctly without the character iterator, but it seems clear that treating the entire string as one giant grapheme cluster is not the way to go. And that’s what returning length() does.
Comment 12 Kent Tamura 2009-09-18 17:53:38 PDT
Created attachment 39807 [details]
Proposed patch (rev.5)

> Suppose that the string is "ab". numCharactersInGraphemeClusters(1) should
> return 1, not 2. If we specify length(), the result would be 2.

ok, I changed it to:

unsigned String::numCharactersInGraphemeClusters(unsigned numGraphemeClusters) const
{
    TextBreakIterator* it = characterBreakIterator(characters(), length());
    if (!it)
        return min(length(), numGraphemeClusters);
Comment 13 Kent Tamura 2009-09-21 09:23:00 PDT
Created attachment 39855 [details]
Proposed patch (rev.6)

 resources/ -> script-tests/
Comment 14 Eric Seidel 2009-09-21 18:29:54 PDT
Comment on attachment 39745 [details]
Proposed patch (rev.2)

Removing Darin's r+ from this obsolete patch.
Comment 15 Eric Seidel 2009-09-21 18:30:04 PDT
Comment on attachment 39769 [details]
Proposed patch (rev.3)

Removing Darin's r+ from this obsolete patch.
Comment 16 Eric Seidel 2009-09-23 10:28:47 PDT
Comment on attachment 39855 [details]
Proposed patch (rev.6)

Darin reviewed the previous patch, so it should be easy for him to r+ this one.  I'd like to leave it for him to do that.  I would need to read up to make sure that your numCharactersInGraphemeClusters changes are OK.... especially since they're not explained in the ChangeLog. :(
Comment 17 WebKit Commit Bot 2009-09-23 19:55:49 PDT
Comment on attachment 39855 [details]
Proposed patch (rev.6)

Clearing flags on attachment: 39855

Committed r48698: <http://trac.webkit.org/changeset/48698>
Comment 18 Kent Tamura 2009-09-27 18:01:12 PDT
The patch was landed.
Comment 19 Mark Rowe (bdash) 2009-09-30 14:32:08 PDT
This broke Facebook.  See bug 29922.