Bug 27457

Summary: Add <input type=email> validation support
Product: WebKit Reporter: Peter Kasting <pkasting>
Component: FormsAssignee: Michelangelo De Simone <michelangelo>
Status: RESOLVED FIXED    
Severity: Enhancement CC: commit-queue, ddkilzer, laszlo.gombos, mike, pkasting, tkent
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
URL: http://www.whatwg.org/specs/web-apps/current-work/#e-mail-state
Bug Depends on:    
Bug Blocks: 19264    
Attachments:
Description Flags
Patch v1
none
Patch v1a
none
Patch v1, Changelogs were missing
eric: review-
Patch rev.2
none
Patch rev.2a
none
Patch rev. 2b none

Description Peter Kasting 2009-07-20 12:29:07 PDT
See spec section 4.10.4.1.5.

Basically, the only difference between an e-mail input and a normal text input is that at validation time the e-mail input can return true from validityState's typeMismatch() method if it doesn't contain a valid email address or list of addresses.  The spec covers in detail what those consist of.
Comment 1 Peter Kasting 2009-07-21 10:24:47 PDT
Note: The underlying "URL" input type and associated parser changes were made on bug 25554, so this is only about validation.
Comment 2 Michelangelo De Simone 2009-08-21 06:53:51 PDT
AFAIK the fastest and most synthetic way to check for email validity is to make use of regexp.
My intention is to divide this task into two separate patches:

Part 1: regular validation check on type=email input value
Part 2: same check when the "multiple" attribute is specified.
Comment 3 Michelangelo De Simone 2009-08-24 10:23:33 PDT
Created attachment 38484 [details]
Patch v1
Comment 4 Michelangelo De Simone 2009-08-24 10:25:41 PDT
Created attachment 38485 [details]
Patch v1a
Comment 5 Michelangelo De Simone 2009-08-24 10:32:10 PDT
Created attachment 38487 [details]
Patch v1, Changelogs were missing
Comment 6 Eric Seidel (no email) 2009-08-24 13:36:51 PDT
Comment on attachment 38487 [details]
Patch v1, Changelogs were missing

I think the test results could be made much clearer by using a little pass/fail wrapper function:

function testEmail(value, expectedValid)
{
    i.value = "someone@somewhere.com";
    shouldBe("v.typeMismatch", "false");
    var didPass = v.typeMismatch == expectedValid;
    var validText = (expectedValue ? !didPass : didPass) ? "an invalid" : "a valid";
    var resultText = "" + value + " is " + validText + " email address"
    if (didPass)
        testPassed(resultText);
    else
        testFailed(resultText);
}
Comment 7 Eric Seidel (no email) 2009-08-25 10:43:17 PDT
Comment on attachment 38487 [details]
Patch v1, Changelogs were missing

Maybe we could also break:
 31 #define RFC5322_DOTATOMTEXT "[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*"

up into parts.  It's kinda confusing to read as-is.

r- for the tests being confusing.
Comment 8 Michelangelo De Simone 2009-08-26 12:32:00 PDT
Created attachment 38631 [details]
Patch rev.2

Both Eric's comments have been addressed.
Comment 9 Peter Kasting 2009-08-28 15:23:44 PDT
Comment on attachment 38631 [details]
Patch rev.2

> +bool ValidityState::isValidEmailAddress(String& email)
> +{
> +    if (email.isEmpty())
> +        return false;
> +
> +    DEFINE_STATIC_LOCAL(String, dotAtomText, (RFC5322_ATEXT));
> +    dotAtomText.append("+(?:\\.");
> +    dotAtomText.append(RFC5322_ATEXT);
> +    dotAtomText.append("+)*");

Wait a minute, won't this append these on every call, making the regex longer and longer?
What about just

#define RFC5322_ATEXT "[a-z0-9!#$%&'*+/=?^_`{|}~-]"
#define RFC5322_DOTATOMTEXT RFC5322_ATEXT "+(?:\\." RFC5322_ATEXT "+)*"
...
DEFINE_STATIC_LOCAL(AtomicString, emailPattern, RFC5322_DOTATOMTEXT "@" RFC5322_DOTATOMTEXT);
Comment 10 Michelangelo De Simone 2009-08-28 16:03:11 PDT
Created attachment 38759 [details]
Patch rev.2a

Corrected, thanks to Peter.
Comment 11 Peter Kasting 2009-08-31 00:02:25 PDT
Note that Ian has just changed the spec as to what the acceptance criteria are.
Comment 12 Michelangelo De Simone 2009-08-31 10:52:29 PDT
Comment on attachment 38759 [details]
Patch rev.2a

Canceling r? due to specs change (4.10.4.1.5); a valid email address is in format 1*( atext / "." ) "@" ldh-str 1*( "." ldh-str )
Comment 13 Michelangelo De Simone 2009-09-02 16:12:28 PDT
(In reply to comment #12)

> Canceling r? due to specs change (4.10.4.1.5); a valid email address is in
> format 1*( atext / "." ) "@" ldh-str 1*( "." ldh-str )

I'm gonna publish a patch that makes use of the following regexp to match Ian's modifications to specs:

[a-z0-9!#$%&'*+/=?^_`{|}~-](?.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)* @ [a-z0-9-]+(?.[a-z0-9-])*

It's obviously quite "verbose" but, for the sake of clearness...
Comment 14 Kent Tamura 2009-09-10 18:34:46 PDT
(In reply to comment #13)
> (In reply to comment #12)
> 
> > Canceling r? due to specs change (4.10.4.1.5); a valid email address is in
> > format 1*( atext / "." ) "@" ldh-str 1*( "." ldh-str )
> 
> I'm gonna publish a patch that makes use of the following regexp to match Ian's
> modifications to specs:
> 
> [a-z0-9!#$%&'*+/=?^_`{|}~-](?.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)* @
> [a-z0-9-]+(?.[a-z0-9-])*
> 
> It's obviously quite "verbose" but, for the sake of clearness...

It should be 
  [a-z0-9!#$%&'*+/=?^_`{|}~-.]+@[a-z0-9-]+(?\.[a-z0-9-])+
Comment 15 Michelangelo De Simone 2009-09-14 07:09:12 PDT
(In reply to comment #14)

> It should be 
>   [a-z0-9!#$%&'*+/=?^_`{|}~-.]+@[a-z0-9-]+(?\.[a-z0-9-])+

This seems slightly incorrect.:)

1*( atext / "." ) = [a-z0-9!#$%&'*+/=?^_`{|}~.-]+
ldh-str 1*( "." ldh-str ) = [a-z0-9-]+(.[a-z0-9-]+)*

[a-z0-9!#$%&'*+/=?^_`{|}~.-]+@[a-z0-9-]+(.[a-z0-9-]+)*
Comment 16 Kent Tamura 2009-09-14 08:35:33 PDT
(In reply to comment #15)
> 1*( atext / "." ) = [a-z0-9!#$%&'*+/=?^_`{|}~.-]+
> ldh-str 1*( "." ldh-str ) = [a-z0-9-]+(.[a-z0-9-]+)*
> 
> [a-z0-9!#$%&'*+/=?^_`{|}~.-]+@[a-z0-9-]+(.[a-z0-9-]+)*

The last '*' should be '+'.  We need at least 1 dot in a domain part.
Comment 17 Michelangelo De Simone 2009-09-15 08:37:28 PDT
Created attachment 39602 [details]
Patch rev. 2b
Comment 18 Adam Barth 2009-10-13 13:34:25 PDT
Comment on attachment 39602 [details]
Patch rev. 2b

This looks good.  I'm not 100% sure the regexp is right, but we can tweak that later if we're not matching HTML5.
Comment 19 WebKit Commit Bot 2009-10-13 13:53:20 PDT
Comment on attachment 39602 [details]
Patch rev. 2b

Clearing flags on attachment: 39602

Committed r49508: <http://trac.webkit.org/changeset/49508>
Comment 20 WebKit Commit Bot 2009-10-13 13:53:26 PDT
All reviewed patches have been landed.  Closing bug.