Bug 162593 - Implement URLParser::syntaxViolation
Summary: Implement URLParser::syntaxViolation
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Alex Christensen
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-09-26 22:22 PDT by Alex Christensen
Modified: 2016-10-03 16:01 PDT (History)
3 users (show)

See Also:


Attachments
Patch (69.44 KB, patch)
2016-09-26 22:32 PDT, Alex Christensen
no flags Details | Formatted Diff | Diff
Patch (71.80 KB, patch)
2016-09-26 23:52 PDT, Alex Christensen
no flags Details | Formatted Diff | Diff
Patch (73.43 KB, patch)
2016-09-27 09:23 PDT, Alex Christensen
ggaren: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Alex Christensen 2016-09-26 22:22:08 PDT
Implement URLParser::syntaxViolation
Comment 1 Alex Christensen 2016-09-26 22:32:42 PDT
Created attachment 289917 [details]
Patch
Comment 2 Alex Christensen 2016-09-26 23:52:40 PDT
Created attachment 289920 [details]
Patch
Comment 3 Alex Christensen 2016-09-27 09:23:32 PDT
Created attachment 289952 [details]
Patch
Comment 4 Geoffrey Garen 2016-09-27 12:54:28 PDT
Comment on attachment 289952 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=289952&action=review

r=me

> Source/WebCore/ChangeLog:15
> +        first time, we assume that everything in the input String up to that point is equal to what it would have been

assume => know

> Source/WebCore/ChangeLog:23
> +        This is about a 15% speed improvement on my URL parsing benchmark.

Are we fast enough now?

> Source/WebCore/platform/URLParser.cpp:413
> +void URLParser::incrementIteratorSkippingTabsAndNewlines(CodePointIterator<CharacterType>& iterator, const CodePointIterator<CharacterType>& iteratorForSyntaxViolationPosition)

I would call this "advanceSkippingTabsAndNewlines" or just "advance". The iterator thing is implied by argument type.

> Source/WebCore/platform/URLParser.cpp:950
> +    if (m_seenSyntaxViolation)

Let's call this "m_didSeeSyntaxViolation" because it's a boolean.

> Source/WebCore/platform/URLParser.cpp:985
> +    for (size_t i = 0; i < unicodeCodeUnitsToCopy; ++i)

I would write this as "(i = asciiCodeUnitsToCopy; i < asciiCodeUnitsToCopy + unicodeCodeUnitsToCopy; ++i)". I think that's a little clearer because then i always points to the current cursor.

> Source/WebCore/platform/URLParser.cpp:1345
> +                if (c.atEnd())
> +                    break;
>              }
> -            break;
> +            // Skip the check for tabs which might cause a syntaxViolation.
> +            // We want to handle the syntaxViolations while actually parsing the authority or host.
> +            goto CaseAuthorityOrHost;

Let's make this "do { ... } while (!c.atEnd());".

> Source/WebCore/platform/URLParser.cpp:1603
> +                m_seenUnicodeFragmentCodePoint = true;

m_didSeeUnicodeFragmentCodePoint.

> Source/WebCore/platform/URLParser.cpp:1621
> +            if (c.atEnd())
> +                break;
> +
> +            // Skip the check for tabs which might cause a syntaxViolation.
> +            // We need to handle them differently with fragmentSyntaxViolation.
> +            goto CaseFragment;

Let's make this "do { ... } while (!c.atEnd());".
Comment 5 Alex Christensen 2016-09-27 13:10:01 PDT
http://trac.webkit.org/changeset/206457
Comment 6 Alex Christensen 2016-09-27 13:12:02 PDT
(In reply to comment #4)
> > Source/WebCore/ChangeLog:23
> > +        This is about a 15% speed improvement on my URL parsing benchmark.
> 
> Are we fast enough now?
Almost
Comment 7 Alex Christensen 2016-10-03 16:01:34 PDT
For future reference, this was prototyped in https://bugs.webkit.org/show_bug.cgi?id=162464