Bug 41797 - REGRESSION (HTML5 parser?): Impossible to get past the CAPTCHA on postcode.royalmail.com
Summary: REGRESSION (HTML5 parser?): Impossible to get past the CAPTCHA on postcode.ro...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: DOM (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P1 Normal
Assignee: Adam Barth
URL: http://postcode.royalmail.com/
Keywords: InRadar, Regression
Depends on:
Blocks: 41115
  Show dependency treegraph
 
Reported: 2010-07-07 14:03 PDT by David Evans
Modified: 2010-07-12 19:05 PDT (History)
7 users (show)

See Also:


Attachments
Patch (7.78 KB, patch)
2010-07-12 16:19 PDT, Adam Barth
no flags Details | Formatted Diff | Diff
Patch (3.88 KB, patch)
2010-07-12 16:29 PDT, Adam Barth
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description David Evans 2010-07-07 14:03:34 PDT
At the Royal Mail postcode finder site it is impossible to get past the CAPTCHA no matter how many times you try.

However, trying it with Safari without a nightly build webkit, or Firefox 3.6.4, and you will get past it in one or two goes.

Safari Version 5.0 (6533.16, r62632)

Try finding the postcode for:

Building number: 5
Street:   Rochford Avenue
Town:  Shenfield

or any other reasonable UK address.
Comment 1 Alexey Proskuryakov 2010-07-08 12:00:43 PDT
How does one get to see a CAPTCHA on this site? I don't see it anywhere, all I could find mentioned was a limit of 15 searches per day.
Comment 2 David Evans 2010-07-09 03:03:40 PDT
I think you have to do a few searches before you get to see the Captcha.
Comment 3 Alexey Proskuryakov 2010-07-09 11:38:14 PDT
> I think you have to do a few searches before you get to see the Captcha.

I'm seeing the CAPTCHA today. The behavior is a little unstable, so it's hard to bisect to find out when this started, but I suspect HTLM5 parser.

The problem here is that we request the captcha image twice, so server side state gets out of sync with what is actually displayed.

The first request is made by normal parser, and the second one is made by preload scanner. This may or may not be a bug in loader, but preload scanner getting behind normal parsing makes no sense at all.

This probably affects more than just one site.
Comment 4 Alexey Proskuryakov 2010-07-09 11:38:34 PDT
<rdar://problem/8175896>
Comment 5 Adam Barth 2010-07-09 11:42:47 PDT
The second request comes from the preload scanner...  Interesting.  That could be because the preload scanner request has lower priority than the regular request.  If they get kicked off close to each other, the regular one might win.

We might need to be more agressive about canceling preload scanning.
Comment 6 Alexey Proskuryakov 2010-07-09 13:04:58 PDT
> That could be because the preload scanner request has lower priority than the regular request.

I was talking about Loader::Host::addRequest(), sorry for making it too terse. Preload scanner actually sees the tag after real parser does.
Comment 7 Adam Barth 2010-07-11 13:47:12 PDT
I'll look at this soon.
Comment 8 Adam Barth 2010-07-12 16:19:57 PDT
Created attachment 61282 [details]
Patch
Comment 9 Eric Seidel (no email) 2010-07-12 16:22:13 PDT
Comment on attachment 61282 [details]
Patch

It would be cleaner if you could get rid of the AAAAs by using an expression in php and stick them in a <div> or something which you can display none.
Comment 10 Adam Barth 2010-07-12 16:29:38 PDT
Created attachment 61284 [details]
Patch
Comment 11 Eric Seidel (no email) 2010-07-12 16:47:03 PDT
Comment on attachment 61284 [details]
Patch

Thanks!
Comment 12 WebKit Commit Bot 2010-07-12 18:28:25 PDT
Comment on attachment 61284 [details]
Patch

Clearing flags on attachment: 61284

Committed r63154: <http://trac.webkit.org/changeset/63154>
Comment 13 WebKit Commit Bot 2010-07-12 18:28:30 PDT
All reviewed patches have been landed.  Closing bug.
Comment 14 WebKit Review Bot 2010-07-12 18:52:46 PDT
http://trac.webkit.org/changeset/63154 might have broken GTK Linux 32-bit Release
Comment 15 Eric Seidel (no email) 2010-07-12 19:05:58 PDT
Committed r63158: <http://trac.webkit.org/changeset/63158>