Bug 174633

Summary: Web Automation: implicit navigations don't cause browsing context switch
Product: WebKit Reporter: Carlos Garcia Campos <cgarcia>
Component: WebKit2Assignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: bburg, buildbot, joepeck, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 174670    
Bug Blocks:    
Attachments:
Description Flags
Patch
none
Patch bburg: review+

Description Carlos Garcia Campos 2017-07-18 09:35:50 PDT
I've noticed this debugging testShouldFocusOnTheReplacementWhenAFrameFollowsALinkToA_TopTargetedPage. The situation is something like this:

1. a frame set is loaded
2. switch frame to any frame in the set
3. get a link from the subframe and click it
4. a new page is loaded in the main frame
5. find an element that is in the new page

from 3 to 4 there's actually a browsing context switch, but web driver doesn't know about it. The subrame from which the navigation happened is still the current browsing context, and it's still referenced in WebAutomationSession (frames are never removed from the maps) and in WebAutomationSessionProxy (the script object is still protected, and pending callbacks if any are nos dispatched, it seems we are leaking the frame in the web process). So, step 5 is still trying to find the element in the previous frame, and since the frame is still alive, the evaluateJavaScriptFunction is called, but the element is not found and returns a no such element error. This would be even more problematic if the previous page had an element matching the find command. I'm not sure how to solve this, if the frame was not leaked in the wen process, at least we could get a no such frame error, and handle it as special case, switching to top level and trying again. Something like that is done in chromium:

if (attempt == 2) {
   // Switch to main frame and retry command if subframe no longer exists.                                                                                                                 
   session->SwitchToTopFrame();
}

I don't like this approach that much, but if there's no other way, at least we should do it only when error returned is no such frame, (and there's an active browsing context, of course).

Any other ideas? Did you solve that in safaridriver somehow?
Comment 1 BJ Burg 2017-07-18 15:08:55 PDT
(In reply to Carlos Garcia Campos from comment #0)
> I've noticed this debugging
> testShouldFocusOnTheReplacementWhenAFrameFollowsALinkToA_TopTargetedPage.
> The situation is something like this:
> 
> 1. a frame set is loaded
> 2. switch frame to any frame in the set
> 3. get a link from the subframe and click it
> 4. a new page is loaded in the main frame
> 5. find an element that is in the new page
> 
> from 3 to 4 there's actually a browsing context switch, but web driver
> doesn't know about it. The subrame from which the navigation happened is
> still the current browsing context, and it's still referenced in
> WebAutomationSession (frames are never removed from the maps) and in
> WebAutomationSessionProxy (the script object is still protected, and pending
> callbacks if any are nos dispatched, it seems we are leaking the frame in
> the web process). So, step 5 is still trying to find the element in the
> previous frame, and since the frame is still alive, the
> evaluateJavaScriptFunction is called, but the element is not found and
> returns a no such element error. This would be even more problematic if the
> previous page had an element matching the find command.

Great investigation!

> I'm not sure how to
> solve this, if the frame was not leaked in the web process, at least we
> could get a no such frame error, and handle it as special case, switching to
> top level and trying again. Something like that is done in chromium:
> 
> if (attempt == 2) {
>    // Switch to main frame and retry command if subframe no longer exists.  
> 
>    session->SwitchToTopFrame();
> }
> 
> I don't like this approach that much, but if there's no other way, at least
> we should do it only when error returned is no such frame, (and there's an
> active browsing context, of course).
> 
> Any other ideas? Did you solve that in safaridriver somehow?

No we haven't solved this. This is probably the cause of many nonspecific bugs we get with unexpected "element not found" errors.

What about changing the context in WebAutomationSession::navigationOccurredForPage() if needed, and sending a notification out to the driver that the current top level context changed?
Comment 2 Radar WebKit Bug Importer 2017-07-18 15:31:36 PDT
<rdar://problem/33387797>
Comment 3 Carlos Garcia Campos 2017-07-20 00:09:36 PDT
We are not actually leaking frames, it's that they are kept alive in the page cache.
Comment 4 Carlos Garcia Campos 2017-07-20 00:50:58 PDT
(In reply to Brian Burg from comment #1)
> (In reply to Carlos Garcia Campos from comment #0)
> > I've noticed this debugging
> > testShouldFocusOnTheReplacementWhenAFrameFollowsALinkToA_TopTargetedPage.
> > The situation is something like this:
> > 
> > 1. a frame set is loaded
> > 2. switch frame to any frame in the set
> > 3. get a link from the subframe and click it
> > 4. a new page is loaded in the main frame
> > 5. find an element that is in the new page
> > 
> > from 3 to 4 there's actually a browsing context switch, but web driver
> > doesn't know about it. The subrame from which the navigation happened is
> > still the current browsing context, and it's still referenced in
> > WebAutomationSession (frames are never removed from the maps) and in
> > WebAutomationSessionProxy (the script object is still protected, and pending
> > callbacks if any are nos dispatched, it seems we are leaking the frame in
> > the web process). So, step 5 is still trying to find the element in the
> > previous frame, and since the frame is still alive, the
> > evaluateJavaScriptFunction is called, but the element is not found and
> > returns a no such element error. This would be even more problematic if the
> > previous page had an element matching the find command.
> 
> Great investigation!
> 
> > I'm not sure how to
> > solve this, if the frame was not leaked in the web process, at least we
> > could get a no such frame error, and handle it as special case, switching to
> > top level and trying again. Something like that is done in chromium:
> > 
> > if (attempt == 2) {
> >    // Switch to main frame and retry command if subframe no longer exists.  
> > 
> >    session->SwitchToTopFrame();
> > }
> > 
> > I don't like this approach that much, but if there's no other way, at least
> > we should do it only when error returned is no such frame, (and there's an
> > active browsing context, of course).
> > 
> > Any other ideas? Did you solve that in safaridriver somehow?
> 
> No we haven't solved this. This is probably the cause of many nonspecific
> bugs we get with unexpected "element not found" errors.
> 
> What about changing the context in
> WebAutomationSession::navigationOccurredForPage() if needed, and sending a
> notification out to the driver that the current top level context changed?

I think we can simply clear the m_handleWebFrameMap there, because no frames can be resolved before the page loads, any frame cached there at that point is for previous page. Then, any operation on a stale frame will end up on a non such frame error.
Comment 5 Carlos Garcia Campos 2017-07-20 04:40:46 PDT
Created attachment 315979 [details]
Patch
Comment 6 Build Bot 2017-07-20 04:42:07 PDT
Attachment 315979 [details] did not pass style-queue:


ERROR: Source/WebDriver/Session.h:69:  Extra space before ( in function call  [whitespace/parens] [4]
ERROR: Source/WebDriver/Session.cpp:930:  Extra space before ( in function call  [whitespace/parens] [4]
Total errors found: 2 in 9 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 7 Carlos Garcia Campos 2017-07-20 04:44:06 PDT
Created attachment 315980 [details]
Patch

Oops, submitted the wrong patch
Comment 8 BJ Burg 2017-07-20 11:15:25 PDT
Comment on attachment 315980 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=315980&action=review

r+ with change

> Source/WebKit/UIProcess/Automation/WebAutomationSession.cpp:468
> +        m_handleWebFrameMap.clear();

You need to clear m_webFrameHandleMap too, as they are inversed lookup tables.
Comment 9 Carlos Garcia Campos 2017-07-21 03:22:37 PDT
Committed r219723: <http://trac.webkit.org/changeset/219723>