Bug 84953

Summary: [Qt][WK2] QtWebKit2 doesn't have an API for getting a page source
Product: WebKit Reporter: Jesus Sanchez-Palencia <jesus>
Component: WebKit QtAssignee: Jesus Sanchez-Palencia <jesus>
Status: RESOLVED INVALID    
Severity: Normal CC: cmarcelo, hausmann, hugo.lima, joel.parks, kenneth, menard, webkit.review.bot, zalan, zoltan
Priority: P2 Keywords: Qt, QtTriaged
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch none

Jesus Sanchez-Palencia
Reported 2012-04-26 06:54:11 PDT
Currently there is no way to implement a "View source" feature using our API if one is developing a browser. I would like to propose an experimental synchronous API for it.
Attachments
Patch (6.64 KB, patch)
2012-04-26 07:06 PDT, Jesus Sanchez-Palencia
no flags
Patch (6.81 KB, patch)
2012-04-27 08:21 PDT, Jesus Sanchez-Palencia
no flags
Patch (6.53 KB, patch)
2012-04-27 12:26 PDT, Jesus Sanchez-Palencia
no flags
Patch (6.59 KB, patch)
2012-05-02 06:55 PDT, Jesus Sanchez-Palencia
no flags
Patch (6.62 KB, patch)
2012-05-02 07:22 PDT, Jesus Sanchez-Palencia
no flags
Patch (7.89 KB, patch)
2012-05-03 14:04 PDT, Jesus Sanchez-Palencia
no flags
Patch (8.32 KB, patch)
2012-05-04 05:15 PDT, Jesus Sanchez-Palencia
no flags
Jesus Sanchez-Palencia
Comment 1 2012-04-26 07:06:40 PDT
Kenneth Rohde Christiansen
Comment 2 2012-04-26 07:09:42 PDT
Comment on attachment 138996 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=138996&action=review > Source/WebKit2/UIProcess/API/qt/qquickwebpage_p.h:39 > + Q_PROPERTY(QString source READ source FINAL) sourceText ?
Alexis Menard (darktears)
Comment 3 2012-04-26 07:12:26 PDT
Comment on attachment 138996 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=138996&action=review > Source/WebKit2/UIProcess/qt/WebPageProxyQt.cpp:120 > + process()->sendSync(Messages::WebPage::GetSourceForFrameSync(frame->frameID()), Messages::WebPage::GetSourceForFrameSync::Reply(source), m_pageID); I thought we were trying to avoid sync calls from public API?
Hugo Parente Lima
Comment 4 2012-04-26 10:17:20 PDT
>> Source/WebKit2/UIProcess/API/qt/qquickwebpage_p.h:39 >> + Q_PROPERTY(QString source READ source FINAL) > > sourceText ? I would prefer "sourceCode", the webpage "source" probably was the internet ;-P
Jesus Sanchez-Palencia
Comment 5 2012-04-27 08:21:36 PDT
zalan
Comment 6 2012-04-27 09:12:03 PDT
Comment on attachment 139208 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=139208&action=review It would be somewhat cleaner, if WebPage could deal with the view-source handling on the webprocess side. It would save the extra API + m_isInViewSourceMode etc, although I am sure there's tons of things that would break because of the double scheme. (speaking of which, wondering how QUrl deals with view-source:https://, but probably it's fine) Sigh. > Source/WebKit2/UIProcess/API/qt/qquickwebview.cpp:1246 > + bool inViewSourceMode = false; I'd simply do bool inViewSourceMode = url.scheme() == QString::fromAscii("view-source"); > Source/WebKit2/UIProcess/API/qt/qquickwebview.cpp:1256 > + d->m_isInViewSourceMode = inViewSourceMode; This will fail when webprocess is crashed in viewsource mode and view-source::url is reloaded. > Source/WebKit2/UIProcess/WebPageProxy.h:649 > + void setViewInSourceMode(bool); setViewSourceMode(bool on)? Though I am not good at naming. > Source/WebKit2/WebProcess/WebPage/WebPage.cpp:3166 > + if (m_mainFrame && m_mainFrame->coreFrame()) No need to check against m_mainframe.
Jesus Sanchez-Palencia
Comment 7 2012-04-27 12:03:09 PDT
(In reply to comment #6) > It would be somewhat cleaner, if WebPage could deal with the view-source handling on the webprocess side. It would save the extra API + m_isInViewSourceMode etc, although I am sure there's tons of things that would break because of the double scheme. (speaking of which, wondering how QUrl deals with view-source:https://, but probably it's fine) > Sigh. Yep, exactly. I thought about adding a bool parameter to loadURL instead, but since Messages don't handle default values for its parameters I would need to add a "false" parameter to the other ports where they send the loadURL message. And QUrl is handling it pretty fine, by the way. :) > > > Source/WebKit2/UIProcess/API/qt/qquickwebview.cpp:1246 > > + bool inViewSourceMode = false; > > I'd simply do bool inViewSourceMode = url.scheme() == QString::fromAscii("view-source"); D'oh, yep. Done. :) > > > Source/WebKit2/UIProcess/API/qt/qquickwebview.cpp:1256 > > + d->m_isInViewSourceMode = inViewSourceMode; > > This will fail when webprocess is crashed in viewsource mode and view-source::url is reloaded. Good point. I did this to avoid extra IPC calls but now that you spotted this it might be a bad idea. Thanks for the review, I will upload a patch soon.
Jesus Sanchez-Palencia
Comment 8 2012-04-27 12:26:31 PDT
Kenneth Rohde Christiansen
Comment 9 2012-04-29 04:04:16 PDT
Comment on attachment 139246 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=139246&action=review > Source/WebKit2/UIProcess/API/qt/qquickwebview.cpp:166 > + , m_isInViewSourceMode(false) isInSourceViewingMode ? :) > Source/WebKit2/UIProcess/API/qt/qquickwebview.cpp:1246 > + // Make sure we have a well formed URL after we remove the scheme Dot at end > Source/WebKit2/UIProcess/API/qt/qquickwebview.cpp:1247 > + QUrl urlToLoad = d->m_isInViewSourceMode ? QUrl::fromUserInput(url.toString(QUrl::RemoveScheme)) : url; Doesnt fromUserInput do more than you need?
Jesus Sanchez-Palencia
Comment 10 2012-05-02 06:41:36 PDT
(In reply to comment #9) > (From update of attachment 139246 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=139246&action=review > > > Source/WebKit2/UIProcess/API/qt/qquickwebview.cpp:166 > > + , m_isInViewSourceMode(false) > > isInSourceViewingMode ? :) It follows what we currently have in WebCore/page/Frame.h . :) > > > Source/WebKit2/UIProcess/API/qt/qquickwebview.cpp:1246 > > + // Make sure we have a well formed URL after we remove the scheme > > Dot at end Done. Patch coming. > > > Source/WebKit2/UIProcess/API/qt/qquickwebview.cpp:1247 > > + QUrl urlToLoad = d->m_isInViewSourceMode ? QUrl::fromUserInput(url.toString(QUrl::RemoveScheme)) : url; > > Doesnt fromUserInput do more than you need? I don't think so. If the user type in "view-source:http:/www.google.com" we really want to remove the first scheme ("view-source") and then treat everything else as a URL fromUserInput.
Jesus Sanchez-Palencia
Comment 11 2012-05-02 06:55:04 PDT
Alexis Menard (darktears)
Comment 12 2012-05-02 07:02:40 PDT
Comment on attachment 139805 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=139805&action=review > Source/WebKit2/UIProcess/API/qt/qquickwebview_p_p.h:197 > + bool m_isInViewSourceMode; Nitpick : Can you move the bool with the other ones so all the bools are together (maybe one day we can pack all of them).
Jesus Sanchez-Palencia
Comment 13 2012-05-02 07:22:00 PDT
Kenneth Rohde Christiansen
Comment 14 2012-05-02 07:34:15 PDT
Comment on attachment 139810 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=139810&action=review > Source/WebKit2/UIProcess/API/qt/qquickwebview.cpp:1270 > + QUrl urlToLoad = d->m_isInViewSourceMode ? QUrl::fromUserInput(url.toString(QUrl::RemoveScheme)) : url; The application (say MiniBrowser) should enforce valid urls, not us. You should just strip it here.
Jesus Sanchez-Palencia
Comment 15 2012-05-02 10:29:37 PDT
(In reply to comment #14) > (From update of attachment 139810 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=139810&action=review > > > Source/WebKit2/UIProcess/API/qt/qquickwebview.cpp:1270 > > + QUrl urlToLoad = d->m_isInViewSourceMode ? QUrl::fromUserInput(url.toString(QUrl::RemoveScheme)) : url; > > The application (say MiniBrowser) should enforce valid urls, not us. You should just strip it here. Yes, we all agree on that and the applications are usually doing it. Let me just elaborate a bit more. WE have decided that this API would be implemented through "view-source". Now we need to decide how to support it. When we get the URL in QQuickWebView::setUrl it should have been through QUrl::fromUserInput from the application already, the problem is that the way "view-source:" is evaluated by QUrl. It is considered a valid URL (i.e. isValid() returns true) but it's not "handled". What do I mean by "handled"? Let's refer to 4 use-cases: 1) User types in "www.foo.com" - isValid returns true, no matter we use QUrl ctor or ::fromUserInput() - QUrl::fromUserInput will add http:// to it and return QUrl("http://www.foo.com") 2) User types in "http:www.foo.com" - isValid returns false, no matter we use QUrl ctor or ::fromUserInput() - QUrl::fromUserInput will do nothing and return QUrl("") 3) User types in "view-source:www.foo.com" - isValid returns true, no matter we use QUrl ctor or ::fromUserInput() - QUrl::fromUserInput will do nothing and return QUrl("view-source:www.foo.com") 4) User types in "view-source:http://www.foo.com" - isValid returns true, no matter we use QUrl ctor or ::fromUserInput() - QUrl::fromUserInput will do nothing and return QUrl("view-source:http://www.foo.com") In other words, if we just strip the URL we will only support fully typed urls (i.e. "view-source:http://www.foo.com", or use case #4) and won't support the ones without scheme typed in (i.e. "view-source:www.foo.com", or use case #3). Chromium, for instance, supports both cases. If we only want to support #4, then stripping it only will be fine. If we want to support #3 as well, then we need to strip it (remove view-source:) and afterwards let QUrl "handle" whatever is left so it adds http/https/ftp/foobar... I don't believe this should be fixed in QUrl because it will then fallback to the discussion about whether #2 should work or not. What do you guys think?
Kenneth Rohde Christiansen
Comment 16 2012-05-03 02:47:27 PDT
> Yes, we all agree on that and the applications are usually doing it. Let me just elaborate a bit more. WE have decided that this API would be implemented through "view-source". Now we need to decide how to support it. > > When we get the URL in QQuickWebView::setUrl it should have been through QUrl::fromUserInput from the application already, the problem is that the way "view-source:" is evaluated by QUrl. It is considered a valid URL (i.e. isValid() returns true) but it's not "handled". > > What do I mean by "handled"? Let's refer to 4 use-cases: > > 1) User types in "www.foo.com" > - isValid returns true, no matter we use QUrl ctor or ::fromUserInput() > - QUrl::fromUserInput will add http:// to it and return QUrl("http://www.foo.com") > > 2) User types in "http:www.foo.com" > - isValid returns false, no matter we use QUrl ctor or ::fromUserInput() > - QUrl::fromUserInput will do nothing and return QUrl("") > > 3) User types in "view-source:www.foo.com" > - isValid returns true, no matter we use QUrl ctor or ::fromUserInput() > - QUrl::fromUserInput will do nothing and return QUrl("view-source:www.foo.com") > > 4) User types in "view-source:http://www.foo.com" > - isValid returns true, no matter we use QUrl ctor or ::fromUserInput() > - QUrl::fromUserInput will do nothing and return QUrl("view-source:http://www.foo.com") > > > In other words, if we just strip the URL we will only support fully typed urls (i.e. "view-source:http://www.foo.com", or use case #4) and won't support the ones without scheme typed in (i.e. "view-source:www.foo.com", or use case #3). Chromium, for instance, supports both cases. The browser (say MiniBrowser) should do this. QString userScheme; if (urlString.startsWith(QLatin1String("view-source:"))) { userScheme = QLatin1String("view-source:"); urlString.remove(0, 12); } QUrl url = ::fromUserInput(urlString); url ... The more I look at this, it makes more sense to add this to the browser and add a experimental.sourceViewingMode: true > If we only want to support #4, then stripping it only will be fine. > If we want to support #3 as well, then we need to strip it (remove view-source:) and afterwards let QUrl "handle" whatever is left so it adds http/https/ftp/foobar... > > I don't believe this should be fixed in QUrl because it will then fallback to the discussion about whether #2 should work or not. > > What do you guys think?
Jesus Sanchez-Palencia
Comment 17 2012-05-03 05:42:24 PDT
(In reply to comment #16) > The browser (say MiniBrowser) should do this. > > QString userScheme; > > if (urlString.startsWith(QLatin1String("view-source:"))) { > userScheme = QLatin1String("view-source:"); > urlString.remove(0, 12); > } > > QUrl url = ::fromUserInput(urlString); > url ... > > > The more I look at this, it makes more sense to add this to the browser and add a experimental.sourceViewingMode: true Well, then these change the entire conversation we had on IRC about our WebView supporting "view-source:" out of the box. I will upload a new patch when we have reached a final decision here. Thanks for the review!
Jesus Sanchez-Palencia
Comment 18 2012-05-03 14:04:05 PDT
Jesus Sanchez-Palencia
Comment 19 2012-05-03 14:13:15 PDT
(In reply to comment #16) > The browser (say MiniBrowser) should do this. > > QString userScheme; > > if (urlString.startsWith(QLatin1String("view-source:"))) { > userScheme = QLatin1String("view-source:"); > urlString.remove(0, 12); > } > > QUrl url = ::fromUserInput(urlString); > url ... Ok, we agree on this. The browser should clean and fix the URL, even after the view-source:. > The more I look at this, it makes more sense to add this to the browser and add a experimental.sourceViewingMode: true We disagree on this. :) Me and Caio had a good chat about this and we still think the WebView should support "view-source:" instead of yet another experimental API being set/unset. I believe the latest patch is a clean solution covering all the cases on MiniBrowser.
Alexis Menard (darktears)
Comment 20 2012-05-03 14:21:53 PDT
Comment on attachment 140088 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=140088&action=review > Source/WebKit2/UIProcess/API/qt/qquickwebview.cpp:1324 > + WTF::String url = d->m_isInViewSourceMode ? "view-source:" + mainFrame->url() : mainFrame->url(); what about? StringBuilder url; if (d->m_isInViewSourceMode) url.append("view-source:"); url.append(mainFrame->url()); return QUrl(url.toString()); > Tools/MiniBrowser/qt/utils.cpp:99 > + // Make sure we have a well formed URL after the view-source scheme. Give an example (e.g. view-source:www.google.com)
Jesus Sanchez-Palencia
Comment 21 2012-05-03 14:26:56 PDT
(In reply to comment #20) Thanks for the review. > what about? > StringBuilder url; > if (d->m_isInViewSourceMode) > url.append("view-source:"); > url.append(mainFrame->url()); > return QUrl(url.toString()); Cool! I can fix it before landing. > > > Tools/MiniBrowser/qt/utils.cpp:99 > > + // Make sure we have a well formed URL after the view-source scheme. > > Give an example (e.g. view-source:www.google.com) Yep. I thought it was clear enough after all the discussion on the bug but I agree it needs a more explicit comment. I can also fix it before landing (something like "(...) QUrl::fromUserInput doesn't sanitize urls starting with view-source:").
Alexis Menard (darktears)
Comment 22 2012-05-03 14:28:47 PDT
(In reply to comment #21) > (In reply to comment #20) > > Thanks for the review. > > > what about? > > StringBuilder url; > > if (d->m_isInViewSourceMode) > > url.append("view-source:"); > > url.append(mainFrame->url()); > > return QUrl(url.toString()); > > Cool! I can fix it before landing. > > > > > > Tools/MiniBrowser/qt/utils.cpp:99 > > > + // Make sure we have a well formed URL after the view-source scheme. > > > > Give an example (e.g. view-source:www.google.com) > > Yep. I thought it was clear enough after all the discussion on the bug but I agree it needs a more explicit comment. I can also fix it before landing (something like "(...) QUrl::fromUserInput doesn't sanitize urls starting with view-source:"). In the code it's always better than in the bug tracking system, when people refactor they don't open the bugzilla and read the history of the bug :).
Rafael Brandao
Comment 23 2012-05-03 16:03:19 PDT
Comment on attachment 140088 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=140088&action=review >> Source/WebKit2/UIProcess/API/qt/qquickwebview.cpp:1324 >> + WTF::String url = d->m_isInViewSourceMode ? "view-source:" + mainFrame->url() : mainFrame->url(); > > what about? > StringBuilder url; > if (d->m_isInViewSourceMode) > url.append("view-source:"); > url.append(mainFrame->url()); > return QUrl(url.toString()); Should "view-source" be declared static as well? > Source/WebKit2/WebProcess/WebPage/WebPage.cpp:3164 > + m_mainFrame->coreFrame()->setInViewSourceMode(viewInSourceMode); Is it possible that m_mainFrame or m_mainFrame->coreFrame() get derefered? If so, you should add null checks (or asserts).
zalan
Comment 24 2012-05-03 23:08:31 PDT
> > Source/WebKit2/WebProcess/WebPage/WebPage.cpp:3164 > > + m_mainFrame->coreFrame()->setInViewSourceMode(viewInSourceMode); > > Is it possible that m_mainFrame or m_mainFrame->coreFrame() get derefered? If so, you should add null checks (or asserts). m_mainFrame can't, but m_mainFrame->coreFrame() can get NULL. Check against only m_mainframe->coreFrame(). (assert is unnecessary here as having null coreframe is normal behavior)
Jesus Sanchez-Palencia
Comment 25 2012-05-04 05:15:31 PDT
zalan
Comment 26 2012-05-04 06:04:42 PDT
Comment on attachment 140193 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=140193&action=review > Source/WebKit2/UIProcess/WebPageProxy.cpp:3774 > + process()->send(Messages::WebPage::SetViewInSourceMode(viewInSourceMode), m_pageID, 0); Having the webprocess crashed at this point creates an interesting mismatch. If we put if (!isValid()) return; then webview and urlbar is in viewsource mode, while the reattached webprocess (the following load() ensures that) is in non-viewsource mode and there's the mismatch. OTOH, reattaching the webprocess in this setter function might look excessive. Since putting the webprocess in viewsource mode always followed by a load, it might make sense to combine these 2 into one function, something like loadURLWithViewSourceMode(url, viewsourceOn); and then do if (!isValid()) reattachToWebProcess(); but that's yet another load*() and I think we already have enough of them...or the WebPageProxy::load*() signatures could be extended to include the viewsource option, but that's not really desirable either. Most likely putting the reattach() with proper commenting is the least worst option, unless someone comes up with something better.
Simon Hausmann
Comment 27 2012-05-04 11:00:36 PDT
I'm also a bit sceptical about this. In view-source mode links are clickable, but who resets the mode on the Frame then? (and there are no tests that verify this) In a browser I think the "source view" is always a separate tab/window that can only be closed by the user - from a WebKit POV it feels immutable. Any links clicked on are probably meant to be opened in a new tab. I'm not 100% sure this is the right interface, also because it adds complexity to the url property.
Alexis Menard (darktears)
Comment 28 2012-05-07 04:38:53 PDT
(In reply to comment #27) > I'm also a bit sceptical about this. In view-source mode links are clickable, but who resets the mode on the Frame then? (and there are no tests that verify this) > > In a browser I think the "source view" is always a separate tab/window that can only be closed by the user - from a WebKit POV it feels immutable. Any links clicked on are probably meant to be opened in a new tab. Which is the behavior of Chromium, every time you click on a link in the source it opens in a new tab. > > I'm not 100% sure this is the right interface, also because it adds complexity to the url property.
Jesus Sanchez-Palencia
Comment 29 2012-05-07 12:09:29 PDT
(In reply to comment #27) > I'm not 100% sure this is the right interface, also because it adds complexity to the url property. I've tried to come up with something else, Simon, but none of the previous ideas convinced me so far. Any suggestions here? Thanks!
Simon Hausmann
Comment 30 2012-06-01 23:52:19 PDT
Comment on attachment 140193 [details] Patch I'm taking this out of the review queue. The discussion on the mailing list was not very conclusive, but I think we can safely say that adding extra state/logic in QQuickWebView is not desirable compared to having the logic on the webcore side. Where does Chromium deal with this situation and how would their approach map to ours?
Jocelyn Turcotte
Comment 31 2014-02-03 03:20:41 PST
=== Bulk closing of Qt bugs === If you believe that this bug report is still relevant for a non-Qt port of webkit.org, please re-open it and remove [Qt] from the summary. If you believe that this is still an important QtWebKit bug, please fill a new report at https://bugreports.qt-project.org and add a link to this issue. See http://qt-project.org/wiki/ReportingBugsInQt for additional guidelines.
Note You need to log in before you can comment on or make changes to this bug.