Bug 17147

Summary: [GTK] API: Stream-based loader API
Product: WebKit Reporter: Alp Toker <alp>
Component: WebKitGTKAssignee: Nobody <webkit-unassigned>
Status: RESOLVED WONTFIX    
Severity: Normal CC: a.butenka, acmay, christian, jmalonzo, mrobinson, pbloomfield, pmuellr, svillar, talby
Priority: P2 Keywords: Gtk
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Bug Depends on:    
Bug Blocks: 15843    
Attachments:
Description Flags
Add signal before an image load, and add in API function to load in an image none

Alp Toker
Reported 2008-02-01 17:37:30 PST
We need to support applications that load custom resources into WebView. Evolution, tnymail, Yelp, Monodoc are the apps that come to mind. Something like WebDataSource, or perhaps the ability to register URL protocol handlers, or both. GIO could be useful for stream classes, perhaps as a counterpart to NSData in the Mac API. The Mac Obj-C code is quite tangly and looks like parts are deprecated. We don't necessarily want to copy everything there. (Bug #15843 is a request from application authors that we provide additional metadata to the loader application, such as image scale which might be worth keeping in mind when designing a stream-based loader API.)
Attachments
Add signal before an image load, and add in API function to load in an image (17.95 KB, patch)
2008-09-30 22:32 PDT, Andrew May
no flags
José Dapena Paz
Comment 1 2008-07-15 05:17:16 PDT
I would very interested in helping on this. Unfortunately, I still don't have enough knowledge on the parts I should touch to add this feature (webcore and webkit architecture elements involved, the interaction among them, etc). Some additional points for things we need in tinymail and modest: * We need to be able to feed a custom stream. * We also need that webkit provide a way to provide a custom stream for specific uris (for example, for loading images). Currently, talking about the implementation we use, in gtkhtml, it provides: * A signal "url-requested". This signal offers a GtkHTMLStream you can feed (from your own stream or whatever you need). This is used for loading images or other resources included inside the document (we hook to this for cid: uris, and also for managing internally fetching or blocking external image uris). * An api to obtain a GtkHTMLStream from the gtkhtml widget, you can write to. This API is used for loading the document itself to the html widget. Maybe it would be good if we could use a standard GIO channel for this, so that we can use as much standard glib/gtk api as possible.
Andrew May
Comment 2 2008-09-30 22:32:43 PDT
Created attachment 23967 [details] Add signal before an image load, and add in API function to load in an image I don't do any glib/gtk devel or WebKit stuff, so I don't expect this to be perfect or correct. There is some left over junk in the patch with "notimplemented" macro to help me see what else is missing. But I did try to dive into the classes and get something working. I did get this working with a patch to the unreleased mayflower plugin for claws mail. So I am looking to see if this is anywhere close to being the correct approach and what needs to be done to fix it.
Peter Bloomfield
Comment 3 2008-11-26 05:55:02 PST
I'm exploring WebKitWebView as an alternative to gtkhtml{2,3} in an email UA (Balsa). We would need some feature like this, to (a) meet "cid:" requests, and (b) optionally block loading of images or anything else from remote sites--some users have privacy concerns. I'm completely unfamiliar with the WebKit codebase, so I'm not going to be much help with coding, but I'd be very interested in helping out with testing.
Patrick Mueller
Comment 4 2009-09-12 16:03:23 PDT
I ran into Peter Bloomfield at a conference, and had another idea about how to solve a particular problem related to this bug. Peter described the problem of dealing with html email with embedded images. My suggestion was to make use of the data: url to avoid having to do anything more complicated. Reference here: http://en.wikipedia.org/wiki/Data_URI_scheme So the basic idea would be to extract the embedded images out of the email payload, and replace the <img src="xxx"> with <img src="data:yyy"> in the actual html email section. This could be done in either the primary code dealing with the message (C?), or could probably be done in JavaScript with something like an onload handler. May be off-base, but it sounded like it might be an easy way to work around this particular problem, but not a general purpose solution (I don't think).
Peter Bloomfield
Comment 5 2009-12-28 14:12:46 PST
(In reply to comment #4) > I ran into Peter Bloomfield at a conference, and had another idea about how to > solve a particular problem related to this bug. Hi Patrick: I enjoyed the opportunity to chat with you about this. I finally got around to trying some rewriting of the HTML source text. To avoid interpolating potentially large amounts of data, I first tried saving the matching message part as a temporary file, then replacing the "cid:" protocol with "file:///tmp/". But that ran afoul of WebKit's resolute refusal to use "file:" links. So then I tried your suggestion of a "data:" URI, and it works! Well, in a small number of tests. But at least it requires *zero* patches to WebKit, which means we don't have to wait indefinitely to see a fix.
Alexander Butenko
Comment 6 2009-12-28 20:04:57 PST
webview has resource-request-starting and navigation-decission-requested signal to redirect request resource_request_set_uri(request, "file://tmp/blabla.jpg") should do the job as i understand.
Peter Bloomfield
Comment 7 2009-12-29 12:21:02 PST
(In reply to comment #6) > webview has resource-request-starting and navigation-decission-requested signal > to redirect request > > resource_request_set_uri(request, "file://tmp/blabla.jpg") should do the job as > i understand. Thanks for pointing out the "resource-request-starting" signal--I must have started working with WebKit before it appeared in 1.1.14! It does indeed provide a very clean solution to the cid: problem.
Martin Robinson
Comment 8 2010-10-21 17:11:13 PDT
If I'm not mistaken, this functionality is provided will be provider by SoupURILoader in the future. Sergio, can you comment?
Martin Robinson
Comment 9 2010-10-21 17:14:54 PDT
(In reply to comment #8) > If I'm not mistaken, this functionality is provided will be provider by SoupURILoader in the future. Sergio, can you comment? That came out mangled: If I'm not mistaken this functionality will be provided by SoupURILoader in the future.
Sergio Villar Senin
Comment 10 2010-10-22 00:47:54 PDT
(In reply to comment #9) > (In reply to comment #8) > > If I'm not mistaken, this functionality is provided will be provider by SoupURILoader in the future. Sergio, can you comment? > > That came out mangled: If I'm not mistaken this functionality will be provided by SoupURILoader in the future. Well, actually we *do* support this functionality right now as we imported the SoupURILoader code in webkit as a basement for the new HTTP cache. So we currently have a stream-based loader API for all the protocols we support.
Martin Robinson
Comment 11 2010-10-22 08:34:15 PDT
> Well, actually we *do* support this functionality right now as we imported the SoupURILoader code in webkit as a basement for the new HTTP cache. So we currently have a stream-based loader API for all the protocols we support. Correct me if I'm wrong, but my understanding is that WebKit doesn't expose an API for it and it isn't officially part of libsoup yet.
Sergio Villar Senin
Comment 12 2010-10-22 08:42:06 PDT
(In reply to comment #11) > > Well, actually we *do* support this functionality right now as we imported the SoupURILoader code in webkit as a basement for the new HTTP cache. So we currently have a stream-based loader API for all the protocols we support. > > Correct me if I'm wrong, but my understanding is that WebKit doesn't expose an API for it and it isn't officially part of libsoup yet. Oh my fault, I didn't read the bug properly. As you said, we do not currently expose any API.
talby
Comment 13 2011-10-07 16:47:31 PDT
I have been using the "resource-request-starting" signal to call webkit_network_request_set_uri() updating the uri to a data: url, which works wonderfully to embed media resources, however embedding html is problematic. I am attempting to sandbox a browser display of html content, but since this technique alters the urls visible to the DOM, onDomain policies get mangled and relative urls can not be resolved properly.
Martin Robinson
Comment 14 2011-10-07 16:55:56 PDT
(In reply to comment #13) > I have been using the "resource-request-starting" signal to call webkit_network_request_set_uri() updating the uri to a data: url, which works wonderfully to embed media resources, however embedding html is problematic. > > I am attempting to sandbox a browser display of html content, but since this technique alters the urls visible to the DOM, onDomain policies get mangled and relative urls can not be resolved properly. Have you tried using http://webkitgtk.org/reference/webkitgtk-webkitwebview.html#webkit-web-view-load-string ?
talby
Comment 15 2011-10-08 12:21:50 PDT
(In reply to comment #14) > > Have you tried using http://webkitgtk.org/reference/webkitgtk-webkitwebview.html#webkit-web-view-load-string ? That's a great suggestion, and yes I am, for the main frame load. However my trouble shows up when the page has child frames (or child windows). If I use a "resource-request-starting" handler to rewrite the url to a data: url, the DOM is unable to resolve relative links and onDomain policies are not honored. If instead I use a "resource-request-starting" handler to call webkit_web_frame_load_string(), it seems to mangle the WebKitWebFrame and lead to segfaults. I think the DocumentLoader ends up in a bad state where a new load attempt has partially initiated, yet the last attempt has not finished failing. So, it doesn't seem like webkit_web_frame_load_string() is intended for use in signal handlers, at least not "resource-request-starting" (or "navigation-policy-decision-requested"). I can't find a way to satisfy a pending load attempt with a *_load_string() call, so it doesn't seem to help me with child frames where the load is initiated as a side effect of a parent load.
Martin Robinson
Comment 16 2011-10-08 12:26:11 PDT
Hrm. Perhaps you'll have better luck handling the load-started signal. The documentation claims it's deprecated, but I'm in favor of undeprecating it. Also the fact that calling load_data in a signal handler causes a crash, sounds like a bug! Do you have a stack trace?
talby
Comment 17 2011-10-10 11:44:05 PDT
(In reply to comment #16) > Hrm. Perhaps you'll have better luck handling the load-started signal. The documentation claims it's deprecated, but I'm in favor of undeprecating it. > > Also the fact that calling load_data in a signal handler causes a crash, sounds like a bug! Do you have a stack trace? "load-started" on the WebKitWebView only seems to fire for the main frame. WebKitWebFrame doesn't emit a "load-started" signal, so I don't think that can help me. I have since been able to call webkit_web_frame_load_string() from within "navigation-policy-decision-requested". On the first pass I didn't notice that webkit_web_frame_load_string() from with the signal handler emits second "navigation-policy-decision-requested" signal, and my naive attempt was simply blowing the stack. If I don't call _load_string() in the second emit, it seems to load smoothly. I still have a segfault attempting to use webkit_web_view_load_string() from within "resource-request-starting", and can provide the stack trace if it's still interesting to you. It may not be worth investigating because there's a workaround, but that one is not a handler recursion issue, it's something more complex.
Martin Robinson
Comment 18 2014-04-08 17:56:10 PDT
We have an API to implement custom protocols now.
Note You need to log in before you can comment on or make changes to this bug.