Data URLs currently cause: Curl ERROR for url='http://data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABgAAABAAQMAAAA6ZCUhAAAABlBMVEUAAAAup+mvJKLXAAAAAnRSTlMA/1uRIrUAAAAJcEhZcwAADsQAAA7EAZUrDhsAAAAZSURBVHicY2D4z8BACf5PAFNqPkELKDMfANtJT7FBP1fLAAAAAElFTkSuQmCC', error: 'couldn't resolve host name' The Qt port already has code for this, so a solution that can be shared between ports may be a good idea.
Note that the Qt port no longer has code for this (they must have added the feature to Qt itself), though it's still in SVN history. This feature is not so hard to implement, but the state of the curl backend makes it little fun to hack on. This bug continues to be one of the most annoying, since a lot of SVG and canvas tests use data: URLs for inline images, even though it's not a feature used widely on the Web.
Created attachment 16876 [details] Add data URL support
Comment on attachment 16876 [details] Add data URL support > unsigned len = in.size(); > const char* data = in.data(); > Can these just be folded into the new base64Decode call now? >+ return base64Decode(data, len, out); >+} >+ >+bool base64Decode(const char* data, unsigned len, Vector<char>& out) >+{ >+ out.clear(); >+ if (len == 0) >+ return true; >+ >+ // If the input string is pathologically large, just return nothing. >+ if (len > UINT_MAX) >+ return false; >+ Why is this change needed? > while (len && data[len-1] == '=') > --len; > >+static void parseDataUrl(ResourceHandle* job) Please name the variable "handle" instead of "job". >+{ >+ DeprecatedString data = job->request().url().url().latin1(); Why do you need to call latin1() here? Also, we recommend against using DeprecatedString in newly written code. >+ >+ ASSERT(data.startsWith("data:")); URL schemes are case insensitive so this is not correct. >+ > void ResourceHandleManager::startJob(ResourceHandle* job) I take it that startJob is called from the main loop. > { >+ KURL kurl = job->request().url(); >+ String protocol = kurl.protocol(); >+ >+ if (protocol == "data") { URL schemes are insensitive, please use equalIgnoringCase here. >+ parseDataUrl(job); >+ return; >+ } >+ > ResourceHandleInternal* d = job->getInternal();
(In reply to comment #3) > (From update of attachment 16876 [details] [edit]) > > unsigned len = in.size(); > > const char* data = in.data(); > > > > Can these just be folded into the new base64Decode call now? Like this? return base64Decode(in.data(), in.size(), out); Sure. > > >+ return base64Decode(data, len, out); > >+} > >+ > >+bool base64Decode(const char* data, unsigned len, Vector<char>& out) > >+{ > >+ out.clear(); > >+ if (len == 0) > >+ return true; > >+ > >+ // If the input string is pathologically large, just return nothing. > >+ if (len > UINT_MAX) > >+ return false; > >+ > > Why is this change needed? These checks were in the original base64Decode(). The first is a base case and seems to still be necessary, but the second is probably redundant. > > > while (len && data[len-1] == '=') > > --len; > > > > >+static void parseDataUrl(ResourceHandle* job) > > Please name the variable "handle" instead of "job". OK. > > >+{ > >+ DeprecatedString data = job->request().url().url().latin1(); > > Why do you need to call latin1() here? Also, we recommend against using > DeprecatedString in newly written code. Its features were closest to the original code (eg. DeprecatedString::mid()). KURL's KURL::decode_string() also expects and returns DeprecatedString, which we call for percent-encoded data. > > >+ > >+ ASSERT(data.startsWith("data:")); > > URL schemes are case insensitive so this is not correct. OK. > > >+ > > void ResourceHandleManager::startJob(ResourceHandle* job) > > I take it that startJob is called from the main loop. Yeah. The job then gets queued. > > > { > >+ KURL kurl = job->request().url(); > >+ String protocol = kurl.protocol(); > >+ > >+ if (protocol == "data") { > > URL schemes are insensitive, please use equalIgnoringCase here. OK. > > >+ parseDataUrl(job); > >+ return; > >+ } > >+ > > ResourceHandleInternal* d = job->getInternal(); >
(In reply to comment #4) > (In reply to comment #3) > > (From update of attachment 16876 [details] [edit] [edit]) > > > unsigned len = in.size(); > > > const char* data = in.data(); > > > > > > > Can these just be folded into the new base64Decode call now? > > Like this? > > return base64Decode(in.data(), in.size(), out); > > Sure. Good. > > > > > >+ return base64Decode(data, len, out); > > >+} > > >+ > > >+bool base64Decode(const char* data, unsigned len, Vector<char>& out) > > >+{ > > >+ out.clear(); > > >+ if (len == 0) > > >+ return true; > > >+ > > >+ // If the input string is pathologically large, just return nothing. > > >+ if (len > UINT_MAX) > > >+ return false; > > >+ > > > > Why is this change needed? > > These checks were in the original base64Decode(). The first is a base case and > seems to still be necessary, but the second is probably redundant. > Are they still in the method that takes a vector? If so, can they be removed from there? > > > > > while (len && data[len-1] == '=') > > > --len; > > > > > > > >+static void parseDataUrl(ResourceHandle* job) > > > > Please name the variable "handle" instead of "job". > > OK. > > > > > >+{ > > >+ DeprecatedString data = job->request().url().url().latin1(); > > > > Why do you need to call latin1() here? Also, we recommend against using > > DeprecatedString in newly written code. > > Its features were closest to the original code (eg. DeprecatedString::mid()). > KURL's KURL::decode_string() also expects and returns DeprecatedString, which > we call for percent-encoded data. > String::substring is the equivalent of DeprecatedString::mid.
Created attachment 16892 [details] Add data URL support, take two This addresses the issues in the review, except for DeprecatedString. I'm not comfortable with the idea of storing binary data in a string class either way, so I don't think using String instead of DeprecatedString will be any less broken.
Created attachment 16893 [details] Add data URL support, take three (with ChangeLog entry)
Comment on attachment 16893 [details] Add data URL support, take three (with ChangeLog entry) Looks great, r=me
Landed in r27189.