Bug 24418 - transformToFragment() method from XSLTProcessor objects returning null when processing XML data read from on-disk files
Summary: transformToFragment() method from XSLTProcessor objects returning null when p...
Status: RESOLVED INVALID
Alias: None
Product: WebKit
Classification: Unclassified
Component: XML (show other bugs)
Version: 528+ (Nightly build)
Hardware: Mac OS X 10.5
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-03-05 21:56 PST by Juan Manuel Palacios
Modified: 2009-03-06 08:55 PST (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Juan Manuel Palacios 2009-03-05 21:56:34 PST
I'm using r41443 of the Webkit nightly builds, the current one as of 2009-03-05, on top of the Safari 4 beta and I'm experiencing a problem with the transformToFragment() method of the XSLTProcessor() objects, which is always returning null when processing the sample XML and XSL files that can be reached at the following URLs:

-) http://jmpp.org/ttf_testcase/ a small index.html file with a clickable link to see the bug in action;
-) http://jmpp.org/ttf_testcase/data.xml the sample XML file;
-) http://jmpp.org/ttf_testcase/stylesheet.xsl the XSLT stylesheet;
-) http://jmpp.org/ttf_testcase/fragment_broken.js the JS code that attempts to process the previous two files.

A couple of things I'd like to note explicitly:

-) this particular combination of XML, XSLT and JS works perfectly on Firefox 3;
-) a slight variation of this test case mysteriously makes the problem go away, in which the XML to be processed is turned into a string (rather than an on-disk file) that's output by a server-side script, such as http://dykasa.com/imAppInterfaces/IArticulos.php?act=articulos_marca&marId=1&page=0

In this modified testcase, the xmlDoc JS variable that holds the XML to be processed is filled a little differently, as demonstrated in the script inlined below:

/* fragment_working.js */
var myXMLHTTPRequest = new XMLHttpRequest();
var xsltProcessor = new XSLTProcessor();
var xmlDoc;
var fragment;

// load the xslt file
myXMLHTTPRequest.open('GET', 'http://dykasa.com/imClientFormats/articulos_marca.xsl', false);
myXMLHTTPRequest.send(null);
xsltProcessor.importStylesheet(myXMLHTTPRequest.responseXML);
		
// load the xml file
myXMLHTTPRequest.open('GET', 'http://dykasa.com/imAppInterfaces/IArticulos.php?act=articulos_destacados&page=0', false);
myXMLHTTPRequest.send(null);
// different xmlDoc var here!
xmlDoc = (new DOMParser()).parseFromString(myXMLHTTPRequest.responseText, 'text/xml');
fragment = xsltProcessor.transformToFragment(xmlDoc, document);
console.log(fragment);

In this case, transformToFragment() returns a valid object off of the custom-filled xmlDoc variable. This narrowing-down process leads me to believe the problem is in the responseXML property of the XHR object the second time it is used in the 'fragment_broken.js' file, i.e. when the XML file is loaded. Unfortunately, I hit a dead end with this theory because dumping myXMLHTTPRequest.responseXML to the console just after loading the XML file ('data.xml') in 'fragment_broken.js' shows me a perfectly valid XML structure. Also, the 'data.xml' file used in 'fragment_broken.js' and the XML string produced by the URL used in 'fragment_working.js' have the exact same structure, so that makes it difficult to point to the XML file as the possible culprit (e.g. being malformed). Similarly, the 'stylesheet.xsl' file used in 'fragment_broken.js' has the exact same structure as that used in 'fragment_working.js', so it is equally difficult to point to a possibly malformed 'stylesheet.xsl' file as the culprit of the problem.

So, summarizing:

-) http://jmpp.org/ttf_testcase/data.xml and http://dykasa.com/imAppInterfaces/IArticulos.php?act=articulos_destacados&page=0 have the exact same structure, but the former is an on-disk file and the latter is a string output by a server-side script (this seems to be the only sensible difference so far);
-) http://jmpp.org/ttf_testcase/stylesheet.xsl and http://dykasa.com/imClientFormats/articulos_marca.xsl have the exact same structure and both are on-disk files;
-) in 'fragment_broken.js', the xmlDoc JS variable is simply the 'responseXML' property of the XHR object that loads the on-disk XML file;
-) in 'fragment_working.js', the xmlDoc JS variable is the XML resulting of a DOM parse:

xmlDoc = (new DOMParser()).parseFromString(myXMLHTTPRequest.responseText, 'text/xml')

Lastly, a search through bugzilla led me to a similar issue, #10419, but that has not only already been fixed, but also the suggestions made there don't seem to help this time round:

-) removing or leaving in the <?xml version="1.0" encoding="UTF-8"?> declaration at the top of my stylesheet.xsl file makes no difference, problem demonstrated in fragment_broken.js persists;
-) switching the xsl:output to <xsl:output method="xml" omit-xml-declaration="yes"/> makes no difference either, problem demonstrated in fragment_broken.js persists.

Please don't hesitate to ping me here or on IRC, under the nick jmpp, if any of this is not clear, I'd be happy to elaborate and/or clarify.


-jmpp
Comment 1 Juan Manuel Palacios 2009-03-05 22:46:53 PST
A couple of things I'd like to clarify, after a conversation with a very helpful individual in ##JavaScript @ Freenode:

-) when I say "on-disk file" in my report, I mean a static file served by Apache ('data.xml'), as opposed to PHP dynamically generated content and printed to the wire, which is exactly how the XML "string" fetched in fragment_working.js is constructed;

-) the 'data.xml' file is served with an appropriate "Content-Type:application/xml" header, and this is the case that works on Firefox but fails on Webkit (fragment_broken.js); the XML "string" used in fragment_working.js is served with "erroneous" "Content-Type:text/html" headers, and works on both Firefox *and* Webkit, which only confuses me more...

- jmpp
Comment 2 Alexey Proskuryakov 2009-03-06 01:13:29 PST
This is caused by a mistake in stylesheet.xsl. For the first output element, the stylesheet attempts to add an id attribute after adding a child, which is forbidden by XSLT spec, see <http://www.w3.org/TR/xslt#creating-attributes>:

-----------
The following are all errors:
- Adding an attribute to an element after children have been added to it; implementations may either signal the error or ignore the attribute.
...
-----------

Firefox silently ignores the error, but libxslt does not. I've checked that there is no way to ask libxslt to be lenient in this case.

Please note that the XSLT transformation error is printed to Mac OS X console. We certainly should make it so that errors are printed to Web Inspector console in the future for easier debugging.
Comment 3 Juan Manuel Palacios 2009-03-06 08:55:00 PST
(In reply to comment #2)
> This is caused by a mistake in stylesheet.xsl. For the first output element,
> the stylesheet attempts to add an id attribute after adding a child, which is
> forbidden by XSLT spec, see <http://www.w3.org/TR/xslt#creating-attributes>:
> 
> -----------
> The following are all errors:
> - Adding an attribute to an element after children have been added to it;
> implementations may either signal the error or ignore the attribute.
> ...
> -----------

Thanks for pointing me to this, I made the corrections to the stylesheet and everything is working fine now, mystery solved! I apologize for the loud noise, I was very confused with this "problem" as no analysis led me to any logical conclusion. I did look for help in #Webkit @ freenode to review the XML and XSLT in case there was anything at fault in them that I was missing, but I was requested to instead come here and open a bug.

When I said the XSLT in this testcase and that used at the dykasa.com server (the sample used in 'fragment_working.js') had the same structure, I was referring to their <xsl:output> elements, as the review in bug #10419 led me to believe that's where I should focus my efforts. They were nevertheless different, as the latter doesn't incur on the error I was making here.

> 
> Firefox silently ignores the error, but libxslt does not.

That didn't help one bit in finding the error ;)

> 
> Please note that the XSLT transformation error is printed to Mac OS X console.

Aye!

3/6/09 12:07:24 PM [0x0-0x29029].org.webkit.nightly.WebKit[423] runtime error: file http://jmpp.org/ttf_testcase/stylesheet.xsl line 9 element attribute
3/6/09 12:07:24 PM [0x0-0x29029].org.webkit.nightly.WebKit[423] xsl:attribute : node already has children

> We certainly should make it so that errors are printed to Web Inspector console
> in the future for easier debugging.


Please! That would definitely be of great help, and I guess many would say they'd expect these types of errors to be shown right on the browser, which is where we are working after all; it never occurred to me to look in the Console until you mentioned it. Firebug prints these errors to its console too (though in an extremely cryptic manner, sadly), further supporting this move.

But I'm not complaining, I understand the nature of open source and I don't have a patch handy ;) Thanks for all the help!


-jmpp