Bug 19215 - REGRESSION: transformToDocument fails when xsl includes  
Summary: REGRESSION: transformToDocument fails when xsl includes  
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: XML (show other bugs)
Version: 528+ (Nightly build)
Hardware: Mac (Intel) OS X 10.5
: P1 Normal
Assignee: Alexey Proskuryakov
URL:
Keywords: Regression
: 19916 (view as bug list)
Depends on:
Blocks:
 
Reported: 2008-05-23 08:18 PDT by Duncan Booth
Modified: 2008-07-07 04:43 PDT (History)
4 users (show)

See Also:


Attachments
Javascript to create xml and xsl and demonstrate the problem (1.08 KB, text/javascript)
2008-05-23 08:20 PDT, Duncan Booth
no flags Details
HTML to load the previously attached javascript (158 bytes, text/html)
2008-05-23 08:20 PDT, Duncan Booth
no flags Details
proposed fix (20.80 KB, patch)
2008-05-27 00:08 PDT, Alexey Proskuryakov
darin: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Duncan Booth 2008-05-23 08:18:59 PDT
I have an XSL stylesheet which includes an entity escape for a non-break space. Prior to webkit nightly build r32877 this worked, but from r32901 onwards it fails (transformToDocument returns null). Simply removing the   from the XSL file makes it work again.
Comment 1 Duncan Booth 2008-05-23 08:20:09 PDT
Created attachment 21311 [details]
Javascript to create xml and xsl and demonstrate the problem
Comment 2 Duncan Booth 2008-05-23 08:20:40 PDT
Created attachment 21312 [details]
HTML to load the previously attached javascript
Comment 3 Alexey Proskuryakov 2008-05-23 08:27:37 PDT
Sounds related to <http://trac.webkit.org/projects/webkit/changeset/32879>.

Confirming (although I haven't attempted to actually reproduce yet).
Comment 4 Alexey Proskuryakov 2008-05-26 03:28:49 PDT
The problem is that we replace &#160; with &nbsp; when re-serializing the stylesheet for XSLT, which makes it invalid.

We shouldn't do this for XML. Maybe we should match Firefox and also avoid doing this for HTML in cases other than innerHTML, but so far, we've been able to get away with having innerHTML behave identically to XMLSerializer, XSLT and XMLHttpRequest serialization etc.
Comment 5 Alexey Proskuryakov 2008-05-27 00:08:29 PDT
Created attachment 21358 [details]
proposed fix
Comment 6 Darin Adler 2008-05-27 09:20:29 PDT
Comment on attachment 21358 [details]
proposed fix

+    Vector<UChar> s;
+    appendEscapedContent(s, make_pair(in.characters(), in.length()), escapeNBSP);
+    return String::adopt(s);

I would use a longer name than "s" for the string buffer.

Do we want to have a path where we reuse the string when there is no text to escape?

+            // FIXME: Comment content is not escaped, but some APIs calling this should raise an exception if it includes "-->".

I don't know what "some APIs calling this" refers to.

+        case Node::DOCUMENT_TYPE_NODE: {
+            const DocumentType* n = static_cast<const DocumentType*>(node);

Moving the code inside this file seems fine, but putting it all inside the case statement instead of factored into a separate function seems like a step backwards.

     if (Document *doc = document())
         if (DocumentType *doctype = doc->doctype())
-            return doctype->toString();
+            return createMarkup(doctype);
 
     return String();

Should use early returns instead of nested ifs. Or add braces if leaving this nested.

r=me
Comment 7 Alexey Proskuryakov 2008-05-29 02:48:39 PDT
Committed <http://trac.webkit.org/changeset/34197>.

(In reply to comment #6)
> Do we want to have a path where we reuse the string when there is no text to
> escape?

As far as I can tell, this function is not used in any hot code paths - although I'm not sure if I know all of its uses.

> I don't know what "some APIs calling this" refers to.

Reworded to mention one.

> Moving the code inside this file seems fine, but putting it all inside the case
> statement instead of factored into a separate function seems like a step
> backwards.

Factored out document type case into a separate function - the others seem small enough to stay inside the switch.
Comment 8 Alexey Proskuryakov 2008-07-07 04:43:51 PDT
*** Bug 19916 has been marked as a duplicate of this bug. ***