WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
Bug 5262
XMLSerializer drops Namespace information
https://bugs.webkit.org/show_bug.cgi?id=5262
Summary
XMLSerializer drops Namespace information
Richard Vermillion
Reported
2005-10-04 14:16:28 PDT
The XMLSerializer object in Safari 2.0.1 does not serialize XML Namespace information. Specifically, no xmlns attributes are serialized and attributes are only serialized with their localName. This leads to invalid HTML when there are two or more attributes with the same localName but different namespaceURI's. The DOM implementation itself is correctly handling the namespace information, but it is not being serialized. The following script will produce the error (it works fine in Firefox): var d = document.implementation.createDocument("urn:foo-ns", "foo:root", null); if (!d.documentElement) { // This shouldn't happen, since DomImplementation.createDocument // is supposed to create the root element. But in Safari, it's required. d.appendChild(d.createElementNS("urn:foo-ns", "foo:root")); } var root = d.documentElement; root.setAttributeNS("urn:foo-ns", "foo:type", "test") var c = d.createElementNS(null, "child"); root.appendChild(c); c.setAttributeNS("urn:foo-ns", "foo:name", "one"); c.setAttributeNS("urn:bar-ns", "bar:name", "two"); window.alert("foo:name is " + c.getAttributeNS("urn:foo-ns", "name") + " and should be one"); window.alert("bar:name is " + c.getAttributeNS("urn:bar-ns", "name") + " and should be two"); var s = new XMLSerializer(); window.alert(s.serializeToString(d));
Attachments
patch for fix and test case
(10.12 KB, patch)
2007-03-02 14:38 PST
,
Lamar Goddard
darin
: review-
Details
Formatted Diff
Diff
patch for fix and test case
(10.31 KB, patch)
2007-03-05 19:02 PST
,
Lamar Goddard
darin
: review-
Details
Formatted Diff
Diff
patch for fix and test case
(10.47 KB, patch)
2007-03-07 16:22 PST
,
Lamar Goddard
ap
: review-
Details
Formatted Diff
Diff
fix bit rot
(13.41 KB, patch)
2007-03-31 02:53 PDT
,
Alexey Proskuryakov
no flags
Details
Formatted Diff
Diff
patch for fix and test case
(14.46 KB, patch)
2007-04-03 16:21 PDT
,
Lamar Goddard
darin
: review-
Details
Formatted Diff
Diff
patch for fix and test case
(15.62 KB, patch)
2007-04-04 16:51 PDT
,
Lamar Goddard
darin
: review+
Details
Formatted Diff
Diff
Show Obsolete
(4)
View All
Add attachment
proposed patch, testcase, etc.
Alexey Proskuryakov
Comment 1
2005-10-18 03:55:30 PDT
Same as
bug 3811
(which doesn't have simple steps to reproduce)?
Noah Vihinen
Comment 2
2005-12-19 00:59:35 PST
I ran into this bug as well.
Alexey Proskuryakov
Comment 3
2005-12-23 05:38:16 PST
***
Bug 3811
has been marked as a duplicate of this bug. ***
Alexey Proskuryakov
Comment 4
2005-12-23 05:38:39 PST
From
bug 3811
: Expected a namespace-savvy serializer to be used. That is, expected the serializer to preserve (ns-URI,localName) pairs by synthetizing xmlns attibutes and prefixes if necessary. Expected residual prefixes and xmlns attributes to be used as hints of preferred prefixes when possible. Expected DOM Level 1 tampering with the xml:foo attributes to get the namespace right. (IE does not do namespaces or DOM Level 1 but does have sane behavior with DOM Level 1 and xml:foo attributes.)
Alexey Proskuryakov
Comment 5
2005-12-24 02:42:56 PST
May be relevant: <
http://www.w3.org/TR/2002/WD-DOM-Level-3-Core-20021022/namespaces
- algorithms.html>
Eric Seidel (no email)
Comment 6
2005-12-24 02:53:20 PST
I thought I fixed this a while back, when doing the XSLTProcessor work. I guess not. Assigning this to "nobody" mjs can pull it back if he really wanted this assigned to him.
Eric Seidel (no email)
Comment 7
2005-12-24 02:55:04 PST
It would be nice of someone could make up a DumpRenderTree compatible test case for this.
Maciej Stachowiak
Comment 8
2007-02-23 02:55:11 PST
<
rdar://problem/5018778
>
Lamar Goddard
Comment 9
2007-03-02 14:38:08 PST
Created
attachment 13454
[details]
patch for fix and test case
Darin Adler
Comment 10
2007-03-04 22:22:50 PST
Comment on
attachment 13454
[details]
patch for fix and test case Thanks for tackling this. -------- +static bool shouldAddNamespaceElem(const Element* elem, Vector<QualifiedName*>* namespaces) Why the unused "namespaces" parameter? -------- + const String prefix = elem->prefix(); We don't normally put const on local variables like this one. I know that some people prefer to do this in their C++ code and there can be some modest benefits, but I don't see any reason to do this with all the local variables in your patch. -------- All the places you write: !!x && x.length() you don't need to do it like that. This will do the same thing, more efficiently: !x.isEmpty() -------- This is a particularly inefficient way to check this: + if (attr->name().toString() == "xmlns") { One good way to write this is: static const QualifiedName xmlnsAttr(nullAtom, "xmlns", nullAtom); if (attr->name() == xmlnsAttr) { -------- + if (attr->prefix() == "xmlns") { Why it OK to check the prefix here and not the namespace? -------- +static DeprecatedString addNamespace(String prefix, const String ns, Vector<QualifiedName*>* namespaces) These parameters should be const String&, not String type. The Vector should be a reference parameter, not a pointer parameter. Vector<QualifiedName*>&. Please don't use DeprecatedString in the interface of new functions. This should return a String. -------- + prefix = !!prefix ? prefix : ""; Are you sure the above code is needed? I think everything would work fine without it. Null strings and empty strings behave the same in most cases. -------- The * goes next to the type. + QualifiedName *item = namespaces->at(i); -------- +static DeprecatedString startMarkup(const Node*, const Range*, EAnnotateForInterchange, CSSMutableStyleDeclaration*, Vector<QualifiedName*>* = 0); +static DeprecatedString startMarkup(const Node *node, const Range *range, EAnnotateForInterchange annotate, CSSMutableStyleDeclaration *defaultStyle, Vector<QualifiedName*> *namespaces) You should not have a separate declaration here. If you want to forward declare, it goes at the top of the file. Otherwise, the definition is just fine. Just put the = 0 after the namespaces vector. +static DeprecatedString markup(Node*, bool, Vector<Node*>*, Vector<QualifiedName*>* = 0); +static DeprecatedString markup(Node* startNode, bool onlyIncludeChildren, Vector<Node*>* nodes, Vector<QualifiedName*>* namespaces) Same thing here. -------- + Vector<QualifiedName*> my_namespaces; We don't use underscore in our variable names, nor do we use "my" in variable names. All the QualifiedName objects of the namespaces leak here. You need to do something to delete them. A call to deleteAllValues would be one way to do it. -------- Are you sure the test case exercises all the code paths? I think you're using the wrong data structure for the namespaces. It seems that it should be a map from a prefix string to a namespace string. It's not a good way to use QualifiedName. I'm thinking the type you want is either this: HashMap<AtomicString, AtomicString> or this: HashMap<AtomicStringImpl*, AtomicStringImpl*> Using a vector instead of a map is only good if there's a fixed limit on the number of namespaces. Since there isn't, you should use a map.
Lamar Goddard
Comment 11
2007-03-05 19:02:22 PST
Created
attachment 13484
[details]
patch for fix and test case Ok, another attempt at a patch. I now use HashMap<AtomicStringImpl*, AtomicStringImpl*> and have included a slightly more complex test case that should hit all the new code branches.
Darin Adler
Comment 12
2007-03-07 07:01:31 PST
Comment on
attachment 13484
[details]
patch for fix and test case Thanks! This looks really good. + static const QualifiedName xmlnsAttr(nullAtom, "xmlns", xmlnsURI); + if (attr->name() == xmlnsAttr) { Is it right that the QualifiedName for the "xmlns" attribute has a namespace? I'd expect nullAtom for the namespace. Is there a test case that verifies that this line of code is working properly? +static bool shouldAddNamespaceAttr(const Attribute* attr, HashMap<AtomicStringImpl*, AtomicStringImpl*>* namespaces) +static String addNamespace(const AtomicString& prefix, const AtomicString& ns, HashMap<AtomicStringImpl*, AtomicStringImpl*>* namespaces) +static String addNamespaceElem(const Element *elem, HashMap<AtomicStringImpl*, AtomicStringImpl*>* namespaces) +static String addNamespaceAttr(const Attribute *attr, HashMap<AtomicStringImpl*, AtomicStringImpl*>* namespaces) I still think that you should pass a reference here rather than a pointer (as I suggested last time I reviewed a version of this patch). I see how it's consistent to use a * since the namespaces hash is optional at the outermost level + // Set pre to be emptyAtom if prefix is empty so that all empty AtomicStrings will map to the same key in the hash. + AtomicString pre = prefix.isEmpty() ? emptyAtom : prefix; I believe the behavior you're talking about here is already guaranteed by AtomicString; the whole point of the class is that it merges all equal strings into a single key. The only exception to this rule might be due to the difference between "null" and "empty", so the code above might be useful if you're trying to make the nullAtom be equal to the emptyAtom. Or maybe the hash table can't handle nullAtom as a key? Did you test without this? Are you sure you need that line of code? + AtomicString ns = elem->namespaceURI(); For lines of code like this one, you can actually make them slightly more efficient and not do as much reference count churning if you make the local variable "const AtomicString&". There are three caveats: 1) you can't modify the value of the variable in that case 2) you need to make sure that you do this only for functions that actually return const AtomicString&, like namespaceURI() and getAttribute() and the like 3) you have to be sure not to use the value if the object you got the string from has been deleted Anyway, not an important optimization but worth mentioning; I would have used it where possible. This patch has some tabs in it. We can't check a patch into Subversion in if it has tab characters, so it makes work for whoever is landing your patch if you use them. Please search for tab characters and make sure you're not using any before posting your next patch. + if (namespaces) + namespaceHash = *namespaces; Since the markup function does not modify the passed-in namespaces hash, then I think the namespaces parameter should be a const* parameter instead of a plain old * parameter? But I'm worried now that we're going to be doing a lot of hash table copying. Is there any way to avoid that overhead while still keeping the same semantic. I know, it's my fault for suggesting we use a hash table in the first place! Everything looks good here. I'm going to say review- just because of the tab characters, and also because there are enough comments that you might want to make some revisions. However, I don't think any of my comments besides the tab character one are mandatory changes.
Lamar Goddard
Comment 13
2007-03-07 16:22:50 PST
Created
attachment 13536
[details]
patch for fix and test case (In reply to
comment #12
)
> + static const QualifiedName xmlnsAttr(nullAtom, "xmlns", xmlnsURI); > + if (attr->name() == xmlnsAttr) { > > Is it right that the QualifiedName for the "xmlns" attribute has a namespace? > I'd expect nullAtom for the namespace. Is there a test case that verifies that > this line of code is working properly?
Yeah, I tried it with nullAtom, but it doesn't match and the "node" elem in the test case also output the namespace. I got the proper namespaceURI for xmlns attributes from
http://www.w3.org/TR/2006/REC-xml-names-20060816/#xmlReserved
> +static bool shouldAddNamespaceAttr(const Attribute* attr, > HashMap<AtomicStringImpl*, AtomicStringImpl*>* namespaces) > +static String addNamespace(const AtomicString& prefix, const AtomicString& ns, > HashMap<AtomicStringImpl*, AtomicStringImpl*>* namespaces) > +static String addNamespaceElem(const Element *elem, HashMap<AtomicStringImpl*, > AtomicStringImpl*>* namespaces) > +static String addNamespaceAttr(const Attribute *attr, > HashMap<AtomicStringImpl*, AtomicStringImpl*>* namespaces) > > I still think that you should pass a reference here rather than a pointer (as I > suggested last time I reviewed a version of this patch). I see how it's > consistent to use a * since the namespaces hash is optional at the outermost > level
I've made this change.
> + // Set pre to be emptyAtom if prefix is empty so that all empty > AtomicStrings will map to the same key in the hash. > + AtomicString pre = prefix.isEmpty() ? emptyAtom : prefix; > > I believe the behavior you're talking about here is already guaranteed by > AtomicString; the whole point of the class is that it merges all equal strings > into a single key. The only exception to this rule might be due to the > difference between "null" and "empty", so the code above might be useful if > you're trying to make the nullAtom be equal to the emptyAtom. Or maybe the hash > table can't handle nullAtom as a key? Did you test without this? Are you sure > you need that line of code?
I tested without it and with using nullAtom instead of emptyAtom and both failed so I think you're right in that the hash can't store nullAtom as a key.
> + AtomicString ns = elem->namespaceURI(); > > For lines of code like this one, you can actually make them slightly more > efficient and not do as much reference count churning if you make the local > variable "const AtomicString&". There are three caveats: > > 1) you can't modify the value of the variable in that case > 2) you need to make sure that you do this only for functions that actually > return const AtomicString&, like namespaceURI() and getAttribute() and the like > 3) you have to be sure not to use the value if the object you got the > string from has been deleted > > Anyway, not an important optimization but worth mentioning; I would have used > it where possible.
I've added this where I thought it made sense.
> This patch has some tabs in it. We can't check a patch into Subversion in if it > has tab characters, so it makes work for whoever is landing your patch if you > use them. Please search for tab characters and make sure you're not using any > before posting your next patch.
Ok, I've removed all the tabs.
> + if (namespaces) > + namespaceHash = *namespaces; > > Since the markup function does not modify the passed-in namespaces hash, then I > think the namespaces parameter should be a const* parameter instead of a plain > old * parameter?
I've made this change.
> But I'm worried now that we're going to be doing a lot of hash table copying. > Is there any way to avoid that overhead while still keeping the same semantic. > I know, it's my fault for suggesting we use a hash table in the first place!
I'm not sure how much overhead there is in copying a hash. Each element should have its own copy of the hash to pass on to its children after adding any namespaces it defines. In theory we could check startNode in WebCore::markup() for being an element that either has a prefix/namespace or has an attribute with a prefix/namespace not already in the hash before doing the copy.
Darin Adler
Comment 14
2007-03-09 06:45:07 PST
Comment on
attachment 13536
[details]
patch for fix and test case + // Set pre to be emptyAtom if prefix is empty so that all empty AtomicStrings will map to the same key in the hash. Comment should really mention null, because there are only two empty atomic strings: null and empty. The code below: + // Set pre to be emptyAtom if prefix is empty so that all empty AtomicStrings will map to the same key in the hash. + const AtomicString& pre = prefix.isEmpty() ? emptyAtom : prefix; + AtomicString foundNS = namespaces.get(pre.impl()); + if (foundNS.isEmpty() || (foundNS != ns)) { + namespaces.set(pre.impl(), ns.impl()); + return " xmlns" + (!prefix.isEmpty() ? ":" + prefix : "") + "=\"" + ns + "\""; + } + return ""; seems more natural to me, written this way: if (ns.isEmpty()) return ""; // Use emptyAtom's impl() for both null and empty strings, since the HashMap can't handle 0 as a key. AtomicStringImpl* pre = prefix.isEmpty() ? emptyAtom.impl() : prefix.impl(); AtomicStringImpl* foundNS = namespaces.get(pre.impl()); if (foundNS != ns.impl()) { namespaces.set(pre.impl(), ns.impl()); return " xmlns" + (!prefix.isEmpty() ? ":" + prefix : "") + "=\"" + ns + "\""; } return ""; Since the map is of AtomicStringImpl*, the code can work with those. With the special case for ns, you can simplify callers of addNamespace -- they don't need to check for the ns.isEmpty() special case. But those changes are optional. r=me
Alexey Proskuryakov
Comment 15
2007-03-31 02:53:48 PDT
Created
attachment 13902
[details]
fix bit rot I have tried to apply this patch, and encountered several problems: 1) It has bit-rotted. 2) Namespace attribute serialization should use escapeTextForMarkup() to avoid problems with special characters in namespaces. 3) No ChangeLog for LayoutTests, 4) Incorrect markup is produced in the original test case - one of the elements has 'xmlns="null"'. 5) I am not sure why this only fixes createMarkup(Node* ,...), and not createMarkup(Range*, ...). I fixed (1), (2) and (3), but (4) has to be fixed prior to landing, too. Attaching what I ended up with.
Alexey Proskuryakov
Comment 16
2007-03-31 02:54:50 PDT
Comment on
attachment 13536
[details]
patch for fix and test case r- per the above comment.
Lamar Goddard
Comment 17
2007-04-03 16:21:20 PDT
Created
attachment 13938
[details]
patch for fix and test case Fixed (4) by patching Document::createElementNS to use nullAtom if the _namespaceURI param == "null" as createElementNS with a null namespace should behave the same way as Document::createElement. Didn't patch JSImmediate::toString as it seems that other functions/tests assume that a null from js will be converted to the string "null". This issue may need to be split out into its own bug as "null" and null are being treated the same by Document::createElementNS and that shouldn't be. I don't think (5) needs addressing (at least by this bug) as XMLSerializer only calls createMarkup(Node* , ...).
Alexey Proskuryakov
Comment 18
2007-04-03 21:50:37 PDT
(In reply to
comment #17
)
> This issue may need to be > split out into its own bug as "null" and null are being treated the same by > Document::createElementNS and that shouldn't be.
I think you are right, this sounds like a separate bug then. It can be fixed in Document.idl: - [OldStyleObjC] Element createElementNS(in DOMString namespaceURI, + [OldStyleObjC] Element createElementNS(in [ConvertNullToNullString] DOMString namespaceURI, in DOMString qualifiedName)
Darin Adler
Comment 19
2007-04-04 09:31:14 PDT
Comment on
attachment 13938
[details]
patch for fix and test case Patch basically looks pretty good. + // Test _namespaceURI for "null" as null from js comes through as "null" That's a bug that should be fixed at JavaScript binding call site. We shouldn't convert a null to a string and then check for "null". Instead we should use [ConvertNullToNullString] in the Document.idl file. Look at DOMImplementation.idl for examples. - markups.prepend(startMarkup(parent, range, annotate)); + markups.prepend(startMarkup(parent, range, annotate, false, 0)); - markups.prepend(startMarkup(parent, range, annotate)); + markups.prepend(startMarkup(parent, range, annotate, false, 0)); - markups.prepend(startMarkup(ancestor, range, annotate, convertBlocksToInlines)); + markups.prepend(startMarkup(ancestor, range, annotate, convertBlocksToInlines, 0)); - markups.prepend(startMarkup(ancestor, range, annotate)); + markups.prepend(startMarkup(ancestor, range, annotate, false, 0)); Why are these helpful changes? Those are the default values of those parameters.
Lamar Goddard
Comment 20
2007-04-04 16:51:43 PDT
Created
attachment 13953
[details]
patch for fix and test case I've added [ConvertNullToNullString] for the namespaceURI parameters in Document's bindings of createAttributeNS, createElementNS and getElementsByTagNameNS. The dom/xhtml/level3/core/nodeisequal14.xhtml test now fails because an Attr created from createAttribute is supposed to have a null prefix, localName and namespaceURI. See:
http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#ID-1084891198
This was only passing before because the namespaceURI that was being used by createAttributeNS(null, ...) was "null". The dom/xhtml/level3/core/nodeisequal15.xhtml test now succeds as attributes with no namespace from the page and attributes created explicitly with a namespace of null have the same namespace.
Darin Adler
Comment 21
2007-04-05 08:56:18 PDT
Comment on
attachment 13953
[details]
patch for fix and test case r=me There are three minor things wrong with this patch that are not serious enough for a review-. One is that the change log for LayoutTests needs to list the files modified, and in a couple cases instead it's listing the .xhtml files of the tests with changed results. Second, the explanation of nodeisequalnode14.xhtml failure is unnecessarily confusing. It should say that it's supposed to create an Attr with localName of null, which is the salient point I think. The third problem is that we don't have nearly enough testing of the behavior of createElementNS and getElementsByTagNameNS when passed a namespace of null -- the new layout test is the only one that calls them that way, and it doesn't test the behavior very thoroughly.
Alexey Proskuryakov
Comment 22
2007-04-21 02:59:20 PDT
Committed revision 20997. See
bug 6216
for some related getElementsByTagNameNS issues.
Alexey Proskuryakov
Comment 23
2007-06-23 03:37:22 PDT
***
Bug 14322
has been marked as a duplicate of this bug. ***
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug