WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
Bug 17403
WebKit Creates Invalid Xhtml Links with Ajax
https://bugs.webkit.org/show_bug.cgi?id=17403
Summary
WebKit Creates Invalid Xhtml Links with Ajax
Charlie
Reported
2008-02-16 23:19:51 PST
In xhtml, links must be escaped like this: <a id="search_link" href="
http://www.google.com/search?q=webkit&start=10
">Original Search</a> Safari 3.x correctly handles this when loading an XHTML document, but not when replacing parts of the document using Ajax. What happens is Safari interprets the links as:
http://www.google.com/search?q=webkit&start=10
Which is wrong - the start parameter will not be sent along. I have attached an example showing the problem: 1. Open the document 2. Click the link, and you'll go to page #2 of a Google search on Webkit. 3. Hit the replace button which will replace the link in a similar way that an Ajax call will work. Note I took the replace method from the Prototype library. 4. Now click the link again - you'll see page 1 of the results, *not* page 2. And that's the bug - Safari mucks up & is xhtml links, cause all parameters after the first one to be dropped.
Attachments
Shows xhtml link handling bug
(918 bytes, application/xhtml+xml)
2008-02-16 23:20 PST
,
Charlie
no flags
Details
further reduced test case
(551 bytes, application/xhtml+xml)
2008-02-22 03:59 PST
,
Alexey Proskuryakov
no flags
Details
First version: switch DocumentFragment parsing to xmlParseContent
(11.11 KB, patch)
2008-04-12 15:07 PDT
,
Julien Chaffraix
ap
: review+
Details
Formatted Diff
Diff
test contextual parsing
(600 bytes, application/xhtml+xml)
2008-04-13 23:56 PDT
,
Alexey Proskuryakov
no flags
Details
View All
Add attachment
proposed patch, testcase, etc.
Charlie
Comment 1
2008-02-16 23:20:44 PST
Created
attachment 19168
[details]
Shows xhtml link handling bug
Charlie
Comment 2
2008-02-16 23:24:42 PST
Note by the way, the example works correctly in Firefox 2.x and Opera 9.20
Alexey Proskuryakov
Comment 3
2008-02-18 03:41:08 PST
Confirmed with
r30368
- internally, we are doubly encoding the ampersand as "&#38;".
Charlie
Comment 4
2008-02-21 01:02:58 PST
Any thoughts on fixes? This issue makes using xhtml/ajax with Safari a bit painful since any replaced links are wrong.
Alexey Proskuryakov
Comment 5
2008-02-21 01:52:45 PST
I think you would get better results by using DOM manipulation methods instead of createContextualFragment. If this is unacceptably hard to re-architecture the code you have, the next thing I'd try is fixing up URL with DOM methods after parsing with createContextualFragment.
Charlie
Comment 6
2008-02-21 19:27:18 PST
DOM manipulation really isn't the best solution in this case because XHTML is being returned from a server. Thus its a whole lot easier to use createContextualFragment - the alternative would be to use DOM Parser to first parse the content, and then copy it over to the DOM. But my guess is you'd run into the same bug anyway. I say that because we did try the trick of looping over all A elements that are inserted and reset their href values. It didn't work...the same issue occurs. This really seems like a bug in webkit since inserting html/xhtml text into the DOM works with every browser *except* Safari with xhtml (and obviously IE with xhtml). Note that we use xhtml because mapbuzz is a mapping site, and we make heavy use of SVG. In addition, in some places we atom feeds into xhtml documents to speed up some search results. Thus we depend on XML namespaces (or IE's proprietary solution of XML data islands). Thanks for the comments, and we would of course be happy to test any patches.
Alexey Proskuryakov
Comment 7
2008-02-22 03:59:27 PST
Created
attachment 19277
[details]
further reduced test case This looks like a bug in createContextualFragment(), not in URL handling.
Alexey Proskuryakov
Comment 8
2008-02-22 04:02:43 PST
DOMParser seems to work, you can try it as a workaround: var html = '<div foo="&"/>' var newDoc = (new DOMParser).parseFromString(html, 'application/xml');
Charlie
Comment 9
2008-03-18 14:44:49 PDT
Alexey, Unfortunately DOMParser does *not* work. To flesh out your example: var parser = new DOMParser() var doc = parser.parseFromString(html.stripScripts(), 'application/xhtml+xml'); var node = document.importNode(doc.documentElement, true) element.innerHTML = ' ' // the space is needed...WebKit bug element.appendChild(node) All these operations work. But what you end up with is inserted XHTML that has: * Broken images - the <img> tags are correct but no image shows up * Broken links - Safari renders the links (they have a blue underline), but if you mouse over them no links show up in the status bar and if you click on them they do not work After doing the appendChild, alert(element.innerHTML) shows the correct XHTML is inserted. But the inserted XHTML simply doesn't work... So this solution is a no-go (and seems to reveal a bunch more bugs).
Alexey Proskuryakov
Comment 10
2008-03-18 23:28:08 PDT
(In reply to
comment #9
)
> * Broken images - the <img> tags are correct but no image shows up
That's
bug 17897
(not a DOMParser issue).
> * Broken links - Safari renders the links (they have a blue underline), but if > you mouse over them no links show up in the status bar and if you click on them > they do not work
I do not remember seeing a bug for this, but it sounds related.
Charlie
Comment 11
2008-03-18 23:35:35 PDT
Ah, interesting - thanks for the pointer. Any chance of getting the issue in createContextualFragment() fixed so that the URIs come out right when setting innerHTML in an XHTML document? Thanks for the help Alexey.
Julien Chaffraix
Comment 12
2008-03-26 19:00:47 PDT
I think I have found the cause of the bug. We are failing in XMLTokenizer::parseXMLDocumentFragment. The arguments are good but the result of xmlParseBalancedChunkMemory is wrong (it converts "&" to "&") I had a look at libxml to know why we fail : when you use xmlParseBalancedChunkMemory, the library automatically converts '&' to "&" which explains the bug. A look at libxml suggests that there is two possibilities to solve the bug: correcting the values when creating the DocumentFragment or using another libxml API. Any suggestion on how to proceed here ?
Alexey Proskuryakov
Comment 13
2008-03-28 03:48:46 PDT
What is the other libxml2 API? If it's more suitable for the task, we should definitely consider using it.
Julien Chaffraix
Comment 14
2008-03-28 13:44:16 PDT
(In reply to
comment #13
)
> What is the other libxml2 API? If it's more suitable for the task, we should > definitely consider using it. >
I had no in mind when writing the mail as my libxml2 knowledge was too weak. After looking a bit more into the code and the documentation, I am pretty sure we will have to switch to another API. The candidate I am testing is xmlParseChunk as we can provide an xmlParseCtxPtr (which will give us more control over the parsing).
Julien Chaffraix
Comment 15
2008-03-29 19:23:36 PDT
> I had no in mind when writing the mail as my libxml2 knowledge was too weak. > After looking a bit more into the code and the documentation, I am pretty sure > we will have to switch to another API. The candidate I am testing is > xmlParseChunk as we can provide an xmlParseCtxPtr (which will give us more > control over the parsing). >
I was wrong about xmlParseChunk. We need a function that can parse a fragment or "content" as defined in the libxml documentation (see
http://bugs.webkit.org/show_bug.cgi?id=5142
for more explainations). There is 4 APIs that does fragment parsing: xmlParseBalancedChunkMemory, xmlParseBalancedChunkMemoryRecover, xmlParseContent and xmlParseInNodeContext. The first two are out because they are more or less the same and we are trying to find a replacement. The last one does not sound correct for what we want. I have investigated xmlParseContext with no results for the moment. I will continue to digg that API a bit more.
Julien Chaffraix
Comment 16
2008-04-12 15:02:49 PDT
As advised by Alexey, I have contacted the libxml mailing list to have their view about what we wanted to do and the way to do it (see
http://mail.gnome.org/archives/xml/2008-April/msg00017.html
). It seems that parsing a content is quite tricky and the only way is to use the internal function xmlParseContent. As precised in the link, we need to initialize the parser ourselves (we also need to interpret the errors ourselves) which will lead to some non portable code. However we already use some internal properties and the addition is quite small so it should be worth the effort of merging some of our parsing code.
Julien Chaffraix
Comment 17
2008-04-12 15:07:03 PDT
Created
attachment 20493
[details]
First version: switch DocumentFragment parsing to xmlParseContent Tested and no regression found.
Alexey Proskuryakov
Comment 18
2008-04-13 23:56:55 PDT
Created
attachment 20521
[details]
test contextual parsing The only problem I can see is that is that createContextualFragment is still not contextual - it doesn't use any context from the node it is invoked on, see this test case. Even in the original test case, the elements would end up in null namespace AFAICT. Daniel says there are many things that are contextual, besides namespaces - are there any others that we should care about? 483 // XML_PARSE_NODICT: default dictionnary option. An extra "n" in "dictionary" here.
Julien Chaffraix
Comment 19
2008-04-14 03:50:49 PDT
> The only problem I can see is that is that createContextualFragment is still > not contextual - it doesn't use any context from the node it is invoked on, see > this test case. Even in the original test case, the elements would end up in > null namespace AFAICT.
IMHO, I think you are wrong here. Check XMLTokenizer::startElementNs: there is a check for the namespace. If none is provided by libxml then we check the prefix against m_prefixToNamespaceMap (which is initialized from the Element provided as a parent). If no prefix was found, we put the node in the default namespace. (LayoutTests/fast/dom/set-innnerHTML also check for namespace and I had a hard time with it)
> Daniel says there are many things that are contextual, > besides namespaces - are there any others that we should care about?
I would say that others are internal to libxml (like tree structure or parser state to detect if the tree is well formed). I have tried to match libxml behaviour to avoid any regression here (preferring portable API whenever I could).
> 483 // XML_PARSE_NODICT: default dictionnary option. > > An extra "n" in "dictionary" here. >
Oups... There is also a memory leak as we never free m_context.
Alexey Proskuryakov
Comment 20
2008-04-14 04:22:01 PDT
Comment on
attachment 20493
[details]
First version: switch DocumentFragment parsing to xmlParseContent
> IMHO, I think you are wrong here.
Indeed, I didn't realize that this logic was in WebCore, not libxml. r=me.
Julien Chaffraix
Comment 21
2008-04-14 04:53:11 PDT
Committed in
r31860
.
Charlie
Comment 22
2008-04-14 08:13:43 PDT
Thanks everyone for the quick fix - I'll give it a spin in one of this week's nightly builds.
Lucas Forschler
Comment 23
2019-02-06 09:02:36 PST
Mass moving XML DOM bugs to the "DOM" Component.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug