Bug 3274 - document() not supported
Summary: document() not supported
Alias: None
Product: WebKit
Classification: Unclassified
Component: XML (show other bugs)
Version: 412
Hardware: All All
: P2 Normal
Assignee: Anders Carlsson
: 4055 (view as bug list)
Depends on:
Reported: 2005-06-04 20:31 PDT by Dave Hyatt
Modified: 2005-07-29 10:31 PDT (History)
1 user (show)

See Also:

Work in progress (3.78 KB, patch)
2005-06-22 15:19 PDT, Anders Carlsson
darin: review-
Details | Formatted Diff | Diff
Adress comments (6.71 KB, patch)
2005-06-23 12:52 PDT, Anders Carlsson
darin: review-
Details | Formatted Diff | Diff
Address more comments (6.88 KB, patch)
2005-06-24 08:05 PDT, Anders Carlsson
darin: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Dave Hyatt 2005-06-04 20:31:27 PDT
The document() capability of XSLT is not supported.  XSLT's loader API (the one used to load stylesheets) 
needs to be used in order to synchronously load and hand back an XML document.  This code could be 
similar to (or possibly reuse) the sync code for XMLHttpRequests.
Comment 1 Anders Carlsson 2005-06-22 15:19:18 PDT
Created attachment 2555 [details]
Work in progress

Here's an initial stab at getting document() working. The things I'm concerned
with are:

1) The use of libxml: I see that other parts of WebCore use different parser
functions, for example XMLTokenizer::setTransformSource uses xmlReadMemory
instead, with a passed in QString. Should the QByteBuffer be encoded to a
QString (and what encoding should be used if so?) and then passed in.
Basically, what function should be used?

2) Error handling: If there's a parse error in an xml document included by
document(), the parse errors are written to stderr. I suppose the error sax
handlers need to be overridden in order to suppress those warnings.

3) Test cases: Not a big deal, but there are no real xsl test cases. Maciej
suggested that we add a fast/xsl subdir with xsl test cases. I haven't uploaded
any test cases here yet (I have a very basic one) since I wanted to get
feedback on the patch first.
Comment 2 Anders Carlsson 2005-06-22 15:20:02 PDT
Comment on attachment 2555 [details]
Work in progress

I feel kinda stupid requesting a review on this patch since I know it's not
gonna be accepted, but I would like to get feedback on my previous comment.
Comment 3 Darin Adler 2005-06-22 20:09:12 PDT
Comment on attachment 2555 [details]
Work in progress

Well, setting the review flag is indeed an effective way to get my attention!

Regarding your comments (2) and (3), sounds like you're on the right track. To
figure out what we should do in error cases, we first need to investigate what
other browsers do in such cases.

As far as issue (1) is concerned, the fact that other caller use xmlReadMemory
is due to the fact that we already have the data in memory, stored in a
QString. In the XMLTokenizer case we have already used our own decoding logic
for figuring out what character set to decode.

Whether libxml2's logic to determine what character set something is in is
acceptable or not, I don't really know. Since it doesn't get passed the
headers, I don't know how it could possibly handle that exactly right.

I think it's OK to start like this, and make a change only if we find a bug. If
we want to decode here in a way that matches the XMLTokenizer code path, it's
pretty simple:

Create a khtml::Decoder. Look at the headers and call setEncoding with the
encoding from the header, if any, passing EncodingFromHTTPHeader. See the logic
in KHTMLPart::write for other rules we might want to imitate (inheriting
encoding, user-specified encoding).

Call decode(), passing all the data. Call flush(), and append that QString to
the one you got from decode().

Then call xmlReadMemory the same way XMLTokenizer does.

I'm sure we could do a little refactoring to share code with KHTMLPart::write

But it's probably not worth doing any of that until we make a test case and see
how the other browsers handle character sets in this case.
Comment 4 Anders Carlsson 2005-06-23 12:52:36 PDT
Created attachment 2605 [details]
Adress comments

Looks like neither Gecko nor WinIE looks at the HTTP headers to see what the
encoding is, so I think we're fine using xmlReadMemory. I've also fixed the two
other issues.
Comment 5 Darin Adler 2005-06-23 23:10:58 PDT
Comment on attachment 2605 [details]
Adress comments

I'd prefer that the layout test explain what it's testing and what the expected
result is.

There's some formatting strangeness here -- space after xmlSetGenericErrorFunc
but before the parenthesis.

Also there are break statements after return statements, which seems
unnecessary to me.

Should fix typo: "somwhere".

Otherwise, looks good.
Comment 6 Anders Carlsson 2005-06-24 08:05:12 PDT
Created attachment 2628 [details]
Address more comments
Comment 7 Darin Adler 2005-06-24 08:33:17 PDT
Comment on attachment 2628 [details]
Address more comments

Comment 8 Joost de Valk (AlthA) 2005-07-03 07:25:58 PDT
Dave, could you verifiy this one? :)
Comment 9 Adele Peterson 2005-07-28 11:54:04 PDT
*** Bug 4055 has been marked as a duplicate of this bug. ***
Comment 10 Vicki Murley 2005-07-29 10:31:59 PDT
This bug is also in Radar as <rdar://4098268>