Bug 5039 - XSLT text output doesn't work
Summary: XSLT text output doesn't work
Status: VERIFIED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: XML (show other bugs)
Version: 420+
Hardware: Mac OS X 10.4
: P2 Normal
Assignee: Dave Hyatt
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-09-18 05:00 PDT by Alexey Proskuryakov
Modified: 2005-09-24 04:44 PDT (History)
0 users

See Also:


Attachments
test case (1.20 KB, application/octet-stream)
2005-09-18 05:03 PDT, Alexey Proskuryakov
no flags Details
proposed patch (1.42 KB, patch)
2005-09-18 05:04 PDT, Alexey Proskuryakov
darin: review-
Details | Formatted Diff | Diff
patch that uses an empty title (1.38 KB, patch)
2005-09-19 21:29 PDT, Alexey Proskuryakov
darin: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Alexey Proskuryakov 2005-09-18 05:00:24 PDT
This patch fixes several issues with xsl:output method="text". I couldn't find any circumstances under 
which it would work previously, thus a vague summary.

What is addressed:
1) Added an XML text declaration.
2) Corrected DTD URI.
3) Added a HEAD with a TITLE.
4) Got rid of a CDATA section. Safari wouldn't display it; besides, it wasn't used correctly (the output could 
contain "]]>" itself). Instead, replaces < and & with entities.
5) Removed a newline after <pre>.
Comment 1 Alexey Proskuryakov 2005-09-18 05:03:26 PDT
Created attachment 3934 [details]
test case

Unlike the patch, this test relies on bug 3418 being fixed (it uses non-ASCII
text).
Comment 2 Alexey Proskuryakov 2005-09-18 05:04:08 PDT
Created attachment 3935 [details]
proposed patch
Comment 3 Eric Seidel (no email) 2005-09-18 12:42:36 PDT
Comment on attachment 3935 [details]
proposed patch

I agree w/ the doctype fix.  I'm not entirely sure as why "text output"
wouldn't just get displayed as text.. instead of parsed as xhtml (hyatt might
know).

Also, why only replace amp and lt?  I would expect you would at least need to
replace gt as well.  And probably all of the predefined entities:
http://www.w3.org/TR/REC-xml/#sec-predefined-ent

I also think it makes sense to have a newline after <pre> unless it gets
displayed as part of the text.

Finally, it looks like you might have used tabs instead of spaces (or maybe the
whole method is tabs and you used spaces, which would be correct).

For the lack of gt replacement, I'm going to review: -.  Hyatt and darin may
have further opinions as they are more familiar with the xslt code.
Comment 4 Alexey Proskuryakov 2005-09-18 13:27:49 PDT
(In reply to comment #3)
> Also, why only replace amp and lt?
If these two are replaced, nothing else can be mistakengly recognized as markup.

> I also think it makes sense to have a newline after <pre> unless it gets
> displayed as part of the text.
I removed it because it does get displayed (that's the primary function of <pre> :) )

Indentation also looks correct...

Resetting review? flag after discussing this with Eric on IRC.
Comment 5 Darin Adler 2005-09-19 15:49:16 PDT
Comment on attachment 3935 [details]
proposed patch

Is it really OK to put the title in without any escaping at all? Are we somehow
guaranteed that the title won't have the string "</title>" in it?

Otherwise, this looks fine to me. Feel free to move this back to review? if you
answer that question.
Comment 6 Alexey Proskuryakov 2005-09-19 21:29:45 PDT
Created attachment 3956 [details]
patch that uses an empty title

My idea was that XML document's title is its file name - if so, a closing tag
cannot be in the title (because it includes a path separator). However,
1) an opening tag can, which I missed;
2) I've just tested and saw that m_sourceDocument->title() comes empty in my
test - and that doesn't prevent Safari from displaying the file name in the
title.

So, I think that always setting an empty path is a safer solution than escaping
it - entities in the title will not look good, if a later change makes
m_sourceDocument->title() return the document's file name.
Comment 7 Darin Adler 2005-09-20 16:52:12 PDT
Comment on attachment 3956 [details]
patch that uses an empty title

I don't understand the argument for the empty <title> element. How is this
different from any other XML document without a title? Is putting an empty
<title> in really the best way to get the result we're after?
Comment 8 Alexey Proskuryakov 2005-09-20 21:24:08 PDT
(In reply to comment #7)
The document must have a HEAD with a TITLE in it, because it's XHTML-strict (W3C validator rejected it 
otherwise, and the spec says the same IIRC).

The original XML has no intrinsic title other than its file name, and Safari still displays the file name in the 
window title in my test case. On the other hand, putting the file name in TITLE would probably be not very 
nice - it's not a real title, but a Safari courtesy. Unless I'm missing something, it cannot be made any 
better :)
Comment 9 Darin Adler 2005-09-21 11:56:59 PDT
Comment on attachment 3956 [details]
patch that uses an empty title

OK. I think <title/> would have been slightly more elegant, but I understand.

r=me
Comment 10 Darin Adler 2005-09-24 03:03:46 PDT
The idea of just displaying text as text instead of making an XML document still seems like a good future 
possibility.