<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://bugs.webkit.org/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4.1"
          urlbase="https://bugs.webkit.org/"
          
          maintainer="admin@webkit.org"
>

    <bug>
          <bug_id>17353</bug_id>
          
          <creation_ts>2008-02-13 17:20:41 -0800</creation_ts>
          <short_desc>XMLTokenizer installs global libxml2 callbacks that can break client applications</short_desc>
          <delta_ts>2016-02-01 15:43:52 -0800</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WebKit</product>
          <component>XML</component>
          <version>528+ (Nightly build)</version>
          <rep_platform>All</rep_platform>
          <op_sys>All</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>FIXED</resolution>
          
          <see_also>https://bugs.webkit.org/show_bug.cgi?id=153683</see_also>
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords>Gtk</keywords>
          <priority>P1</priority>
          <bug_severity>Normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Alp Toker">alp</reporter>
          <assigned_to name="Nobody">webkit-unassigned</assigned_to>
          <cc>ap</cc>
    
    <cc>mrowe</cc>
    
    <cc>sam</cc>
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>70586</commentid>
    <comment_count>0</comment_count>
    <who name="Alp Toker">alp</who>
    <bug_when>2008-02-13 17:20:41 -0800</bug_when>
    <thetext>The developers of Yelp (the GNOME integrated help system) found that the XML parsing functionality of their application started to fail after they replaced Mozilla with a WebKit GTK+ WebView widget.

Their WebView code is completely separate from the XML parsing code (which uses libxml2), but both run in the same process.

I did some debugging and tracked down the issue. XMLTokenizer.cpp assigns global match/open/read/write/close functions. By modifying the behaviour of these functions, WebKit cripples XML features throughout the embedding application:

static xmlParserCtxtPtr createStringParser(xmlSAXHandlerPtr handlers, void* userData)
{
    static bool didInit = false;
    if (!didInit) {
        xmlInitParser();
        xmlRegisterInputCallbacks(matchFunc, openFunc, readFunc, closeFunc);
        xmlRegisterOutputCallbacks(matchFunc, openFunc, writeFunc, closeFunc);
        didInit = true;
    }

...

I&apos;m not familiar with XMLTokenizer.cpp so not sure how to proceed. Any thoughts?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>70597</commentid>
    <comment_count>1</comment_count>
    <who name="Mark Rowe (bdash)">mrowe</who>
    <bug_when>2008-02-13 22:15:30 -0800</bug_when>
    <thetext>The IO callbacks registered by xmlRegisterInputCallbacks/xmlRegisterOutputCallbacks go onto a global stack of callbacks that are asked in turn whether they can handle the IO for a given URL.  The fact that these can only be set globally looks to be a limitation of libxml2.  The fact that the callbacks are only provided with the URL when asked whether they can handle the load makes it challenging to have WebCore&apos;s callbacks limit themselves to handling IO for WebCore-related resources.  Not using WebCore&apos;s callbacks when globalDocLoader may reduce the problem a little, but is unlikely to be safe in a multithreaded application.

What is the failure mode that is occurring in applications?  My reading of the code suggests that documents may be returned as zero-length if there is no associated WebCore loader, which I would imagine to be the case for uses of libxml2 outside WebCore.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>70648</commentid>
    <comment_count>2</comment_count>
    <who name="Alp Toker">alp</who>
    <bug_when>2008-02-14 04:59:01 -0800</bug_when>
    <thetext>Original bug report:



Hi,

(Feel free to forward this to anyone you feel like / might be more
appropriate)

As you may have seen on planet.gnome.org [1]), I&apos;m trying to get yelp
running with webkit.  I&apos;m hitting some issues working with libxml /
libxslt and webkit_web_view_load_string which I&apos;m using to load local
files.

The problem manifests when I set the mime type to &quot;application/xhtml
+xml&quot;, which is required for all our documents.

The problem is that when I set the mime type to application/xhtml+xml,
any subsequent calls to xslt will fail as libxml thinks the document is
empty.  I&apos;m not entirely certain why it thinks this, so I thought I&apos;d
get some expert opinion  ;) 

I&apos;ve produced a test program, based on the example browser and which can
be found at:
http://www.gnome.org/~dscorgie/main.c

Vague instructions.  Compile using:
gcc -o main main.c `pkg-config --cflags --libs WebKitGtk libxslt`

then run with:
./main text/html
to run the program working.  When you do this, click the link and press
back.  You should see output in the terminal like:
ss1: 0x807d690
ss2: 0x82261e8
which means libxslt has successfully parsed the requested stylesheet.

The output are 2 pointers to stylesheets, the first (ss1) is generated just before running 
the web_view and the second (ss2) generated in the &quot;go_back&quot; callback (for lack of a better place).

To change the mime type to xhtml, you can omit the argument:
./main
which should produce the dreaded output (after clicking the link and pressing back):
ss1: 0x807d690
/usr/share/yelp/xslt/yelp-common.xsl:1: parser error : Document is empty

^
/usr/share/yelp/xslt/yelp-common.xsl:1: parser error : Start tag expected, &apos;&lt;&apos; not found

^
error
xsltParseStylesheetFile : cannot parse /usr/share/yelp/xslt/yelp-common.xsl
ss2: (nil)

which means the program now thinks /usr/share/yelp/xslt/yelp-common.xsl is empty.

I feel I&apos;m doing something stupid, but have been staring at this (and tracing through libxml, 
libxslt and yelp) for 2 weeks and still can&apos;t see it.

Oh, I should also mention you need yelp installed on you&apos;re system (or change the example to point to 
a different stylesheet in both places).  Also, this is with nightly builds r29907 and r30123 (the latest).

Thanks
Don

[1] progress can be tracked in the bug:
http://bugzilla.gnome.org/show_bug.cgi?id=512827
</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>70692</commentid>
    <comment_count>3</comment_count>
      <attachid>19125</attachid>
    <who name="Alp Toker">alp</who>
    <bug_when>2008-02-14 13:44:44 -0800</bug_when>
    <thetext>Created attachment 19125
Proposed fix

This fixes the reported breakage in applications.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>70693</commentid>
    <comment_count>4</comment_count>
      <attachid>19125</attachid>
    <who name="Darin Adler">darin</who>
    <bug_when>2008-02-14 13:50:24 -0800</bug_when>
    <thetext>Comment on attachment 19125
Proposed fix

r=me, with some reservations.

Doesn&apos;t this still leave us with the problem that other libxml2 clients could call xmlRegisterInputCallbacks and clobber our callbacks with their own? Is there a &quot;save and restore&quot; approach we could use instead?

Should we file a bug asking libxml2 to come up with a way of doing this that&apos;s not global?

+    // TODO: We should restore the original global error handler as well.

We use FIXME, not TODO, for these things.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>70694</commentid>
    <comment_count>5</comment_count>
    <who name="Alp Toker">alp</who>
    <bug_when>2008-02-14 13:51:49 -0800</bug_when>
    <thetext>(In reply to comment #3)
&gt; Created an attachment (id=19125) [edit]
&gt; Proposed fix
&gt; 
&gt; This fixes the reported breakage in applications.
&gt; 

The patch deals with all potential re-entrancy I can think of but any thoughts here are welcome.

It might be worth filing a follow-up bug to track/audit other issues like this in WebCore. The failure mode is very obscure and we&apos;re quite lucky this bug was reported to us.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>70695</commentid>
    <comment_count>6</comment_count>
    <who name="Alp Toker">alp</who>
    <bug_when>2008-02-14 13:57:37 -0800</bug_when>
    <thetext>(In reply to comment #4)
&gt; (From update of attachment 19125 [edit])
&gt; r=me, with some reservations.
&gt; 
&gt; Doesn&apos;t this still leave us with the problem that other libxml2 clients could
&gt; call xmlRegisterInputCallbacks and clobber our callbacks with their own? Is
&gt; there a &quot;save and restore&quot; approach we could use instead?

Yep, totally. It&apos;s not a complete fix. Luckily very few applications seem to use xmlRegisterInputCallbacks(), so this saves our skin for now.

I tried cooking up a save/restore form of this patch but it still didn&apos;t address the reverse scenario you described. Could you elaborate or maybe give a proof of concept?

&gt; 
&gt; Should we file a bug asking libxml2 to come up with a way of doing this that&apos;s
&gt; not global?

I think dv is aware that the globals are bad. It&apos;s possible they&apos;ve added new a new way to do this and we haven&apos;t seen it yet.

We should file a bug or check up on the existing one (Eric believes a bug has already been filed).
</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>70713</commentid>
    <comment_count>7</comment_count>
      <attachid>19125</attachid>
    <who name="Alp Toker">alp</who>
    <bug_when>2008-02-14 16:33:29 -0800</bug_when>
    <thetext>Comment on attachment 19125
Proposed fix

Patch landed in r30236. Will keep the bug open at least for a few days to see how things pan out. Maybe we can come up with a more complete solution.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>70753</commentid>
    <comment_count>8</comment_count>
    <who name="Alp Toker">alp</who>
    <bug_when>2008-02-15 05:18:23 -0800</bug_when>
    <thetext>Guess we can close this.</thetext>
  </long_desc>
      
          <attachment
              isobsolete="0"
              ispatch="1"
              isprivate="0"
          >
            <attachid>19125</attachid>
            <date>2008-02-14 13:44:44 -0800</date>
            <delta_ts>2008-02-14 16:33:29 -0800</delta_ts>
            <desc>Proposed fix</desc>
            <filename>thr4.patch</filename>
            <type>text/plain</type>
            <size>3485</size>
            <attacher name="Alp Toker">alp</attacher>
            
              <data encoding="base64">ZGlmZiAtLWdpdCBhL1dlYkNvcmUvQ2hhbmdlTG9nIGIvV2ViQ29yZS9DaGFuZ2VMb2cKaW5kZXgg
MzJmZGI5Mi4uMzZlZDVmMyAxMDA2NDQKLS0tIGEvV2ViQ29yZS9DaGFuZ2VMb2cKKysrIGIvV2Vi
Q29yZS9DaGFuZ2VMb2cKQEAgLTEsMyArMSwyNSBAQAorMjAwOC0wMi0xNCAgQWxwIFRva2VyICA8
YWxwQGF0b2tlci5jb20+CisKKyAgICAgICAgUmV2aWV3ZWQgYnkgTk9CT0RZIChPT1BTISkuCisK
KyAgICAgICAgaHR0cDovL2J1Z3Mud2Via2l0Lm9yZy9zaG93X2J1Zy5jZ2k/aWQ9MTczNTMKKyAg
ICAgICAgWE1MVG9rZW5pemVyIGluc3RhbGxzIGdsb2JhbCBsaWJ4bWwyIGNhbGxiYWNrcyB0aGF0
IGNhbiBicmVhayBjbGllbnQgYXBwbGljYXRpb25zCisKKyAgICAgICAgUGF0Y2ggYnkgTWFyayBS
b3dlICh3aXRoIGEgZmV3IGNoYW5nZXMpLgorCisgICAgICAgIFRoZSB4bWxSZWdpc3RlcklucHV0
Q2FsbGJhY2tzL3htbFJlZ2lzdGVyT3V0cHV0Q2FsbGJhY2tzIGRvbmUgYXQKKyAgICAgICAgaW5p
dCBhcmUgZ2xvYmFsIHNvIHdlIG5lZWQgdG8gbWFrZSBzdXJlIHRoZXNlIGNhbGxiYWNrcyBvbmx5
IGdldCB1c2VkCisgICAgICAgIGJ5IFhNTFRva2VuaXplciBhbmQgbmV2ZXIgYnkgbGlieG1sMiBj
YWxscyBpbiB1c2VyIGFwcGxpY2F0aW9ucy4KKworICAgICAgICBUaGlzIHBhdGNoIG1vZGlmaWVz
IHRoZSBtYXRjaCBhbmQgb3BlbiBmdW5jdGlvbnMgdG8gb25seSBhcHBseSB3aGVuIHdlCisgICAg
ICAgIGFyZSBjZXJ0YWluIHRoZSBjYWxsZXIgaXMgWE1MVG9rZW5pemVyIGJ5IGNoZWNraW5nIGds
b2JhbERvY0xvYWRlciBhbmQKKyAgICAgICAgZW5zdXJpbmcgd2UncmUgb24gdGhlIGNvcnJlY3Qg
dGhyZWFkLgorCisgICAgICAgICogZG9tL1hNTFRva2VuaXplci5jcHA6CisgICAgICAgIChXZWJD
b3JlOjptYXRjaEZ1bmMpOgorICAgICAgICAoV2ViQ29yZTo6b3BlbkZ1bmMpOgorICAgICAgICAo
V2ViQ29yZTo6Y3JlYXRlU3RyaW5nUGFyc2VyKToKKwogMjAwOC0wMi0xMyAgSnVzdGluIEdhcmNp
YSAgPGp1c3Rpbi5nYXJjaWFAYXBwbGUuY29tPgogCiAgICAgICAgIFJldmlld2VkIGJ5IE9saXZl
ciBIdW50LgpkaWZmIC0tZ2l0IGEvV2ViQ29yZS9kb20vWE1MVG9rZW5pemVyLmNwcCBiL1dlYkNv
cmUvZG9tL1hNTFRva2VuaXplci5jcHAKaW5kZXggN2JhNDI0Yy4uYTNiMzRmMSAxMDA2NDQKLS0t
IGEvV2ViQ29yZS9kb20vWE1MVG9rZW5pemVyLmNwcAorKysgYi9XZWJDb3JlL2RvbS9YTUxUb2tl
bml6ZXIuY3BwCkBAIC00Nyw2ICs0Nyw3IEBACiAjaW5jbHVkZSAiUmVzb3VyY2VIYW5kbGUuaCIK
ICNpbmNsdWRlICJSZXNvdXJjZVJlcXVlc3QuaCIKICNpbmNsdWRlICJSZXNvdXJjZVJlc3BvbnNl
LmgiCisjaW5jbHVkZSAiVGhyZWFkaW5nLmgiCiAjaWZuZGVmIFVTRV9RWE1MU1RSRUFNCiAjaW5j
bHVkZSA8bGlieG1sL3BhcnNlci5oPgogI2luY2x1ZGUgPGxpYnhtbC9wYXJzZXJJbnRlcm5hbHMu
aD4KQEAgLTM0MCwxNCArMzQxLDE2IEBAIHB1YmxpYzoKIC8vIC0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tCiAKIHN0YXRpYyBpbnQgZ2xvYmFsRGVzY3JpcHRvciA9IDA7CitzdGF0aWMg
RG9jTG9hZGVyKiBnbG9iYWxEb2NMb2FkZXIgPSAwOworc3RhdGljIFRocmVhZElkZW50aWZpZXIg
bGlieG1sTG9hZGVyVGhyZWFkID0gMDsKIAogc3RhdGljIGludCBtYXRjaEZ1bmMoY29uc3QgY2hh
ciogdXJpKQogewotICAgIHJldHVybiAxOyAvLyBNYXRjaCBldmVyeXRoaW5nLgorICAgIC8vIE9u
bHkgbWF0Y2ggbG9hZHMgaW5pdGlhdGVkIGR1ZSB0byB1c2VzIG9mIGxpYnhtbDIgZnJvbSB3aXRo
aW4gWE1MVG9rZW5pemVyIHRvIGF2b2lkCisgICAgLy8gaW50ZXJmZXJpbmcgd2l0aCBjbGllbnQg
YXBwbGljYXRpb25zIHRoYXQgYWxzbyB1c2UgbGlieG1sMi4gIGh0dHA6Ly9idWdzLndlYmtpdC5v
cmcvc2hvd19idWcuY2dpP2lkPTE3MzUzCisgICAgcmV0dXJuIGdsb2JhbERvY0xvYWRlciAmJiBj
dXJyZW50VGhyZWFkKCkgPT0gbGlieG1sTG9hZGVyVGhyZWFkOwogfQogCi1zdGF0aWMgRG9jTG9h
ZGVyKiBnbG9iYWxEb2NMb2FkZXIgPSAwOwotCiBjbGFzcyBPZmZzZXRCdWZmZXIgewogcHVibGlj
OgogICAgIE9mZnNldEJ1ZmZlcihjb25zdCBWZWN0b3I8Y2hhcj4mIGIpIDogbV9idWZmZXIoYiks
IG1fY3VycmVudE9mZnNldCgwKSB7IH0KQEAgLTM3NywxNSArMzgwLDI0IEBAIHN0YXRpYyBib29s
IHNob3VsZEFsbG93RXh0ZXJuYWxMb2FkKGNvbnN0IGNoYXIqIGluVVJJKQogfQogc3RhdGljIHZv
aWQqIG9wZW5GdW5jKGNvbnN0IGNoYXIqIHVyaSkKIHsKLSAgICBpZiAoIWdsb2JhbERvY0xvYWRl
ciB8fCAhc2hvdWxkQWxsb3dFeHRlcm5hbExvYWQodXJpKSkKKyAgICBBU1NFUlQoZ2xvYmFsRG9j
TG9hZGVyKTsKKyAgICBBU1NFUlQoY3VycmVudFRocmVhZCgpID09IGxpYnhtbExvYWRlclRocmVh
ZCk7CisKKyAgICBpZiAoIXNob3VsZEFsbG93RXh0ZXJuYWxMb2FkKHVyaSkpCiAgICAgICAgIHJl
dHVybiAmZ2xvYmFsRGVzY3JpcHRvcjsKIAogICAgIFJlc291cmNlRXJyb3IgZXJyb3I7CiAgICAg
UmVzb3VyY2VSZXNwb25zZSByZXNwb25zZTsKICAgICBWZWN0b3I8Y2hhcj4gZGF0YTsKICAgICAK
LSAgICBpZiAoZ2xvYmFsRG9jTG9hZGVyLT5mcmFtZSgpKSAKLSAgICAgICAgZ2xvYmFsRG9jTG9h
ZGVyLT5mcmFtZSgpLT5sb2FkZXIoKS0+bG9hZFJlc291cmNlU3luY2hyb25vdXNseShLVVJMKHVy
aSksIGVycm9yLCByZXNwb25zZSwgZGF0YSk7CisgICAgRG9jTG9hZGVyKiBkb2NMb2FkZXIgPSBn
bG9iYWxEb2NMb2FkZXI7CisgICAgZ2xvYmFsRG9jTG9hZGVyID0gMDsKKyAgICAvLyBUT0RPOiBX
ZSBzaG91bGQgcmVzdG9yZSB0aGUgb3JpZ2luYWwgZ2xvYmFsIGVycm9yIGhhbmRsZXIgYXMgd2Vs
bC4KKworICAgIGlmIChkb2NMb2FkZXItPmZyYW1lKCkpIAorICAgICAgICBkb2NMb2FkZXItPmZy
YW1lKCktPmxvYWRlcigpLT5sb2FkUmVzb3VyY2VTeW5jaHJvbm91c2x5KEtVUkwodXJpKSwgZXJy
b3IsIHJlc3BvbnNlLCBkYXRhKTsKKworICAgIGdsb2JhbERvY0xvYWRlciA9IGRvY0xvYWRlcjsK
IAogICAgIHJldHVybiBuZXcgT2Zmc2V0QnVmZmVyKGRhdGEpOwogfQpAQCAtNDMyLDYgKzQ0NCw3
IEBAIHN0YXRpYyB4bWxQYXJzZXJDdHh0UHRyIGNyZWF0ZVN0cmluZ1BhcnNlcih4bWxTQVhIYW5k
bGVyUHRyIGhhbmRsZXJzLCB2b2lkKiB1c2VyCiAgICAgICAgIHhtbEluaXRQYXJzZXIoKTsKICAg
ICAgICAgeG1sUmVnaXN0ZXJJbnB1dENhbGxiYWNrcyhtYXRjaEZ1bmMsIG9wZW5GdW5jLCByZWFk
RnVuYywgY2xvc2VGdW5jKTsKICAgICAgICAgeG1sUmVnaXN0ZXJPdXRwdXRDYWxsYmFja3MobWF0
Y2hGdW5jLCBvcGVuRnVuYywgd3JpdGVGdW5jLCBjbG9zZUZ1bmMpOworICAgICAgICBsaWJ4bWxM
b2FkZXJUaHJlYWQgPSBjdXJyZW50VGhyZWFkKCk7CiAgICAgICAgIGRpZEluaXQgPSB0cnVlOwog
ICAgIH0KIAo=
</data>

          </attachment>
      

    </bug>

</bugzilla>