Bug 201295 - [Gtk] Inline SVG confuses local file parsing
Summary: [Gtk] Inline SVG confuses local file parsing
Status: RESOLVED MOVED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKitGTK (show other bugs)
Version: Other
Hardware: Unspecified Linux
: P2 Major
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-08-29 02:30 PDT by Alexandre Franke
Modified: 2019-09-11 08:00 PDT (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Alexandre Franke 2019-08-29 02:30:23 PDT
When I open the following HTML document in Web:
```html
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Test animation</title>
    <link rel="stylesheet" href="svg.css" />
  </head>
  <body>

    <svg xmlns="http://www.w3.org/2000/svg" version="1.1"
                                            viewBox="0 0 300 300" width="500" height="500" id="starter_svg">
      <text id="username" x="150" y="150" dx="-15" alignment-baseline="middle" text-anchor="end">alice</text>
      <text id="at-separator" class="animated" x="150" y="150" alignment-baseline="middle" text-anchor="middle">@</text>
      <text id="domain" x="150" y="150" dx="15" alignment-baseline="middle" text-anchor="start">example.com</text>
    </svg>
  </body>
</html>
```

I get the following error:
```
This page contains the following errors:

error on line 7 at column 10: Opening and ending tag mismatch: meta line 0 and head
Below is a rendering of the page up to the first error.
```

and the text “Test animation” is displayed below it.

If I remove the inline SVG, everything works fine.

Originally reported at https://gitlab.gnome.org/GNOME/epiphany/issues/886
Comment 1 Alexey Proskuryakov 2019-08-30 09:27:56 PDT
This is an XML parser error, meaning that this HTML document was parsed as XML. What MIME type is it being served with? You can check using curl with -i option.
Comment 2 Alexandre Franke 2019-08-30 11:16:14 PDT
This is a local file (file:///) so I don’t “being served with” makes much sense here.
Comment 3 Michael Catanzaro 2019-08-30 13:18:00 PDT
GIO detects it as an SVG file instead of an HTML file, even when saved with the filename test.html. I suspect a shared-mime-info bug.

Alexey, could you check if it works in Safari?
Comment 4 Michael Catanzaro 2019-08-30 13:21:35 PDT
Specifically, I suspect the magic priorities for the SVG and HTML mime types will need to be adjusted in shared-mime-info to make this work. If you have time, please check with Bastien (hadess) and then report it to: https://gitlab.freedesktop.org/xdg/shared-mime-info/issues

Sorry for all the bug tracker ping-pong. :)
Comment 5 Alexey Proskuryakov 2019-09-06 09:39:56 PDT
> This is a local file (file:///) so I don’t “being served with” makes much sense here.

Thank you for the clarification.

Making the file valid HTML would almost certainly fix the symptom (i.e. remove the spurious slash at the end of <link>, xmlns and version attributes on <svg>, and anything else the HTML5 validator points out).

> Alexey, could you check if it works in Safari?

Our stack doesn't do any content sniffing, the extension is all that matters for local files. Given how many malformed XML documents exist out there, only working because they are parsed as HTML, I suggest dropping content sniffing in other ports as well.
Comment 6 Michael Catanzaro 2019-09-06 11:12:20 PDT
(In reply to Alexey Proskuryakov from comment #5)
> > This is a local file (file:///) so I don’t “being served with” makes much sense here.
> 
> Thank you for the clarification.
> 
> Making the file valid HTML would almost certainly fix the symptom (i.e.
> remove the spurious slash at the end of <link>, xmlns and version attributes
> on <svg>, and anything else the HTML5 validator points out).

I tried those tweaks but didn't notice any difference.

The content sniffer isn't nearly that sophisticated anyway. I think it's rather more simple: the content sniffer, shared-mime-info, sees <svg> in the file and that wins because it has a higher "magic priority" used to determine the content type. It won't be fixed without a bug report to https://gitlab.freedesktop.org/xdg/shared-mime-info/issues

Normally I'd close the issue here as not WebKit's fault, but:

> > Alexey, could you check if it works in Safari?
> 
> Our stack doesn't do any content sniffing, the extension is all that matters
> for local files. Given how many malformed XML documents exist out there,
> only working because they are parsed as HTML, I suggest dropping content
> sniffing in other ports as well.

OK, soup ports should stop content sniffing then.

It looks like I may have enabled use of SoupContentSniffer accidentally in r238805. Per the SoupSession documentation: "Additionally, sessions using the plain SoupSession class (rather than one of its deprecated subtypes) have a SoupContentDecoder by default." I'm not sure what actual impact that has, though, because WebKit doesn't intentionally use that functionality anywhere. And we've definitely done some form of content type sniffing long before this, because I remember a much older bug where shared-mime-info misidentified XHTML vs. HTML, causing breakage. I'm not sure where WebKit actually uses shared-mime-info, though. It's probably done via GIO somewhere.

The SoupContentSniffer documentation also says: "SoupContentSniffer provides support for HTML5-style response body content sniffing." I'm not sure what it means by that ("HTML5-style"), but it's clear somebody thought about it and decided this was desirable to do.
Comment 7 Michael Catanzaro 2019-09-06 11:15:28 PDT
(In reply to Michael Catanzaro from comment #6)
> It looks like I may have enabled use of SoupContentSniffer accidentally in
> r238805.

Well I failed to read, it's explicitly added right there in that diff. OK then.

It's probably a (trivial) mistake that I didn't remove that line, because I was trying to keep only those lines that were non-default. Since SoupContentSniffer is added by default and that is an API guarantee, WebKit adding it manually is now redundant.
Comment 8 Bastien Nocera 2019-09-11 04:24:21 PDT
(In reply to Alexandre Franke from comment #0)
> When I open the following HTML document in Web:
> ```html
> <!DOCTYPE html>
> <html lang="en">
>   <head>
>     <meta charset="utf-8">
>     <title>Test animation</title>
>     <link rel="stylesheet" href="svg.css" />
>   </head>
>   <body>
> 
>     <svg xmlns="http://www.w3.org/2000/svg" version="1.1"
>                                             viewBox="0 0 300 300"
> width="500" height="500" id="starter_svg">
>       <text id="username" x="150" y="150" dx="-15"
> alignment-baseline="middle" text-anchor="end">alice</text>
>       <text id="at-separator" class="animated" x="150" y="150"
> alignment-baseline="middle" text-anchor="middle">@</text>
>       <text id="domain" x="150" y="150" dx="15" alignment-baseline="middle"
> text-anchor="start">example.com</text>
>     </svg>
>   </body>
> </html>
> ```
> 
> I get the following error:
> ```
> This page contains the following errors:
> 
> error on line 7 at column 10: Opening and ending tag mismatch: meta line 0
> and head
> Below is a rendering of the page up to the first error.
> ```
> 
> and the text “Test animation” is displayed below it.
> 
> If I remove the inline SVG, everything works fine.
> 
> Originally reported at https://gitlab.gnome.org/GNOME/epiphany/issues/886

This will now get matched as a text/html file after:
https://gitlab.freedesktop.org/xdg/shared-mime-info/merge_requests/41
gets merged.
Comment 9 Michael Catanzaro 2019-09-11 08:00:21 PDT
That should resolve this. Thanks Bastien!