Bug 155836
| Summary: | default-charset not honored by webkit_web_view_load_html | ||
|---|---|---|---|
| Product: | WebKit | Reporter: | Jérémy Lal <kapouer> |
| Component: | WebKitGTK | Assignee: | Nobody <webkit-unassigned> |
| Status: | RESOLVED FIXED | ||
| Severity: | Normal | CC: | bugs-noreply, mcatanzaro |
| Priority: | P2 | ||
| Version: | Other | ||
| Hardware: | PC | ||
| OS: | Linux | ||
Jérémy Lal
Hi,
i'm seeing this since webkitgtk 2.12.0.
1) Set setting default-charset = utf-8 on WebView
2) Using webkit_web_view_load_html, load an html string:
<!doctype html>
<html>
<head>
<script>console.log(document.characterSet);</script>
</head>
</html>
-> prints "iso-8859-1", expected result was "utf-8".
webkitgtk 2.10 prints the expected result.
It's not only the value of that property that is wrong, it's also
the charset used by default for decoding javascript files (this can
be seen by putting `console.log("é");` in a test.js file and loading it.
When using webkit_web_view_load, this problem does not happen.
| Attachments | ||
|---|---|---|
| Add attachment proposed patch, testcase, etc. |
Jérémy Lal
As a workaround i still can set
webkit_web_view_set_custom_charset.
Jérémy Lal
Also i don't understand this, and it seems linked to the issue:
#if defined(LANGUAGE_OBJECTIVE_C) && LANGUAGE_OBJECTIVE_C || defined(LANGUAGE_GOBJECT) && LANGUAGE_GOBJECT
[TreatReturnedNullStringAs=Undefined, TreatNullAs=NullString] attribute DOMString charset;
#else
But if i set
document.charset = <another valid charset than the current one>
the value doesn't change, where it did in version 2.10.4.
However the code is compiled with LANGUAGE_GOBJECT=1, see build logs at
https://buildd.debian.org/status/package.php?p=webkit2gtk
Jérémy Lal
To sum up the problem:
- webkit_web_view_load_html does not follow 'default-charset' setting since at least >= 2.10 and probably 2.8 too
- i was working around that problem by setting `document.charset` in a user script
- it is no longer possible in version 2.12 - a behavior that matches the specification.
Jérémy Lal
The workaround i mentioned in comment #1 is very difficult to use because it reloads the document (which is somewhat weird IMO).
A better, working way is to use webkit_web_view_load_bytes whose encoding parameter sets the document encoding accordingly.
It feels a bit weird, though.
Michael Catanzaro
I just tested this with Epiphany 3.22.3 and WebKitGTK+ 2.14.2, and it seems to work properly. I don't doubt that the issue was valid at the time you filed this bug, but I guess it's been fixed. Are you still able to reproduce it?
Jérémy Lal
Indeed, if i replace webkit_web_view_load_bytes by webkit_web_view_load_html, default-charset is still correctly applied.
It's been fixed sometime this year.