WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
Bug 27262
Chromium: HTML exported isn't marked as being UTF-8
https://bugs.webkit.org/show_bug.cgi?id=27262
Summary
Chromium: HTML exported isn't marked as being UTF-8
Avi Drissman
Reported
2009-07-14 08:13:17 PDT
When exporting HTML for the clipboard or drag/drop, the charset isn't indicated. The Windows clipboard format is explicitly documented as being UTF-8, and all Linux apps assume UTF-8. On the Mac, though, unless otherwise indicated, ISO/IEC 8859-1 is assumed, which is wrong.
Attachments
Patch to mark clipboard HTML as UTF-8
(2.19 KB, patch)
2009-07-14 08:14 PDT
,
Avi Drissman
fishd
: review+
Details
Formatted Diff
Diff
Links to the bug now; no other changes
(2.25 KB, patch)
2009-07-14 09:41 PDT
,
Avi Drissman
fishd
: review-
Details
Formatted Diff
Diff
New version; addresses jshin's comments
(2.25 KB, patch)
2009-07-14 11:33 PDT
,
Avi Drissman
fishd
: review+
Details
Formatted Diff
Diff
Show Obsolete
(2)
View All
Add attachment
proposed patch, testcase, etc.
Avi Drissman
Comment 1
2009-07-14 08:14:52 PDT
Created
attachment 32713
[details]
Patch to mark clipboard HTML as UTF-8 This is corresponding to
http://codereview.chromium.org/149414
Darin Fisher (:fishd, Google)
Comment 2
2009-07-14 09:38:26 PDT
Comment on
attachment 32713
[details]
Patch to mark clipboard HTML as UTF-8
> Index: WebCore/ChangeLog
...
> +2009-07-14 Avi Drissman <
avi@chromium.org
> > + > + Reviewed by NOBODY (OOPS!). > + > + Explicitly mark the HTML generated for the Mac as being UTF-8 encoded. > + The Windows clipboard format is explicitly documented as being UTF-8, > + and all Linux apps assume UTF-8. On the Mac, though, unless otherwise > + indicated, ISO/IEC 8859-1 is assumed, which is wrong.
nit: Your ChangeLog should include a link to this bug. Otherwise, R=me
Avi Drissman
Comment 3
2009-07-14 09:41:28 PDT
Created
attachment 32718
[details]
Links to the bug now; no other changes
Jungshik Shin
Comment 4
2009-07-14 10:23:50 PDT
nit: a bit of change in the comment and the bug description is necessary. Judging from the way it's broken without your patch, what's assumed is not ISO-8859-1 nor MacRoman but windows-1252 (it's a bit odd to see that on Mac OS X :-)). For instance, U+2018 (Left Single Quotation Mark) whose UTF-8 representation is "0xE2, 0x80, 0x98" is converted to "U+00E2, U+20AC, U+02DC". If it's interpreted as ISO-8859-1, it would be converted to "U+00E2, U+0080, U+0098".
Darin Fisher (:fishd, Google)
Comment 5
2009-07-14 10:48:44 PDT
Comment on
attachment 32718
[details]
Links to the bug now; no other changes r- for revised changelog per feedback from jshin. i'll commit the next patch. -darin
Avi Drissman
Comment 6
2009-07-14 11:33:52 PDT
Created
attachment 32724
[details]
New version; addresses jshin's comments
Darin Fisher (:fishd, Google)
Comment 7
2009-07-14 15:51:14 PDT
Landed as:
http://trac.webkit.org/changeset/45878
(The patch didn't apply cleanly... hand-editing in the ChangeLog portion of the diff?)
Avi Drissman
Comment 8
2009-07-14 15:54:42 PDT
(In reply to
comment #7
)
> (The patch didn't apply cleanly... hand-editing in the ChangeLog portion of the > diff?)
Yes, that's the precise reason. Bad me; I'll not do that next time.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug