WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
20664
[GTK] File names are not always encodable in UTF-8
https://bugs.webkit.org/show_bug.cgi?id=20664
Summary
[GTK] File names are not always encodable in UTF-8
Marco Barisione
Reported
2008-09-05 07:14:15 PDT
On linux file names should be considered just raw strings as it's possible to have just junk non-encodable in UTF-8 inside them. This means that file names cannot be just stored as-is inside a String. Note that using g_filename_to_utf8 is not a possible solution as it's only meant to convert file names for displaying them in the UI. A possible solution would be to switch from String to CString for file names but this would mean losing the ability to use all the nice String methods on them and making the generic code and the code for other ports more complex. Another possible solution is to escape invalid characters before storing them in a String and unescaping them before passing the string to functions expecting a raw file name.
Attachments
Add String::toFilename and String::fromFilename
(8.13 KB, patch)
2008-09-05 07:25 PDT
,
Marco Barisione
darin
: review-
Details
Formatted Diff
Diff
Modified version of attachment #23170 to the String::to/fromFilename
(5.94 KB, patch)
2008-09-05 07:30 PDT
,
Marco Barisione
no flags
Details
Formatted Diff
Diff
Test
(1.83 KB, patch)
2008-09-05 07:38 PDT
,
Marco Barisione
no flags
Details
Formatted Diff
Diff
Add filenameToString and filenameFromString to FileSystemGtk.cpp
(7.19 KB, patch)
2008-09-11 08:49 PDT
,
Marco Barisione
alp
: review-
Details
Formatted Diff
Diff
Add filenameToString and filenameFromString to FileSystemGtk.cpp
(10.18 KB, patch)
2008-09-15 08:10 PDT
,
Marco Barisione
eric
: review-
Details
Formatted Diff
Diff
Add filenameToString, filenameFromString and filenameForDisplay to FileSystemGtk.cpp
(15.97 KB, patch)
2008-10-02 09:46 PDT
,
Marco Barisione
no flags
Details
Formatted Diff
Diff
Show Obsolete
(3)
View All
Add attachment
proposed patch, testcase, etc.
Marco Barisione
Comment 1
2008-09-05 07:25:52 PDT
Created
attachment 23191
[details]
Add String::toFilename and String::fromFilename 1. Add String::fromFilename static method (only for the GTK port) that creates an escape string from a file name. 2. Add String::toFilename to unescape a string create with String::fromFilename. 3. Add some debugging code to try to catch uses of String::toFilename on Strings not created through String::fromFilename. 4. Fix the functions in FileSystemGtk.cpp.
Marco Barisione
Comment 2
2008-09-05 07:30:58 PDT
Created
attachment 23192
[details]
Modified version of
attachment #23170
[details]
to the String::to/fromFilename
Marco Barisione
Comment 3
2008-09-05 07:38:40 PDT
Created
attachment 23193
[details]
Test This patch creates 2 CSS files the first named test-à.css (with the à encoded in latin-1, i.e. "\xe0") and test-[?].css (where with [?] I mean the Unicode replacement character, i.e. "\xef\xbf\xbd"), and then it tries to load test-à.css as the user-specified stylesheet. Try this with
https://bugs.webkit.org/attachment.cgi?id=23170
and WebKit will load test-[?].css because of the raw chars -> utf-8 transformation, so you will se a red background. Try this with
https://bugs.webkit.org/attachment.cgi?id=23192
and WebKit will load test-à.css, so you will se a green background.
Darin Adler
Comment 4
2008-09-05 11:31:26 PDT
Comment on
attachment 23191
[details]
Add String::toFilename and String::fromFilename Adding this capability to the String class seems clearly wrong. It's not correct layering -- we don't want the String class to know anything about the file system. We also want to limit platform-specific functions in the String class to the ones for converting to and from platform String types. You can accomplish this by building functions that operate on String objects rather than adding member functions to the String class itself. I believe this can be done entirely within the FileSystemGtk.cpp file without touching PlatformString.h or String.cpp.
Darin Adler
Comment 5
2008-09-05 11:33:32 PDT
One other question: This seems fine for cases where the filenames come from the functions in FileSystemGtk. But what effect will this have on all the code paths where the filename comes from content on the web?
Marco Barisione
Comment 6
2008-09-11 08:49:18 PDT
Created
attachment 23350
[details]
Add filenameToString and filenameFromString to FileSystemGtk.cpp (In reply to
comment #5
)
> One other question: This seems fine for cases where the filenames come from the > functions in FileSystemGtk. But what effect will this have on all the code > paths where the filename comes from content on the web?
Like? I cannot think to any case where we could get filenames causing problems, but probably I'm missing something.
Alp Toker
Comment 7
2008-09-13 09:18:59 PDT
Comment on
attachment 23350
[details]
Add filenameToString and filenameFromString to FileSystemGtk.cpp +#if PLATFORM(WIN) should be WIN_OS. g_uri_escape_string() ^ only available in GLib >= 2.16 For this change to make sense, we need to decide exactly what we expect the developer to pass to user-stylesheet-uri: a local filename (byte array) or a well formatted URI?
Marco Barisione
Comment 8
2008-09-15 08:10:29 PDT
Created
attachment 23441
[details]
Add filenameToString and filenameFromString to FileSystemGtk.cpp (In reply to
comment #7
)
> +#if PLATFORM(WIN) > > should be WIN_OS.
Fixed.
> g_uri_escape_string() > > ^ only available in GLib >= 2.16
I included a copy of that function so we can use with glib < 2.16. Is it the right way to do that?
> For this change to make sense, we need to decide exactly what we expect the > developer to pass to user-stylesheet-uri: a local filename (byte array) or a > well formatted URI?
WebCore expects a URI and considering the name I think that it should be clear that it's a URI and not a filename.
Holger Freyther
Comment 9
2008-09-21 19:10:34 PDT
(In reply to
comment #8
)
> Created an attachment (id=23441) [edit]
> > > g_uri_escape_string() > > > > ^ only available in GLib >= 2.16 > > I included a copy of that function so we can use with glib < 2.16. Is it the > right way to do that?
It might make sense to create a new file (maybe even only a header file) for such things? Besides that question it seems to be done. If we manage to land
bug #18987
first you might need to upload a new diff though.
Eric Seidel (no email)
Comment 10
2008-09-23 04:20:36 PDT
Comment on
attachment 23441
[details]
Add filenameToString and filenameFromString to FileSystemGtk.cpp 87 gchar* systemFilename = filenameFromString(m_filename.utf8().data()); That can just be: 87 gchar* systemFilename = filenameFromString(m_filename); or? I think all the imported code from GLIB should live in a separate header/.cpp pair, that way it can have its own special indentation for that file. Although I'm not sure that's really enough code (at least for is_valid) to really justify needing to be copy/pasted from glib.
Marco Barisione
Comment 11
2008-10-02 09:46:32 PDT
Created
attachment 24021
[details]
Add filenameToString, filenameFromString and filenameForDisplay to FileSystemGtk.cpp (In reply to
comment #10
)
> (From update of
attachment 23441
[details]
[edit]) > 87 gchar* systemFilename = > filenameFromString(m_filename.utf8().data()); > > That can just be: > 87 gchar* systemFilename = filenameFromString(m_filename); > or?
Yeah, I didn't see that. Fixed.
> I think all the imported code from GLIB should live in a separate header/.cpp > pair, that way it can have its own special indentation for that file. Although > I'm not sure that's really enough code (at least for is_valid) to really > justify needing to be copy/pasted from glib.
Done. is_valid is there only because it's used by another function copied from glib, I don't use it directly.
> such things? Besides that question it seems to be done. If we manage to land >
bug #18987
first you might need to upload a new diff though.
Updated. For this chance I also added filenameForDisplay that converts an encoded string to a UTF-8 string for display, useful for error messages for instance.
Holger Freyther
Comment 12
2008-10-13 10:51:32 PDT
Looks sane. One minor coding style issue with the following but it can be fixed when landing. I need another second to have a second look. + gchar *filename = filenameFromString(string); + gchar *display = g_filename_to_utf8(filename, 0, 0, 0, 0);
Holger Freyther
Comment 13
2008-10-13 11:23:27 PDT
Comment on
attachment 24021
[details]
Add filenameToString, filenameFromString and filenameForDisplay to FileSystemGtk.cpp
>+ gchar* systemFilename = filenameFromString(m_filename); >+ gchar* systemBasename = g_path_get_basename(systemFilename); >+ g_free(systemFilename); >+ stringByAdoptingFileSystemRepresentation(systemBasename, string);
vs.
>+ char* baseName = g_path_get_basename(tmpFilename); > String fileName = String::fromUTF8(baseName); > g_free(baseName); >+ g_free(tmpFilename);
should one use the above stringByAdopting... function as well? The next thing is to remove the various if (fileName) {} statements as filenameFromString is not returning a NULL String? We at least have inconsistent null handling with filenameFromString after this patch. I would be willing to land this as the first part if cleanups follow? Do you agree?
Marco Barisione
Comment 14
2008-10-14 08:57:24 PDT
(In reply to
comment #13
)
> (From update of
attachment 24021
[details]
[edit]) > >+ gchar* systemFilename = filenameFromString(m_filename); > >+ gchar* systemBasename = g_path_get_basename(systemFilename); > >+ g_free(systemFilename); > >+ stringByAdoptingFileSystemRepresentation(systemBasename, string); > > vs. > > >+ char* baseName = g_path_get_basename(tmpFilename); > > String fileName = String::fromUTF8(baseName); > > g_free(baseName); > >+ g_free(tmpFilename);
I will fix it using GOwnPtr now that it landed. I can fix this and the return value checks in a separate patch+bug report if it's ok for you, so we avoid to keep changing the same patch.
Holger Freyther
Comment 15
2008-10-15 11:23:43 PDT
Comment on
attachment 24021
[details]
Add filenameToString, filenameFromString and filenameForDisplay to FileSystemGtk.cpp We are aware that more work needs to be done but this will be done in another patch.
Holger Freyther
Comment 16
2008-10-15 13:34:28 PDT
Comment on
attachment 24021
[details]
Add filenameToString, filenameFromString and filenameForDisplay to FileSystemGtk.cpp Clearing the review field as the patch landed. guriescape had to be converted to spaces. Thanks for the patch and the next thing would be to move things to GOwnPtr?
Sung-jun, Kim
Comment 17
2008-12-02 23:00:56 PST
This patch was landed. How about change status to FIXED?
Gustavo Noronha (kov)
Comment 18
2009-01-12 15:33:53 PST
Setting to fixed since the patch was landed, please reopen if needed.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug