Summary: | VC++ 2005 Express failed to build WebKit due to raw UTF-8 string in WebKit/win/WebCoreLocalizedStrings.cpp | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Takeshi Yoshino <tyoshino> | ||||||||||
Component: | New Bugs | Assignee: | David Levin <levin> | ||||||||||
Status: | RESOLVED FIXED | ||||||||||||
Severity: | Normal | CC: | abarth, hbono, levin | ||||||||||
Priority: | P2 | ||||||||||||
Version: | 528+ (Nightly build) | ||||||||||||
Hardware: | PC | ||||||||||||
OS: | OS X 10.5 | ||||||||||||
Bug Depends on: | |||||||||||||
Bug Blocks: | 27416 | ||||||||||||
Attachments: |
|
Description
Takeshi Yoshino
2009-06-13 12:09:18 PDT
Created attachment 31242 [details]
Proposed fix for 26375
Comment on attachment 31242 [details]
Proposed fix for 26375
LGTM. Thanks.
Comment on attachment 31242 [details]
Proposed fix for 26375
This change is incompatible with the scripts that we uses to maintain the localized string files, so while it fixes compilation for the people using these development tools, it breaks WebKit development work flow for anyone using the localization scripts, unless we make some changes to the script used to extract localizable strings.
Comment on attachment 31242 [details] Proposed fix for 26375 Marking as R- based on comment #3. I'm surprised we've been able to get away with non-ASCII source files thus far. Maybe we should change the scripts? (In reply to comment #4) > Marking as R- based on comment #3. I'm surprised we've been able to get away > with non-ASCII source files thus far. Maybe we should change the scripts? Until this point in the project, all the compilers we support were consistent in that in C strings (non-wide) they allowed non-ASCII characters, including the same raw bytes in the executable that were in the source file. This meant we were free to use UTF-8 for all strings, and not do any special escaping. I'm disappointed to find we need to support compilers incompatible with this technique. If we decide that this is no longer allowed, then we need to teach the scripts to deal with some sort of all-ASCII syntax. I think actual UTF-8 sequences with \x for each byte are a really ugly choice; it would be nice to have something that was aesthetically pleasing. I don't have any immediate good ideas at the moment about how to resolve this. Perhaps the issue is entirely with warnings. Maybe we can fix this by turning off the warning. Or is the compiler actually trying to transcode the string? Other projects I've been involved with have used XML as the source representations of localized strings (which, of course, is natively UTF-8). I see. So, for now, may I put some comments into the code to let the Windows-based developer know the workaround, e.g. URL to this bug entry? I'll send a new patch to do so, later. Created attachment 31405 [details]
Proposed fix for 26375 (rev 2)
*** Bug 21849 has been marked as a duplicate of this bug. *** Could you please post the warning number, and the description (even if it's in Japanese). (In reply to comment #11) > Could you please post the warning number, and the description (even if it's in > Japanese). > Here's the warning. 5>WebCoreLocalizedStrings.cpp 5>..\WebCoreLocalizedStrings.cpp : error C2220: warning treated as error - no 'object' file generated 5>..\WebCoreLocalizedStrings.cpp : warning C4819: The file contains a character that cannot be represented in the current code page (932). S ave the file in Unicode format to prevent data loss 5>Project : warning PRJ0018 : The following environment variables were not found: 5>$(PRODUCTION) 5>Build log was saved at "file://E:\wk1_svn_cygwin\WebKitBuild\obj\WebKit\Debug\BuildLog.htm" 5>WebKit - 1 error(s), 1 warning(s) ========== Build: 4 succeeded, 1 failed, 13 up-to-date, 0 skipped ========== Created attachment 31531 [details]
Proposed fix for 26375 (rev 3)
Included the warning messages to the patch.
What happens if we "Save the file in Unicode format"? Does it just add a UTF8 BOM to the top of the file? (In reply to comment #14) > What happens if we "Save the file in Unicode format"? Does it just add a UTF8 > BOM to the top of the file? > I couldn't find what the message "Save the file in Unicode format" in C4819 (http://msdn.microsoft.com/en-us/library/ms173715.aspx) exactly means. However, Visual Studio 2005 Professional has a "Advanced Save Options" for its editor. I saw five encoding options in the dialog with "Unicode" in them. So, I've tested building WebKit by saving WebCoreLocalizedString.cpp with each of them. Whether a BOM is added or not is also described. Build mode used: Release (because run-safari in debug mode on Windows fails too frequently) Check: Navigate to some image (http://www.google.co.jp/intl/ja_jp/images/logo.gif) and check the title Save mode: Unicode (UTF-8 with signature) - Codepage 65001 The source was encoded by UTF-8 w/ BOM Build: Successful Result: localized string not found Save mode: Unicode - Codepage 1200 The source was encoded by UTF-16 Little Endian w/o BOM Build: Successful Result: localized string not found Save mode: Unicode (Big-Endian) - Codepage 1201 The source was encoded by UTF-16 Big-Endian w/o BOM Build: Successful Result: localized string not found Save mode: Unicode (UTF-7) - Codepage 65000 The source was encoded by UTF-7 Build: Failure 4>WebCoreLocalizedStrings.cpp 4>..\WebCoreLocalizedStrings.cpp(1) : error C2059: syntax error : '/' 4>..\WebCoreLocalizedStrings.cpp(7) : error C2059: syntax error : '+' 4>..\WebCoreLocalizedStrings.cpp(13) : error C2137: empty character constant 4>..\WebCoreLocalizedStrings.cpp(35) : error C2653: 'WebCore' : is not a class or namespace name Save mode: Unicode (UTF-8 without signature) - Codepage 65001 The source was encoded by UTF-8 w/o BOM Build: Failure. We need to remove /Wx option (take warnings as errors) Build w/o /Wx: Successful Result: localized string not found None of them fixed the problem. I saw "logo.gif 276×110 pixels image" on the title bar only when I replace the character with escape sequences (\xC3\x97). Thanks, Comment on attachment 31531 [details]
Proposed fix for 26375 (rev 3)
Wow! Thank you for the awesome investigation.
Unfortunately the current patch doesn't seem to actually have the code change, just a comment?
This is the only log message you need in the ChangeLog:
WebCoreLocalizedStrings.cpp : warning C4819: The file contains a character that cannot be represented in the current code page (932). Save the file in Unicode format to prevent data loss
The rest of the build output you can remove from the ChangeLog.
r- for lack of code change. Otherwise I'd r+ this. Thanks again for the investigation!
Comment on attachment 31531 [details]
Proposed fix for 26375 (rev 3)
oops, meant to r- it until a new patch is posted with the code change (and adjusted ChangeLog)
(In reply to comment #16) > (From update of attachment 31531 [details] [review]) > Wow! Thank you for the awesome investigation. > > Unfortunately the current patch doesn't seem to actually have the code change, > just a comment? > > This is the only log message you need in the ChangeLog: > WebCoreLocalizedStrings.cpp : warning C4819: The file contains a character that > cannot be represented in the current code page (932). Save the file in Unicode > format to prevent data loss > > The rest of the build output you can remove from the ChangeLog. > OK. Thanks. > r- for lack of code change. Otherwise I'd r+ this. Thanks again for the > investigation! > As Darin said, if I make code change on WebCoreLocalizedStrings.cpp without making change on extract-localized-strings script, I'll break Localized.strings generation process. Do you mean I should make change on extract-localized-strings instead of putting a comment? Thanks, Created attachment 33143 [details]
Proposed fix for 26375 (rev 4)
Wrote a change on WebKitTools/Scripts/extract-localizable-strings so
that replacing a raw string in UI_STRING with hexadecimal escape
sequences won't break localized string generation process.
Assign to levin for landing. Committed as http://trac.webkit.org/changeset/46245 (In reply to comment #21) > Committed as http://trac.webkit.org/changeset/46245 Thank you! |