Bug 39654 - Backslash is transcoded into yen sign even when non japanese font is specified
Summary: Backslash is transcoded into yen sign even when non japanese font is specified
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Text (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-05-25 00:14 PDT by Shinichiro Hamaji
Modified: 2010-07-23 00:07 PDT (History)
7 users (show)

See Also:


Attachments
test case (216 bytes, text/html)
2010-05-25 00:14 PDT, Shinichiro Hamaji
no flags Details
Patch v1 (178.44 KB, patch)
2010-05-25 01:59 PDT, Shinichiro Hamaji
no flags Details | Formatted Diff | Diff
Patch v1 - rebased (178.47 KB, patch)
2010-05-25 04:46 PDT, Shinichiro Hamaji
no flags Details | Formatted Diff | Diff
Patch v2 (226.42 KB, patch)
2010-05-31 01:47 PDT, Shinichiro Hamaji
tkent: review+
tkent: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Shinichiro Hamaji 2010-05-25 00:14:57 PDT
Created attachment 56983 [details]
test case

Bug 24906 introduced font based transcoding. In Bug 24906, I put a FIXME comment for the case where a web author specifies non-Japanese fonts to avoid yen sign glyph.
Comment 1 Shinichiro Hamaji 2010-05-25 01:59:12 PDT
Created attachment 56993 [details]
Patch v1
Comment 2 Shinichiro Hamaji 2010-05-25 04:46:05 PDT
Created attachment 57007 [details]
Patch v1 - rebased
Comment 3 Alexey Proskuryakov 2010-05-28 11:45:32 PDT
Comment on attachment 57007 [details]
Patch v1 - rebased

> @@ -70,8 +70,8 @@ Font::Font(const FontDescription& fd, short letterSpacing, short wordSpacing)
>      , m_letterSpacing(letterSpacing)
>      , m_wordSpacing(wordSpacing)
>      , m_isPlatformFont(false)
> -    , m_needsTranscoding(fontTranscoder().needsTranscoding(family().family().string()))
>  {
> +    m_needsTranscoding = fontTranscoder().needsTranscoding(*this);
>  }

In general, it isn't a great pattern to pass "this" to external functions from a constructor. The object may still be in some transitional state, and although it's valid 
C++, it may catch the programmer by surprise:
- subclass constructors haven't been invoked yet, and virtual methods table hasn't been swapped to final one yet;
- a reference counted object can still have a refcount of 0, so taking a temporary RefPtr will destroy it from within the constructor;
- post-constructor "init" functions that someone else wrote to avoid these problems haven't run yet;
- etc.

I'm still unclear on the "specified font" concept. The test doesn't check multiple font names (e.g. '"some-windows-only-font", "Times"', or "some-windows-only-font", "MS PGothic"). Is "specified font" inherited from parent nodes?
Comment 4 Shinichiro Hamaji 2010-05-31 01:47:55 PDT
Created attachment 57438 [details]
Patch v2
Comment 5 Shinichiro Hamaji 2010-05-31 02:12:07 PDT
Thanks for your review!

> In general, it isn't a great pattern to pass "this" to external functions from a constructor. The object may still be in some transitional state, and although it's valid 
> C++, it may catch the programmer by surprise:
> - subclass constructors haven't been invoked yet, and virtual methods table hasn't been swapped to final one yet;
> - a reference counted object can still have a refcount of 0, so taking a temporary RefPtr will destroy it from within the constructor;
> - post-constructor "init" functions that someone else wrote to avoid these problems haven't run yet;
> - etc.

Yeah, I totally agree. Fortunately, I could easily remove the "this" by using FontDescription instead of Font in this case.

> I'm still unclear on the "specified font" concept.

Basically, whenever a webpage specifies a font name by font-family or font, the font is considered "specified". If a non Japanese font is "specified", we guess the author of the web page intended to show backslashes even with Japanese encodings. If a generic font such as serif, sans-serif, etc. is specified, the font is considered "unspecified" because IE and Firefox choose locale specific fonts. We check only the first font family. I guess this is OK for now as most fonts would have U+005C. This limitation should be fixed if isSpecifiedFont will be used for other purpose. I doubt it will happen though.

I updated the comment on m_isSpecifiedFont. I hope the updated comment is clearer a bit. I'm not sure if the name "isSpecifiedFont" is good. Suggestions of a better name will be really appreciated.

> The test doesn't check multiple font names (e.g. '"some-windows-only-font", "Times"', or "some-windows-only-font", "MS PGothic").

Very good point. Actually, "Times, serif" was incorrect with the previous patch (the second font calls setIsSpecifiedFont). Now I fixed this issue and added some test cases. Thanks for catching this!

> Is "specified font" inherited from parent nodes?

Yes, just like font-family.
Comment 6 Kent Tamura 2010-07-16 07:15:40 PDT
Comment on attachment 57438 [details]
Patch v2

Looks OK.
Comment 7 Alexey Proskuryakov 2010-07-22 16:33:38 PDT
This patch has a review and can be landed, is there a reason it's commit-queue-?
Comment 8 Kent Tamura 2010-07-22 16:38:03 PDT
(In reply to comment #7)
> This patch has a review and can be landed, is there a reason it's commit-queue-?

The patch can not be applied cleanly.
Comment 9 Shinichiro Hamaji 2010-07-22 22:39:40 PDT
Committed r63950: <http://trac.webkit.org/changeset/63950>
Comment 10 WebKit Review Bot 2010-07-22 23:19:15 PDT
http://trac.webkit.org/changeset/63950 might have broken SnowLeopard Intel Release (Tests)
Comment 11 Shinichiro Hamaji 2010-07-23 00:07:29 PDT
(In reply to comment #10)
> http://trac.webkit.org/changeset/63950 might have broken SnowLeopard Intel Release (Tests)

Sorry for this. I think I should have treated font-family "-webkit-*" as unspecified. For now, I'll land a failing test and post a fix soon later.