Steps:
1. Go to: http://www.megaupload.com/?c=abuse
Issue:
The text field under 'Description of abuse:' text area renders small. Thus shows the number as 10 instead of 1000.
Other Browsers:
IE, FF, Opera don't have this issue.
Attached is the reduction.
Hi, WebKit renders it correct regarding to the HTML 4.01 Specs.
But i think IE and Firefox use a min width of 3 characters for input fields of type "text" or "password".
This is about how you measure characters when computing the size attribute. I believe FFX deliberately uses a very wide character, whereas we use a character of average width. Our rendering certainly seems more sensible to me.
Firefox matches IE. We don't.
The other important part of this bug is that IE and FF appear to apply a minimum size of 3 characters to any text/password field (See my test case).
My latest test case seems to show that FF/IE are in fact using a slightly *smaller* char to measure width of the text field and they are padding any widths by 2 characters.
Created attachment 21964[details]
an even betterer testcase
Works in IE and normalizes fontFamily/fontSize. WebKit has different defaults than IE/FF and IE/FF do different things for different fontFamilies.
Created attachment 21965[details]
patch to match IE's width
IE does the following for width of text controls (FF mostly matches this, but not exactly):
Text inputs: avgCharWidth * size + (maxCharWidth - avgCharWidth)
Textareas: avgCharWidth * cols
As the changelog in the patch says, this makes WebKit exactly match IE for GDI fonts (I don't have access to CG hooks for avgCharWidth/maxCharWidth). There's still 2 extra pixels in WebKit that I explain in more detail in the changelog.
Two things need to be done here:
1. Decide if this is the right behavior for WebKit. Ideally Hyatt or Hixie or Adele or Darin or such would chime in here. There are obvious compatibility issues with our existing behavior. Would this new behavior introduce worse ones?
2. Once that's decided, implementing CG and SVG versions of this code path would require some time form Hyatt or Mitz or Adele or such.
I think this is about as optimal as we can get with respect to compatibility with the web that's out there. This patch only affects sites that don't explicitly set the width/height of a textarea/text-input or set it via the cols/size attribute.
I posit that sites that rely on these values are mostly older sites that are coded to and tested only in IE. That said, I don't have statistically significant data to back that up, just the handful of sites in this bug which have layout problems because of this. :)
My concern is really with the case where no font is explicitly specified. IE uses a fixed pitch default font. Safari uses Lucida Grande. We need to ensure that this algorithm works well for Lucida Grande.
Comment on attachment 21965[details]
patch to match IE's width
Oh, I see you did this for Windows only.
I really dislike that you added 2 members to SimpleFontData on every platform, but then only use them on Windows. I also don't like treating "0" for avgCharWidth as "unset."
Character widths should also be float-based (the fact that Windows rounds is a platform-specific characteristic not shared by other platforms necessarily).
Why not get the width of 0 for the other platforms? Then you don't have to special case the RenderTextControl code.
Sorry I didn't respond sooner. I didn't get emails when Hyatt responded on the 15th.
> My concern is really with the case where no font is explicitly specified. IE
> uses a fixed pitch default font. Safari uses Lucida Grande. We need to ensure
> that this algorithm works well for Lucida Grande.
I'll ensure Lucida Grande does something reasonable once the platform issues below are address. As it is now, Lucida Grande renders exactly as it did before because it goes down the CG path (even with GDI fonts enabled).
> Oh, I see you did this for Windows only.
Only because I don't know how to do this on Mac/Linux. I was hoping someone on this thread that is more familiar with Mac/Linux APIs could help me out here.
> I really dislike that you added 2 members to SimpleFontData on every platform,
> but then only use them on Windows. I also don't like treating "0" for
> avgCharWidth as "unset."
My hope was to add a proper value for this to every platform. I just am not familier with the Mac/Linux ways to get this data (if there are any). I sent this patch as is so that it would at least be submittable without breaking other platforms.
> Character widths should also be float-based (the fact that Windows rounds is a
> platform-specific characteristic not shared by other platforms necessarily).
Will fix that.
> Why not get the width of 0 for the other platforms? Then you don't have to
> special case the RenderTextControl code.
That's fine with me. Again, I was hoping someone on this thread would know how to get the actual avgCharWidth/maxCharWidth of the font on Mac/Linux.
Should maxCharWidth also be the width of a 0? That seems a bit weird, but I don't know what would be better.
Hyatt, can you provide Ojan information as to how he'd get the average char width from CG? Windows seems to have explicit APIs for this. I don't think CG exposes this information as API.
(In reply to comment #25)
> Hyatt, can you provide Ojan information as to how he'd get the average char
> width from CG? Windows seems to have explicit APIs for this. I don't think CG
> exposes this information as API.
I don't know a high-level Mac OS X API for "average character width". I am not sure what the value that the Windows API returns is based on. If it is the xAvgCharWidth field from the font's OS/2 table, then there is Mac OS X API for accessing font tables which can be used (ATSFontGetTable).
A layout test for this was posted in bug 21407, then dropped from there in favor of this one. Even if it's decided not to fix/change anything here, a test should still be added to verify intended behavior.
Actually, on the "catch unintentional changes" theory, two tests that will be affected by this were landed from bug 21407 after all, in r37665:
fast/forms/input-type-text-min-width.html
fast/forms/textarea-width.html
After about an evening + morning worth of research, we couldn't really find any concept of this on Mac in either ATSUI or CoreText. It seems to be very Microsoftian at its core.
The best we could do would be to read the "OS/2 table" present in TrueType fonts (hoping the font in question is one) and dig the xAvgCharWidth field out of it. We could fallback to '0' if there's no table present.
Can someone at infinite loop talk to the CoreText team and see if there's a better way to go about this? It sounds like such a hack. Does anyone think it's worth pursuing, or would it be rejected outright?
(In reply to comment #29)
> Can someone at infinite loop talk to the CoreText team and see if there's a
> better way to go about this?
I asked right after making comment #26 and was told that "the only path open is to parse the OS2 table yourself". Sorry about not adding this to the bug at the time.
I'm still slogging through layout test failures. I've looked at all the pixel failures so far and all the new results look correct to me. I was hoping to get an initial review of the code before doing all the rebaselining work though in case the review affects any of the metrics.
Comment on attachment 29235[details]
Make textarea metrics more closely match IEs.
I think a better fallback would be to use the space glyph if there's no "0" or no "w".
77 Vector<SVGGlyphIdentifier> numeralZeroGlyphs;
78 associatedFontElement->getGlyphIdentifiersForString(String("0", 1), numeralZeroGlyphs);
79 // FIXME: We should probably do something better than pick zero if there's no numeral 0.
80 m_avgCharWidth = numeralZeroGlyphs.isEmpty() ? 0 : static_cast<float>(numeralZeroGlyphs.first().horizontalAdvanceX * scale);
81
82 Vector<SVGGlyphIdentifier> letterWGlyphs;
83 associatedFontElement->getGlyphIdentifiersForString(String("W", 1), letterWGlyphs);
84 // FIXME: We should probably do something better than pick zero if there's no letter W.
85 m_maxCharWidth = letterWGlyphs.isEmpty() ? 0 : static_cast<float>(letterWGlyphs.first().horizontalAdvanceX * scale);
86
87 // FIXME: Should probably fill these in as we do above.
7288 m_spaceGlyph = 0;
7389 m_spaceWidth = 0;
It might be better to init m_avgCharWidth and other member variables all in one place.
103 #if !PLATFORM(MAC) && !PLATFORM(CHROMIUM) && !PLATFORM(WIN)
104 void SimpleFontData::platformCharWidthInit()
105 {
106 m_avgCharWidth = 0.0;
If there is a good OS/2 table, shouldn't we return early? I tried this out, and got a weird value for m_avgCharWidth (-1003 or something). Do you see that too?
288 CFRelease(os2TableRef);
289 }
290
291 // if there was no OS/2 table or it was malformed, we (by definition) can't
292 // match Windows metrics, so we fall back to the previous WebKit behavior of using
293 // the width of the digit '0' as a reasonable estimate. This method also sets
294 // m_maxCharWidth.
295 initCharWidthsFromZeroAndW();
More comments to come in the next few days...
(In reply to comment #40)
> (From update of attachment 29235[details] [review])
> I think a better fallback would be to use the space glyph if there's no "0" or
> no "w".
Seems reasonable to me. I'll add that. Is it safe to assume that every font has a space glyph?
> 77 Vector<SVGGlyphIdentifier> numeralZeroGlyphs;
> 78 associatedFontElement->getGlyphIdentifiersForString(String("0", 1),
> numeralZeroGlyphs);
> 79 // FIXME: We should probably do something better than pick zero if
> there's no numeral 0.
> 80 m_avgCharWidth = numeralZeroGlyphs.isEmpty() ? 0 :
> static_cast<float>(numeralZeroGlyphs.first().horizontalAdvanceX * scale);
> 81
> 82 Vector<SVGGlyphIdentifier> letterWGlyphs;
> 83 associatedFontElement->getGlyphIdentifiersForString(String("W", 1),
> letterWGlyphs);
> 84 // FIXME: We should probably do something better than pick zero if
> there's no letter W.
> 85 m_maxCharWidth = letterWGlyphs.isEmpty() ? 0 :
> static_cast<float>(letterWGlyphs.first().horizontalAdvanceX * scale);
> 86
> 87 // FIXME: Should probably fill these in as we do above.
> 7288 m_spaceGlyph = 0;
> 7389 m_spaceWidth = 0;
>
> It might be better to init m_avgCharWidth and other member variables all in one
> place.
>
> 103 #if !PLATFORM(MAC) && !PLATFORM(CHROMIUM) && !PLATFORM(WIN)
> 104 void SimpleFontData::platformCharWidthInit()
> 105 {
> 106 m_avgCharWidth = 0.0;
>
Yeah, I'm not very happy with the current organization either.
Another option would be to not have a platformCharWidthInit method at all. Just inline this code into platformInit or platformGlyphInit for mac/windows and have the linux build break? Or I could take a stab at implementing this for every platform, but I don't have a linux machine, so I have no way of testing. I would leave initCharWidthsFromZeroAndW though so that Mac/Linux could continue to share it. Does that seem better to you?
> If there is a good OS/2 table, shouldn't we return early? I tried this out,
> and got a weird value for m_avgCharWidth (-1003 or something). Do you see that
> too?
We only init avgCharWidth from the OS/2 table. So, we still need to init maxCharWidth, which I'm told is not in the OS/2 table. If m_avgCharWidth has a non-positive value, then we over-write it with the width of a 0 in initCharWidthsFromZeroAndW. My understanding is that some fonts just have garbage values here. I didn't actually write the mac implementation here though. Amanda (cc'ed) wrote it. Amanda, can you comment on this bit?
> 288 CFRelease(os2TableRef);
> 289 }
> 290
> 291 // if there was no OS/2 table or it was malformed, we (by definition)
> can't
> 292 // match Windows metrics, so we fall back to the previous WebKit
> behavior of using
> 293 // the width of the digit '0' as a reasonable estimate. This method
> also sets
> 294 // m_maxCharWidth.
> 295 initCharWidthsFromZeroAndW();
>
> More comments to come in the next few days...
>
(In reply to comment #41)
> (In reply to comment #40)
> > If there is a good OS/2 table, shouldn't we return early? I tried this out,
> > and got a weird value for m_avgCharWidth (-1003 or something). Do you see that
> > too?
>
> We only init avgCharWidth from the OS/2 table. So, we still need to init
> maxCharWidth, which I'm told is not in the OS/2 table. If m_avgCharWidth has a
> non-positive value, then we over-write it with the width of a 0 in
> initCharWidthsFromZeroAndW. My understanding is that some fonts just have
> garbage values here. I didn't actually write the mac implementation here
> though. Amanda (cc'ed) wrote it. Amanda, can you comment on this bit?
Yes. There are a number of non-Windows fonts that have OS/2 tables but do not have valid data in the avgCharWidth field (several of Apple's system-supplied fonts, for example). So as a sanity check, if the avgCharWidth from the table is clearly wrong, we discard it and fall through to the "compute a reasonable estimate" code path. The other comments in this bug about how to better compute that estimate all look quite reasonable. In the end, it's a judgement call anyway, since "number of characters wide" isn't actually a geometric measurement.
(In reply to comment #41)
> (In reply to comment #40)
> > (From update of attachment 29235[details] [review] [review])
> > I think a better fallback would be to use the space glyph if there's no "0" or
> > no "w".
>
> Seems reasonable to me. I'll add that. Is it safe to assume that every font has
> a space glyph?
No, but for some fonts (like non-roman ones) its more likely they'll have a space than "0" or "w".
> Yeah, I'm not very happy with the current organization either.
>
> Another option would be to not have a platformCharWidthInit method at all. Just
> inline this code into platformInit or platformGlyphInit for mac/windows and
> have the linux build break? Or I could take a stab at implementing this for
> every platform, but I don't have a linux machine, so I have no way of testing.
> I would leave initCharWidthsFromZeroAndW though so that Mac/Linux could
> continue to share it. Does that seem better to you?
I don't think it needs to be inline. But the ifdef'd version of platformCharWidthInit is unnecessary. Put this default implementation in the other platform files, even if you can't test. Then you could file bugs for the other platforms for what you think might be missing implementation details.
A few more comments...
- Instead of using hardcoded constants in the OS/2 table code, it would be better to use descriptively named const ints.
- Use a RetainPtr for os2TableRef - then you can avoid CFRetain and CFRelease here.
(In reply to comment #42)
> (In reply to comment #41)
> > (In reply to comment #40)
> > > If there is a good OS/2 table, shouldn't we return early? I tried this out,
> > > and got a weird value for m_avgCharWidth (-1003 or something). Do you see that
> > > too?
> >
> > We only init avgCharWidth from the OS/2 table. So, we still need to init
> > maxCharWidth, which I'm told is not in the OS/2 table. If m_avgCharWidth has a
> > non-positive value, then we over-write it with the width of a 0 in
> > initCharWidthsFromZeroAndW. My understanding is that some fonts just have
> > garbage values here. I didn't actually write the mac implementation here
> > though. Amanda (cc'ed) wrote it. Amanda, can you comment on this bit?
>
> Yes. There are a number of non-Windows fonts that have OS/2 tables but do not
> have valid data in the avgCharWidth field (several of Apple's system-supplied
> fonts, for example). So as a sanity check, if the avgCharWidth from the table
> is clearly wrong, we discard it and fall through to the "compute a reasonable
> estimate" code path. The other comments in this bug about how to better
> compute that estimate all look quite reasonable. In the end, it's a judgement
> call anyway, since "number of characters wide" isn't actually a geometric
> measurement.
>
Are there other indicators of invalid data from the OS/2 table? Is checking for a negative number good enough? For example, what if the avg is more than the maximum?
I've done all of the comments except the two below. I'm working through layout test changes now and will upload a new patch soonish.
(In reply to comment #43)
> - Use a RetainPtr for os2TableRef - then you can avoid CFRetain and CFRelease
> here.
I declare n00b-dom here. I've never used CF apis or RetainPtr. I tried to hack it based off other RetainPtr usages in this file, but I really don't know what I'm doing and couldn't get it to compile. Mind showing me what this would look like?
(In reply to comment #44)
> (In reply to comment #42)
> > (In reply to comment #41)
> > > (In reply to comment #40)
> > > > If there is a good OS/2 table, shouldn't we return early? I tried this out,
> > > > and got a weird value for m_avgCharWidth (-1003 or something). Do you see that
> > > > too?
> > >
> > > We only init avgCharWidth from the OS/2 table. So, we still need to init
> > > maxCharWidth, which I'm told is not in the OS/2 table. If m_avgCharWidth has a
> > > non-positive value, then we over-write it with the width of a 0 in
> > > initCharWidthsFromZeroAndW. My understanding is that some fonts just have
> > > garbage values here. I didn't actually write the mac implementation here
> > > though. Amanda (cc'ed) wrote it. Amanda, can you comment on this bit?
> >
> > Yes. There are a number of non-Windows fonts that have OS/2 tables but do not
> > have valid data in the avgCharWidth field (several of Apple's system-supplied
> > fonts, for example). So as a sanity check, if the avgCharWidth from the table
> > is clearly wrong, we discard it and fall through to the "compute a reasonable
> > estimate" code path. The other comments in this bug about how to better
> > compute that estimate all look quite reasonable. In the end, it's a judgement
> > call anyway, since "number of characters wide" isn't actually a geometric
> > measurement.
> >
>
> Are there other indicators of invalid data from the OS/2 table? Is checking
> for a negative number good enough? For example, what if the avg is more than
> the maximum?
Amanda, any thoughts here? I don't have a strong opinion one way or the other here for the specific example. I think it's an edge-case we're unlikely to hit in practice. Seems reasonable to err on the safe side though and do something here. Maybe we should set avg to min(avg, max) at the end or set max to max(avg, max)? It's not clear to me which of those makes sense. Or maybe we should discard the values entirely and fallback to the initCharWidths code?
(In reply to comment #45)
> Amanda, any thoughts here? I don't have a strong opinion one way or the other
> here for the specific example. I think it's an edge-case we're unlikely to hit
> in practice. Seems reasonable to err on the safe side though and do something
> here. Maybe we should set avg to min(avg, max) at the end or set max to
> max(avg, max)? It's not clear to me which of those makes sense. Or maybe we
> should discard the values entirely and fallback to the initCharWidths code?
Either of those would be reasonable--the only invalid values I've observed so far are negative values in Apple-supplied fonts; it does not seem to be a widespread problem (and no public docs give a valid interpretation of a negative value in that field). Ideally, I'd like to know what the negative value is trying to indicate and interpret it appropriately, but that appears to require docs that are not in evidence on the web.
For the specific example of avg > max (that is, values that appear valid individually but are inconsistent), I'd prefer to avoid a bunch of ad hoc checks. Since a more general "sanity check this font" capability would be useful for downloadable fonts as well as system fonts, I'd rather keep this patch as narrow as possible and open up a separate bug for the larger issue.
(In reply to comment #46)
> (In reply to comment #45)
> > Amanda, any thoughts here? I don't have a strong opinion one way or the other
> > here for the specific example. I think it's an edge-case we're unlikely to hit
> > in practice. Seems reasonable to err on the safe side though and do something
> > here. Maybe we should set avg to min(avg, max) at the end or set max to
> > max(avg, max)? It's not clear to me which of those makes sense. Or maybe we
> > should discard the values entirely and fallback to the initCharWidths code?
>
> Either of those would be reasonable--the only invalid values I've observed so
> far are negative values in Apple-supplied fonts; it does not seem to be a
> widespread problem (and no public docs give a valid interpretation of a
> negative value in that field). Ideally, I'd like to know what the negative
> value is trying to indicate and interpret it appropriately, but that appears to
> require docs that are not in evidence on the web.
Is there another way you could tell the the value is invalid? For example, is the rest of the OS/2 table valid in the cases that you have observed? <http://developer.apple.com/textfonts/TTRefMan/RM06/Chap6OS2.html> says that the first two bytes of a valid OS/2 table should be zeros. Do the tables that have a negative xAvgCharWidth still have zeros in their first two bytes? Are there any valid tables that have something other than zeros?
(In reply to comment #45)
> > - Use a RetainPtr for os2TableRef - then you can avoid CFRetain and CFRelease
> > here.
>
> I declare n00b-dom here. I've never used CF apis or RetainPtr. I tried to hack
> it based off other RetainPtr usages in this file, but I really don't know what
> I'm doing and couldn't get it to compile. Mind showing me what this would look
> like?
Something like:
RetainPtr<CFDataRef> os2TableRef(AdoptCF, CGFontCopyTableForTag(m_font.cgFont(), 'OS/2'));
if (os2TableRef && CFDataGetLength(os2TableRef.get()) >= 68) {
const UInt8* buffer = CFDataGetBytePtr(os2TableRef.get());
[...]
and *not* calling CFRelease, because the RetainPtr has ownership of the object and will release it when it goes out of scope.
I also notice that the mystery number 68 is still there in the most recent version of the patch. You should use a named constant for it.
(In reply to comment #48)
> Something like:
>
> RetainPtr<CFDataRef> os2TableRef(AdoptCF,
> CGFontCopyTableForTag(m_font.cgFont(), 'OS/2'));
> if (os2TableRef && CFDataGetLength(os2TableRef.get()) >= 68) {
> const UInt8* buffer = CFDataGetBytePtr(os2TableRef.get());
> [...]
That was it. Thanks. It's the AdoptCF bit I was missing.
> I also notice that the mystery number 68 is still there in the most recent
> version of the patch. You should use a named constant for it.
Yeah, don't look at the patch that's there now. I haven't uploaded a new one yet. Amanda did some more digging and the maxCharWidth we were getting was often too small. She gave me some new code to do it and I'm going through all the layout test results now. This now brings Mac Safari very close to the input metrics of Windows Safari / IE.
I should have a complete patch in the next half an hour or so.
(In reply to comment #46)
> (In reply to comment #45)
> > Amanda, any thoughts here? I don't have a strong opinion one way or the other
> > here for the specific example. I think it's an edge-case we're unlikely to hit
> > in practice. Seems reasonable to err on the safe side though and do something
> > here. Maybe we should set avg to min(avg, max) at the end or set max to
> > max(avg, max)? It's not clear to me which of those makes sense. Or maybe we
> > should discard the values entirely and fallback to the initCharWidths code?
>
> Either of those would be reasonable--the only invalid values I've observed so
> far are negative values in Apple-supplied fonts; it does not seem to be a
> widespread problem (and no public docs give a valid interpretation of a
> negative value in that field). Ideally, I'd like to know what the negative
> value is trying to indicate and interpret it appropriately, but that appears to
> require docs that are not in evidence on the web.
>
> For the specific example of avg > max (that is, values that appear valid
> individually but are inconsistent), I'd prefer to avoid a bunch of ad hoc
> checks. Since a more general "sanity check this font" capability would be
> useful for downloadable fonts as well as system fonts, I'd rather keep this
> patch as narrow as possible and open up a separate bug for the larger issue.
Either approach here seems totally reasonable to me. Adele, Mitz, what are your thoughts?
OK. New patch is ready for review. The only thing I'm uncomfortable with is one of the results for layout tests that I'm unable to explain: http/tests/navigation/javascriptlink-frames.html. The change to the text input metrics in that test are correct, but somehow the back-forward list is different. I don't see how my change could cause that given that it only affects text control metrics, but I'm synced to head, have no other changes in this client and I don't see the test failing on the bot. I'll keep digging to see if I come up with anything.
> I think a better fallback would be to use the space glyph if there's no "0" or
> no "w".
I've made the avgCharWidth fallback to the space width. If there's no space glyph it falls back to the xHeight. I've made maxCharWidth fallback to the ascent (matches what we do in initCharWidths).
> It might be better to init m_avgCharWidth and other member variables all in one
> place.
Done.
> - Instead of using hardcoded constants in the OS/2 table code, it would be
> better to use descriptively named const ints.
Done. I left the 256 in there. I feel like the comment is pretty clear that it's there to deal with Endian-ness. Happy to add a constant or to do the bit-shifting differently if you prefer.
Comment on attachment 29354[details]
Make textarea metrics more closely match IEs.
+ (WebCore::SimpleFontData::initCharWidthsFromZeroAndW):
This is in the change log but no longer in the patch.
+ m_adjustedSpaceWidth = m_treatAsFixedPitch ? ceilf(m_spaceWidth) : roundf(m_spaceWidth);
Can an SVG font ever be treated as fixed pitch?
+ m_avgCharWidth = widthForGlyph(glyphPageZero->glyphDataForCharacter(digitZeroChar).glyph);
What if the '0' glyph is missing?
+ m_maxCharWidth = widthForGlyph(glyphPageZero->glyphDataForCharacter(letter4E00).glyph);
This is wrong. Code point U+4E00 is not in glyph page zero. It is in glyph page 0x4e00 / GlyphPage::size. It also doesn't check for the missing glyph case. I think you should come up with a way to get the width of the U+4E00, it if exists, without instantiating a glyph page.
+ m_maxCharWidth = widthForGlyph(glyphPageZero->glyphDataForCharacter(letterWChar).glyph);
Not checking for the missing glyph.
+const uint32 kOS2CompatibilityTable = 'OS/2';
+const int kOS2CompatibilityTableMinValidLength = 68;
+const int kOS2xAvgCharWidthOffset = 2;
We don't use the 'k' prefix for constants in WebCore. I think a better name for the first constant would be OS2CompatibilityTableTag. Since these are only used in platformCharWidthInit(), why not move them into that function?
+ // retrieve the (big-endian) average character width for this font
Sentence-like comments should have sentence capitalization and end with a period.
+ SInt16 iAvgWidth = buffer[kOS2xAvgCharWidthOffset] * 256 + buffer[kOS2xAvgCharWidthOffset + 1];
+ m_avgCharWidth = scaleEmToUnits(iAvgWidth, m_unitsPerEm) * m_font.m_size;
Given that iAvgWidth may be invalid, I think you should not leave it up to the cross-platform function to discover it. You should check the validity here. Then you can set m_avgCharWidth only if iAvgWidth is valid. Have you looked into making the validity check more robust by examining the OS/2 table version field?
+ // Compute the maximum width of a character in this font.
+ m_maxCharWidth = 0.f;
+ int numGlyphsInFont = CGFontGetNumberOfGlyphs(m_font.cgFont());
+ for (Glyph glyphNumber = 0; glyphNumber < numGlyphsInFont; glyphNumber++)
+ m_maxCharWidth = std::max(m_maxCharWidth, widthForGlyph(glyphNumber));
This looks very very expensive. In addition to taking a considerable amount of time, it will allocate many GlyphWidthPage instances in the glyph width map.
+ // if there was no OS/2 table or it was malformed, we (by definition) can't
+ // match Windows metrics, so we fall back to the previous WebKit behavior of using
+ // the width of the digit '0' as a reasonable estimate.
This comment is no long accurate, because you changed the behavior to consider glyphs other than '0'. It should also be written in sentence style.
(In reply to comment #53)
> This looks very very expensive. In addition to taking a considerable amount of
> time, it will allocate many GlyphWidthPage instances in the glyph width map.
In practice, it does not seem very expensive, but it's pretty simple to trade off performance for accuracy here. Since it's only used in one place to compute some extra padding for the text, an estimate (such as the font's ascent) is perfectly reasonable if we don't need to match Windows exactly (since unfortunately it does compute the real max character width). But if we do want to match Windows metrics for the MS web fonts, and since CGFont doesn't provide this data, iterating over the glyphs appears to be the only way to compute it. But we could indeed avoid the memory hit of creating GlyphWidthPages at this stage--good observation; I'll give ojan an updated implementation of that loop that just uses CGFont APIs.
(In reply to comment #54)
> GlyphWidthPages at this stage--good observation; I'll give ojan an updated
> implementation of that loop that just uses CGFont APIs.
Scratch that--I'll have it pull pull the information straight out of the hhea/hmtx tables, which should be fast as well. As long as we're looking directly at table data anyway, might as well go all out :-).
(In reply to comment #53)
> (From update of attachment 29354[details] [review])
> + (WebCore::SimpleFontData::initCharWidthsFromZeroAndW):
>
> This is in the change log but no longer in the patch.
Sorry, I'm still not used to manually maintained changelogs as a concept. Fixed.
> + m_adjustedSpaceWidth = m_treatAsFixedPitch ? ceilf(m_spaceWidth) :
> roundf(m_spaceWidth);
>
> Can an SVG font ever be treated as fixed pitch?
Heh, no. Fixed.
> + m_avgCharWidth =
> widthForGlyph(glyphPageZero->glyphDataForCharacter(digitZeroChar).glyph);
>
> What if the '0' glyph is missing?
Fixed.
> + m_maxCharWidth =
> widthForGlyph(glyphPageZero->glyphDataForCharacter(letter4E00).glyph);
>
> This is wrong. Code point U+4E00 is not in glyph page zero. It is in glyph page
> 0x4e00 / GlyphPage::size. It also doesn't check for the missing glyph case. I
> think you should come up with a way to get the width of the U+4E00, it if
> exists, without instantiating a glyph page.
>
> + m_maxCharWidth =
> widthForGlyph(glyphPageZero->glyphDataForCharacter(letterWChar).glyph);
>
> Not checking for the missing glyph.
Got rid of both of these fallbacks. Now we just grab the max of the avgCharWidth and the ascent, which I'm told should be roughly equal to the width of a CJK character.
> +const uint32 kOS2CompatibilityTable = 'OS/2';
> +const int kOS2CompatibilityTableMinValidLength = 68;
> +const int kOS2xAvgCharWidthOffset = 2;
>
> We don't use the 'k' prefix for constants in WebCore. I think a better name for
> the first constant would be OS2CompatibilityTableTag. Since these are only used
> in platformCharWidthInit(), why not move them into that function?
>
> + // retrieve the (big-endian) average character width for this font
>
> Sentence-like comments should have sentence capitalization and end with a
> period.
Fixed.
> + SInt16 iAvgWidth = buffer[kOS2xAvgCharWidthOffset] * 256 +
> buffer[kOS2xAvgCharWidthOffset + 1];
> + m_avgCharWidth = scaleEmToUnits(iAvgWidth, m_unitsPerEm) *
> m_font.m_size;
>
> Given that iAvgWidth may be invalid, I think you should not leave it up to the
> cross-platform function to discover it. You should check the validity here.
> Then you can set m_avgCharWidth only if iAvgWidth is valid. Have you looked
> into making the validity check more robust by examining the OS/2 table version
> field?
I added the obvious check of making sure that iAvgWidth is positive. WRT doing something more complicated, here was Amanda's response on #webkit:
awalker: Well, all documented versions of the table are supersets of previous versions, and put the value we're picking out in the same place. As I noted in the bug, I don't want to put a whole chain of ad hoc sanity checks into this routine, though I think there's a place for an overall "sanity check this font" capability (see some of the stuff agl has been experimenting with)
awalker: But I think that should be a separate bug. I mean, if it's going to block landing the fix, we can add some, but we specifically *don't* need to check "version number < foo" as the table is currently documented.
awalker: IMO, anyway :-).
awalker: That said, it sure would be nice to know what those negative values of xAvgCharWidth are trying to tell us :-).
> + // Compute the maximum width of a character in this font.
> + m_maxCharWidth = 0.f;
> + int numGlyphsInFont = CGFontGetNumberOfGlyphs(m_font.cgFont());
> + for (Glyph glyphNumber = 0; glyphNumber < numGlyphsInFont; glyphNumber++)
> + m_maxCharWidth = std::max(m_maxCharWidth, widthForGlyph(glyphNumber));
>
> This looks very very expensive. In addition to taking a considerable amount of
> time, it will allocate many GlyphWidthPage instances in the glyph width map.
Replaced this whole chunk. The new code gives exactly the same value as the above code for all the fonts I tested, except for a couple Mac fonts where it gave a value 1px larger, which seems fine.
I also added a test-case with a bunch of different fonts so that we better track the expected behavior. Looking at the test-case, we still don't match IE metrics on the Mac. The value we get for maxCharWidth is often ~10px smaller than on Windows. In either case, this patch brings us closer to matching IE metrics on the Mac, so it's still a considerable improvement.
I'm CCing Adam, who has succeeded in matching Windows metrics here in Chromium's Linux port. Maybe we want to do the same here for Mac? Adam, can you comment on what you do on Linux in order to grab the values of avgCharWidth and maxCharWidth? In either case, if we decide to do this, I'd prefer to do so in a followup patch.
> + // if there was no OS/2 table or it was malformed, we (by definition)
> can't
> + // match Windows metrics, so we fall back to the previous WebKit behavior
> of using
> + // the width of the digit '0' as a reasonable estimate.
>
> This comment is no long accurate, because you changed the behavior to consider
> glyphs other than '0'. It should also be written in sentence style.
Updated the comment.
Comment on attachment 29405[details]
Make textarea metrics more closely match IEs.
WebCore/ChangeLog is out of date. git confused me. Correct patch coming in 20 seconds. :)
(In reply to comment #59)
> Created an attachment (id=29410) [review]
> Make textarea metrics more closely match IEs.
>
> LayoutTests/ChangeLog | 559 +++++++
[Another 558 lines like the above]
> WebCore/rendering/RenderTextControlSingleLine.cpp | 4 +
> 558 files changed, 5850 insertions(+), 4236 deletions(-)
What is the purpose of posting this information on Bugzilla?
Comment on attachment 29410[details]
Make textarea metrics more closely match IEs.
+#include "String.h"
#include "SVGFontData.h"
+#include "SVGFontElement.h"
#include "SVGFontFaceElement.h"
+#include "SVGGlyphElement.h"
These should be in ASCII order, so String.h should come last in the group.
+ if (m_maxCharWidth <= 0.f)
+ m_maxCharWidth = std::max(m_avgCharWidth, float(m_ascent));
Can m_ascent be negative? If not, then the max() is superfluous.
+ SInt16 iAvgWidth = buffer[OS2xAvgCharWidthOffset] * 256 + buffer[OS2xAvgCharWidthOffset + 1];
+ if (iAvgWidth > 0.f)
iAvgWidth is an integer. I think the above forces a costly conversion and float comparison. Even if it does not, you should just compare with integer 0.
As I have said before, this seems risky. You should retrieve the version field and proceed only if it is one of the versions known to contain the average width field. The result of ignoring an unknown version is not as bad as the result of misinterpreting it.
+ // Compute the maximum width of a character in this font.
+ m_maxCharWidth = [m_font.font() maximumAdvancement].width;
maximumAdvancement returns an NSSize, so it is not okay to send it to nil and use the result. m_font.font() can be nil if this is a CSS3 Web Font.
A couple questions before I upload a new patch...
(In reply to comment #62)
> + if (m_maxCharWidth <= 0.f)
> + m_maxCharWidth = std::max(m_avgCharWidth, float(m_ascent));
>
> Can m_ascent be negative? If not, then the max() is superfluous.
Do we know that m_ascent >= m_avgCharWidth? Please excuse my ignorance, I don't much about fonts.
> + SInt16 iAvgWidth = buffer[OS2xAvgCharWidthOffset] * 256 +
> buffer[OS2xAvgCharWidthOffset + 1];
> + if (iAvgWidth > 0.f)
>
> iAvgWidth is an integer. I think the above forces a costly conversion and float
> comparison. Even if it does not, you should just compare with integer 0.
>
> As I have said before, this seems risky. You should retrieve the version field
> and proceed only if it is one of the versions known to contain the average
> width field. The result of ignoring an unknown version is not as bad as the
> result of misinterpreting it.
Sorry, I wasn't ignoring your previous statement. I didn't include another comment Amanda had made, which is that every version ought to have the average character width field. So checking the version number would not help in identifying the ones giving a bad result. Why some fonts give a negative value is still unclear.
(In reply to comment #63)
> A couple questions before I upload a new patch...
>
> (In reply to comment #62)
> > + if (m_maxCharWidth <= 0.f)
> > + m_maxCharWidth = std::max(m_avgCharWidth, float(m_ascent));
> >
> > Can m_ascent be negative? If not, then the max() is superfluous.
>
> Do we know that m_ascent >= m_avgCharWidth? Please excuse my ignorance, I don't
> much about fonts.
Ignore that comment, I misread the code. I have a different comment instead:
You should just add "using namespace std;" at the top of the file, and use unqualified max here. Don't use a C-style cast to float. Either use a static_cast<float>, or use an explicit max<float>(m_avgCharWidth, m_ascent);
> > + SInt16 iAvgWidth = buffer[OS2xAvgCharWidthOffset] * 256 +
> > buffer[OS2xAvgCharWidthOffset + 1];
> > + if (iAvgWidth > 0.f)
> >
> > iAvgWidth is an integer. I think the above forces a costly conversion and float
> > comparison. Even if it does not, you should just compare with integer 0.
> >
> > As I have said before, this seems risky. You should retrieve the version field
> > and proceed only if it is one of the versions known to contain the average
> > width field. The result of ignoring an unknown version is not as bad as the
> > result of misinterpreting it.
>
> Sorry, I wasn't ignoring your previous statement. I didn't include another
> comment Amanda had made, which is that every version ought to have the average
> character width field. So checking the version number would not help in
> identifying the ones giving a bad result. Why some fonts give a negative value
> is still unclear.
Do those fonts that give a negative value have a valid ("known", "documented") version number? My understanding was that they have "garbage" in the version number field.
(In reply to comment #64)
> Do those fonts that give a negative value have a valid ("known", "documented")
> version number? My understanding was that they have "garbage" in the version
> number field.
Unfortunately, no. For example, Lucida Grande (at least the version shipped with Mac OS X 10.5.6) has a version number of 0002 and an xAvgCharWidth value of FC15. This was the font that prompted this patch, in fact, since it's the default font used for unstyled text fields in western languages.
(In reply to comment #65)
> (In reply to comment #64)
> > Do those fonts that give a negative value have a valid ("known", "documented")
> > version number? My understanding was that they have "garbage" in the version
> > number field.
>
> Unfortunately, no. For example, Lucida Grande (at least the version shipped
> with Mac OS X 10.5.6) has a version number of 0002 and an xAvgCharWidth value
> of FC15. This was the font that prompted this patch, in fact, since it's the
> default font used for unstyled text fields in western languages.
I see. Thank you.
Created attachment 29640[details]
rolled out code
Attaching the patch you get from svn di -r 42698:42697. In theory, you can apply this to r 42697 to return your tree to the state it was in tonight: patch for this bug rolled in; windows tests failing but added to a skipped list; tiger unable to build.
OK. So, to get this working on tiger we could do something like the following:
#if USE(ATSUI) && !defined(__LP64__)
if (m_atsuFontID) {
ATSFontMetrics horizMetrics;
ATSFontRef atsFont = FMGetATSFontRefFromFont(m_atsuFontID);
ATSFontGetHorizontalMetrics(atsFont, kATSOptionFlagsDefault, &horizMetrics);
m_avgCharWidth = horizMetrics.avgAdvanceWidth * pointSize;
m_maxCharWidth = horizMetrics.maxAdvanceWidth * pointSize;
return;
}
#else
... OS/2 table code as before ...
#endif
We're not sure yet whether that returns the same values as digging into the OS/2 table though. It would be great if we could just use the ATSUI version on all platforms, but FMGetATSFontRefFromFont is a 32bit only API (according to http://developer.apple.com/DOCUMENTATION/Carbon/Reference/Font_Manager/Reference/reference.html#//apple_ref/c/func/FMGetATSFontRefFromFont) .
Is there another API we could use that does something similar and is available to 64 bit? Mitz, do you know by any chance?
Grabbed more data. Including it at the bottom, but here's the summary:
-The OS/2 Table and the ATSUI tables match exactly for maxCharWidth on my Leopard machine. Presumably these are the same values you'd get out of the Tiger ATSUI tables.
-The ATSUI table *never* returns avgCharWidth, the OS/2 table does ~50% of the time.
-Computing the avgCharWidth according to http://developer.apple.com/textfonts/TTRefMan/RM06/Chap6OS2.html, which involves a weighted average over 27 specific glyphs returns numbers that are very close to the OS/2 numbers.
So, I propose the following that will get consistent rendering on Tiger and Leopard:
if (TIGER)
maxCharWidth = maxCharWidth out of the ATSUI table
else
maxCharWidth = maxCharWidth out of the OS/2 table
avgCharWidth = weighted avg according to http://developer.apple.com/textfonts/TTRefMan/RM06/Chap6OS2.html
//fallback to cross-platform initCharWidths() code if either value is not positive
I'll post a patch with this code soon. If it's OK, I'd like to leave out the layout test expected results until the code itself is approved as I'll need to generate windows expected results as well.
This will still leave us still not matching Windows metrics because Windows gets considerably (~2x) larger maxCharWidths for most fonts and is slightly different for avgCharWidth on some fonts. This does get us closer though.
Values are for fonts below are in the following order: Lucida Grande, Courier, Helvetica, Monaco, Times, Andale Mono, Arial, Comic Sans, Courier New, Georgia, Impact, Times New Roman, Trebuchet, Verdana, Webdings.
Uses:
-CGFontCopyTableForTag, then grabbing avgCharWidth out of the OS/2 Table
-m_maxCharWidth = [m_font.font() maximumAdvancement].width;
OS2. avg: 0.000000, max: 19.250000
OS2. avg: 0.000000, max: 27.539062
OS2. avg: 0.000000, max: 28.843750
OS2. avg: 0.000000, max: 9.055664
OS2. avg: 0.000000, max: 16.505371
OS2. avg: 0.000000, max: 6.670898
OS2. avg: 0.000000, max: 18.933105
OS2. avg: 6.601074, max: 6.601074
OS2. avg: 4.855469, max: 22.000000
OS2. avg: 6.601074, max: 6.977051
OS2. avg: 4.839355, max: 13.427734
OS2. avg: 4.500977, max: 14.125977
OS2. avg: 4.409668, max: 22.000000
OS2. avg: 5.591309, max: 16.736328
OS2. avg: 10.683105, max: 14.206543
Uses:
ATSFontGetHorizontalMetrics(atsFont, kATSOptionFlagsDefault, &horizMetrics);
m_avgCharWidth = horizMetrics.avgAdvanceWidth * m_font.m_size;
m_maxCharWidth = horizMetrics.maxAdvanceWidth * m_font.m_size;
ATSUI. avg: 0.000000, max: 19.250000
ATSUI. avg: 0.000000, max: 27.539062
ATSUI. avg: 0.000000, max: 28.843750
ATSUI. avg: 0.000000, max: 9.055664
ATSUI. avg: 0.000000, max: 16.505371
ATSUI. avg: 0.000000, max: 6.670898
ATSUI. avg: 0.000000, max: 18.933105
ATSUI. avg: 0.000000, max: 6.601074
ATSUI. avg: 0.000000, max: 22.000000
ATSUI. avg: 0.000000, max: 6.977051
ATSUI. avg: 0.000000, max: 13.427734
ATSUI. avg: 0.000000, max: 14.125977
ATSUI. avg: 0.000000, max: 22.000000
ATSUI. avg: 0.000000, max: 16.736328
ATSUI. avg: 0.000000, max: 14.206543
Computed according to http://developer.apple.com/textfonts/TTRefMan/RM06/Chap6OS2.html.
COMPUTE WEIGHTED. avg: 5.757157, max: 19.250000
COMPUTE WEIGHTED. avg: 6.288000, max: 28.000000
COMPUTE WEIGHTED. avg: 6.749000, max: 29.000000
COMPUTE WEIGHTED. avg: 7.000000, max: 9.000000
COMPUTE WEIGHTED. avg: 4.823000, max: 17.000000
COMPUTE WEIGHTED. avg: 7.000000, max: 7.000000
COMPUTE WEIGHTED. avg: 4.639000, max: 19.000000
COMPUTE WEIGHTED. avg: 7.000000, max: 7.000000
COMPUTE WEIGHTED. avg: 4.823000, max: 22.000000
COMPUTE WEIGHTED. avg: 7.000000, max: 7.000000
COMPUTE WEIGHTED. avg: 4.871000, max: 13.000000
COMPUTE WEIGHTED. avg: 4.596000, max: 14.000000
COMPUTE WEIGHTED. avg: 4.639000, max: 22.000000
COMPUTE WEIGHTED. avg: 5.761000, max: 17.000000
COMPUTE WEIGHTED. avg: 10.070000, max: 14.000000
Mitz, I assume you'll be reviewing this again. I'd like to get the code reviewed before uploading the patch for the LayoutTests if that's OK as the review very well might change the test results anyways. The LayoutTests changed slightly because the metrics we compute now are slightly different. No other changes from the previous patch. That said, if you want me to upload the test results as well, I can do that.
This patch is the same as the previous one, except in how it calculates m_avgCharWidth and m_maxCharWidth in SimpleFontDataMac.mm. The old method, looking at the OS/2 table via CGFontCopyTableForTag, doesn't work in Tiger and using ATSUI doesn't work no 64-bit.
The new code should work the same on Tiger and Leopard. I don't have Tiger, so I wasn't able to run the tests, but Amanda confirmed for me that the APIs we're now using all exist on Tiger and give the same results as they do in Leopard.
Created attachment 29879[details]
LayoutTests side of patch. Run on Windows, Tiger and Leopard.
711 files changed, 17317 insertions(+), 4402 deletions(-)
Comment on attachment 29764[details]
WebCore side of patch.
> + if (glyphPageZero) {
> + static int weights[] = { 64, 14, 27, 35, 100, 20, 14, 42, 63, 3, 6, 35, 20, 56, 56, 17, 4, 49, 56, 71, 31, 10, 18, 3, 18, 2, 166 };
> + float sum = 0.f;
> + int numGlyphs = 27;
> + ASSERT(numGlyphs == sizeof(weights)/sizeof(int));
Please add spaces around the / sign.
> + // Compute the weighted sum of the space character and the lowercase letters in the Latin alphabet.
> + for (int i = 0; i < numGlyphs; i++) {
> + Glyph glyph = glyphPageZero->glyphDataForCharacter((i < 26 ? i + 'a' : ' ')).glyph;
> + if (glyph)
> + sum += widthForGlyph(glyph) * weights[i];
> + }
> + m_avgCharWidth = sum / 1000;
It may be better to add up the total weight of found glyphs and divide by that instead of 1000. Consider a font that contains a space glyph but no Latin (Geeza Pro is an example, if I remember correctly).
r=me , but I suggest making the above change before landing the patch.
Comment on attachment 29879[details]
LayoutTests side of patch. Run on Windows, Tiger and Leopard.
> + On Windows, I'm not 100% confident in some of the rebaselines. Specifically, the
> + following editing ones are missing editing callbacks in the new results
A well-kept secret of Windows DumpRenderTree is that it does not dump editing delegate callback (even though it looks like most of the code to do that appears to be there). run-webkit-tests masks the difference when running the tests on Windows with Mac results.
We usually omit the long list of updated results from the change log in cases like this.
Comment on attachment 29882[details]
WebCore side of patch with final review comments implemented.
> + if (sum > 0.f && totalWeight > 0)
> + m_avgCharWidth = sum / 1000;
You should divide by totalWeight!
Comment on attachment 29882[details]
WebCore side of patch with final review comments implemented.
Ugh. Sorry about that. Thanks for being thorough mitz. Testing with the actual correct code now. Will upload a patch soon...
Created attachment 29884[details]
LayoutTests side of patch. Run on Windows, Tiger and Leopard.
711 files changed, 16587 insertions(+), 4402 deletions(-)
2007-09-28 17:10 PDT, Anantha Keesara
2008-05-08 13:51 PDT, Eric Seidel (no email)
2008-05-16 14:32 PDT, Eric Seidel (no email)
2008-06-26 18:42 PDT, Ojan Vafai
2008-06-26 18:54 PDT, Ojan Vafai
2009-03-24 18:42 PDT, Ojan Vafai
2009-03-31 16:14 PDT, Ojan Vafai
2009-03-31 17:44 PDT, Ojan Vafai
2009-04-03 12:06 PDT, Ojan Vafai
2009-04-08 18:38 PDT, Ojan Vafai
2009-04-10 16:08 PDT, Ojan Vafai
2009-04-10 16:57 PDT, Ojan Vafai
2009-04-20 11:41 PDT, Ojan Vafai
2009-04-20 22:26 PDT, Geoffrey Garen
2009-04-24 14:24 PDT, Ojan Vafai
2009-04-24 14:26 PDT, Ojan Vafai
2009-04-28 19:01 PDT, Ojan Vafai
2009-04-28 21:21 PDT, Ojan Vafai
2009-04-29 11:02 PDT, Ojan Vafai
2009-04-29 11:45 PDT, Ojan Vafai
2009-04-29 11:45 PDT, Ojan Vafai
2009-04-29 11:49 PDT, Ojan Vafai
2009-04-29 15:58 PDT, Ojan Vafai