Bug 15610

Summary: [GTK] Text rendering using Pango
Product: WebKit Reporter: Alp Toker <alp>
Component: New BugsAssignee: Sven Herzberg <sven>
Status: RESOLVED FIXED    
Severity: Normal CC: mh+webkit, pravi.a, sven, xan.lopez
Priority: P2 Keywords: Cairo, Gtk
Version: 523.x (Safari 3)   
Hardware: All   
OS: All   
Attachments:
Description Flags
Proof of concept Pango text rendering
none
0001-Update-Alp-s-patch-to-the-new-APIs-TextStyle-folded.patch
none
pangowebkit.diff
none
Rendering bug with pango.
none
pangorendering.diff
alp: review-
0001-Use-pango-to-render-complex-path.patch alp: review+

Description Alp Toker 2007-10-22 01:29:35 PDT
We should support text rendering through Pango/Cairo in addition to the traditional path.
Comment 1 Alp Toker 2007-11-12 23:23:51 PST
Created attachment 17222 [details]
Proof of concept Pango text rendering

This patch is based partly on the text rendering code from gtk-webcore. It uses Pango/Cairo for all text rendering, forcing WebCore to use the Complex CodePath all the time (which is useful for testing).

It does not support font selection but just uses the Pango defaults. There is a bug with the position text is drawn at, so I've disabled text underlining in GraphicsContext for now to make browsing bearable.

I haven't spent much time on this feature so I'm sure there is plenty of room for improvement/correction.
Comment 2 Alp Toker 2007-12-09 11:01:14 PST
Sven, is there any word on this one? I'd like to get whatever patch we have into SVN, this is becoming an important missing feature.
Comment 3 Praveen A 2007-12-27 18:30:48 PST
Open Office is using icu for text layout in addition to content manipulation. Can we use icu for text layout also in WebKit? This would enable rendering of complex text like Indic (Malayalam, Hindi, Tamil...), Arabic, Thai.
Comment 4 Praveen A 2007-12-27 18:37:08 PST
Some more info about how ICU actually handles complex text
http://www.icu-project.org/userguide/layoutEngine.html
Comment 5 Xan Lopez 2007-12-27 22:12:10 PST
Some comments on the code while I wait for WebKit to build and test it...

+// FIXME: having this code here is silly
+static gchar* convertUniCharToUTF8(const UChar* characters, int from, int to)

I assume this means: transform UChar, starting at character (and not byte) 'from', up to character 'to', to UTF8.

+    gchar *utf8 = g_utf16_to_utf8(str, to, NULL, NULL, NULL);

No error checking, but that was an easy one.

+    if (from > 0) {
+        gchar *str_left = g_strdup(utf8);
+        str_left = g_utf8_strncpy(str_left, utf8, from);

Dup utf8 to str_left and immediately overwrite it? With the 'from' first characters from utf8?

+        g_free(utf8);
+        utf8 = str_left;
+    }

Ok, so utf8 now contains its 'from' first characters?

The code to substitute line breaks for spaces comes after that. Looks ok at first sight (wonder if there's no easier way to do it), but not sure the code up to that point is ok.

Comment 6 Xan Lopez 2007-12-28 13:05:24 PST
Created attachment 18152 [details]
0001-Update-Alp-s-patch-to-the-new-APIs-TextStyle-folded.patch

Subject: [PATCH] Update Alp's patch to the new APIs (TextStyle folded into TextRun)

Also substract descent() from point.y() - ascent() in the cairo_translate
call. Seems to 'almost' fix the mispositioning of the text; still a small
offset though.
---
 WebCore/platform/graphics/Font.cpp        |    2 +-
 WebCore/platform/graphics/gtk/FontGtk.cpp |  160 +++++++++++++++++++++++++++--
 2 files changed, 152 insertions(+), 10 deletions(-)
Comment 7 Xan Lopez 2008-01-06 14:01:11 PST
Created attachment 18303 [details]
pangowebkit.diff

Ok, some more fixes. Mainly addressing behdad's review (done by email):

- Check more thoroughly the UTF16->UTF8 conversion (code by behdad from mozilla, with two small changes: break when need_copy is TRUE during validation, no need to continue iterating; s/PRUnichar/UChar/, etc).

- Try to do something sane when from > 0 in convertUniCharToUTF8.

- Do not fallback to the attr list, it's on by default.

- Use PangoLayoutLine and pango_cairo_show_layout_line in drawComplexText, so we can stop messing with ascent(), descent(), etc. The line will automatically have the coordinates we want (but see last note).

Note: there's still some glitches with latin text; capital letters seem to be cut off by a few pixels some times. Attaching shot.
Comment 8 Xan Lopez 2008-01-06 14:04:18 PST
Created attachment 18304 [details]
Rendering bug with pango.
Comment 9 Xan Lopez 2008-01-06 14:05:47 PST
Actually the fix for from > 0 will leak memory. Need to copy only the data I want, not strdup the whole string. Will fix that in the next patch when we figure out the rendering bug.
Comment 10 Xan Lopez 2008-01-08 11:08:23 PST
Created attachment 18330 [details]
pangorendering.diff

Ok, thanks to Behdad the last glitch is gone as far as I can tell. I think this is good enough for now until we switch to the low level pango APIs like Firefox3.
Comment 11 Alp Toker 2008-01-08 15:28:37 PST
Comment on attachment 18330 [details]
pangorendering.diff

This works really well. There are a few coding style issues to be sorted out before it can be landed though.

>diff --git a/WebCore/platform/graphics/gtk/FontGtk.cpp b/WebCore/platform/graphics/gtk/FontGtk.cpp
>index 3b3a228..790d762 100644
>--- a/WebCore/platform/graphics/gtk/FontGtk.cpp
>+++ b/WebCore/platform/graphics/gtk/FontGtk.cpp
>@@ -1,6 +1,10 @@
> /*
>  * Copyright (C) 2006 Apple Computer, Inc.  All rights reserved.
>  * Copyright (C) 2006 Michael Emmel mike.emmel@gmail.com 
>+ * Copyright (c) 2007 Hiroyuki Ikezoe All rights reserved.
>+ * Copyright (c) 2007 Kouhei Sutou All rights reserved.
>+ * Copyright (C) 2007 Alp Toker <alp@atoker.com>
>+ * Copyright (C) 2008 Xan Lopez <xan@gnome.org>
>  * All rights reserved.
>  *
>  * Redistribution and use in source and binary forms, with or without
>@@ -33,12 +37,144 @@
> #include "SimpleFontData.h"
> #include <cairo.h>
> 
>+#include <pango/pango.h>
>+#include <pango/pangocairo.h>
>+
> namespace WebCore {
> 
>+/* From Mozilla, with small changes */
>+
>+#define IS_HIGH_SURROGATE(u)  ((UChar)(u) >= (UChar)0xd800 && (UChar)(u) <= (UChar)0xdbff)
>+#define IS_LOW_SURROGATE(u)  ((UChar)(u) >= (UChar)0xdc00 && (UChar)(u) <= (UChar)0xdfff)
>+
>+static void
>+utf16_to_utf8 (const UChar* aText, gint aLength, char *&text, gint &length)

Should be no space before (

>+{
>+  gboolean need_copy = FALSE;
>+  int i;
>+
>+  for (i = 0; i < aLength; i++) {

You can put the int i within the for statement.

>+    if (!aText[i] || IS_LOW_SURROGATE(aText[i])) {
>+      need_copy = TRUE;
>+      break;
>+    }
>+    else if (IS_HIGH_SURROGATE(aText[i])) {
>+      if (i < aLength - 1 && IS_LOW_SURROGATE(aText[i+1]))
>+        i++;
>+      else {
>+        need_copy = TRUE;
>+        break;
>+      }
>+    }
>+  }
>+
>+  if (need_copy) {
>+
>+    /* Pango doesn't correctly handle nuls.  We convert them to 0xff. */
>+    /* Also "validate" UTF-16 text to make sure conversion doesn't fail. */
>+
>+    UChar *p = (UChar *)g_memdup(aText, aLength * sizeof(aText[0]));

Watch out for the position of *.

>+
>+    /* don't need to reset i */
>+    for (i = 0; i < aLength; i++) {
>+      if (!p[i] || IS_LOW_SURROGATE(p[i]))
>+        p[i] = 0xFFFD;
>+      else if (IS_HIGH_SURROGATE(p[i])) {
>+        if (i < aLength - 1 && IS_LOW_SURROGATE(aText[i+1]))
>+          i++;
>+        else
>+          p[i] = 0xFFFD;
>+      }
>+    }
>+
>+    aText = p;
>+  }
>+
>+  glong items_written;
>+  text = g_utf16_to_utf8(aText, aLength, NULL, &items_written, NULL);
>+  length = items_written;
>+
>+  if (need_copy)
>+    g_free((gpointer) aText);
>+
>+}
>+
>+// FIXME: having this code here is silly
>+static gchar* convertUniCharToUTF8(const UChar* characters, gint length, int from, int to)
>+{
>+    gchar *utf8 = 0;
>+    gint new_length = 0;
>+    utf16_to_utf8(characters, length, utf8, new_length);
>+    if (!utf8) return NULL;
>+
>+    if (from > 0) {
>+        // discard the first 'from' characters 
>+        // FIXME: we should do this before the conversion probably
>+        gchar *str_left = g_utf8_offset_to_pointer(utf8, from);
>+        gchar *tmp = g_strdup(str_left);
>+        g_free(utf8);
>+        utf8 = tmp;
>+    }
>+
>+    gchar *pos = utf8;
>+    gint len = strlen(pos);
>+    GString *ret = g_string_new_len(NULL, len);
>+
>+    // replace line break by space
>+    while (len > 0) {
>+        gint index, start;
>+        pango_find_paragraph_boundary(pos, len, &index, &start);
>+        g_string_append_len(ret, pos, index);
>+        if (index == start)
>+            break;
>+        g_string_append_c(ret, ' ');
>+        pos += start;
>+        len -= start;
>+    }
>+    return g_string_free(ret, FALSE);
>+}
>+
>+static void setPangoAttributes(const Font* font, const TextRun& run, PangoLayout* layout)
>+{
>+    PangoAttrList* list = pango_attr_list_new();
>+    PangoAttribute* attr;
>+
>+    attr = pango_attr_size_new_absolute((int)(font->size() * PANGO_SCALE));
>+    attr->end_index = G_MAXUINT;
>+    pango_attr_list_insert_before(list, attr);
>+
>+    if (!run.spacingDisabled()) {
>+        attr = pango_attr_letter_spacing_new(font->letterSpacing() * PANGO_SCALE);
>+        attr->end_index = G_MAXUINT;
>+        pango_attr_list_insert_before(list, attr);
>+    }
>+
>+    // Pango does not yet support synthesising small caps
>+    /*
>+    printf ("small caps: %d\n", font->isSmallCaps());
>+
>+    if (font->isSmallCaps()) {
>+        attr = pango_attr_variant_new(PANGO_VARIANT_SMALL_CAPS);
>+        attr->end_index = G_MAXUINT;
>+        pango_attr_list_insert_before(list, attr);
>+    }
>+    */

Commented-out code is against coding style, though I personally wouldn't mind leaving this snippet around so it doesn't get lost. If you want to remove it, you could replace it with a link to this bug and a comment.

>+
>+    pango_layout_set_attributes(layout, list);
>+    pango_attr_list_unref(list);
>+
>+    pango_layout_set_auto_dir(layout, FALSE);
>+
>+    PangoContext* pango_context = pango_layout_get_context(layout);
>+    PangoDirection direction = run.rtl() ? PANGO_DIRECTION_RTL : PANGO_DIRECTION_LTR;
>+    pango_context_set_base_dir(pango_context, direction);
>+}
>+
> void Font::drawGlyphs(GraphicsContext* graphicsContext, const SimpleFontData* font, const GlyphBuffer& glyphBuffer,
>                       int from, int numGlyphs, const FloatPoint& point) const
> {
>     cairo_t* context = graphicsContext->platformContext();
>+    cairo_save(context);
> 
>     // Set the text color to use for drawing.
>     float red, green, blue, alpha;
>@@ -59,26 +195,86 @@ void Font::drawGlyphs(GraphicsContext* graphicsContext, const SimpleFontData* fo
>         offset += glyphBuffer.advanceAt(from + i);
>     }
>     cairo_show_glyphs(context, glyphs, numGlyphs);
>+
>+    cairo_restore(context);
> }
> 
>-void Font::drawComplexText(GraphicsContext*, const TextRun&, const FloatPoint&, int from, int to) const
>+void Font::drawComplexText(GraphicsContext* context, const TextRun& run, const FloatPoint& point, int from, int to) const
> {
>-    notImplemented();
>+    cairo_t* cr = context->platformContext();
>+    cairo_save(cr);
>+
>+    PangoLayout* layout = pango_cairo_create_layout(cr);
>+
>+    gchar* utf8 = convertUniCharToUTF8(run.characters(), run.length(), from, to);
>+    pango_layout_set_text(layout, utf8, -1);
>+    g_free(utf8);
>+
>+    setPangoAttributes(this, run, layout);
>+
>+    // Set the text color to use for drawing.
>+    float red, green, blue, alpha;
>+    Color penColor = context->fillColor();
>+    penColor.getRGBA(red, green, blue, alpha);
>+    cairo_set_source_rgba(cr, red, green, blue, alpha);
>+
>+    // Our layouts are single line
>+    cairo_move_to(cr, point.x(), point.y());
>+    PangoLayoutLine *layout_line = pango_layout_get_line_readonly(layout, 0);
>+    pango_cairo_show_layout_line(cr, layout_line);
>+
>+    g_object_unref(layout);
>+    cairo_restore(cr);
> }
> 
>-float Font::floatWidthForComplexText(const TextRun&) const
>+float Font::floatWidthForComplexText(const TextRun& run) const
> {
>-    notImplemented();
>-    return 0.0f;
>+    if (run.length() == 0)
>+        return 0.0f;
>+
>+    // FIXME: we should create the layout with our actual context
>+    PangoFontMap* map = pango_cairo_font_map_get_default();
>+    PangoContext* pango_context = pango_cairo_font_map_create_context(PANGO_CAIRO_FONT_MAP(map));
>+    PangoLayout* layout = pango_layout_new(pango_context);
>+    g_object_unref(pango_context);

This chunk could move into a static helper function and be re-used.

Also, I'd recommend removing FIXME/TODOs or clarifying them.
Comment 12 Xan Lopez 2008-01-08 19:49:46 PST
Created attachment 18341 [details]
0001-Use-pango-to-render-complex-path.patch

Addressed comments.
Comment 13 Alp Toker 2008-01-08 20:42:42 PST
Comment on attachment 18341 [details]
0001-Use-pango-to-render-complex-path.patch

r=me

Will fix up some remaining style issues (* position, camel case, a badly indented if statement) before landing as discussed.
Comment 14 Alp Toker 2008-01-08 20:45:50 PST
Landed in r29334.