Bug 136337 - Webkit using Harfbuzz does not display Arabic script correctly
Summary: Webkit using Harfbuzz does not display Arabic script correctly
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: 528+ (Nightly build)
Hardware: Other Other
: P2 Normal
Assignee: Nobody
URL: http://www.aljazeera.net/portal
Keywords:
Depends on:
Blocks:
 
Reported: 2014-08-28 03:47 PDT by Doron
Modified: 2014-12-08 03:34 PST (History)
3 users (show)

See Also:


Attachments
Patch (1.71 KB, patch)
2014-12-08 03:22 PST, Alberto Garcia
cgarcia: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Doron 2014-08-28 03:47:08 PDT
None of the text for http://www.aljazeera.net/portal displays correctly.

Platform PowerPC big-endian architecture.

After debugging, I found that this issue is due to an endian issue. This can be fixed with the following:

diff a/Source/WebCore/platform/graphics/harfbuzz/HarfBuzzFaceCairo.cpp b/Source/WebCore/platform/graphics/harfbuzz/HarfBuzzFaceCairo.cpp
index ecaafc1..2c643b3 100644
--- a/Source/WebCore/platform/graphics/harfbuzz/HarfBuzzFaceCairo.cpp
+++ b/Source/WebCore/platform/graphics/harfbuzz/HarfBuzzFaceCairo.cpp
@@ -111,7 +111,8 @@ static hb_bool_t harfBuzzGetGlyph(hb_font_t*, void* fontData, hb_codepoint_t uni
     if (result.isNewEntry) {
         cairo_glyph_t* glyphs = 0;
         int numGlyphs = 0;
-        CString utf8Codepoint = UTF8Encoding().encode(reinterpret_cast<UChar*>(&unicode), 1, QuestionMarksForUnencodables);
+        UChar ch = unicode;
+        CString utf8Codepoint = UTF8Encoding().encode(&ch, 1, QuestionMarksForUnencodables);
         if (cairo_scaled_font_text_to_glyphs(scaledFont, 0, 0, utf8Codepoint.data(), utf8Codepoint.length(), &glyphs, &numGlyphs, 0, 0, 0) != CAIRO_STATUS_SUCCESS)
             return false;
         if (!numGlyphs)

The unfixed code works fine on little endian architectures since reinterpret casting int32_t* to an int16_t* gets the two least significant bytes but on big endian architectures the downsizing will always return zeros (for small integer values). 

This then has the knock-on effect that no glyphs get returned.
Comment 1 Alberto Garcia 2014-12-08 03:22:06 PST
Created attachment 242796 [details]
Patch

Thanks for the patch!
Comment 2 Alberto Garcia 2014-12-08 03:34:19 PST
Committed r176945: <http://trac.webkit.org/changeset/176945>