RESOLVED FIXED312973
webkit_uri_for_display fails on s390x
https://bugs.webkit.org/show_bug.cgi?id=312973
Summary webkit_uri_for_display fails on s390x
nathan.teodosio
Reported 2026-04-22 01:00:35 PDT
Created attachment 479239 [details] Log of failed build After an archive rebuild, this test in Epiphany-Browser fails only for s390x: static void test_ephy_uri_decode (void) { g_autofree char *result = NULL; result = ephy_uri_decode ("https://ja.wikipedia.org/wiki/%E3%83%A1%E3%82%A4%E3%83%B3%E3%83%9A%E3%83%BC%E3%82%B8"); g_assert_cmpstr (result, ==, "https://ja.wikipedia.org/wiki/メインページ"); } This must have been caused by some change between Webkit 2.51.92 (where the test fails) and 2.50.3 (where the test succeeds).
Attachments
Log of failed build (72.45 KB, application/gzip)
2026-04-22 01:00 PDT, nathan.teodosio
no flags
Alberto Garcia
Comment 1 2026-04-22 12:23:19 PDT
Since this affects s390x only this is probably an endianness problem. In Source/WTF/wtf/text/StringImpl.cpp every conversion between utf8 <=> utf16 is guarded by #if CPU(BIG_ENDIAN) except the one in StringImpl::create(std::span<const char8_t> codeUnits), so this might be the problem. --- a/Source/WTF/wtf/text/StringImpl.cpp +++ b/Source/WTF/wtf/text/StringImpl.cpp @@ -305,7 +305,11 @@ RefPtr<StringImpl> StringImpl::create(std::span<const char8_t> codeUnits) std::span<char16_t> data; auto string = createUninitializedInternalNonEmpty(utf16Length, data); +#if CPU(BIG_ENDIAN) + size_t written = simdutf::convert_valid_utf8_to_utf16be(input, inputLength, data.data()); +#else size_t written = simdutf::convert_valid_utf8_to_utf16le(input, inputLength, data.data()); +#endif RELEASE_ASSERT_WITH_SECURITY_IMPLICATION(written == utf16Length); return string; I haven't been able to verify it, however, and I'll be a few days away from my computer. The code in the main branch is different, so I don't know if it's affected.
Michael Catanzaro
Comment 2 2026-04-22 20:29:58 PDT
That's presumably correct, so might as well land it. Bonus points if Nathan wants to check to confirm that it works.
Michael Catanzaro
Comment 3 2026-04-22 20:36:23 PDT
Actually, since we always want native byte order, we can remove the preprocessor guards and just always use simdutf::convert_valid_utf8_to_utf16, which chooses the correct one for us, instead of the be/le versions. I assume the use of separate be/le versions was just an oversight?
Alberto Garcia
Comment 4 2026-04-23 03:30:20 PDT
> Actually, since we always want native byte order, we can remove the preprocessor guards and just always use simdutf::convert_valid_utf8_to_utf16, which chooses the correct one for us, instead of the be/le versions. I assume the use of separate be/le versions was just an oversight? I think we have two options here: 1. Cherry pick the fixes from main, I think that would mean (at least) 307875@main and 310857@main , but they change more things and don't have time to test it. 2. Make the minimal change to fix this specific issue, in this case we can keep the #if CPU(BIG_ENDIAN) to keep it consistent with the rest of the file, or use simdutf::convert_valid_utf8_to_utf16 as you suggest. I think it's the same either way. I just verified that option (2) solves the problem with a test build on s390x, so I'll prepare a pull request. The problem was also visible in the details of the fail assertion, from the logs: "\346\240\200\347\220\200\347\220\200\347\200\200\347\214\200\343\250\200\342\274\200\342\274\200\346\250\200\346\204\200\342\270\200\347\234\200\346\244\200\346\254\200\346\244\200\347\200\200\346\224\200\346\220\200\346\244\200\346\204\200\342\270\200\346\274\200\347\210\200\346\234\200 [...]" == "https://ja.wikipedia.org [...]) The actual value on the left is the result of doing this conversion: $ printf "https://ja.wikipedia.org" | iconv -f utf8 -t utf16le | iconv -f utf16be -t utf8 | od -An -t o1 346 240 200 347 220 200 347 220 200 347 200 200 347 214 200 343 250 200 342 274 200 342 274 200 346 250 200 346 204 200 342 270 200 347 234 200 346 244 200 346 254 200 346 244 200 347 200 200 346 224 200 346 220 200 346 244 200 346 204 200 342 270 200 346 274 200 347 210 200 346 234 200
Alberto Garcia
Comment 5 2026-04-23 03:34:29 PDT
EWS
Comment 6 2026-04-23 03:49:30 PDT
Committed 305877.448@webkitglib/2.52 (585ad22df82d): <https://commits.webkit.org/305877.448@webkitglib/2.52> Reviewed commits have been landed. Closing PR #63416 and removing active labels.
Radar WebKit Bug Importer
Comment 7 2026-04-23 03:50:12 PDT
Note You need to log in before you can comment on or make changes to this bug.