Bug 195908 - [SOUP] webkit_web_context_allow_tls_certificate_for_host() fails for IPv6 URIs produced by SoupURI
Summary: [SOUP] webkit_web_context_allow_tls_certificate_for_host() fails for IPv6 URI...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKitGTK (show other bugs)
Version: WebKit Nightly Build
Hardware: PC Linux
: P2 Normal
Assignee: Michael Catanzaro
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-03-18 13:53 PDT by Michael Catanzaro
Modified: 2023-11-21 06:25 PST (History)
5 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Catanzaro 2019-03-18 13:53:33 PDT
webkit_web_context_allow_tls_certificate_for_host() fails for IPv6 URIs produced by SoupURI because SoupURI handles normalization differently (less-conformantly) than WTF::URI.

The problem is the [] brackets are being normalized away when stored in the misnamed clientCertificates() map (they are server certificates, not client certificates). But the brackets are still there when being checked against the map. Then [2a01:4f8:172:122a::2] is considered nonequal to 2a01:4f8:172:122a::2.

I'd argue that it's a SoupURI bug for not storing the [] brackets in the host portion of the URI, but sadly it's documented behavior:

"""
If host is an IPv6 IP address, it should not include the brackets required by the URI syntax; they will be added automatically when converting uri to a string.
"""

(The brackets are required to be a normalized URI.)

This unacceptable hack "fixes" the problem:

void SoupNetworkSession::allowSpecificHTTPSCertificateForHost(const CertificateInfo& certificateInfo, const String& host)
{
    URL urlForNormalizingHost(URL(), makeString("http://[", host, "]"));
    String normalizedHost = urlForNormalizingHost.host().toString();
WTFLogAlways("%s: host=%s normalizedHost=%s", __FUNCTION__, host.utf8().data(), normalizedHost.utf8().data());
    allowedCertificates().add(normalizedHost, HostTLSCertificateSet()).iterator->value.add(certificateInfo.certificate());
}

More discussion here:

https://gitlab.gnome.org/GNOME/epiphany/issues/451
Comment 1 Alfred Morgan 2023-11-08 20:11:06 PST
I believe you can't call it a bug if it's documented behavior. Either stop using SoupURI or use it as documented. The bug's in your court. Any chance for a fix "soon" (been ~5 years)?
Comment 2 Michael Catanzaro 2023-11-10 09:58:04 PST
SoupURI no longer exists (replaced by GUri). We'll have to check what the current status is with libsoup 3.
Comment 3 Patrick Griffis 2023-11-10 11:18:10 PST
Input: 'http://[2a01:4f8:172:122a::2]'
get_host() -> '2a01:4f8:172:122a::2'
to_string() -> 'http://[2a01:4f8:172:122a::2]'
Comment 4 Michael Catanzaro 2023-11-20 15:17:32 PST
(In reply to Alfred Morgan from comment #1)
> I believe you can't call it a bug if it's documented behavior. Either stop
> using SoupURI or use it as documented. The bug's in your court. Any chance
> for a fix "soon" (been ~5 years)?

It's your lucky day. Complaining does *sometimes* work. But only because this was easy....

(Note that different string representations of the same IPv6 address will still fail to match, but that's probably OK.)
Comment 5 Michael Catanzaro 2023-11-20 15:25:42 PST
Pull request: https://github.com/WebKit/WebKit/pull/20763
Comment 6 Alfred Morgan 2023-11-20 15:31:44 PST
Bravo sir. 👏
Comment 7 Adrian Perez 2023-11-21 00:09:28 PST
Reading RFC 3986 (section 3.2.2, “Host”) it looks like the brackets
are *not optional* in order to identify IPv6 addresses in the host
part of an URI, if I am reading the RFC correctly. Therefore, it looks
to me like SoupURI/GUri are handling that case incorrectly, and the
proper fix belongs there.

In practice, even if the bug would be fixed in GUri in a future version
of GLib (I don't expect libsoup2 would fix it, as it is legacy by now),
I agree that we ought to fix this inside WebKit now in order to have
things working.

Shoudn't we file a bug against GLib?
Comment 8 Adrian Perez 2023-11-21 00:12:42 PST
(In reply to Adrian Perez from comment #7)
> Reading RFC 3986 (section 3.2.2, “Host”) it looks like the brackets
> are *not optional* in order to identify IPv6 addresses in the host
> part of an URI, if I am reading the RFC correctly. Therefore, it looks
> to me like SoupURI/GUri are handling that case incorrectly, and the
> proper fix belongs there.

Link to the relevant section of the RFC:

  https://datatracker.ietf.org/doc/html/rfc3986#section-3.2.2

Quote:

  “A host identified by an Internet Protocol literal address, version 6
   [RFC3513] or later, is distinguished by enclosing the IP literal
   within square brackets ("[" and "]").  This is the only place where
   square bracket characters are allowed in the URI syntax.  In
   anticipation of future, as-yet-undefined IP literal address formats,
   an implementation may use an optional version flag to indicate such a
   format explicitly rather than rely on heuristic determination.

      IP-literal = "[" ( IPv6address / IPvFuture  ) "]"

      IPvFuture  = "v" 1*HEXDIG "." 1*( unreserved / sub-delims / ":" )

   [...]

   A host identified by an IPv6 literal address is represented inside
   the square brackets without a preceding version flag.”

Then the RFC elaborates further on the syntax for versioned addresses,
IPv4 addresses, and registered names -- but there is no further mention
of brackets, and as one can see they are not deemed optional in the quoted
portion.
Comment 9 Michael Catanzaro 2023-11-21 05:28:23 PST
(In reply to Adrian Perez from comment #7)
> Reading RFC 3986 (section 3.2.2, “Host”) it looks like the brackets
> are *not optional* in order to identify IPv6 addresses in the host
> part of an URI, if I am reading the RFC correctly. Therefore, it looks
> to me like SoupURI/GUri are handling that case incorrectly, and the
> proper fix belongs there.

They *do* add the brackets to the string representation of the URI when you do g_uri_to_string(). They're just not included when you do g_uri_get_host().

(In reply to Adrian Perez from comment #7)
> Shoudn't we file a bug against GLib?

No, because the documentation of g_uri_get_host() says:

"""
If uri contained an IPv6 address literal, this value will be just that address, without the brackets around it that are necessary in the string form of the URI. Note that in this case there may also be a scope ID attached to the address. Eg, `fe80::1234%`em1 (or `fe80::1234%`25em1 if the string is still encoded).
"""
Comment 10 Adrian Perez 2023-11-21 05:59:17 PST
(In reply to Michael Catanzaro from comment #9)
> (In reply to Adrian Perez from comment #7)
> > Reading RFC 3986 (section 3.2.2, “Host”) it looks like the brackets
> > are *not optional* in order to identify IPv6 addresses in the host
> > part of an URI, if I am reading the RFC correctly. Therefore, it looks
> > to me like SoupURI/GUri are handling that case incorrectly, and the
> > proper fix belongs there.
> 
> They *do* add the brackets to the string representation of the URI when you
> do g_uri_to_string(). They're just not included when you do g_uri_get_host().
> 
> (In reply to Adrian Perez from comment #7)
> > Shoudn't we file a bug against GLib?
> 
> No, because the documentation of g_uri_get_host() says:
> 
> """
> If uri contained an IPv6 address literal, this value will be just that
> address, without the brackets around it that are necessary in the string
> form of the URI. Note that in this case there may also be a scope ID
> attached to the address. Eg, `fe80::1234%`em1 (or `fe80::1234%`25em1 if the
> string is still encoded).
> """

Oh, I now get the complete picture, thanks. A bit of a pity that GUri
went that way, but it is what it is. At any rate, thanks for the patch
to fix this issue in WebKit 🤗️
Comment 11 EWS 2023-11-21 06:25:25 PST
Committed 271013@main (bcd2223c9362): <https://commits.webkit.org/271013@main>

Reviewed commits have been landed. Closing PR #20763 and removing active labels.