Bug 196094 - [GTK][WPE] Implement a basic DNS cache
Summary: [GTK][WPE] Implement a basic DNS cache
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKitGTK (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-03-21 11:06 PDT by Patrick Griffis
Modified: 2019-07-08 05:03 PDT (History)
7 users (show)

See Also:


Attachments
Patch (26.77 KB, patch)
2019-03-21 11:07 PDT, Patrick Griffis
no flags Details | Formatted Diff | Diff
Patch (25.20 KB, patch)
2019-03-21 11:09 PDT, Patrick Griffis
no flags Details | Formatted Diff | Diff
Patch (25.14 KB, patch)
2019-03-21 11:11 PDT, Patrick Griffis
cgarcia: review-
ews: commit-queue-
Details | Formatted Diff | Diff
Archive of layout-test-results from ews105 for mac-highsierra-wk2 (2.69 MB, application/zip)
2019-03-21 14:03 PDT, Build Bot
no flags Details
Patch (27.52 KB, patch)
2019-06-19 05:32 PDT, Carlos Garcia Campos
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Patrick Griffis 2019-03-21 11:06:22 PDT
[GTK][WPE] Implement a basic DNS cache
Comment 1 Patrick Griffis 2019-03-21 11:07:28 PDT
Created attachment 365568 [details]
Patch
Comment 2 Patrick Griffis 2019-03-21 11:09:18 PDT
Created attachment 365570 [details]
Patch
Comment 3 Patrick Griffis 2019-03-21 11:11:08 PDT
Created attachment 365571 [details]
Patch
Comment 4 Build Bot 2019-03-21 14:03:34 PDT
Comment on attachment 365571 [details]
Patch

Attachment 365571 [details] did not pass mac-wk2-ews (mac-wk2):
Output: https://webkit-queues.webkit.org/results/11603776

New failing tests:
http/wpt/mediarecorder/MediaRecorder-AV-audio-video-dataavailable.html
Comment 5 Build Bot 2019-03-21 14:03:36 PDT
Created attachment 365614 [details]
Archive of layout-test-results from ews105 for mac-highsierra-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews105  Port: mac-highsierra-wk2  Platform: Mac OS X 10.13.6
Comment 6 Carlos Garcia Campos 2019-03-22 03:23:42 PDT
Comment on attachment 365571 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=365571&action=review

Great! This patch is perfect for libsoup, but not for WebKit. We should follow the WebKit coding style, use smart pointers and our internal data structures from WTF whenever possible. I'll add more specific comments in the patch.

> Source/WebKit/NetworkProcess/soup/NetworkProcessSoup.cpp:115
> +    webkit_soup_cached_resolver_ensure_default();

We know this is called only once so we could simply:

GRefPtr<GResolver> resolver = webkitCachedResolverNew(adoptGRef(g_resolver_get_default()));
g_resolver_set_default(resolver.get());

> Source/WebKit/Shared/glib/WebKitSoupCachedResolver.cpp:27
> +#include "WebKitSoupCachedResolver.h"

This is not soup specific art all, so I would avoid the Soup in the name, I would just call this WebKitCachedResolver.

> Source/WebKit/Shared/glib/WebKitSoupCachedResolver.cpp:45
> + */

This is private object in WebKit, so we don't need documentation. If you want to keep a comment explaining what this object is for, remove the gtk-doc tags and use C++ style comment.

> Source/WebKit/Shared/glib/WebKitSoupCachedResolver.cpp:48
> +/* This version introduces querying names for ipv4/ipv6
> + * separately which we cache separately */

Use C++ style comment. It could be one line and finish with period.

> Source/WebKit/Shared/glib/WebKitSoupCachedResolver.cpp:55
> +struct _WebKitSoupCachedResolver {

It's fine to keep the instance and class structs private here but we also need to use a private instance to be able to use smart pointers and C++ objects.

> Source/WebKit/Shared/glib/WebKitSoupCachedResolver.cpp:56
> +    GResolver parent_instance;

parentInstance

> Source/WebKit/Shared/glib/WebKitSoupCachedResolver.cpp:57
> +    GResolver* wrapped_resolver;

GRefPtr<GResolver> wrappedResolver;

> Source/WebKit/Shared/glib/WebKitSoupCachedResolver.cpp:58
> +    GHashTable* dns_caches[N_CACHES];

Use HashMap from WTF instead of GHashTable. It could be

HashMap<CString, CachedResponse>

> Source/WebKit/Shared/glib/WebKitSoupCachedResolver.cpp:59
> +    GMutex dns_cache_lock;

Use WTF Lock

> Source/WebKit/Shared/glib/WebKitSoupCachedResolver.cpp:63
> +    GResolverClass parent_class;

parentClass

> Source/WebKit/Shared/glib/WebKitSoupCachedResolver.cpp:66
> +G_DEFINE_TYPE(WebKitSoupCachedResolver, webkit_soup_cached_resolver, G_TYPE_RESOLVER)

Use WebKitDefineType

> Source/WebKit/Shared/glib/WebKitSoupCachedResolver.cpp:68
> +/* These match Firefox */

C++ style comments

> Source/WebKit/Shared/glib/WebKitSoupCachedResolver.cpp:70
> +#define DNS_CACHE_EXPIRE_SECONDS 60
> +#define DNS_CACHE_MAX_SIZE 400

static const Seconds dnsCachedExpireSeconds = 60_s;
static const unsigned dnsCacheMaxSize = 400;

> Source/WebKit/Shared/glib/WebKitSoupCachedResolver.cpp:88
> +/**
> + * webkit_soup_cached_resolver_new:
> + * @wrapped_resolver: Underlying #GResolver to be cached
> + * 
> + * Note that @wrapped_resolver must not be a #WebKitSoupCachedResolver.
> + *
> + * You should generally use webkit_soup_cached_resolver_ensure_default()
> + * rather than this API directly.
> + *
> + * Returns:(transfer full): A new #WebKitSoupCachedResolver
> + */

Same comment here about the api docs.

> Source/WebKit/Shared/glib/WebKitSoupCachedResolver.cpp:90
> +static WebKitSoupCachedResolver* 
> +webkit_soup_cached_resolver_new(GResolver* wrapped_resolver)

This should be one line. I would make this public instead of webkit_soup_cached_resolver_ensure_default().

GResolver* webkitCachedResolverNew(GRefPtr<GResolver>&& wrappedResolver)

> Source/WebKit/Shared/glib/WebKitSoupCachedResolver.cpp:92
> +    g_return_val_if_fail(wrapped_resolver, nullptr);

G_IS_RESOLVER(wrapperResolver.get(), nullptr);

> Source/WebKit/Shared/glib/WebKitSoupCachedResolver.cpp:93
> +    g_return_val_if_fail(!WEBKIT_IS_SOUP_CACHED_RESOLVER(wrapped_resolver), nullptr);

We wouldn't need this check, replace it with ASSERT if you really want to check this.

> Source/WebKit/Shared/glib/WebKitSoupCachedResolver.cpp:97
> +    return static_cast<WebKitSoupCachedResolver*>(g_object_new(WEBKIT_TYPE_SOUP_CACHED_RESOLVER,
> +        "wrapped-resolver", wrapped_resolver,
> +        nullptr));

Since this is private object and won't be used by any binding, we can get rid of the construct property and assign the wrapped object here after g_object_new call.

auto* resolver = static_cast<WebKitCachedResolver*>(g_object_new(WEBKIT_TYPE_CACHED_RESOLVER, nullptr));
resolver->priv->wrappedResolver = WTFMove(wrappedResolver);

> Source/WebKit/Shared/glib/WebKitSoupCachedResolver.cpp:123
> +typedef struct {
> +    GList* addresses; /* owned */
> +    gint64 expiration;
> +} CachedResponse;

struct CachedResponse {
    Vector<GRefPtr<GInetAddress>> addresses;
    WallTime expiration;
};

> Source/WebKit/Shared/glib/WebKitSoupCachedResolver.cpp:130
> +static void
> +cached_response_free(CachedResponse* cache)
> +{
> +    g_resolver_free_addresses(cache->addresses);
> +    g_free(cache);
> +}

Then you don't need this.

> Source/WebKit/Shared/glib/WebKitSoupCachedResolver.cpp:170
> +static gpointer
> +copy_object(gconstpointer obj, gpointer user_data)
> +{
> +    return g_object_ref(G_OBJECT(obj));
> +}
> +
> +static GList* 
> +copy_addresses(GList* addresses)
> +{
> +    return g_list_copy_deep(addresses, copy_object, nullptr);
> +}

You shouldn't need this either.

> Source/WebKit/Shared/glib/WebKitSoupCachedResolver.cpp:177
> +    gint64 now = g_get_monotonic_time();

Seconds now = WallTime::now();

> Source/WebKit/Shared/glib/WebKitSoupCachedResolver.cpp:184
> +        if (cached->expiration <= now || size > DNS_CACHE_MAX_SIZE)

How can be the current cache bigger than the maximum size? Shouldn't we check the expiration time to decide what response to remove when the cache is full?

> Source/WebKit/Shared/glib/WebKitSoupCachedResolver.cpp:208
> +    /* Cleanup while we are at it. */
> +    cleanup_dns_cache(self, cache);

I think we could split the case of cache being full from the case of expired responses. Should we use a timer to remove expired resources instead of doing it only when adding new entries?

> Source/WebKit/Shared/glib/WebKitSoupCachedResolver.cpp:234
> +    if (cached && cached->expiration > now)
> +        addresses = copy_addresses(cached->addresses);

Shouldn't we remove the entry from the cache if it has expired?

> Source/WebKit/Shared/glib/WebKitSoupCachedResolver.cpp:240
> +

I think we can add a private C++ class here to handle the cache and make the GObject use it.

> Source/WebKit/Shared/glib/WebKitSoupCachedResolver.cpp:257
> +typedef struct {
> +    char* hostname;
> +    GResolverNameLookupFlags flags;
> +} LookupData;

struct LookupAsyncData {
    GUniquePtr<char> hostname;
    GResolverNameLookupFlags flags;
};
WEBKIT_DEFINE_ASYNC_DATA_STRUCT(LookupAsyncData);

That will define createLookupAsyncData() and destroyLookupAsyncData() function using fast malloc and allowing to use smart pointers.

> Source/WebKit/Shared/glib/WebKitSoupCachedResolver.cpp:325
> +    GTask* task = g_task_new(self, cancellable, callback, user_data);

GRefPtr<GTask> task = adoptGRef()

> Source/WebKit/SourcesGTK.txt:92
> +Shared/glib/WebKitSoupCachedResolver.cpp

This doesn't belong to Shared. I Shared we have things shared by multiple processes, but the dns cache is only used in the network process, so this should be in NetworkProcess/glib or NetworkProcess/soup
Comment 7 Carlos Garcia Campos 2019-06-19 05:32:26 PDT
Created attachment 372453 [details]
Patch

Just addressed my own review comments
Comment 8 Michael Catanzaro 2019-06-19 10:20:45 PDT
Comment on attachment 372453 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=372453&action=review

I have a question about the threadsafety semantics. I was originally going to ask why you are using a lock in the DNSCache implementation, since I would have expected that would only ever be called from the main thread. But I see that is wrong, because GResolver is a global singleton, and global singletons in GLib are (probably) expected to be threadsafe (even though I've never seen any documentation indicate this clearly). OK then, the DNSCache is used by your WebKitCachedResolver, so makes sense for DNSCache to be threadsafe.

But WebKitCachedResolver is not itself threadsafe. So that doesn't make sense to me.

I have a suspicion that these will both only ever be used by WebKit on its main thread, so the DNSCache lock could be removed...? Is that not correct? If it's not correct, then I think WebKitCachedResolver needs its own lock. It just seems really suspicious that one has a lock but not the other.

> Source/WebKit/NetworkProcess/glib/WebKitCachedResolver.cpp:74
> +WEBKIT_DEFINE_ASYNC_DATA_STRUCT(LookupAsyncData)

Huh, I didn't know we had this.
Comment 9 Carlos Garcia Campos 2019-06-20 00:49:04 PDT
(In reply to Michael Catanzaro from comment #8)
> Comment on attachment 372453 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=372453&action=review
> 
> I have a question about the threadsafety semantics. I was originally going
> to ask why you are using a lock in the DNSCache implementation, since I
> would have expected that would only ever be called from the main thread. But
> I see that is wrong, because GResolver is a global singleton, and global
> singletons in GLib are (probably) expected to be threadsafe (even though
> I've never seen any documentation indicate this clearly). OK then, the
> DNSCache is used by your WebKitCachedResolver, so makes sense for DNSCache
> to be threadsafe.

I have no idea. Patrick's patch was using a lock only to protect the cache itself, and that's what I've done here. 

> But WebKitCachedResolver is not itself threadsafe. So that doesn't make
> sense to me.

Why it's not? I think it is as long as the wrapped resolver is, and the default resolver is AFAIK.

> I have a suspicion that these will both only ever be used by WebKit on its
> main thread, so the DNSCache lock could be removed...? Is that not correct?
> If it's not correct, then I think WebKitCachedResolver needs its own lock.
> It just seems really suspicious that one has a lock but not the other.

I thought the same, but I'm just addressing my review comments here.

> > Source/WebKit/NetworkProcess/glib/WebKitCachedResolver.cpp:74
> > +WEBKIT_DEFINE_ASYNC_DATA_STRUCT(LookupAsyncData)
> 
> Huh, I didn't know we had this.

It uses fast malloc and allows to use C++ objects as members.
Comment 10 Patrick Griffis 2019-06-20 07:00:51 PDT
(In reply to Carlos Garcia Campos from comment #9)
> (In reply to Michael Catanzaro from comment #8)
> > Comment on attachment 372453 [details]
> > Patch
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=372453&action=review
> > 
> > I have a question about the threadsafety semantics. I was originally going
> > to ask why you are using a lock in the DNSCache implementation, since I
> > would have expected that would only ever be called from the main thread. But
> > I see that is wrong, because GResolver is a global singleton, and global
> > singletons in GLib are (probably) expected to be threadsafe (even though
> > I've never seen any documentation indicate this clearly). OK then, the
> > DNSCache is used by your WebKitCachedResolver, so makes sense for DNSCache
> > to be threadsafe.
> 
> I have no idea. Patrick's patch was using a lock only to protect the cache
> itself, and that's what I've done here. 
> 

All calls in my impl were directly redirected to the the existing Resolver which was threadsafe. The only thing the cached resolver did was put results in a cache so that had to be threadsafe.
Comment 11 Michael Catanzaro 2019-06-20 07:39:42 PDT
Comment on attachment 372453 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=372453&action=review

> Source/WebKit/NetworkProcess/glib/WebKitCachedResolver.cpp:38
> +typedef struct {
> +    GRefPtr<GResolver> wrappedResolver;
> +    DNSCache cache;
> +} WebKitCachedResolverPrivate;

OK you're right, I suppose WebKitCachedResolver is threadsafe because both its data members are.

I recommend adding a comment here to warn about this, otherwise it would be really really easy to mess this up in the future by adding a non-threadsafe member here without a lock.
Comment 12 WebKit Commit Bot 2019-06-20 22:37:10 PDT
Comment on attachment 372453 [details]
Patch

Clearing flags on attachment: 372453

Committed r246671: <https://trac.webkit.org/changeset/246671>
Comment 13 WebKit Commit Bot 2019-06-20 22:37:12 PDT
All reviewed patches have been landed.  Closing bug.
Comment 14 Michael Catanzaro 2019-06-21 06:18:18 PDT
(In reply to Michael Catanzaro from comment #11)
> I recommend adding a comment here to warn about this, otherwise it would be
> really really easy to mess this up in the future by adding a non-threadsafe
> member here without a lock.

I can all but promise this will break in the future, since you didn't add a comment and there's no indication whatsoever in the WebKitCachedResolver file that it is supposed to be threadsafe.
Comment 15 Claudio Saavedra 2019-07-08 02:27:31 PDT
About the time this landed, several tests started having crashes in the network process, not very consistently but consistently enough to think that the underlying cause is probably this. Here a sample:

https://build.webkit.org/results/GTK%20Linux%2064-bit%20Release%20(Tests)/r247200%20(10899)/imported/w3c/web-platform-tests/cors/redirect-preflight-2-crash-log.txt

With crash history:

https://webkit-test-results.webkit.org/dashboards/flakiness_dashboard.html#showAllRuns=true&tests=imported%2Fw3c%2Fweb-platform-tests%2Fcors%2Fredirect-preflight-2.htm
Comment 16 Claudio Saavedra 2019-07-08 05:03:38 PDT
Crashes reported here https://bugs.webkit.org/show_bug.cgi?id=199572