Bug 167941

Summary: [WPE][GTK] Enable support for CONTENT_EXTENSIONS
Product: WebKit Reporter: Adrian Perez <aperez>
Component: WebKitGTKAssignee: Adrian Perez <aperez>
Status: RESOLVED FIXED    
Severity: Normal CC: achristensen, baebytasyatechiq, bob, bugs-noreply, calvaris, cgarcia, commit-queue, darin, don.olmstead, enrica, ews-watchlist, mcatanzaro, rego, thorton, zan
Priority: P2    
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugs.webkit.org/show_bug.cgi?id=194922
Bug Depends on: 192714, 192855, 193622    
Bug Blocks: 194818, 154553, 200479    
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
WIP Patch
none
WIP Patch v2
none
WIP Patch v3
none
WIP Patch v4
none
WIP Patch v5
none
Patch v6
none
Patch v7
none
Archive of layout-test-results from ews114 for mac-highsierra
none
Patch v8
none
Patch v9
none
Patch v10
none
Patch v11
none
Patch v12
none
Patch v13 none

Description Adrian Perez 2017-02-07 09:05:20 PST
This is needed as a prerequisite before adding new WebKitGTK+ API for
filtering using content extensions, and is needed for bug #154553
Comment 1 Adrian Perez 2017-02-07 09:25:07 PST
Created attachment 300813 [details]
Patch
Comment 2 Michael Catanzaro 2017-02-07 09:32:29 PST
Comment on attachment 300813 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=300813&action=review

> Source/WTF/wtf/FeatureDefines.h:333
> +#if !defined(ENABLE_CONTENT_EXTENSIONS)
> +#define ENABLE_CONTENT_EXTENSIONS 1
> +#endif

I think it should be exposed in WebKitFeatures.cmake instead
Comment 3 Adrian Perez 2017-02-07 10:46:14 PST
Created attachment 300820 [details]
Patch
Comment 4 Adrian Perez 2017-02-07 10:51:01 PST
(In reply to comment #2)
> Comment on attachment 300813 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=300813&action=review
> 
> > Source/WTF/wtf/FeatureDefines.h:333
> > +#if !defined(ENABLE_CONTENT_EXTENSIONS)
> > +#define ENABLE_CONTENT_EXTENSIONS 1
> > +#endif
> 
> I think it should be exposed in WebKitFeatures.cmake instead

The new version of the patch uses WebKitFeatures.cmake, and enables the
feature by default in OptionsGTK.cmake.

Also, I have changed a return statement which failed to compile in the GTK+
EWS, in “Source/WebCore/contentextensions/ContentExtensionsBackend.cpp” from:

    return { willBlockLoad, willBlockCookies, willMakeHTTPS };

to:

    return BlockedStatus { willBlockLoad, willBlockCookies, willMakeHTTPS };

The version of GCC we are targeting in the GTK+ port does not seem to
support this C++11 construct yet. I hope this minor change won't be a
problem to get the patch landed :-)
Comment 5 Adrian Perez 2017-02-07 11:08:50 PST
Created attachment 300823 [details]
Patch
Comment 6 Adrian Perez 2017-02-07 11:19:37 PST
Created attachment 300824 [details]
Patch
Comment 7 Adrian Perez 2017-02-07 11:33:08 PST
Created attachment 300826 [details]
Patch
Comment 8 Michael Catanzaro 2017-02-07 15:16:56 PST
Comment on attachment 300826 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=300826&action=review

I don't want to commit this until after branching 2.16, but that should happen next week so it's not exactly far off.

> Source/WebCore/contentextensions/ContentExtensionsBackend.cpp:224
> -    return { willBlockLoad, willBlockCookies, willMakeHTTPS };
> +
> +    BlockedStatus status;
> +    status.blockedLoad = willBlockLoad;
> +    status.blockedCookies = willBlockCookies;
> +    status.madeHTTPS = willMakeHTTPS;
> +    return status;

I think you should add a constructor if there's no better way to get this to work with GCC 4.9.

> Source/WebKit2/NetworkProcess/cache/NetworkCacheDataSoup.cpp:137
> +Data adoptAndMapFile(GRefPtr<GFileIOStream> stream, size_t offset, size_t size)
> +{
> +    GInputStream* inputStream = g_io_stream_get_input_stream(G_IO_STREAM(stream.get()));
> +    int fd = g_file_descriptor_based_get_fd(G_FILE_DESCRIPTOR_BASED(inputStream));
> +    return adoptAndMapFile(fd, offset, size);
> +}

It looks like this function does not need to take ownership of stream, so it should take a GFileIOStream* instead of a GRefPtr<GFileIOStream>. That way it's self-documenting that there is no ownership transfer, and it avoids unnecessary refcount churn.

> Source/WebKit2/UIProcess/API/gtk/APIUserContentExtensionStoreGtk.cpp:36
> +    return WebCore::pathByAppendingComponent(WebCore::pathByAppendingComponent(WebCore::stringFromFileSystemRepresentation(g_get_user_cache_dir()), WebCore::stringFromFileSystemRepresentation(g_get_prgname())), "WebKitUserContentExtensionCache");

Hm, do you really have to chain together multiple calls to WebCore::pathByAppendingComponent in order to use it? There's no variadic template version? It's not good; use g_build_filename() instead and then convert it to a String. Try this:

GUniquePtr<char> path(g_build_filename(g_get_prgname(), "WebKitUserContentExtensionCache", nullptr));
return String::fromUTF8(path.data());

But also, we probably need to respect the cache directory set by the WebKitWebsiteDataManager here. At least it's important that we never save stuff outside the correct cache dir. But it's not clear to me how we should do that here, because each WebKitWebView has its own WebKitWebsiteDataManager, and clearly you don't have one to work with inside this function. So some investigation required here.
Comment 9 Adrian Perez 2017-02-08 09:29:03 PST
Created attachment 300913 [details]
Patch
Comment 10 Adrian Perez 2017-02-08 09:37:46 PST
(In reply to comment #8)
> Comment on attachment 300826 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=300826&action=review
> 
> I don't want to commit this until after branching 2.16, but that should
> happen next week so it's not exactly far off.

Sure, let's merge it after the branching.

> > Source/WebCore/contentextensions/ContentExtensionsBackend.cpp:224
> > -    return { willBlockLoad, willBlockCookies, willMakeHTTPS };
> > +
> > +    BlockedStatus status;
> > +    status.blockedLoad = willBlockLoad;
> > +    status.blockedCookies = willBlockCookies;
> > +    status.madeHTTPS = willMakeHTTPS;
> > +    return status;
> 
> I think you should add a constructor if there's no better way to get this to
> work with GCC 4.9.

Yep, I got me a Debian Jessie chroot here (debootstrap == so handy) to be
able to test. The current version of the patch has a couple of constructors
added, and I tried to make the minimum amount fo changes possible. 

> > Source/WebKit2/NetworkProcess/cache/NetworkCacheDataSoup.cpp:137
> > +Data adoptAndMapFile(GRefPtr<GFileIOStream> stream, size_t offset, size_t size)
> > +{
> > +    GInputStream* inputStream = g_io_stream_get_input_stream(G_IO_STREAM(stream.get()));
> > +    int fd = g_file_descriptor_based_get_fd(G_FILE_DESCRIPTOR_BASED(inputStream));
> > +    return adoptAndMapFile(fd, offset, size);
> > +}
> 
> It looks like this function does not need to take ownership of stream, so it
> should take a GFileIOStream* instead of a GRefPtr<GFileIOStream>. That way
> it's self-documenting that there is no ownership transfer, and it avoids
> unnecessary refcount churn.

ACK.

> > Source/WebKit2/UIProcess/API/gtk/APIUserContentExtensionStoreGtk.cpp:36
> > +    return WebCore::pathByAppendingComponent(WebCore::pathByAppendingComponent(WebCore::stringFromFileSystemRepresentation(g_get_user_cache_dir()), WebCore::stringFromFileSystemRepresentation(g_get_prgname())), "WebKitUserContentExtensionCache");
> 
> Hm, do you really have to chain together multiple calls to
> WebCore::pathByAppendingComponent in order to use it? There's no variadic
> template version? It's not good; use g_build_filename() instead and then
> convert it to a String. Try this:
> 
> GUniquePtr<char> path(g_build_filename(g_get_prgname(),
> "WebKitUserContentExtensionCache", nullptr));
> return String::fromUTF8(path.data());

Right now I am leaning more towards providing a dummy implementation first,
which only has an ASSERT_NOT_REACHED(), and add the code to build the path
later -- or not, if we decide that the GTK+ port will require setting the
path explicitly; read on below...

> But also, we probably need to respect the cache directory set by the
> WebKitWebsiteDataManager here. At least it's important that we never save
> stuff outside the correct cache dir. But it's not clear to me how we should
> do that here, because each WebKitWebView has its own
> WebKitWebsiteDataManager, and clearly you don't have one to work with inside
> this function. So some investigation required here.

We have a discussion going on in bug #154553 about how the API should look
like, which is touching the topic of whether the path should be controlled
by WebKitWebsiteDataManager or not, and one option is to not have an default
implicit location by default. Therefore I think it's better that I post a
new version of the patch with ASSERT_NOT_REACHED(), and we land this patch
with that. We would decide later on whether this would be implemented or
remain with the assertion depending on how the discussion about the API goes.
Comment 11 Adrian Perez 2017-02-08 10:04:40 PST
Created attachment 300918 [details]
Patch
Comment 12 Michael Catanzaro 2017-02-08 10:56:46 PST
Comment on attachment 300918 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=300918&action=review

We probably technically need an owner for the changes in APIUserContentExtensionStore.cpp, but since the changes there are only to work around GCC 4.9, I think it's fine.

Note: for committing *after* the 2.16 branch.

> Source/WebKit2/NetworkProcess/cache/NetworkCacheDataSoup.cpp:40
> +#if ENABLE(CONTENT_EXTENSIONS)
> +#include <gio/gfiledescriptorbased.h>
> +#endif

Don't use the CONTENT_EXTENSIONS guard here, since this code is generic and not related to CONTENT_EXTENSIONS. It's harmless to compile it in regardless.

> Source/WebKit2/NetworkProcess/cache/NetworkCacheDataSoup.cpp:137
> +#if ENABLE(CONTENT_EXTENSIONS)
> +Data adoptAndMapFile(GFileIOStream* stream, size_t offset, size_t size)

Ditto.

> Source/WebKit2/UIProcess/API/gtk/APIUserContentExtensionStoreGtk.cpp:37
> +    ASSERT_NOT_REACHED();
> +    return { };

I guess, based on our discussion earlier today, you maybe really do want to list some default path here?
Comment 13 Adrian Perez 2017-02-08 14:12:33 PST
(In reply to comment #12)
> Comment on attachment 300918 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=300918&action=review
> 
> We probably technically need an owner for the changes in
> APIUserContentExtensionStore.cpp, but since the changes there are only to
> work around GCC 4.9, I think it's fine.

Using “webkit-path upload --suggest-reviewers” already included Enrica Casucci,
though looking at the ChangeLogs it seems Alex Christensen has been poking
around “WebCore/contentextensions/” now and then, so I am CC'ing him as well.

> Note: for committing *after* the 2.16 branch.
> 
> > Source/WebKit2/NetworkProcess/cache/NetworkCacheDataSoup.cpp:40
> > +#if ENABLE(CONTENT_EXTENSIONS)
> > +#include <gio/gfiledescriptorbased.h>
> > +#endif
> 
> Don't use the CONTENT_EXTENSIONS guard here, since this code is generic and
> not related to CONTENT_EXTENSIONS. It's harmless to compile it in regardless.

Not having the guards breaks the EFL build. It seems that EFL is not using
GIO, or at least not adding the needed flags to build WebCore. For example
check this EWS failure:

  https://webkit-queues.webkit.org/patch/300826
  https://webkit-queues.webkit.org/results/3020955

> > Source/WebKit2/NetworkProcess/cache/NetworkCacheDataSoup.cpp:137
> > +#if ENABLE(CONTENT_EXTENSIONS)
> > +Data adoptAndMapFile(GFileIOStream* stream, size_t offset, size_t size)
> 
> Ditto.

They are also needed here to avoid breaking EFL.
 
> > Source/WebKit2/UIProcess/API/gtk/APIUserContentExtensionStoreGtk.cpp:37
> > +    ASSERT_NOT_REACHED();
> > +    return { };
> 
> I guess, based on our discussion earlier today, you maybe really do want to
> list some default path here?

I am leaning towards not providing a default path, and forcing developers
to provide the path: “Explicit is better than implicit.” [1]. I would want
developers using our API to be 100% aware that using content extensions will
store non-transient data on disk, and to that effect they have to decide
where their application is going to place this data.

Depending on which API we settle, this may or may not get an implementation.
If your mind is more at ease with that, I can change the ASSERT_NOT_REACHED()
into UNIMPLEMENTED().

---
[1] % python -c 'import this'
Comment 14 Alex Christensen 2017-02-08 14:15:17 PST
Comment on attachment 300918 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=300918&action=review

> Source/WebCore/contentextensions/ContentExtensionActions.h:55
> +    BlockedStatus() { }
> +    BlockedStatus(bool blockedLoad, bool blockedCookies, bool madeHTTPS)
> +        : blockedLoad(blockedLoad)
> +        , blockedCookies(blockedCookies)
> +        , madeHTTPS(madeHTTPS)
> +    {
> +    }

Is this only needed for old compiler versions? Can we not use an = default somewhere instead of verbosely listing out the members again?

> Source/WebKit2/UIProcess/API/APIUserContentExtensionStore.cpp:85
> +    ContentExtensionMetaData() { }
> +    ContentExtensionMetaData(uint32_t version, uint64_t actionsSize, uint64_t filtersWithoutDomainsBytecodeSize, uint64_t filtersWithDomainBytecodeSize, uint64_t domainFiltersBytecodeSize)

ditto
Comment 15 Michael Catanzaro 2017-02-08 15:41:47 PST
(In reply to comment #13)
> Not having the guards breaks the EFL build. It seems that EFL is not using
> GIO, or at least not adding the needed flags to build WebCore. For example
> check this EWS failure:
> 
>   https://webkit-queues.webkit.org/patch/300826
>   https://webkit-queues.webkit.org/results/3020955

GIO is not optional; there's no way EFL port could function without it. Check to see if EFL is checking gio-unix-2.0.pc for CFLAGS/LDFLAGS. It's probably not.

> Depending on which API we settle, this may or may not get an implementation.
> If your mind is more at ease with that, I can change the ASSERT_NOT_REACHED()
> into UNIMPLEMENTED().

No no, we don't want to use UNIMPLEMENTED() for this.
Comment 16 Carlos Garcia Campos 2017-02-08 22:46:39 PST
Comment on attachment 300918 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=300918&action=review

> Source/WebKit2/NetworkProcess/cache/NetworkCacheData.h:160
> +#if USE(SOUP) && ENABLE(CONTENT_EXTENSIONS)
> +Data adoptAndMapFile(GFileIOStream*, size_t offset, size_t);
> +#endif

Do not use ENABLE(CONTENT_EXTENSIONS) here, even if this is only used by content extensions code, it's not actually specific to content extensions

>>> Source/WebKit2/NetworkProcess/cache/NetworkCacheDataSoup.cpp:40
>>> +#endif
>> 
>> Don't use the CONTENT_EXTENSIONS guard here, since this code is generic and not related to CONTENT_EXTENSIONS. It's harmless to compile it in regardless.
> 
> Not having the guards breaks the EFL build. It seems that EFL is not using
> GIO, or at least not adding the needed flags to build WebCore. For example
> check this EWS failure:
> 
>   https://webkit-queues.webkit.org/patch/300826
>   https://webkit-queues.webkit.org/results/3020955

They probably need to add gio-unix-2.0.

>>> Source/WebKit2/NetworkProcess/cache/NetworkCacheDataSoup.cpp:137
>>> +Data adoptAndMapFile(GFileIOStream* stream, size_t offset, size_t size)
>> 
>> Ditto.
> 
> They are also needed here to avoid breaking EFL.

EFL doesn't use FileSystemGtk, so they don't need this at all. One possible solution to avoid the ifdefs would be to use PlatformFileHandle. Then we would need a new version for the GTK port receiving an int fd. In any case, all this is specific to the GTK+ port, not to soup, because we use a different FileSystem implementation. So, this should be #if PLATFORM(GTK) for sure.
Comment 17 Adrian Perez 2017-02-09 05:38:32 PST
(In reply to comment #14)
> Comment on attachment 300918 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=300918&action=review
> 
> > Source/WebCore/contentextensions/ContentExtensionActions.h:55
> > +    BlockedStatus() { }
> > +    BlockedStatus(bool blockedLoad, bool blockedCookies, bool madeHTTPS)
> > +        : blockedLoad(blockedLoad)
> > +        , blockedCookies(blockedCookies)
> > +        , madeHTTPS(madeHTTPS)
> > +    {
> > +    }
> 
> Is this only needed for old compiler versions? Can we not use an = default
> somewhere instead of verbosely listing out the members again?

Yes. Unfortunately GCC 4.9 does not support aggregate initialization using
list-initialization, which is one of the C++14 it lacks. We still want to
support this compiler version for the GTK+ port because it's still shipped
still in many stable distributions, like Debian Jessie.

I tried a couple of different approaches, but this is the less intrusive
in the end. It works by changing the empty constructor to:

  BlockedStatus() = default;

Unfortunately, the other constructor needs to be kept around. I'll upload an
updated version of the patch later which uses “= default” for this one, though.

> > Source/WebKit2/UIProcess/API/APIUserContentExtensionStore.cpp:85
> > +    ContentExtensionMetaData() = default;
> > +    ContentExtensionMetaData(uint32_t version, uint64_t actionsSize, uint64_t filtersWithoutDomainsBytecodeSize, uint64_t filtersWithDomainBytecodeSize, uint64_t domainFiltersBytecodeSize)
> 
> ditto

After some fiddling, I came up with the following, which looks a bit better on
my eyes than the long constructor, let me know if it looks better to you as well:

diff --git a/Source/WebKit2/UIProcess/API/APIUserContentExtensionStore.cpp b/Source/WebKit2/UIProcess/API/APIUserContentExtensionStore.cpp
index 3a740fd50a5..b166f7afaf0 100644
--- a/Source/WebKit2/UIProcess/API/APIUserContentExtensionStore.cpp
+++ b/Source/WebKit2/UIProcess/API/APIUserContentExtensionStore.cpp
@@ -81,6 +81,14 @@ static String constructedPath(const String& base, const String& identifier)

 const size_t ContentExtensionFileHeaderSize = sizeof(uint32_t) + 4 * sizeof(uint64_t);
 struct ContentExtensionMetaData {
+private:
+    ContentExtensionMetaData(uint32_t version): version(version) { }
+public:
+    static const ContentExtensionMetaData kInvalidHeader;
+    ContentExtensionMetaData() = default;
+
     uint32_t version { UserContentExtensionStore::CurrentContentExtensionFileVersion };
     uint64_t actionsSize { 0 };
     uint64_t filtersWithoutDomainsBytecodeSize { 0 };
@@ -97,6 +105,8 @@ struct ContentExtensionMetaData {
     }
 };

+const ContentExtensionMetaData ContentExtensionMetaData::kInvalidHeader { 0 };
+
 static Data encodeContentExtensionMetaData(const ContentExtensionMetaData& metaData)
 {
     WTF::Persistence::Encoder encoder;
@@ -383,9 +393,8 @@ void UserContentExtensionStore::invalidateContentExtensionVersion(const WTF::Str
     auto file = WebCore::openFile(constructedPath(m_storePath, identifier), WebCore::OpenForWrite);
     if (file == WebCore::invalidPlatformFileHandle)
         return;
-    ContentExtensionMetaData invalidHeader = {0, 0, 0, 0, 0};
-    auto bytesWritten = WebCore::writeToFile(file, reinterpret_cast<const char*>(&invalidHeader), sizeof(invalidHeader));
-    ASSERT_UNUSED(bytesWritten, bytesWritten == sizeof(invalidHeader));
+    auto bytesWritten = WebCore::writeToFile(file, reinterpret_cast<const char*>(&ContentExtensionMetaData::kInvalidHeader), sizeof(ContentExtensionMetaData::kInvalidHeader));
+    ASSERT_UNUSED(bytesWritten, bytesWritten == sizeof(ContentExtensionMetaData::kInvalidHeader));
     WebCore::closeFile(file);
 }
Comment 18 Adrian Perez 2017-02-09 08:22:44 PST
Created attachment 301041 [details]
Patch
Comment 19 Adrian Perez 2017-02-09 09:26:42 PST
Created attachment 301050 [details]
Patch
Comment 20 Michael Catanzaro 2017-02-10 20:17:09 PST
Comment on attachment 301050 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=301050&action=review

It looks good to me now. I kinda think it doesn't make sense to land this before the API is ready, though.

> Source/WebKit2/UIProcess/API/APIUserContentExtensionStore.cpp:109
> +const ContentExtensionMetaData ContentExtensionMetaData::kInvalidHeader(0);

Surely you should be able to declare it inside the function...?

(Also, prefer ALL_CAPS style constants to kGoogleStyle now that Google is gone. ;)
Comment 21 Adrian Perez 2017-02-11 11:06:09 PST
Created attachment 301263 [details]
Patch
Comment 22 Adrian Perez 2017-02-20 14:43:28 PST
Comment on attachment 301263 [details]
Patch

May we land this now that we have branched for WebKitGTK+ 2.16?
Comment 23 Carlos Garcia Campos 2017-02-21 22:51:55 PST
Comment on attachment 301263 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=301263&action=review

Ok.

> Source/WebKit2/NetworkProcess/cache/NetworkCacheDataSoup.cpp:42
> +

Remove this extra line.

> Source/WebKit2/UIProcess/API/APIUserContentExtensionStore.cpp:113
> +

And this one.

> Source/WebKit2/UIProcess/API/gtk/APIUserContentExtensionStoreGtk.cpp:36
> +    ASSERT_NOT_REACHED();

Add a FIXME if we need to implement this eventually or a comment explaining why we don't use this.
Comment 24 Adrian Perez 2017-03-01 07:15:12 PST
Created attachment 303066 [details]
Patch
Comment 25 Adrian Perez 2017-03-01 07:16:17 PST
Updated patch with nits fixed.

(In reply to comment #23)
> Comment on attachment 301263 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=301263&action=review
> 
> Ok.
> 
> > Source/WebKit2/NetworkProcess/cache/NetworkCacheDataSoup.cpp:42
> > +
> 
> Remove this extra line.

ACK.

> > Source/WebKit2/UIProcess/API/APIUserContentExtensionStore.cpp:113
> > +
> 
> And this one.

ACK.

> > Source/WebKit2/UIProcess/API/gtk/APIUserContentExtensionStoreGtk.cpp:36
> > +    ASSERT_NOT_REACHED();
> 
> Add a FIXME if we need to implement this eventually or a comment explaining
> why we don't use this.

Comment added.
Comment 26 Adrian Perez 2017-03-01 07:33:03 PST
(In reply to comment #20)
> Comment on attachment 301050 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=301050&action=review
> 
> It looks good to me now. I kinda think it doesn't make sense to land this
> before the API is ready, though.

I will have two follow-up patches: one for adding WKUserContentExtensionStore 
bits to the WK2 C API, to use it in WKTR and unskip the content extensions
layout tests; and then a second one for the actual API added to the GTK+ port.
The patch attached to this bug is needed for them both, and I already have WIP
versions for them as well (I'm opening the respective bugs in a moment :P).

All in all, I think it's not of much use to have the patch here waiting to be
landed, but it would be fine by me to wait until the one for the WK2 C API is
ready and land them together.

(Side note... Interestingly enough, the Cocoa port is using their Objective-C
API inside WKTR, which looks a bit wrong given that WKTR is supposed to have
as less port specific code as possible — hopefully by adding the missing
functions in the WK2 C API we can unify this part of WKTR!)

> > Source/WebKit2/UIProcess/API/APIUserContentExtensionStore.cpp:109
> > +const ContentExtensionMetaData ContentExtensionMetaData::kInvalidHeader(0);
> 
> Surely you should be able to declare it inside the function...?

That is how it's done in the last version of the patch :)
Comment 27 Alex Christensen 2017-03-01 10:46:01 PST
Comment on attachment 303066 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=303066&action=review

> Source/WebKit2/UIProcess/API/gtk/APIUserContentExtensionStoreGtk.cpp:37
> +    // compiler user content extensions: the clients of the public API must

compiled

> Source/WebKit2/UIProcess/API/gtk/APIUserContentExtensionStoreGtk.cpp:42
> +    ASSERT_NOT_REACHED();

RELEASE_ASSERT_NOT_REACHED() would cause bigger problems for people misusing the API without building WebKit themselves.
Comment 28 Adrian Perez 2017-03-01 14:45:39 PST
Created attachment 303124 [details]
Patch
Comment 29 Adrian Perez 2017-03-01 14:47:21 PST
Created attachment 303125 [details]
Patch
Comment 30 Adrian Perez 2017-03-01 14:50:34 PST
(In reply to comment #27)
> Comment on attachment 303066 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=303066&action=review
> 
> > Source/WebKit2/UIProcess/API/gtk/APIUserContentExtensionStoreGtk.cpp:37
> > +    // compiler user content extensions: the clients of the public API must
> 
> compiled

Ouch, typo! Fixed now :)

> > Source/WebKit2/UIProcess/API/gtk/APIUserContentExtensionStoreGtk.cpp:42
> > +    ASSERT_NOT_REACHED();
> 
> RELEASE_ASSERT_NOT_REACHED() would cause bigger problems for people misusing
> the API without building WebKit themselves.

This is now being used, thanks for the suggestion!
Comment 31 Michael Catanzaro 2017-03-01 17:01:51 PST
Comment on attachment 303125 [details]
Patch

Alex already gave you r+, so you don't need to request review again; just reupload with his name in the Reviewed by line. 'webkit-patch' would handle this automatically if you used it.

cq- from me since we should not land this until the API patches are ready.
Comment 32 Michael Catanzaro 2018-06-13 09:44:18 PDT
Comment on attachment 303125 [details]
Patch

This is stale and should be submitted in tandem with patches to add API to make it usable.
Comment 33 Adrian Perez 2018-06-19 04:54:20 PDT
(In reply to Michael Catanzaro from comment #32)
> Comment on attachment 303125 [details]
> Patch
> 
> This is stale and should be submitted in tandem with patches to add API to
> make it usable.

I'll try to get back to this in the next days. Having this instead
of a makeshift adblocker in each browser should help *also* reduce
memory usage, see bug #167941 for some discussion about how a good
amount of the memory used by the WebProcess in Epiphany comes from
Epiphany's WebExtension.
Comment 34 Adrian Perez 2018-12-12 15:25:51 PST
Created attachment 357177 [details]
WIP Patch


This is my current work in progress patch, which I think is already
good enough to take a look at the API and the implementation; but there
are a couple of things missing. I have not tested at all the WKTR bits,
and the layout tests need unskipping, too.

I will be updating this in the coming days with the missing bits, but
it seems like a good idea to submit already something to get the ball
rolling :)
Comment 35 EWS Watchlist 2018-12-12 15:28:56 PST
Attachment 357177 [details] did not pass style-queue:


ERROR: Tools/WebKitTestRunner/TestController.cpp:37:  Alphabetical sorting problem.  [build/include_order] [4]
ERROR: Tools/WebKitTestRunner/TestController.cpp:1358:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/gtk/WebKitUserContentManager.h"
WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/gtk/WebKitUserContent.h"
WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/gtk/webkit2.h"
ERROR: Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:29:  Alphabetical sorting problem.  [build/include_order] [4]
ERROR: Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:97:  One line control clauses should not use braces.  [whitespace/braces] [4]
ERROR: Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:99:  One line control clauses should not use braces.  [whitespace/braces] [4]
ERROR: Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:111:  Extra space before ( in function call  [whitespace/parens] [4]
ERROR: Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:140:  One line control clauses should not use braces.  [whitespace/braces] [4]
ERROR: Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:152:  Extra space before ( in function call  [whitespace/parens] [4]
ERROR: Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:177:  Boolean expressions that span multiple lines should have their operators on the left side of the line instead of the right side.  [whitespace/operators] [4]
ERROR: Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:178:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:182:  One line control clauses should not use braces.  [whitespace/braces] [4]
WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/wpe/WebKitContentFilterManager.h"
WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/wpe/webkit.h"
WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/wpe/WebKitUserContent.h"
ERROR: Tools/MiniBrowser/wpe/main.cpp:114:  Declaration has space between type name and * in GMainLoop *mainLoop  [whitespace/declaration] [3]
ERROR: Tools/MiniBrowser/wpe/main.cpp:115:  Declaration has space between type name and * in WebKitUserContentFilter *filter  [whitespace/declaration] [3]
ERROR: Tools/MiniBrowser/wpe/main.cpp:116:  Declaration has space between type name and * in GError *error  [whitespace/declaration] [3]
ERROR: Tools/MiniBrowser/wpe/main.cpp:195:  Declaration has space between type name and * in WebKitUserContentManager *userContentManager  [whitespace/declaration] [3]
ERROR: Tools/MiniBrowser/wpe/main.cpp:199:  Declaration has space between type name and * in WebKitContentFilterManager *filterManager  [whitespace/declaration] [3]
ERROR: Tools/MiniBrowser/wpe/main.cpp:200:  Use nullptr instead of NULL.  [readability/null] [5]
ERROR: Tools/MiniBrowser/wpe/main.cpp:201:  Declaration has space between type name and * in gchar *contentFilterJSON  [whitespace/declaration] [3]
ERROR: Tools/MiniBrowser/wpe/main.cpp:201:  Use nullptr instead of NULL.  [readability/null] [5]
ERROR: Tools/MiniBrowser/wpe/main.cpp:202:  Declaration has space between type name and * in GError *error  [whitespace/declaration] [3]
ERROR: Tools/MiniBrowser/wpe/main.cpp:202:  Use nullptr instead of NULL.  [readability/null] [5]
ERROR: Tools/MiniBrowser/wpe/main.cpp:209:  Use nullptr instead of NULL.  [readability/null] [5]
WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/wpe/WebKitUserContentManager.h"
ERROR: Tools/WebKitTestRunner/TestController.h:152:  The parameter name "test" adds no information, so it should be removed.  [readability/parameter_name] [5]
ERROR: Source/WebKit/UIProcess/API/glib/WebKitUserContentPrivate.h:27:  Alphabetical sorting problem.  [build/include_order] [4]
WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/gtk/WebKitContentFilterManager.h"
Total errors found: 24 in 36 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 36 Adrian Perez 2018-12-12 15:33:19 PST
(In reply to Adrian Perez from comment #34)
> Created attachment 357177 [details]
> WIP Patch
> 
> 
> This is my current work in progress patch, which I think is already
> good enough to take a look at the API and the implementation; but there
> are a couple of things missing. I have not tested at all the WKTR bits,
> and the layout tests need unskipping, too.
> 
> I will be updating this in the coming days with the missing bits, but
> it seems like a good idea to submit already something to get the ball
> rolling :)

Also the style checker will complain, I am aware and will fix the
issues reported by it myself tomorrow -- no need for reviewers to
remind me about that ;-)
Comment 37 Michael Catanzaro 2018-12-12 19:11:31 PST
Comment on attachment 357177 [details]
WIP Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=357177&action=review

I see changes to TestController, but not to layout test expectations. Usually those go hand in hand. I assume this fixes a bunch of tests, right? Are you planning to land expectations changes separately?

What about Epiphany? I see implementation in MiniBrowser, which is maybe OK to satisfy our implementation requirement I suppose, but I'm a bit disappointed we're shoehorning advanced features into MiniBrowser when Epiphany is really the proper proving ground for the new API. We can be a lot more confident that this API is sufficient if it can successfully replace Epiphany's adblocker.

> Source/WebKit/ChangeLog:19
> +        (API::createExtension): Change usge of SharedMemory::create() to

usage

> Source/WebKit/PlatformGTK.cmake:66
> +    ${WEBKIT_DIR}/UIProcess/API/gtk/WebKitContentFilterManager.h

What about PlatformWPE.cmake?

> Source/WebKit/UIProcess/API/APIContentRuleListStore.cpp:372
> -    auto sharedMemory = WebKit::SharedMemory::create(const_cast<uint8_t*>(fileData.data()), fileData.size(), WebKit::SharedMemory::Protection::ReadOnly);
> +    auto sharedMemory = fileData.tryCreateSharedMemory();

It's quite bad for a function like SharedMemory::create to do different things on different platforms. Is this fixable? If not, we should remove it altogether to avoid the semantic mismatch.

If this is the only place the function is used, then it becomes work for this patch. If not, should still be fixed as part of this project.

> Source/WebKit/UIProcess/API/C/WKUserContentExtensionStoreRef.cpp:91
> +    toImpl(userContentExtensionStore)->compileContentRuleList(toWTFString(identifier), toWTFString(jsonSource), [=](RefPtr<API::ContentRuleList> contentRuleList, std::error_code error) {

We prefer to explicitly capture what you need, rather than use [=]

> Source/WebKit/UIProcess/API/glib/APIContentRuleListStoreGLib.cpp:38
> +String ContentRuleListStore::defaultStorePath(bool)
> +{
> +    const auto cacheDir = WebCore::FileSystem::pathByAppendingComponent(WebCore::FileSystem::stringFromFileSystemRepresentation(g_get_user_cache_dir()), g_get_prgname());
> +    return WebCore::FileSystem::pathByAppendingComponent(cacheDir, "ContentExtensions");
> +}

What's this get used for? If this is a fallback that's unlikely to be used, then fine. But it's important that, in general, we store into the base-cache-directory of a WebKitWebsiteDataManager.

> Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:86
> +void webkit_content_filter_manager_compile(WebKitContentFilterManager* manager, const char* identifier, const char* jsonSource, GAsyncReadyCallback callback, void* userData)

Since this will go into docs, you should write gpointer rather than void*.

> Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:111
> +WebKitUserContentFilter* webkit_content_filter_manager_compile_finish (WebKitContentFilterManager *manager, GAsyncResult *result, GError **error)

This whole line is GNOME style and style checker somehow missed it, so I will complain. Ditto for several other functions in this file. Check them!

> Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:231
> +GStrv webkit_content_filter_manager_fetch_identifiers_finish(WebKitContentFilterManager* manager, GAsyncResult* result)

Use char**. GStrv is for autoptrs where that doesn't work, not APIs.

> Source/WebKit/UIProcess/API/wpe/WebKitContentFilterManager.h:126
> +WEBKIT_API GStrv

Should add more space to the functions above, so that the header is aligned better initially.

> Source/cmake/WebKitFeatures.cmake:99
> +    WEBKIT_OPTION_DEFINE(ENABLE_CONTENT_EXTENSIONS "Toggle content extensions support" PRIVATE OFF)

We're now trying to keep this in sync with WebKitFeatures.pm and all these awful .xcconfig project files so please make sure to add it there too.
Comment 38 Adrian Perez 2018-12-13 00:12:04 PST
*** Bug 154553 has been marked as a duplicate of this bug. ***
Comment 39 Adrian Perez 2018-12-13 00:12:40 PST
*** Bug 169037 has been marked as a duplicate of this bug. ***
Comment 40 Adrian Perez 2018-12-13 00:31:04 PST
Created attachment 357215 [details]
WIP Patch v2


This should make the style checker happy, get gtk-doc to finish successfully
for WPE (finishing the docs content still pending, though), updates the
FeatureList.pm module, and adds the WebKitContentFilterManager.h header
to the list of public API headers for WPE.
Comment 41 Adrian Perez 2018-12-13 10:42:55 PST
(In reply to Michael Catanzaro from comment #37)
> Comment on attachment 357177 [details]
> WIP Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=357177&action=review
> 
> I see changes to TestController, but not to layout test expectations.
> Usually those go hand in hand. I assume this fixes a bunch of tests, right?
> Are you planning to land expectations changes separately?

This is WIP, I have been neglecting the code additions to WKTR in
order to debug a release assertion and get things working. The next
version of the patch will have that working.

At least this will allow to unskip most (if not all) the tests for
content extensions; maybe others with some luck. My plan is to land
split out the WKTR changes to a separate patch, and include the
updated test expectations as part of that. Let me know if you think
that it would be needed to split things in more patches than the
two I am planning to do.

> What about Epiphany? I see implementation in MiniBrowser, which is maybe OK
> to satisfy our implementation requirement I suppose, but I'm a bit
> disappointed we're shoehorning advanced features into MiniBrowser when
> Epiphany is really the proper proving ground for the new API. We can be a
> lot more confident that this API is sufficient if it can successfully
> replace Epiphany's adblocker.

I am starting to work on that in a new Epiphany branch today, I will
comment with a link for a public version of the branch once there is
something substantial to show.

> > Source/WebKit/ChangeLog:19
> > +        (API::createExtension): Change usge of SharedMemory::create() to
> 
> usage
> 
> > Source/WebKit/PlatformGTK.cmake:66
> > +    ${WEBKIT_DIR}/UIProcess/API/gtk/WebKitContentFilterManager.h
> 
> What about PlatformWPE.cmake?

Good catch, updated in the WIP Patch v2 already.

> > Source/WebKit/UIProcess/API/APIContentRuleListStore.cpp:372
> > -    auto sharedMemory = WebKit::SharedMemory::create(const_cast<uint8_t*>(fileData.data()), fileData.size(), WebKit::SharedMemory::Protection::ReadOnly);
> > +    auto sharedMemory = fileData.tryCreateSharedMemory();
> 
> It's quite bad for a function like SharedMemory::create to do different
> things on different platforms. Is this fixable? If not, we should remove it
> altogether to avoid the semantic mismatch.

Certainly doable. We would need to decide whether to keep the name as
SharedMemory::create() or use SharedMemory::wrapMap(). Alternative, the
bare minimum I would do is removing removing the broken ::create()
implementation from SharedMemoryUnix.cpp, and guard its definition in
the header with OS(DARWIN).

My preferred solution would be renaminf SharedMemory::create() for the
Cocoa port and calling it ::wrapMap() as well. To me it looks like the
naming is wrong: what is actually done is taking a chunk of data memory
mapped from a file, and creating a SharedMemory instance which *refers*
to the existing mapping data (and not creating a new area of anonymous
memory which can be shared, which is what the Unix implementation was
wrongly doing).

We could land the cleanup for this beforehand as a separate patch.

Opinions?

> If this is the only place the function is used, then it becomes work for
> this patch. If not, should still be fixed as part of this project.
> 
> > Source/WebKit/UIProcess/API/C/WKUserContentExtensionStoreRef.cpp:91
> > +    toImpl(userContentExtensionStore)->compileContentRuleList(toWTFString(identifier), toWTFString(jsonSource), [=](RefPtr<API::ContentRuleList> contentRuleList, std::error_code error) {
> 
> We prefer to explicitly capture what you need, rather than use [=]

lambda {
    will-change: captures;
}

> > Source/WebKit/UIProcess/API/glib/APIContentRuleListStoreGLib.cpp:38
> > +String ContentRuleListStore::defaultStorePath(bool)
> > +{
> > +    const auto cacheDir = WebCore::FileSystem::pathByAppendingComponent(WebCore::FileSystem::stringFromFileSystemRepresentation(g_get_user_cache_dir()), g_get_prgname());
> > +    return WebCore::FileSystem::pathByAppendingComponent(cacheDir, "ContentExtensions");
> > +}
> 
> What's this get used for? If this is a fallback that's unlikely to be used,
> then fine. But it's important that, in general, we store into the
> base-cache-directory of a WebKitWebsiteDataManager.

I have made it so webkit_content_filter_manager_new() can take NULL as
file system path argument, and in that case the returned instance will
contain a reference to the default ContentRuleListStore, which uses the
path returned by ContentRuleListStore::defaultStorePath().

Tying this to WebKitWebsiteDataManager, while possible, feels a bit weird:
it seems wrong to instantiate the whole WebKitWebsiteDataManager to only
derive a path from its “base-data-directory” property seems like overkill.
The WebKitContentFilterManager is completely independent, which for example
would allow to instantiate it earlier to get it started in compiling rule
set files early at program bootup so it does its job (compilation/lookup
is done in a background thread) while the rest of the WebKit machinery is
brought up in the main thread.

Another option would be to never use the default ContentRuleListStore
and always require that a non-NULL path is passed when instantiating
with webkit_content_filter_manager_new(). WDYT?

> > Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:86
> > +void webkit_content_filter_manager_compile(WebKitContentFilterManager* manager, const char* identifier, const char* jsonSource, GAsyncReadyCallback callback, void* userData)
> 
> Since this will go into docs, you should write gpointer rather than void*.

ACK, good catch, I will change this as well. I suppose also gchar*? (TBH,
I get always a bit confused with the mixture of styles between public
GObject style headers and the implementation files in half-WebKit style).

> > Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:111
> > +WebKitUserContentFilter* webkit_content_filter_manager_compile_finish (WebKitContentFilterManager *manager, GAsyncResult *result, GError **error)
> 
> This whole line is GNOME style and style checker somehow missed it, so I
> will complain. Ditto for several other functions in this file. Check them!

Running it locally the style checker does signal this line as bad style.
Go figure! :-)

> > Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:231
> > +GStrv webkit_content_filter_manager_fetch_identifiers_finish(WebKitContentFilterManager* manager, GAsyncResult* result)
> 
> Use char**. GStrv is for autoptrs where that doesn't work, not APIs.

Okay.

> > Source/WebKit/UIProcess/API/wpe/WebKitContentFilterManager.h:126
> > +WEBKIT_API GStrv
> 
> Should add more space to the functions above, so that the header is aligned
> better initially.

Sure.

> > Source/cmake/WebKitFeatures.cmake:99
> > +    WEBKIT_OPTION_DEFINE(ENABLE_CONTENT_EXTENSIONS "Toggle content extensions support" PRIVATE OFF)
> 
> We're now trying to keep this in sync with WebKitFeatures.pm and all these
> awful .xcconfig project files so please make sure to add it there too.

Ah, this was FeatureList.pm in the end. Do we really need to change
the .xcconfig files? They are used only by the Cocoa (iOS/macOS) ports,
which already have the feature enabled, so I don't expect they need any
change.
Comment 42 Michael Catanzaro 2018-12-13 11:07:10 PST
(In reply to Adrian Perez from comment #41)
> At least this will allow to unskip most (if not all) the tests for
> content extensions; maybe others with some luck. My plan is to land
> split out the WKTR changes to a separate patch, and include the
> updated test expectations as part of that. Let me know if you think
> that it would be needed to split things in more patches than the
> two I am planning to do.

Seems fine.

> Certainly doable. We would need to decide whether to keep the name as
> SharedMemory::create() or use SharedMemory::wrapMap(). Alternative, the
> bare minimum I would do is removing removing the broken ::create()
> implementation from SharedMemoryUnix.cpp, and guard its definition in
> the header with OS(DARWIN).
> 
> My preferred solution would be renaminf SharedMemory::create() for the
> Cocoa port and calling it ::wrapMap() as well. To me it looks like the
> naming is wrong: what is actually done is taking a chunk of data memory
> mapped from a file, and creating a SharedMemory instance which *refers*
> to the existing mapping data (and not creating a new area of anonymous
> memory which can be shared, which is what the Unix implementation was
> wrongly doing).

I would create a new bug, and CC some Apple people (at least Alex and Youenn, and anyone who's made significant changes there recently).

> I have made it so webkit_content_filter_manager_new() can take NULL as
> file system path argument, and in that case the returned instance will
> contain a reference to the default ContentRuleListStore, which uses the
> path returned by ContentRuleListStore::defaultStorePath().
> 
> Tying this to WebKitWebsiteDataManager, while possible, feels a bit weird:
> it seems wrong to instantiate the whole WebKitWebsiteDataManager to only
> derive a path from its “base-data-directory” property seems like overkill.
> The WebKitContentFilterManager is completely independent, which for example
> would allow to instantiate it earlier to get it started in compiling rule
> set files early at program bootup so it does its job (compilation/lookup
> is done in a background thread) while the rest of the WebKit machinery is
> brought up in the main thread.
> 
> Another option would be to never use the default ContentRuleListStore
> and always require that a non-NULL path is passed when instantiating
> with webkit_content_filter_manager_new(). WDYT?

Hm, my instinct would be to tie it to WebKitWebsiteDataManager somehow. WebKit really shouldn't store any data outside base-cache-directory and base-data-directory except if the browser specifically tells it to via API call.

But there is a problem: the same WebKitUserContentManager can be used in multiple WebKitWebViews, and each WebKitWebView can have its own WebKitWebsiteDataManager. So it seems unworkable.

So I don't know. Please ask Carlos Garcia for his opinion.

> ACK, good catch, I will change this as well. I suppose also gchar*? (TBH,
> I get always a bit confused with the mixture of styles between public
> GObject style headers and the implementation files in half-WebKit style).

Yeah, gchar* too, I missed that.

> > > Source/cmake/WebKitFeatures.cmake:99
> > > +    WEBKIT_OPTION_DEFINE(ENABLE_CONTENT_EXTENSIONS "Toggle content extensions support" PRIVATE OFF)
> > 
> > We're now trying to keep this in sync with WebKitFeatures.pm and all these
> > awful .xcconfig project files so please make sure to add it there too.
> 
> Ah, this was FeatureList.pm in the end. Do we really need to change
> the .xcconfig files? They are used only by the Cocoa (iOS/macOS) ports,
> which already have the feature enabled, so I don't expect they need any
> change.

Oops, yes, I meant FeatureList.pm. We're trying to keep FeatureList.pm, WebKitFeatures.cmake, and the .xcconfigs synced up. Apple ports are currently setting the option in FeatureDefines.h, not in the .xcconfigs. I agree it's a bit annoying to make unnecessary changes to XCode project files in this patch to enable a setting that's already enabled. It could be done in a separate patch. If you don't want to do it, you could talk to Don, but I think he does want it synced up.
Comment 43 Don Olmstead 2018-12-13 11:12:48 PST
Comment on attachment 357215 [details]
WIP Patch v2

View in context: https://bugs.webkit.org/attachment.cgi?id=357215&action=review

Nits

> Source/WebKit/ChangeLog:20
> +        (API::createExtension): Change usge of SharedMemory::create() to

usge -> usage

> Source/WebKit/NetworkProcess/cache/NetworkCacheData.h:101
> +#if PLATFORM(GTK) || PLATFORM(WPE)

#if USE(GLIB)

> Source/WebKit/NetworkProcess/cache/NetworkCacheDataSoup.cpp:36
> +#if PLATFORM(GTK) || PLATFORM(WPE)

#if USE(GLIB)

> Source/WebKit/NetworkProcess/cache/NetworkCacheDataSoup.cpp:133
> +#if PLATFORM(GTK) || PLATFORM(WPE)

#if USE(GLIB)

> Source/WebKit/UIProcess/API/C/WKUserContentExtensionStoreRef.cpp:63
> +    return nullptr;

Missing a UNUSED_PARAM(path)

> Source/WebKit/UIProcess/API/C/WKUserContentExtensionStoreRef.cpp:109
> +#endif

UNUSED_PARAMs missing

> Source/WebKit/UIProcess/API/C/WKUserContentExtensionStoreRef.cpp:122
> +#endif

UNUSED_PARAMs missing
Comment 44 Don Olmstead 2018-12-13 11:45:41 PST
(In reply to Michael Catanzaro from comment #42)
> (In reply to Adrian Perez from comment #41)
> > > > Source/cmake/WebKitFeatures.cmake:99
> > > > +    WEBKIT_OPTION_DEFINE(ENABLE_CONTENT_EXTENSIONS "Toggle content extensions support" PRIVATE OFF)
> > > 
> > > We're now trying to keep this in sync with WebKitFeatures.pm and all these
> > > awful .xcconfig project files so please make sure to add it there too.
> > 
> > Ah, this was FeatureList.pm in the end. Do we really need to change
> > the .xcconfig files? They are used only by the Cocoa (iOS/macOS) ports,
> > which already have the feature enabled, so I don't expect they need any
> > change.
> 
> Oops, yes, I meant FeatureList.pm. We're trying to keep FeatureList.pm,
> WebKitFeatures.cmake, and the .xcconfigs synced up. Apple ports are
> currently setting the option in FeatureDefines.h, not in the .xcconfigs. I
> agree it's a bit annoying to make unnecessary changes to XCode project files
> in this patch to enable a setting that's already enabled. It could be done
> in a separate patch. If you don't want to do it, you could talk to Don, but
> I think he does want it synced up.

There's a larger problem with the features being defined in so many different places that needs to be addressed. If you want to leave it for me to follow up its fine. If not adding to the configs would be a nice thing to do.

https://trac.webkit.org/changeset/239068/webkit shows what to do if you want to do the .xcconfigs. There's a script to sync the files from whats in the Tools\TestWebKitAPI folder.
Comment 45 Adrian Perez 2018-12-14 13:46:41 PST
(In reply to Michael Catanzaro from comment #42)
> (In reply to Adrian Perez from comment #41)
>
> [...]
>
> > Certainly doable. We would need to decide whether to keep the name as
> > SharedMemory::create() or use SharedMemory::wrapMap(). Alternative, the
> > bare minimum I would do is removing removing the broken ::create()
> > implementation from SharedMemoryUnix.cpp, and guard its definition in
> > the header with OS(DARWIN).
> > 
> > My preferred solution would be renaminf SharedMemory::create() for the
> > Cocoa port and calling it ::wrapMap() as well. To me it looks like the
> > naming is wrong: what is actually done is taking a chunk of data memory
> > mapped from a file, and creating a SharedMemory instance which *refers*
> > to the existing mapping data (and not creating a new area of anonymous
> > memory which can be shared, which is what the Unix implementation was
> > wrongly doing).
> 
> I would create a new bug, and CC some Apple people (at least Alex and
> Youenn, and anyone who's made significant changes there recently).

Done as bug #192714 -- and I have also added Don Olmstead for ACKing
the Windows parts.
Comment 46 Adrian Perez 2018-12-19 08:26:58 PST
Yesterday I found out an issue with the TestController in the GTK+
and WPE ports which will need to be fixed before we can unskip the
content extensions layout tests (bug #192855).
Comment 47 Carlos Garcia Campos 2018-12-21 00:17:25 PST
(In reply to Michael Catanzaro from comment #42)
> > I have made it so webkit_content_filter_manager_new() can take NULL as
> > file system path argument, and in that case the returned instance will
> > contain a reference to the default ContentRuleListStore, which uses the
> > path returned by ContentRuleListStore::defaultStorePath().
> > 
> > Tying this to WebKitWebsiteDataManager, while possible, feels a bit weird:
> > it seems wrong to instantiate the whole WebKitWebsiteDataManager to only
> > derive a path from its “base-data-directory” property seems like overkill.
> > The WebKitContentFilterManager is completely independent, which for example
> > would allow to instantiate it earlier to get it started in compiling rule
> > set files early at program bootup so it does its job (compilation/lookup
> > is done in a background thread) while the rest of the WebKit machinery is
> > brought up in the main thread.
> > 
> > Another option would be to never use the default ContentRuleListStore
> > and always require that a non-NULL path is passed when instantiating
> > with webkit_content_filter_manager_new(). WDYT?
> 
> Hm, my instinct would be to tie it to WebKitWebsiteDataManager somehow.
> WebKit really shouldn't store any data outside base-cache-directory and
> base-data-directory except if the browser specifically tells it to via API
> call.
> 
> But there is a problem: the same WebKitUserContentManager can be used in
> multiple WebKitWebViews, and each WebKitWebView can have its own
> WebKitWebsiteDataManager. So it seems unworkable.
> 
> So I don't know. Please ask Carlos Garcia for his opinion.

Rules aren't actually website data, so I think it makes sense to have a separate API to define its path. Applications are the ones setting the base-cache and base-data directories in WebsiteDataManager, so they can still use the same to build their own path. I would make the path mandatory for creating a new filter manager.
Comment 48 Adrian Perez 2018-12-21 05:29:14 PST
(In reply to Carlos Garcia Campos from comment #47)
> (In reply to Michael Catanzaro from comment #42)
> > > I have made it so webkit_content_filter_manager_new() can take NULL as
> > > file system path argument, and in that case the returned instance will
> > > contain a reference to the default ContentRuleListStore, which uses the
> > > path returned by ContentRuleListStore::defaultStorePath().
> > > 
> > > Tying this to WebKitWebsiteDataManager, while possible, feels a bit weird:
> > > it seems wrong to instantiate the whole WebKitWebsiteDataManager to only
> > > derive a path from its “base-data-directory” property seems like overkill.
> > > The WebKitContentFilterManager is completely independent, which for example
> > > would allow to instantiate it earlier to get it started in compiling rule
> > > set files early at program bootup so it does its job (compilation/lookup
> > > is done in a background thread) while the rest of the WebKit machinery is
> > > brought up in the main thread.
> > > 
> > > Another option would be to never use the default ContentRuleListStore
> > > and always require that a non-NULL path is passed when instantiating
> > > with webkit_content_filter_manager_new(). WDYT?
> > 
> > Hm, my instinct would be to tie it to WebKitWebsiteDataManager somehow.
> > WebKit really shouldn't store any data outside base-cache-directory and
> > base-data-directory except if the browser specifically tells it to via API
> > call.
> > 
> > But there is a problem: the same WebKitUserContentManager can be used in
> > multiple WebKitWebViews, and each WebKitWebView can have its own
> > WebKitWebsiteDataManager. So it seems unworkable.
> > 
> > So I don't know. Please ask Carlos Garcia for his opinion.
> 
> Rules aren't actually website data, so I think it makes sense to have a
> separate API to define its path. Applications are the ones setting the
> base-cache and base-data directories in WebsiteDataManager, so they can
> still use the same to build their own path. I would make the path mandatory
> for creating a new filter manager.

Great, then I think we have an agreement about this point. I will make
the path parameter required to be non-NULL, make sure to adress the rest
of review comments, split out the WKTR and C API changes into a new bug
(which will also enable the layout tests for the WPE and GTK+ ports),
and then I think we are basically ready to roll :)
comments
Comment 49 Adrian Perez 2018-12-21 05:39:56 PST
Comment on attachment 357215 [details]
WIP Patch v2

View in context: https://bugs.webkit.org/attachment.cgi?id=357215&action=review

One more doubt, also for reviewers familiar with GLib
style APIs :)

> Source/WebKit/UIProcess/API/gtk/WebKitContentFilterManager.h:90
> +                                              const gchar                *json_source,

Another question regarding API design: I was thinking that it would be
good to include a “gssize json_source_length” parameter, in bytes; if
negative the “json_source” string is assumed to be null-terminated.

The rationale for the above would be to more easily allow loading JSON
rule sets using means other than null-terminated C strings, like for
example:

- The JSON source file to load is memory mapped. 
- Using data from a GByteArray or a GBytes object.
- Loading from GResource can return a GBytes object (in this case it
  is guaranteed that the data will be NULL-terminated by GResource, tho).
- ...and so on.

WDYT?
Comment 50 Michael Catanzaro 2018-12-21 09:39:30 PST
gchar * is for NULL-terminated UTF-8 strings only. If you want to add an API to allow passing in data that could contain NULLs or non-UTF-8 data, then it's not a string. You can use guchar * instead and make the length parameter mandatory. (Or you can use GBytes or GByteArray or whatnot; I'm not very familiar with these, though.) These APIs are annoying and harder to work with than strings, but if you need to support embedded NULLs or non-UTF-8 then it's what you need to do.
Comment 51 Adrian Perez 2018-12-22 14:58:59 PST
(In reply to Michael Catanzaro from comment #50)
> gchar * is for NULL-terminated UTF-8 strings only. If you want to add an API
> to allow passing in data that could contain NULLs or non-UTF-8 data, then
> it's not a string. You can use guchar * instead and make the length
> parameter mandatory. (Or you can use GBytes or GByteArray or whatnot; I'm
> not very familiar with these, though.) These APIs are annoying and harder to
> work with than strings, but if you need to support embedded NULLs or
> non-UTF-8 then it's what you need to do.

I was thinking of the case when an input source file is big (for example,
the EasyList rules converted to JSON are ~15 MiB), an application may want
to mmap() the file and pass that to webkit_content_filter_manager_compile()
in order to avoid allocations in user space and copying from the buffer
cache in from kernel space to a string in user space, by directly asking
the kernel to map pages in the process memory. For this to be possible,
we need to be able to pass the size of the input to the fuction, becase
memory-mapping a file does not guarantee that a '\0' is present at the
end.

(Granted: there will be an extra memory copy from the parameters passed
to the function to a WTF::String, which is unavoidable with the current
design of ContentRuleListStore -- but I would rather give the possibility
of avoiding the second memory copy as outlined above.)

Also, I have found many counter examples to the “string parameters are
always zero-terminated and lengths for them never passed around”:
g_string_append_len, g_string_chunk_insert_len, g_string_erase,
g_string_insert_len, g_string_new_len, g_string_overwrite_len,
g_strstr_len, g_compute_{checksum,hmac}_for_string... In all these
functions passing a “-1” as the length of the input string means
“the input string is null-terminated”.
Comment 52 Carlos Garcia Campos 2018-12-23 02:27:06 PST
(In reply to Adrian Perez from comment #51)
> (In reply to Michael Catanzaro from comment #50)
> > gchar * is for NULL-terminated UTF-8 strings only. If you want to add an API
> > to allow passing in data that could contain NULLs or non-UTF-8 data, then
> > it's not a string. You can use guchar * instead and make the length
> > parameter mandatory. (Or you can use GBytes or GByteArray or whatnot; I'm
> > not very familiar with these, though.) These APIs are annoying and harder to
> > work with than strings, but if you need to support embedded NULLs or
> > non-UTF-8 then it's what you need to do.
> 
> I was thinking of the case when an input source file is big (for example,
> the EasyList rules converted to JSON are ~15 MiB), an application may want
> to mmap() the file and pass that to webkit_content_filter_manager_compile()
> in order to avoid allocations in user space and copying from the buffer
> cache in from kernel space to a string in user space, by directly asking
> the kernel to map pages in the process memory. For this to be possible,
> we need to be able to pass the size of the input to the fuction, becase
> memory-mapping a file does not guarantee that a '\0' is present at the
> end.
> 
> (Granted: there will be an extra memory copy from the parameters passed
> to the function to a WTF::String, which is unavoidable with the current
> design of ContentRuleListStore -- but I would rather give the possibility
> of avoiding the second memory copy as outlined above.)
> 
> Also, I have found many counter examples to the “string parameters are
> always zero-terminated and lengths for them never passed around”:
> g_string_append_len, g_string_chunk_insert_len, g_string_erase,
> g_string_insert_len, g_string_new_len, g_string_overwrite_len,
> g_strstr_len, g_compute_{checksum,hmac}_for_string... In all these
> functions passing a “-1” as the length of the input string means
> “the input string is null-terminated”.

I would use GBytes.
Comment 53 Adrian Perez 2018-12-26 11:37:07 PST
(In reply to Carlos Garcia Campos from comment #52)
> (In reply to Adrian Perez from comment #51)
> > (In reply to Michael Catanzaro from comment #50)
> > > gchar * is for NULL-terminated UTF-8 strings only. If you want to add an API
> > > to allow passing in data that could contain NULLs or non-UTF-8 data, then
> > > it's not a string. You can use guchar * instead and make the length
> > > parameter mandatory. (Or you can use GBytes or GByteArray or whatnot; I'm
> > > not very familiar with these, though.) These APIs are annoying and harder to
> > > work with than strings, but if you need to support embedded NULLs or
> > > non-UTF-8 then it's what you need to do.
> > 
> > I was thinking of the case when an input source file is big (for example,
> > the EasyList rules converted to JSON are ~15 MiB), an application may want
> > to mmap() the file and pass that to webkit_content_filter_manager_compile()
> > in order to avoid allocations in user space and copying from the buffer
> > cache in from kernel space to a string in user space, by directly asking
> > the kernel to map pages in the process memory. For this to be possible,
> > we need to be able to pass the size of the input to the fuction, becase
> > memory-mapping a file does not guarantee that a '\0' is present at the
> > end.
> > 
> > (Granted: there will be an extra memory copy from the parameters passed
> > to the function to a WTF::String, which is unavoidable with the current
> > design of ContentRuleListStore -- but I would rather give the possibility
> > of avoiding the second memory copy as outlined above.)
> > 
> > Also, I have found many counter examples to the “string parameters are
> > always zero-terminated and lengths for them never passed around”:
> > g_string_append_len, g_string_chunk_insert_len, g_string_erase,
> > g_string_insert_len, g_string_new_len, g_string_overwrite_len,
> > g_strstr_len, g_compute_{checksum,hmac}_for_string... In all these
> > functions passing a “-1” as the length of the input string means
> > “the input string is null-terminated”.
> 
> I would use GBytes.

I think it's a good option. Should we consider *also* providing a
webkit_content_filter_manager_compile_string() which takes a null
terminated UTF-8 string (a “const char*”), wraps it into a GBytes
and calls _compile()?
Comment 54 Adrian Perez 2019-01-22 01:50:22 PST
Created attachment 359725 [details]
WIP Patch v3


This version of the patch is rebased on top of the patch for bug #193622
(which needs to be applied first, so the EWS bots will fail), and it now
includes proper API documentation, changes names of functions a bit to
make things consistent (e.g. _get_identifier instead of _get_name to get
the identifier for a filter), made the style checker happy, and addresses
most of the review comments (not all, tho).

I am still undecided on the GBytes vs. char* parameters for the JSON
filter source. How about making webkit_content_filter_manager_compile()
accept a GBytes *and* also providing a helper function
webkit_content_filter_manager_compile_string() which accepts a string
and calls the GBytes version under the hood?
Comment 55 Adrian Perez 2019-01-23 09:34:02 PST
Created attachment 359895 [details]
WIP Patch v4


Rebased on top of trunk, and added missing autocleanup
for WebKitUserContentFilter.
Comment 56 Michael Catanzaro 2019-01-23 10:15:29 PST
A lot of red EWS here still.
Comment 57 Adrian Perez 2019-01-23 13:07:49 PST
(In reply to Michael Catanzaro from comment #56)
> A lot of red EWS here still.

That is to be expected because the latest versions of the patch
which I have been posting to this bug require the patch attached
to bug #193622 to be applied first.

(If having a combined patch in this bug with all that's needed,
just let me know.)
Comment 58 Adrian Perez 2019-01-23 16:32:30 PST
Created attachment 359972 [details]
WIP Patch v5


Now using GBytes for the JSON source argument to
webkit_content_filter_manager_compile(), and added
webkit_content_filter_manager_compile_string() for
convenience.
Comment 59 Michael Catanzaro 2019-01-24 08:52:28 PST
(In reply to Adrian Perez from comment #57) 
> (If having a combined patch in this bug with all that's needed,
> just let me know.)

Sorry, of course it's not needed. Let's focus on the first patch in the other bug first.
Comment 60 Adrian Perez 2019-02-01 16:04:30 PST
Created attachment 360918 [details]
Patch v6


Rebased on top of the current patch for bug #193622 (v8);
still missing ChangeLog entries; did a couple of minor cleanups.
Comment 61 Michael Catanzaro 2019-02-03 13:58:04 PST
Comment on attachment 360918 [details]
Patch v6

View in context: https://bugs.webkit.org/attachment.cgi?id=360918&action=review

All the async APIs should take cancellable parameters. Even if the internal WebKit API isn't cancellable, you should at least fake cancellation by returning G_IO_ERROR_CANCELLED in the finish() function if the user tried to cancel it before the operation completed. It's just an antipattern to provide an async API without cancellation.

I also think we should provide webkit_content_filter_manager_compile_string_finish(), even though it won't need to do anything other than call webkit_content_filter_manager_compile_finish().

I'm glad there are lots of layout tests, but there should be API tests too. r- for this. We don't rely on TestController's C API usage to test our WPE/GTK APIs. Well, we can rely on that to test most of the functionality so we don't need complex platform-specific tests, but we should still have basic checks to make sure the WPE/GTK APIs work.

Carlos Garcia needs to approve it too.

> Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:66
> +    RefPtr<API::ContentRuleListStore> store { };

It doesn't need the { }

> Source/WebKit/UIProcess/API/glib/WebKitUserContent.cpp:323
> +    _WebKitUserContentFilter(RefPtr<API::ContentRuleList> contentRuleList)

Why RefPtr<API::ContentRuleList> and not RefPtr<API::ContentRuleList>&&?

> Source/WebKit/UIProcess/API/glib/WebKitUserContent.cpp:389
> +WebKitUserContentFilter* webkitUserContentFilterFromRuleList(RefPtr<API::ContentRuleList> contentRuleList)

Why RefPtr<API::ContentRuleList> and not RefPtr<API::ContentRuleList>&&?

> Source/WebKit/UIProcess/API/gtk/WebKitAutocleanups.h:86
> +G_DEFINE_AUTOPTR_CLEANUP_FUNC (WebKitUserContentFilter, webkit_user_content_filter_unref)

You're missing an autocleanup for WebKitContentFilterManager.

> Source/WebKit/UIProcess/API/gtk/WebKitError.h:149
> +/**
> + * WebKitContentFilterError:
> + * @WEBKIT_CONTENT_FILTER_ERROR_INVALID_JSON: The JSON source for a content filter is invalid.
> + * @WEBKIT_CONTENT_FILTER_ERROR_NOT_FOUND: The requested content filter could not be found.
> + */

Since: 2.24

> Source/WebKit/UIProcess/API/wpe/WebKitError.h:134
> +/**
> + * WebKitContentFilterError:
> + * @WEBKIT_CONTENT_FILTER_ERROR_INVALID_JSON: The JSON source for a content filter is invalid.
> + * @WEBKIT_CONTENT_FILTER_ERROR_NOT_FOUND: The requested content filter could not be found.
> + */

Since: 2.24
Comment 62 Adrian Perez 2019-02-12 15:08:31 PST
(In reply to Michael Catanzaro from comment #61)
> Comment on attachment 360918 [details]
> Patch v6
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=360918&action=review
> 
> All the async APIs should take cancellable parameters. Even if the internal
> WebKit API isn't cancellable, you should at least fake cancellation by
> returning G_IO_ERROR_CANCELLED in the finish() function if the user tried to
> cancel it before the operation completed. It's just an antipattern to
> provide an async API without cancellation.

It took me a while and some reading to understand this, but you are indeed
right and I will be adding the GCancellable parameters. Thanks :)

> I also think we should provide
> webkit_content_filter_manager_compile_string_finish(), even though it won't
> need to do anything other than call
> webkit_content_filter_manager_compile_finish().

Sure, let's add it for consistency.

> I'm glad there are lots of layout tests, but there should be API tests too.
> r- for this. We don't rely on TestController's C API usage to test our
> WPE/GTK APIs. Well, we can rely on that to test most of the functionality so
> we don't need complex platform-specific tests, but we should still have
> basic checks to make sure the WPE/GTK APIs work.

So far I hadn't been adding API tests because I wanted to discuss the
new API first. I'll be adding them in an upcoming version of the patch.

> Carlos Garcia needs to approve it too.
> 
> > Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:66
> > +    RefPtr<API::ContentRuleListStore> store { };
> 
> It doesn't need the { }

Okay.

> > Source/WebKit/UIProcess/API/glib/WebKitUserContent.cpp:323
> > +    _WebKitUserContentFilter(RefPtr<API::ContentRuleList> contentRuleList)
> 
> Why RefPtr<API::ContentRuleList> and not RefPtr<API::ContentRuleList>&&?

No reason; it just didn't cross my mind to use moves here because it
being a RefPtr<> there's not really much memory copied around, only
a ref/unref cycle... but yeat, let's use a move and avoid the ref/unref
as well :)

> > Source/WebKit/UIProcess/API/glib/WebKitUserContent.cpp:389
> > +WebKitUserContentFilter* webkitUserContentFilterFromRuleList(RefPtr<API::ContentRuleList> contentRuleList)
> 
> Why RefPtr<API::ContentRuleList> and not RefPtr<API::ContentRuleList>&&?

Same as above: didn't cross my mind. I'll have this changed, too.

> > Source/WebKit/UIProcess/API/gtk/WebKitAutocleanups.h:86
> > +G_DEFINE_AUTOPTR_CLEANUP_FUNC (WebKitUserContentFilter, webkit_user_content_filter_unref)
> 
> You're missing an autocleanup for WebKitContentFilterManager.

Will be added.

> > Source/WebKit/UIProcess/API/gtk/WebKitError.h:149
> > +/**
> > + * WebKitContentFilterError:
> > + * @WEBKIT_CONTENT_FILTER_ERROR_INVALID_JSON: The JSON source for a content filter is invalid.
> > + * @WEBKIT_CONTENT_FILTER_ERROR_NOT_FOUND: The requested content filter could not be found.
> > + */
> 
> Since: 2.24

D'oh! Of course :)

> > Source/WebKit/UIProcess/API/wpe/WebKitError.h:134
> > +/**
> > + * WebKitContentFilterError:
> > + * @WEBKIT_CONTENT_FILTER_ERROR_INVALID_JSON: The JSON source for a content filter is invalid.
> > + * @WEBKIT_CONTENT_FILTER_ERROR_NOT_FOUND: The requested content filter could not be found.
> > + */
> 
> Since: 2.24

Adding!
Comment 63 Adrian Perez 2019-02-12 15:25:18 PST
Created attachment 361851 [details]
Patch v7


Addresses all the review comments by Michael; but is still missing
API tests, which I am adding next and will be posting in v8 of the
patch 8-)
Comment 64 EWS Watchlist 2019-02-12 17:34:08 PST
Comment on attachment 361851 [details]
Patch v7

Attachment 361851 [details] did not pass mac-debug-ews (mac):
Output: https://webkit-queues.webkit.org/results/11128057

New failing tests:
http/tests/inspector/network/resource-initiatorNode.html
Comment 65 EWS Watchlist 2019-02-12 17:34:10 PST
Created attachment 361875 [details]
Archive of layout-test-results from ews114 for mac-highsierra

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews114  Port: mac-highsierra  Platform: Mac OS X 10.13.6
Comment 66 Carlos Garcia Campos 2019-02-13 00:59:18 PST
Comment on attachment 361851 [details]
Patch v7

View in context: https://bugs.webkit.org/attachment.cgi?id=361851&action=review

We really need unit tests

> Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:54
> + */

Since: 2.24

> Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:72
> +static void webkit_content_filter_manager_class_init(WebKitContentFilterManagerClass* klass)

Remove klass

> Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:78
> + * @storage_path: path where data for compiled filters will be stored on disk

(nullable) and add , or %NULL and document here or bellow in the body that passing %NULL means using using the default store path. I would make this mandatory, though.

> Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:89
> +    manager->priv->store = path ? adoptRef(new API::ContentRuleListStore(String::fromUTF8(path), false)) : &API::ContentRuleListStore::defaultStore(false);

The default store depends on ContentRuleListStore::defaultStorePath() to be implement but it isn't for GLib based ports. Use FileSystem::stringFromFileSystemRepresentation() instead of String::fromUTF8().

> Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:97
> + * @json_source: a string containing the rule seet in JSON format

I would just use @source

> Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:120
> +    GRefPtr<GBytes> bytes = adoptGRef(g_bytes_new_static(jsonSource, strlen(jsonSource)));
> +    webkit_content_filter_manager_compile(manager, identifier, bytes.get(), cancellable, callback, userData);

I think it's easier enough for the caller to create a GBytes from a string to not need to provide this API.

> Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:144
> + * webkit_content_filter_manager_compile:

I think I would use add instead of compile. The fact that the source is compiled is an implementation detail. Since we have remove and lookup, add sounds better to me.

> Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:153
> + * [WebKit content extesions JSON format](https://webkit.org/blog/3476/content-blockers-first-look/).

WE should add a chapter in the docs to document the JSON format instead of including a link to a blog post. I would be annoying, for example if you are reading the docs in devhelp. I'm fine to do it in a follow up patch, but before 2.24 is out, please.

> Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:173
> +    auto jsonString = String::fromUTF8(static_cast<const char*>(g_bytes_get_data(jsonSource, nullptr)) ? : "", g_bytes_get_size(jsonSource));

I think this would be easier to read if we first get the data and size with g_bytes_get_data. We don't need to null-check the result, String accepts and handles null. Another thing is, do we want to allow passing an empty source?

> Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:208
> + * @identifier: a filter identifier

Since add returns the WebKitUserContentFilter, I wonder if we should receive the filter here instead of the id.

> Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:291
> +            g_task_return_pointer(task.get(), webkitUserContentFilterFromRuleList(WTFMove(contentRuleList)), reinterpret_cast<GDestroyNotify>(webkit_user_content_filter_unref));

This is weird. The method is called lookup(), but the pointer won't be the same as the one returned by compile(). We should document it and probably rename this method.

> Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:308
> +WebKitUserContentFilter* webkit_content_filter_manager_lookup_finish(WebKitContentFilterManager* manager, GAsyncResult* result, GError** error)

Transfer full is also weird for a lookup().

> Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:343
> +        result[identifiers.size()] = nullptr;

This is already nullptr, since you used g_new0.

> Source/WebKit/UIProcess/API/glib/WebKitUserContent.cpp:376
> + * @user_content_filter.

What's this?

> Source/WebKit/UIProcess/API/glib/WebKitUserContent.cpp:386
> +    return userContentFilter->contentRuleList->name().utf8().data();

You can't do this, data() is a temporary. We should either cache the id in WebKitUserContentFilter struct and return a const char* or use g_strdup() here and return a newly allocated string. The former is usually preferred.

> Source/WebKit/UIProcess/API/glib/WebKitUserContentManager.cpp:322
> + * webkit_user_content_manager_add_filter:

Now I see why using add would be even more confusing in WebKitContentFilterManager. I think we should make it clearer that WebKitContentFilterManager is about storing in disk. Maybe Manager is not the best word in this case and Store would fit better. If we keep the Manager name then I would use store() instead of add() or compile() and retrieve() instead of lookup(). I would also consider using WebKitUserContentFilter{Manmager|Store} since it creates WebKitUserContentFilters, it looks more consistent.

> Source/WebKit/UIProcess/API/glib/WebKitUserContentManager.cpp:331
> + * Filters need to be compiled from source or loaded from disk using a
> + * #WebKitContentFilterManager.

Since this receives a WebKitUserContentFilter object, there's no other way to create such an object, no?

> Source/WebKit/UIProcess/API/glib/WebKitUserContentManager.cpp:342
> +void webkit_user_content_manager_remove_filter(WebKitUserContentManager* manager, WebKitUserContentFilter* filter)

This needs to be documented. I wonder why the build doesn't fail.

> Source/WebKit/UIProcess/API/gtk/WebKitError.h:147
> + * @WEBKIT_CONTENT_FILTER_ERROR_INVALID_JSON: The JSON source for a content filter is invalid.

I would use INVALID_SOURCE

> Tools/MiniBrowser/gtk/main.c:116
> +    { "content-filter", 0, 0, G_OPTION_ARG_STRING, &contentFilter, "JSON with content filtering rules", "FILE" },

G_OPTION_ARG_FILENAME

> Tools/MiniBrowser/gtk/main.c:568
> +        if (!g_file_get_contents(contentFilter, &contentFilterJSON, NULL, &error)) {

Use g_file_new_for_commandline_arg(), so that the passed file can be a relative path. Then you could get the GBytes directly with g_file_read + g_input_stream_read_bytes (g_file_load_bytes() would be better but requires newer glib).

> Tools/MiniBrowser/gtk/main.c:571
> +            return 1;

Do we want to make this fatal?

> Tools/MiniBrowser/gtk/main.c:575
> +        filterManager = webkit_content_filter_manager_new(filtersPath);

This is leaked.

> Tools/MiniBrowser/gtk/main.c:584
> +        compilationData.mainLoop = g_main_loop_new(NULL, FALSE);
> +        g_main_loop_run(compilationData.mainLoop);
> +        g_main_loop_unref(compilationData.mainLoop);
> +        g_object_unref(filterManager);

Why are we doing this synchronously? If this is really useful we should provide _sync() methods too.

> Tools/MiniBrowser/gtk/main.c:589
> +            return 1;

Do we want to make this fatal?
Comment 67 Adrian Perez 2019-02-13 02:16:54 PST
(In reply to Carlos Garcia Campos from comment #66)
> Comment on attachment 361851 [details]
> Patch v7
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=361851&action=review
> 
> We really need unit tests

I know, and will add them in the next version of the patch as promised =)

> > Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:54
> > + */
> 
> Since: 2.24

👍

> > Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:72
> > +static void webkit_content_filter_manager_class_init(WebKitContentFilterManagerClass* klass)
> 
> Remove klass

👍

> > Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:78
> > + * @storage_path: path where data for compiled filters will be stored on disk
> 
> (nullable) and add , or %NULL and document here or bellow in the body that
> passing %NULL means using using the default store path. I would make this
> mandatory, though.

The storage path is NOT nullable, and mandatory. Please note that in
“APIContentRuleListStoreGLib.cpp” we have:

  String ContentRuleListStore::defaultStorePath(bool)
  {
      ASSERT_NOT_REACHED();
      return String();
  }

This is because first we were thinking of deriving the path from
WebKitWebsiteDataManager; but that would not make sense in the end
because filters are not “website data” (they are something the UA
applies based on *user* preference), and anyway a applications will
know where they want to store their compiled filters so making the
path mandatory is fine.

Also, side note: an application may need (or want) to use 
WebKitContentFilterManager independently from WebKitWebsiteDataManager
(or WebKitWebContext, or any other WebKit classes). The main examples
would:

 - The application wants to start compilation of filters ASAP right
   after it starts executing, asynchronously, before having created
   any other WebKit object, and then do the rest of its startup
   sequence while filters compile in background (it can take 1-3s
   to compile a big rule set; e.g. like EasyList, which is ~9 MiB
   when converted to JSON).

 - A helper program may want to modify the on-disk filters using
   WebKitContentFilterManager without using any of the other WebKit
   classes. This would be the case of Epiphany's profile migrator.

> > Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:89
> > +    manager->priv->store = path ? adoptRef(new API::ContentRuleListStore(String::fromUTF8(path), false)) : &API::ContentRuleListStore::defaultStore(false);
> 
> The default store depends on ContentRuleListStore::defaultStorePath() to be
> implement but it isn't for GLib based ports. Use
> FileSystem::stringFromFileSystemRepresentation() instead of
> String::fromUTF8().

If we make the store path a mandatory as outlined above (and you
seem to kind of suggest that as well) then this can stay as it is.

BTW, I used to have ::defaultStorePath() implemented in an initial
version of the patch, and then after agreeing with Michael that it
is better to make the “store_path” parameter mandatory this ended
up with ASSERT_NOT_REACHED() in it ;-)

> > Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:97
> > + * @json_source: a string containing the rule seet in JSON format
> 
> I would just use @source

Okay. The documentation is clear about it having to be a JSON formatted
rule set, so calling it just “source” seems fine. I'll change this.

> > Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:120
> > +    GRefPtr<GBytes> bytes = adoptGRef(g_bytes_new_static(jsonSource, strlen(jsonSource)));
> > +    webkit_content_filter_manager_compile(manager, identifier, bytes.get(), cancellable, callback, userData);
> 
> I think it's easier enough for the caller to create a GBytes from a string
> to not need to provide this API.

No strong preference. In my head the _compile_string() is nice to
have, but it is also true that creating a GBytes from a string is
a no-brainer... let's remove it, and if we really feel like it in
the future it can always be added later.

> > Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:144
> > + * webkit_content_filter_manager_compile:
> 
> I think I would use add instead of compile. The fact that the source is
> compiled is an implementation detail. Since we have remove and lookup, add
> sounds better to me.

About this, we just had a private chat, and we agreed renaming this way:

  - WebKitContentFilterManager → WebKitContentFilterStore
  - _compile() → _save()
  - _lookup() → _load()

This makes clearer that the object manipulates things on disk (it's a 
“store”) and that one can save JSON rule sets to it and retrieve them
later. Saving or loading returns a WebKitUserContentFilter, which is
opaque and that compilation happens is an implementation detail which
the user of the class does not need to be concerned about.

> > Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:153
> > + * [WebKit content extesions JSON format](https://webkit.org/blog/3476/content-blockers-first-look/).
> 
> We should add a chapter in the docs to document the JSON format instead of
> including a link to a blog post. I would be annoying, for example if you are
> reading the docs in devhelp. I'm fine to do it in a follow up patch, but
> before 2.24 is out, please.

Definitely. I will try to find out what's the license for the documentation
about the JSON format which is available at developer.apple.com and with
some luck we may be able to copy from there. Having a page titled “WebKit
JSON content filter format” in Devhelp would be definitely handy. 

> > Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:173
> > +    auto jsonString = String::fromUTF8(static_cast<const char*>(g_bytes_get_data(jsonSource, nullptr)) ? : "", g_bytes_get_size(jsonSource));
> 
> I think this would be easier to read if we first get the data and size with
> g_bytes_get_data. We don't need to null-check the result, String accepts and
> handles null. Another thing is, do we want to allow passing an empty source?

Mmmh, probably we don't want to allow an empty source, because then
compilation of the content extension would fail anyway. I will reorder
this a bit and make it fail early.

> > Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:208
> > + * @identifier: a filter identifier
> 
> Since add returns the WebKitUserContentFilter, I wonder if we should receive
> the filter here instead of the id.

There are cases in which the identifier only can be useful. Example: you
don't know the list of filters stored, so you don't know which IDs to
lookup, yet you want to remove them anyway... If we allow listing IDs,
and lookup up a filter by ID, it seems natural to allow removing a filter
by ID.

How about having both _remove() and _remove_by_identifier()?

> > Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:291
> > +            g_task_return_pointer(task.get(), webkitUserContentFilterFromRuleList(WTFMove(contentRuleList)), reinterpret_cast<GDestroyNotify>(webkit_user_content_filter_unref));
> 
> This is weird. The method is called lookup(), but the pointer won't be the
> same as the one returned by compile(). We should document it and probably
> rename this method.

Solved by the rename from _lookup() to _load() as outlined above.

> > Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:308
> > +WebKitUserContentFilter* webkit_content_filter_manager_lookup_finish(WebKitContentFilterManager* manager, GAsyncResult* result, GError** error)
> 
> Transfer full is also weird for a lookup().

Ditto, no problem it we call it _load().

> > Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:343
> > +        result[identifiers.size()] = nullptr;
> 
> This is already nullptr, since you used g_new0.

Good point... just being used to C being funny and routinely having
to be careful of shooting oneself on the foot. I'll remove this line.

> > Source/WebKit/UIProcess/API/glib/WebKitUserContent.cpp:376
> > + * @user_content_filter.
> 
> What's this?

Leftover line fro copy paste; I'll remove it.

> > Source/WebKit/UIProcess/API/glib/WebKitUserContent.cpp:386
> > +    return userContentFilter->contentRuleList->name().utf8().data();
> 
> You can't do this, data() is a temporary. We should either cache the id in
> WebKitUserContentFilter struct and return a const char* or use g_strdup()
> here and return a newly allocated string. The former is usually preferred.

D'oh! Good catch, thanks!

> > Source/WebKit/UIProcess/API/glib/WebKitUserContentManager.cpp:322
> > + * webkit_user_content_manager_add_filter:
> 
> Now I see why using add would be even more confusing in
> WebKitContentFilterManager. I think we should make it clearer that
> WebKitContentFilterManager is about storing in disk. Maybe Manager is not
> the best word in this case and Store would fit better. If we keep the
> Manager name then I would use store() instead of add() or compile() and
> retrieve() instead of lookup(). I would also consider using
> WebKitUserContentFilter{Manmager|Store} since it creates
> WebKitUserContentFilters, it looks more consistent.

Yes, let's add “User“ to the class name, for consistency. I am not a big
fan of those VeryLongJavaStyleClassNames but we already have “User” as
a prefix for the other classes, and therefore I agree it's better to
add it for consistency. 

> > Source/WebKit/UIProcess/API/glib/WebKitUserContentManager.cpp:331
> > + * Filters need to be compiled from source or loaded from disk using a
> > + * #WebKitContentFilterManager.
> 
> Since this receives a WebKitUserContentFilter object, there's no other way
> to create such an object, no?

Correct. The only way to obtain a WebKitUserContentFilter object is
through the WebKitContentFilterManager using _compile() or _lookup().

> > Source/WebKit/UIProcess/API/glib/WebKitUserContentManager.cpp:342
> > +void webkit_user_content_manager_remove_filter(WebKitUserContentManager* manager, WebKitUserContentFilter* filter)
> 
> This needs to be documented. I wonder why the build doesn't fail.

No idea... but yes, needs documenting indeed.

> > Source/WebKit/UIProcess/API/gtk/WebKitError.h:147
> > + * @WEBKIT_CONTENT_FILTER_ERROR_INVALID_JSON: The JSON source for a content filter is invalid.
> 
> I would use INVALID_SOURCE

Sure, will change.

> > Tools/MiniBrowser/gtk/main.c:116
> > +    { "content-filter", 0, 0, G_OPTION_ARG_STRING, &contentFilter, "JSON with content filtering rules", "FILE" },
> 
> G_OPTION_ARG_FILENAME
> 
> > Tools/MiniBrowser/gtk/main.c:568
> > +        if (!g_file_get_contents(contentFilter, &contentFilterJSON, NULL, &error)) {
> 
> Use g_file_new_for_commandline_arg(), so that the passed file can be a
> relative path. Then you could get the GBytes directly with g_file_read +
> g_input_stream_read_bytes (g_file_load_bytes() would be better but requires
> newer glib).

👍

> > Tools/MiniBrowser/gtk/main.c:571
> > +            return 1;
> 
> Do we want to make this fatal?

With g_fatal()? Sure.

> > Tools/MiniBrowser/gtk/main.c:575
> > +        filterManager = webkit_content_filter_manager_new(filtersPath);
> 
> This is leaked.

Yes, I need to add an _unref() here, thanks.
 
> > Tools/MiniBrowser/gtk/main.c:584
> > +        compilationData.mainLoop = g_main_loop_new(NULL, FALSE);
> > +        g_main_loop_run(compilationData.mainLoop);
> > +        g_main_loop_unref(compilationData.mainLoop);
> > +        g_object_unref(filterManager);
> 
> Why are we doing this synchronously? If this is really useful we should
> provide _sync() methods too.

To not complicate the minibrowsers much and make sure the filter is
enabled before attempting to load any web page.

> > Tools/MiniBrowser/gtk/main.c:589
> > +            return 1;
> 
> Do we want to make this fatal?
Comment 68 Carlos Garcia Campos 2019-02-13 02:30:56 PST
Comment on attachment 361851 [details]
Patch v7

View in context: https://bugs.webkit.org/attachment.cgi?id=361851&action=review

>>> Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:89
>>> +    manager->priv->store = path ? adoptRef(new API::ContentRuleListStore(String::fromUTF8(path), false)) : &API::ContentRuleListStore::defaultStore(false);
>> 
>> The default store depends on ContentRuleListStore::defaultStorePath() to be implement but it isn't for GLib based ports. Use FileSystem::stringFromFileSystemRepresentation() instead of String::fromUTF8().
> 
> If we make the store path a mandatory as outlined above (and you
> seem to kind of suggest that as well) then this can stay as it is.
> 
> BTW, I used to have ::defaultStorePath() implemented in an initial
> version of the patch, and then after agreeing with Michael that it
> is better to make the “store_path” parameter mandatory this ended
> up with ASSERT_NOT_REACHED() in it ;-)

I don't understand it, if path is mandatory, why are we null-checking it? we should g_return_val_if_fail instead. ContentRuleListStore::defaultStore() will end up using the default store patch and it will assert.

>>> Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:120
>>> +    webkit_content_filter_manager_compile(manager, identifier, bytes.get(), cancellable, callback, userData);
>> 
>> I think it's easier enough for the caller to create a GBytes from a string to not need to provide this API.
> 
> No strong preference. In my head the _compile_string() is nice to
> have, but it is also true that creating a GBytes from a string is
> a no-brainer... let's remove it, and if we really feel like it in
> the future it can always be added later.

I think it would be more convenient to add compile_from_file()

>>> Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:173
>>> +    auto jsonString = String::fromUTF8(static_cast<const char*>(g_bytes_get_data(jsonSource, nullptr)) ? : "", g_bytes_get_size(jsonSource));
>> 
>> I think this would be easier to read if we first get the data and size with g_bytes_get_data. We don't need to null-check the result, String accepts and handles null. Another thing is, do we want to allow passing an empty source?
> 
> Mmmh, probably we don't want to allow an empty source, because then
> compilation of the content extension would fail anyway. I will reorder
> this a bit and make it fail early.

yes, we can make it fail with invalid source, I guess

>>> Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:208
>>> + * @identifier: a filter identifier
>> 
>> Since add returns the WebKitUserContentFilter, I wonder if we should receive the filter here instead of the id.
> 
> There are cases in which the identifier only can be useful. Example: you
> don't know the list of filters stored, so you don't know which IDs to
> lookup, yet you want to remove them anyway... If we allow listing IDs,
> and lookup up a filter by ID, it seems natural to allow removing a filter
> by ID.
> 
> How about having both _remove() and _remove_by_identifier()?

Nah, I guess it's fine, since this would cover both cases. I didn't consider the case of fetch_ids() + remove(). We might consider to add a clear() method to remove all the ids, too.

>>> Tools/MiniBrowser/gtk/main.c:571
>>> +            return 1;
>> 
>> Do we want to make this fatal?
> 
> With g_fatal()? Sure.

I mean the opposite, if we really want to make this fatal that. Maybe we can just show a message, but still allow MB to start.
Comment 69 Adrian Perez 2019-02-13 18:40:16 PST
(In reply to Carlos Garcia Campos from comment #68)
> Comment on attachment 361851 [details]
> Patch v7
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=361851&action=review
> 
> >>> Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:89
> >>> +    manager->priv->store = path ? adoptRef(new API::ContentRuleListStore(String::fromUTF8(path), false)) : &API::ContentRuleListStore::defaultStore(false);
> >> 
> >> The default store depends on ContentRuleListStore::defaultStorePath() to be implement but it isn't for GLib based ports. Use FileSystem::stringFromFileSystemRepresentation() instead of String::fromUTF8().
> > 
> > If we make the store path a mandatory as outlined above (and you
> > seem to kind of suggest that as well) then this can stay as it is.
> > 
> > BTW, I used to have ::defaultStorePath() implemented in an initial
> > version of the patch, and then after agreeing with Michael that it
> > is better to make the “store_path” parameter mandatory this ended
> > up with ASSERT_NOT_REACHED() in it ;-)
> 
> I don't understand it, if path is mandatory, why are we null-checking it? we
> should g_return_val_if_fail instead. ContentRuleListStore::defaultStore()
> will end up using the default store patch and it will assert.

I have now a null-check with g_return_val_if_fail() for the path in
the next version of the patch.

> >>> Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:120
> >>> +    webkit_content_filter_manager_compile(manager, identifier, bytes.get(), cancellable, callback, userData);
> >> 
> >> I think it's easier enough for the caller to create a GBytes from a string to not need to provide this API.
> > 
> > No strong preference. In my head the _compile_string() is nice to
> > have, but it is also true that creating a GBytes from a string is
> > a no-brainer... let's remove it, and if we really feel like it in
> > the future it can always be added later.
> 
> I think it would be more convenient to add compile_from_file()

Added for the next version of the patch, but I called it _save_from_file()
for coherency with the agreed _compile() → _save() change. Question: do
we want a separate _save_from_file_finish()? Personally I think it's not
needed and feels a bit silly to have such a function which will just
call _save_finish() internally.

> >>> Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:173
> >>> +    auto jsonString = String::fromUTF8(static_cast<const char*>(g_bytes_get_data(jsonSource, nullptr)) ? : "", g_bytes_get_size(jsonSource));
> >> 
> >> I think this would be easier to read if we first get the data and size with g_bytes_get_data. We don't need to null-check the result, String accepts and handles null. Another thing is, do we want to allow passing an empty source?
> > 
> > Mmmh, probably we don't want to allow an empty source, because then
> > compilation of the content extension would fail anyway. I will reorder
> > this a bit and make it fail early.
> 
> yes, we can make it fail with invalid source, I guess

I got this done for the next version of the patch, and works fine :)

> >>> Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:208
> >>> + * @identifier: a filter identifier
> >> 
> >> Since add returns the WebKitUserContentFilter, I wonder if we should receive the filter here instead of the id.
> > 
> > There are cases in which the identifier only can be useful. Example: you
> > don't know the list of filters stored, so you don't know which IDs to
> > lookup, yet you want to remove them anyway... If we allow listing IDs,
> > and lookup up a filter by ID, it seems natural to allow removing a filter
> > by ID.
> > 
> > How about having both _remove() and _remove_by_identifier()?
> 
> Nah, I guess it's fine, since this would cover both cases. I didn't consider
> the case of fetch_ids() + remove(). We might consider to add a clear()
> method to remove all the ids, too.
> 
> >>> Tools/MiniBrowser/gtk/main.c:571
> >>> +            return 1;
> >> 
> >> Do we want to make this fatal?
> > 
> > With g_fatal()? Sure.
> 
> I mean the opposite, if we really want to make this fatal that. Maybe we can
> just show a message, but still allow MB to start.

Ah, got it now. These won't be fatal errors in the next version of
the patch. An error will be printed to stderr and execution continues.
Comment 70 Adrian Perez 2019-02-13 18:56:30 PST
Created attachment 361981 [details]
Patch v8


This version of the patch includes:

 - One API test (yay!)
 - The renames:
    * WebKitContentFilterManager → WebKitUserContentFilterStore
    * _compile() → _save()
	* _lookup() → _load()
 - New webkit_user_content_filter_store_save_from_file() function.
 - Content filter errors non-fatal in the MiniBrowsers.
 - NULL-check the store path in the constructor.
 - Early error return on empty JSON source input.
 - Everything should be documented now properly.

Still pending to do before landing:

 - Updating the ChangeLogs.
 - Make API tests cleanup temporary directories.
 - Maybe some new API tests e.g. TestWebKitUserContentFilterStore.

I think this is getting quite close to be ready :)
Comment 71 Carlos Garcia Campos 2019-02-14 01:01:22 PST
Comment on attachment 361851 [details]
Patch v7

View in context: https://bugs.webkit.org/attachment.cgi?id=361851&action=review

>>>>> Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:120
>>>>> +    webkit_content_filter_manager_compile(manager, identifier, bytes.get(), cancellable, callback, userData);
>>>> 
>>>> I think it's easier enough for the caller to create a GBytes from a string to not need to provide this API.
>>> 
>>> No strong preference. In my head the _compile_string() is nice to
>>> have, but it is also true that creating a GBytes from a string is
>>> a no-brainer... let's remove it, and if we really feel like it in
>>> the future it can always be added later.
>> 
>> I think it would be more convenient to add compile_from_file()
> 
> Added for the next version of the patch, but I called it _save_from_file()
> for coherency with the agreed _compile() → _save() change. Question: do
> we want a separate _save_from_file_finish()? Personally I think it's not
> needed and feels a bit silly to have such a function which will just
> call _save_finish() internally.

Well, it's the common pattern and somehow expected by API users familiar with  gio async pattern, so I guess it doesn't hurt to add it.
Comment 72 Carlos Garcia Campos 2019-02-14 01:56:45 PST
Comment on attachment 361981 [details]
Patch v8

View in context: https://bugs.webkit.org/attachment.cgi?id=361981&action=review

API looks good to me now, but we need more unit tests.

> Source/WTF/wtf/glib/GRefPtr.h:243
> +template <> WTF_EXPORT_PRIVATE GMappedFile* refGPtr(GMappedFile*);
> +template <> WTF_EXPORT_PRIVATE void derefGPtr(GMappedFile*);

This requires a changelog entry in WTF.

> Source/WebKit/UIProcess/API/glib/WebKitUserContent.cpp:324
> +        : identifier(contentRuleList->name().utf8()), contentRuleList(contentRuleList), referenceCount(1)

Use a new line for every parameter. The contentRuleList should be moved with WTFMove()

> Source/WebKit/UIProcess/API/glib/WebKitUserContent.cpp:375
> + * Obtain the identifier passed to webkit_content_filter_manager_compile()

webkit_content_filter_manager_compile no longer exists. I would avoid using function names here, because we might add more. I would save the identifier used to save a filter in a #WebKitUserContentFilterStore, or something similar.

> Source/WebKit/UIProcess/API/glib/WebKitUserContent.cpp:383
> +const char*
> +webkit_user_content_filter_get_identifier(WebKitUserContentFilter* userContentFilter)

This should be in the same line.

> Source/WebKit/UIProcess/API/glib/WebKitUserContent.cpp:389
> +WebKitUserContentFilter* webkitUserContentFilterFromRuleList(RefPtr<API::ContentRuleList>&& contentRuleList)

We normally use Create for private new functiuons like this. webkitUserContentFilterCreate() in this case.

> Source/WebKit/UIProcess/API/glib/WebKitUserContentFilterStore.cpp:63
> +static inline GError*
> +toGError(WebKitUserContentFilterError code, const std::error_code error)

One line here.

> Source/WebKit/UIProcess/API/glib/WebKitUserContentFilterStore.cpp:85
> + * The path must point to a local filesystem.

And should exist, right?

> Source/WebKit/UIProcess/API/glib/WebKitUserContentFilterStore.cpp:93
> +    g_return_val_if_fail(storagePath != nullptr, nullptr);

Don't compare to nullptr

> Source/WebKit/UIProcess/API/glib/WebKitUserContentFilterStore.cpp:95
> +    store->priv->store = adoptRef(new API::ContentRuleListStore(String::fromUTF8(storagePath), false));

Use FileSystem::stringFromFileSystemRepresentation() instead of String::fromUTF8().

> Source/WebKit/UIProcess/API/glib/WebKitUserContentFilterStore.cpp:100
> +static void
> +webkitUserContentFilterStoreSaveBytes(GRefPtr<GTask>&& task, String&& identifier, GRefPtr<GBytes>&& source)

One line here.

> Source/WebKit/UIProcess/API/glib/WebKitUserContentFilterStore.cpp:152
> +    webkitUserContentFilterStoreSaveBytes(WTFMove(task), String::fromUTF8(identifier), WTFMove(sourceBytes));

I think you can do GRefPtr<GBytes>(source) directly here instead of using a local variable + move.

> Source/WebKit/UIProcess/API/glib/WebKitUserContentFilterStore.cpp:201
> +    // Try mapping the file in memory first, and fall-back to reading the contents as a fallback.

fall-back as a fallback? :-)

> Source/WebKit/UIProcess/API/glib/WebKitUserContentFilterStore.cpp:215
> +    struct TaskData {
> +        String identifier;
> +    };

Use WEBKIT_DEFINE_ASYNC_DATA_STRUCT

> Source/WebKit/UIProcess/API/glib/WebKitUserContentFilterStore.cpp:220
> +    g_file_load_bytes_async(file, cancellable, [](GObject* sourceObject, GAsyncResult* result, void* userData) {

We can't use this, it was added in glib 2.56, we should use g_file_read_async + g_input_stream_read_bytes_async(). Maybe it's probably ok to limit this to local files, receiving a path instead, to make it explicit.

> Tools/MiniBrowser/gtk/main.c:483
> +static void
> +filterSavedCallback(WebKitUserContentFilterStore *store, GAsyncResult *result, FilterSaveData *data)

One line here.

> Tools/MiniBrowser/gtk/main.c:487
> +    g_assert_nonnull(store);
> +    g_assert_nonnull(result);
> +    g_assert_nonnull(data);

I would use g_assert() since this is not a unit test. Or even remove them, since they won't be NULL.

> Tools/MiniBrowser/gtk/main.c:568
> +        g_clear_pointer(&filtersPath, g_free);

filtersPath is not used again, so we don't need to set it to NULL, I think g_free is enough here.

> Tools/MiniBrowser/gtk/main.c:570
> +        webkit_user_content_filter_store_save_from_file(store, "GTKMiniBrowserFilter", contentFilterFile, NULL, (GAsyncReadyCallback) filterSavedCallback, &saveData);

(GAsyncReadyCallback) filterSavedCallback -> (GAsyncReadyCallback)filterSavedCallback

> Tools/MiniBrowser/gtk/main.c:573
> +        g_clear_object(&store);

g_object_unref().

> Tools/MiniBrowser/gtk/main.c:582
> +        g_clear_pointer(&saveData.mainLoop, g_main_loop_unref);

g_main_loop_unref()

> Tools/MiniBrowser/gtk/main.c:584
> +        g_clear_pointer(&error, g_error_free);

What is this error for?

> Tools/MiniBrowser/gtk/main.c:585
> +        g_clear_object(&contentFilterFile);

g_object_unref()

> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentManager.cpp:51
> +#if ENABLE(CONTENT_EXTENSIONS)

Now that we are exposing the API we should not allow to build with content extensions disabled

> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentManager.cpp:457
> +    *tempDir = g_dir_make_tmp(nullptr, &error.outPtr());

Use Test::dataDirectory() for any temp file.

> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentManager.cpp:479
> +    g_clear_pointer(&data.mainLoop, g_main_loop_unref);

g_main_loop_unref

> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentManager.cpp:480
> +    g_clear_object(&store);

g_object_unref

> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentManager.cpp:497
> +    GUniqueOutPtr<char> filtersDir;
> +    WebKitUserContentFilter* filter = getUserContentFilter(test->m_mainLoop, &filtersDir.outPtr());

You are not using the filtersDir

> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentManager.cpp:504
> +    g_clear_pointer(&filter, webkit_user_content_filter_unref);

webkit_user_content_filter_unref

> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentManager.cpp:505
> +}

We should add cases for every new api entry, save_from_file(), load(), remove(), fetch_ids, get_identifier(), remova_all(), etc. And we should also add cases for the possible errors.
Comment 73 Adrian Perez 2019-02-14 13:09:55 PST
(In reply to Carlos Garcia Campos from comment #71)
> Comment on attachment 361851 [details]
> Patch v7
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=361851&action=review
> 
> >>>>> Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:120
> >>>>> +    webkit_content_filter_manager_compile(manager, identifier, bytes.get(), cancellable, callback, userData);
> >>>> 
> >>>> I think it's easier enough for the caller to create a GBytes from a string to not need to provide this API.
> >>> 
> >>> No strong preference. In my head the _compile_string() is nice to
> >>> have, but it is also true that creating a GBytes from a string is
> >>> a no-brainer... let's remove it, and if we really feel like it in
> >>> the future it can always be added later.
> >> 
> >> I think it would be more convenient to add compile_from_file()
> > 
> > Added for the next version of the patch, but I called it _save_from_file()
> > for coherency with the agreed _compile() → _save() change. Question: do
> > we want a separate _save_from_file_finish()? Personally I think it's not
> > needed and feels a bit silly to have such a function which will just
> > call _save_finish() internally.
> 
> Well, it's the common pattern and somehow expected by API users familiar
> with  gio async pattern, so I guess it doesn't hurt to add it.

Sure, I'll add it :)
Comment 74 Adrian Perez 2019-02-14 17:02:30 PST
(In reply to Carlos Garcia Campos from comment #72)
> Comment on attachment 361981 [details]
> Patch v8
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=361981&action=review
> 
> API looks good to me now, but we need more unit tests.

\o/

> > Source/WTF/wtf/glib/GRefPtr.h:243
> > +template <> WTF_EXPORT_PRIVATE GMappedFile* refGPtr(GMappedFile*);
> > +template <> WTF_EXPORT_PRIVATE void derefGPtr(GMappedFile*);
> 
> This requires a changelog entry in WTF.

Aye.

> > Source/WebKit/UIProcess/API/glib/WebKitUserContent.cpp:324
> > +        : identifier(contentRuleList->name().utf8()), contentRuleList(contentRuleList), referenceCount(1)
> 
> Use a new line for every parameter. The contentRuleList should be moved with
> WTFMove()

👍

> > Source/WebKit/UIProcess/API/glib/WebKitUserContent.cpp:375
> > + * Obtain the identifier passed to webkit_content_filter_manager_compile()
> 
> webkit_content_filter_manager_compile no longer exists. I would avoid using
> function names here, because we might add more. I would save the identifier
> used to save a filter in a #WebKitUserContentFilterStore, or something
> similar.

Reworded as suggested, without mentioning any particular function.

> > Source/WebKit/UIProcess/API/glib/WebKitUserContent.cpp:383
> > +const char*
> > +webkit_user_content_filter_get_identifier(WebKitUserContentFilter* userContentFilter)
> 
> This should be in the same line.

👍

> > Source/WebKit/UIProcess/API/glib/WebKitUserContent.cpp:389
> > +WebKitUserContentFilter* webkitUserContentFilterFromRuleList(RefPtr<API::ContentRuleList>&& contentRuleList)
> 
> We normally use Create for private new functiuons like this.
> webkitUserContentFilterCreate() in this case.

👍

> > Source/WebKit/UIProcess/API/glib/WebKitUserContentFilterStore.cpp:63
> > +static inline GError*
> > +toGError(WebKitUserContentFilterError code, const std::error_code error)
> 
> One line here.

👍

> > Source/WebKit/UIProcess/API/glib/WebKitUserContentFilterStore.cpp:85
> > + * The path must point to a local filesystem.
> 
> And should exist, right?

The directories are created as needed automatically, I'll add a note about
that in the documentation comment.

> > Source/WebKit/UIProcess/API/glib/WebKitUserContentFilterStore.cpp:93
> > +    g_return_val_if_fail(storagePath != nullptr, nullptr);
> 
> Don't compare to nullptr

👍

> > Source/WebKit/UIProcess/API/glib/WebKitUserContentFilterStore.cpp:95
> > +    store->priv->store = adoptRef(new API::ContentRuleListStore(String::fromUTF8(storagePath), false));
> 
> Use FileSystem::stringFromFileSystemRepresentation() instead of
> String::fromUTF8().

👍

> > Source/WebKit/UIProcess/API/glib/WebKitUserContentFilterStore.cpp:100
> > +static void
> > +webkitUserContentFilterStoreSaveBytes(GRefPtr<GTask>&& task, String&& identifier, GRefPtr<GBytes>&& source)
> 
> One line here.

👍

> > Source/WebKit/UIProcess/API/glib/WebKitUserContentFilterStore.cpp:152
> > +    webkitUserContentFilterStoreSaveBytes(WTFMove(task), String::fromUTF8(identifier), WTFMove(sourceBytes));
> 
> I think you can do GRefPtr<GBytes>(source) directly here instead of using a
> local variable + move.

That worked, thanks for the tip!

> > Source/WebKit/UIProcess/API/glib/WebKitUserContentFilterStore.cpp:201
> > +    // Try mapping the file in memory first, and fall-back to reading the contents as a fallback.
> 
> fall-back as a fallback? :-)

Comment reworded, heh.

> > Source/WebKit/UIProcess/API/glib/WebKitUserContentFilterStore.cpp:215
> > +    struct TaskData {
> > +        String identifier;
> > +    };
> 
> Use WEBKIT_DEFINE_ASYNC_DATA_STRUCT

Neat macro, done.

> > Source/WebKit/UIProcess/API/glib/WebKitUserContentFilterStore.cpp:220
> > +    g_file_load_bytes_async(file, cancellable, [](GObject* sourceObject, GAsyncResult* result, void* userData) {
> 
> We can't use this, it was added in glib 2.56, we should use
> g_file_read_async + g_input_stream_read_bytes_async(). Maybe it's probably
> ok to limit this to local files, receiving a path instead, to make it
> explicit.

Instead of g_file_read_async() + g_input_stream_read_bytes_async() this is
now written in terms of g_file_load_contents_async() which seems to have
since the dawn of the GIO times -- there is no “Since:” tag in the API docs.
This gives char* buffer allocated by GLib, which is then “adopted” by a
GBytes instance created using g_bytes_new_take() -- available since 2.32
which is fine because the minimum requirement is 2.40.

Limiting to local files (or “native” as GIO calls them) seems reasonable,
so I am adding a note in the documentation and also adding a check with
“g_return_if_fail(g_file_is_native(file))” at the beginning of the function.

> > Tools/MiniBrowser/gtk/main.c:483
> > +static void
> > +filterSavedCallback(WebKitUserContentFilterStore *store, GAsyncResult *result, FilterSaveData *data)
> 
> One line here.

👍

> > Tools/MiniBrowser/gtk/main.c:487
> > +    g_assert_nonnull(store);
> > +    g_assert_nonnull(result);
> > +    g_assert_nonnull(data);
> 
> I would use g_assert() since this is not a unit test. Or even remove them,
> since they won't be NULL.

Removed.

> > Tools/MiniBrowser/gtk/main.c:568
> > +        g_clear_pointer(&filtersPath, g_free);
> 
> filtersPath is not used again, so we don't need to set it to NULL, I think
> g_free is enough here.
> 
> > Tools/MiniBrowser/gtk/main.c:570
> > +        webkit_user_content_filter_store_save_from_file(store, "GTKMiniBrowserFilter", contentFilterFile, NULL, (GAsyncReadyCallback) filterSavedCallback, &saveData);
> 
> (GAsyncReadyCallback) filterSavedCallback ->
> (GAsyncReadyCallback)filterSavedCallback
> 
> > Tools/MiniBrowser/gtk/main.c:573
> > +        g_clear_object(&store);
> 
> g_object_unref().
> 
> > Tools/MiniBrowser/gtk/main.c:582
> > +        g_clear_pointer(&saveData.mainLoop, g_main_loop_unref);
> 
> g_main_loop_unref()

Personally I prefer to use g_clear_pointer() and friends with iron-hand
discipline: one never knows when code is going to be moved around, and
NULLifying the pointers makes it much less likely to accidentally use
pointers containing garbage, and also much more convenient to get
backtraces from the debugger. Extra bonus: g_steal_pointer() \m/

Anyway, I digress... Let's use the same convention as the existing code,
so I will change all those as you suggest.

> > Tools/MiniBrowser/gtk/main.c:584
> > +        g_clear_pointer(&error, g_error_free);
> 
> What is this error for?

Nothing, IIRC it is a leftover from a WIP version of the patch.

> > Tools/MiniBrowser/gtk/main.c:585
> > +        g_clear_object(&contentFilterFile);
> 
> g_object_unref()

👍

> > Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentManager.cpp:51
> > +#if ENABLE(CONTENT_EXTENSIONS)
> 
> Now that we are exposing the API we should not allow to build with content
> extensions disabled

👍

> > Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentManager.cpp:457
> > +    *tempDir = g_dir_make_tmp(nullptr, &error.outPtr());
> 
> Use Test::dataDirectory() for any temp file.

Wonderful, that's exactly what I needed. Thanks a lot for pointing me to this!

> > Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentManager.cpp:479
> > +    g_clear_pointer(&data.mainLoop, g_main_loop_unref);
> 
> g_main_loop_unref
> 
> > Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentManager.cpp:480
> > +    g_clear_object(&store);
> 
> g_object_unref
> 
> > Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentManager.cpp:497
> > +    GUniqueOutPtr<char> filtersDir;
> > +    WebKitUserContentFilter* filter = getUserContentFilter(test->m_mainLoop, &filtersDir.outPtr());
> 
> You are not using the filtersDir

Noup. Removed.

> > Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentManager.cpp:504
> > +    g_clear_pointer(&filter, webkit_user_content_filter_unref);
> 
> webkit_user_content_filter_unref
> 
> > Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentManager.cpp:505
> > +}
> 
> We should add cases for every new api entry, save_from_file(), load(),
> remove(), fetch_ids, get_identifier(), remova_all(), etc. And we should also
> add cases for the possible errors.

For this I want to add a new TestWebKitUserContentFilterStore test suite,
which is WIP and I will add along in the next iteration of the patch,
hopefully tomorrow.

Thanks for the review and all the good hints!
Comment 75 Adrian Perez 2019-02-14 17:05:41 PST
Created attachment 362081 [details]
Patch v9


This version of the patch addresses review comment #73.

Still pending before landing: Add a new TestWebKitUserContentFilterStore
test suite with more API tests.
Comment 76 Carlos Garcia Campos 2019-02-15 02:28:30 PST
Comment on attachment 362081 [details]
Patch v9

View in context: https://bugs.webkit.org/attachment.cgi?id=362081&action=review

cq- because we still need more unit tests

> Source/WebKit/UIProcess/API/glib/WebKitUserContentFilterStore.cpp:95
> +    WebKitUserContentFilterStore* store = WEBKIT_USER_CONTENT_FILTER_STORE(g_object_new(WEBKIT_TYPE_USER_CONTENT_FILTER_STORE, nullptr));
> +    store->priv->store = adoptRef(new API::ContentRuleListStore(FileSystem::stringFromFileSystemRepresentation(storagePath), false));

Sorry that I didn't notice this before, but we don't recommend to do this in public new methods, because it doesn't work for bindings that use g_object_new directly, instead of the new() public method. You should make the storage path a read-only, construct-only property and create the API::ContentRuleListStore in constructed().

> Source/WebKit/UIProcess/API/glib/WebKitUserContentFilterStore.cpp:220
> +    g_file_load_contents_async(file, cancellable, [](GObject* sourceObject, GAsyncResult* result, void* userData) {

When I suggested to limit this to local files was because we couldn't use load_bytes() and fallback code might be to complex, but since g_file_load_contents_async() can handle any GFile I don't see any reason to limit it local files. I'm sorry I didn't make it clearer in my previous review.

> Source/WebKit/UIProcess/API/glib/WebKitUserContentFilterStore.cpp:228
> +

Remove this empty line.
Comment 77 Michael Catanzaro 2019-02-15 09:21:25 PST
Comment on attachment 362081 [details]
Patch v9

View in context: https://bugs.webkit.org/attachment.cgi?id=362081&action=review

> Source/WebKit/UIProcess/API/glib/WebKitUserContentManager.cpp:348
> + * Since 2.24

Carlos, are you planning to backport it?
Comment 78 Adrian Perez 2019-02-15 11:21:52 PST
(In reply to Michael Catanzaro from comment #77)
> Comment on attachment 362081 [details]
> Patch v9
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=362081&action=review
> 
> > Source/WebKit/UIProcess/API/glib/WebKitUserContentManager.cpp:348
> > + * Since 2.24
> 
> Carlos, are you planning to backport it?

I will happily backport it myself if Carlos agrees to it :)
Comment 79 Adrian Perez 2019-02-15 11:31:22 PST
(In reply to Carlos Garcia Campos from comment #76)
> Comment on attachment 362081 [details]
> Patch v9

They're coming in v10 of the patch, finally.

> View in context:
> https://bugs.webkit.org/attachment.cgi?id=362081&action=review
> 
> cq- because we still need more unit tests
>
> > Source/WebKit/UIProcess/API/glib/WebKitUserContentFilterStore.cpp:95
> > +    WebKitUserContentFilterStore* store = WEBKIT_USER_CONTENT_FILTER_STORE(g_object_new(WEBKIT_TYPE_USER_CONTENT_FILTER_STORE, nullptr));
> > +    store->priv->store = adoptRef(new API::ContentRuleListStore(FileSystem::stringFromFileSystemRepresentation(storagePath), false));
> 
> Sorry that I didn't notice this before, but we don't recommend to do this in
> public new methods, because it doesn't work for bindings that use
> g_object_new directly, instead of the new() public method. You should make
> the storage path a read-only, construct-only property and create the
> API::ContentRuleListStore in constructed().

No problem, I'll add a construct-only “path” property accordingly ☺
 
> > Source/WebKit/UIProcess/API/glib/WebKitUserContentFilterStore.cpp:220
> > +    g_file_load_contents_async(file, cancellable, [](GObject* sourceObject, GAsyncResult* result, void* userData) {
> 
> When I suggested to limit this to local files was because we couldn't use
> load_bytes() and fallback code might be to complex, but since
> g_file_load_contents_async() can handle any GFile I don't see any reason to
> limit it local files. I'm sorry I didn't make it clearer in my previous
> review.

Great, I will remove the g_return_if_fail() check and use make the code
try using GMappedFile only if g_file_is_native() returns true (there is
no point in trying to mmap non-native files anyway).

> > Source/WebKit/UIProcess/API/glib/WebKitUserContentFilterStore.cpp:228
> > +
> 
> Remove this empty line.

Aye.
Comment 80 Adrian Perez 2019-02-15 16:50:43 PST
Created attachment 362184 [details]
Patch v10


This version adds the “path” construct-only and lifts the g_file_is_native()
check for webkit_user_content_filter_store_save_from_file().

In the end the additional API tests wil be in v11 of the patch.
Comment 81 Carlos Garcia Campos 2019-02-16 01:49:29 PST
(In reply to Michael Catanzaro from comment #77)
> Comment on attachment 362081 [details]
> Patch v9
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=362081&action=review
> 
> > Source/WebKit/UIProcess/API/glib/WebKitUserContentManager.cpp:348
> > + * Since 2.24
> 
> Carlos, are you planning to backport it?

Yes, I plan to make .91 release next week with this patch merged.
Comment 82 Adrian Perez 2019-02-18 14:27:28 PST
Created attachment 362334 [details]
Patch v11


Now with API tests, sould be good to land. \o/
Comment 83 Adrian Perez 2019-02-18 15:00:01 PST
(In reply to Adrian Perez from comment #82)
> Created attachment 362334 [details]
> Patch v11
> 
> 
> Now with API tests, sould be good to land. \o/

The WPE EWS failure was most likely due to the bot running out
of memory. Coincidentally, that was my bot, so I'll take a look
and see if it can be made to run a bit more robustly (it failed
with OOM a handful times in the last couple of weeks, meh).
Comment 84 Carlos Garcia Campos 2019-02-19 02:56:53 PST
Comment on attachment 362334 [details]
Patch v11

View in context: https://bugs.webkit.org/attachment.cgi?id=362334&action=review

> Source/WebKit/UIProcess/API/glib/WebKitUserContentFilterStore.cpp:140
> +            _("Storage directory path"),
> +            _("The directory where user content filters are stored"),

You need to add this file to POTFILES.in

> Source/WebKit/UIProcess/API/glib/WebKitUserContentFilterStore.cpp:172
> +    g_return_val_if_fail(store, nullptr);

WEBKIT_IS_USER_CONTENT_FILTER_STORE(store)

> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentFilterStore.cpp:68
> +

Empty line here.

> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentFilterStore.cpp:73
> +struct AsyncData {

We normally create a Test class, in this case derived from Test. 

class UserContentFilterStoreTest: public Test {
public:
    MAKE_GLIB_TEST_FIXTURE(UserContentFilterStoreTest);

Then you can create tests by using UserContentFilterStoreTest::add().

> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentFilterStore.cpp:78
> +    GMainLoop* mainLoop { g_main_loop_new(nullptr, FALSE) };
> +    GError* error { nullptr };
> +    char** filterIds { nullptr };
> +    WebKitUserContentFilter* filter { nullptr };
> +    gboolean ok { FALSE };

Member variables are usually at the end, after the methods.

> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentFilterStore.cpp:103
> +static GUniquePtr<char*> storeFetchIdentifiers(WebKitUserContentFilterStore* store)

And this could be a method of UserContentFilterStoreTest, using the main loop to make it sync. We normally return const from tests methods, and the class takes care of cleaning evertything up to make the tests easier to read.

> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentFilterStore.cpp:115
> +static WebKitUserContentFilter* storeSaveFilter(WebKitUserContentFilterStore* store, const char* filterId, const char* source, GUniqueOutPtr<GError>& error)

Ditto.

> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentFilterStore.cpp:130
> +static WebKitUserContentFilter* storeLoadFilter(WebKitUserContentFilterStore* store, const char* filterId, GUniqueOutPtr<GError>& error)

Ditto.

> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentFilterStore.cpp:144
> +static bool storeRemoveFilter(WebKitUserContentFilterStore* store, const char* filterId, GUniqueOutPtr<GError>& error)

Ditto.

> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentFilterStore.cpp:161
> +static void testEmptyStore(Test* test, gconstpointer)
> +{
> +    auto store = TEST_STORE(test);
> +    auto filterIds = storeFetchIdentifiers(store.get());

This would receive a UserContentFilterStoreTest and you would only need to do:

g_assert_cmpuint(g_strv_length(test->fetchIdentifiers()), ==, 0);

I would create the store also with a function of the class createStore() that receives the path, for example.

> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentFilterStore.cpp:167
> +static void testSaveInvalidFilter(Test* test, gconstpointer)
> +{
> +    auto store = TEST_STORE(test);

And same here and all other tests. Also use Test::assertObjectIsDeletedWhenTestFinishes() after store is created to ensure we don't leak it.

> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentFilterStore.cpp:202
> +        webkit_user_content_filter_unref(filter);

This would be handled by the class too.

> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentFilterStore.cpp:207
> +        auto store = TEST_STORE(test);

Why do you re-create the store?

> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentFilterStore.cpp:304
> +static void testSaveFilterFromFilter(Test* test, gconstpointer)

FromFilter -> FromFile I guess

> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentFilterStore.cpp:311
> +    auto savedOk = g_file_set_contents(filePath.get(), kSimpleJSONSource, strlen(kSimpleJSONSource), &error.outPtr());
> +    g_assert_true(savedOk);

g_assert_true(g_file_set_contents(filePath.get(), kSimpleJSONSource, strlen(kSimpleJSONSource), &error.outPtr());

> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentFilterStore.cpp:322
> +

We could check also that the id is listed, for example.

> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentManager.cpp:449
> +static WebKitUserContentFilter* getUserContentFilter(WebViewTest* test)

I would use load instead of get here.

> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentManager.cpp:452
> +    WebKitUserContentFilterStore* store = webkit_user_content_filter_store_new(filtersPath.get());

Use GRefPtr

> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentManager.cpp:458
> +    Data data { g_main_loop_ref(test->m_mainLoop), nullptr, };

You don't need to ref the main loop.

> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentManager.cpp:471
> +    g_main_loop_unref(data.mainLoop);

And you don't need to unref it here either.
Comment 85 Adrian Perez 2019-02-19 09:02:39 PST
(In reply to Carlos Garcia Campos from comment #66)

> Comment on attachment 361851 [details]
> Patch v7
>
> [...]
> 
> > Source/WebKit/UIProcess/API/glib/WebKitContentFilterManager.cpp:153
> > + * [WebKit content extesions JSON format](https://webkit.org/blog/3476/content-blockers-first-look/).
> 
> We should add a chapter in the docs to document the JSON format instead of
> including a link to a blog post. I would be annoying, for example if you are
> reading the docs in devhelp. I'm fine to do it in a follow up patch, but
> before 2.24 is out, please.

Reported bug #194818 to take care of adding documentation
for the JSON source format reference.
Comment 86 Adrian Perez 2019-02-19 13:09:16 PST
(In reply to Carlos Garcia Campos from comment #84)
> Comment on attachment 362334 [details]
> Patch v11
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=362334&action=review
> 
> > Source/WebKit/UIProcess/API/glib/WebKitUserContentFilterStore.cpp:140
> > +            _("Storage directory path"),
> > +            _("The directory where user content filters are stored"),
> 
> You need to add this file to POTFILES.in

Good point, thanks. I see there's only POTFILES.in for the GTK port,
so probably it's safe to assume that either the same file is used
for WPE, or no i18n is setup for WPE at all.

> > Source/WebKit/UIProcess/API/glib/WebKitUserContentFilterStore.cpp:172
> > +    g_return_val_if_fail(store, nullptr);
> 
> WEBKIT_IS_USER_CONTENT_FILTER_STORE(store)

👍

> > Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentFilterStore.cpp:68
> > +
> 
> Empty line here.

👍

> > Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentFilterStore.cpp:73
> > +struct AsyncData {
> 
> We normally create a Test class, in this case derived from Test. 
> 
> class UserContentFilterStoreTest: public Test {
> public:
>     MAKE_GLIB_TEST_FIXTURE(UserContentFilterStoreTest);
> 
> Then you can create tests by using UserContentFilterStoreTest::add().

Got it, I'll rework this as suggested.
 
> > Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentFilterStore.cpp:78
> > +    GMainLoop* mainLoop { g_main_loop_new(nullptr, FALSE) };
> > +    GError* error { nullptr };
> > +    char** filterIds { nullptr };
> > +    WebKitUserContentFilter* filter { nullptr };
> > +    gboolean ok { FALSE };
> 
> Member variables are usually at the end, after the methods.

👍

> > Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentFilterStore.cpp:103
> > +static GUniquePtr<char*> storeFetchIdentifiers(WebKitUserContentFilterStore* store)
> 
> And this could be a method of UserContentFilterStoreTest, using the main
> loop to make it sync. We normally return const from tests methods, and the
> class takes care of cleaning evertything up to make the tests easier to read.

Yes, fits well with the style of other test suites which have their
own subclasses of the “Test“ class.

> [...]
> 
> > Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentFilterStore.cpp:161
> > +static void testEmptyStore(Test* test, gconstpointer)
> > +{
> > +    auto store = TEST_STORE(test);
> > +    auto filterIds = storeFetchIdentifiers(store.get());
> 
> This would receive a UserContentFilterStoreTest and you would only need to
> do:
> 
> g_assert_cmpuint(g_strv_length(test->fetchIdentifiers()), ==, 0);
> 
> I would create the store also with a function of the class createStore()
> that receives the path, for example.
> 
> [...]

I am halfway through the conversion, and this indeed makes it easier
to follow what each test case is trying to check for, yay!

> > Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentFilterStore.cpp:167
> > +static void testSaveInvalidFilter(Test* test, gconstpointer)
> > +{
> > +    auto store = TEST_STORE(test);
> 
> And same here and all other tests. Also use
> Test::assertObjectIsDeletedWhenTestFinishes() after store is created to
> ensure we don't leak it.

Good one, thanks for the tip about assertObjectIsDeletedWhenTestFinishes().

> > Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentFilterStore.cpp:207
> > +        auto store = TEST_STORE(test);
> 
> Why do you re-create the store?

The idea is to test that filters effectively persist on disk from one
instantiation of the store to the next. It sounds as if it would be
better to explicitly test for that in a new test case function instead
of mixing it here, so I'll do that.
 
> > Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentFilterStore.cpp:304
> > +static void testSaveFilterFromFilter(Test* test, gconstpointer)
> 
> FromFilter -> FromFile I guess

Indeed!

> > Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentFilterStore.cpp:311
> > +    auto savedOk = g_file_set_contents(filePath.get(), kSimpleJSONSource, strlen(kSimpleJSONSource), &error.outPtr());
> > +    g_assert_true(savedOk);
> 
> g_assert_true(g_file_set_contents(filePath.get(), kSimpleJSONSource,
> strlen(kSimpleJSONSource), &error.outPtr());

Aye!

> > Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentFilterStore.cpp:322
> > +
> 
> We could check also that the id is listed, for example.

Will add.

> > Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentManager.cpp:449
> > +static WebKitUserContentFilter* getUserContentFilter(WebViewTest* test)
> 
> I would use load instead of get here.

Ok.

> > Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentManager.cpp:452
> > +    WebKitUserContentFilterStore* store = webkit_user_content_filter_store_new(filtersPath.get());
> 
> Use GRefPtr

👍

> > Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentManager.cpp:458
> > +    Data data { g_main_loop_ref(test->m_mainLoop), nullptr, };
> 
> You don't need to ref the main loop.
> 
> > Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentManager.cpp:471
> > +    g_main_loop_unref(data.mainLoop);
> 
> And you don't need to unref it here either.

This is now gone after adding the UserContentFilterStoreTest class :)
Comment 87 Adrian Perez 2019-02-19 13:31:14 PST
Created attachment 362419 [details]
Patch v12


Addresses the suggestion from comment #85. This time I am also adding cq?

=)
Comment 88 Carlos Garcia Campos 2019-02-20 00:47:39 PST
Comment on attachment 362419 [details]
Patch v12

View in context: https://bugs.webkit.org/attachment.cgi?id=362419&action=review

> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentFilterStore.cpp:84
> +        assertObjectIsDeletedWhenTestFinishes(G_OBJECT(m_filterStore.get()));

This is already done in newStore().

> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentManager.cpp:452
> +    WebKitUserContentFilterStore* store = webkit_user_content_filter_store_new(filtersPath.get());

test->assertObjectIsDeletedWhenTestFinishes()

> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentManager.cpp:458
> +    Data data { g_main_loop_ref(test->m_mainLoop), nullptr, };

You don't need to ref the main loop.

> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitUserContentManager.cpp:471
> +    g_main_loop_unref(data.mainLoop);

And you don't need to unref it here either.
Comment 89 Adrian Perez 2019-02-20 01:33:47 PST
Created attachment 362488 [details]
Patch v13


Addresses the suggestion from comment #88.
Comment 90 WebKit Commit Bot 2019-02-20 02:16:06 PST
Comment on attachment 362488 [details]
Patch v13

Clearing flags on attachment: 362488

Committed r241790: <https://trac.webkit.org/changeset/241790>
Comment 91 WebKit Commit Bot 2019-02-20 02:16:10 PST
All reviewed patches have been landed.  Closing bug.
Comment 92 Carlos Garcia Campos 2019-02-21 23:00:00 PST
Comment on attachment 362488 [details]
Patch v13

View in context: https://bugs.webkit.org/attachment.cgi?id=362488&action=review

> Source/WebKit/PlatformWPE.cmake:137
>      ${WEBKIT_DIR}/UIProcess/API/wpe/WebKitUserContent.h
> +    ${WEBKIT_DIR}/UIProcess/API/gtk/WebKitUserContentFilterStore.h
>      ${WEBKIT_DIR}/UIProcess/API/wpe/WebKitUserContentManager.h

ouch